Stability AI · 1 month ago
Machine Learning Data Engineer, Research - 3D Data Focus
Stability AI is seeking a talented ML Data Engineer, Research with a focus on 3D data to join their Data team. The role involves improving and scaling data infrastructure for training generative AI models, particularly in the realm of 3D computer graphics.
Artificial Intelligence (AI)Generative AIImage RecognitionInformation TechnologySoftware
Responsibilities
Ingest, clean, normalize, and preprocess data in a scalable, parallelizable way to prepare it for our machine learning model training pipelines while ensuring data quality
Design, implement, and maintain scalable data infrastructure for generative AI
Develop tools to search, curate, and serve the data at scale
Collaborate with research teams to understand and meet their data requirements
Maintain and improve data quality and integrity across various databases and data stores
Manage and organize large-scale unstructured data, including image, text, audio, video and 3D
Keep up to date with the latest data trends and research papers and implement novel ways to create synthetic data
Qualification
Required
Background in real-time or offline 3D graphics and rendering (either academic or industry)
Experience with 3D technologies such as Blender, Houdini, 3ds Max, Maya, and with concepts such as PBR materials and tools for PBR editing (e.g., Substance 3D)
Experience with shader development (especially for offline rendering)
Experience with common high-end 3D renderers (e.g., Cycles, VRay, Octane, Renderman, Arnold, Corona)
Experience with procedural 3D data generation
Experience with real-time 3D graphics and shader APIs programming (e.g., OpenGL, Vulkan, Direct3D, GLSL, HLSL)
Experience with image processing and computer vision methods and frameworks (e.g., Open3D, OpenCV)
Experience in dealing with a diverse range of 3D geometry representations and file formats (e.g., triangular or quad meshes, voxels, NeRFs, SDFs, pointclouds, etc)
Experience with a diverse range of 2D image file formats (e.g., 16/32bit image depth, HDR data in formats such as OpenEXR)
Experience with automation and scripting for 3D asset pipelines (e.g., Blender BPY, MEL, HScript)
Strong understanding of modern 3D file formats such as USD and glTF
Proven background within large-scale distributed workloads, and multi-component pipeline synchronization and orchestration
Experience fine tuning large generative models (e.g., diffusion, language, or multimodal) on domain-specific datasets
Experience with cloud storage and file systems. AWS (S3) is strongly preferred, but open to other cloud platforms
Experience with Python, and additional languages such as C/C++, JavaScript
Expertise in data warehouses and lakehouse technologies (Postgres, MongoDB, BigQuery, Delta Lake, Iceberg, etc)
Experience working with AI research teams and curating data for them
Good teamwork and communication skills based on experience working with a distributed international team with timezone and cultural differences
Experience working effectively remotely
Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance
Attention to detail and the ability to document processes and solutions effectively
Company
Stability AI
Stability AI is an artificial intelligence company focused on developing open-source generative AI models.
Funding
Current Stage
Growth StageTotal Funding
$256MKey Investors
WPPIntel
2025-03-05Corporate Round
2024-06-25Series Unknown· $80M
2023-11-09Convertible Note· $50M
Recent News
TechWire Asia
2026-01-09
Company data provided by crunchbase