HeyGen · 3 weeks ago
Tech Lead, AI Compute Infrastructure
HeyGen is a company focused on making visual storytelling accessible to all. They are seeking a seasoned Technical Leader to build and scale the foundational compute infrastructure that powers their AI models, ensuring robust, efficient, and scalable platforms for generative video models.
E-LearningGenerative AISoftwareWeb Apps
Responsibilities
Optimize GPU Utilization: Design and implement mechanisms to aggressively optimize GPU and cluster utilization across thousands of devices for inference, training, data processing and large-scale deployment of our state-of-art video generation models
Develop Large-Scale AI Job Framework: Build highly scalable, reliable frameworks for launching and managing massive, heterogeneous compute jobs, including multi-modal high-volume data ingestion/processing, distributed model training, and continuous evaluation/benchmarking
Enhance Observability: Develop world-class observability, tracing, and visualization tools for our compute cluster to ensure reliability, diagnose performance bottlenecks (e.g., memory, bandwidth, communication)
Accelerate Pipelines: Collaborate closely with AI researchers and AI engineers to integrate innovative acceleration techniques (e.g., custom CUDA kernels, distributed training libraries) into production-ready, scalable training and inference pipelines
Infrastructure Management: Champion the adoption and optimization of modern cloud and container technologies (Kubernetes, Ray) for elastic, cost-efficient scaling of our distributed systems
Qualification
Required
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience
5+ years of full-time industry experience in large-scale MLOps, AI infrastructure, or HPC systems
Experience with data frameworks and standards like Ray, Apache Spark, LanceDB
Strong proficiency in Python and a high-performance language such as C++ for developing core infrastructure components
Deep understanding and hands-on experience with modern orchestration and distributed computing frameworks such as Kubernetes and Ray
Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX
Preferred
Master's or PhD in Computer Science or a related technical field
Demonstrated Tech Lead experience, driving projects from conceptual design through to production deployment across cross-functional teams
Prior experience building infrastructure specifically for Generative AI models (e.g., diffusion models, GANs, or large language models) where cost and latency are critical
Proven background in building and operating large-scale data infrastructure (e.g., Ray, Apache Spark) to manage petabytes of multi-modal data (video, audio, text)
Expertise in GPU acceleration and deep familiarity with low-level compute programming, including CUDA, NCCL, or similar technologies for efficient inter-GPU communication
Benefits
Competitive salary and benefits package.
Dynamic and inclusive work environment.
Opportunities for professional growth and advancement.
Collaborative culture that values innovation and creativity.
Access to the latest technologies and tools.
Company
HeyGen
HeyGen is an AI video generation platform that specializes in video creation, AI avatars, and generative AI.
H1B Sponsorship
HeyGen has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (11)
2024 (5)
Funding
Current Stage
Growth StageTotal Funding
$69MKey Investors
Benchmark
2024-03-25Series A· $60M
2022-11-08Seed· $9M
Recent News
Tech Funding News
2025-10-31
Company data provided by crunchbase