Manager, Super Intelligence HPC Support jobs in United States
cer-icon
Apply on Employer Site
company-logo

Lambda · 3 months ago

Manager, Super Intelligence HPC Support

Lambda is a company focused on building Gigawatt-scale AI Factories for Training and Inference. They are seeking a hands-on leader to build and guide their Super Intelligence HPC Support Engineering team, responsible for delivering world-class support to complex customers operating hyperscale GPU clusters.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingData CenterGPUMachine Learning
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Lead & Develop: Build, coach, and mentor a team of Super Intelligence HPC Support Engineers, ensuring technical excellence and strong execution in customer-facing work
Escalation Ownership: Take point on high-visibility incidents and escalations with hyperscale customers, ensuring timely, transparent, and high-quality outcomes
Customer Advocacy: Represent the needs of Super Intelligence customers in cross-functional discussions, influencing product design and roadmap decisions to improve supportability
Incident Leadership: Guide your team through major incidents, driving consistency in communication, coordination, and resolution under pressure
Operational Excellence: Define and refine support processes, runbooks, and documentation tailored to hyperscale environments
Partnership: Collaborate closely with Product, Engineering, and Data Center teams to ensure Lambda delivers reliable, scalable solutions at the largest levels of deployment
Metrics & Accountability: Monitor team performance, drive improvements in SLA adherence, response/resolution quality, and customer satisfaction
Hands-On Leadership: Step in to troubleshoot complex issues and model the standard of excellence expected from your team

Qualification

HPC expertiseGPU clustersLinux administrationSlurmKubernetesInfiniBandNetworking certificationsCustomer advocacyTeam leadershipCommunication

Required

Proven track record leading technical support or engineering teams serving enterprise or hyperscale customers
Skilled at managing customer escalations and major incidents with clarity, confidence, and urgency
Deep expertise in HPC environments including GPU clusters, InfiniBand/RoCE networks, and Linux system administration
Ability to guide engineers through troubleshooting at scale, from orchestration (Slurm/Kubernetes) down to kernel-level debugging
Strong leadership presence: able to inspire, set direction, and build a culture of accountability and customer-first execution
Excellent communication skills, capable of engaging with both engineers and executive stakeholders

Preferred

Advanced degree in Computer Science, Engineering, or related field
Certifications in HPC, networking, or related technologies
Experience with Slurm, Kubernetes, InfiniBand, and other high-performance interconnects (RoCE, NVLink/NVSwitch)
Background supporting Private Cloud environments or other dedicated enterprise clusters
Experience supporting enterprise AI workloads across startups and Fortune 500 companies

Benefits

Health, dental, and vision coverage for you and your dependents
Wellness and Commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible Paid Time Off Plan that we all actually use

Company

Lambda

twittertwittertwitter
company-logo
Lambda is a cloud-based platform that provides high-performance GPU hardware and cloud infrastructure for AI model training and inference.

H1B Sponsorship

Lambda has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (1)
2023 (3)
2022 (2)
2021 (2)
2020 (3)

Funding

Current Stage
Late Stage
Total Funding
$3.19B
Key Investors
TWG GlobalJP MorganMacquarie Group
2025-11-18Series E· $1.5B
2025-08-19Debt Financing· $275M
2025-02-19Series D· $480M

Leadership Team

leader-logo
Stephen Balaban
Co-founder, CEO
linkedin
leader-logo
Michael Balaban
Co-Founder / CTO
linkedin
Company data provided by crunchbase