San Francisco Compute Company · 3 months ago
Supercomputing Engineer
SF Compute is focused on revolutionizing the compute market by enabling real-time trading of compute contracts. The Supercomputing Engineer will manage high-performance ML training clusters, ensure their smooth operation, and engage with customers while leveraging automation for hardware management.
Telecom & CommunicationsInformation TechnologyInternet
Responsibilities
Keeping compute clusters running smoothly
Monitoring hardware health
Participating in on-call rotation
Fixing issues when they arise
Qualification
Required
You've managed at least one GPU training cluster in the past (ideally a cluster with >1k GPU's but not required)
You deeply understand Linux, networking fundamentals, CUDA, NCCL, and Infiniband
You enjoy automating hardware deployments, leveraging IaC wherever possible
You appreciate and value good documentation
Preferred
Experience with Rust (our bare metal tooling is written in Rust)
Experience with Linux virtualization (KVM, QEMU, libvirt, etc.)
Experience with Kubernetes implementation including CRD's and CNI's
Experience with HPC network architectures (eBGP, fat-tree, VXLAN, MCLAG, etc.)
Benefits
GENEROUS EQUITY GRANT
VISA SPONSORSHIPS
RETIREMENT MATCHING
MEDICAL, DENTAL & VISION
TIME OFF
PARENTAL LEAVE
DAILY LUNCH
UNLIMITED OFFICE BOOK BUDGET
Company
San Francisco Compute Company
Compute is a commodity. We think people should buy it like one.
Funding
Current Stage
Early StageTotal Funding
$52MKey Investors
DCVC,Wing Venture CapitalAltman Capital
2025-11-26Series A· $40M
2024-07-16Series Unknown· $12M
Company data provided by crunchbase