Hyperbolic · 1 month ago
Head of Infrastructure
Hyperbolic Labs is on a mission to democratize AI by breaking down barriers to computing power with their Open-Access AI Cloud. The Head of Infrastructure will architect and scale systems for a peer-to-peer GPU marketplace and lead a high-performing engineering team to ensure reliability and performance in a fast-paced environment.
Artificial Intelligence (AI)Cloud ComputingInformation TechnologyMachine LearningSoftware
Responsibilities
Lead the design, evolution, and reliability of Hyperbolic’s globally distributed GPU cloud
Own the infrastructure roadmap end-to-end—from distributed systems design and resource orchestration to networking, security, and global capacity strategy
Grow and mentor a world-class engineering organization
Establish engineering excellence standards
Partner closely with Product, Security, Platform, and GTM leadership to translate future AI workloads into infrastructure reality
Qualification
Required
10+ years in infrastructure, systems engineering, or distributed systems, including 5+ years leading managers and senior ICs
Proven ability to own multi-year infrastructure roadmaps, align stakeholders, and translate ambiguous requirements into crisp technical direction
Experience building, scaling, and mentoring high-performing engineering orgs across infrastructure, platform, and SRE disciplines
Exceptional judgment in balancing velocity with reliability, cost, and security
Comfortable working in fast-moving, high-stakes environments where infrastructure is the product
Deep expertise in distributed systems, operating systems internals, networking, and resource orchestration
Hands-on experience with container orchestration systems (Kubernetes, Nomad, SLURM, custom schedulers) at global scale
Strong engineering background with the ability to read and write production code (Go, Rust, Python, or similar)
Experience architecting multi-cloud + on-prem + edge topologies, including GPU-centric workloads
Expert-level understanding of infrastructure-as-code, automation frameworks, and GitOps workflows
Expertise in designing observability systems (metrics, tracing, logging, alerting) and building operational excellence
A track record of owning 99.9–99.99% uptime targets, incident response processes, and resilience engineering
Passionate about security-first infrastructure, including workload isolation, network security, IAM, hardening, and compliance
Experience leading major capacity planning, load forecasting, and cost optimization initiatives
Preferred
Contributions to open-source infra tools, kernels, schedulers, or distributed systems libraries
Familiarity with service mesh, mTLS, RPC frameworks, or low-latency communication patterns
Benefits
Equity
Health
Remote policy
Hardware budget
Offsites
Company
Hyperbolic
Hyperbolic is the open-access AI cloud made for AI developers, providing fast, affordable access to compute, inference, and AI services.
H1B Sponsorship
Hyperbolic has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (3)
2023 (1)
Funding
Current Stage
Early StageTotal Funding
$19M2024-12-10Series A· $12M
2024-07-30Seed· $7M
2023-01-01Seed
Recent News
Company data provided by crunchbase