Harrison Clarke ยท 23 hours ago
Cloud Engineer
Harrison Clarke is seeking a Senior Infrastructure Engineer to design and scale the core systems behind a next-generation AI platform. This hands-on role involves creating the infrastructure layer for large-scale AI workloads while collaborating with applied ML teams to ensure reliable model serving.
Responsibilities
Architecting and scaling infrastructure for low-latency, high-throughput AI inference
Managing GPU resources and multi-tenant workloads using Kubernetes and cloud-native tooling
Designing and operating core infrastructure components including infrastructure-as-code, container orchestration, monitoring, logging, and networking
Building platform-level capabilities such as authentication, rate limiting, telemetry, alerting, and system health monitoring
Owning infrastructure tradeoffs across performance, availability, and cost as usage scales
Working closely with machine learning engineers to productionize and optimize model serving pipelines
Establishing best practices and patterns for operating AI systems at scale in a fast-moving startup environment
Qualification
Required
Strong background in infrastructure, platform, or systems engineering
Deep experience operating Kubernetes-based systems at scale, including GPU scheduling and workload orchestration
Familiarity with service mesh, global traffic routing, and high-availability architectures
Fluency in infrastructure-as-code, CI/CD workflows, and modern observability stacks
Solid systems fundamentals: distributed systems, performance tuning, concurrency, caching, and cost optimization
Hands-on experience with model inference and serving technologies (e.g., Triton, ONNX Runtime, vLLM, TensorRT, or similar)
Good understanding of cloud security and data management in production environments
Comfort working in early-stage settings with high ownership, ambiguity, and rapid iteration