Tech Lead Architect (AI Infrastructure & Distributed Systems) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Arc.dev · 1 week ago

Tech Lead Architect (AI Infrastructure & Distributed Systems)

Arc.dev is building next-generation AI infrastructure focused on ultra-fast model inference and scalable LLM hosting. They are seeking a Tech Lead Architect with deep experience in distributed systems and cloud-scale engineering to define technical vision and lead the architecture of their AI platform.

Career PlanningHuman ResourcesRecruiting
check
H1B Sponsor Likelynote

Responsibilities

Architect the overall AI serving platform (model execution engine, routing, safety, observability)
Design multi-node LLM inference pipelines optimized for throughput, latency, and cost
Implement architectural frameworks that support thousands of concurrent model requests
Establish core engineering principles and technical direction
Define GPU cluster layout, scheduling strategies, sharding, and resource isolation
Optimize performance across heterogeneous GPU fleets (A100, H100, L40, 4090, etc.)
Lead decisions around vLLM, TensorRT-LLM, DeepSpeed-Inference, or custom kernels
Architect distributed compute layers, RPC frameworks, autoscaling, and fault tolerance
Lead decisions across data plane, control plane, orchestration, and microservices structure
Build high-availability systems that serve AI workloads reliably at scale
Design clean, developer-first APIs for inference, embeddings, fine-tuning, model mgmt
Work with product teams to define how developers interact with the platform
Architect logging, token accounting, rate limiting, and streaming protocols
Make key architectural decisions that define the company’s long-term technical roadmap
Mentor and guide engineers (backend, infra, ML, frontend)
Interview, hire, and help scale the engineering team
Work directly with the founders on strategy, vision, and roadmap

Qualification

Distributed systemsGPU programmingCloud infrastructureHigh-scale backend systemsKubernetesModel inferenceTechnical leadershipMicroservicesCost-efficiencyMentoring

Required

7+ years experience in software engineering, infrastructure, or systems architecture
Strong experience with: Distributed systems + microservices
Strong experience with: GPU programming, CUDA, or model inference
Strong experience with: Cloud infrastructure (AWS/GCP/Azure)
Strong experience with: Kubernetes, Ray, or container orchestration
Strong experience with: High-scale backend systems and APIs
Proven ability to architect large systems end to end
Experience with high-performance systems (latency, throughput, batching, caching)
Strong instincts around reliability, scalability, and cost-efficiency

Preferred

Experience building or contributing to inference frameworks (vLLM, TensorRT-LLM, TGI)
Deep understanding of LLM internals, KV cache, quantization, tensor parallelism
Experience with data streaming, tracing, profiling, or log-based architectures
Experience with ML training, fine-tuning pipelines, or HF ecosystem
Startup/founding experience or appetite for zero-to-one environments
Background in cloud cost optimization or infra financial modeling

Benefits

Competitive salary + founder-level equity

Company

Arc.dev

twittertwittertwitter
company-logo
Arc is a global marketplace of top remote talent, vetted and ready to interview.

H1B Sponsorship

Arc.dev has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
2022 (3)

Funding

Current Stage
Early Stage
Total Funding
$1M
Key Investors
Hyphen Capital
2021-04-28Equity Crowdfunding· $1M
2021-03-01Seed

Leadership Team

leader-logo
Weiting Liu
Founder & CEO
linkedin

Recent News

Company data provided by crunchbase