Arc.dev · 1 week ago
Tech Lead Architect (AI Infrastructure & Distributed Systems)
Arc.dev is building next-generation AI infrastructure focused on ultra-fast model inference and scalable LLM hosting. They are seeking a Tech Lead Architect with deep experience in distributed systems and cloud-scale engineering to define technical vision and lead the architecture of their AI platform.
Career PlanningHuman ResourcesRecruiting
Responsibilities
Architect the overall AI serving platform (model execution engine, routing, safety, observability)
Design multi-node LLM inference pipelines optimized for throughput, latency, and cost
Implement architectural frameworks that support thousands of concurrent model requests
Establish core engineering principles and technical direction
Define GPU cluster layout, scheduling strategies, sharding, and resource isolation
Optimize performance across heterogeneous GPU fleets (A100, H100, L40, 4090, etc.)
Lead decisions around vLLM, TensorRT-LLM, DeepSpeed-Inference, or custom kernels
Architect distributed compute layers, RPC frameworks, autoscaling, and fault tolerance
Lead decisions across data plane, control plane, orchestration, and microservices structure
Build high-availability systems that serve AI workloads reliably at scale
Design clean, developer-first APIs for inference, embeddings, fine-tuning, model mgmt
Work with product teams to define how developers interact with the platform
Architect logging, token accounting, rate limiting, and streaming protocols
Make key architectural decisions that define the company’s long-term technical roadmap
Mentor and guide engineers (backend, infra, ML, frontend)
Interview, hire, and help scale the engineering team
Work directly with the founders on strategy, vision, and roadmap
Qualification
Required
7+ years experience in software engineering, infrastructure, or systems architecture
Strong experience with: Distributed systems + microservices
Strong experience with: GPU programming, CUDA, or model inference
Strong experience with: Cloud infrastructure (AWS/GCP/Azure)
Strong experience with: Kubernetes, Ray, or container orchestration
Strong experience with: High-scale backend systems and APIs
Proven ability to architect large systems end to end
Experience with high-performance systems (latency, throughput, batching, caching)
Strong instincts around reliability, scalability, and cost-efficiency
Preferred
Experience building or contributing to inference frameworks (vLLM, TensorRT-LLM, TGI)
Deep understanding of LLM internals, KV cache, quantization, tensor parallelism
Experience with data streaming, tracing, profiling, or log-based architectures
Experience with ML training, fine-tuning pipelines, or HF ecosystem
Startup/founding experience or appetite for zero-to-one environments
Background in cloud cost optimization or infra financial modeling
Benefits
Competitive salary + founder-level equity
Company
Arc.dev
Arc is a global marketplace of top remote talent, vetted and ready to interview.
H1B Sponsorship
Arc.dev has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
2022 (3)
Funding
Current Stage
Early StageTotal Funding
$1MKey Investors
Hyphen Capital
2021-04-28Equity Crowdfunding· $1M
2021-03-01Seed
Recent News
linkedin.com
2024-12-02
Company data provided by crunchbase