Featherless AI · 20 hours ago
AI Researcher — Inference Optimization
FeatherlessAI is seeking an AI Researcher with deep experience in inference optimization to design, evaluate, and deploy high-performance inference systems for large-scale machine learning models. The role involves improving latency, throughput, and cost efficiency across real-world production environments by developing techniques to optimize inference performance and collaborating with engineering teams to deploy optimized pipelines.
Artificial Intelligence (AI)Cloud ComputingDatabase
Responsibilities
Research and develop techniques to optimize inference performance for large neural networks
Improve latency, throughput, memory efficiency, and cost per inference
Design and evaluate model-level optimizations (quantization, pruning, KV-cache optimization, architecture-aware simplifications)
Implement systems-level optimizations (dynamic batching, kernel fusion, multi-GPU inference, prefill vs decode optimization)
Benchmark inference workloads across hardware accelerators
Collaborate with engineering teams to deploy optimized inference pipelines
Translate research insights into production-ready improvements
Qualification
Required
Strong background in machine learning, deep learning, or AI systems
Hands-on experience optimizing inference for large-scale models
Proficiency in Python and modern ML frameworks (e.g., PyTorch)
Experience with inference tooling (e.g., Triton, TensorRT, vLLM, ONNX Runtime)
Ability to design experiments and communicate results clearly
Preferred
Experience deploying production inference systems at scale
Familiarity with distributed and multi-GPU inference
Experience contributing to open-source ML or inference frameworks
Authorship or co-authorship of peer-reviewed research papers in machine learning, systems, or related fields
Experience working close to hardware (CUDA, ROCm, profiling tools)
Company
Featherless AI
We enable serverless inference via our GPU orchestration and model load-balancing system.
H1B Sponsorship
Featherless AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Early StageTotal Funding
$5MKey Investors
Airbus Ventures
2025-10-31Series A
2025-03-17Seed· $5M
Company data provided by crunchbase