Modular · 1 day ago
Software Engineer, Inference
Modular is on a mission to revolutionize AI infrastructure by rebuilding the AI software stack. The role involves building end-to-end distributed LLM inference deployments, focusing on operational excellence and collaboration with various teams to enhance application performance.
AI InfrastructureArtificial Intelligence (AI)Generative AIMachine LearningSoftware
Responsibilities
Build & ship Modular’s LLM focused inference services using best-in-class inference techniques (eg disaggregated inference, multi-node deployment of large models, high performance networking, high throughput batch processing, etc)
Build the distributed systems needed to support high performance inference (eg distributed kv-cache, expert parallel request routing & rebalancing, etc)
Push the envelope for operational excellence with request-to-kernel observability, multi-cloud deployments, clever autoscaling, cold-start optimizations, and more
Collaborate with our kernels and genAI teams to achieve SOTA application performance by integrating SOTA kernel & serving optimizations with SOTA cluster optimizations
Build helm charts, kubernetes operators, and more to make a create simple, effective, maintainable deployments
Qualification
Required
5+ years of experience working in backend engineering
Experience working on high scale ML inference infrastructure (traditional AI or genAI)
Experience with kubernetes and operating your own services
Ability to create durable, reusable software tools and libraries that are leveraged across teams and functions
Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture
Strongly identifies with our core company cultural values
Preferred
Experience with high performance computing / networking (RDMA, RoCE, Infiniband, etc)
Experience with LLM Frameworks vLLM, SGLang, TensorRT-LLM
Familiarity with golang
Benefits
Premier insurance plans
Up to 5% 401k matching
Flexible paid time off
Annual target bonus
Equity
Company
Modular
Modular provides AI infrastructure for deployment, serving, and programming GPUs.
H1B Sponsorship
Modular has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (10)
2024 (6)
2023 (8)
2022 (4)
Funding
Current Stage
Growth StageTotal Funding
$380MKey Investors
US Innovative Technology FundGeneral CatalystGoogle Ventures
2025-09-24Series C· $250M
2023-08-24Series B· $100M
2022-06-30Seed· $30M
Recent News
Greylock
2025-12-29
Crunchbase News
2025-10-15
Company data provided by crunchbase