Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

Amazon Web Services (AWS) · 5 hours ago

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Amazon Web Services (AWS) is seeking a Senior Software Development Engineer to join the Annapurna Labs team, which builds the AWS Neuron SDK for accelerating deep learning workloads. This role involves architecting and implementing critical features, mentoring engineers, and optimizing machine learning models for performance on AWS's custom ML accelerators.

Agentic AIConsultingDevOpsInformation TechnologySoftwareWeb Development
check
H1B Sponsor Likelynote

Responsibilities

Help lead the efforts in building distributed inference support for Pytorch in the Neuron SDK
Tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and servers
Collaborate across compiler, runtime, framework, and hardware teams to optimize machine learning workloads for our global customer base
Work at the intersection of software, hardware, and machine learning systems
Bring expertise in low-level optimization, system architecture, and ML model acceleration
Develop and performance tune a wide variety of LLM model families, including 500B+ large language models like the Llama family, DeepSeek and beyond
Work side by side with performance, compiler and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia
Build infrastructure to systematically analyze and onboard multiple models with diverse architecture
Collaborate with performance team to enable and evaluate optimizations such as fusion, sharding, tiling, and scheduling etc
Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines
Work directly with customers to enable and optimize their ML models on AWS accelerators
Collaborate across teams to develop innovative optimization techniques
Build online/offline inference serving with vLLM, SGLang, TensorRT or similar platforms in production environments
Debugging performance issues, optimizing memory usage, and shaping the future of Neuron's inference stack across Amazon and the Open Source Community
Create metrics, implement automation and other improvements, and resolve the root cause of software defects
Build high-impact solutions to deliver to our large customer base and participate in design discussions, code review, and communicate with internal and external stakeholders
Work cross-functionally to help drive business decisions with your technical input

Qualification

Machine LearningPythonSystem Level ProgrammingDistributed InferenceLLM OptimizationPerformance TuningSoftware Development LifecycleCollaborationMentorshipProblem Solving

Required

5+ years of non-internship professional software development experience
5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution
Experience programming with at least one software programming language

Preferred

5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Masters degree in computer science or equivalent

Benefits

Equity
Sign-on payments
Full range of medical, financial, and/or other benefits

Company

Amazon Web Services (AWS)

company-logo
Launched in 2006, Amazon Web Services (AWS) began exposing key infrastructure services to businesses in the form of web services -- now widely known as cloud computing.

H1B Sponsorship

Amazon Web Services (AWS) has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (22803)
2024 (21175)
2023 (19057)
2022 (24088)
2021 (12233)
2020 (14881)

Funding

Current Stage
Late Stage
Total Funding
unknown
Key Investors
BIRD Foundation
2025-01-22Grant

Leadership Team

leader-logo
Matt Garman
Chief Executive Officer
linkedin
leader-logo
Anand Desikan
CTO, CXO Advisor, and Enterprise Technologist
linkedin
Company data provided by crunchbase