AI Software Development Eng. jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD · 22 hours ago

AI Software Development Eng.

AMD is a company focused on building innovative products that enhance computing experiences across various domains including AI and data centers. They are seeking a software engineer to work on Distributed Inferencing on AMD GPUs, optimizing performance and scalability of key applications and benchmarks in the AI/ML space.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Distributed AI Enablement and Benchmarking: Enable and benchmark AI models on large-scale distributed systems to evaluate performance, accuracy, and scalability
Scalable Systems Optimization: Optimize AI workloads across scale-up (multi-GPU), scale-out (multi-node), and scale-across distributed system configurations
Cross-Team Collaboration: Collaborate closely with internal GPU library teams to analyze and optimize distributed workloads for high throughput and low latency
Parallelization Strategies: Develop and apply optimal parallelization strategies for AI workloads to achieve best-in-class performance across diverse system configurations
Model Infrastructure and Management: Contribute to distributed model management systems, model zoos, monitoring frameworks, benchmarking pipelines, and technical documentation
Performance Monitoring and Visualization: Build and maintain real-time dashboards reporting performance, accuracy, and reliability metrics for internal stakeholders and external users

Qualification

C++PythonDistributed SystemsAI FrameworksCluster ManagementCI/CD ToolsQuality AssuranceProblem-SolvingCollaboration

Required

Strong technical expertise in C++/Python development
Experience solving performance and investigating scalability on multi-GPU, multi-node clusters
Passionate about quality assurance, benchmarking, and automation in the AI/ML space
Ability to thrive in both collaborative and independent environments
Excellent problem-solving skills
Ownership in defining goals and delivering impactful solutions
Enable and benchmark AI models on large-scale distributed systems to evaluate performance, accuracy, and scalability
Optimize AI workloads across scale-up (multi-GPU), scale-out (multi-node), and scale-across distributed system configurations
Collaborate closely with internal GPU library teams to analyze and optimize distributed workloads for high throughput and low latency
Develop and apply optimal parallelization strategies for AI workloads to achieve best-in-class performance across diverse system configurations
Contribute to distributed model management systems, model zoos, monitoring frameworks, benchmarking pipelines, and technical documentation
Build and maintain real-time dashboards reporting performance, accuracy, and reliability metrics for internal stakeholders and external users
Master's or PhD degree in Computer Science, Computer Engineering, or a related field, or equivalent practical experience

Preferred

Hands-on experience with AI inference or serving frameworks such as vLLM, SGLang, and Llama.cpp
Understanding KV cache transfer mechanisms and technologies (e.g., Mooncake, NIXL/RIXL) and expert parallelization approaches (e.g., DeepEP, MORI, PPLX-Garden)
Strong C/C++ and Python skills, with experience in software design, debugging, performance analysis, and test development
Experience running AI workloads on large-scale, heterogeneous compute clusters
Familiarity with cluster management and orchestration platforms such as SLURM and Kubernetes (K8s)
Experience with GitHub, Jenkins, or similar CI/CD tools and modern development workflows

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase