Apply on Employer Site

AMD · 4 hours ago

Post-Training Platform Infrastructure Engineer

San Jose, CA

Full-time

Hybrid

Mid, Senior Level

$192K/yr - $288K/yr

AMD is a company dedicated to building innovative products that enhance next-generation computing experiences. They are seeking a systems-minded engineer to focus on post-training and inference infrastructure, emphasizing performance optimization and efficient resource utilization in large-scale model inference and reinforcement learning systems.

Embedded SoftwareArtificial Intelligence (AI)SemiconductorCloud ComputingElectronicsHardwareAI InfrastructureComputerEmbedded SystemsGPU

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Research and deeply understand modern LLM inference frameworks, including:

Architecture and design tradeoffs of P/D (prefill / decode) disaggregation

KV cache lifecycle, memory layout, eviction strategies, and reuse

KV cache offloading mechanisms across GPU, CPU, and storage backends

Analyze and compare inference execution paths to identify:

Performance bottlenecks (latency, throughput, memory pressure)

Inefficiencies in scheduling, cache management, and resource utilization

Develop and implement infrastructure-level features to:

Improve inference latency, throughput, and memory efficiency

Optimize KV cache management and offloading strategies

Enhance scalability across multi-GPU and multi-node deployments

Apply the same research-driven approach to RL frameworks:

Study post-training and RL systems (e.g., policy rollout, inference-heavy loops)

Debug performance and correctness issues in distributed RL pipelines

Optimize inference, rollout efficiency, and memory usage during training

Collaborate with research and applied ML teams to:

Translate model-level requirements into infrastructure capabilities

Validate performance gains with benchmarks and real workloads

Document findings, architectural insights, and best practices to guide future system design

Qualification

LLM inference frameworksDistributed systemsPerformance optimizationGPU-accelerated workloadsPythonC++KV cache managementMemory managementAnalytical skillsCollaborationProblem-solving

Required

Strong background in systems engineering, distributed systems, or ML infrastructure

Hands-on experience with GPU-accelerated workloads and memory-constrained systems

Solid understanding of LLM inference workflows (prefill vs decode)

Attention mechanisms and KV cache behavior

Multi-process / multi-GPU execution models

Proficiency in Python and C++ (or similar systems languages)

Experience debugging performance issues using profiling tools (GPU, CPU, memory)

Ability to read, understand, and modify complex open-source codebases

Strong analytical skills and comfort working in research-heavy, ambiguous problem spaces

Direct experience with LLM inference frameworks or serving stacks

Familiarity with GPU memory hierarchies (HBM, pinned memory, NUMA considerations)

KV cache compression, paging, or eviction strategies

Storage-backed offloading (NVMe, object stores, distributed file system)

Experience with distributed RL or post-training pipelines

Knowledge of scheduling systems, async execution, or actor-based runtimes

Contributions to open-source ML or systems projects

Experience designing benchmarking suites or performance evaluation frameworks

Bachelor's or master's degree in computer science, computer engineering, electrical engineering, or equivalent

Benefits

AMD benefits at a glance.

Company

AMD

Glassdoor4.1

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Founded in 1969

Santa Clara, California, USA

10001+ employees

http://www.amd.com

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (836)

2024 (770)

2023 (551)

2022 (739)

2021 (519)

2020 (547)

Funding

Current Stage

Public Company

Total Funding

unknown

Key Investors

OpenAIDaniel Loeb

2025-10-06Post Ipo Equity

2023-03-02Post Ipo Equity

2021-06-29Post Ipo Equity

Leadership Team

Lisa Su

Chair & CEO

Mark Papermaster

CTO and EVP

Recent News

KitGuru.net

AMD confirms Microsoft’s next-gen Xbox for 2027

2026-02-06

The Next Platform

AMD Finally Makes More Money On GPUs Than CPUs In A Quarter

2026-02-06

semafor.com

Alphabet to double infrastructure spending as it bets on AI

2026-02-06

Company data provided by crunchbase