Senior Software Engineer, Model Inference jobs in United States
info-icon
This job has closed.
company-logo

Apple · 8 hours ago

Senior Software Engineer, Model Inference

Apple is seeking a Senior Software Engineer to join their Maps team, focusing on building advanced deep learning and large language models for high-performance inference services. The role involves collaborating with research and product teams to deliver scalable solutions that improve search quality and enhance user experiences across Maps.

AppsArtificial Intelligence (AI)BroadcastingDigital EntertainmentFoundational AIMedia and EntertainmentMobile DevicesOperating SystemsTVWearables
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Lead the design and implementation of large-scale, high-performance inference services that support a wide range of models used across Maps, including deep learning and large language models
Collaborate closely with research and product partners to bring models into production, with a strong focus on efficiency, reliability, and scalability
Onboard new use cases, optimize inference across heterogeneous accelerated compute hardware, deploy services on Kubernetes, build and integrate inference engines and control-plane components, and ensure seamless integration with Maps infrastructure

Qualification

ML inferenceGPU accelerationLarge-scale systemsPythonJavaC++Deep learning frameworksModel serving toolsGPU optimizationCloud technologiesML Ops practicesDistributed systemsModel compression techniquesAttention FusionQuantizationSpeculative Decoding

Required

Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience)
5+ years in software engineering focused on ML inference, GPU acceleration, and large-scale systems
Expertise in deploying and optimizing LLMs for high-performance, production-scale inference
Proficiency in Python, Java or C++
Experience with deep learning frameworks like PyTorch, TensorFlow, and Hugging Face Transformers
Experience with model serving tools (e.g., NVIDIA Triton, TensorFlow Serving, VLLM, etc)
Experience with optimization techniques like Attention Fusion, Quantization, and Speculative Decoding
Skilled in GPU optimization (e.g., CUDA, TensorRT-LLM, cuDNN) to accelerate inference tasks
Skilled in cloud technologies like Kubernetes, Ingress, HAProxy for scalable deployment

Preferred

Master's or PhD in Computer Science, Machine Learning, or a related field
Understanding of ML Ops practices, continuous integration, and deployment pipelines for machine learning models
Familiarity with model distillation, low-rank approximations, and other model compression techniques for reducing memory footprint and improving inference speed
Strong understanding of distributed systems, multi-GPU/multi-node parallelism, and system-level optimization for large-scale inference

Benefits

Comprehensive medical and dental coverage
Retirement benefits
A range of discounted products and free services
Reimbursement for certain educational expenses — including tuition
Discretionary bonuses or commission payments
Relocation

Company

Apple is a technology company that designs, manufactures, and markets consumer electronics, personal computers, and software.

H1B Sponsorship

Apple has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6998)
2024 (3766)
2023 (3939)
2022 (4822)
2021 (4060)
2020 (3656)

Funding

Current Stage
Public Company
Total Funding
$5.67B
Key Investors
Berkshire HathawayMicrosoftSequoia Capital
2025-05-05Post Ipo Debt· $4.5B
2025-01-16Post Ipo Debt· $0.31M
2021-04-30Post Ipo Equity

Leadership Team

leader-logo
Tim Cook
CEO
leader-logo
Craig Federighi
SVP, Software Engineering
Company data provided by crunchbase