Software Engineer - GPU Kernel jobs in United States
cer-icon
Apply on Employer Site
company-logo

FriendliAI · 4 days ago

Software Engineer - GPU Kernel

FriendliAI is a San Mateo, CA-based startup building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models. They are seeking a GPU Kernel Engineer to design and optimize GPU kernels for their AI inference platform, ensuring high performance and efficiency.

Artificial Intelligence (AI)Generative AIInformation TechnologyInternetSaaSSoftware
Hiring Manager
Woojin Lee
linkedin

Responsibilities

Design, implement, and optimize high-performance GPU kernels for AI inference (e.g., GEMM, attention, routing)
Develop and maintain GPU code in CUDA and C++, including low-level assembly when needed
Implement reduced-precision and quantized kernels (FP8/FP4) for low-latency or high-throughput inference
Benchmark and ensure cross-vendor performance parity between NVIDIA and AMD hardware
Contribute to internal GPU libraries and tune performance of performance-critical components
Accelerate multi-modal model pipelines
Investigate and integrate next-generation GPU features

Qualification

GPU programmingCUDAC++GPU profilingPerformance tuningGPU architectureNumerical backgroundQuantization techniquesInter-GPU communicationOpen-source contributions

Required

3+ years of experience in GPU programming, HPC, or performance-critical systems
Bachelor's or Master's degrees in Computer Science, Computer Engineering, Electrical Engineering, or a related field
Strong proficiency in CUDA for NVIDIA GPUs or ROCm/HIP for AMD GPUs
Deep understanding of GPU architecture: warps, threads, memory hierarchy, synchronization, and latency-throughput trade-offs
Proficiency in C++
Experience with GPU profiling and performance tuning
Strong numerical background with understanding of precision trade-offs and quantization techniques

Preferred

Experience optimizing transformer, multi-modal, or Mixture-of-Experts (MoE) architectures at the kernel level
Familiarity with the latest GPU libraries and frameworks (CUTLASS, Triton, …)
Inter-GPU communication programming experience
Open-source contributions related to GPU performance or ML acceleration
Research or conference presentations on GPU optimization, HPC, or numerical computing

Benefits

Competitive compensation.
Premium hardware and health support benefits.
Health insurance
Startup equity
Other benefits

Company

FriendliAI

twittertwitter
company-logo
FriendliAI is an AI infrastructure company that enables deployment, scaling, and monitoring of large language and multimodal models.

Funding

Current Stage
Early Stage
Total Funding
$26.75M
Key Investors
Capstone Partners
2025-08-28Seed· $20M
2021-12-15Seed· $6.75M

Leadership Team

leader-logo
Byung-Gon Chun
Chief Executive Officer
linkedin
leader-logo
Gyeong-In Yu
Chief Technology Officer
linkedin
Company data provided by crunchbase