Software Engineer - Model Serving Infrastructure - USDS jobs in United States
cer-icon
Apply on Employer Site
company-logo

TikTok · 2 weeks ago

Software Engineer - Model Serving Infrastructure - USDS

TikTok is the leading destination for short-form mobile video, and they are seeking a Machine Learning Engineer - Model Serving Infrastructure to join their AML team. This role focuses on designing and implementing distributed inference infrastructure for various ranking models, ensuring system reliability and performance, and collaborating with product teams to meet their requirements.

Content CreatorsContent DiscoveryMedia and EntertainmentSocial MediaVideo
badNo H1Bnote

Responsibilities

Responsible for the design and implementation of distributed inference infrastructure for feeds, ads and search ranking models
Responsible for building monitoring/managing tools to oversee the reliability and scalability of online inference servers
Responsible for triaging system inefficiency and bottlenecks and improving system performance
Responsible for building tools to analyze bottlenecks and sources of instability and then design and implement solutions
Responsible for collaboration with product teams and providing general solutions to meet their requirements

Qualification

C/C++/CUDADeep learning frameworksGPU performance optimizationOpen source contributionsLarge-scale systemsHardware-Software Co-DesignCommunicationTeamwork skills

Required

Bachelor's/Master's degree in Computer Science, Computer Engineering, or related fields or equivalent years of experience in a software engineering role
Proficient in C/C++/CUDA, and have solid programming skills
Familiar with deep learning serving frameworks (TensorFlow Serving/TorchScript)
Experience in GPU performance optimization

Preferred

Experience contributing to an open sourced machine learning framework (TensorFlow / Jax / PyTorch / TorchScript / MXNet / TensorRT)
Experience in developing and deploying large-scale systems
Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/RDMA) or ML for Systems
Ability to work independently and complete projects from beginning to end and in a timely manner
Good communication and teamwork skills to clearly communicate technical concepts with other teammates

Benefits

Medical, dental, and vision insurance
A 401(k) savings plan with company match
Paid parental leave
Short-term and long-term disability coverage
Life insurance
Wellbeing benefits
10 paid holidays per year
10 paid sick days per year
17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)

Company

TikTok is a short-form video entertainment app and social network platform. It is a sub-organization of ByteDance.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
N Ali Mohamed
CEO
linkedin
leader-logo
Blake Chandlee
VP Global Business Solutions
linkedin
Company data provided by crunchbase