Google · 3 hours ago
Senior Software Engineering Manager, ML Fleet Systems
Google is a leading technology company, and they are seeking a Senior Software Engineering Manager for their ML Fleet Systems team. This role involves managing a team of engineers, driving the technical strategy for ML resources, and ensuring the efficient operation of Google's ML infrastructure.
AppsArtificial Intelligence (AI)Cloud StorageSearch EngineSEO
Responsibilities
Define and drive the long-term technical goal, strategy, and roadmap for critical software systems that manage Alphabet's ML fleet. This includes building systems for all ML resources such as TPUs, GPUs, compute, storage, and networking
Collaborate closely with engineering partners (e.g., Onefleet, Spatial Flex, ODS) to design and deliver joint engineered solutions to our customers (Product Areas within Google)
Identify, scope, and solve broad and ambiguous challenges that impact the efficiency, reliability, and cost-effectiveness of the entire ML fleet. Turn these challenges into strategic opportunities and actionable plans
Drive significant improvements in ML fleet metrics, such as utilization, scheduling efficiency, and power consumption, through innovative software and system design
Ensure the long-term health, maintainability, and evolution of the software systems underpinning Google's AI/ML development
Qualification
Required
Bachelor's degree, or equivalent practical experience
8 years of experience programming in C++, Java, Python, Kotlin or Go
5 years of experience in a technical leadership role
5 years of experience in a people management or team leadership role
3 years of experience in designing, analyzing, and troubleshooting distributed systems
Preferred
Master's degree or PhD in Computer Science or related technical field
5 years of experience working in a complex, matrixed organization
Experience with colossus and other relevant Google storage systems (e.g., Bigtable, Spanner, Woodshed)
Experience with infrastructure optimization, performance analysis, and cost reduction in large-scale environments
Familiarity with Machine Learning hardware accelerators (e.g., TPUs, GPUs) and their life-cycle management
Understanding of resource management systems (e.g., compute infrastructure, Kubernetes, Flex), cluster management, and scheduling algorithms
Benefits
Health, dental, vision, life, disability insurance
Retirement Benefits: 401(k) with company match
Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
Sick Time: 40 hours/year (statutory, where applicable); 5 days/event (discretionary)
Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
Baby Bonding Leave: 18 weeks
Holidays: 13 paid days per year
Company
Google specializes in internet-related services and products, including search, advertising, and software. It is a sub-organization of Alphabet.
H1B Sponsorship
Google has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8763)
2024 (8872)
2023 (9682)
2022 (11626)
2021 (9109)
2020 (9785)
Funding
Current Stage
Public CompanyTotal Funding
$26.1MKey Investors
Andy Bechtolsheim
2004-08-19IPO
1999-06-07Series Unknown· $25M
1998-11-01Angel· $1M
Recent News
Small Business Trends
2026-01-24
2026-01-24
Search Engine Journal
2026-01-24
Company data provided by crunchbase