Lawrence Livermore National Laboratory · 20 hours ago
High-Performance Computing (HPC) Center Operations Manager
Lawrence Livermore National Laboratory (LLNL) has turned bold ideas into world-changing impact advancing science and technology to strengthen U.S. security and promote global stability. They are seeking a High-Performance Computing (HPC) Center Operations Manager to lead a team providing 24x7 support for HPC systems and facilities, overseeing operations and ensuring reliability through innovative solutions.
Information TechnologyMarketingMarket ResearchSecurity
Responsibilities
Provide expert technical leadership to team members, including recruiting, hiring, mentoring, conducting performance appraisals, facilitating quarterly feedback sessions and one-on-one meetings, and managing salary and career development to support staff growth and operational excellence
Oversee 24x7 support of HPC systems and networks while utilizing advanced monitoring and diagnostic tools to ensure reliability and rapid incident response for systems and supporting infrastructure. Provide guidance and support to shift supervisors during operational events and ensure effective incident response, resolution, and reporting
Establish, implement, and continuously improve procedures, schedules, and work priorities for HPC operations, identifying and developing key growth areas for staff and processes
Lead the development and deployment of innovative tools and processes to enhance operational efficiency and technical service delivery for HPC facilities and operations
Manage multiple vault type rooms, oversee siting and infrastructure projects, and ensure strict compliance with safety and security policies and requirements
Develop formal training plans to enhance team skills in alarm response, safety practices, HPC system monitoring, troubleshooting, repair, and issue escalation for operations and facilities teams
Collaborate with senior management in planning, budgeting, and decision-making; and represent the organization in vendor meetings, cross-divisional initiatives, and external organizations such as Energy Efficiency High Performance Computing Working Group, HPC operational reviews, or other professional best-practice groups
Keep pace with the escalating demands of next-generation platforms by providing solutions for highly unusual and complex HPC engineering challenges that arise from the intersection of extreme power density, precision cooling demands, evolving HPC compute loads, and mission-critical uptime requirements
Perform other duties as assigned
Qualification
Required
This position requires an active Department of Energy (DOE) Q-level clearance or active Top-Secret clearance issued by another U.S. government agency at the time of hire
Bachelor's degree in engineering, computer science or related field, or equivalent combination of education and experience in HPC Facilities and Operations
Significant experience managing and troubleshooting HPC environments, including monitoring and maintenance of systems (e.g. computers, storage) and facilities (e.g. mechanical, electrical, cooling systems)
Advanced technical experience installing and operating HPC equipment, networks, or associated facilities, and resolving issues in cooperation with vendors and staff
Significant experience in recruiting and supervising technical staff, preparing performance reviews, and participating in performance management processes
Advanced communication, facilitation, and collaboration skills to lead a group, explain policies, and interact with management, technical teams, and vendors
Significant experience developing written processes and/or procedures to improve service delivery and operational efficiency, and experience training technicians and engineers and assessing skills
Advanced knowledge of data center infrastructure and equipment
Preferred
Extensive experience working in a High-Performance Computing Center and responding to emergency situations to diagnose and fix significant issues with computers or mechanical equipment while under pressure
Experience in payroll supervision, organizational performance alignment, salary management, and knowledge of DOE/NNSA/LLNL policies and procedures
Experience with HVAC, electrical, and structural systems in a data center environment
Company
Lawrence Livermore National Laboratory
Lawrence Livermore National Laboratory, a national security laboratory, provides transformational solutions to national security challenges.
Funding
Current Stage
Late StageTotal Funding
$11.4MKey Investors
ARPA-EUS Department of EnergyDARPA
2023-11-21Grant
2023-08-14Grant
2022-09-19Grant
Recent News
2026-02-12
2026-01-22
Company data provided by crunchbase