Senior Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

CyrusOne · 15 hours ago

Senior Reliability Engineer

CyrusOne is a leading provider of data center solutions, and they are seeking a Senior Reliability Engineer to oversee infrastructure reliability across mission-critical data center sites. This role involves designing and improving reliability strategies for power, cooling, and control systems while mentoring other engineers and influencing operational decisions.

Data CenterInformation ServicesInformation TechnologyWeb Hosting
check
H1B Sponsor Likelynote

Responsibilities

Architect and govern portfolio-level, risk-based asset strategies for mission-critical power and cooling infrastructure
Apply advanced RCM principles to define maintenance and inspection strategies aligned to failure risk, system criticality, and redundancy posture
Evaluate and balance tradeoffs between maintenance investment, operational risk, spares coverage, redundancy, and capital replacement
Establish and maintain enterprise PM quality standards, including audits, task effectiveness reviews, and elimination of low-value maintenance
Serve as a final technical authority for high-risk SOPs, MOPs, EOPs, and operational change packages
Perform system-level risk assessments for planned work, incidents, and abnormal operating conditions
Guide site teams in CMMS data integrity, work management maturity, and adherence to approved operating procedures
Lead or oversee complex reliability investigations involving multiple systems, teams, or contributing factors
Design and mature predictive condition-monitoring programs across the portfolio (oil analysis, thermography, vibration, battery monitoring, controls analytics)
Develop and interpret leading reliability indicators and degradation trends to anticipate failures before impact
Apply statistical analysis, reliability modeling, and engineering judgment to evaluate failure likelihood and consequence
Translate analytical insights into strategic maintenance, operational mitigations, or capital recommendations
Define and govern enterprise critical spares strategies, accounting for supplier risk, lead times, and system exposure
Identify systemic spares gaps and drive remediation plans in partnership with Supply Chain and Operations
Lead lifecycle asset assessments to guide long-range capital planning and replacement prioritization
Provide data-driven input to business cases supporting capital investments and infrastructure upgrades
Lead high-impact post-incident RCAs and FMEAs, ensuring depth of analysis beyond proximate causes
Identify and address latent design, procedural, and organizational contributors to reliability events
Ensure lessons learned result in durable changes to standards, procedures, maintenance strategies, or training
Champion continuous improvement initiatives that measurably reduce risk and failure recurrence across sites
Act as a mentor and technical escalation point for Reliability Engineers, site engineers, and CE leaders
Coach teams on reliability methods, risk-based decision-making, and interpretation of condition-monitoring data
Influence and evolve enterprise reliability standards, playbooks, and operating philosophies
Partner with leadership to strengthen operator certification, training rigor, and operational discipline

Qualification

Reliability EngineeringRCMFMEARCAData AnalysisPredictive AnalyticsPythonSQLExecutive CommunicationContinuous ImprovementMentoring

Required

10+ years of experience in reliability engineering, maintenance engineering, or facilities engineering within mission-critical environments
Demonstrated leadership of complex, multi-system reliability programs with measurable business impact
Expert-level knowledge of RCM, FMEA, RCA, and maintenance optimization methodologies
Deep technical understanding of mission-critical infrastructure, including UPS, generators, switchgear, chillers, cooling towers, CRAH/CRAC, and BMS/EPMS
Proven experience governing SOP/MOP/EOP programs and assessing operational change risk in live environments
Advanced ability to analyze condition-monitoring, CMMS, and operational datasets and convert insights into strategic actions
Proficiency in data analysis and visualization tools (Excel, Power BI, or similar)
Ability to apply statistical techniques or reliability modeling to support risk-informed decision-making under uncertainty
Strong executive-level communication skills; able to influence senior leaders and defend technical positions
Bachelor's degree in Mechanical, Electrical, or Industrial Engineering (or equivalent experience)

Preferred

Experience designing and scaling enterprise critical spares and lifecycle asset management programs
Hands-on experience with predictive analytics, failure modeling, or reliability simulations
Proficiency with Python, R, or similar tools for advanced reliability analytics
Working knowledge of SQL or other data query languages
Strong familiarity with NFPA, IEEE, ASHRAE, and other relevant codes and standards
Experience presenting reliability risk, capital tradeoffs, and investment recommendations to executive audiences

Company

CyrusOne

twittertwittertwitter
company-logo
CyrusOne is a data center operator that offers colocation and peering services.

H1B Sponsorship

CyrusOne has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (2)
2020 (1)

Funding

Current Stage
Public Company
Total Funding
$11.89B
Key Investors
JANA PartnersTD Securities
2024-07-15Debt Financing· $687.1M
2024-07-09Debt Financing· $7.9B
2024-05-15Debt Financing· $1.18B

Leadership Team

leader-logo
Eric Schwartz
Chief Executive Officer
linkedin
leader-logo
Owen Morris
Chief Financial Officer
linkedin
Company data provided by crunchbase