SIGN IN
Lead Cloud Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

NiCE · 1 day ago

Lead Cloud Site Reliability Engineer

NiCE is a leading software company known for its innovative solutions in AI, cloud, and digital technologies. The Lead Site Reliability Engineer is a senior technical leader responsible for enhancing the reliability and operational maturity of the SaaS platform, while driving cross-team initiatives and setting engineering standards for the SRE organization.
Enterprise SoftwareSoftwareInformation TechnologyRobotic Process Automation (RPA)Security
check
H1B Sponsor Likelynote

Responsibilities

Define, maintain, and evangelize SRE standards, frameworks, and best practices across the entire SRE organization
Establish consistent patterns for SLI/SLO design, observability instrumentation, incident response, readiness reviews, and postmortem quality
Partner with architecture and engineering leadership to ensure reliability is embedded in solution design
Lead multi-team efforts to reduce toil, improve quality, and increase service resilience
Serve as the technical owner for cross-functional reliability projects—including scope, timelines, and technical decisions
Provide deep technical guidance on cloud architecture, distributed systems reliability, and automation patterns
Create advanced observability dashboards and distributed tracing solutions to provide visibility across product lines
Automate manual operational processes to eliminate toil and increase efficiency across teams
Lead and mentor engineers in performance analysis, capacity planning, and reliability-focused system design
Drive consistency and maturity in monitoring and alerting implementations across services
Oversee and elevate blameless incident response and ensure high-quality postmortems across SRE teams
Partner with Incident & Problem Management to identify systemic weaknesses and lead long-term remediation
Provide highest-tier on-call leadership for critical incidents, guiding teams in improving MTTR and outage prevention
Mentor senior and mid-level SREs, uplift team capability, and provide technical coaching and training
Review complex engineering work and provide robust, actionable feedback
Help teams develop and adopt operational playbooks, engineering processes, and shared troubleshooting libraries

Qualification

Public cloud ecosystemsMonitoringObservability toolsAdvanced programming/scriptingKubernetesDistributed systemsIncident responseCommunication skillsMentorship

Required

Bachelor's degree in Computer Science, Information Systems, or equivalent experience
6+ years in SRE, platform engineering, or cloud reliability roles
Expert-level proficiency in public cloud ecosystems (AWS, GCP, Azure)
Advanced programming/scripting experience (Python, Go, Java, or similar)
Deep experience with monitoring, automation, CI/CD, and observability tools
Proven success leading complex cross-functional engineering initiatives
Outstanding communication skills for both technical and executive-level audiences

Preferred

Experience defining SRE organizational standards or building an SRE practice
Hands-on experience with Kubernetes, microservices, Terraform, or Ansible
Strong background in distributed systems and fault-tolerant architectures

Company

NiCE is transforming the world with AI that puts people first.

H1B Sponsorship

NiCE has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (14)
2023 (8)
2022 (8)
2021 (11)
2020 (10)

Funding

Current Stage
Public Company
Total Funding
unknown
1996-02-02IPO

Leadership Team

D
David Gustafson
VP, GM of Platform
linkedin
leader-logo
Matt Reading
VP, Customer Succes
linkedin
Company data provided by crunchbase