Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Heidi · 7 hours ago

Site Reliability Engineer

Heidi is building an AI Care Partner to enhance continuous and human-centric healthcare. The Site Reliability Engineer will participate in incident response, improve operational reliability, and collaborate with engineers to enhance production readiness and reliability expectations.

Artificial Intelligence (AI)Business Process Automation (BPA)Health CareSoftware
check
H1B Sponsor Likelynote

Responsibilities

Participate in on-call and incident response: Respond to production incidents, contribute to service restoration, and support clear communication during incidents. Over time, take increasing responsibility for leading incidents end-to-end
Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes through better alerting, automation, system changes, or process improvements
Own parts of the production environment: Operate and improve Kubernetes clusters, cloud infrastructure, and core platform services, with growing ownership as familiarity increases
Strengthen observability: Improve dashboards, alerts, logs, and traces so issues are detected earlier and diagnosed faster, with a strong focus on actionable signals
Reduce operational toil: Automate repetitive tasks, simplify runbooks, and improve tooling to make on-call and day-to-day operations easier and safer
Support safe change: Improve deployments, rollback mechanisms, and operational readiness to reduce the risk of incidents caused by change
Contribute to operational practices: Write and maintain runbooks, participate in blameless post-mortems, and help improve incident response processes over time
Collaborate closely with engineers: Work with product and feature teams to improve production readiness, service ownership, and reliability expectations

Qualification

Cloud infrastructureKubernetesInfrastructure as CodeMonitoring toolsScripting/automationOn-call experienceDebugging live systemsIncident response

Required

3–6+ years in SRE, DevOps, Platform, or operations-heavy engineering roles
Experience supporting production systems and participating in on-call rotations
Comfortable debugging live systems under pressure
Experience operating cloud infrastructure (AWS preferred)
Working knowledge of Kubernetes and containerised workloads
Infrastructure as Code experience (Terraform or similar)
Familiarity with monitoring and alerting tools (Datadog, Prometheus, etc)
Scripting or automation experience (Python, Bash, or similar)

Benefits

Healthcare, Dental, Vision benefit options
401k with 3% match
Personal development budget of $500 per annum
Become an owner, with shares (equity) in the company, if Heidi wins, we all win

Company

Heidi

twittertwitter
company-logo
Heidi is the AI Care Partner designed to expand clinical capacity by automating administrative work – documentation, form filling, and task management - so clinicians can focus on patients.

H1B Sponsorship

Heidi has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)

Funding

Current Stage
Growth Stage
Total Funding
$91.55M
Key Investors
Point72 VenturesGrow Digital Health Midlands AcceleratorHeadline
2025-10-06Series B· $65M
2025-05-13Non Equity Assistance
2025-03-03Series A· $16.6M

Leadership Team

leader-logo
Thomas Kelly
Co-Founder & CEO
linkedin
leader-logo
Waleed Mussa
Co-Founder
linkedin
Company data provided by crunchbase