Apply on Employer Site

Kharon · 6 hours ago

Staff Site Reliability Engineer

Denver Metropolitan Area

Full-time

Onsite

Lead/Staff

$220K/yr - $265K/yr

10+ years exp

Kharon is a highly disruptive organization that navigates risk at the intersection of global security threats and international commerce. They are seeking a Staff Site Reliability Engineer to build resilient and scalable systems, champion best practices in observability and automation, and collaborate across teams to ensure the reliability of critical insights for their clients.

AnalyticsBusiness IntelligenceComplianceRisk Management

H1B Sponsor Likely

Responsibilities

Stand up and standardize metrics, logging, tracing, and alert hygiene; introduce golden dashboards and alert runbooks

Coach engineers on reliability practices, including leading incident response (MTTA/MTTR) running blameless postmortems, reliability reviews

Plan capacity, conduct load/perf tests, and drive performance tuning and cost–reliability tradeoffs

Collaborate with DevOps on Kubernetes/cloud/IaC standards, including creating paved roads and production-readiness checklists for app teams

Work cross functionally on resilient CI/CD (pre-deployment checks, canary/blue-green, automated rollbacks)

Align with security on least privilege, secrets management, and audit-ready operational practices

Define RTO/RPO, backups, and failover drills; document and test recovery playbooks

Identify opportunities related to repetitive work and automations (scripts, jobs, runbooks, self-service tooling)

Help shape on-call rotations, escalation policies, and handbooks, ultimately improving signal-to-noise and engineer well-being

Assist in defining SLIs/SLOs and error budgets with product/engineering, creating visibility into availability, latency, and quality

Qualification

Site Reliability EngineeringCloud ComputingKubernetesInfrastructure as CodeIncident ManagementNetworking FundamentalsSoftware DevelopmentCommunication SkillsDocumentation SkillsCross-team Collaboration

Required

Bachelor's Degree in Computer Science, Engineering, or a related field

10-12+ years of experience in software engineering or DevOps, with at least 5+ years in a site reliability engineering (SRE) or reliability-focused role

Strong networking fundamentals including DNS, Kubernetes routing, load balancing, WAF, multi-VPC routing in AWS, Traefik

Solid software fundamentals (one or more of: Python, Java, Javascript, Go, Scala or similar) and ability to read/modify production services

Deep experience in a major cloud (AWS/GCP/Azure) and container orchestration (Kubernetes)

Proficiency with IaC (Terraform or equivalent), CI/CD systems, and git-based workflows

Hands-on with metrics/logging/tracing systems and alerting best practices

Proven incident commander experience and skillful facilitation of blameless postmortems

Solid grasp of networking, HTTP, load balancing, caching, and data stores (SQL/NoSQL/queues)

Excellent communication, documentation, and cross-team influence

Benefits

Fully sponsored medical, dental, and vision

FSA program for both medical and dependent care

401k + Roth with matching and immediate vesting

Paid time off + 11 paid holidays

Company

Kharon

Network intelligence at the nexus of global security + international commerce

Founded in 2016

Santa Monica, California, USA

51-200 employees

https://www.kharon.com/

H1B Sponsorship

Kharon has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2021 (1)