Zeta Global · 23 hours ago
Senior Site Reliability Engineer
Zeta Global is an AI-Powered Marketing Cloud that leverages advanced artificial intelligence and consumer signals to enhance marketing efficiency. They are seeking a Senior Site Reliability Engineer to design, implement, and manage reliability metrics, develop production-grade software, and collaborate with engineering teams to ensure system reliability and scalability.
AdvertisingAnalyticsInformation ServicesMarketing
Responsibilities
Design, implement, and manage SLOs, SLIs, and error budgets, ensuring reliability aligns with user expectations and business objectives
Develop production-grade software to enhance system reliability and reduce manual toil through automation
Implement and optimize observability solutions using tools like OpenTelemetry, with a focus on high-cardinality metrics, distributed tracing, and actionable insights
Drive postmortem processes and lead in-depth root cause analyses for incidents, ensuring lessons learned are effectively applied to prevent recurrence
Define and monitor MTTx metrics (MTTA, MTTR, MTTF), using them to guide system improvements and measure reliability progress
Design and participate in Chaos Engineering exercises
Collaborate with engineering teams to design systems with reliability and scalability in mind, incorporating capacity planning, resiliency patterns, and modern deployment strategies (e.g., Canary, Blue-Green)
Lead design reviews for alerting strategies, ensuring effective signal-to-noise ratios in monitoring and incident management
Advocate for and implement best practices in incident response and system design to achieve optimal uptime and performance
Qualification
Required
Can code confidently in Python or Golang and solve real-world problems through automation. (not only scripting)
Have hands-on experience implementing SLIs, SLOs, and distributed tracing in production
Understand Kubernetes, Terraform, and Infrastructure as Code tools
Have hands-on experience with Chaos Engineering and anomaly detection
Are excited about working with high-throughput, distributed systems processing millions of transactions daily
4+ years of experience as an SRE or in a similar role with hands-on coding
3+ years of software development experience in Python or Golang, with a focus on building maintainable, production-quality code
Deep understanding of SRE principles, particularly SLIs, SLOs, error budgets, and their real-world application
Hands-on experience conducting postmortems and implementing observability at scale
Hands-on experience conducting chaos engineering exercises
Expertise in designing and implementing end-to-end observability solutions using tools like OpenTelemetry, Prometheus, Grafana, or Honeycomb
Experience with distributed tracing and handling high-cardinality metrics in production environments
3+ years of experience with AWS and proficiency in Kubernetes, Terraform, and Infrastructure as Code (IaC) tools
Strong understanding of distributed systems, microservices architectures, and containerization (Docker, Kubernetes)
Hands-on experience with CI/CD platforms (GitOps, Jenkins, ArgoCD) and building automated pipelines
Familiarity with tools and frameworks for incident management and operational automation
Knowledge of modern deployment strategies (e.g., Canary, Blue-Green) and resiliency patterns (e.g., circuit breakers, retries)
Strong analytical skills for statistical analysis of metrics to identify and resolve performance bottlenecks
Benefits
Unlimited PTO
Excellent medical, dental, and vision coverage
Employee Equity and Stock Purchase Plan
Employee Discounts, Virtual Wellness Classes, and Pet Insurance
Company
Zeta Global
Zeta offers technology and marketing services to help brands acquire, engage, and retain customers.
H1B Sponsorship
Zeta Global has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (20)
2024 (17)
2023 (11)
2022 (6)
2021 (8)
2020 (16)
Funding
Current Stage
Public CompanyTotal Funding
$1.46BKey Investors
BofA Securities
2024-09-04Post Ipo Secondary· $105.26M
2024-09-04Post Ipo Equity· $204.94M
2024-09-03Post Ipo Debt· $550M
Leadership Team
Recent News
The Motley Fool
2026-01-16
Destination CRM
2026-01-07
Company data provided by crunchbase