Senior Site Reliability Engineer – Distributed Systems jobs in United States
cer-icon
Apply on Employer Site
company-logo

Cognizant · 3 days ago

Senior Site Reliability Engineer – Distributed Systems

Cognizant is a leading technology company, and they are seeking a Senior Site Reliability Engineer to design and implement observability solutions for distributed edge computing environments. The role involves collaborating with cross-functional teams to ensure system reliability and performance while developing frameworks for monitoring and incident response.

ConsultingIndustrial AutomationInformation TechnologySoftwareSoftware Engineering
check
H1B Sponsor Likelynote

Responsibilities

Design and implement observability frameworks for edge computing environments, including monitoring, logging, tracing, and metrics collection
Define and maintain SLIs, SLOs, and business KPIs to measure and enhance system reliability across edge and centralized infrastructure
Build dashboards, visualizations, and alerting systems for real-time insights and incident response
Implement distributed tracing and log aggregation systems to troubleshoot complex edge issues
Collaborate with engineering teams to embed observability best practices into edge applications and infrastructure
Proactively identify issues using advanced observability tools, reducing MTTD and MTTR
Lead incident postmortems and implement observability-driven improvements
Develop automation scripts and tools to optimize observability pipelines for bandwidth-constrained environments
Optimize data storage and querying strategies for performance, cost, and scalability
Stay current with emerging observability trends and advocate for adoption of edge-specific solutions

Qualification

Distributed systemsCloud platformsAutomation scriptingObservability frameworksProgramming languagesMonitoring toolsNetworking protocolsContainerizationSoft skills

Required

10+ years of IT experience
3–5 years of experience in service reliability/operations for large-scale hybrid environments
3–5 years of experience writing automation scripts and building dashboards for application performance management
2–4 years of experience with programming languages such as Go, Python, Java, or Rust
Working knowledge of databases such as Oracle, SQL Server, Redis, ClickHouse, PostgreSQL, MongoDB, or time-series databases
At least 2 years of experience with cloud platforms and containerization (GCP, AWS, Rancher, Azure, OpenShift)
Experience maintaining containerized apps in GKE/RKE/AKE environments
Experience implementing cloud observability using OpenTelemetry (OTEL)
Experience with GraphQL frameworks (Apollo, Prisma, Hasura)
Strong understanding of networking protocols (TCP/IP, HTTP, DNS, load balancing, service mesh)
Bachelor's degree in computer science, IT or equivalent

Preferred

Proven experience managing application availability and building automation for high-availability platforms
Hands-on experience with monitoring tools like Splunk, AppDynamics, Grafana/Prometheus, and Dynatrace
Experience with CI/CD tools and extenders such as Rally and Confluence
Experience With In-memory Caching Solutions (Redis Preferred)
Strong debugging skills across integrated technical platforms and API gateways
Hands-on experience with GCS, Cloud SQL, Spanner, and Firestore
Experience in enterprise-level infrastructure and operations
Expertise in high-availability and distributed systems, Linux/Windows administration, and support
Experience monitoring and troubleshooting HashiCorp Vault environments
Working knowledge of Vertex AI, Gen AI, and BigQuery

Benefits

Medical/Dental/Vision/Life Insurance
Paid holidays plus Paid Time Off
401(k) plan and contributions
Long-term/Short-term Disability
Paid Parental Leave
Employee Stock Purchase Plan

Company

Cognizant

company-logo
Cognizant is a professional services company that helps clients alter their business, operating, and technology models for the digital era.

H1B Sponsorship

Cognizant has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (11113)
2024 (11423)
2023 (13054)
2022 (13876)
2021 (12651)
2020 (28659)

Funding

Current Stage
Public Company
Total Funding
$0.24M
Key Investors
Summit Financial Wealth Advisors
2025-03-08Post Ipo Equity
2016-11-18Post Ipo Equity· $0.24M
1998-06-19IPO

Leadership Team

leader-logo
Ravi Kumar S
Chief Executive Officer
linkedin
leader-logo
Anil Cheriyan
CTO / EVP Strategy & Technology
linkedin
Company data provided by crunchbase