Barti · 1 day ago
Senior Site Reliability Engineer
Barti is a venture-backed startup on a mission to revolutionize eye care through AI-powered software. They are seeking a Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of their production systems while leading infrastructure initiatives and mentoring engineers.
Artificial Intelligence (AI)Electronic Health Record (EHR)Health CareSaaS
Responsibilities
Lead and participate in the design, implementation, and maintenance of highly available and scalable infrastructure
Monitor system health, performance metrics, and capacity planning to ensure optimal performance
Establish and track SLIs, SLOs, and error budgets to measure and improve system reliability
Design and implement Infrastructure as Code (IaC) solutions using tools like Terraform, Pulumi, or CloudFormation
Build and maintain CI/CD pipelines to enable rapid, safe deployments
Automate operational tasks and eliminate toil through scripting and tooling
Lead incident response efforts, including on-call rotation, post-mortem analysis, and remediation
Debug and resolve complex production issues across the entire stack
Implement monitoring, alerting, and observability solutions to detect and prevent issues proactively
Provide technical leadership and mentorship to engineers on reliability and infrastructure best practices
Collaborate with cross-functional teams, including Engineering and Product to ensure reliable product delivery
Lead the technical design of infrastructure solutions, ensuring alignment with architectural principles and business goals
Stay updated with emerging technologies and industry trends in SRE, DevOps, and cloud infrastructure
Propose and drive the adoption of best practices, tools, and processes to enhance system reliability and developer productivity
Conduct chaos engineering experiments and disaster recovery drills to validate system resilience
Implement and maintain security best practices across infrastructure and applications
Manage secrets, access controls, and security monitoring systems
Foster a collaborative environment within the engineering team and across departments
Clearly communicate technical concepts and system health to both technical and non-technical stakeholders
Work closely with engineering teams to define reliability requirements and ensure operational excellence
Qualification
Required
5+ years (ideally 7+) of relevant work experience in Site Reliability Engineering, DevOps, or Infrastructure roles
1+ years of hands-on experience with either Python, Go, or Bash scripting
Experience with cloud platforms (ideally GCP) and container orchestration (Kubernetes, Docker)
Proficiency with Infrastructure as Code tools (Terraform, CloudFormation, or similar)
Strong understanding of Linux systems, networking, and distributed systems
Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
Excellent problem-solving and communication skills
Able to work independently and as part of a team
Preferred
Background in healthcare technology or regulated industries
Experience with GCP, Cloud SQL, and Google Kubernetes Engine (GKE)
HIPAA compliance and security best practices experience
Experience with relational databases (Postgres, MySQL) performance tuning and high availability
Proficiency with CI/CD tools (GitHub Actions, CircleCI, GitLab CI)
Familiarity with APM tools and distributed tracing
Benefits
Equity in a fast-growing startup
Health, vision, and dental benefits
Unlimited PTO
Annual professional development stipend
Company
Barti
Barti offers an AI-powered electronic health record and practice management platform tailored for eye care providers.
Funding
Current Stage
Early StageTotal Funding
$19.58MKey Investors
Five Elms CapitalHealth Engine
2025-08-25Series A· $15.08M
2023-11-30Grant
2022-11-28Seed· $4.5M
Recent News
2025-09-17
thesaasnews.com
2025-08-27
2025-08-25
Company data provided by crunchbase