Staff Engineer, GitLab Delivery - Operate jobs in United States
cer-icon
Apply on Employer Site
company-logo

GitLab · 3 months ago

Staff Engineer, GitLab Delivery - Operate

GitLab is an open-core software company that develops a comprehensive AI-powered DevSecOps Platform used by over 100,000 organizations. The Staff Engineer role within the GitLab Operate team focuses on leading the technical direction for self-managed deployment strategies, emphasizing zero-downtime upgrades and operational excellence. This high-impact position involves architecting and implementing systems that enable organizations to deploy and operate GitLab reliably in their own infrastructure.

Cloud SecurityDeveloper ToolsDevOpsOpen SourceSaaS
check
Comp. & Benefits

Responsibilities

Define the technical vision for GitLab's cloud-native deployment and upgrades future, balancing operational simplicity, customer needs, and engineering constraints
Lead the design and implementation of the new tooling, including Operator(s), enabling automated lifecycle management and zero-downtime upgrades
Architect upgrade orchestration systems that safely coordinate complex multi-component upgrades across databases, application services, and auxiliary components
Establish operational maturity standards and guidance for new services being integrated into GitLab's deployment tooling and empowering development teams for the end-to-end of their components
Drive technical decisions around service integration patterns, deployment models, and operational interfaces
Design production-grade Kubernetes Operators that aims to reliable reconciliation logic for complex stateful applications
Design and implement upgrade orchestration that handles database migrations, rolling deployments, compatibility checks, and rollback capabilities
Develop tooling and automation to reduce the operational complexity of running GitLab at scale
Create integration frameworks that enable development teams to ship new services with standardized deployment patterns
Maintain and evolve GitLab Helm Charts to support both simple and complex deployment topologies
Contribute to safe database migration strategies for zero-downtime upgrades across PostgreSQL and other stateful components
Implement compatibility layers that enable incremental upgrades without requiring simultaneous updates across all components
Design and contribute to build validation and pre-flight check systems that detect potential upgrade issues before they impact production
Partner with development teams to define integration requirements for new services and features
Collaborate with GitLab Dedicated and Gitlab.com SRE teams to align deployment patterns and operational practices
Work with Product Management to translate customer needs into technical requirements
Mentor and guide other engineers on the team, establishing technical standards and best practices
Create technical documentation and runbooks that enable customer success and support teams
Define and implement observability standards for self-managed deployments, including metrics, logging, and alerting
Build automated testing frameworks that validate deployment and upgrade scenarios across reference architectures
Establish performance benchmarks and capacity planning guidance for different deployment scales
Design resilience patterns for handling failures during upgrades and operations
Contribute to incident response and post-mortems for self-managed deployment issues

Qualification

Go proficiencyProduction Kubernetes experienceCloud-native architectureDatabase migration strategiesHelm charts designInfrastructure automationLinux systems understandingMentoring engineersTechnical documentationCross-functional collaboration

Required

8+ years of software engineering experience with at least 3+ years in platform engineering or infrastructure roles
Expert-level Go proficiency (Ruby and Rails as a plus) with demonstrated ability to work in large, complex codebases
Production Kubernetes experience, including: Building and maintaining Kubernetes Operators, Designing Helm charts for complex stateful applications, Understanding of Custom Resource Definitions (CRDs), admission controllers, and controller patterns, Experience with stateful workloads, persistent volumes, and storage classes
Cloud-native architecture experience, including service mesh, observability stacks, and infrastructure as code
Experience shipping production software that customers install and operate in their own infrastructure
Understanding of Linux systems, including package management, systemd, and system-level debugging

Preferred

Experience building or maintaining Operators for complex stateful applications (databases, message queues, etc.)
Ruby on Rails expertise and understanding of Rails application architecture
Infrastructure automation using Terraform, Ansible, or similar tools
Background in Site Reliability Engineering or DevOps with production on-call experience
Understanding of compliance and security requirements for enterprise software deployments
Experience with observability platforms
Open source contribution history, particularly in infrastructure or deployment tooling

Benefits

Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan
Growth and Development Fund
Parental leave
Home office support

Company

GitLab is a web-based Git repository manager that offers a variety of features for software development teams.

Funding

Current Stage
Public Company
Total Funding
$413.5M
Key Investors
ICONIQ GrowthGoogle VenturesAugust Capital
2021-10-14IPO
2019-09-17Series E· $268M
2018-09-19Series D· $100M

Leadership Team

leader-logo
Bill Staples
Chief Executive Officer
linkedin
leader-logo
Sytse Sijbrandij
Co-founder and Executive Chair
linkedin
Company data provided by crunchbase