Gigster · 2 months ago
Staff SRE (Site Reliability Engineer)
Gigster is a dynamic company that connects top-tier IT engineers with exciting projects in software development and cloud services. They are seeking a highly skilled Staff Site Reliability Engineer to ensure the reliability, scalability, and performance of critical systems, while collaborating with teams to drive infrastructure improvements and automation initiatives.
AnalyticsAppsSaaSSoftware
Responsibilities
Design, build, and maintain scalable and reliable infrastructure
Collaborate with engineering teams to ensure systems are designed with reliability and scalability in mind
Evaluate and integrate new technologies to enhance our infrastructure
Implement and maintain monitoring and alerting systems to detect and respond to issues promptly
Lead incident response efforts, ensuring quick resolution and effective communication
Conduct post-incident reviews and drive improvements based on findings
Architect & Build innovative automation projects (preferably in Python/GoLang) from scratch to help reduce day-to-day SRE toil
Create Bash scripts to automate manual activities like upgrades, status checks, and deployment
Develop and maintain infrastructure as code (IaC) using tools such as Terraform, Ansible, or similar
Automate repetitive tasks and processes to improve efficiency and reduce manual intervention
Collaborate with cross-functional teams to deliver high-quality products and services
Mentor and guide junior SREs and other team members
Advocate for best practices in reliability engineering across the organization
Drive initiatives to improve service reliability, capacity, and performance
Participate in capacity planning and disaster recovery exercises
Stay current with industry trends and emerging technologies
Qualification
Required
Design, build, and maintain scalable and reliable infrastructure
Collaborate with engineering teams to ensure systems are designed with reliability and scalability in mind
Evaluate and integrate new technologies to enhance our infrastructure
Implement and maintain monitoring and alerting systems to detect and respond to issues promptly
Lead incident response efforts, ensuring quick resolution and effective communication
Conduct post-incident reviews and drive improvements based on findings
Architect & Build innovative automation projects (preferably in Python/GoLang) from scratch to help reduce day-to-day SRE toil
Create Bash scripts to automate manual activities like upgrades, status checks, and deployment
Develop and maintain infrastructure as code (IaC) using tools such as Terraform, Ansible, or similar
Automate repetitive tasks and processes to improve efficiency and reduce manual intervention
Collaborate with cross-functional teams to deliver high-quality products and services
Mentor and guide junior SREs and other team members
Advocate for best practices in reliability engineering across the organization
Drive initiatives to improve service reliability, capacity, and performance
Participate in capacity planning and disaster recovery exercises
Stay current with industry trends and emerging technologies
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
8+ years of minimum experience in the industry as a Software Engineer, SRE, or Platform Engineer
Minimum 3+ years of experience as a Platform Engineer or SRE
Proven experience in managing large-scale, mission-critical infrastructure
Deep understanding of Linux/Unix systems and networking
Proficiency in at least one or more programming languages (e.g., Python, Go, Java)
Intermediate to Expert level skill in bash scripting
Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes)
Strong knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack)
Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI)
Excellent problem-solving skills and a proactive attitude
Strong communication and collaboration skills
Ability to work independently and as part of a team
Demonstrated leadership and mentoring abilities
Candidates must be able to work during Pacific time hours 8am - 5pm PST, open to on-call rotation
Company
Gigster
Gigster is the first team intelligence engine, enabling software development teams to achieve 30% higher efficiency.
H1B Sponsorship
Gigster has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2020 (1)
Funding
Current Stage
Growth StageTotal Funding
$32.62MKey Investors
RedpointAndreessen Horowitz
2024-03-26Acquired
2017-08-29Series B· $20M
2015-12-07Series A· $10M
Recent News
Research and Markets
2025-04-23
2025-02-09
2024-04-11
Company data provided by crunchbase