Simplesense · 1 hour ago
Staff Cloud Operations Engineer
Simplesense is a non-traditional defense contractor focused on building and sustaining the Installation Resilience Platform for mission operators. They are seeking a Staff Cloud Operations Engineer to lead the technical direction and operational success of critical cloud services, ensuring reliability and performance while mentoring the team.
Cyber SecurityNational SecurityPublic SafetySecuritySoftware
Responsibilities
End-to-End Operational Ownership : Act as the single technical owner responsible for the operational success of critical cloud systems, defining System-Level Objectives (SLOs) and System-Level Indicators (SLIs). Work with other Staff and Principals Engineers to establish operational Infrastructure as Code (IaC) standards and best practices
Cross-Functional Collaboration : Coordinate with development to ensure repeatable and reliable feature deployments into cloud environments using CI/CD pipelines and maintain infrastructure through IaC practices
Ambiguity Navigation : Tackle vague and complex operational challenges by defining technical strategies and leading the team toward holistic, sustainable solutions
Mentorship and Improvement : Elevate the operational maturity of the team through insightful reviews of operational runbooks, CI/CD pipelines, and automation scripts. Mentor operations engineers on troubleshooting, problem solving, and incident response
Operational Execution : Focus on the health of critical systems, conducting root cause analysis (RCA) for major incidents and resolving complex, intermittent issues that span on-prem/cloud boundaries
Active Operational Support : Participate in periodic help desk rotations and Tier 3 / Tier 4 on-call support, troubleshooting and resolving issues, fixing bugs and implementing solutions to enhance system reliability and performance
CI/CD Development : Build automated delivery pipelines and develop internal self-service tools to enhance operational efficiency
Stakeholder Collaboration : Work with product and development teams to define operational requirements and communicate system trade-offs effectively
Demonstrated experience providing technical leadership, mentorship, and guidance to engineers, with the ability to influence team direction, operational practices, and outcomes. Prior or potential experience supporting people leadership responsibilities (such as onboarding, coaching, or performance feedback) is a plus
Qualification
Required
7+ years in managing cloud environments, systems administration, or related fields, with a focus on cloud-native applications and services
Proficient in Infrastructure as Code (IaC) tools such as Cloudformation or Terraform
Proficient in configuration tooling such as Ansible
Strong understanding of CI/CD pipelines and tooling development
Experience with implementing and tuning observability stacks, including monitoring, logging, and tracing systems
Experience in IP networking fundamentals
Familiarity using Git command line and other IDE tooling
Proven track record in leading complex troubleshooting efforts and root cause analyses related to critical incidents
Experience in mentoring junior engineers and enhancing the team's operational readiness
Excellent interpersonal and communication skills for cross-team collaboration
Must be able to obtain DoD 8570/8140 IAT Level II certification (e.g., CompTIA Security+ CE) within 6 months of hire
10% travel for quarterly team planning
Must be a U.S. Citizen and able to obtain a DoD NIPR network account and Common Access Card (CAC)
Must have, or be able to obtain, a Secret Clearance
Preferred
Experience in the operational intelligence or industrial technology sectors
AWS Networking experience with Transit Gateways, Managed VPNs and Direct Connect
AWS Certifications: Networking Specialty, Security Specialty, or Solution Architect
Benefits
Equity
Medical, Life, Short-Term Disability, and AD&D insurance
Medical travel coverage
Dental coverage
Vision coverage
401k matching
Company
Simplesense
Rapidly authorize and deploy proven cybersecurity solutions for Industrial Control Systems (ICS) / Operational Technology (OT)