Knox Systems, Inc. · 20 hours ago
Level 2 (L2) Cloud Operations Engineer
Knox Systems, Inc. operates the largest Federal managed cloud, providing secure cloud and AI environments for critical government missions. The Cloud Operations Engineer (L2) is responsible for advanced troubleshooting, system administration, and application environment support across Knox’s cloud infrastructure, ensuring system stability and compliance.
ComputerCyber SecurityGovernment
Responsibilities
Perform advanced troubleshooting for infrastructure, OS, and application issues
Analyze system logs, metrics, and telemetry from monitoring platforms (Grafana, Datadog, Wiz, Crowdstrike)
Coordinate with Platform/DevOps Engineers on root cause analysis and long-term remediation
Ensure timely resolution of escalated incidents in accordance with SLAs
Manage and maintain AWS, Azure, and hybrid environments in accordance with NIST 800-53 controls
Execute system patching, upgrades, and configuration changes via automation or scripts
Perform health checks, deployment validations, and post-change verifications
Maintain infrastructure documentation and system configuration inventories
Perform advanced application troubleshooting for web-based applications, common application architectures
Troubleshoot app-layer issues such as API failures, integration errors, or misconfigurations
Work with DevOps/Platform teams to optimize CI/CD deployment workflows and rollback plans
Ensure adherence to change management and deployment authorization processes
Create or modify automation scripts (Bash, Python, PowerShell) for maintenance and reporting tasks
Leverage Terraform, Ansible, or cloud-native tools for provisioning and environment consistency
Proactively identify opportunities to automate recurring operational processes
Document system changes and incident response details for FedRAMP audits
Support Continuous Monitoring (ConMon) activities through vulnerability reporting and patch compliance tracking
Assist in maintaining logs, baselines, and access control evidence
Qualification
Required
3–5 years of experience in cloud operations, system administration, or infrastructure support
Hands-on experience with CrowdStrike Falcon endpoint protection, including analyzing detections, reviewing IOM/IOA telemetry, assessing endpoint vulnerability exposure, and executing or supporting SOAR-based automated response actions
Hands-on experience using Grafana or Datadog for operational monitoring and incident response, including building and maintaining dashboards, analyzing time-series metrics, and correlating alerts to identify performance degradation, availability issues, and system failures in production environments
Proficiency in command-line troubleshooting
Strong working knowledge of AWS and/or Azure infrastructure services
Familiarity with CI/CD pipelines and deployment automation tools
Understand advanced application troubleshooting techniques for web-based applications and common application architectures
Experience writing and maintaining scripts (Bash, Python, PowerShell)
Familiarity with FedRAMP, NIST 800-53, or similar compliance environments
AWS SysOps Administrator, Microsoft Azure Administrator, CompTIA Security+
Due to the nature of our work with federal government clients and compliance with applicable regulations, this position requires U.S. citizenship
Benefits
Medical
Dental
Vision
Life & Disability
Unlimited PEO
Employee funded 401k plan
Company
Knox Systems, Inc.
FedRAMP in 90 Days for 90% less.
Funding
Current Stage
Growth StageTotal Funding
$6.5MKey Investors
Felicis
2025-07-10Seed· $6.5M
Recent News
Company data provided by crunchbase