Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Autonomai Recruitment · 2 hours ago

Site Reliability Engineer

Autonomai Recruitment is seeking a Site Reliability Engineer to join their core engineering group. The role involves designing and scaling Linux platforms that support ML/AI-driven trading, ensuring ultra-reliable and fast trading systems through hands-on leadership and strategic decision-making.

Staffing & Recruiting
Hiring Manager
James McNicholl
linkedin

Responsibilities

Lead SRE practices for Linux platforms powering low-latency, high-throughput trading workloads
Architect, optimize, and tune Linux for performance, resilience, and minimal latency
Drive incident response, root cause analysis, and continuous reliability improvement across production systems
Oversee system automation and reproducibility—build, deploy, and fleet-manage bare-metal Linux and containerized stacks
Manage and enhance Kubernetes clusters, network configuration, and large-scale orchestration
Set observability standards; expand monitoring, alerting, and performance metrics across platforms
Analyze networking, kernel-level performance, and distributed systems—solving core challenges in a multi-petabyte, multi-cluster environment
Build Python tools for automation, reliability engineering, and performance analysis
Design highly distributed systems

Qualification

Deep LinuxScripting - PythonDevOpsKubernetesDistributed systems

Required

Deep Linux
Scripting - Python
DevOps
Kubernetes
Experience building technology 0→1
Owning systems end-to-end
Working close to the metal
Architect and own reliability for massive simulation, HPC, and production workloads
Lead SRE practices for Linux platforms powering low-latency, high-throughput trading workloads
Architect, optimize, and tune Linux for performance, resilience, and minimal latency
Drive incident response, root cause analysis, and continuous reliability improvement across production systems
Oversee system automation and reproducibility—build, deploy, and fleet-manage bare-metal Linux and containerized stacks
Manage and enhance Kubernetes clusters, network configuration, and large-scale orchestration
Set observability standards; expand monitoring, alerting, and performance metrics across platforms
Analyze networking, kernel-level performance, and distributed systems—solving core challenges in a multi-petabyte, multi-cluster environment
Build Python tools for automation, reliability engineering, and performance analysis
Design highly distributed systems

Preferred

Experience in a top-tier tech environment (FAANG, elite trading, hyperscale infra)
Collaborative engagement with quants, researchers, and trading experts

Company

Autonomai Recruitment

twitter
company-logo
Autonomai Recruitment is a boutique search agency specializing in tailored recruitment solutions for FinTech, Crypto, and Ai.

Funding

Current Stage
Early Stage
Company data provided by crunchbase