Autonomai Recruitment · 2 hours ago
Site Reliability Engineer
Autonomai Recruitment is seeking a Site Reliability Engineer to join their core engineering group. The role involves designing and scaling Linux platforms that support ML/AI-driven trading, ensuring ultra-reliable and fast trading systems through hands-on leadership and strategic decision-making.
Responsibilities
Lead SRE practices for Linux platforms powering low-latency, high-throughput trading workloads
Architect, optimize, and tune Linux for performance, resilience, and minimal latency
Drive incident response, root cause analysis, and continuous reliability improvement across production systems
Oversee system automation and reproducibility—build, deploy, and fleet-manage bare-metal Linux and containerized stacks
Manage and enhance Kubernetes clusters, network configuration, and large-scale orchestration
Set observability standards; expand monitoring, alerting, and performance metrics across platforms
Analyze networking, kernel-level performance, and distributed systems—solving core challenges in a multi-petabyte, multi-cluster environment
Build Python tools for automation, reliability engineering, and performance analysis
Design highly distributed systems
Qualification
Required
Deep Linux
Scripting - Python
DevOps
Kubernetes
Experience building technology 0→1
Owning systems end-to-end
Working close to the metal
Architect and own reliability for massive simulation, HPC, and production workloads
Lead SRE practices for Linux platforms powering low-latency, high-throughput trading workloads
Architect, optimize, and tune Linux for performance, resilience, and minimal latency
Drive incident response, root cause analysis, and continuous reliability improvement across production systems
Oversee system automation and reproducibility—build, deploy, and fleet-manage bare-metal Linux and containerized stacks
Manage and enhance Kubernetes clusters, network configuration, and large-scale orchestration
Set observability standards; expand monitoring, alerting, and performance metrics across platforms
Analyze networking, kernel-level performance, and distributed systems—solving core challenges in a multi-petabyte, multi-cluster environment
Build Python tools for automation, reliability engineering, and performance analysis
Design highly distributed systems
Preferred
Experience in a top-tier tech environment (FAANG, elite trading, hyperscale infra)
Collaborative engagement with quants, researchers, and trading experts
Company
Autonomai Recruitment
Autonomai Recruitment is a boutique search agency specializing in tailored recruitment solutions for FinTech, Crypto, and Ai.
Funding
Current Stage
Early StageCompany data provided by crunchbase