Lead Site Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Intellum · 1 day ago

Lead Site Reliability Engineer

Intellum is the leader in corporate education technology, powering successful learning programs for major brands. They are seeking a Lead Site Reliability Engineer to lead their SRE team, focusing on reliability standards, security posture, and platform scalability.

EducationEnterprise SoftwareHuman ResourcesInformation TechnologySaaS
check
H1B Sponsor Likelynote

Responsibilities

SRE Leadership & Strategy: Set clear goals for the SRE team and partner with Engineering leadership to align platform initiatives with business objectives
Reliability & Observability (SLA/SLO): Lead the definition and enforcement of SLAs, SLIs, and SLOs. Architect observability frameworks to translate telemetry data into actionable roadmaps that reduce toil and enhance resilience
Core Engineering & Performance: Take ownership of critical code components (i.e., Queues, Enrollments) and lead efforts to identify bottlenecks, optimize performance, and improve code quality across the engineering department
Security by Design: Champion infrastructure security. Partner with InfoSec to define hardening standards, manage perimeter defense (WAF/DDoS), and automate vulnerability remediation within the CI/CD pipeline
Incident Command: Participate in the 24x7 on-call rotation and lead post-incident reviews (RCAs), ensuring action items are implemented to improve MTTR and prevent recurrence
Mentorship: Empower developers with better tooling and guidance on performant coding practices, fostering a culture of collaboration and reliability and "you build it, you run it"

Qualification

Ruby on RailsCloud ComputingInfrastructure as CodeSQL DatabasesDeep ObservabilitySLO GovernanceSecurity FocusIncident ManagementDocumentation & TrainingProactive Problem-SolvingAutomation ToolsCI/CD ExpertiseKubernetesPerimeter DefenseLeadership

Required

10+ years of engineering experience, with 5+ years specifically developing Ruby on Rails applications
Expertise in Cloud Computing (AWS/GCP) and Infrastructure as Code (Terraform/Ansible)
Strong proficiency with SQL databases (PostgreSQL) and the ability to quickly navigate and optimize complex, unfamiliar codebases
Proven experience designing monitoring solutions (Datadog, New Relic, Prometheus) based on the 'Golden Signals'
Demonstrated ability to define SLIs/SLOs from scratch, negotiate Error Budgets, and use data to balance feature velocity with reliability
Experience securing cloud environments and container platforms (Kubernetes), including hands-on management of WAF rules and edge security
Experience leading post-incident reviews (RCAs) and implementing action items that directly improve MTTR (Mean Time to Recovery) and MTTD (Mean Time to Detection)
Proven experience leading technical teams, mentoring engineers, and working in a team-oriented, collaborative environment with strong communication skills
Skilled in documenting solutions and training operational teams on how to effectively support and maintain systems
Demonstrated ability to communicate clearly, seek help proactively, and take ownership of tasks, leading them to completion
Bachelor's degree in Computer Science or related technical field

Preferred

Experience in developing solutions using server automation tools such as Terraform, Ansible
Experience in writing and maintaining CI/CD pipelines and services
Experience in building, deploying, and optimizing Kubernetes-based infrastructure
Experience configuring and managing Web Application Firewalls (WAF) (e.g., Cloudflare, AWS WAF, Akamai) and DDOS protection mechanisms

Benefits

Medical - 100% of employee premiums for selected individual plans
Dental - 100% of employee premiums covered
Vision - 100% of employee premiums covered
LinkedIn Learning
401(k) plus matching (US Based Only)
Unlimited PTO
Calm subscription
Annual Company Retreat

Company

Intellum

twittertwittertwitter
company-logo
Intellum is a provider of integrated brand building, web presence strategy solutions and managed services company.

H1B Sponsorship

Intellum has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (1)

Funding

Current Stage
Growth Stage
Total Funding
$25M
Key Investors
Guidepost Growth Equity
2023-08-02Private Equity· $25M

Leadership Team

leader-logo
David Pitta
Chief Marketing Officer
linkedin
leader-logo
Greg Rose
Chief Experience Officer
linkedin
Company data provided by crunchbase