PSR Associates, Inc. ยท 1 month ago
Director, Edge Operations/SRE
PSR Associates is a consulting and talent solutions firm connecting qualified IT professionals with opportunities. They are seeking a Director of Edge Operations/SRE to lead a distributed team managing edge computing infrastructure across 100+ countries, ensuring performance, reliability, and operational excellence.
Information TechnologyRecruitingStaffing Agency
Responsibilities
Lead 24x7x365 operations of edge infrastructure, ensuring high availability, reliability, and efficient performance
Build, mentor, and lead a team of SREs, engineers and operations specialists across multiple geographies
Supervise incident response, root cause analysis, and resolution processes for edge-related outages or degradations
Encourage the utilization of SRE procedures, encompassing SLIs/SLOs, error budgets, and incident management
Collaborate with Platform Engineering and Application teams to build and deploy scalable, resilient systems
Monitor platform capacity and performance and collaborate with the Platform Engineering team to forecast and plan for future edge capacity/performance needs
Develop and maintain monitoring, alerting, and observability to proactively detect and resolve issues
Lead initiatives to automate day to day operational tasks and reduce toil
Collaborate with Edge Platform Vendor and Outsourced Service Provider to guarantee SLAs are achieved and consistently improved
Ensure all edge operations align with security guidelines and meet relevant regulatory and compliance standards
Engage with Platform Engineering, Application, Segment, and Market teams to coordinate edge operations with business objectives
Collaborate with Market Operations and other Global Tech Operations teams to implement modifications on the edge platform based on global and market-specific change protocols
Cultivate an environment of continuous improvement, teamwork, and operational efficiency among team members
Qualification
Required
Bachelor's or Master's degree in Computer Science, Engineering, or a related field
10+ years of experience in infrastructure, On-Prem/Public Cloud, or SRE roles, with at least 5 years in a leadership capacity
Shown experience leading edge platforms, or hybrid cloud environments
Strong knowledge of Hardware, Kubernetes, CI/CD pipelines, and infrastructure as code (e.g., Terraform, Ansible)
Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK, New Relics)
Experience with driving/leading automation initiative to reduce toil and improve efficiency
Excellent communication, leadership, and cross-functional collaboration skills
Experience in leading and forging partnerships with Vendors and Managed service partners to deliver business value
Demonstrable background in guiding Operations/SRE team within a sophisticated multinational corporation
Strong knowledge and experience with GCP and/or AWS cloud Infrastructure