Palo Alto Networks · 3 months ago
Principal Engineer Machine Learning (MLOps)
Palo Alto Networks is dedicated to being the cybersecurity partner of choice, protecting our digital way of life. They are seeking a Principal MLOps Engineer to lead the design and operation of machine learning infrastructure at scale, ensuring reliable and efficient deployments while collaborating with cross-functional teams.
Agentic AICloud SecurityCyber SecurityNetwork SecuritySecurity
Responsibilities
Lead MLOps architecture: Design and implement scalable ML platforms, CI/CD pipelines, and deployment workflows across cloud and hybrid environments
Operationalize ML models: Build automated systems for training, testing, deployment, monitoring, and rollback of ML models in production
Ensure reliability and governance: Implement model versioning, reproducibility, auditing, and compliance best practices
Drive observability & monitoring: Develop real-time monitoring, alerting, and logging solutions for ML services, ensuring performance, drift detection, and system health
Champion automation & efficiency: Reduce friction between data science, engineering, and operations by introducing best practices for infrastructure-as-code, container orchestration, and continuous delivery
Collaborate cross-functionally: Partner with ML engineers, data scientists, security teams, and product engineering to deliver robust, production-ready AI systems
Lead innovation in MLOps: Evaluate and introduce new tools, frameworks, and practices that elevate the scalability, reliability, and security of ML operations
Optimize ML infrastructure for cost efficiency and reduced bootstrapping times across various environments
Qualification
Required
MS / PhD in Computer Science, Engineering, or related field, or equivalent military/industry experience
8+ years of software/DevOps/ML engineering experience, with at least 3+ years focused on MLOps or ML platform engineering
Strong programming skills (Python, Go, or Java) with deep expertise in building production systems
Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker)
Proven experience in ML infrastructure: model serving (TensorFlow Serving, TorchServe, Triton), workflow orchestration (Airflow, Kubeflow, MLflow, Ray, Vertex AI, SageMaker)
Hands-on experience with CI/CD pipelines, infrastructure-as-code (Terraform, Helm), and monitoring/observability tools (Prometheus, Grafana, ELK/EFK stack)
Strong knowledge of data pipelines, feature stores, and streaming systems (Kafka, Spark, Flink)
Understanding of model monitoring, drift detection, retraining pipelines, and governance frameworks
Ability to influence cross-functional stakeholders, define best practices, and mentor engineers at all levels
Passion for operational excellence, scalability, and securing ML systems in mission-critical environments
Benefits
Restricted stock units
Bonus
Company
Palo Alto Networks
Palo Alto Networks is a cybersecurity company that offers cybersecurity solutions for organizations.
Funding
Current Stage
Public CompanyTotal Funding
$65MKey Investors
Icon VenturesLehman HoldingsGlobespan Capital Partners
2012-07-20IPO
2008-11-03Series C· $10M
2008-08-18Series C· $27M
Recent News
2026-01-14
2026-01-13
Jerusalem Post
2026-01-11
Company data provided by crunchbase