Jobs via Dice · 1 day ago
MLOps Engineer (Google Cloud Platform)
Dice is the leading career destination for tech experts at every stage of their careers. Our client, ITBMS Inc., is seeking an MLOps Engineer specializing in Google Cloud Platform to design, implement, and maintain infrastructure for machine learning models. The role involves bridging data science and engineering to ensure reliable and scalable ML systems.
Computer Software
Responsibilities
Model Deployment: Design and implement pipelines for deploying machine learning models into production using Google Cloud Platform services such as AI Platform, Vertex AI, or Cloud Run, Cloud Composer ensuring high availability and performance
Infrastructure Management: Build and maintain scalable Google Cloud Platform-based infrastructure using services like Google Compute Engine, Google Kubernetes Engine (GKE), and Cloud Storage to support model training, deployment, and inference
Automation: Develop automated workflows for data ingestion, model training, validation, and deployment using Google Cloud Platform tools like Cloud Composer, and CI/CD pipelines integrated with GitLab and Bitbucket Repositories
Monitoring and Maintenance: Implement monitoring solutions using Google Cloud Monitoring and Logging to track model performance, data drift, and system health, and take corrective actions as needed
Collaboration: Work closely with data scientists, Data engineers, Infrastructure and DevOps teams to streamline the ML lifecycle and ensure alignment with business objectives
Versioning and Reproducibility: Manage versioning of datasets, models, and code using Google Cloud Platform tools like Artifact Registry or Cloud Storage to ensure reproducibility and traceability of machine learning experiments
Optimization: Optimize model performance and resource utilization on Google Cloud Platform, leveraging containerization with Docker and GKE, and utilizing cost-efficient resources like preemptible VMs or Cloud TPU/GPU
Security and Compliance: Ensure ML systems comply with data privacy regulations (e.g., GDPR, CCPA) using Google Cloud Platform’s security tools like Cloud IAM, VPC Service Controls, and Data Loss Prevention (DLP)
Tooling: Integrate Google Cloud Platform-native tools (e.g., Vertex AI, Cloud composer) and open-source MLOps frameworks (e.g., MLflow, Kubeflow) to support the ML lifecycle
Qualification
Required
Proficiency in programming languages such as Python
Expertise in Google Cloud Platform services, including Vertex AI, Google Kubernetes Engine (GKE), Cloud Run, BigQuery, Cloud Storage, and Cloud Composer, Data proc or PySpark and managed Airflow
Experience with infrastructure-as-code - Terraform
Familiarity with containerization (Docker, GKE) and CI/CD pipelines, GitLab and Bitbucket
Knowledge of ML frameworks (TensorFlow, PyTorch, scikit-learn) and MLOps tools compatible with Google Cloud Platform (MLflow, Kubeflow) and Gen AI RAG applications
Understanding of data engineering concepts, including ETL pipelines with BigQuery and Dataflow, Dataproc - Pyspark
Strong problem-solving and analytical skills
Excellent communication and collaboration abilities
Ability to work in a fast-paced, cross-functional environment
Preferred
Experience with large-scale distributed ML systems on Google Cloud Platform, such as Vertex AI Pipelines or Kubeflow on GKE, Feature Store
Exposure to Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) applications and deployment strategies
Familiarity with Google Cloud Platform's model monitoring tools and techniques for detecting data drift or model degradation
Knowledge of microservices architecture and API development using Cloud Endpoints or Cloud Functions
Google Cloud Professional certifications (e.g., Professional Machine Learning Engineer, Professional Cloud Architect)
Company
Jobs via Dice
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.
Funding
Current Stage
Early StageCompany data provided by crunchbase