OKAYA INFOCOM · 2 hours ago
Remote--Databricks Engineer--Cincinnati, OH--Full Time
OKAYA INFOCOM is a technology company seeking a Databricks Engineer. The role involves developing and maintaining CI/CD pipelines for Azure Databricks deployments, automating deployment and configuration, and collaborating with cross-functional teams to design and scale data pipelines and models.
Responsibilities
Workspace & environment engineering: Standardize Dev/UAT/Prod workspaces (network/Private Link, VNets, secure egress), service principals, secret scopes, and Key Vault integrations
Unity Catalog & governance: Configure catalogs/schemas, RBAC, lineage, and data access patterns aligned to our guardrails
CI/CD for Databricks: Implement YAML based Azure DevOps pipelines to automate notebook/job deployments, dependencies, environment promotions, and approvals/compliance checks
IaC for Databricks & Azure: Author reusable Bicep/Terraform modules for workspaces, clusters/pools, UC objects, and supporting Azure resources
Observability & reliability: Establish monitoring/alerting for jobs, clusters, SLAs, autoscaling, and cost controls; and automation for disaster recovery scenarios
Documentation & handover: Patterns, pipeline templates, IaC modules, and operational runbooks for BAU, plus KT during the first two releases
Qualification
Required
Minimum 10-15 years of relevant experience in below -
Develop and maintain CI/CD pipelines for Azure Databricks deployments (Azure DevOps/YAML and related tools)
Automate deployment and configuration of Databricks clusters, jobs, libraries, notebooks, and environment promotions
Implement and manage the Databricks environment for performance, cost efficiency, and scalability; optimize cluster sizing and autoscaling
Collaborate with Data Engineers/Scientists/Software Engineers to design, deploy, and scale data pipelines and models on Databricks
Monitor and troubleshoot clusters, pipelines, jobs, and associated workflows; integrate Azure Monitor/Log Analytics for visibility and metrics
Implement Infrastructure as Code (IaC) using Terraform, ARM templates, or Bicep to manage Azure resources and Databricks artifacts
Design and maintain backup, recovery, and DR strategies for Databricks environments
Support security best practices: RBAC/ABAC, managed identities, Key Vault secrets, compliance controls, and Unity Catalog governance
Produce clear documentation, templates, and runbooks; enable smooth KT to BAU teams
Proven experience as a DevOps/Platform Engineer in cloud environments, with a strong focus on Azure
Deep expertise in Azure Databricks, Azure Data Lake Storage, Azure Resource Manager (ARM), Microsoft Entra, Azure SQL Database
Hands on experience automating Databricks: clusters, libraries, jobs, notebooks, and environment promotions via pipelines
Proficiency in Unity Catalog and Databricks data governance
Familiarity with Apache Spark (PySpark, Spark SQL)
Strong IaC skills: Terraform, ARM, or Bicep
Scripting (Python/PowerShell), and Git (branching strategies, conflict resolution)
Observability with Azure Monitor, Log Analytics; pipeline orchestration with Azure Data Factory
Security best practices for cloud (RBAC, managed identities, Key Vault)
Excellent problem solving and collaboration across cross functional teams