Definitive Healthcare · 8 hours ago
Data Engineer
Definitive Healthcare is a leading company in healthcare data analytics, dedicated to transforming data into actionable intelligence for its clients. They are seeking a Data Engineer to build scalable data pipelines, manage complex healthcare datasets, and contribute to a modern cloud-native data architecture.
Artificial Intelligence (AI)Enterprise SoftwareSaaSBig DataHealthcareHospitalInformation TechnologyAnalyticsCRMInformation ServicesMedical
Responsibilities
Develop and maintain robust data pipelines using Python, Spark, Databricks, SQL, and SSIS
Implement and orchestrate ETL/ELT workflows using Apache Airflow and SSIS
Build reliable, repeatable processes that support the ingestion and transformation of large healthcare datasets
Integrate data from diverse sources (AWS, on‑prem, third‑party vendors) into our enterprise data platform
Work with a wide range of file formats including CSV, XML, Parquet, Delta, and more
Apply strong data quality, cleansing, and curation practices to ensure accuracy and consistency
Optimize storage and compute resources for performance, cost, and scalability
Automate observability and monitoring across data pipelines and workloads
Implement and manage Unity Catalog for metadata, lineage, and access control
Ensure adherence to data governance, security, and privacy standards
Maintain clear documentation, data dictionaries, and lineage tracking
Contribute to automation of data observability and governance workflows
Tune and optimize Spark jobs for speed, reliability, and cost efficiency
Diagnose and resolve performance bottlenecks across distributed systems
Apply JVM tuning and Spark optimization techniques to improve throughput
Support and enhance our Medallion architecture (bronze/silver/gold) to improve data quality and usability
Ensure data is processed, enriched, and validated at each stage of the lifecycle
Partner with data scientists, analysts, product teams, and business stakeholders to understand data needs
Implement CI/CD pipelines to streamline deployment and testing of data assets
Stay current with emerging technologies and bring forward recommendations to evolve our data platform
Qualification
Required
Strong programming experience in SQL and Python or Scala
Hands‑on experience with Apache Spark and Databricks
Experience with Apache Airflow or similar orchestration tools
Knowledge of data cleansing, curation, and quality frameworks
Familiarity with Unity Catalog or other metadata management tools
Understanding of data governance, security, and compliance best practices
Experience working with AWS cloud services
Proficiency with CI/CD tools (Jenkins, GitLab CI, etc.)
Experience tuning Spark jobs and JVM‑based applications
Experience implementing or working within a Medallion architecture
Strong analytical and problem‑solving abilities
Excellent communication and cross‑functional collaboration skills
Ability to work independently and within a team environment
High attention to detail and commitment to quality
Preferred
AWS certifications (e.g., AWS Certified Data Analytics)
Experience with SQL and NoSQL databases
Background in a fast‑paced, data‑centric SaaS or healthcare environment
Benefits
Medical, dental, and vision coverage
Unlimited paid time off
Participation in the company’s 401(k) plan with employer contribution
Competitive benefits package including great healthcare benefits and a 401(k) match
Company
Definitive Healthcare
Definitive Healthcare aims to transform data, analytics and expertise into healthcare commercial intelligence.
H1B Sponsorship
Definitive Healthcare has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (12)
2024 (4)
2023 (8)
2022 (16)
2021 (24)
2020 (5)
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
22C Capital
2021-09-15IPO
2019-10-02Private Equity
2015-03-02Private Equity
Recent News
2025-11-11
2025-11-08
2025-11-08
Company data provided by crunchbase