Investigo · 8 hours ago
Senior Data Engineer
Investigo is a recruitment company partnering with Hippocratic AI, a leading generative AI company in healthcare. They are seeking a Senior Data Engineer to design and scale data infrastructure that supports safe AI deployment across healthcare environments, ensuring data quality and compliance.
AccountingFinanceProfessional Services
Responsibilities
Build & operate data platforms and pipelines (batch/stream) that feed training, RAG, evaluation, and analytics using tools like Prefect, dbt, Airflow, Spark, and cloud data warehouses (Snowflake/BigQuery/Redshift)
Own data governance and access control: implement HIPAA-grade permissioning, lineage, audit logging, and DLP; manage IAM, roles, and policy-as-code
Ensure reliability, observability, and cost efficiency across storage (S3/GCS), warehouses, and ETL/ELT—SLAs/SLOs, data quality checks, monitoring, and disaster recovery
Enable self-service analytics via curated models and semantic layers; mentor engineers on best practices in schema design, SQL performance, and data lifecycle. Partner with ML/Research to provision high-quality datasets, feature stores, and labeling/eval corpora with reproducibility (versioning, metadata, data contracts)
Qualification
Required
5+ years of software or data engineering experience, with 3+ years building data infrastructure, ETL/ELT pipelines, or distributed data systems
Deep experience with Python and at least one cloud data platform (Snowflake, DataBricks, BigQuery, Redshift, or equivalent)
Familiarity with orchestration tools (Airflow, prefect, dbt) and infrastructure-as-code (Terraform, CloudFormation)
Strong understanding of data security, access control, and compliance frameworks (HIPAA, SOC 2, GDPR, or similar)
Proficiency with SQL and experience optimizing query performance and storage design
Excellent problem-solving and collaboration skills — able to work across engineering, ML, and clinical teams
Comfortable navigating trade-offs between performance, cost, and maintainability in complex systems
Preferred
Experience supporting ML pipelines, feature stores, or model training datasets
Familiarity with real-time streaming systems (Kafka, Kinesis) or large-scale unstructured data storage (S3, GCS)
Background in data reliability engineering, data quality monitoring, or governance automation
Experience in healthcare, safety-critical systems, or regulated environments
Company
Investigo
Investigo is offering accountancy, finance, audit, risk, compliance, property and facilities management services located in London.
Funding
Current Stage
Growth StageRecent News
Bdaily Business News
2025-01-22
Company data provided by crunchbase