Capgemini · 11 hours ago
Data Architect
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world. The role involves designing and implementing scalable data architectures, managing data infrastructure, and ensuring data governance practices while utilizing cloud platforms and Databricks features.
Responsibilities
Contribute to the design and implementation of scalable data architectures, such as a Lakehouse, using Delta Lake and Unity Catalog
Manage and maintain the underlying data infrastructure, which typically exists on a major cloud platform like AWS, Azure, or GCP
Implement data governance practices, including data lineage, metadata management, and access controls integrating with other 3rd party products like Immuta, Protegrity for tokenization etc
Adhere to software engineering best practices, including participating in code reviews and CI/CD (Continuous Integration/Continuous Deployment) automation
Stay up to date with the latest trends and technologies in data engineering and the Databricks ecosystem
Design and build robust ETL (Extract, Transform, Load) and ELT workflows to ingest, transform, and load structured and unstructured data from various sources
Utilize Databricks features like Delta Live Tables and Databricks Workflows to orchestrate and manage complex data processes
Optimize and tune Apache Spark jobs for performance and cost efficiency on large datasets
Implement and enforce data security, access control, and compliance policies in Databricks and Azure
Hands on experience working on streaming technologies (Kafka, Event Hubs, Kinesis)
Experience architecting machine learning platforms, advanced analytics workloads and develop and deployment of MLOPS on Databricks
Expertise with enterprise security models, networking, and cost governance
Should have experience working with DevOps teams to establish deployment practices using Terraform or similar
Optimize Databricks performance (auto-scaling, caching, delta optimization, job tuning, cost optimization)
Prior experience in leading enterprise Azure Databricks implementations
Qualification
Required
Contribute to the design and implementation of scalable data architectures, such as a Lakehouse, using Delta Lake and Unity Catalog
Manage and maintain the underlying data infrastructure, which typically exists on a major cloud platform like AWS, Azure, or GCP
Implement data governance practices, including data lineage, metadata management, and access controls integrating with other 3rd party products like Immuta, Protegrity for tokenization etc
Adhere to software engineering best practices, including participating in code reviews and CI/CD (Continuous Integration/Continuous Deployment) automation
Stay up to date with the latest trends and technologies in data engineering and the Databricks ecosystem
Design and build robust ETL (Extract, Transform, Load) and ELT workflows to ingest, transform, and load structured and unstructured data from various sources
Utilize Databricks features like Delta Live Tables and Databricks Workflows to orchestrate and manage complex data processes
Optimize and tune Apache Spark jobs for performance and cost efficiency on large datasets
Implement and enforce data security, access control, and compliance policies in Databricks and Azure
Hands on experience working on streaming technologies (Kafka, Event Hubs, Kinesis)
Experience architecting machine learning platforms, advanced analytics workloads and develop and deployment of MLOPS on Databricks
Expertise with enterprise security models, networking, and cost governance
Should have experience working with DevOps teams to establish deployment practices using Terraform or similar
Optimize Databricks performance (auto-scaling, caching, delta optimization, job tuning, cost optimization)
Prior experience in leading enterprise Azure Databricks implementations
Benefits
Flexible work
Healthcare including dental, vision, mental health, and well-being programs
Financial well-being programs such as 401(k) and Employee Share Ownership Plan
Paid time off and paid holidays
Paid parental leave
Family building benefits like adoption assistance, surrogacy, and cryopreservation
Social well-being benefits like subsidized back-up child/elder care and tutoring
Mentoring, coaching and learning programs
Employee Resource Groups
Disaster Relief
Company
Capgemini
Capgemini is a software company that provides consulting, technology, and digital transformation services.
H1B Sponsorship
Capgemini has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2856)
2024 (3012)
2023 (3424)
2022 (4392)
2021 (3311)
2020 (5871)
Funding
Current Stage
Public CompanyTotal Funding
$4.72B2025-09-18Post Ipo Debt· $4.72B
1999-04-01IPO
Recent News
2026-01-12
2026-01-08
2026-01-06
Company data provided by crunchbase