Prodege, LLC · 10 hours ago
Principal Data Engineer
Prodege, LLC is a cutting-edge marketing and consumer insights platform, and they are seeking a Principal Data Engineer to lead the design and modernization of their data architecture. The role involves building scalable data pipelines, ensuring data quality, and supporting AI/ML initiatives to optimize the company’s flagship products.
AdvertisingAnalyticsDigital MediaGift CardMarket ResearchMobileSearch EngineVirtual Currency
Responsibilities
Lead the design and implementation of the Lakehouse architecture (Iceberg/Trino) and refactor complex legacy data systems into modern patterns
Design, build, and optimize high-scale, reliable ELT/ETL data pipelines using expert-level SQL, Python, Snowflake, and dbt
Own the observability, lineage, quality, and governance frameworks for mission-critical datasets across the multi-product ecosystem
Directly support Data Science and ML Engineering teams by delivering production-grade data sets and optimizing feature engineering pipelines
Elevate the engineering bar across the team, championing best practices and utilizing AI-assisted development tools to accelerate workflow
Architect, design, and implement components of the next-generation Lakehouse platform, leveraging Iceberg, Trino, and Snowflake
Lead the simplification and refactoring efforts for complex, high-volume legacy pipelines, migrating them to modern, declarative ELT patterns (primarily via dbt)
Define and implement best practices for data storage, partitioning, clustering, and schema evolution to optimize performance and reduce cloud compute costs
Design, build, and maintain scalable, reliable data pipelines (batch and near real-time) using Python, expert-level SQL, and orchestration tools (e.g., Airflow, etc.)
Develop and enhance Snowflake data models, dbt models, and high-performance analytical data marts for consumption by BI, reporting, and product applications
Own the entire pipeline lifecycle: requirements gathering $ o$ design $ o$ build $ o$ unit/integration testing $ o$ deployment $ o$ monitoring $ o$ iteration
Implement and enhance data lineage, quality checks (via dbt tests/Great Expectations), observability, and alerting across core data pipelines
Collaborate with Data Governance and Security teams to enforce data access controls, PII handling, and retention policies
Continuously monitor and tune pipeline performance to meet strict data SLAs (Service Level Agreements) and SLOs (Service Level Objectives)
Work closely with Data Science and ML Engineering teams to understand and enable their training and serving data needs
Design and optimize data feeds for high-volume Machine Learning workloads, including the development of feature stores and model-serving pipelines
Ensure data consistency and integrity for critical AI-driven applications across consumer and business products
Actively use AI-assisted development tools (like GitHub Copilot, Gemini, etc.) to accelerate coding, generate documentation, draft tests, and simplify complex spec generation
Set high technical standards for code quality, testing, and documentation within the Data Engineering team
Provide technical leadership and mentorship to junior and mid-level engineers, running design reviews and driving consensus on architectural trade-offs
Qualification
Required
Bachelors' degrees in Computer Science or equivalent are of study, or equivalent years of relevant experience
Six or more (6+) years of hands-on experience in data engineering, ideally in multi-product, high-volume, or consumer-scale environments
Expert-level proficiency in SQL, strong Python, and extensive experience building robust ETL/ELT workflows
Strong experience with Snowflake and dbt (Data Build Tool) for data transformation and analytics engineering
Proven experience with modern data modeling techniques (e.g., Kimball, Data Vault, semantic layers) and performance tuning of large queries
Experience with Iceberg, Trino, or similar open table format/query engine ecosystems in a Lakehouse architecture
Ability to navigate and refactor complex, interconnected data systems with an Ownership Mindset (you build it, you run it)
Preferred
Experience with Kafka, Kinesis, or Apache Flink for streaming ingestion and event-driven data architectures
Familiarity with feature stores, model-serving pipelines, and MLOps practices
Professional experience using AI-driven development tools (e.g., GitHub Copilot, etc.) for coding, testing, or documentation generation
Prior experience in a consumer rewards, survey, or performance marketing ecosystem
Benefits
Medical
Dental
Vision
STD
LTD
Basic life insurance
Flexible PTO
Paid sick leave prorated based on hire date
Eight paid holidays throughout the calendar year
Option to purchase shares of Company stock commensurate with their position, which vests over four years
Company
Prodege, LLC
A cutting-edge marketing and consumer insights platform, Prodege has charted a course of innovation in the evolving technology landscape by helping leading brands, marketers, and agencies uncover the answers to their business questions, acquire new customers, increase revenue, and drive brand loyalty & product adoption.
H1B Sponsorship
Prodege, LLC has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
2023 (4)
2022 (2)
2020 (5)
Funding
Current Stage
Late StageTotal Funding
$60MKey Investors
Great Hill PartnersTCV
2021-11-15Private Equity
2014-04-01Series A· $60M
2008-01-01Series Unknown
Recent News
Canada NewsWire
2025-10-30
GlobeNewswire News Room
2025-01-30
GlobeNewswire News Room
2024-12-03
Company data provided by crunchbase