CINC · 1 month ago
Data Engineer II
CINC is seeking a Data Engineer II to architect and implement data systems at TruBase. This role focuses on designing distributed data processing systems, optimizing data pipelines, and ensuring system reliability at scale.
Real Estate
Responsibilities
Build distributed data systems
Design for fault tolerance Balance consistency and availability requirements
Scale processing across multiple instances
Implement database solutions
Recommend appropriate database technologies based on workloads
Design efficient schemas and indices
Optimize query patterns and data access
Build efficient and correct ETL pipelines
Design high-throughput data flows
Implement data transformation logic
Monitor pipeline performance and correctness
Ensure data integrity across systems
Troubleshoot performance issues
Profile system bottlenecks
Analyze memory, CPU, and network usage
Implement distributed tracing
Advance technical capabilities
Be familiar with emerging technologies
Implement proof-of-concepts
Document architectural decisions
Mentor junior engineers
Qualification
Required
Distributed systems knowledge (idempotence, consistency, consensus)
Understand how databases work (B+-trees vs LSMs, columnar vs. row storage, storage tiering)
Performance debugging and profiling practices and tools (strace, py-spy, flamegraphs)
Memory access/usage, IO, and CPU usage optimization
Data transformation and aggregation at scale
Experience with log-based distributed streaming systems (Kafka, Pulsar, Kinesis)
Experience with relational and columnar databases (Postgres, Redshift, Clickhouse, Dynamo)
Advanced language skills (Python, TypeScript)
Container orchestration (Kubernetes)
CI/CD for data pipelines
Preferred
Experience with lower level lang
Understanding of codecs and compression (JSON, Protobuf, FlatBuffers)
AWS cloud architecture
Data governance implementation