Jobs via Dice · 7 hours ago
Data Engineer
Dice is the leading career destination for tech experts at every stage of their careers. Our client, ASCII Group LLC, is seeking a Data Engineer with expertise in AWS EMR to design, deploy, and manage big data environments. The role involves optimizing performance, managing data ingestion pipelines, and ensuring compliance with security standards.
Computer Software
Responsibilities
Proficiency in AWS big data services such as EMR, Glue, Redshift, and S3
Design, deploy, configure, and maintain big data environments on AWS EMR
Provision, configure, and manage Amazon EMR clusters
Optimize EMR performance, sizing, and auto scaling
Administer Hadoop ecosystem components including HDFS, YARN, Hive, and Spark
Install, configure, and upgrade big data tools and frameworks
Monitor cluster health and troubleshoot failures
Manage data ingestion pipelines using Kafka, Kinesis, and AWS Glue
Implement EMR security using IAM, Kerberos, KMS, and VPC configurations
Ensure compliance with organizational security and governance standards
Automate EMR provisioning using CloudFormation, CDK, or Terraform
Build CI/CD pipelines using AWS tools or Jenkins
Implement cost optimization strategies for EMR and associated AWS services
Use CloudWatch, CloudTrail, and the EMR Console for monitoring and diagnostics
Tune Spark and Hive jobs for performance improvements
Collaborate with data engineers and cloud teams across projects
Provide production support, including on call coverage
Strong expertise with AWS EMR and the Hadoop ecosystem
Proficiency in Linux administration and scripting (Bash, Python)
Familiarity with AWS services such as S3, EC2, IAM, and CloudWatch
Experience with DevOps tools including Terraform and CloudFormation
AWS certifications (Solutions Architect, Big Data, Data Engineer) preferred
Experience with EMR Serverless, Docker, or Kubernetes preferred
Qualification
Required
Proficiency in AWS big data services such as EMR, Glue, Redshift, and S3
Design, deploy, configure, and maintain big data environments on AWS EMR
Provision, configure, and manage Amazon EMR clusters
Optimize EMR performance, sizing, and auto scaling
Administer Hadoop ecosystem components including HDFS, YARN, Hive, and Spark
Install, configure, and upgrade big data tools and frameworks
Monitor cluster health and troubleshoot failures
Manage data ingestion pipelines using Kafka, Kinesis, and AWS Glue
Implement EMR security using IAM, Kerberos, KMS, and VPC configurations
Ensure compliance with organizational security and governance standards
Automate EMR provisioning using CloudFormation, CDK, or Terraform
Build CI/CD pipelines using AWS tools or Jenkins
Implement cost optimization strategies for EMR and associated AWS services
Use CloudWatch, CloudTrail, and the EMR Console for monitoring and diagnostics
Tune Spark and Hive jobs for performance improvements
Collaborate with data engineers and cloud teams across projects
Provide production support, including on call coverage
Strong expertise with AWS EMR and the Hadoop ecosystem
Proficiency in Linux administration and scripting (Bash, Python)
Familiarity with AWS services such as S3, EC2, IAM, and CloudWatch
Experience with DevOps tools including Terraform and CloudFormation
Preferred
AWS certifications (Solutions Architect, Big Data, Data Engineer)
Experience with EMR Serverless, Docker, or Kubernetes
Company
Jobs via Dice
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.
Funding
Current Stage
Early StageCompany data provided by crunchbase