Data Engineer jobs in United States
info-icon
This job has closed.
company-logo

Jobs via Dice · 7 hours ago

Data Engineer

Dice is the leading career destination for tech experts at every stage of their careers. Our client, ASCII Group LLC, is seeking a Data Engineer with expertise in AWS EMR to design, deploy, and manage big data environments. The role involves optimizing performance, managing data ingestion pipelines, and ensuring compliance with security standards.

Computer Software

Responsibilities

Proficiency in AWS big data services such as EMR, Glue, Redshift, and S3
Design, deploy, configure, and maintain big data environments on AWS EMR
Provision, configure, and manage Amazon EMR clusters
Optimize EMR performance, sizing, and auto scaling
Administer Hadoop ecosystem components including HDFS, YARN, Hive, and Spark
Install, configure, and upgrade big data tools and frameworks
Monitor cluster health and troubleshoot failures
Manage data ingestion pipelines using Kafka, Kinesis, and AWS Glue
Implement EMR security using IAM, Kerberos, KMS, and VPC configurations
Ensure compliance with organizational security and governance standards
Automate EMR provisioning using CloudFormation, CDK, or Terraform
Build CI/CD pipelines using AWS tools or Jenkins
Implement cost optimization strategies for EMR and associated AWS services
Use CloudWatch, CloudTrail, and the EMR Console for monitoring and diagnostics
Tune Spark and Hive jobs for performance improvements
Collaborate with data engineers and cloud teams across projects
Provide production support, including on call coverage
Strong expertise with AWS EMR and the Hadoop ecosystem
Proficiency in Linux administration and scripting (Bash, Python)
Familiarity with AWS services such as S3, EC2, IAM, and CloudWatch
Experience with DevOps tools including Terraform and CloudFormation
AWS certifications (Solutions Architect, Big Data, Data Engineer) preferred
Experience with EMR Serverless, Docker, or Kubernetes preferred

Qualification

AWS EMRHadoop ecosystemData ingestion pipelinesLinux administrationAWS certificationsTerraformCloudFormationPythonBash scriptingCollaboration

Required

Proficiency in AWS big data services such as EMR, Glue, Redshift, and S3
Design, deploy, configure, and maintain big data environments on AWS EMR
Provision, configure, and manage Amazon EMR clusters
Optimize EMR performance, sizing, and auto scaling
Administer Hadoop ecosystem components including HDFS, YARN, Hive, and Spark
Install, configure, and upgrade big data tools and frameworks
Monitor cluster health and troubleshoot failures
Manage data ingestion pipelines using Kafka, Kinesis, and AWS Glue
Implement EMR security using IAM, Kerberos, KMS, and VPC configurations
Ensure compliance with organizational security and governance standards
Automate EMR provisioning using CloudFormation, CDK, or Terraform
Build CI/CD pipelines using AWS tools or Jenkins
Implement cost optimization strategies for EMR and associated AWS services
Use CloudWatch, CloudTrail, and the EMR Console for monitoring and diagnostics
Tune Spark and Hive jobs for performance improvements
Collaborate with data engineers and cloud teams across projects
Provide production support, including on call coverage
Strong expertise with AWS EMR and the Hadoop ecosystem
Proficiency in Linux administration and scripting (Bash, Python)
Familiarity with AWS services such as S3, EC2, IAM, and CloudWatch
Experience with DevOps tools including Terraform and CloudFormation

Preferred

AWS certifications (Solutions Architect, Big Data, Data Engineer)
Experience with EMR Serverless, Docker, or Kubernetes

Company

Jobs via Dice

twitter
company-logo
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.

Funding

Current Stage
Early Stage
Company data provided by crunchbase