Data Engineer (AWS)
Company
Infosys
Location
Guadalajara, Mexico
Type
Full Time
Job Description
Required Qualifications:
- 5+ years of experience in data engineering using Python with a focus on AWS S3, EMR, Glue, Step Functions, Apache NiFi and Spark.
- Proven track record of building scalable data pipelines in cloud environments.
- Proficiency in flow design, processors, and data provenance in Apache NiFi.
- Strong expertise in Spark, Hadoop, and distributed computing on AWS EMR.
- In-depth knowledge of AWS services (S3, Glue, Redshift, RDS, Lambda, Step Functions).
- Experience with data formats (JSON, CSV, Parquet, Avro) and transformation techniques.
- Strong problem-solving skills and ability to troubleshoot complex data processing issues.
- Excellent communication skills with the ability to document and explain technical details clearly.
Want more jobs like this?
Get jobs in Guadalajara, Mexico delivered to your inbox every week.
- AWS Certified Solutions Architect or Data Analytics Specialty.
- Experience with data governance frameworks and compliance requirements.
- Familiarity with CI/CD pipelines and version control (GitLab, Jenkins).
Design & Develop Data Pipelines:
- Architect and implement end-to-end data pipelines using AWS S3, EMR, Glue, Step Functions, Apache NiFi, Spark.
- Manage data ingestion processes from AWS S3, ensuring secure and efficient data transfer.
- Implement initial data routing, validation, and transformations using Apache NiFi processors and Spark Data Engines
- Integrate using AWS EMR, Apache NiFi, Spark to perform complex data transformations and analytics.
- Optimize Spark jobs for processing large-scale datasets with a focus on performance and resource utilization.
- Handle both historical and incremental data loads, ensuring data consistency and integrity.
- Define and implement data storage strategies across S3, RDS, and Redshift, adhering to business requirements.
- Manage data catalog creation and schema management using AWS Glue.
- Develop and manage workflows using Apache Airflow, AWS Step Functions to automate data processing tasks.
- Implement monitoring, error handling, and retries within the orchestration framework.
- Ensure data security with encryption (AES-256, TLS) and IAM role-based access controls.
- Implement data governance policies using AWS Glue Data Catalog to ensure compliance with regulatory requirements.
- Utilize AWS CloudWatch to monitor the performance of EMR clusters, NiFi flows and data storage.
- Continuously optimize Spark job configurations and NiFi data flows for maximum throughput and minimal latency.
Date Posted
12/21/2024
Views
0
Similar Jobs
Online Data Analyst: Spanish Language (Remote Position) - TELUS Digital AI Data Solutions
Views in the last 30 days - 0
This freelance opportunity allows you to work as an online data analyst from home enhancing digital maps used by millions worldwide The role involves ...
View DetailsEA Specialist II - BigCommerce
Views in the last 30 days - 0
The Administrative Specialist II serves as the primary point of contact for internal constituencies on matters pertaining to the senior executives the...
View DetailsScaled Customer Success Manager - Apollo.io
Views in the last 30 days - 0
The role involves managing a large customer portfolio in AMER and LATAM driving Apollo product adoption and expanding the customer base through tailor...
View DetailsSenior Engineer II, Payments - TrueML
Views in the last 30 days - 0
TrueML is a missiondriven financial software company that aims to create better customer experiences for distressed borrowers They use machine learnin...
View DetailsData Scientist II - TrueML
Views in the last 30 days - 0
TrueML is a missiondriven financial software company that uses machine learning to create personalized digital experiences for distressed borrowers Th...
View DetailsDirector of Data & Site Analytics - Newsela
Views in the last 30 days - 0
The company is seeking a highly experienced Data Analytics Contractor preferably fluent in Spanish or Portuguese to lead product analytics and busines...
View Details