Data Engineer

Accenture

Pune, India

3-5 Years

Save

Posted 3 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Project Role : Data Engineer

Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.

Must have skills : Apache Spark, AWS Glue

Good to have skills : NA

Minimum 3 Year(s) Of Experience Is Required

Educational Qualification : 15 years full time education

Summary:

As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and contribute to the overall data strategy of the organization, ensuring that data solutions are efficient, scalable, and aligned with business objectives.

As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and provide innovative solutions to enhance data accessibility and usability.

We are looking for skilled Data Engineer to join data migration project. In this developer-focused role, you will build and optimize ETL pipelines to migrate and transform data from legacy on-prem systems to AWS, leveraging native AWS transformation technologies

Key Responsibilities

Implement end-to-end ETL pipelines using AWS native services like Spark, Step Function, EventBridge, Glue (PySpark, SQL), and Lambda for data extraction, transformation, and loading.

Use pre-created utility & for seamless migration, handling high-volume datasets with error handling and retry mechanisms.

Collaborate with the Solution Designer to test pipelines for performance, and deploy with help of devOps engineer.

Monitor and troubleshoot pipelines using CloudWatch, optimize for cost and process scalability, and document code for team handover.

Support data quality validation and incremental loads to maintain data integrity during the hydration process.

Engage with multiple teams and contribute on key decisions. Provide solutions to problems for their immediate team and across multiple teams.

Professional & Technical Skills:

Must To Have Skills: Proficiency in Apache Spark, AWS Glue, Pyspark, SQL, ETL, Unix, Iceberg, Astronomer, DW concepts
Strong understanding of data pipeline architecture and design principles.
Experience with data warehousing solutions and ETL processes.
Familiarity with cloud computing services and data storage solutions.
Ability to troubleshoot and optimize data workflows for performance.

Professional & Technical Skills: