Search by job, company or skills

Ingrain Systems Inc

Data Engineer - AWS/PySpark

new job description bg glownew job description bg glownew job description bg svg
  • Posted 15 days ago
  • Be among the first 20 applicants
Early Applicant

Job Description

Job Title : Data engineer

Location : Hyderabad

Work mode : Hybrid

Office Timings : 11am to 8pm IST

Rounds of interview - 2( F2F)

Core Technical Expertise Required For This Role

  • AWS PySpark : Strong hands-on experience using PySpark within AWS environments to process large-scale datasets efficiently.
  • AWS Glue : Experience building, maintaining, and optimizing AWS Glue jobs for data extraction, transformation, and loading.
  • AWS S3 : Proficient in working with Amazon S3 for data storage, data lake architecture, and integration with analytics pipelines.
  • PySpark : Ability to write optimized PySpark code for distributed data processing and transformation.
  • ETL Frameworks : Experience designing, developing, and maintaining scalable ETL frameworks for batch and streaming data pipelines.

Skills that will provide additional value to the role :

  • Knowledge on Talend Cloud ETL : Familiarity with Talend Cloud for building and orchestrating ETL pipelines.
  • Kafka : Understanding of event-driven architectures and streaming data platforms.
  • Snowflake Cloud : Experience working with Snowflake for cloud-based data warehousing and analytics.
  • PowerBI : Exposure to data visualization and reporting using PowerBI.

Qualification

(Educational background and professional experience requirements)

  • Bachelor's or Master's Degree in Computer Science or equivalent experience
  • Formal education in computer science or related disciplines, or equivalent hands-on industry experience.
  • At least 3+ years of experience in application design, development and analysis
  • Proven experience in designing, developing, and analyzing data-driven applications.
  • Experience in AWS Cloud Solutions. Retail Industry experience preferred.
  • Hands-on experience designing and implementing solutions on AWS, with exposure to retail domain use cases being an advantage.

Key Responsibilities

(What the role involves on a day-to-day basis)

  • Process data using spark (PySpark) : Develop and manage Spark-based data processing pipelines to handle large volumes of structured and unstructured data.
  • Collaborate and work with data analysts in various functions to ensure that data meets their reporting and analysis needs.
  • Work closely with business and analytics teams to deliver high-quality, reliable datasets.
  • Experience in creating Various ETL Frameworks in processing or extracting the data from cloud databases by using AWS Services by leveraging Lambda, Glue, PySpark, Step Functions, SNS, SQS and Batch.
  • Design end-to-end ETL frameworks using a wide range of AWS services to support scalable and automated data workflows.
  • Proven ability to be a strategic thinker to drive the necessary ownership and data governance within the organization
  • Demonstrate ownership of data pipelines and contribute to governance, best practices, and long-term architectural decisions.
  • Should have sound knowledge in various AWS Services
  • Strong understanding of AWS services and how they integrate to build robust data platforms.
  • Able to understand the existing Frameworks built in AWS PySpark and Glue
  • Quickly ramp up on existing solutions, analyze current implementations, and ensure continuity.
  • Able to scan ETL Frameworks and propose optimizations and cost savings.
  • Identify performance bottlenecks, improve efficiency, and recommend cost-optimization strategies

(ref:hirist.tech)

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 141447649

Similar Jobs