Data Engineer (Hadoop)

Tata Communications

Chennai, India

5-7 Years

Save

Posted 8 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Position Overview

An experienced Senior Data Engineer with at least 5 years of hands-on experience in data engineering. The ideal candidate will have a solid understanding of big data technologies and be skilled in building scalable data infrastructure, designing ETL pipelines, and leveraging tools like Hadoop, PySpark, Kafka, and Apache NiFi.

Key Responsibilities

Design, develop, and maintain large-scale, high-performance data systems and data pipelines using Python, PySpark, Hadoop, and Kafka.
Build, deploy, and optimize ETL workflows to process and transform large volumes of structured and unstructured data.
Collaborate with cross-functional teams to understand requirements and implement solutions that meet business needs.
Work with Apache NiFi for data ingestion, transformation, and flow management.
Write and optimize complex SQL queries for data manipulation and reporting.
Apply strong data structures and algorithms knowledge to solve complex technical problems.
Automate tasks and processes using Shell Script and Linux-based tools.
Participate in code reviews, design discussions.
Ensure adherence to best practices in software development, testing, and deployment.
Continuously improve software performance, scalability, and reliability.
Stay up-to-date with the latest developments in data engineering, big data technologies and incorporate them into the team's practices.

Required Skills and Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, or related field.
At least 5 years of professional software development experience, with strong expertise in the following:
Python: Advanced proficiency in Python, including libraries like pandas, numpy, etc.
PySpark: Experience with distributed data processing using PySpark.
Hadoop: Familiarity with the Hadoop ecosystem, including HDFS, MapReduce, and related tools.
Kafka: Hands-on experience in building and maintaining Kafka-based messaging systems.
SQL: Strong knowledge of relational databases and advanced SQL querying.
Data Structures & Algorithms: Strong understanding and practical application of data structures and algorithms.
Data Engineering Best Practices: Deep understanding of data modeling, pipeline design, and data infrastructure architecture.
ETL Pipelines: Expertise in designing, building, and maintaining efficient ETL pipelines.
Apache NiFi: Knowledge of data flow management using Apache NiFi.
Shell Scripting: Proficiency in writing efficient shell scripts for task automation.
Linux: Strong knowledge of Linux systems and tools for development and deployment.
Experience with Agile development methodologies.
Excellent problem-solving skills and ability to troubleshoot complex technical issues.
Strong communication skills with the ability to work in a collaborative team environment.

Preferred Qualifications