Search by job, company or skills

Tata Communications

Data Engineer (Hadoop)

new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Position Overview

An experienced Senior Data Engineer with at least 5 years of hands-on experience in data engineering. The ideal candidate will have a solid understanding of big data technologies and be skilled in building scalable data infrastructure, designing ETL pipelines, and leveraging tools like Hadoop, PySpark, Kafka, and Apache NiFi.

Key Responsibilities

  • Design, develop, and maintain large-scale, high-performance data systems and data pipelines using Python, PySpark, Hadoop, and Kafka.
  • Build, deploy, and optimize ETL workflows to process and transform large volumes of structured and unstructured data.
  • Collaborate with cross-functional teams to understand requirements and implement solutions that meet business needs.
  • Work with Apache NiFi for data ingestion, transformation, and flow management.
  • Write and optimize complex SQL queries for data manipulation and reporting.
  • Apply strong data structures and algorithms knowledge to solve complex technical problems.
  • Automate tasks and processes using Shell Script and Linux-based tools.
  • Participate in code reviews, design discussions.
  • Ensure adherence to best practices in software development, testing, and deployment.
  • Continuously improve software performance, scalability, and reliability.
  • Stay up-to-date with the latest developments in data engineering, big data technologies and incorporate them into the team's practices.

Required Skills and Qualifications

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • At least 5 years of professional software development experience, with strong expertise in the following:
  • Python: Advanced proficiency in Python, including libraries like pandas, numpy, etc.
  • PySpark: Experience with distributed data processing using PySpark.
  • Hadoop: Familiarity with the Hadoop ecosystem, including HDFS, MapReduce, and related tools.
  • Kafka: Hands-on experience in building and maintaining Kafka-based messaging systems.
  • SQL: Strong knowledge of relational databases and advanced SQL querying.
  • Data Structures & Algorithms: Strong understanding and practical application of data structures and algorithms.
  • Data Engineering Best Practices: Deep understanding of data modeling, pipeline design, and data infrastructure architecture.
  • ETL Pipelines: Expertise in designing, building, and maintaining efficient ETL pipelines.
  • Apache NiFi: Knowledge of data flow management using Apache NiFi.
  • Shell Scripting: Proficiency in writing efficient shell scripts for task automation.
  • Linux: Strong knowledge of Linux systems and tools for development and deployment.
  • Experience with Agile development methodologies.
  • Excellent problem-solving skills and ability to troubleshoot complex technical issues.
  • Strong communication skills with the ability to work in a collaborative team environment.

Preferred Qualifications

  • Experience with platforms like Cloudera, Databricks.
  • Familiarity with containerization technologies like Docker and Kubernetes.
  • Knowledge of data warehousing and data lakes.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 136398069

Similar Jobs