About The Opportunity
Primary title: Big Data Engineer (Hadoop Ecosystem). We operate in the IT services and enterprise data engineering sector, delivering large-scale analytics, data-pipeline modernization, and operational data platforms for clients across industries. This on-site role is based in India and focuses on building and operating Hadoop-based, high-throughput data platforms.
Role & Responsibilities
- Design, deploy and maintain Hadoop ecosystem clusters (HDFS, YARN) ensuring stability, scalability and security for production workloads.
- Develop and optimise batch ETL pipelines using Spark, Hive, Sqoop and related tools to ingest, transform and store large datasets.
- Build and operate streaming data pipelines using Kafka (producers/consumers), ensuring low-latency delivery and fault tolerance.
- Perform performance tuning, capacity planning and cluster troubleshootingincluding resource contention, job optimisation and storage format tuning (Parquet/ORC/Avro).
- Implement data reliability, lineage and governance best practices; automate routine operations with Oozie/airflow and scripting.
- Collaborate with Data Engineering, Data Science and DevOps teams to productionize models, integrate CI/CD, and enable observability and alerting.
Must-Have
Skills & Qualifications
- Apache Hadoop
- HDFS
- YARN
- Apache Spark
- Apache Hive
- Apache Kafka
- Sqoop
- Oozie
Preferred
- AWS EMR or cloud-managed Hadoop services
- Apache HBase
- Apache Airflow
Qualifications: Bachelor's degree in Computer Science, IT or equivalent practical experience; demonstrated track record of delivering production Big Data solutions in Hadoop ecosystems.
Benefits & Culture Highlights
- On-site role offering hands-on ownership of large-scale data platforms and opportunities to work across analytics and ML data pipelines.
- Exposure to modern data engineering practices, performance tuning and platform automation.
- Collaborative engineering culture with focus on operational excellence and career growth.
Location: India | Workplace: On-site. Keywords: Big Data Engineer, Hadoop Ecosystem, Spark, Kafka, Hive, ETL, data pipelines, cluster management, performance tuning.
Skills: yarn,big data,sqoop,hadoop,data,apache kafka,apache hadoop,apache spark,oozie