About The Opportunity
Operating in the IT services & analytics sector, this role sits inside a data engineering practice that builds large-scale batch and streaming data platforms for enterprise clients. We deliver high-throughput, low-latency data processing solutions using Hadoop ecosystems and modern ETL practices to support analytics, reporting and ML use cases.
Position: Big Data Engineer (Hadoop Developer) | Location: India | Workplace: On-site
Role & Responsibilities
- Design, develop and maintain end-to-end Hadoop-based data pipelines for batch and streaming ingestion, transformation, and delivery to downstream consumers.
- Implement and optimise Spark and MapReduce jobs for high-volume data processing; drive performance tuning and resource optimisation across clusters.
- Author and maintain Hive schemas, SQL queries, partitioning strategies and data modelling to support analytics and BI workloads.
- Integrate data movement tools (Sqoop, Flume, Kafka) and manage reliability, fault handling and data validation in production flows.
- Perform cluster-level troubleshooting, deploy fixes, monitor jobs and collaborate with platform/ops teams for capacity planning and security (Kerberos).
- Contribute to CI/CD pipelines, code reviews, runbooks and knowledge transfer to ensure operational excellence and reproducibility.
Skills & Qualifications
Must-Have
- Strong hands-on experience with Hadoop ecosystem components: HDFS, MapReduce, Hive and Sqoop.
- Proven experience developing and optimising Apache Spark jobs (batch and/or streaming).
- Practical knowledge of event/streaming integration using Kafka or Flume and robust ETL design patterns.
- Production Linux experience and ability to debug cluster/node-level issues; proficient with shell scripting.
Preferred
- Experience with cloud-managed Hadoop services (AWS EMR) or distribution management (Cloudera/Hortonworks).
- Familiarity with workflow orchestration tools (Oozie or Airflow) and monitoring/logging stacks.
Benefits & Culture Highlights
- Engage with enterprise-scale data projects and accelerate technical growth in Big Data engineering.
- Support for certifications and hands-on mentoring from senior data platform engineers.
- Collaborative, delivery-focused environment with opportunities to influence architecture and best practices.
How to apply: Candidates based in India with strong on-site availability and demonstrable production Hadoop experience will be prioritised. Please highlight relevant projects, cluster sizes handled, and examples of performance improvements in your application.
Skills: apache spark,hadoop,sqoop,mapreduce,linux,hive,kafka,big data