Search by job, company or skills

  • Posted 6 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Primary Title: PySpark Developer (Big Data Engineer)

About The Opportunity

Recruiting for the Information Technology & Big Data Analytics sector, supporting enterprise-grade data platform and analytics initiatives. This on-site role in India places a PySpark Developer into engineering teams building scalable ETL/streaming pipelines, data lakes, and real-time analytics for mission-critical systems.

Role & Responsibilities

  • Design, develop and optimise PySpark-based ETL and streaming jobs to transform large-scale datasets for analytics and ML feature stores.
  • Implement robust data ingestion and processing pipelines across batch and streaming sources (Kafka, S3/HDFS, RDBMS), ensuring data quality and low-latency delivery.
  • Collaborate with data engineers and data scientists to productionise models and create reusable data schemas, partitioning and performance tuning strategies.
  • Author unit and integration tests, apply CI/CD practices, and deploy Spark jobs to production environments (on-premise clusters, EMR, or Databricks).
  • Monitor pipeline health, troubleshoot performance issues, and implement observability (metrics, logging, alerting) to maintain SLAs.
  • Document pipeline design, data contracts and runbooks; mentor junior engineers on Spark best practices and coding standards.

Skills & Qualifications

Must-Have

  • PySpark
  • Apache Spark
  • Python
  • SQL
  • Apache Hive
  • Hadoop
  • Apache Kafka
  • Airflow

Preferred

  • Databricks
  • AWS EMR
  • Apache NiFi

Other qualifications: On-site in India; candidates should demonstrate proven delivery of production Spark pipelines and an ability to optimise for throughput and cost. Strong debugging and profiling experience with Spark UI and JVM metrics is highly valued.

Benefits & Culture Highlights

  • Work on high-impact, enterprise data platforms and real-time analytics use-cases.
  • Collaborative engineering teams with emphasis on code quality, observability, and continuous improvement.
  • Opportunity to upskill on cloud data services and modern data engineering patterns.

Location: India On-site. Recruitment partner handling the hiring process for our client. To apply, provide an updated CV highlighting PySpark projects, pipeline architectures, and performance tuning achievements.

Skills: python,sql,apache kafka,pyspark,hadoop,airflow,apache spark

More Info

Job Type:
Industry:
Employment Type:

Job ID: 134101753

Similar Jobs