Pyspark Developer

Viraaj HR Solutions Private Limited

Hyderabad, India

Fresher

Save

Posted 6 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Primary Title: PySpark Developer (Big Data Engineer)

About The Opportunity

Recruiting for the Information Technology & Big Data Analytics sector, supporting enterprise-grade data platform and analytics initiatives. This on-site role in India places a PySpark Developer into engineering teams building scalable ETL/streaming pipelines, data lakes, and real-time analytics for mission-critical systems.

Role & Responsibilities

Design, develop and optimise PySpark-based ETL and streaming jobs to transform large-scale datasets for analytics and ML feature stores.
Implement robust data ingestion and processing pipelines across batch and streaming sources (Kafka, S3/HDFS, RDBMS), ensuring data quality and low-latency delivery.
Collaborate with data engineers and data scientists to productionise models and create reusable data schemas, partitioning and performance tuning strategies.
Author unit and integration tests, apply CI/CD practices, and deploy Spark jobs to production environments (on-premise clusters, EMR, or Databricks).
Monitor pipeline health, troubleshoot performance issues, and implement observability (metrics, logging, alerting) to maintain SLAs.
Document pipeline design, data contracts and runbooks; mentor junior engineers on Spark best practices and coding standards.

Skills & Qualifications

Must-Have

PySpark
Apache Spark
Python
SQL
Apache Hive
Hadoop
Apache Kafka
Airflow

Preferred

Databricks
AWS EMR
Apache NiFi

Other qualifications: On-site in India; candidates should demonstrate proven delivery of production Spark pipelines and an ability to optimise for throughput and cost. Strong debugging and profiling experience with Spark UI and JVM metrics is highly valued.

Benefits & Culture Highlights

Work on high-impact, enterprise data platforms and real-time analytics use-cases.
Collaborative engineering teams with emphasis on code quality, observability, and continuous improvement.
Opportunity to upskill on cloud data services and modern data engineering patterns.

Location: India On-site. Recruitment partner handling the hiring process for our client. To apply, provide an updated CV highlighting PySpark projects, pipeline architectures, and performance tuning achievements.

Skills: python,sql,apache kafka,pyspark,hadoop,airflow,apache spark