Data Engineer_Spark/Scala

zorba ai

Pune, India

5-8 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

We are seeking an experienced Data Engineer with strong expertise in Spark, Scala/Python, Airflow, and Azure Cloud to design, develop, and optimize scalable data processing solutions. The ideal candidate should have hands-on experience in building large-scale data pipelines, tuning Spark applications, configuring clusters, and implementing cloud-based data engineering solutions.

Key Responsibilities

Design, develop, and maintain scalable data pipelines and ETL/ELT workflows using Spark, Scala, and Python.
Build high-performance batch and streaming data processing solutions for large-scale datasets.
Configure, optimize, and troubleshoot Spark applications, cluster settings, and resource management for maximum performance.
Develop and manage workflow orchestration using Apache Airflow.
Implement data engineering solutions on Azure Cloud services.
Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver robust solutions.
Monitor, maintain, and enhance data pipeline reliability, scalability, and performance.
Perform code reviews and ensure adherence to coding standards and best practices.
Troubleshoot production issues and provide timely resolutions.
Contribute to the design and implementation of data platform architecture and modernization initiatives.

Required Skills

5–8 years of experience in Data Engineering and Big Data technologies.
Strong hands-on experience with Apache Spark, including:

Spark architecture and internals
Spark performance tuning
Spark configuration and cluster optimization
Resource allocation and workload management

Good experience in Scala and/or Python development.
Experience with Apache Airflow for workflow orchestration and scheduling.
Strong knowledge of Azure Cloud services and data engineering ecosystem.
Experience with distributed data processing and large-scale data platforms.
Good understanding of ETL/ELT concepts and data pipeline development.
Strong problem-solving and debugging skills.

Preferred Qualifications

Experience with Azure Data Lake, Azure Databricks, Azure Synapse, or related Azure data services.
Knowledge of CI/CD practices and DevOps methodologies.
Experience working in Agile/Scrum environments.
Familiarity with data warehousing concepts and data modeling.
Exposure to real-time data processing frameworks is a plus.

Educational Qualification