Lead Data Engineer - Scala/Spark

Xebia

Bengaluru, India

6-12 Years

Save

Posted 5 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Title : Lead Data Engineer - Scala/Spark

Job Location : Bengaluru

Exp Range : 6-12 years

Position Overview :We are seeking a Senior Data Engineer with deep expertise in Scala-based Spark development and end-to-end deployment of data pipelines on Kubernetes cluster, orchestrated via Airflow.

Key Responsibilities:

• Design & implement robust, scalable, batch & real-time data engineering solutions using Apache

Spark (Scala) & Spark structure streaming.

• Architect well-structured Scala projects using reusable, modular, and testable codebases aligned with SOLID principles and clean architecture principles & practices.

• Develop, Deploy & Manage Spark jobs on Kubernetes clusters, ensuring eTicient resource utilization, fault tolerance, and scalability.

• Orchestrate data workflows using Apache Airflow — manage DAGs, task dependencies, retries, and SLA alerts.

• Write and maintain comprehensive unit tests and integration tests for Pipelines / Utilities developed.

• Work on performance tuning, partitioning strategies, and data quality validation.

• Use and enforce version control best practices (branching, PRs, code review) and continuous

integration (CI/CD) for automated testing and deployment.

• Write clear, maintainable documentation (README, inline docs, docstrings).

• Participate in design reviews and provide technical guidance to peers and junior engineers.

Technical Skills:

Primary:

• Languages: Scala

• Big Data Orchestration: Airflow, Spark on Kubernetes, Yarn, Oozie