Search by job, company or skills

Xebia

Lead Data Engineer - Scala/Spark

Save
new job description bg glownew job description bg glow
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title : Lead Data Engineer - Scala/Spark

Job Location : Bengaluru

Exp Range : 6-12 years

Position Overview :We are seeking a Senior Data Engineer with deep expertise in Scala-based Spark development and end-to-end deployment of data pipelines on Kubernetes cluster, orchestrated via Airflow.

Key Responsibilities:

Design & implement robust, scalable, batch & real-time data engineering solutions using Apache

Spark (Scala) & Spark structure streaming.

Architect well-structured Scala projects using reusable, modular, and testable codebases aligned with SOLID principles and clean architecture principles & practices.

• Develop, Deploy & Manage Spark jobs on Kubernetes clusters, ensuring eTicient resource utilization, fault tolerance, and scalability.

• Orchestrate data workflows using Apache Airflow — manage DAGs, task dependencies, retries, and SLA alerts.

• Write and maintain comprehensive unit tests and integration tests for Pipelines / Utilities developed.

• Work on performance tuning, partitioning strategies, and data quality validation.

• Use and enforce version control best practices (branching, PRs, code review) and continuous

integration (CI/CD) for automated testing and deployment.

• Write clear, maintainable documentation (README, inline docs, docstrings).

• Participate in design reviews and provide technical guidance to peers and junior engineers.

Technical Skills:

Primary:

Languages: Scala

• Big Data Orchestration: Airflow, Spark on Kubernetes, Yarn, Oozie

• Big Data Processing: Hadoop, Kafka, Spark & Spark Structured Streaming.

• Experience on SOLID & DRY principles with Good Software Architecture & Design implementation

experience

Advanced Scala experience (e.g. Functional Programming, using Case classes, Complex Data

Structures & Algorithms)

• Proficient in developing automated frameworks for unit & integration testing.

• Experience with Docker and Helm and related container technologies.

• Proficient in deploying and managing Spark workloads on Kubernetes clusters.

• Experience in evaluation and implementation of Data Validation & Data Quality

• Devops experience in Jenkins, Maven, Github, Github actions, CI/CD

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147475249