Search by job, company or skills

  • Posted 5 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role: Sr. Azure Databricks Data Engineer

Experience: 6+ Years

Location: Navi Mumbai

Duration: Fulltime

Role Overview

We are looking for a Data Engineer to design and build scalable, production-grade data pipelines on Azure using Databricks. The role involves working with high-volume, high-velocity enterprise data across domains such as telecom, retail, and regulatory systems (e.g., GST, eWay Bill).

You will be responsible for building reliable batch and real-time pipelines, ensuring data quality, auditability, and performance at scale.

Key Responsibilities

Data Engineering & Pipeline Development

Design and implement end-to-end data pipelines using Azure Databricks and PySpark

Build and maintain batch and real-time ingestion pipelines using:

o Azure Data Factory (ADF)

o Kafka / Azure Event Hubs

Process TBPB scale structured and semi-structured datasets (JSON, Parquet, CSV)

Lakehouse Architecture

Implement and maintain Medallion Architecture (Bronze, Silver, Gold layers)

Develop reusable data models for analytics, reporting, and downstream consumption

Ensure data lineage, traceability, and auditability across layers

Delta Lake & Data Management

Leverage Delta Lake features:

o MERGE (upserts), SCD Type 1/2 implementations

o Schema enforcement and evolution

o ACID-compliant pipelines

Optimize Delta tables using:

o OPTIMIZE, Z-ORDER, VACUUM

Handle incremental processing and CDC pipelines

Streaming & Real-Time Processing

Build low-latency streaming pipelines using Structured Streaming

Handle:

o Late-arriving data

o Watermarking and windowing

o Exactly-once processing semantics

Integrate streaming pipelines with downstream Delta tables and serving layers

Performance Optimization & Scalability

Optimize Spark jobs using:

o Partitioning strategies

o Broadcast joins

o Caching and persistence

o Adaptive Query Execution (AQE)

Troubleshoot performance bottlenecks such as:

o Data skew

o Shuffle issues

o Memory constraints

Orchestration & Workflow Management

Design orchestration workflows using Azure Data Factory:

o Pipeline dependencies

o Scheduling and triggers

o Retry and failure handling

Integrate ADF with Databricks Jobs / Workflows for end-to-end execution

Data Quality & Governance

Implement robust data quality checks, including:

o Schema validation

o Deduplication

o Null and integrity checks

o Data reconciliation across sources

Handle schema drift and evolving data contracts

Ensure compliance with regulatory and audit requirements

Domain-Specific Use Cases

Build pipelines for:

o High-frequency transactional systems

o Regulatory datasets (GST, eWay Bill, financial reporting)

o Retail / telecom data platforms

Ensure data consistency, reconciliation, and reporting accuracy

Technical Skills Required

Core

Strong SQL:

o Window functions

o Complex joins

o Query optimization

PySpark:

o Transformations and actions

o Performance tuning

Azure Ecosystem

Azure Data Factory (ADF)

Azure Data Lake Storage Gen2 (ADLS)

Azure Databricks

Streaming

Kafka or Azure Event Hubs

Structured Streaming

Good to Have

CDC pipeline implementation

Databricks Autoloader

Delta optimization techniques

Multi-region / multi-source ingestion

Data quality frameworks (e.g., expectation-based validation)

Soft Skills & Expectations

Strong debugging skills in distributed data systems (Spark)

Experience handling production incidents and RCA (Root Cause Analysis)

Ability to work in high-scale, SLA-driven environments

Effective collaboration with business, analytics, and downstream consumers

Flexibility to work in WFO / hybrid setup

What Success Looks Like

Reliable, scalable pipelines handling large-scale enterprise data

Reduced pipeline failures and improved data SLAs

High-quality, trusted datasets for business-critical reporting

Efficient Spark jobs with optimized cost and performance

More Info

Job Type:
Industry:
Employment Type:

Job ID: 144027091

Similar Jobs