Search by job, company or skills

Bajaj Finserv

Senior Data Engineer

3-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 14 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Location Name: Pune Corporate Office - Mantri

Job Purpose

Build and maintain reliable, scalable batch and real-time data pipelines on the Enterprise Data Platform to enable analytics, reporting, and downstream applications. The role delivers high-quality data engineering solutions using SQL, Python, and PySpark, with strong focus on streaming, Change Data Capture (CDC), and database mirroring to ensure timely, trusted data delivery.

Duties And Responsibilities

  • Design, develop, and optimize data pipelines using SQL, Python, and PySpark on cloud data platforms.
  • Implement and operate real-time/streaming data ingestion (e.g., Spark Structured Streaming/Kafka) including schema evolution and late-arriving data handling.
  • Set up and manage CDC frameworks and database mirroring for nearrealtime replication and minimal-latency updates.
  • Build robust data models and curated datasets for analytics, dashboards, and application consumption.
  • Ensure data quality, lineage, and observability (validation, alerting, SLAs/SLOs) across batch and streaming workloads.
  • Drive performance tuning and cost optimization (partitioning, file formats, caching, autoscaling).
  • Harden solutions with security best practices (access controls, PII handling), governance, and compliance standards.
  • Contribute to CI/CD using Git/GitHub and DevOps pipelines; automate testing and deployments.
  • Partner with Data Platform, BI/Analytics, and Application teams to translate requirements into technical solutions.
  • Provide L2/L3 support for pipelines and jobs; troubleshoot incidents, perform RCA, and implement preventive fixes.
  • Create and maintain technical documentation and runbooks; participate in code reviews and knowledge sharing.

Key Decisions / Dimensions


  • Select appropriate ingestion patterns (batch vs. streaming), CDC/mirroring approaches, and storage formats.
  • Define partitioning, indexing, and optimization strategies to meet SLAs.
  • Recommend tooling and frameworks for orchestration, testing, and observability.
  • Prioritize defect fixes and enhancements based on impact and risk.

Major Challenges


  • Maintaining reliability and low latency for missioncritical streaming and CDC pipelines.
  • Managing schema changes and data drift across diverse source systems.
  • Balancing feature delivery with production support within tight timelines.
  • Optimizing performance and cost at scale across environments.

Educational Qualifications


Required Qualifications and Experience

  • Graduate or PostGraduate in Computer Science, Information Technology, or Data Science/Technologies.

Work Experience


  • 34 years of handson data engineering experience.

Technical Expertise / Skills Keywords


  • SQL, Python, PySpark
  • Data streaming (e.g., Spark Structured Streaming, Kafka), CDC (e.g., Debezium/Log-based), Database Mirroring
  • Data modeling, performance tuning, and optimization
  • Version control (Git/GitHub) and DevOps pipelines (e.g., Azure DevOps)
  • Preferred: Azure Databricks, Azure Data Factory, Data Lake Storage; experience with orchestration and observability tools.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 144902451

Similar Jobs