Search by job, company or skills

Incedo Inc.

Senior Data Engineer - Databricks

Save
new job description bg glownew job description bg glow
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role Description

We are hiring a Data Engineer to join a Databricks-focused delivery team working on a life sciences data platform build. You will develop ingestion and transformation pipelines, build and maintain medallion-layer data models, and contribute to data quality and governance implementation.

This role is based in India (Gurgaon preferred) and involves working as part of a cross-functional POD alongside senior engineers, analysts, and QA resources. You will report to the onshore technical lead and collaborate closely with client stakeholders during overlap hours.

Role and Responsibilities

  • Build and maintain ETL/ELT pipelines on Databricks using PySpark, Python, and SQL.
  • Develop Bronze, Silver, and Gold layer transformations following medallion architecture patterns.
  • Implement data ingestion from multiple source systems using Auto Loader, batch APIs, and file-based connectors.
  • Write and maintain data quality checks, validation logic, and reconciliation scripts within pipelines.
  • Support Unity Catalog setup and configuration including access control, tagging, and lineage.
  • Participate in CI/CD pipeline setup using GitHub and Databricks Workflows for automated deployments.
  • Collaborate with senior engineers on architecture decisions, code reviews, and design discussions.
  • Troubleshoot pipeline failures, data anomalies, and performance issues in development and production environments.

Requirements

  • 3 to 5+ years of overall data engineering experience.
  • 1-2 years working on Databricks (notebooks, workflows, Delta Lake).
  • 1-2 years hands-on with PySpark or Python for data pipeline development.
  • Familiarity with data modeling concepts and schema design (dimensional models, SCD patterns).
  • Comfortable writing complex SQL queries for transformation, analysis, and data validation.
  • Basic experience with version control (GitHub or GitLab) and CI/CD concepts.
  • Ability to work in an agile, sprint-based delivery environment.
  • Good communication skills and comfort working with distributed teams across time zones.

Good to Have

  • Life sciences or pharma domain exposure.
  • Experience with Auto Loader, Lakeflow Connect, or Kafka.
  • Exposure to Unity Catalog, data governance, or data quality frameworks.
  • AWS or Azure cloud platform experience.
  • Experience with dbt, Airflow, or other orchestration tools.

Regrads,

Manvendra

[Confidential Information]

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148392577

Similar Jobs

Gurugram, Gurugram, India

Skills:

Spark StreamingDatabricksSqlApache SparkPysparkKafkaSpark SQLPythonAzureGitAzure DevOpsSpark optimizationBig Data toolsData LakesEvent HubsDatabricks LakehouseDelta

Delhi, India

Skills:

PysparkApache SparkAutomationData QualityGitlabDatabricksData GovernancePythonCI CD PipelinesAI ML WorkflowsLLMOpsRAG PipelinesVector-Space ArchitecturesVector SearchSQL OptimizationmetadataDelta LakeSpark Performance OptimizationDatabricks REST APIsDistributed Data ProcessingScalable Data Platform Architecture

Hyderabad, Bengaluru, Noida

Skills:

AzureawsDatabricks