Search by job, company or skills

Dexian India

Python Data Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 17 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Key Responsibilities

  • Pipeline Development Design, build, and deploy robust ETL/ELT pipelines in Databricks (PySpark, SQL, Delta Lake) to ingest, transform, and curate governance and operational metadata from multiple sources landed in Databricks.
  • Granular Data Quality Capture Implement profiling logic to capture issue-level metadata (source table, column, timestamp, severity, rule type) to support drill-down from dashboards into specific records and enable targeted remediation.
  • Governance Metrics Automation Develop data pipelines to generate metrics for dashboards covering data quality, lineage, job monitoring, access & permissions, query cost, usage & consumption, retention & lifecycle, policy enforcement, sensitive data mapping, and governance KPIs.
  • Microsoft Purview Integration Automate asset onboarding, metadata enrichment, classification tagging, and lineage extraction for integration into governance reporting.
  • Data Retention & Policy Enforcement Implement logic for retention tracking and policy compliance monitoring (masking, RLS, exceptions).
  • Job & Query Monitoring Build pipelines to track job performance, SLA adherence, and query costs for cost and performance optimization.
  • Metadata Storage & Optimization Maintain curated Delta tables for governance metrics, structured for efficient dashboard consumption.
  • Testing & Troubleshooting Monitor pipeline execution, optimize performance, and resolve issues quickly.
  • Collaboration Work closely with the lead engineer, QA, and reporting teams to validate metrics and resolve data quality issues.
  • Security & Compliance Ensure all pipelines meet organizational governance, privacy, and security standards.

Required Qualifications

  • Bachelor's degree in Computer Science, Engineering, Information Systems, or related field
  • 4+ years of hands-on data engineering experience, with Azure Databricks and Azure Data Lake
  • Proficiency in PySpark, SQL, and ETL/ELT pipeline design
  • Demonstrated experience building granular data quality checks and integrating governance logic into pipelines
  • Working knowledge of Microsoft Purview for metadata management, lineage capture, and classification
  • Experience with Azure Data Factory or equivalent orchestration tools
  • Understanding of data modeling, metadata structures, and data cataloging concepts
  • Strong debugging, performance tuning, and problem-solving skills
  • Ability to document pipeline logic and collaborate with cross-functional teams

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 132695981

Similar Jobs