Key Responsibilities
Data Pipeline Development (ETL/ELT)
- Design and develop scalable ETL/ELT pipelines using PySpark and Spark SQL
- Build and maintain Databricks Jobs for workflow orchestration
- Implement Delta Live Tables (DLT) for declarative and automated pipeline development
- Develop both batch and streaming data processing pipelines
Lakehouse Architecture Design
- Design and implement Medallion Architecture (Bronze, Silver, Gold layers)
- Build curated, analytics-ready datasets for business and reporting needs
- Optimize storage and processing using Delta Lake best practices
Data Ingestion & Integration
- Ingest data from multiple sources including RDBMS, NoSQL databases, APIs, and streaming systems like Kafka and Event Hubs
- Process structured and semi-structured data formats such as CSV, JSON, Parquet, and Avro
- Ensure reliable and scalable data ingestion frameworks across systems
Performance Optimization & Delta Lake Management
- Implement ACID transactions, schema enforcement, and schema evolution using Delta Lake
- Apply Change Data Capture (CDC) techniques for incremental data processing
- Optimize Spark workloads using partitioning, caching, broadcast joins, and Z-order indexing
- Improve job performance through tuning and distributed computing best practices
Data Quality, Governance & Security
- Implement data validation rules and quality checks across pipelines
- Ensure compliance with enterprise data governance standards
- Use Unity Catalog for access control, data lineage, and auditability
- Maintain transparency, traceability, and data reliability
Monitoring & Reliability Engineering
- Build monitoring, logging, and alerting systems for data pipelines
- Troubleshoot pipeline failures and optimize performance continuously
- Ensure high availability, fault tolerance, and production stability
Collaboration & Stakeholder Engagement
- Work closely with data analysts, data scientists, and business stakeholders
- Translate business requirements into scalable data models and pipelines
- Participate in Agile ceremonies including sprint planning, stand-ups, and retrospectives