Key Responsibilities
End-to-End Data Pipeline Development
- Design, develop, and manage scalable data pipelines using Microsoft Fabric (Data Pipelines, Lakehouse, Warehouse, OneLake) and Azure Data Factory
- Build robust ETL/ELT workflows supporting batch and near real-time data processing
- Ensure efficient data ingestion from multiple sources including APIs, RDBMS, files, and streaming systems
Lakehouse Architecture & Data Modeling
- Implement Medallion Architecture (Bronze, Silver, Gold layers) for enterprise data platforms
- Design and optimize data models using star and snowflake schemas
- Support modern lakehouse and data warehousing architectures
Data Processing & Engineering
- Develop data transformation logic using PySpark, Python, Spark SQL, and SQL
- Optimize T-SQL queries, stored procedures, and database performance
- Implement incremental loading, CDC, and watermarking techniques
Cloud Integration & Storage Management
- Work with Azure Data Lake Storage Gen2, Azure Blob Storage, and OneLake
- Manage structured and unstructured data across distributed storage systems
- Integrate datasets with Power BI for analytics and reporting
Monitoring, Governance & Optimization
- Implement data quality checks, monitoring, logging, and error handling frameworks
- Use Azure and Fabric monitoring tools for performance optimization
- Ensure compliance with data governance, lineage, and security standards (Microsoft Purview preferred)
Migration & Modernization
- Support migration from Azure Data Factory and Databricks to Microsoft Fabric
- Contribute to modernization of enterprise data platforms
- Ensure smooth transition with minimal disruption to data pipelines
Collaboration & Leadership
- Work closely with architects, analysts, and business stakeholders
- Translate business requirements into scalable technical solutions
- Provide technical guidance and mentorship to junior engineers