Immediate Joiners
Location: Bangalore and Pune
Work Mode: Hybrid
Job Details
Roles & Responsibilities
- Design, develop, and optimize data pipelines using Azure Databricks and Azure Data Factory (ADF).
- Build and manage large-scale distributed data processing solutions using Apache Spark.
- Develop ETL/ELT workflows for structured and unstructured data.
- Implement and manage data ingestion, transformation, and orchestration pipelines.
- Work with data lakes (Azure Data Lake Storage Gen2) and data warehousing solutions.
- Optimize performance of Spark jobs and Databricks clusters.
- Collaborate with data architects and stakeholders to design scalable data solutions.
- Ensure data quality, governance, and security best practices.
- Troubleshoot and resolve data pipeline issues.
Job Description
- 6+ years of experience in data engineering or big data technologies.
- Strong hands-on experience with:
- Azure Databricks
- Azure Data Factory (ADF)
- Apache Spark (PySpark/Scala)
- Good understanding of distributed computing and data processing models.
- Experience with Azure Data Lake Storage (ADLS Gen2).
- Strong SQL skills and experience with data modeling.
- Experience in building scalable ETL/ELT pipelines.
- Knowledge of data partitioning, performance tuning, and optimization techniques.
- Familiarity with CI/CD pipelines (Azure DevOps).
Preferred Skills
- Experience with Delta Lake and Lakehouse architecture.
- Knowledge of streaming data processing (Structured Streaming, Kafka, Event Hub).
- Experience with Power BI or other visualization tools.
- Familiarity with Python, Scala, or SQL-based programming.
- Understanding of data governance and security frameworks.