Role: Data Engineer
Experience: 67 Years
Location: Remote
Working Hours: 11:00 AM 9:00 PM IST
Role Overview
We are looking for an experienced Data Engineer with strong expertise in Azure Databricks, Azure Data Factory (ADF), Apache Spark, Python, and SQL. The role focuses on building scalable data pipelines, optimizing distributed data workflows, and enabling analytics and automation across the Azure ecosystem.
Key Responsibilities
Data Engineering & Pipeline Development
- Design, develop, and maintain scalable data pipelines using Azure Databricks and Apache Spark.
- Build and orchestrate workflows using Azure Data Factory (ADF).
Data Integration
- Ingest and process structured, semi-structured, and unstructured data.
- Implement ETL/ELT pipelines using Delta Lake and Azure Data Lake.
Performance Optimization
- Optimize Spark jobs for performance, reliability, and cost efficiency.
- Troubleshoot distributed processing bottlenecks.
Automation & DevOps
- Automate data ingestion, transformation, and deployment workflows.
- Implement CI/CD pipelines using GitHub Actions or similar tools.
Collaboration & Enablement
- Work closely with data analysts, business users, and citizen developers.
- Support self-service analytics and low-code/no-code automation initiatives.
Governance & Security
- Ensure compliance with data governance, security, and privacy standards.
- Work with metadata, lineage, and governance tools.
Required Technical Skills
Must-Have
- Azure Databricks
- Azure Data Factory (ADF)
- Apache Spark
- Python
- SQL
Good-to-Have
- Power Apps, Power Automate, Azure Logic Apps
- MLflow
- Delta Lake
- CI/CD pipelines
- Machine Learning workflows
Preferred Qualifications
- Experience building CI/CD pipelines for Databricks.
- Knowledge of Azure Purview or similar governance tools.
- Experience working in Agile environments.