We are seeking an experienced Data Engineer with strong expertise in Spark, Scala/Python, Airflow, and Azure Cloud to design, develop, and optimize scalable data processing solutions. The ideal candidate should have hands-on experience in building large-scale data pipelines, tuning Spark applications, configuring clusters, and implementing cloud-based data engineering solutions.
Key Responsibilities
- Design, develop, and maintain scalable data pipelines and ETL/ELT workflows using Spark, Scala, and Python.
- Build high-performance batch and streaming data processing solutions for large-scale datasets.
- Configure, optimize, and troubleshoot Spark applications, cluster settings, and resource management for maximum performance.
- Develop and manage workflow orchestration using Apache Airflow.
- Implement data engineering solutions on Azure Cloud services.
- Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver robust solutions.
- Monitor, maintain, and enhance data pipeline reliability, scalability, and performance.
- Perform code reviews and ensure adherence to coding standards and best practices.
- Troubleshoot production issues and provide timely resolutions.
- Contribute to the design and implementation of data platform architecture and modernization initiatives.
Required Skills- 5–8 years of experience in Data Engineering and Big Data technologies.
- Strong hands-on experience with Apache Spark, including:
- Spark architecture and internals
- Spark performance tuning
- Spark configuration and cluster optimization
- Resource allocation and workload management
- Good experience in Scala and/or Python development.
- Experience with Apache Airflow for workflow orchestration and scheduling.
- Strong knowledge of Azure Cloud services and data engineering ecosystem.
- Experience with distributed data processing and large-scale data platforms.
- Good understanding of ETL/ELT concepts and data pipeline development.
- Strong problem-solving and debugging skills.
Preferred Qualifications
- Experience with Azure Data Lake, Azure Databricks, Azure Synapse, or related Azure data services.
- Knowledge of CI/CD practices and DevOps methodologies.
- Experience working in Agile/Scrum environments.
- Familiarity with data warehousing concepts and data modeling.
- Exposure to real-time data processing frameworks is a plus.
Educational Qualification
- Bachelor's or Master's degree in Computer Science, Information Technology, Engineering, or a related field.
Key Competencies
- Strong analytical and troubleshooting skills.
- Ability to work independently and in a collaborative team environment.
- Excellent communication and stakeholder management skills.
- Focus on performance optimization, scalability, and solution quality.
Skills: azure,spark,scala