- Design, develop, and maintain data pipelines using Databricks and Apache Spark.
- Build scalable data ingestion, transformation, and processing frameworks on AWS.
- Work with AWS services such as S3, Glue, Lambda, EMR, Redshift, and EC2.
- Develop and optimize ETL/ELT processes for large-scale structured and unstructured data.
- Implement Delta Lake architecture and optimize performance in Databricks.
- Collaborate with data architects, data scientists, and business teams to deliver data solutions.
- Ensure data quality, governance, and security standards across data platforms.
- Monitor and troubleshoot data workflows and performance issues.
- Implement CI/CD pipelines and DevOps practices for data engineering workflows.
Required Skills
- Strong hands-on experience with Databricks
- Expertise in AWS cloud services (S3, Glue, Lambda, EMR, Redshift, IAM).
- Proficiency in Python / PySpark / SQL / Scala for data processing.
- Experience with ETL/ELT frameworks and data pipeline development.
- Knowledge of Delta Lake, data lake architecture, and big data technologies.
- Experience working with large-scale distributed data processing systems.
- Familiarity with workflow orchestration tools like Airflow or similar.
Skills: development,databricks,aws