MODULE LEAD - LEAD

Fresher

Save

Early Applicant

Job Description

Develop and implement ETL/ELT data pipelines using Python, PySpark, and SQL within the Databricks ecosystem, leveraging features like Unity Catalog and Delta Lake.
Design and deploy scalable solutions using Databricks Asset Bundles (DAB), Spark APIs, and efficient Workflow creations (e.g., using Airflow).
Optimize complex queries and data processing jobs for performance using advanced SQL query writing and PySpark Performance Tuning techniques.
Manage data ingestion and storage, interacting with AWS S3 to read from and write to Delta Lake and Snowflake.
Employ best practices in Git/GitHub for version control and collaborative development.
A strong foundation in Core Python is essential; experience with Object-Oriented Programming (OOP) is a plus.