Role Purpose
The incumbent is responsible for designing, developing and optimising data engineering
solutions on the Azure Databricks platform using PySpark, implementing medallion
architecture patterns (Bronze, Silver, Gold), scheduling and orchestrating data workflows,
and delivering quality analytics and reporting solutions to the business.
Requirements
- Exposure to the full data engineering and analytics development life cycle.
- 5 yrs of core experience in Data Engineering, Data Analytics, or Business Intelligence.
- Azure Databricks and PySpark experience essential; experience with Databricks
Workflows (job scheduling and orchestration) required
- Strong knowledge of PySpark, Python, SQL, and data pipeline orchestration using
Databricks Workflows
- Good understanding of medallion architecture (Lakehouse), Delta Lake, data
modelling, and supporting areas (Data Transformation, Governance and Reporting)
- Power BI or Databricks SQL for reporting and analytics preferred
Duties and Responsibilities
- Participate in the analysis, design, development, troubleshooting and support of the
enterprise Databricks data platform and analytics environment
- Design, build, test and implement data pipelines using PySpark and Databricks
notebooks, following medallion architecture patterns (Bronze, Silver, Gold) to
transform raw data into curated, analytics-ready datasets
- Develop and maintain data solutions using PySpark, SQL, Databricks Workflows, Unity
Catalog and Delta Lake on the Azure Databricks platform
- Integrate with diverse source systems (including but not limited to: In-House, Vendor
based, On-prem and Cloud-based, and Office 365)
- Configure, schedule and monitor Databricks Workflows (jobs) to ensure reliable and
timely data processing, and collaborate with DevOps on CI/CD practices for notebook
and pipeline deployments
- Responsible for the day-to-day data engineering tasks including developing and
optimising PySpark transformations, managing Delta tables, and maintaining data
quality across the medallion layers
- Maintain and evolve the medallion architecture data models, Unity Catalog
governance structures, and analytics-ready Gold layer datasets
- Apply Spark performance tuning techniques (partitioning, caching, broadcast joins,
cluster sizing) to optimise data pipeline throughput and cost efficiency
- Engage directly with business stakeholders to gather requirements, provide data
driven recommendations, and translate business needs into technical solutions. Assist
lead developer in Coordinate team efforts to achieve business objectives (Strategic
and operational)
- Ensure business continuity documentation through Azure DevOps
- Enforce data security and governance standards through Unity Catalog, access
controls and data classification, ensuring adherence by all team members
- Review code implementations, PySpark notebooks and pipeline designs by the Data
team, ensuring adherence to best practices and coding standards
- Oversee quality of data engineering solutions by junior team members, providing
constructive feedback and guidance
- Actively mentor Junior and Intermediate team members, sharing knowledge on
Databricks, PySpark, medallion architecture and data engineering best practices
- Drive implementation of innovative data platform capabilities and Databricks features
(e.g., Delta Live Tables, Databricks SQL, ML integrations) through the Lead Developer
and Enterprise Architecture team
- Utilize junior members in achieving large scale project developments
and implementations in consultation with the lead developer
As an applicant, please verify the legitimacy of this job advert on our company career page.