cloud involving ingestion from varied data sources
Should be familiar with data warehousing target and intermediate structures ideally in environments using data lakes
Well versed in Databricks, SQL Database, Azure Data Factory, Azure Data Lake
Must have hands-on experience in PySpark, SparkSQL, T-SQL Should be able to go through requirements, design specs and create appropriate test scenarios across various layers/stages of data pipelines
Should be able to write reusable PySpark and T-SQL script to validate data across layers
Experience of working in Databricks Delta Lake will be highly desirable
Ideally should have worked in a team using Agile methodology
Appreciation of data, its quality, and its usage for business benefit Very good communication and written skills