End-to-End Validation: Validate data as it moves through various stages: ingestion, transformation (ETL/ELT), and final reporting.
SQL Mastery: Write complex SQL queries to perform source-to-target mapping, data profiling, and regression testing.
Architectural Analysis: Verify data integrity across Data Lakehouse (Cloudera Data Platform - CDP), ensuring consistency between source to lakehouse.
Defect Management: Identify, document, and track data anomalies, working closely with Data Engineers to ensure defect closure and identify root causes.
Business Alignment: Understand business logic to ensure the data produced actually meets the needs of stakeholders and analysts.
(Nice to have) Automated Testing: Design and implement automated test scripts to monitor data health and performance.
Technical Skills:
Concepts: Strong grasp of Data Modeling (Star/Snowflake schema), and the nuances of Data Lakehouses.
Expert-level SQL: Joins, window functions, CTEs, and query optimization.
Tools: Familiarity with CDP / Hadoop, Oracle, SQL Server or any other database.