Role Overview
The Data Architect owns the end-to-end data strategy for the DRT Modernization program — a migration of internal DPP Recruiting Tool from Appian SaaS (MariaDB backend) to a cloud-native Azure stack (Azure SQL, Azure Data Factory, Azure Data Lake Storage). The role spans schema design, data migration planning, ETL architecture, and governance, working closely with .NET developers, integration engineers, and KPMG stakeholders.
Key Responsibilities
Lead the design of the target-state Azure SQL schema (50–60 tables across 10 schema groups: ask, nomination, due_diligence, interview, onboarding, notification, admin, employee, audit, lookup).Produce and own the Entity Relationship Diagram (ERD), data dictionary, and DDL scripts for Azure SQL.Define the data migration strategy (big-bang cutover) from MariaDB/Appian to Azure SQL including rollback plan.Architect Azure Data Factory (ADF) pipelines for inbound XML data feeds (SFTP/ADM DataBus) and outbound CSV exports to Clarity SaaS.Design the Azure Data Lake Storage landing-zone structure (raw, cleansed, processed zones).Reverse-engineer and rewrite 20+ MariaDB stored procedures and views into T-SQL / ADF Data Flows.Establish data validation and reconciliation framework to confirm migration completeness (<2 GB, 5+ years of history).
Define partitioning, soft-delete, and archival patterns to meet the 7-year KPMG data-retention policy.Collaborate with .NET team to define ORM conventions (Entity Framework Core / Dapper) and query performance standards.Review and approve data-tier pull requests; provide technical governance across data deliverables.Identify and mitigate data risks: undocumented SP logic, Appian audit-trail extraction, in-flight workflow states at cutover.Present architecture decisions and trade-offs to KPMG stakeholders and obtain sign-off.
Must Have
- Azure SQL Database — hands-on schema design, T-SQL, performance tuning, indexing strategies.
- Azure Data Factory (ADF) — building and operationalising complex pipelines — Copy Activity, Data Flows, SFTP connectors, XML/CSV transformations.
- Azure Data Lake Storage (ADLS Gen2) — landing-zone design, directory structure, access tiers.
- Data Migration — designing and executing large-scale schema migrations with reconciliation validation and rollback planning.
- Stored Procedure & View Migration — reverse-engineering and rewriting RDBMS stored procedures (MariaDB → T-SQL or equivalent).
- Enterprise Data Modeling — normalised relational schemas, ERDs, and data dictionaries for complex multi-domain applications.
- Cloud Data Architecture — end-to-end reference architectures on Azure (ingestion → store → serve).
- Stakeholder Communication — translating complex data designs into clear documentation for non-technical business stakeholders.
- SQL Performance & Governance — query optimisation, indexing, query-store analysis, schema-level security.
Nice to Have
- Experience with Appian SaaS platform data extraction or low-code-to-pro-code migration projects.
- Familiarity with MariaDB / MySQL syntax and data type mappings to SQL Server / Azure SQL.
- Knowledge of Azure Service Bus or event-driven integration patterns for workflow orchestration.
- Experience with Azure Key Vault, Azure Monitor, and Application Insights for data platform observability.
- Understanding of Azure Active Directory / KPMG SSO patterns for data access control.
- Exposure to EF Core / Dapper and how ORM patterns interact with relational schema design.
- KPMG or Big-4 professional services or financial services sector background.
- Familiarity with KPMG's DataBus / ADM integration ecosystem.
- Azure Data Engineer Associate (DP-203) or Azure Solutions Architect Expert certification.
Soft Skills & Ways Of Working
- Proven ability to make pragmatic architectural trade-offs (e.g., shared-DB vs. DB-per-service for current scale).
- Strong written communication — able to produce architecture documents, ADRs, and data dictionaries independently.
- Collaborative: comfortable aligning with KPMG architects, .NET leads, and QA teams simultaneously.
- Detail-oriented: able to track 50+ table dependencies, foreign-key chains, and migration sequencing.