Job Title: Lead Data Domain Architect - Finance Data and Insights
Position Overview: We are looking for an Engineer and optimize Databricks DDL models that power FORCE analytics, ML, NLP, LLM, and dashboards. Enforce modeling standards, validation/reconciliation controls, and audit ready documentation. Drive effective collaboration and tie delivery to measurable OKRs.
As an Analytics Solution Associate within our analytics team, you will design and maintain Databricks DDL models and promote performance engineering across our data platforms.
Job Responsibilities:
- Design and maintain Databricks DDL models: Define normalized schemas, SCDs, and surrogate keys
- Performance engineering: Implement partitioning/ZORDER/OPTIMIZE, constraints, caching reduce shuffle/skew.
- Medallion architecture: Engineer bronze/silver/gold transformations with reproducible pipelines.
- Data quality and reconciliation: Evaluate and make recommendations on harnesses (e.g., Great Expectations/Deequ), Atoti checks, and hierarchies reconciliation.
- Documentation excellence: Maintain audit‑ready DRDs/DDLs, standards, and runbooks monthly update cadence.
- Collaboration and handoffs: Participate in joint planning/reviews manage efficient handoffs.
- Advanced analytics/NLQ enablement: Ensure models support NLQ and ML use cases.
- Release/change readiness: Partner on release gates, rollback plans, change controls.
- Mentoring and training: Lead workshops on modeling patterns and performance.
Required qualifications, capabilities, and skills
- 6+ years in data modeling for analytics 3+ years leading modeling initiatives
- Expert SQL and Databricks (Delta Lake) optimization techniques (partitioning, ZORDER, OPTIMIZE, vacuum)
- ERWIN or equivalent Git‑based workflows Continuous Integration/Continuous Deployment for data (tests, linting, code reviews)
- Proven validation/reconciliation frameworks and audit‑ready documentation
- Collaboration, communication, and Agile delivery proficiency
Preferred qualifications, capabilities, and skills
- SQL, PySpark, and Python fluency Great Expectations/Deequ ThoughtSpot/Tableau/Sigma consumption patterns
- Schema evolution Change Data Capture (CDC)