
Search by job, company or skills
Key responsibilities
1.Architecture and roadmap
Define reference architectures for lakehouse and medallion patterns using Delta Lake, OneLake, and Synapse/Fabric Lakehouse for scalable analytics and AI.
Create domain-driven data models, canonical schemas, and patterns for batch and streaming integration (bronze/silver/gold).
2.Platform design and build
Design ingestion frameworks for batch (ADF/Fabric Pipelines) and streaming (Event Hubs, Kafka, IoT Hub) into ADLS/OneLake with Delta and Change Data Capture.
Architect Databricks workloads (PySpark/Scala/SQL) for ETL/ELT, feature engineering, and ML data prep with robust job orchestration and scheduling.
3.Real-time streaming
Lead Structured Streaming architectures in Databricks with exactly-once semantics, watermarking, and stateful aggregations; design Kappa/Lambda where appropriate.
Implement low-latency serving layers and materialized views for near-real-time analytics and operational reporting.
4.Microsoft Fabric implementation
Establish Fabric workspaces, Lakehouse, Pipelines, Dataflows Gen2, Shortcuts to ADLS/OneLake, and semantic model standards for governed self-service BI.
Define data product patterns integrating Fabric with Databricks and Power BI for governed, reusable datasets.
5.Data governance and security
Implement RBAC/ABAC, Unity Catalog, Purview (lineage, glossary, classifications), encryption, network isolation, and data masking/tokenization.
Define data quality SLAs, expectations, and contracts; embed quality checks, observability, and lineage in pipelines.
6.DevOps and FinOps
Standardize CI/CD (Azure DevOps/GitHub), environment strategy, IaC (Bicep/Terraform), cluster policies, and workspace baselines.
Optimize cost via right-sized clusters, autoscaling, Photon, Delta optimization/Z-Order, and job scheduling.
7.Delivery leadership
Lead design reviews, threat modeling, performance testing, and production readiness; mentor engineers and partner with product/enterprise architects.
Translate business requirements into technical designs, estimates, and roadmaps; drive stakeholder communication and risk management.
Required skills and experience
812 years in data engineering/architecture with 4+ years on Azure data stack; strong leadership in complex enterprise programs.
Deep expertise
Databricks: PySpark/SQL, Delta Lake, Structured Streaming, Jobs/Workflows, Unity Catalog, cluster policies, performance tuning.
Azure: ADLS Gen2, Event Hubs/Kafka, Azure Functions/Logic Apps, Key Vault, ADF, Synapse; VNETs, Private Endpoints, Managed Identity.
Fabric: Lakehouse, OneLake, Pipelines, Dataflows Gen2, Shortcuts, semantic models, governance integration with Purview and Power BI.
Architecture patterns
Lakehouse, medallion, Data Mesh/data products, CDC with Debezium/Fivetran/ADF mapping data flows, SCD handling, schema evolution.
Batch and streaming design, watermarking, state store management, idempotency, backfills, and late/duplicate data handling.
Data management
Dimensional and semantic modeling, Data Vault/Kimball, query performance, partitioning, Z-Order, OPTIMIZE/VACUUM, file sizing.
DQ frameworks (Great Expectations/Deequ), monitoring/observability (Log Analytics, Databricks metrics), SLA/SLO design.
Security and compliance
Purview lineage and classification, Unity Catalog governance, PII/PHI handling, encryption, tokenization; audit, SOC2/ISO, GDPR/DPDP familiarity.
DevOps/IaC and automation
Git-based development, branch strategies, CI/CD for notebooks/SQL/artifacts, IaC for data resources, automated testing.
Communication and leadership
Strong stakeholder engagement, technical writing, solution estimation, and mentoring.
Nice to have
Experience with data products and mesh operating models; product lifecycle and contracts between producer/consumer domains.
ML/feature store integration (Databricks Feature Store), MLOps awareness for data readiness.
Knowledge of dbt, Terraform, Airflow, Confluent, and enterprise SSO/SCIM/SCIM provisioning with Databricks/Fabric.
Qualifications
Bachelor's/Master's in Computer Science, Engineering, or related field.
Certifications: Azure Solutions Architect Expert, Azure Data Engineer Associate, Databricks Data Engineer Professional/Associate, Microsoft Fabric Data Engineer Associate.
Job ID: 134141239