Search by job, company or skills

PwC India

Cloud Architect

new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Key responsibilities

1.Architecture and roadmap

Define reference architectures for lakehouse and medallion patterns using Delta Lake, OneLake, and Synapse/Fabric Lakehouse for scalable analytics and AI.

Create domain-driven data models, canonical schemas, and patterns for batch and streaming integration (bronze/silver/gold).

2.Platform design and build

Design ingestion frameworks for batch (ADF/Fabric Pipelines) and streaming (Event Hubs, Kafka, IoT Hub) into ADLS/OneLake with Delta and Change Data Capture.

Architect Databricks workloads (PySpark/Scala/SQL) for ETL/ELT, feature engineering, and ML data prep with robust job orchestration and scheduling.

3.Real-time streaming

Lead Structured Streaming architectures in Databricks with exactly-once semantics, watermarking, and stateful aggregations; design Kappa/Lambda where appropriate.

Implement low-latency serving layers and materialized views for near-real-time analytics and operational reporting.

4.Microsoft Fabric implementation

Establish Fabric workspaces, Lakehouse, Pipelines, Dataflows Gen2, Shortcuts to ADLS/OneLake, and semantic model standards for governed self-service BI.

Define data product patterns integrating Fabric with Databricks and Power BI for governed, reusable datasets.

5.Data governance and security

Implement RBAC/ABAC, Unity Catalog, Purview (lineage, glossary, classifications), encryption, network isolation, and data masking/tokenization.

Define data quality SLAs, expectations, and contracts; embed quality checks, observability, and lineage in pipelines.

6.DevOps and FinOps

Standardize CI/CD (Azure DevOps/GitHub), environment strategy, IaC (Bicep/Terraform), cluster policies, and workspace baselines.

Optimize cost via right-sized clusters, autoscaling, Photon, Delta optimization/Z-Order, and job scheduling.

7.Delivery leadership

Lead design reviews, threat modeling, performance testing, and production readiness; mentor engineers and partner with product/enterprise architects.

Translate business requirements into technical designs, estimates, and roadmaps; drive stakeholder communication and risk management.

Required skills and experience

812 years in data engineering/architecture with 4+ years on Azure data stack; strong leadership in complex enterprise programs.

Deep expertise

Databricks: PySpark/SQL, Delta Lake, Structured Streaming, Jobs/Workflows, Unity Catalog, cluster policies, performance tuning.

Azure: ADLS Gen2, Event Hubs/Kafka, Azure Functions/Logic Apps, Key Vault, ADF, Synapse; VNETs, Private Endpoints, Managed Identity.

Fabric: Lakehouse, OneLake, Pipelines, Dataflows Gen2, Shortcuts, semantic models, governance integration with Purview and Power BI.

Architecture patterns

Lakehouse, medallion, Data Mesh/data products, CDC with Debezium/Fivetran/ADF mapping data flows, SCD handling, schema evolution.

Batch and streaming design, watermarking, state store management, idempotency, backfills, and late/duplicate data handling.

Data management

Dimensional and semantic modeling, Data Vault/Kimball, query performance, partitioning, Z-Order, OPTIMIZE/VACUUM, file sizing.

DQ frameworks (Great Expectations/Deequ), monitoring/observability (Log Analytics, Databricks metrics), SLA/SLO design.

Security and compliance

Purview lineage and classification, Unity Catalog governance, PII/PHI handling, encryption, tokenization; audit, SOC2/ISO, GDPR/DPDP familiarity.

DevOps/IaC and automation

Git-based development, branch strategies, CI/CD for notebooks/SQL/artifacts, IaC for data resources, automated testing.

Communication and leadership

Strong stakeholder engagement, technical writing, solution estimation, and mentoring.

Nice to have

Experience with data products and mesh operating models; product lifecycle and contracts between producer/consumer domains.

ML/feature store integration (Databricks Feature Store), MLOps awareness for data readiness.

Knowledge of dbt, Terraform, Airflow, Confluent, and enterprise SSO/SCIM/SCIM provisioning with Databricks/Fabric.

Qualifications

Bachelor's/Master's in Computer Science, Engineering, or related field.

Certifications: Azure Solutions Architect Expert, Azure Data Engineer Associate, Databricks Data Engineer Professional/Associate, Microsoft Fabric Data Engineer Associate.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 134141239

Similar Jobs