Role Overview
We are looking for a hands-on, senior Databricks Architect to design, build, and govern our Lakehouse data platform from the ground up. You will own the end-to-end architecture of our data infrastructure from raw ingestion through the Medallion layers to serving and establish the engineering standards that will guide the entire data organization.
This is a highly strategic and technical role focused on driving adoption of Databricks, Unity Catalog, and modern Lakehouse patterns across all data products and pipelines.
Key Responsibilities
Lakehouse Architecture & Design
- Design and implement a production-grade Medallion Architecture (Bronze / Silver / Gold) across all data pipelines.
- Establish best practices for Delta Lake table design, partitioning strategies, Z-ordering, and optimization across large-scale datasets.
- Define data modeling standards and schema evolution policies across the Lakehouse.
- Architect end-to-end data flows from ingestion (streaming and batch) through transformation and serving layers.
Unity Catalog & Data Governance
- Lead the setup, configuration, and rollout of Unity Catalog as the centralized governance layer for all data assets.
- Design metastore hierarchy, catalog/schema/table organization, and tagging standards.
- Implement fine-grained access control (row-level, column-level), data masking policies, and audit logging.
- Establish data lineage tracking and ensure end-to-end visibility across all pipelines.
- Define and enforce data classification and sensitivity frameworks for PII and regulated data assets.
Pipeline Development & Orchestration
- Build and maintain production-grade data pipelines using PySpark, Delta Live Tables (DLT), and Databricks Workflows / Jobs.
- Design modular, reusable pipeline patterns including incremental ingestion, CDC (Change Data Capture), and full-refresh strategies.
- Implement robust pipeline observability: logging, alerting, lineage tracking, and SLA monitoring.
- Leverage Databricks Repos for CI/CD integration, managing code promotion across dev / staging / production environments.
Performance & Compute Optimization
- Optimize Spark execution plans, identify and resolve performance bottlenecks across large-scale distributed workloads.
- Right-size cluster configurations: Serverless warehouses, auto-scaling job clusters, and photon-enabled SQL warehouses.
- Leverage Serverless Warehouses and SQL Warehouses for BI and ad hoc analytics workloads, minimizing cost and cold-start latency.
- Manage cost governance for compute, storage, and DBU consumption across workspaces.
Developer Experience & Standards
- Set up and maintain Databricks Repos with standardized project structures and Git integration.
- Define Python coding standards, notebook best practices, and modular library patterns for the data engineering team.
- Build reusable Python utility libraries for common patterns: schema validation, data quality checks, Delta operations, and logging.
- Establish unit testing and integration testing frameworks for Spark pipelines.
Security, Compliance & Networking
- Configure workspace-level and account-level security: Private Link, IP access lists, secrets management via Databricks Secrets or AWS Secrets Manager.
- Design and enforce network isolation for sensitive data workloads.
- Ensure compliance with data residency and access control requirements for customer data.
Collaboration & Enablement
- Partner with data engineers, data scientists, and analytics engineers to ensure the platform meets diverse workload needs.
- Mentor the engineering team on Databricks, Spark optimization, and Lakehouse best practices.
- Produce architectural documentation, runbooks, and internal knowledge bases.
- Evaluate and recommend new Databricks features and third-party integrations relevant to the organization's data roadmap.
Required Qualifications
Core Databricks & Lakehouse
- 5+ years of hands-on experience with Databricks, with at least 2 years in an architect or senior lead role.
- Deep expertise in Unity Catalog: metastore setup, three-level namespace, ACL design, and data governance workflows.
- Strong mastery of the Medallion Architecture and Delta Lake: ACID transactions, time travel, compaction, and OPTIMIZE/VACUUM strategies.
- Proven experience designing and deploying production pipelines with Databricks Jobs and Workflows, including multi-task job DAGs, retry logic, and notifications.
- Hands-on experience with Databricks Repos and CI/CD integration for notebook and Python library deployments.
- Experience configuring and operating Serverless SQL Warehouses and Serverless compute for Jobs.
Apache Spark
- Expert-level PySpark development: DataFrames, Spark SQL, window functions, broadcast joins, and UDFs.
- Strong understanding of Spark internals: DAG execution, shuffle optimization, memory management, and speculative execution.
- Experience with structured streaming and micro-batch processing patterns.
- Proven ability to diagnose and resolve Spark performance issues using Spark UI and event logs.
Python & Software Engineering
- Advanced Python skills with a strong software engineering background: packaging, testing (pytest), virtual environments, and dependency management.
- Experience building modular Python libraries for data engineering use cases.
- Familiarity with common data engineering libraries: pandas, pydantic, great_expectations or similar DQ frameworks.
Cloud & Infrastructure
- Experience deploying Databricks on AWS, including workspace provisioning, IAM integration, and VPC configuration.
- Familiarity with cloud-native storage (S3/ADLS), external locations in Unity Catalog, and storage credentials management.
- Exposure to infrastructure-as-code tooling (Terraform, Databricks Asset Bundles, or similar).
Preferred Qualifications
- Databricks Certified Data Engineer Professional or Databricks Certified Associate Developer for Apache Spark certifications.
- Experience with Delta Live Tables (DLT) for declarative pipeline authoring.
- Familiarity with dbt (data build tool) integrated with Databricks SQL.
- Experience with Databricks Feature Store or MLflow for ML platform use cases.
- Exposure to Databricks Marketplace and Partner Connect integrations.
- Experience with Elasticsearch, Apache Kafka, or other streaming/search technologies complementary to the Lakehouse.