
Search by job, company or skills

Job Description
Databricks Solution Architect
Position: Databricks Solution Architect
Experience Level: 8+ years
Key Responsibilities
Next-Gen Architecture Delivery: Design and deploy modern Data Lakehouse solutions using Serverless
Compute (for SQL, Jobs, and Notebooks) to eliminate infrastructure overhead and optimize TCO.
Production Engineering: Write, optimize, and deploy production-grade code. You are expected to fix
what breaks—whether it's a shuffling error in Spark, a CI/CD failure in Databricks Asset Bundles (DABs),
or a complex Mosaic AI vector search pipeline.
Deep Performance Tuning: Diagnose and resolve bottlenecks (Skew, OOM, Spill). You must know when
to apply legacy tuning (Z-Order/Partitioning) vs. modern Liquid Clustering to handle changing data
patterns automatically.
Governance & Federation: Implement strict data governance using Unity Catalog. Configure Lakehouse
Federation to query external systems without data movement, and enforce Attribute-Based Access
Control (ABAC).
Customer Obsession: Act as the Expert in the Room. Explain complex concepts (like why Serverless
reduces cold starts or how AI/BI Genie handles hallucination) to non-technical stakeholders.
Must-Have Skills
Core Spark & Liquid Clustering: Expert knowledge of the Catalyst Optimizer and AQE. You must
understand the architectural shift from static partitioning to Liquid Clustering and how it impacts file
layout and query skipping.
Coding Fluency: Advanced proficiency in Python (PySpark) and SQL. You must be comfortable livecoding complex UDFs and transformation logic without reliance on Google.
Delta Lakehouse Mastery: Deep knowledge of ACID transactions, Time Travel, and Delta Live Tables
(DLT) for declarative pipeline management.
Unity Catalog & Security: Hands-on experience with System Tables for observability, Lakehouse
Federation for cross-platform querying, and setting up Volume based access controls.
Operational Excellence: Experience with Infrastructure-as-Code (Terraform), Databricks Asset Bundles
(DABs) for CI/CD, and Databricks Connect v2 for local development.
Nice-to-Have Skills
SAP Data Integration (High Value):
Experience extracting data from SAP systems (ECC, S/4HANA) into Databricks Delta Lake.
Familiarity with SAP Datasphere, SAP SLT, or Partner Connectors (e.g., Fivetran/Qlik for SAP) for
operational reporting.
Understanding of common SAP data structures (IDOCs, BAPIs) is a major plus.
Advanced AI & Machine Learning:
ML Engineering: Hands-on experience with MLflow for experiment tracking and model registry.
Familiarity with Feature Store implementation.
GenAI/Mosaic AI: Experience fine-tuning LLMs (Foundation Models), using Mosaic AI Gateway for
governance, or building conversational analytics with AI/BI Genie.
Libraries: Proficiency with core ML libraries (Scikit-learn, XGBoost, PyTorch) for non-generative use
cases.
Serverless Architectures: Proven experience migrating workloads from Classic Compute to Serverless,
including cost analysis.
Certifications: Databricks Certified Data Engineer Professional (Strongly Preferred).
Job ID: 150040173
We don’t charge any money for job offers