Search by job, company or skills

marktine technology solutions pvt ltd

Databricks solution architect

Save
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description

Databricks Solution Architect

Position: Databricks Solution Architect

Experience Level: 8+ years

Key Responsibilities

Next-Gen Architecture Delivery: Design and deploy modern Data Lakehouse solutions using Serverless

Compute (for SQL, Jobs, and Notebooks) to eliminate infrastructure overhead and optimize TCO.

Production Engineering: Write, optimize, and deploy production-grade code. You are expected to fix

what breaks—whether it's a shuffling error in Spark, a CI/CD failure in Databricks Asset Bundles (DABs),

or a complex Mosaic AI vector search pipeline.

Deep Performance Tuning: Diagnose and resolve bottlenecks (Skew, OOM, Spill). You must know when

to apply legacy tuning (Z-Order/Partitioning) vs. modern Liquid Clustering to handle changing data

patterns automatically.

Governance & Federation: Implement strict data governance using Unity Catalog. Configure Lakehouse

Federation to query external systems without data movement, and enforce Attribute-Based Access

Control (ABAC).

Customer Obsession: Act as the Expert in the Room. Explain complex concepts (like why Serverless

reduces cold starts or how AI/BI Genie handles hallucination) to non-technical stakeholders.

Must-Have Skills

Core Spark & Liquid Clustering: Expert knowledge of the Catalyst Optimizer and AQE. You must

understand the architectural shift from static partitioning to Liquid Clustering and how it impacts file

layout and query skipping.

Coding Fluency: Advanced proficiency in Python (PySpark) and SQL. You must be comfortable livecoding complex UDFs and transformation logic without reliance on Google.

Delta Lakehouse Mastery: Deep knowledge of ACID transactions, Time Travel, and Delta Live Tables

(DLT) for declarative pipeline management.

Unity Catalog & Security: Hands-on experience with System Tables for observability, Lakehouse

Federation for cross-platform querying, and setting up Volume based access controls.

Operational Excellence: Experience with Infrastructure-as-Code (Terraform), Databricks Asset Bundles

(DABs) for CI/CD, and Databricks Connect v2 for local development.

Nice-to-Have Skills

SAP Data Integration (High Value):

Experience extracting data from SAP systems (ECC, S/4HANA) into Databricks Delta Lake.

Familiarity with SAP Datasphere, SAP SLT, or Partner Connectors (e.g., Fivetran/Qlik for SAP) for

operational reporting.

Understanding of common SAP data structures (IDOCs, BAPIs) is a major plus.

Advanced AI & Machine Learning:

ML Engineering: Hands-on experience with MLflow for experiment tracking and model registry.

Familiarity with Feature Store implementation.

GenAI/Mosaic AI: Experience fine-tuning LLMs (Foundation Models), using Mosaic AI Gateway for

governance, or building conversational analytics with AI/BI Genie.

Libraries: Proficiency with core ML libraries (Scikit-learn, XGBoost, PyTorch) for non-generative use

cases.

Serverless Architectures: Proven experience migrating workloads from Classic Compute to Serverless,

including cost analysis.

Certifications: Databricks Certified Data Engineer Professional (Strongly Preferred).

More Info

Job Type:
Industry:
Employment Type:

Job ID: 150040173