
Search by job, company or skills
Job Description: Databricks Resident Solutions Architect (RSA)
Experience: 8+ Years (with min. 5+ years deep specialization in Databricks/Spark)
The Opportunity as a Resident Solutions Architect (RSA) at Celebal Technologies (a Databricks Elite Partner), you will work shoulder-to-shoulder with the Databricks to deliver mission-critical data/AI transformations for Fortune 500 clients.
This is not a slide-ware architecture role. We are looking for a Bar Raisersomeone who is an architect on paper but an engineer at heart. You will leverage the full Databricks Premium & Enterprise suite, moving clients beyond basic ETL into Serverless, GenAI Agents, and Federated Data Mesh. You must be a Hands-on Keyboard expert, spending 70-80% of your time writing production code, debugging distributed systems, and implementing the platform's newest features.
What You Will Do (Key Responsibilities)
Next-Gen Architecture Delivery: Design and deploy modern Data Lakehouse solutions using Serverless Compute (for SQL, Jobs, and Notebooks) to eliminate infrastructure overhead and optimize TCO.
Production Engineering: Write, optimize, and deploy production-grade code. You are expected to fix what breakswhether it's a shuffling error in Spark, a CI/CD failure in Databricks Asset Bundles (DABs), or a complex Mosaic AI vector search pipeline.
Deep Performance Tuning: Diagnose and resolve bottlenecks (Skew, OOM, Spill). You must know when to apply legacy tuning (Z-Order/Partitioning) vs. modern Liquid Clustering to handle changing data patterns automatically.
Governance & Federation: Implement strict data governance using Unity Catalog. Configure Lakehouse Federation to query external systems without data movement, and enforce Attribute-Based Access Control (ABAC).
Customer Obsession: Act as the Expert in the Room. Explain complex concepts (like why Serverless reduces cold starts or how AI/BI Genie handles hallucination) to non-technical stakeholders.
The Bar Raiser Profile (Must-Have Skills)
Core Spark & Liquid Clustering: Expert knowledge of the Catalyst Optimizer and AQE. You must understand the architectural shift from static partitioning to Liquid Clustering and how it impacts file layout and query skipping.
Coding Fluency: Advanced proficiency in Python (PySpark) and SQL. You must be comfortable live-coding complex UDFs and transformation logic without reliance on Google.
Delta Lakehouse Mastery: Deep knowledge of ACID transactions, Time Travel, and Delta Live Tables (DLT) for declarative pipeline management.
Unity Catalog & Security: Hands-on experience with System Tables for observability, Lakehouse Federation for cross-platform querying, and setting up Volume based access controls.
Operational Excellence: Experience with Infrastructure-as-Code (Terraform), Databricks Asset Bundles (DABs) for CI/CD, and Databricks Connect v2 for local development.
Differentiators & Nice-to-Have Skills
SAP Data Integration (High Value):
Experience extracting data from SAP systems (ECC, S/4HANA) into Databricks Delta Lake.
Familiarity with SAP Datasphere, SAP SLT, or Partner Connectors (e.g., Fivetran/Qlik for SAP) for operational reporting.
Understanding of common SAP data structures (IDOCs, BAPIs) is a major plus.
Advanced AI & Machine Learning:
ML Engineering: Hands-on experience with MLflow for experiment tracking and model registry. Familiarity with Feature Store implementation.
GenAI/Mosaic AI: Experience fine-tuning LLMs (Foundation Models), using Mosaic AI Gateway for governance, or building conversational analytics with AI/BI Genie.
Libraries: Proficiency with core ML libraries (Scikit-learn, XGBoost, PyTorch) for non-generative use cases.
Serverless Architectures: Proven experience migrating workloads from Classic Compute to Serverless, including cost analysis.
Certifications: Databricks Certified Data Engineer Professional (Strongly Preferred).
Please share CVs at [Confidential Information]
Job ID: 142649653