
Search by job, company or skills
Mandatory Skill:
Azure Databrick Structured Streamin Delta Lake Delta Live Tables (DLTSpark Declarative Pipelines (SDP DatabricksAsset Bundle Unity Catalo Auto Loade Cloud File Apache SparK
Preferred Experience eMigration of traditional ETL pipelines to Spark Declarative Pipeline
Enterprise-scale Lakehouse implementation Data Quality frameworks and governance solution
Large-scale streaming architectures processing millions of events per da
Experience implementing Medallion Architecture
Streaming & Real-Time Data Processing
Design and develop real-time streaming pipelines using Databricks Structured Streaming.
Build and maintain Kafka-based ingestion frameworks.
Handle late-arriving events using watermarks, event-time processing, and stateful streaming concepts .Implement exactly-once processing and checkpointing mechanisms.
Monitor and optimize streaming workloads for performance and reliability.
Spark Declarative Pipelines (SDP) & Delta Live Tables (DLT)
Automated lineage tracking
Improved maintainability
Enhanced observability
Databricks Asset Bundle
Azure Data Engineering
Work extensively with:Azure Data Lake Storage Gen2 (ADLS Gen2)Azure Service Principals
Azure Key Vault Azure Data Factoy (ADF)Azure Databricks
Implement secure authentication and authorization mechanisms.
Troubleshoot and debug ADF pipeline failures.
Spark Optimization & Performance Tuning
Optimize Spark jobs using:
Partitioning strategies
Adaptive Query Execution (AQE)Broadcast joins
Caching and persistence
ZOrderingFile compaction techniques Analyze Spark UI for job failures and performance bottlenecks.
Troubleshoot executor failures, stage failures, skew issues, memory problems, and shuffle bottlenecks.
Unity Catalog & Governance Implement and manage Unity Catalog.
Configure data access controls and governance policies.
Establish lineage tracking and data security standards.
Manage catalog, schema, and table-level permissions.CI/CD & DevOps
Implement CI/CD pipelines for Databricks projects.
Work with Git branching strategies:
Feature Branches Pull Requests Code Reviews
Merge Processes Integrate VS Code with Databricks.
Automate deployments using Databricks Asset Bundles and DevOps pipelines.
Experience with Azure DevOps, GitHub Actions, Jenkins, or equivalent CI/CD platforms.
Data Modeling & SQL Develop complex SQL transformations.
Build optimized analytical data models.
Write performant SQL queries for large-scal datasets.
Design dimensional and Lakehouse data
Job ID: 150035175
We don’t charge any money for job offers