We are seeking an experienced Databricks Developer / Data Engineer to design, develop, and optimize data pipelines, ETL workflows, and big data solutions using Databricks. The ideal candidate should have expertise in Apache Spark, PySpark, SQL, and cloud-based data platforms (Azure, AWS, GCP). This role involves working with large-scale datasets, data lakes, and data warehouses to drive business intelligence and analytics.
Key Responsibilities:
- Design, build, and optimize ETL and ELT pipelines using Databricks and Apache Spark
- Work with big data processing frameworks (PySpark, Scala, SQL) for data transformation and analytics
- Implement Delta Lake architecture for data reliability, ACID transactions, and schema evolution
- Integrate Databricks with cloud services like Azure Data Lake, AWS S3, GCP BigQuery, and Snowflake
- Develop and maintain data models, data lakes, and data warehouse solutions
- Optimize Spark performance tuning, job scheduling, and cluster configurations
- Work with Azure Synapse, AWS Glue, or GCP Dataflow to enable seamless data integration
- Implement CI/CD automation for data pipelines using Azure DevOps, GitHub Actions, or Jenkins
- Perform data quality checks, validation, and governance using Databricks Unity Catalog
- Collaborate with data scientists, analysts, and business teams to support analytics and AI/ML models
Required Skills & Qualifications:
- 6+ years of experience in data engineering and big data technologies
- Strong expertise in Databricks, Apache Spark, and PySpark/Scala
- Hands-on experience with SQL, NoSQL, and structured/unstructured data processing
- Experience with cloud platforms (Azure, AWS, GCP) and their data services
- Proficiency in Python, SQL, and Spark optimizations
- Experience with Delta Lake, Lakehouse architecture, and metadata management
- Strong understanding of ETL/ELT processes, data lakes, and warehousing concepts
- Experience with streaming data processing (Kafka, Event Hubs, Kinesis, etc)
- Knowledge of security best practices, role-based access control (RBAC), and compliance
- Experience in Agile methodologies and working in cross-functional teams
Preferred Qualifications:
- Databricks Certifications (Databricks Certified Data Engineer Associate/Professional)
- Experience with Machine Learning and AI/ML pipelines on Databricks
- Hands-on experience with Terraform, CloudFormation, or Infrastructure as Code (IaC)