Search by job, company or skills

DataPattern

Senior Data Engineer (Databricks Certified)

new job description bg glownew job description bg glownew job description bg svg
  • Posted 50 minutes ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title: Senior Data Engineer (Databricks Certified)

Experience: 8+ Years

Location: Chennai

Work Mode: Onsite

Job Summary:

We are seeking a highly skilled Senior Data Engineer with 8+ years of experience in building scalable data platforms and pipelines. The ideal candidate must be Databricks Certified and possess strong expertise in Spark Streaming, distributed data processing, and cloud-based data engineering frameworks. This role involves designing, developing, and optimizing modern data solutions that support real-time and batch workloads.

Key Responsibilities:

  • Design and develop scalable data pipelines using Apache Spark, PySpark, and Spark Streaming.
  • Build and optimize complex data workflows on Azure Databricks / AWS Databricks.
  • Implement real-time data streaming solutions using Structured Streaming or Delta Live Tables (DLT).
  • Work with Delta Lake, data lakehouse architectures, and medallion frameworks.
  • Develop ETL/ELT pipelines integrating structured, semi-structured, and unstructured data.
  • Collaborate with data architects and analysts to design robust data models and transformations.
  • Optimize Spark jobs for performance, reliability, and cost efficiency.
  • Use CI/CD practices for data pipeline deployments (Azure DevOps / GitHub Actions / Jenkins).
  • Work with cloud storage and compute services: ADLS, S3, Azure Synapse, Glue, Data Factory (based on cloud).
  • Ensure data quality, governance, and security standards throughout the pipeline.

Required Skills & Qualifications:

  • 8+ years of experience as a Data Engineer in large-scale environments.
  • Databricks Certified (Associate / Professional Data Engineer, or Spark Developer Certification).
  • Strong hands-on experience with Spark Streaming and real-time processing.
  • Expert in PySpark, SQL, Delta Lake, and distributed data processing.
  • Proficiency in one major cloud platform: Azure / AWS / GCP.
  • Strong experience with ETL/ELT design, performance tuning, and pipeline orchestration.
  • Experience with data modeling, partitioning strategies, and big data storage formats (Parquet, ORC, Avro).
  • Solid understanding of DevOps, version control (Git), and CI/CD workflows.

Nice to Have:

  • Experience with Databricks Unity Catalog, governance, and lineage.
  • Knowledge of Kafka / Event Hub / Kinesis for streaming ingestion.
  • Background in Airflow, dbt, or similar orchestration tools.
  • Knowledge of machine learning pipelines and MLOps concepts.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 134548497