About the Role
We are looking for an experienced Senior Data Engineer with 8+ years of hands-on expertise in Python, Databricks, and large-scale distributed data systems. This role involves leading data engineering initiatives, architecting advanced data solutions, mentoring junior engineers, and partnering with business and technology teams to drive data-driven decision-making across the organization.
Key Responsibilities
Technical Leadership
- Lead the design, architecture, and implementation of end-to-end data pipelines using Python, Databricks, Spark, and Delta Lake.
- Provide technical direction on data modeling, ETL/ELT frameworks, and best practices.
- Mentor and guide junior and mid-level engineers, conduct code reviews, and enforce coding standards.
Advanced Data Engineering
- Architect optimized data lake and lakehouse environments, including multi-layered data models (bronze/silver/gold).
- Implement high-performance batch and streaming pipelines using Apache Spark and Databricks Workflows.
- Build scalable ingestion frameworks for structured, semi-structured, and unstructured data from diverse sources.
Cloud & Platform Ownership
- Manage and optimize cloud-native data environments (Azure preferred):
- ADLS, ADF, Azure Event Hub, Azure Synapse
- Databricks cluster tuning, job orchestration, cost optimization
- Integrate Databricks with enterprise systems, APIs, and CI/CD pipelines (Azure DevOps / GitHub Actions).
Data Quality, Security & Governance
- Implement enterprise-grade data quality frameworks and automated validation pipelines.
- Ensure compliance with security, data governance, and privacy standards (Unity Catalog / Purview).
- Define and maintain metadata, lineage, and documentation across data assets.
Cross-functional Collaboration
- Partner with data scientists, ML engineers, BI teams, and business stakeholders to translate requirements into scalable data solutions.
- Collaborate with product owners to prioritize data engineering roadmaps.
- Communicate technical decisions and trade-offs to technical and non-technical audiences.
Required Skills & Qualifications
Technical Expertise
- 8+ years of experience in Data Engineering with strong hands-on command of Python.
- Deep expertise in Databricks, Spark (PySpark), Delta Lake, job orchestration, and cluster tuning.
- Strong SQL experience, including performance tuning and complex transformations.
- Proven experience with cloud data ecosystems (Azure preferred):
- ADLS Gen2, ADF, Azure Databricks, Key Vault
- Strong understanding of distributed systems, data partitioning, caching, and performance optimization in Spark.
Preferred Skills
- Experience with streaming frameworks: Kafka, Event Hub, Spark Streaming.
- Data warehousing and dimensional modeling experience.
- Exposure to MLOps or ML lifecycle workflows in Databricks is an advantage.
- Experience implementing CI/CD for data pipelines and infra-as-code for data platforms.
Soft Skills
- Strong problem-solving, analytical thinking, and decision-making abilities.
- Excellent communication and leadership skills.
- Comfortable working in agile environments and managing multiple parallel initiatives.
- Ability to influence architecture and strategy decisions with strong technical judgment.
Education & Experience
- Bachelor's or Master's degree in Computer Science, Engineering, or related fields.
- 8+ years of professional data engineering experience working on enterprise-scale data projects.
Why Join Us
- Opportunity to work on cutting-edge cloud and data technologies.
- Influence data architecture and contribute to long-term data strategy.
- Lead impactful projects with cross-functional visibility.
- A collaborative culture with strong focus on innovation and continuous improvement.
Job Location- Multiple Locations in India