Location: Bangalore
Experience: 6-10 years
Description:
We are looking for Senior Data Engineer having minimum experience of 6 year in Apache Spark and cloud passionate about technology, motivated for continuous learning and an individual who views every client interaction as an opportunity to create an exceptional customer experience.
Qualifications:
Must have:
· BE/B.Tech/MCA/MS-IT/CS/B.Sc/BCA or any other degrees in related fields
· Expertise and hands-on experience on big data, Python, Pyspark, SQL, Spark and Cloud.
· Expertise knowledge on Databricks.
Job Description
- Lead Engineer responsible for architecting, building, and scaling enterprise data platforms on Databricks Lakehouse for analytics and downstream consumption.
- Own end‑to‑end delivery of batch and streaming data engineering solutions using Spark and modern distributed systems at large operational scale.
- Drive adoption of Databricks best practices across lakehouse design, governance, security, and cost optimization.
- Collaborate closely with product, engineering, and platform teams to deliver product‑driven, data‑backed outcomes.
- Establish engineering standards, reusable frameworks, and patterns for scalable, maintainable data platforms.
- Leverage Generative AI tools to accelerate design, development, debugging, and optimization while maintaining strong engineering rigor.
Roles & Responsibilities
- 6+ years of experience in Data Engineering development.
- Design, build, and optimize scalable batch and real‑time data pipelines using Databricks (Spark, PySpark, Structured Streaming, Auto Loader) and Kafka.
- Architect and implement lakehouse data models using Delta Lake, open table formats, distributed file formats, and transactional data management.
- Lead development of robust ETL/ELT frameworks ensuring data quality, observability, lineage, and performance across platforms.
- Own orchestration, automation, and CI/CD for data pipelines using Databricks Workflows, Airflow/Dagster/Oozie, and containerized runtimes (Docker, Kubernetes).
- Ensure strong governance and access control using Unity Catalog, along with proactive monitoring, tuning, and cost optimization.
- Apply deep expertise in SQL/NoSQL, Python, Unix, Hadoop, and object storage operations to debug and optimize complex, data‑intensive systems.
- Mentor engineers, influence architectural decisions, and collaborate with cross‑functional stakeholders to deliver production‑grade, reliable data platforms on Databricks.