Lead Data Engineer (Scala, Spark, Pyspark, Databricks)

Fresher

Save

Early Applicant

Job Description

Responsibilities

Design, develop, and maintain robust and scalable data pipelines using Apache Spark and Scala on the Databricks platform.
Implement ETL (Extract, Transform, Load) processes for various data sources, ensuring data quality, integrity, and efficiency.
Optimize Spark applications for performance and cost-efficiency within the Databricks environment.
Work with Delta Lake for building reliable data lakes and data warehouses, ensuring ACID transactions and data versioning.
Collaborate with data scientists, analysts, and other engineering teams to understand data requirements and deliver solutions.
Implement data governance and security best practices within Databricks.
Troubleshoot and resolve data-related issues, ensuring data availability and reliability.
Stay updated with the latest advancements in Spark, Scala, Databricks, and related big data technologies.

Required Skills And Experience

Proven experience as a Data Engineer with a strong focus on big data technologies.
Expertise in Scala programming language for data processing and Spark application development.
In-depth knowledge and hands-on experience with Apache Spark, including Spark SQL, Spark Streaming, and Spark Core.
Proficiency in using Databricks platform features, including notebooks, jobs, workflows, and Unity Catalog.
Experience with Delta Lake and its capabilities for building data lakes.
Strong understanding of data warehousing concepts, data modeling, and relational databases.
Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.
Experience with version control systems like Git.
Excellent problem-solving and analytical skills.
Ability to work independently and as part of a team.

Preferred Qualifications (Optional)

Experience with other big data technologies like Kafka, Flink, or Hadoop ecosystem components.
Knowledge of data visualization tools.
Understanding of DevOps principles and CI/CD pipelines for data engineering.
Relevant certifications in Spark or Databrick

Skills: big data,big data technologies,apache,apache spark,spark,scala,lake