Senior Data Engineer

Koantek

Mumbai, India

5-7 Years

Save

Posted 2 months ago
Be among the first 20 applicants

Early Applicant

Job Description

We are seeking an experienced Full Stack Data Engineer with 56 years of industry experience. The ideal candidate will have a proven track record of working on live projects, preferably within the manufacturing or energy sectors. He/she will play a key role in developing and maintaining scalable data solutions using PySpark, SQL, and modern data engineering frameworks.

Key Responsibilities

Develop and deploy end-to-end data pipelines and solutions integrating with various data sources and systems.
Collaborate with cross-functional teams to understand data requirements and deliver effective BI and analytical solutions.
Implement data ingestion, transformation, and processing workflows using Spark (PySpark/Scala) and SQL.
Develop and maintain data models and ETL/ELT processes, ensuring high performance, scalability, reliability, and data quality.
Build and maintain APIs and data services to support analytics, reporting, and application integration.
Ensure data quality, integrity, and security across all stages of the data lifecycle.
Monitor, troubleshoot, and optimize pipeline performance in a cloud-based environment.
Write clean, modular, and well-documented Python/Scala/SQL/PySpark code.
Integrate data from various sources including APIs, relational/non-relational databases, IoT devices, and external providers.
Ensure adherence to data governance, security, and compliance policies.

Required Skills & Experience

Bachelor's or Master's degree in Computer Science, Engineering, or related field.
5-6 years of hands-on experience in Data Engineering, with a strong focus on Apache Spark (PySpark).
Strong programming skills in Python/PySpark and/or Scala, with deep understanding of Apache Spark.
Strong SQL skills for data manipulation, analysis, and performance tuning.
Strong understanding of data architecture, data modeling, ETL/ELT processes, and data warehousing concepts.
Experience building and maintaining ETL/ELT pipelines in production environments.
Experience working with structured and unstructured data, including JSON, Parquet, Avro, and time-series data.
Familiarity with cloud-based data platforms (Azure/AWS/GCP preferred).
Familiarity with CI/CD pipelines and tools like Azure DevOps, Git, and DevOps practices for data engineering.
Excellent problem-solving skills, attention to detail, and ability to work independently or as part of a team.
Strong communication skills for interaction with technical and non-technical stakeholders.