Search by job, company or skills

Tvarit

Senior Data Engineer

7-9 Years
Save
  • Posted 20 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a Senior Data Engineer with strong expertise in Azure Databricks, PySpark, and distributed computing to develop and optimize scalable ETL pipelines for manufacturing analytics. The role involves working with high-frequency industrial data to enable real-time and batch data processing.

Responsibilities

  • Build scalable real-time and batch processing workflows using Azure Databricks, PySpark, and Apache Spark.
  • Perform data pre-processing, including cleaning, transformation, deduplication, normalization, encoding, and scaling to ensure high-quality input for downstream analytics.
  • Design and maintain cloud-based data architectures, including data lakes, lakehouses, and warehouses, following Medallion Architecture.
  • Deploy and optimize data solutions on Azure (preferred), AWS, or GCP with a focus on performance, security, and scalability.
  • Develop and optimize ETL/ELT pipelines for structured and unstructured data from IoT, MES, SCADA, LIMS, and ERP systems.
  • Automate data workflows using CI/CD and DevOps best practices, ensuring security and compliance with industry standards
  • Monitor, troubleshoot, and enhance data pipelines for high availability and reliability.
  • Utilize Docker and Kubernetes for scalable data processing.
  • Collaborate with the automation team, data scientists, and engineers to provide clean, structured data for AI/ML models.

Requirements

  • Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
  • 7+ years of experience in core data engineering, with a strong focus on cloud platforms such as Azure (preferred), AWS, or GCP.
  • Proficiency in PySpark, Azure Databricks, Python, Apache Spark, etc.
  • 2 years of team handling experience.
  • Expertise in relational databases (e. g., SQL Server, PostgreSQL), time series databases (e. g., InfluxDB), and NoSQL databases (e. g., MongoDB, Cassandra).
  • Experience in containerization (Docker, Kubernetes).
  • Strong analytical and problem-solving skills with attention to detail.
  • Good to have MLOps and DevOps, including model lifecycle management.
  • Excellent communication and collaboration skills, with a proven ability to work effectively as a team player.
  • Comfortable working in a dynamic, fast-paced startup environment, adapting quickly to changing priorities and responsibilities

This job was posted by Dr Soumya Sahadevan from Tvarit.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149019123

Similar Jobs

Pune, India

Skills:

DockerCassandraPysparkPostgreSQLSQL ServerApache SparkAzure DatabricksMongoDBKubernetesInflux DB

Pune, India

Skills:

Oracle DbHiveScalaDatabricksPythonTrino

Pune, India

Skills:

Spark SQLAzure Data FactoryPysparkSqlLakehouse OneLake Delta LakePipeline orchestrationMicrosoft Fabric

Pune, India

Skills:

DashboardsData VisualizationData LakeDatabricksPythonLakehouse architectureETL workflowsUnity Catalog

Pune, India

Skills:

snowflake HadoopApache SparkKafkaRedshiftSqlPlsqlAzure Machine LearningPythonAWSLangChainDataStax AstraDBLlamaIndex