Search by job, company or skills

zorba ai

Data Engineer-Pyspark

new job description bg glownew job description bg glownew job description bg svg
  • Posted 13 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

JOB OVERVIEW

We are seeking a Data Engineer who can work closely with cross-functional engineering, data science, and product teams to design, build, and enhance scalable data pipelines across batch and streaming systems. This role is responsible for maintaining high-quality, high-performance data products that support analytics, machine learning, personalization, and real-time business operations.

The Data Engineer will contribute to quarterly business and technical objectives by modernizing core data assets, improving operational reliability, and ensuring best-in-class data quality standards.

KEY RESPONSIBILITIES

  • Design, build, and maintain scalable, efficient, and reliable data pipelines for ingestion, transformation, and integration across diverse data sources and destinations.
  • Develop and optimize batch workflows (PySpark, SQL, orchestration) and support real-time/streaming pipelines (Kafka or similar) when applicable.
  • Improve pipeline performance, cost efficiency, and scalability across large and complex datasets.
  • Implement and maintain automated data quality checks, regression testing, and validation frameworks to ensure accuracy, reliability, and compliance with organizational standards
  • Draft and review architectural diagrams, support processes to ensure clarity and alignment across engineering teams
  • Work closely with engineering, product, and data science partners to deliver high-quality, production-ready data solutions

REQUIRED SKILLS

  • Data Processing: PySpark, SQL, Spark engine/architecture knowledge, performance tuning
  • Programming: Python (Good to have)
  • Cloud & Platforms: Databricks, Azure
  • Streaming: Kafka or similar streaming platforms (nice to have)
  • Version Control / CI-CD: Git, GitHub, GitHub Actions, CI/CD best practices
  • Collaboration: JIRA, Confluence, MS Teams

PREFERRED TRAITS

  • Understanding modern software design patterns, data modeling practices, and distributed system fundamentals.
  • Ability to write clean, testable, scalable code following engineering best practices.
  • Follows established architecture patterns and engineering standards across the data platform.
  • Proactively suggests improvements to minimize tech debt and increase reliability.
  • Capable of identifying, triaging, and resolving data defects in both production and non-prod environments.
  • Partners with QA/Automation teams to implement functional and data testing strategies.
  • Experience working in Agile environments with cross-functional engineering teams of 5 or more.
  • Collaborate effectively with outside teams to support product adoption and operational stability.
  • Demonstrates strong technical communication skills: can explain trade-offs, ask the right questions, and provide/receive feedback constructively.

PREFERRED TRAITS

  • Understanding modern software design patterns, data modeling practices, and distributed system fundamentals.
  • Ability to write clean, testable, scalable code following engineering best practices.
  • Follows established architecture patterns and engineering standards across the data platform.
  • Proactively suggests improvements to minimize tech debt and increase reliability.
  • Capable of identifying, triaging, and resolving data defects in both production and non-prod environments.
  • Partners with QA/Automation teams to implement functional and data testing strategies.
  • Experience working in Agile environments with cross-functional engineering teams of 5 or more.
  • Collaborate effectively with outside teams to support product adoption and operational stability.
  • Demonstrates strong technical communication skills: can explain trade-offs, ask the right questions, and provide/receive feedback constructively.

Skills: spark,pyspark,sql

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145322531