Search by job, company or skills

Primathon

Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 2 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a Data Engineer who can operate as a high-impact Individual Contributor with the depth and ownership of a Tech Lead. This role requires someone who can architect, build, and scale data systems end-to-end, with a strong focus on open-source and self-managed data infrastructure.

Responsibilities

  • Design and build scalable, high-performance data platforms and pipelines.
  • Work on distributed data systems across batch and real-time processing.
  • Take end-to-end ownership from architecture to deployment and optimization.
  • Debug, optimize, and extend open-source data systems.
  • Solve problems at the infrastructure and systems level (not just tool usage).
  • Collaborate across teams and drive data engineering best practices.

Requirements

  • 5+ years of experience in Data Engineering / Data Platform roles.
  • Strong understanding of distributed systems, data storage, and compute layers.
  • Ability to design systems from first principles, not just use tools.
  • Hands-on experience with open-source or self-managed architectures.
  • Strong programming skills in Python or Go (Golang).
  • Experience with system design, performance tuning, and debugging at scale.

Nice To Have

  • Backend engineering experience (APIs, services, system design).
  • Experience working on infrastructure and deployment.
  • Exposure to high-scale or real-time systems.

Key Skills

  • Query and OLAP: Trino, ClickHouse, Apache Pinot.
  • Batch and Stream Processing: Apache Spark (OSS), Apache Flink (OSS).
  • Table Format: Apache Iceberg.
  • Cataloging: AWS Glue.
  • Cloud: AWS.
  • Languages: Go, Python.

This job was posted by Sneha Singh from Primathon.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148093561

Similar Jobs

Delhi, India

Skills:

PysparkApache SparkAutomationData QualityGitlabDatabricksData GovernancePythonCI CD PipelinesAI ML WorkflowsLLMOpsRAG PipelinesVector-Space ArchitecturesVector SearchSQL OptimizationmetadataDelta LakeSpark Performance OptimizationDatabricks REST APIsDistributed Data ProcessingScalable Data Platform Architecture

Gurugram, Gurugram, India

Skills:

JavaS3BigQueryHadoopScalaKafkaRedshiftSqlAzure SynapseGcpSparkPythonAWS

Noida, India

Skills:

Spark SQLPower BiPysparkSqlGoogle CloudData WarehousingAzureAWSEtlcdcDelta LakeMicrosoft Fabric

Gurugram, Gurugram, India

Skills:

BigQueryPysparkApache SparkRedshiftSqlGitHiveGcpFastAPIAzurePythonArgo Workflows

Gurugram, Gurugram, India

Skills:

Data TransformationRest ApisSqlPythonData quality checksDocumentationData management lifecycleAPI integrationsPipeline orchestration