Senior Data Engineer / Data Architect (with Data Science Workflow Exposure)

Kuinbee

Pune, India

5-7 Years

Save

Posted 9 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Company: Kuinbee

Location: Pune, Maharashtra

Mode: Hybrid

Role Type: Full-Time

About Kuinbee

Kuinbee is building a unified data ecosystem that combines a scalable data marketplace with

an end-to-end AI-driven pipeline. Our platform enables automated ingestion,

transformation, quality checks, lineage tracking, modelling, and metadata intelligence

allowing organisations to integrate, manage, and operationalise their data with minimal

engineering effort. By merging marketplace accessibility with intelligent automation,

Kuinbee aims to redefine how modern data systems are built, governed, and scaled.

Role Overview

The ideal candidate will have deep knowledge of end-to-end data workflows, strong

architectural thinking, and the ability to translate engineering processes into modular,

automated agents. You will work closely with the product and AI teams to formalize the

logic that powers Kuinbee's data automation platform.

Key Responsibilities

Document complete pipeline flows from source to serving, including raw, clean,

transformed, and model-ready stages.

Identify technical pain points in real-world pipelines, including failure modes, schema

drift, refresh inconsistencies, and orchestration issues.

Demonstrate how heterogeneous sources such as databases, APIs, files, and streams are

combined, validated, modelled, and monitored.

Present two to three real pipelines you have built, including architecture diagrams,

decisions, and recovery strategies.

Collaborate with AI engineers to design agent equivalents for schema mapping, data

cleaning, transformations, validation, and lineage. Define metadata requirements for Kuinbee's Supermemory Layer to support governance,

semantic consistency, and automated monitoring.

Core Requirements

5+ years of experience building and maintaining production data pipelines end to end.

Expertise with relational databases such as Postgres, MySQL, or SQL Server.

Experience with data warehouses including BigQuery, Snowflake, or Redshift.

Familiarity with processing files such as Parquet, CSV, and Excel, along with API-based and

streaming data.

Advanced skills in SQL, Python, and modern transformation frameworks such as dbt.

Hands-on experience with Spark, Dask, or other distributed compute engines.

Experience with data quality and observability tools such as Great Expectations, Soda, or

Deequ.

Knowledge of lineage systems such as OpenLineage, DataHub, or OpenMetadata.

Strong data modelling foundation including star schemas, semantic layers, metrics, and

feature preparation.

Experience with orchestration frameworks such as Airflow, Dagster, or Prefect.

Understanding of performance optimization including partitioning, indexing, clustering,

and query planning.

Exposure to integrated machine learning workflows such as feature engineering and

inference paths.

Ability to design, reason about, and evaluate modern data architecture.

Bonus Skills

Hands-on experience with LLM-powered workflows, agentic automation, or AI-driven data transformation.
Background in architecting internal data platforms, analytical backbones, or end-to-end data infra.
Familiarity with domain-driven data modeling principles and modular data product design.
Strong point of view on governance frameworks, metadata standards, lineage, and observability.
Palantir Foundry certification or direct experience with Foundry-style ontology, pipelines, and operational workflows.

Compensation: Paid (Contract Based)

How to Apply

Send your CV or portfolio to [Confidential Information].

Applicants who include examples of real pipelines or architecture documents will receive

priority consideration.