Data Engineer

Hero Vired

Gurugram, Gurugram, India

2-5 Years

Save

Posted 10 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

About Hero Vired:

Would you like to be part of an exciting, innovative, and high-growth startup from one of the largest and most well-respected business houses in the country - the Hero Group

Hero Vired is a premium learning experience offering industry-relevant programs and world-class partnerships, to create the change-makers of tomorrow.

At Hero Vired, we believe everyone is made of big things. With the experience, knowledge, and expertise of the Hero Group, Hero Vired is on a mission to change the way we learn. Hero Vired aims to give learners the knowledge, skills, and expertise through deeply engaged and holistic experiences, closely mapped with industry to empower them to transform their aspirations into reality. The focus will be on disrupting and reimagining university education & skilling for working professionals by offering high-impact online certification and degree programs.

The illustrious and renowned US$5 billion diversified Hero Group is a conglomerate of Indian companies with primary interests and operations in automotive manufacturing, financing, renewable energy, electronics manufacturing, and education. The Hero Group (BML Munjal family) companies include Hero MotoCorp, Hero FinCorp, Hero Future Energies, Rockman Industries, Hero Electronix, Hero Mindmine, and the BML Munjal University.

For detailed information, visit Hero Vired

Role :Data Engineer

Job Type: Full Time (Work from Office)

Experience: 2 to 5 years

Location: Gurugram/Delhi/NCR

Department: Technology

Role Overview

We are looking for a fundamentally strong and passionate Data Engineer with deep expertise in relational databases, real-time data streaming, and scalable pipeline architectures. You will design and own the data infrastructure that powers Hero Vired's learning platforms, analytics systems, and operational workflows — ensuring data is reliable, timely, and accessible across the organisation.

Key Responsibilities:

Relational Database Design & Management

Design, architect, and optimise relational schemas in PostgreSQL for high-volume transactional and analytical workloads.
Write complex SQL queries, stored procedures, and views; implement indexing strategies and query performance tuning.
Define and enforce data contracts, ensure schema evolution remains non-breaking, and manage migrations with zero downtime.
Collaborate with backend and product teams to model entities and relationships that reflect real business logic.

Streaming & Real-Time Data

Build and maintain real-time data pipelines using Apache Kafka — including topic design, partitioning strategies, consumer group management, and offset handling.
Design event-driven architectures for reliable, ordered, and fault-tolerant data flow across microservices and systems.
Implement and maintain change data capture (CDC) pipelines to sync operational databases with downstream consumers in real time.
Integrate PowerSync or similar sync frameworks to enable offline-first data availability for client-facing applications.

ETL / ELT Pipeline Development

Design and maintain robust ETL/ELT pipelines to ingest, transform, and load data from diverse sources (CRM, LMS, marketing platforms, third-party APIs).
Build pipeline orchestration using tools like Apache Airflow or similar schedulers — with proper error handling, retries, and alerting.
Ensure data quality through validation, deduplication, reconciliation checks, and lineage tracking at each stage of the pipeline.
Work closely with analytics and product teams to deliver clean, well-modelled datasets for reporting and decision-making.

Data Infrastructure & Observability

Manage and optimise data storage solutions including PostgreSQL, MongoDB, and cloud-based data warehouses (e.g., BigQuery, Redshift, or Snowflake).
Set up monitoring, alerting, and observability for pipelines — detect failures, lag, and data drift proactively.
Implement data cataloguing and documentation practices to improve discoverability and governance across the data platform.

Collaboration & Data Contracts

Partner with backend engineers, data scientists, and product managers to define upstream data contracts and downstream consumption needs.
Contribute to a culture of data ownership — champion data quality, documentation, and engineering best practices across teams.

Required Skills

Deep, hands-on expertise in PostgreSQL — schema design, query optimisation, indexing, and migrations.
Strong understanding of Apache Kafka — producers, consumers, topics, partitioning, and stream processing.
Experience building and maintaining ETL/ELT pipelines at scale with proper orchestration and error handling.
Proficiency in Python for data engineering tasks — scripting, transformation logic, and pipeline development.
Solid understanding of data modelling principles for both OLTP and OLAP use cases.
Experience with CDC tools (e.g., Debezium) and real-time sync frameworks like PowerSync.
Familiarity with cloud platforms (AWS, GCP, or Azure) and managed data services.

Nice-to-Haves