Senior Data Engineer

TheHireHub.Ai

Gurugram, Gurugram, India

6-9 Years

This job is no longer accepting applications

Posted 2 months ago

Job Description

Lead / Senior Data Engineer

Function: Digital Technology Services

Level: 610 years (Data Engineering with exposure to AI / ML workloads)

About the Organization

The organization is a global consulting firm with over 10,000 entrepreneurial, action- and results-oriented professionals across more than 40 countries. It takes a hands-on approach to solving complex client problems and helping organizations reach their full potential. The culture celebrates independent thinkers and doers who create meaningful impact and shape the industry. A collaborative environment, guided by strong core values, defines how teams work and succeed together.

Role Overview

The Senior Data Engineer/Manager is a critical role responsible for building and maintaining trusted, scalable, and AI-ready data foundations that power enterprise analytics, machine learning, and Generative AI solutions.
This role sits at the intersection of data platforms, AI systems, and business consumption layers, ensuring that data used by AI models is accurate, consistent, well-modeled, and governed. The Senior Data Engineer works closely with AI architects, data scientists, backend engineers, and business stakeholders to translate raw operational data into high-quality, reusable data assets and features.
In enterprise AI environments, failures often originate from data gaps, poor semantics, or inconsistent pipelines rather than from models themselves. This role exists to prevent such silent failures by enforcing strong data engineering discipline, observability, and reliability.

Key Responsibilities

Data Architecture & Canonical Modelling

Define and maintain canonical data models across source systems, analytical platforms, and AI consumption layers
Establish and enforce data contracts between data producers and consumers
Manage schema evolution and backward compatibility to protect downstream dependencies
Partner with AI architects to optimize data models for ML and GenAI workloads, including features, embeddings, and metadata

Data Ingestion & Integration

Design, build, and operate robust data ingestion pipelines from enterprise systems, external APIs, files, logs, and streaming sources
Select and implement batch or streaming ingestion patterns based on latency, cost, and business requirements
Build reusable ingestion frameworks and connectors to accelerate onboarding of new data sources

Data Transformation, Quality & Observability

Develop scalable transformation pipelines using PySpark and SQL, optimized for performance and cost
Implement transformations supporting analytics, ML feature engineering, and GenAI grounding use cases
Define and enforce data quality standards covering completeness, accuracy, consistency, and timeliness
Implement automated data validation, reconciliation, anomaly detection, and freshness checks
Build observability dashboards and alerts to detect pipeline failures, data drift, and volume anomalies early

Semantic Layers, Feature Engineering & Platforms

Design and maintain semantic layers with consistent business definitions for analytics and AI
Build and manage reusable, versioned feature pipelines for ML and GenAI use cases
Ensure feature lineage and traceability back to source systems
Support training, inference-time feature access, and RAG pipelines in collaboration with AI teams
Work with modern data platforms including Databricks, Spark-based processing, and cloud-native storage and compute
Support multi-cloud data architectures across Azure, AWS, and GCP

Security, Governance & Compliance

Implement data access controls, masking, encryption, and secure data handling patterns
Ensure compliance with enterprise security, privacy, and governance standards
Support lineage, auditability, and traceability in collaboration with governance teams
Prepare data assets to support Responsible AI and regulatory compliance initiatives

Collaboration & Delivery

Collaborate with GenAI & Data Solution Architects, Data Scientists, AI Engineers, Backend and Platform Engineers, QA, and Delivery teams
Participate in Agile ceremonies including sprint planning, backlog refinement, and reviews
Provide technical guidance and mentoring to junior data engineers

Experience

69 years of experience in data engineering or analytics engineering roles
Proven experience building and operating production-grade data pipelines in enterprise environments
Hands-on experience supporting AI/ML or advanced analytics workloads

Technical Skills