Search by job, company or skills

  • Posted 9 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Lead / Senior Data Engineer

Function: Digital Technology Services

Level: 610 years (Data Engineering with exposure to AI / ML workloads)

About the Organization

The organization is a global consulting firm with over 10,000 entrepreneurial, action- and results-oriented professionals across more than 40 countries. It takes a hands-on approach to solving complex client problems and helping organizations reach their full potential. The culture celebrates independent thinkers and doers who create meaningful impact and shape the industry. A collaborative environment, guided by strong core values, defines how teams work and succeed together.

Role Overview

  • The Senior Data Engineer/Manager is a critical role responsible for building and maintaining trusted, scalable, and AI-ready data foundations that power enterprise analytics, machine learning, and Generative AI solutions.
  • This role sits at the intersection of data platforms, AI systems, and business consumption layers, ensuring that data used by AI models is accurate, consistent, well-modeled, and governed. The Senior Data Engineer works closely with AI architects, data scientists, backend engineers, and business stakeholders to translate raw operational data into high-quality, reusable data assets and features.
  • In enterprise AI environments, failures often originate from data gaps, poor semantics, or inconsistent pipelines rather than from models themselves. This role exists to prevent such silent failures by enforcing strong data engineering discipline, observability, and reliability.

Key Responsibilities

Data Architecture & Canonical Modelling

  • Define and maintain canonical data models across source systems, analytical platforms, and AI consumption layers
  • Establish and enforce data contracts between data producers and consumers
  • Manage schema evolution and backward compatibility to protect downstream dependencies
  • Partner with AI architects to optimize data models for ML and GenAI workloads, including features, embeddings, and metadata

Data Ingestion & Integration

  • Design, build, and operate robust data ingestion pipelines from enterprise systems, external APIs, files, logs, and streaming sources
  • Select and implement batch or streaming ingestion patterns based on latency, cost, and business requirements
  • Build reusable ingestion frameworks and connectors to accelerate onboarding of new data sources

Data Transformation, Quality & Observability

  • Develop scalable transformation pipelines using PySpark and SQL, optimized for performance and cost
  • Implement transformations supporting analytics, ML feature engineering, and GenAI grounding use cases
  • Define and enforce data quality standards covering completeness, accuracy, consistency, and timeliness
  • Implement automated data validation, reconciliation, anomaly detection, and freshness checks
  • Build observability dashboards and alerts to detect pipeline failures, data drift, and volume anomalies early

Semantic Layers, Feature Engineering & Platforms

  • Design and maintain semantic layers with consistent business definitions for analytics and AI
  • Build and manage reusable, versioned feature pipelines for ML and GenAI use cases
  • Ensure feature lineage and traceability back to source systems
  • Support training, inference-time feature access, and RAG pipelines in collaboration with AI teams
  • Work with modern data platforms including Databricks, Spark-based processing, and cloud-native storage and compute
  • Support multi-cloud data architectures across Azure, AWS, and GCP

Security, Governance & Compliance

  • Implement data access controls, masking, encryption, and secure data handling patterns
  • Ensure compliance with enterprise security, privacy, and governance standards
  • Support lineage, auditability, and traceability in collaboration with governance teams
  • Prepare data assets to support Responsible AI and regulatory compliance initiatives

Collaboration & Delivery

  • Collaborate with GenAI & Data Solution Architects, Data Scientists, AI Engineers, Backend and Platform Engineers, QA, and Delivery teams
  • Participate in Agile ceremonies including sprint planning, backlog refinement, and reviews
  • Provide technical guidance and mentoring to junior data engineers

Experience

  • 69 years of experience in data engineering or analytics engineering roles
  • Proven experience building and operating production-grade data pipelines in enterprise environments
  • Hands-on experience supporting AI/ML or advanced analytics workloads

Technical Skills

  • Strong proficiency in Python, PySpark, and SQL
  • Hands-on experience with Databricks or similar Spark-based platforms
  • Experience with data ingestion tools, connectors, and APIs
  • Familiarity with feature engineering pipelines and AI data requirements
  • Exposure to cloud and multi-cloud data platforms (Azure, AWS, GCP)

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 139015117

Similar Jobs

(estd)