Search by job, company or skills

Blue Cloud Softech Solutions Limited

Senior AI Data Management, - Training & -Quality Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 4 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

BU / FUNCTION DESCRIPTION

We are building a new AI Transformation center which will integrate into parts of ADITRAC (Accelerated Digital Transformation Center) and becoming our strategic advisory and technology partner for solving complex challenges in the journey of achieving your digital objectives. We work with all the business functions within Transportation Solutions and Sensors BU to drive Digitalization and create AI-driven solutions.

Our set-up is based on 3 main pillars to drive and deliver digitalization:

  • Consulting and digital / AI advisory: Partner with each function to better understand challenges and design state-of-the-art solutions and agents
  • AI Solutioning and Technology center: Mastering all technical disciplines and solutions
  • Project Management and Security: Ensuring delivery on time and budget and with the necessary security levels.

ROLE OBJECTIVE

AI performance is founded by data quality and data governance. This role ensures the team has the right data, with the right quality, with the right controls - so model outcomes are dependable and reliable. Own the end-to-end AI data lifecycle - from governed ingestion to training/evaluation datasets, data quality gates, lineage, reproducibility, and run-time monitoring - using AWS + Databricks as the production backbone. Guide and prepare the transformation of Sensors from Dashboard- to an AI-driven organization.

RESPONSIBILITIES

AI Data Strategy & Ownership (Operating Model)

Translate AI use cases into data requirements

  • Features, labels, context documents, metadata, refresh cadence, retention rules.
  • Define the AI data products needed for each solution (training set, evaluation set, inference inputs, reference corpora)
  • Develop and maintain an AI data roadmap aligned to the data product roadmap specific for Sensors BU

Develop a data-strategy to tranform from a data-dashboard oriented organization to an AI-first model

  • Collaborating with our DIA Dashboard organization (Philippine spoke team)
  • Develop a data-strategy for our Sensors internal databases (e.g. SBI)

Data Ingestion & Curation on AWS + Databricks

  • Build and operate robust ingestion pipelines from enterprise sources into AWS + Databricks:
  • Ensure data pipelines are:
  • Incremental (cost-aware)
  • Observed (metrics & logs)
  • Reliable (SLAs for freshness and completeness)

Establish BU-oriented AI Data Governance (Unity Catalog + AWS controls)

  • Leverage Databricks Unity Catalog for table, column, and row-level controls
  • Implement classification & handling standards
  • PII/PCI/Confidential tagging
  • Retention and deletion rules (e.g., right-to-delete)
  • Audit trails and access logging-

  • Define and maintain data contracts with source owners for schema, semantics, quality SLAs, and change processes

Data Quality Engineering (Hard Gates for AI Readiness)

  • Define data quality dimensions and SLAs (AI-specific):
  • Completeness, consistency, timeliness, uniqueness
  • Distribution stability (for drift-sensitive features)
  • Implement automated quality checks:
  • Schema validation (breaking changes)
  • Null/missingness thresholds
  • Referential integrity
  • Distribution checks (mean/variance, quantiles, KL divergence where appropriate)

Consider data quality dashboards & alerting:

  • Pipeline failures and/or data freshness breaches
  • Quality test failures (e.g. Block training or deployment when critical checks fail)

Performance & Cost Optimization (AWS + Databricks economics)

  • Optimize data storage and compute:
  • Partitioning strategies and file sizing
  • Delta optimization/compaction strategy
  • Cluster sizing, autoscaling, job scheduling
  • Ensure cost transparency

Production Operations & Support Readiness (Run Phase)

  • Provide operational artifacts and support:
  • Runbooks (pipeline recovery, backfills, reprocessing)
  • On-call / escalation participation for data incidents
  • Root cause analysis for quality issues
  • Ensure observability via SLAs/health checks for critical pipelines

EDUCATION/KNOWLEDGE

Bacholor degree: Computer Science, Software Engineering, Data Science, Artificial Intelligence / Machine Learning, Applied Mathematics or Engineering (with strong CS content)

QUALIFICATIONS & EXPERIENCE

  • Data Engineering & Data Management
  • AI / ML Data Foundations
  • Data Quality Engineering
  • Cloud & Platform Fundamentals
  • Platform-Specific Qualifications (Databricks + AWS)

  • Certifications (Optional but highly valuable)
  • Databricks
  • Databricks Data Engineer Professional
  • Databricks Machine Learning Professional
  • AWS
  • AWS Certified Data Analytics Specialty
  • AWS Solutions Architect (Associate/Professional)

5+ years of overall experience

MOTIVATIONAL/CULTURAL FIT

  • Innovation demeanor Problem solving
  • Proactive
  • Working in a fast paced and dynamic environment
  • Passion for technology
  • Self development
  • Results driven
  • Clear and concise communication both locally and globally

More Info

Job Type:
Industry:
Employment Type:

Job ID: 144015309