Machine Learning Engineer

Yugen.ai

Bengaluru, India

2-5 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

We're hiring an ML Engineer to work alongside Data Scientist(s) and support a leading client in the ad tech domain. You will own the infrastructure, low-latency APIs, data pipelines, deployment, and reliability of recommendation and ranking models in production. You'll be the bridge between data science and engineering: taking prototypes from the Data Scientist and turning them into robust, low-latency, high-availability services that operate at ad-tech scale. You should be comfortable with asynchronous communication(written updates, docs, Slack-style collaboration) with both the client and our internal team across time zones.

The Candidate Will Have Responsibilities Across The Following Functions

Model Productionization and Serving:

Design, build, and maintain low-latency APIs for serving recommendation and ranking models.
Take Data Scientist-built models (in Python) and productionize them for real-time or near-real-time serving.
Implement and maintain model serving endpoints (e. g., using SageMaker, Vertex AI, custom Docker/Kubernetes-based services, or similar).
Optimise for low latency and high throughput, suitable for ad-serving workloads.

Feature Pipelines And Data Engineering

Design and build feature pipelines for training and inference:
Batch pipelines using tools like Airflow, dbt, Beam, or Spark.
Streaming / real-time features using Kafka, Pub/Sub, etc
Design, integrate with, or operate an online feature store to serve low-latency features for real-time scoring.
Ensure training-serving skew is minimised; maintain clear contracts for feature definitions and data schemas.

Infrastructure And MLOps

Implement CI/CD for ML models and pipelines (e. g., GitHub Actions, GitLab CI, Cloud Build, etc. ).
Manage containerization and deployment using Docker and Kubernetes (or managed equivalents).
Set up and maintain model versioning, configuration management, and rollback strategies.

Monitoring, Observability And Reliability

Work with the Data Scientist to define metrics and implement monitoring for:
Model performance (prediction distribution, drift, business KPIs).
System performance (latency, error rates, resource utilisation).
Data quality (schema checks, nulls, outliers, volume anomalies).
Build alerting and logging using the client's stack (e. g., Prometheus, Grafana, Cloud Monitoring, CloudWatch, etc. ).
Investigate and resolve production issues, from infrastructure to data to model-related problems.

Experimentation Platform Support

Integrate models with the client's AB testing/experimentation framework.
Implement traffic splits, routing logic, and variant toggles (feature flags).
Ensure metrics and logs needed for experiment analysis are correctly captured and accessible.

Collaboration And Client Interaction

Work closely with the Data Scientist to understand modelling assumptions and requirements.
Collaborate with the client Product and Engineering teams to align on SLAs, integration points, and architectural choices.
Participate in technical discussions with client partners; communicate trade-offs and propose pragmatic solutions.
Provide clear async updates(tickets, comments, design docs, status summaries) so both the client and internal teams stay aligned without needing constant meetings.

Requirements

Experience: 2-5 years as an ML Engineer / Data Engineer / Software Engineer working on ML-heavy systems.
Programming: Strong skills in Python
Cloud: Hands-on experience with GCP or AWS
Data Engineering: Experience building and operating data pipelines (batch and/or streaming) using tools like Airflow, dbt, Beam, Spark, or similar.
MLOps / Infra: Experience with:
Containerization (Docker) and orchestration (Kubernetes or managed alternatives).
CI/CD for services or ML workflows.
Monitoring/logging tools (Prometheus, Grafana, CloudWatch, Stackdriver, etc. ).
Collaboration and Communication:
Comfortable working in a remote, async-first environment: writing good design docs, giving structured written updates, and collaborating over Slack/email/tickets with distributed teams.

Nice-to-Have

Experience with real-time / low-latency systems, especially in ad tech, recommendation, ranking, or search.
Familiarity with feature stores and online feature serving.
Familiarity with online experimentation frameworks and traffic routing for AB tests.
Familiarity with Model registries and ML platforms (e. g., MLflow, SageMaker, Vertex AI pipelines).
Comfort reading Data Scientist code/notebooks and refactoring them into clean, production-ready modules.

This job was posted by Akshay Singh from Yugen.ai.