Senior Machine Learning Engineer

AiSensy

Gurugram, Gurugram, India

5-7 Years

Save

Posted 2 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

About AiSensy

AiSensy is a WhatsApp based Marketing & Engagement platform helping businesses like Adani, Delhi Transport Corporation, Yakult, Godrej, Aditya Birla Hindalco, Wipro, Asian Paints, India Today Group, Skullcandy, Vivo, Physicswallah, and Cosco grow their revenues via WhatsApp.

Enabling 210,000+ Businesses with WhatsApp Engagement & Marketing
800 Crores+ WhatsApp Messages exchanged between Businesses and Users via AiSensy per year
Working with top brands like Delhi Transport Corporation, Vivo, Physicswallah & more
High Impact as Businesses drive 25-80% Revenues using AiSensy Platform
Mission-Driven and Growth Stage Startup backed by Marsshot.vc, Bluelotus.vc & 50+ Angel Investors

About the Role

You will own the ML systems behind AiSensy's conversational AI stack — serving 200,000+ SMBs across India. This is a deep-IC role with significant architectural influence. You will work directly with senior engineering leadership on systems currently being benchmarked against Intercom Fin and Chatbase.

This is not a research role. You will ship to production, own latency and cost budgets, and be measured on whether real bots stop failing.

Core Responsibilities

1. Conversational AI

Own the end-to-end ML pipeline for Conversational AI: retrieval quality, tool-calling routing, guardrails, and response synthesis.
Design and tune hybrid retrieval (BM25 + ColBERT + dense embeddings) on Vector Databases. Build retrieval quality gates that catch failures before they hit users.
Work across the LangGraph + DSPy orchestration layer or any other orchestration frameworks on prompt isolation, capability discovery, tool-path selection, and structured-output reliability.
Evolve the three-tier memory architecture (STM, summary, LTM) spanning Qdrant and Valkey VSS.
Build guardrail systems (PII detection, advice boundaries, safety) that cleanly separate platform-absolute rules, overridable defaults, and bot-owner rules. No conflation of layers.
Run rigorous offline and online evals; tie model quality to product KPIs (resolution rate, handoff rate, cost per conversation).

2. Behavioral ML

Design and productionize representation-learning models that encode multi-turn conversational behavior into embeddings suitable for downstream clustering, retrieval, and personalization.
Build multi-level behavioral segmentation pipelines — both cross-tenant behavioral archetypes and tenant-scoped business clusters — with incremental updates that stay fresh as new user data arrives.
Partner with platform engineering on the feature infrastructure spanning raw event storage, behavioral feature computation, vector storage at scale, and low-latency online feature serving for real-time journey orchestration.
Own the end-to-end ML lifecycle in production: training, batch and online inference, retrain cadence, drift detection, and safe rollback paths.

3. Platform & Production

Take models from notebook to production on Amazon Bedrock (Nova), SageMaker, and the AWS stack (ECS, ECR, Kubernetes).
Own latency, cost, and quality budgets — particularly for WhatsApp-scale conversational throughput across multi-tenant workloads.
Write low-level design documents before implementation. No LLD = no start is a team-wide gate. You will also review LLDs from peers.
Contribute to the evaluation framework that benchmarks us against Intercom Fin, Chatbase, and other category leaders.

What We're Looking For

Must-Have

5+ years building production ML systems, with at least 2 years focused on LLM / conversational AI.
Deep familiarity with RAG systems — not just wiring them up, but diagnosing retrieval failures, tuning hybrid retrievers, and knowing when not to use RAG.
Hands-on experience with LangGraph, DSPy, or equivalent LLM orchestration frameworks.
Strong grasp of vector databases (Qdrant, pgvector, or similar) at non-trivial scale.
Production experience with a major cloud ML platform (SageMaker preferred; Bedrock, Vertex AI, Azure ML acceptable).
Solid foundation in classical ML: embedding models, clustering, contrastive learning, evaluation methodology.
Ability to write LLDs that a senior backend engineer can review and build against.
Python, Fast API

Strong Signals

You can articulate when an LLM is the wrong tool — e.g., why reaching for RAG on a classification problem is a smell.
You have shipped multi-tenant ML at SMB scale. Cost per tenant matters to you, not just model accuracy.
You have tuned MuRIL, IndicBERT, or similar Indic-language models in production.
Experience with behavioral / sequence modeling for user journeys.
Full-stack comfort — you can argue about MongoDB vs ClickHouse partitioning, not just model architecture.

Nice-to-Have

WhatsApp Business API / CPaaS background.
Contributions to open-source ML tooling.
Published work on retrieval, conversational AI, or behavioral ML.

How We Evaluate

We value first-principles thinking over pattern-matching. In our interview loop:

Take-home submissions with runtime crashes are disqualifying. We expect what you submit to run.
Proposing a generic solution (add RAG, throw an LLM at it) to a problem that does not need it is a signal we read carefully.
We want engineers who decompose a problem, pick the lightest tool that solves it, and can defend that choice under scrutiny.