Search by job, company or skills

cardekho group

Lead AI Engineer

Save
new job description bg glownew job description bg glow
  • Posted 15 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

CarDekho Group is hiring a Lead GenAI Engineer to own the technical direction of two adjacent product surfaces:

  1. OEM AI Solutions - agentic chatbot and assistant products we ship to automotive OEMs (Maruti, Hyundai, Tata, Mahindra, Kia, etc.) for lead qualification, customer support, dealer enablement, and post-sales engagement.
  2. Consumer Product Experience - in-product AI features across CarDekho.com and the app: buying assistants, comparison agents, personalized recommendations, conversational search, and content generation embedded in the model/variant/dealer journey.

You will set the architecture, codify the eval and safety bar, and lead a small pod of GenAI engineers shipping production agents that are measurably better month-over-month, not limited to demos.

What you'll own -

Agentic systems and orchestration

  • Design and ship multi-agent systems for OEM and consumer flows: planner / executor / critic patterns, tool-using agents, and human-in-the-loop checkpoints for high-stakes turns (price quotes, lead capture, booking).
  • Build on Pydantic-AI as the orchestration layer — own the patterns for typed agents, dependency injection, structured outputs, and graph-style multi-step flows. Evolve the stack based on observability, latency, and cost trade-offs.
  • Build and maintain MCP servers that expose CarDekho's catalogue, pricing, dealer inventory, BigQuery analytics, and CRM as first-class tools for our agents and partner agents.

Models and prompting

  • Make build-vs-buy calls across the current frontier: Claude 4.x (Opus / Sonnet / Haiku), GPT-5, Gemini 2.5, Llama 4, Mistral, and reasoning-tier models. Match model to job — don't pay Opus prices for classification.
  • Apply prompt caching, batch APIs, structured outputs, and extended-thinking budgets to keep unit economics viable at OEM scale (millions of conversations/month).
  • Define and enforce prompt engineering standards: versioning, A/B testing, regression suites, and prompt-as-code reviews in PRs.

Evals, observability, and safety

  • Build the eval harness that gates every prompt, model, and agent change: golden sets, LLM-as-judge with calibrated rubrics, automated red-teaming, and online quality metrics tied to business KPIs (lead conversion, T2L, deflection, CSAT).
  • Stand up production observability (Langfuse / LangSmith / Arize / Helicone) traces, token spend per intent, latency p95s, tool-call success rates, hallucination flags.
  • Own the safety and guardrails layer: PII handling, jailbreak resistance, brand-safety filters per OEM, factuality checks on price/spec/availability, and escalation paths to human agents.

Multimodal

  • Use vision models for damage assessment, document understanding (RC, insurance), and image-grounded comparison features.

Leadership

  • Lead a pod of 3–6 GenAI / ML engineers. Set the technical bar, run design reviews, mentor on prompt craft and eval rigor, and unblock.
  • Partner deeply with Product, Design, Data Science, and OEM account teams — translate ambiguous business problems into agent specs with measurable success criteria.
  • Represent CarDekho in technical conversations with OEM CTOs and partner platforms; influence the AI roadmap shipped to millions of car buyers.

What we expect you to bring

Must-have

  • 4+ years in ML / AI engineering, with 2+ years shipping LLM-powered products to production (not just notebooks or POCs).
  • Demonstrated ownership of at least one agentic system in production multi-step tool use, state management, recovery from tool failures, and a real eval story.
  • Strong Python; comfortable with async patterns, FastAPI / equivalent, and the modern GenAI stack (Pydantic-AI, LiteLLM, Pydantic, observability SDKs).
  • Hands-on with at least two of: Claude, GPT, Gemini, open-weights (Llama / Mistral / Qwen) — and informed opinions on when to use which.
  • Eval-driven mindset — you can describe how you'd prove a prompt change is better, not just feel that it is.
  • Solid statistics and ML fundamentals — you can read a calibration plot, design an A/B test, and reason about distribution shift.

Good to have

  • Experience with MCP (servers and clients), function-calling at scale, or agent-to-agent protocols.
  • Practical RAG / retrieval understanding — chunking strategy, embedding model choice, hybrid search (BM25 + dense), reranking, and how to debug the agent can't find X without guessing. Hands-on with at least one vector store (pgvector, Qdrant, Pinecone, Weaviate).
  • Voice / real-time stack experience (OpenAI Realtime, Deepgram, ElevenLabs, telephony integration, VAD, barge-in, sub-second TTS).
  • Worked with BigQuery / large analytical stores as agent tools, or have built text-to-SQL systems with guardrails.
  • Prior experience in automotive, marketplace, or high-intent consumer commerce — understanding lead funnels, dealer ecosystems, or large product catalogues.
  • Open-source contributions, papers, or talks in the agentic space.

How you work

  • You ship. You instrument. You iterate against numbers.
  • You push back with reasons when a roadmap item is the wrong bet, and you propose the alternative.
  • You write down decisions - eval results, model choices, failure modes.
  • You treat OEM trust and consumer safety as engineering constraints, not afterthoughts.

Why this role is interesting

  • Real scale: 30M+ monthly users on CarDekho consumer surfaces and direct OEM deployments that change how India buys cars.
  • A greenfield-but-not-toy mandate — existing data infra (BigQuery, GA4, the full inventory + leads stack) to build agents on top of, with budget and exec sponsorship to actually ship.
  • Genuine technical latitude on the agentic stack, with a sharp eval culture to keep us honest.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148328569

Similar Jobs

Gurugram, Gurugram, India

Skills:

Web AppsAzure FunctionsFastAPIRestful ApisAzurePythonLangChainevent-driven architecturesvector databasesAKSAzure Application InsightsLangGraphGitHub Actionssemantic retrieval mechanismsMCPEvent GridA2A communication

Noida, India

Skills:

PandasPytorchGcpDockerPythonGenerative AIHuggingface TransformersScikit-learn

Gurugram, Gurugram, India

Skills:

snowflake PythonDatabricksDynamodbAWSKubernetesDockerPostgreSQLLangfuseWeaviateAutoGenLangChainCrewAIPineconeLiteLLMLangGraphQdrant

Delhi, India

Skills:

ElkPrometheusGrafanaCloud StorageReactTerraformDockerFlaskPythonVueDjangoGcpCompute EngineFastAPIKubernetesCI CDLangGraphPub SubNext.jsOpenTelemetryVertex AICrewAILangChainCloud FunctionsGKEAutoGenLlamaIndex

Gurugram, Gurugram, India

Skills:

snowflake ReactSystem DesignFastAPIPythonLLMsAI Coding AgentsAPIs integrationsCrm SystemsNext.jsOpenAI APIsGoogle Workspace APIsSlack integrationsClaude APIagentic coding toolsNotion productivity tooling