AI TEST LEAD - AI TEST LEAD

Happiest Minds Technologies

Bengaluru, India

7-9 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Description

About the Role

We are seeking a Lead Quality Engineering (QE) Engineer to define, operationalize, and own the quality strategy for our Agentic AI application teams. This leader will be accountable for functional, operational, and security quality across 25+ engineers (AI + UI engineers in India & USA).

This role requires deep awareness of quality challenges unique to LLM- and SLM-powered Agentic AI applications, especially in healthcare and education, where correctness, reliability, and compliance are essential.

Typical Quality Challenges Include

LLM/SLM Latency & Token Efficiency: unpredictable response times, throughput constraints, and cost-performance tradeoffs.
Non-Deterministic Outputs: validating variable responses in sensitive domains (medical correctness, educational appropriateness).
RAG & Vector DB Use Cases: testing retrieval relevance, embedding coverage, semantic accuracy, and fallback handling.
SME-Driven UAT Cycles: unpredictable validation cycles with clinicians or educators.
Operational Risks: agent workflow reliability, cand system behavior under load.
Security Risks: prompt injection, adversarial inputs, data leakage, and access control.

This is a transformational role: you will move the organization from manual QA toward automation-first and AI-driven evaluation, enabling every engineer to take responsibility for quality.

Key ResponsibilitiesQuality Leadership & Culture

Own accountability for end-to-end quality outcomes across 2 global teams (25 engineers).
Champion a shift-left quality culture, embedding testing in design, code reviews, and CI/CD.
Partner closely with AI Engineers to embed quality into day-to-day development.
Partner with the Platform QE Engineering team to ensure AI apps meet platform-level quality and scalability standards.
Partner with Technical Product Managers (TPMs) and Technical Product Owners (TPOs) to ensure quality requirements are captured and addressed.
Define and track team-level quality OKRs and KPIs.

Functional Quality

Architect and implement automation frameworks (UI, backend, API, mobile).
Build evaluation frameworks for:

LLM/SLM non-deterministic responses.
Prompt and agent orchestration reliability.
RAG + Vector DB use cases (retrieval relevance, semantic correctness, failure fallback).
Hallucination detection, bias, fairness, and safety.

Integrate AI evaluation into CI/CD pipelines with dashboards and gating criteria.

Operational Quality (Enablement Role)

Define strategies for load, performance, and reliability testing.
Establish frameworks and test patterns for evaluating latency, concurrency, token efficiency, and response unpredictability.
Ensure teams conduct and observe LnP (Load & Performance) tests and capture quality signals.
Act as an enabler and coach, ensuring practices are scalable and team owned.

Security & Compliance Quality

Collaborate with the Ascend Penetration Testing team to ensure coverage of security risks (prompt injection, adversarial attacks, access control, and data leakage prevention).
Establish additional security validation practices (input/output sanitization for healthcare/education data).
Ensure compliance with Ascend ITGC, PCI, PII, CCPA where applicable.

QualificationsMust Have

7+ years in Quality Engineering/Automation, with 3 years in QA leadership roles.
Proven experience transforming teams from manual QA to automation-first.
Awareness of LLM/SLM quality challenges (latency unpredictability, token inefficiency, hallucinations, SME UAT cycles).
Strong automation expertise (Playwright, PyTest, Cypress, JUnit, REST API testing).
Familiarity with Agentic AI frameworks (LangChain, LangGraph, RAG pipelines, Vector DBs).
Experience in healthcare or education applications with regulatory constraints.
Solid background in CI/CD, DevOps, and cloud-native systems (Azure, Kubernetes, GitHub Actions).

Nice to Have (Big Plus)

Experience with Playwright MCP (multi-context automation) for scaling automation.
Hands-on with AI evaluation tools (Promptfoo, DeepEval, OpenAI Evals).
Familiarity with AI observability & monitoring (Datadog).
Background in AI security testing (prompt injection, adversarial robustness).