Job Description
A QA Engineer for AI Initiatives is responsible for ensuring the quality, reliability, fairness, and performance of AI/ML-powered products and systems. Unlike traditional QA, this role requires deep understanding of non-deterministic model behavior, data quality, and AI-specific failure modes such as hallucinations, bias, and model drift.
Key Responsibilities
- Design and execute test strategies specifically for AI/ML models, LLM-based applications, and data pipelines
- Develop automated test frameworks for model validation, regression testing, and performance benchmarking
- Evaluate model outputs for accuracy, consistency, relevance, hallucination, and bias across diverse inputs
- Test RAG (Retrieval-Augmented Generation) pipelines, chatbots, recommendation systems, and other AI-driven features
- Collaborate with data scientists and ML engineers to define acceptance criteria and quality thresholds
- Build and maintain evaluation datasets, ground truth sets, and adversarial test cases
- Monitor models in production for drift, degradation, and anomalous behavior
- Validate data quality, data pipelines, and feature stores that feed AI systems
- Document defects, edge cases, and failure patterns specific to AI behavior
- Ensure AI systems meet ethical, fairness, and compliance standards (bias audits, explainability checks)
Required Skills & Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- 3–6 years of QA experience, with at least 1–2 years in AI/ML quality assurance
- Strong proficiency in Python for test automation and data analysis
- Familiarity with LLM evaluation frameworks (e.g., RAGAS, DeepEval, Promptfoo, LangSmith)
- Hands-on experience with testing tools: Pytest, Selenium, Postman, or similar
- Understanding of ML lifecycle — training, validation, deployment, and monitoring
- Knowledge of data quality tools and pipeline testing (Great Expectations, dbt tests)
Nice to Have
- Experience with prompt engineering and red-teaming LLMs
- Familiarity with MLOps platforms (MLflow, SageMaker, Vertex AI)
- Knowledge of vector databases and embedding quality evaluation
- Understanding of AI safety, responsible AI principles, and fairness frameworks
- Experience with A/B testing and shadow deployment strategies
Soft Skills
- Analytical and inquisitive mindset — comfortable challenging model outputs
- Ability to think like both a user and an adversary (red-team thinking)
- Strong documentation and communication skills
- Collaborative approach with data science, engineering, and product teams
- High attention to detail with a quality-first attitude
Preferred Location:
Bangalore