Key Responsibilities
- Quality Strategy Ownership: Define the end-to-end testing strategy for GenAI applications(e.g., summarization, conversational flows, content generation), including selecting appropriate methodologies for agentic AI systems.
- Mentorship & Team Guidance: Lead and mentor junior QA team members, providing direction on AI-specific testing crafts, prompt engineering, and automation frameworks.
- LLM Evaluation: Validate model outputs for factual accuracy, tone, relevance, and consistency.
- Prompt Engineering & Validation: Test and refine prompt-response workflows to identify hallucinations and edge cases
- Performance Engineering: Evaluate model latency, throughput, token usage, and API response times.
Required Skills & Experience
- Experience: 5-8 years in software testing, with 13 years specialized in testing GenAI/LLM models.
- GenAI Tools: Experience with LLMs (e.g., OpenAI GPT, Claude)
- Cloud Platforms: Familiarity with AWS (Bedrock), Azure (OpenAI) etc.
- Automation Tools: Knowledge of API testing tools (Postman, REST Assured) and CI/CD tools (Jenkins, GitHub Actions), Cypress with typescript.
- QA Fundamentals: Strong understanding of Agile methodologies, functional testing, and defect tracking (Jira)
- Root Cause Analysis (RCA): Perform deep-dive analyses of production failures in AI systems to differentiate between code bugs and model behavioural issues