Freelance Gen AI Testing QA (Senior)

willware technologies

Coimbatore, India

5-8 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Role: Freelance Gen AI Testing QA Evaluation Engineer (Senior)

Company Name : WillWare Technologies

Work Model : Remote / Contract / Fulltime

Experience : 5+ Years

Work Location : Chennai/Bangalore/Kochi/Jaipur/Coimbatore/Remote

Job description:

Experience:

• 5-8 years in QA automation, with 1-3 years in GenAI / API-based testing.

Key Responsibilities:

• Develop and maintain automated evaluation pipelines.

• Implement evaluation scripts using Python frameworks (e.g., DeepEval, custom frameworks)

• Integrate LLM/Chatbot APIs and agent workflows into evaluation pipelines

• Execute dataset-driven evaluations and capture and process responses.

• Support manual test scenario execution and validation

• Assist in dataset creation and enrichment

• Generate evaluation reports, and logs

• Debug and troubleshoot execution issues.

• Enable CI/CD integration for continuous evaluation.

Key Skills :

Core GenAI Evaluation Skills:

• Experience with evaluation frameworks (e.g., DeepEval or Arize)

• Understanding of LLM-as-a-Judge (G-Eval) methodology

• Strong prompt engineering and evaluation design skills

• Experience in manual evaluation of LLM outputs.

Technical Skills:

• Strong programming in Python

• Experience in API testing and integration

• Proficiency in JSON handling, parsing, and data processing

• Automation framework development/integration.

• Knowledge of logging, reporting, and debugging tools

Agent Manual Testing & Dataset Skills:

• Experience in:

o Test scenario creation for GenAI use cases

o Manual validation of LLM responses (qualitative assessment)

o Dataset creation and curation

o Writing expected outputs or golden answers.

• Ability to design edge cases, negative scenarios and adversarial inputs (prompt injection, jailbreaks)

Domain & QA Skills:

• Strong foundation in software testing principles:

o Functional, integration, regression testing

• Experience in test design, defect tracking, and reporting.

• Strong analytical and problem-solving skills.

• Conversational AI testing experience.

• Understanding of AI agent behavior, workflows, and edge cases.