Prompt Engineer

NicheHR Global

India

5-7 Years

This job is no longer accepting applications

Posted 4 months ago

Job Description

Job Description

Senior Prompt Engineer — Data Science & Quality Analysis

Remote | US Shift

We are building advanced Voice AI systems for some of the largest restaurant and retail brands in the U.S., including several in the top 10. Our AI solutions are already live in production, delivering over 80% accuracy and powering real customer interactions daily. With a $1B market opportunity ahead, this is a pivotal moment to join the team and shape AI products used by thousands of staff and consumers. Be part of a fast-scaling environment where cutting-edge LLM innovation meets real-world business impact

.Essential Job Function

sDesign, test, and optimize LLM prompts for conversational AI, text classification, and structured data extraction
.Build evaluation pipelines to analyze prompt performance using metrics, human-in-the-loop feedback, and business KPIs
.Run prompt experiments and regression testing to ensure performance stability, accuracy, and safety as models evolve
.Collaborate with ML, Product, and Operations teams to translate business needs into scalable prompt-engineering strategies that improve accuracy, efficiency, and real-world usability
.Use Python/SQL to inspect outputs, detect anomalies, and automate quality-check workflows
.Document best practices and contribute to internal frameworks for prompt evaluation and continuous improvement
.Communicate insights to technical and non-technical stakeholders, driving measurable improvements in product quality and performance

.Requirement

sB.S. or higher in Data Science, Computer Science, Engineering, Linguistics, Philosophy, Cognitive Science, or related fields
.5+ years of relevant experience with a B.S. degree, or 3+ years with a Master's degree
.Strong proficiency in Python for automation, evaluation, and experimentation within LLM workflows
.Proven experience in prompt engineering and working with LLMs (GPT-4, Claude, Gemini, LLaMA, etc.) for text generation, reasoning, and structured data extraction
.Proficiency in Python and SQL for data analysis, evaluation scripting, and workflow automation
.Strong background in A/B testing, statistical analysis, and performance metric design for continuous optimization
.Familiarity with prompt-evaluation tools such as LangFuse, Galileo, and Weights & Biases for experiment management and regression testing
.Deep understanding of advanced prompting techniques including few-shot prompting, reasoning-based prompting, multi-turn dialogue design, agentic orchestration, and DSPy/AdaFlow-style programmatic prompting
.Experience applying CO-STAR and TIDD-EC! prompting frameworks for structured reasoning, instruction design, and context control in production environments
.Excellent requirement-elicitation and communication skills
.Analytical, process-driven mindset focused on optimizing model behavior, data quality, and operational workflows
.Research experience in language models, prompt engineering, or LLM-based systems is a plus
.Familiarity with LLM architectures, embeddings, and fine-tuning techniques preferred
.Experience with LLM red-teaming, adversarial evaluation, or safety testing is a plus
.Must be flexible to work US hours until at least 5 PM EST and must have a personal system/work setup suitable for remote work