Voice AI Prompt Engineer
Location: Bengaluru, Karnataka, India
Experience: 3–6 years
Comp. Range: 20-35 lpa
About the Company
Gnani.ai is a conversational AI company building voice-first, agentic AI platforms that automate customer interactions across voice, chat, and workflows in more than 40 languages, including 12+ Indian languages. The company serves over 200 global enterprises, including Fortune 500 brands, and processes more than 30 million voice interactions every day. Its solutions are used across BFSI, retail, automotive, healthcare, telecom, government, and other enterprise environments where scale, accuracy, and multilingual performance matter. Gnani.ai's mission is to help businesses deliver better customer experiences and stronger operational outcomes through AI agents that understand context, act in real time, and work reliably in production. The company's vision is to become the world's most trusted Agentic AI platform for secure, intelligent automation across customer touchpoints. Backed by Samsung Ventures and Info Edge Ventures, and selected under the IndiaAI Mission, Gnani.ai combines enterprise credibility with deep product and research capability.
Company Proof Points
- Trusted by over 200 global enterprises, including Fortune 500 brands, which shows enterprise adoption at meaningful scale across demanding use cases.
- Processes more than 30 million voice interactions daily, demonstrating production maturity and the operational depth of its voice AI platform.
- Deployed across BFSI, telecom, government, and enterprise environments, giving teams exposure to high-stakes workflows where accuracy and reliability directly affect business outcomes.
Company Recognition & Credentials
- Backed by Samsung Ventures and Info Edge Ventures, reflecting strong investor confidence in the company's product direction and market opportunity.
- Selected under the IndiaAI Mission to build foundational AI models for India, highlighting national-level recognition and relevance in the AI ecosystem.
About the Role
This role owns prompt design and agentic workflow engineering for a live Voice AI platform. The work spans real-time telephony systems, IVR and outbound bots, multilingual conversational agents, post-call analytics, and multi-step LLM applications. Success depends on understanding ASR noise, spoken-language constraints, and latency budgets, then turning those constraints into prompts and workflows that perform reliably in production. The role also requires close collaboration with product and engineering teams to translate requirements into prompt specifications, structured outputs, and evaluation-ready experiments. The right candidate will treat prompts as production assets, not one-off experiments, and will know how to improve call containment, conversation quality, and downstream enterprise outcomes through disciplined iteration.
Key Responsibilities
- Design ASR-noise-robust prompts for IVR bots, outbound calling agents, intent detection, entity extraction, and dialog state tracking, ensuring outputs remain accurate within real-time latency budgets.
- Optimize spoken responses for TTS delivery so they sound concise, natural, and telephony-appropriate, with no markdown and with structure that supports clear audio delivery.
- Engineer prompt strategies for code-mixed and multilingual inputs such as Hinglish and Tanglish, while defining fallback logic that preserves user understanding when language signals are unstable.
- Build agentic workflows using ReAct-style reasoning, tool calling, and RAG-backed patterns so voice and non-voice enterprise applications can complete multi-step tasks reliably.
- Create prompt evaluation frameworks with voice-specific metrics, regression tests, and versioned prompt libraries so changes can be measured and safely shipped.
- Manage long, multi-turn conversations through summarization, memory injection, and context prioritization, keeping critical information available without overwhelming the model context window.
- Embed prompt logic into production inference pipelines and define structured output schemas that downstream telephony systems can consume without manual cleanup.
- Translate product requirements into prompt specifications and curated instruction datasets, turning prompt experiments into reusable assets for fine-tuning and operational improvement.
Essential Skills & Technologies
- Strong Python experience with the ability to build prompt tooling, evaluation scripts, API integrations, and data-processing flows that support production experimentation.
- Hands-on experience with LLMs and conversational AI, including practical understanding of how model behavior changes when ASR outputs, latency, and spoken-language constraints are introduced.
- Experience with agentic frameworks such as LangChain, LlamaIndex, LangGraph, or equivalent RAG architectures, including chunking, retrieval, and prompt integration.
- Ability to build evaluation pipelines with quantitative metrics, regression testing, and failure-mode analysis so prompt changes can be measured rather than guessed.
- Strong critical reading of model outputs, with the ability to articulate precise failure modes and translate them into prompt or workflow changes.
- Practical understanding of voice AI systems, including telephony integrations, IVR flows, outbound bot behavior, and the downstream effect of noisy ASR transcripts on LLM performance.
- Familiarity with structured outputs, production inference pipelines, and API-based integration patterns that connect prompts to enterprise systems cleanly.
Additional Plus
- Experience with multilingual Indian-language voice systems and code-mixed speech handling in live production environments.
- Exposure to fine-tuning data curation, prompt experiment analysis, or post-call analytics workflows in enterprise AI settings.
- Experience working on systems where call containment, customer experience, and operational efficiency are measured as direct business outcomes.
What You'll Bring
- You bring 3–6 years of experience working with LLMs and a clear understanding of how to turn model capability into dependable product behavior.
- You bring at least 1 year of hands-on experience in voice AI or conversational AI, with direct exposure to ASR output characteristics and their downstream impact.
- You bring strong Python skills, disciplined experimentation habits, and the ability to explain exactly why a prompt or workflow fails and how to fix it.
- You bring experience with agentic and retrieval-based systems, plus the judgment to build output structures that production telephony systems can trust.
Why Join Us
Join a company that is already operating at meaningful enterprise scale, where your prompt decisions directly shape customer conversations and business outcomes. This role gives you the chance to work on live Voice AI systems processing more than 30 million calls per day across BFSI, telecom, government, and other enterprise verticals. You will have access to proprietary multilingual speech datasets, in-house ASR and TTS models, and real call transcripts across more than 40 languages, which means your work can move quickly from experimentation to production impact. The role is well suited to someone who wants to go deeper into the systems side of Voice AI and build durable expertise in prompt engineering, agentic workflows, and production evaluation. There is also a clear growth path into Senior Voice AI Engineer, Voice AI Platform Lead, or LLM Systems Architecture, depending on where you want to build your career.
What We Offer
- The opportunity to work on live Voice AI infrastructure with immediate, measurable impact on call containment, user experience, and enterprise efficiency.
- Access to proprietary multilingual datasets, in-house ASR and TTS models, and large-scale production transcripts that make experimentation and learning practical.
- A clear path toward senior technical roles for engineers who want to grow into platform, systems, or architecture leadership.