Voice AI Engineer

Lyzr AI

Bengaluru, India

2-4 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

About Lyzr

Lyzr is a full-stack agent infrastructure platform that helps enterprises build, deploy, and govern autonomous AI agents across sales, marketing, operations, support, and more. Customers use Lyzr to run production agents in their own cloud (AWS / GCP / on-prem) with enterprise-grade observability, security, and control.

Voice is becoming one of the most important front doors to these agents from AI SDRs doing outbound calls to service agents handling complex customer conversations. That's where you come in.

About the Role

We're hiring a Senior AI Voice Engineer with deep real-time voice AI experience and strong backend engineering skills.

You'll own and scale our end-to-end voice agent pipeline that powers AI SDRs, customer support agents, and internal automation agents on calls. This is a hands-on, highly technical role where you'll design and optimize low-latency, high-reliability voice systems on top of the Lyzr agent platform.

You'll work closely with our founders, product, and platform teams, with significant ownership over architecture, benchmarks, and how voice shows up across all Lyzr agents.

What You'll Do

Own the voice stack end-to-end from telephony / WebRTC entrypoints to STT, turn-taking, LLM reasoning, and TTS back to the caller.
Design for real-time architect and optimize streaming pipelines for sub-second latency, barge-in, interruptions, and graceful recovery on bad networks.
Integrate and tune models evaluate, select, and integrate STT/TTS/LLM/VAD providers (and self-hosted models) for different use-cases, balancing quality, speed, and cost.
Build orchestration & tooling implement agent orchestration logic, evaluation frameworks, call simulators, and dashboards for latency, quality, and reliability.
Harden for production ensure high availability, observability, and robust fault-tolerance for thousands of concurrent calls in customer VPCs.
Collaborate with GTM teams work with product, sales, and customer teams to prototype new voice experiences (AI SDRs, support agents, internal hotlines) and take them from PoC to production.
Shape the voice roadmap influence how voice fits into our broader Agentic OS vision (simulation, analytics, multi-agent collaboration, etc.).

You're a Great Fit If You Have

2+ years of software engineering experience (backend or full-stack) in production systems.
Strong experience building real-time voice agents or similar systems using:
STT / ASR (e.g. Whisper, Deepgram, Assembly, AWS Transcribe, GCP Speech)
TTS (e.g. ElevenLabs, PlayHT, AWS Polly, Azure Neural TTS)
VAD / turn-taking and streaming audio pipelines
LLMs (e.g. OpenAI, Anthropic, Gemini, local models)
Proven track record designing and operating low-latency, high-throughput streaming systems (WebRTC, gRPC, websockets, Kafka, etc.).
Hands-on experience integrating ML models into live, user-facing applications with real-time inference & monitoring.
Solid backend skills with Python and TypeScript/Node.js; strong fundamentals in distributed systems, concurrency, and performance optimization.
Experience with cloud infrastructure especially AWS (EKS, ECS, Lambda, SQS/Kafka, API Gateway, load balancers).
Comfortable working in Kubernetes / Docker environments, including logging, metrics, and alerting.
Startup DNA at least 2 years in an early or mid-stage startup where you shipped fast, owned outcomes, and worked close to the customer.

Nice to Have

Experience self-hosting AI models (ASR / TTS / LLMs) and optimizing them for latency, cost, and reliability.
Telephony integration experience (e.g. Twilio, Vonage, Aircall, SignalWire, or similar).
Experience with evaluation frameworks for conversational agents (call quality scoring, hallucination checks, compliance rules, etc.).
Background in speech processing, signal processing, or dialog systems.
Experience deploying into enterprise VPC / on-prem environments and working with security/compliance constraints.

Why Lyzr

Massive leverage: Your work becomes the voice of multiple agents across banks, PE firms, manufacturers, and global enterprises.
Greenfield voice platform: You're not just plugging into a legacy stack you're shaping how voice is done in an Agentic OS from the ground up.
High ownership: Direct access to founders and customers. You'll see your work ship fast and impact real revenue.
Deep tech + real usage: This is where cutting-edge LLMs, voice, and serious enterprise use-cases meet.