Job Description
Experience: 4–5 Years
We are building low-latency, high-reliability voice agents powered by the Lyzr platform. You will own the architecture and core systems that enable live voice conversations. Key focus areas will be on end-to-end latency, robustness, and scalability.
What you'll do
Architect and build the real-time voice pipeline
Drive latency down across the stack
Optimize LLM inference
Collaborate with research and product on model selection and training/finetuning
Ensure reliability and safety with guardrails
Mentor engineers, set best practices for streaming service design, and contribute to technical roadmaps.
Minimum Qualifications
3+ years building production distributed systems with a focus on real-time, low-latency, or high-throughput services.
Proficiency in Python and Go or Rust.
Hands-on experience with streaming audio/video and protocols such as WebRTC, RTP/SRTP, Opus, or gRPC streaming.
Experience with using different speech models and using LLMs
Strong understanding of performance engineering: profiling, async I/O, batching, and cache design.
Track record of shipping reliable services