About UsWe are building next-generation real-time voice AI solutions that combine Text-to-Speech (TTS), Speech-to-Text (STT), Large Language Models (LLMs), and Speech-to-Speech (S2S) intelligence. Our mission is to create highly interactive and human-like AI agents for global use cases in communication, customer experience, and automation.
We are looking for self-motivated, innovative, and hands-on AI/ML Engineers to join our growing team.
Key Responsibilities- Research, design, and implement TTS models (primary focus), with strong emphasis on naturalness, prosody, and real-time synthesis.
- Work with STT pipelines to enable accurate, low-latency transcription in noisy and multi-lingual environments.
- Integrate and fine-tune LLMs for conversational AI and dialogue management.
- Build and optimize Speech-to-Speech (S2S) models by combining STT LLM TTS pipelines and/or direct end-to-end approaches.
- Optimize models for latency, scalability, and deployment in production environments (cloud + edge).
- Collaborate with product, research, and engineering teams to design APIs, SDKs, and scalable infrastructure.
- Stay updated with cutting-edge research in speech AI and apply it to production-ready systems.
Requirements- Proven hands-on experience with TTS models (e.g., Tacotron, FastSpeech, VITS, F5-TTS, VALL-E, or other neural TTS systems).
- Strong background in Deep Learning, NLP, and Speech Processing.
- Experience with STT systems (e.g., Whisper, Deepgram, Wav2Vec2, Kaldi, or similar).
- Familiarity with LLMs (OpenAI, LLaMA, Mistral, or custom fine-tuned models).
- Solid programming skills in Python and frameworks like PyTorch / TensorFlow.
- Experience deploying models in real-time production systems (e.g., Docker, Kubernetes, REST/gRPC APIs).
- Strong problem-solving skills, self-motivation, and ability to work in a fast-paced, team-driven startup environment.
Nice-to-Have- Experience with speech enhancement, voice cloning, or emotion modeling.
- Knowledge of multi-lingual or code-switching (Hinglish, etc.) speech systems.
- Familiarity with GPU optimization, quantization, model distillation, or edge deployment.
- Publications or contributions to open-source AI/ML projects.
What We Offer- Opportunity to work on cutting-edge speech AI with real-world impact.
- Fast-growing startup environment with ownership and innovation freedom.
- Competitive compensation, ESOPs, and performance-based rewards.
- Collaborative, motivated, and research-driven team culture.