Experience: 1-2 Years
Location: Gurugram
About Us
At Josh Talks, we believe voice will be the primary interface between humans and machines.
Our mission:
- Help machines talk like humans
- Build benchmarks & datasets that power global speech AI
- Drive breakthroughs with high-quality, diverse, real-world data (not just algorithms)
- Today's models sound robotic because they're slow & inaccurate.
- Our goals: <200 ms latency and <5% WER so conversations feel natural.
- We build benchmarks, datasets, and foundational speech models using State Space Models (SSMs) where Transformers fall short.
- We're a small, fast team obsessed with precision; every decision is measured, documented, and shipped with intent.
The Role : Join our Dataset team and help build high-quality speech datasets for ASR across Indian languages. This role blends operations + product, with real ownership and scale.
What you'll do:
- Manage speakers, raters, and vendors for large-scale data collection
- Own onboarding, scheduling, payouts, and issue resolution
- Ensure data quality through clear guidelines and QC checks
- Track dataset and quality metrics
- Work with Product & Tech teams to improve data workflows, tools, and dashboards
Who you are:
- Strong in execution, coordination, and communication
- Detail-oriented with a quality-first mindset
- Interested in AI, speech datasets, or product ops
- Indian language familiarity is a plus