As a Senior Speech R&D, you will own the end-to-end speech data lifecycle that powers advanced speech and speech-to-speech models. Your primary responsibility is to build, curate, validate, and deliver high-quality datasets that enable robust speech understanding, generation, and conversational interaction.
This role is critical to ensuring that models are trained on clean, diverse, well-annotated, and model-ready data.
Key Responsibilities
Speech Data Curation
· Build datasets supporting
:o Speech recognition and understandin
go Multilingual and code-mixed speec
ho Conversational and dialog-style speec
ho Speech generation and synthetic voice dat
ao Audio-to-audio conversational scenario
s· Prepare datasets that capture
:o Speaker variation and continuit
yo Emotional and expressive speech cue
so Real-world noise and acoustic condition
so Conversational turn structure and timin
gEnsemble-Based Data Curation
· Implement data curator pipelines using outputs from
:o Multiple in-house speech model
so External or open-source speech model
s· Aggregate, reconcile, and validate model outputs to
:o Generate reliable annotation
so Filter low-confidence sample
so Detect inconsistencies and label nois
e· Apply rule-based and confidence-driven selection strategies
.Validation & Quality Control
· Perform automated validation for
:o Audio integrity and format consistenc
yo Transcript alignment and correctnes
so Language and speaker metadata accurac
y· Run sampling-based manual audits
.· Produce dataset quality reports and summaries
.Engineering & Operations
· Build scalable, reproducible data pipelines in Python and C++
.· Handle large audio corpora efficiently on Linux systems
.· Generate training-ready manifests and metadata
.· Maintain dataset versions, lineage, and reproducibility
.Required Skills
· 4+ years of experience in R&D or ML data pipelines
.· Strong Python skills for large-scale data processing
.· Experience working with audio or speech datasets
.· Familiarity with annotation formats and metadata schemas