Job Description: Data Analyst AI Systems
About the Role:
We are seeking a Data Analyst to join our AI team, focused on data cleaning, dataset curation, annotation quality control, and model performance analysis for AI systems deployed in real-world environments.
Key Responsibilities
Data Cleaning & Dataset Curation
- Clean, validate, and preprocess customer-generated radar, image, and video datasets
- Identify and resolve corrupted, duplicate, mislabeled, or low-quality data samples
- Curate datasets for training, validation, and testing to support model development
- Maintain dataset versions, metadata, lineage, and documentation
- Ensure data quality standards are consistently met across datasets
Annotation & Labeling Quality Control
- Manage and coordinate external annotation resources (e.g., Upwork contractors or vendors)
- Review labeled images and videos to ensure accuracy, consistency, and guideline adherence
- Provide clear feedback, examples, and labeling instructions to annotators
- Track and report on annotation quality, turnaround time, and recurring issues
- Collaborate with the AI team to design, implement, and maintain a quality control (QC) pipeline for data labeling
Model Performance & Data Analysis Support
- Analyze annotated datasets to identify patterns, errors, and data gaps
- Support AI engineers with model evaluation and error analysis
- Monitor basic model performance metrics such as accuracy, precision, recall, and confusion matrices
- Help identify data-driven improvements to model performance in production environments.
Required Qualifications
- 25 years of experience as a Data Analyst, Applied Data Scientist, or similar role
- Strong proficiency in Python, including Pandas, NumPy, and scripting
- Proven experience cleaning and validating messy, real-world datasets
- Comfort working with unstructured data, such as radar signals, images, videos, or logs
- Working knowledge of SQL; experience with Snowflake preferred
- Understanding of basic ML model evaluation metrics
- Exceptional attention to detail and strong focus on data quality.