Research Analyst – Egocentric AI & Robotics Data Intelligence
About the Role
At Humyn Labs, we are building egocentric video datasets from real-world environments — residential, agricultural, manufacturing, and construction — to train the next generation of robotic AI models.
We believe human-collected, real-world data fundamentally outperforms synthetic or simulation-based data for robotic training. Your job is to prove it.
We are looking for a Research Analyst who can rigorously compare Human Labs datasets against sim/synthetic alternatives, publish compelling research that demonstrates the superiority of real human-collected egocentric data, and help position Human Labs as the definitive source of ground-truth robotics training data.
What You'll Own
Robotics Model Performance Research
- Evaluate how robotic models trained on Human Labs egocentric data perform vs. models trained on synthetic or simulation data
- Benchmark across real-world domains:
- Residential (household tasks, navigation, object interaction)
- Agricultural (field operations, crop handling, terrain variability)
- Manufacturing (assembly, quality inspection, tool use)
- Construction (site navigation, material handling, safety scenarios)
- Track metrics such as task success rate, generalization, robustness, and sim-to-real transfer gap
- Continuously publish performance comparisons that highlight real-world data advantages
Egocentric Video Dataset Analysis
- Deep-dive into Human Labs egocentric video datasets — understand what makes them uniquely valuable
- Analyze dataset characteristics including:
- First-person perspective richness and scene diversity
- Labeling precision, bounding quality, and annotation consistency
- Temporal depth and action continuity
- Environmental variability (lighting, motion, noise, terrain)
- Compare against publicly available sim datasets (e.g., AI2-THOR, Habitat, Isaac Sim, CARLA) and synthetic alternatives
- Identify and articulate what differentiates Human Labs data quality from other vendors
Labeling & Annotation Quality Intelligence
- Develop a structured framework to evaluate and score dataset annotation quality
- Focus on what matters for robotics training:
- Bounding box precision and consistency
- Action and event labeling accuracy
- Depth, pose, and spatial annotation quality
- Edge case coverage in real-world conditions
- Showcase how Human Labs labeling standards outperform industry benchmarks
Research Publishing & Thought Leadership
- Publish research reports, white papers, and blog posts that:
- Demonstrate human data superiority over sim/synthetic for robotic training
- Highlight performance gaps when models trained on sim data are deployed in the real world
- Position Human Labs as a pioneer in real-world egocentric robotics data
- Stay deeply read on:
- Robotics learning research (imitation learning, behavior cloning, reinforcement learning from demonstrations)
- Egocentric video understanding and first-person AI
- Sim-to-real transfer literature
- Competing dataset vendors and benchmark ecosystems
What We're Looking For
- 2–5 years of experience in ML research, robotics data, computer vision, or applied AI
- Strong understanding of robotics training pipelines and data requirements
- Familiarity with egocentric or first-person video datasets (e.g., Ego4D, EPIC-Kitchens, or similar)
- Knowledge of sim/synthetic data platforms (Isaac Sim, AI2-THOR, Habitat, CARLA, or similar)
- Experience with dataset evaluation, annotation quality assessment, or benchmarking
- Ability to write clear, publishable research for both technical and non-technical audiences
- Genuine curiosity about the real-world vs. synthetic data debate in AI
Technical Skills
- Python (mandatory)
- PyTorch or TensorFlow
- Video processing tools (OpenCV, FFmpeg, or similar)
- Familiarity with:
- Robotics learning frameworks (ROS, LeRobot, or similar)
- Annotation and labeling tools (CVAT, Scale AI, Labelbox, or similar)
- Evaluation metrics for robotics and video understanding
- Experience reading and synthesizing ML research papers
- Bonus: hands-on experience with sim environments or robotic datasets
Ideal Mindset
- Deeply read on robotics AI, egocentric video, and dataset research
- Analytical and detail-oriented — able to spot what makes one dataset better than another
- Passionate about real-world data and its role in making robots actually work
- A strong communicator who can turn data comparisons into compelling research narratives
- Excited to build Human Labs reputation as the gold standard in robotics training data
What Success Looks Like in 90 Days
- First research report published comparing Human Labs egocentric data vs. sim/synthetic alternatives on at least one robotic domain
- Benchmarking framework live across 2–3 robotics or video models
- Dataset quality scoring system operational with clear differentiation metrics
- At least 2 domain-specific analyses (e.g., residential vs. agricultural) highlighting real-world data advantages