Drive the next wave of generative and multimodal intelligence that powers our autonomous robots.
As a core member of the AI Research team you'll turn cutting-edge, vision-language and diffusion advances into robust real-time systems that see reason and act on dynamic construction sites.
Key Responsibilities
- Research & innovate diffusion-based generative models for photorealistic wall-surface simulation, defect synthesis and domain adaptation
- Architect and train Vision-Language Models (VLMs) and Vision-Language Alignment (VLA) objectives that connect textual work orders, CAD plans and sensor data to pixel-level understanding
- Lead development of auto-annotation pipelines (active learning, self-training, synthetic data) that scale to millions of frames and point-clouds with minimal human effort
- Optimize and compress models (INT8, LoRA, distillation) for deployment on Jetson-class edge devices under ROS 2
- Own the full lifecycleproblem definition, literature review, prototyping, offline/online evaluation and production hand-off to perception & controls teams
- Publish internal tech reports and external conference papers; mentor interns and junior engineers
Qualifications & Skills
- 3+ years in deep-learning R&D or Ph.D./M.S. in CS, EE, Robotics or related field with strong publication record
- Demonstrated expertise in diffusion models (DDPM, LDM, ControlNet) and multimodal transformers / VLMs (CLIP, BLIP-2, LLaVA, Flamingo)
- Proven success building large-scale data-centric AI workflowsactive learning, pseudo-labeling, weak supervision
- Advanced proficiency in Python, PyTorch (or JAX), experiment tracking and scalable training (PyTorch Lightning, DeepSpeed, Ray)
- Familiarity with edge-AI runtimes (TensorRT, ONNX Runtime), and CUDA / C++ performance tuning
- Strong mathematical foundation (probability, information theory, optimization) and ability to translate theory into production code
- Bonus: experience with synthetic data generation in Isaac Sim or robotics perception stacks (ROS2, Nav2, MoveIt 2, Open3D)
Why join us
- Own breakthrough tech from idea to autonomous robot on active job-sitesyour work leaves the lab fast
- Collaborate cross-functionally with perception, controls, and product teams & publish at top venues with company support
- Shape an industry by replacing dangerous, repetitive construction labor with intelligent robots
- Competitive salary + equity, hardware budget, flexible hybrid work, and a culture that prizes deep work and rapid iteration
Requirements
- Vision-Language Models
- Vision-Language Alignment
- R&D
- Ph.D./M.S. in CS, EE, Robotics
- TensorRT,
- ONNX Runtime
- CUDA / C++