Search by job, company or skills

S

Computer Vision Research Engineer

3-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role Description

We are hiring a Senior Computer Vision Research Engineer to design and deploy scalable, low-latency video analytics systems for large-scale CCTV networks. Core focus: building the best-in-class Vision-Language Models (VLMs) optimized for edge deployment, enabling multimodal reasoning (VQA, semantic search, event description) in resource-constrained environments.

Key Responsibilities:

  • Architect end-to-end pipelines: MOT, Re-ID, action/anomaly detection, scene understanding.
  • Develop and optimize sub-2B parameter VLMs for edge (e.g., surpassing Moondream2/Qwen2-VL benchmarks) using QAT, PTQ, pruning, distillation, and efficient architectures.
  • Scale real-time processing of thousands of streams with sub-second latency.
  • Profile and resolve bottlenecks in video analytics and multimodal systems.
  • Optimize for edge hardware (Jetson, Coral, Hailo) via TensorRT/OpenVINO/TVM.
  • Design hybrid cloud-edge architectures and production monitoring.

Qualifications :

  • Minimum 3+ years of industry experience in developing and deploying computer vision systems for video analytics at scale.
  • Proven track record of production deployments across large-scale camera networks ,including full lifecycle from prototyping to monitoring.
  • Demonstrated expertise in building and optimizing Vision-Language Models (VLMs) for edge environments, with hands-on experience in architectures like unified embedding, cross-modality attention, or efficient variants (e.g., SmolVLM, LFM2-VL, MobileVLM).
  • Deep understanding of performance bottlenecks in contemporary video analytics and VLM systems (e.g., GPU/CPU saturation, PCIe bandwidth contention, codec latency, drift due to domain shift, high token counts in multimodal processing, and privacy-preserving inference).
  • Hands-on expertise in edge model optimization using TensorFlow Lite, ONNX Runtime, PyTorch Mobile, OpenVINO, TensorRT, or TVM, achieving 25x reductions in latency/memory while maintaining accuracy, including techniques for VLM compression like token pruning or multi-scale pooling.
  • Strong proficiency in Python/C++, with extensive experience in PyTorch/TensorFlow, OpenCV, CUDA, and distributed training/inference frameworks.
  • Solid foundation in modern CV architectures (Transformers, CNNs, hybrid models), real-time tracking algorithms (DeepSORT, ByteTrack, BoT-SORT), and VLM components (e.g., vision encoders like ViT, multimodal pre-training strategies).

What we offer:

  • Competitive compensation package with equity.
  • Comprehensive health benefits and flexible working arrangements.
  • Access to cutting-edge hardware, cloud credits, and conference attendance support.
  • Opportunity to shape the future of AI-powered physical security systems.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 138613181