Search by job, company or skills

doAZ

Computer Vision & Multimodal LLM Intern (Engineering Drawing Analysis Agent)

Fresher

This job is no longer accepting applications

new job description bg glownew job description bg glow
  • Posted 6 months ago

Job Description

Computer Vision & Multimodal LLM Intern (Drawing Change Analysis Agent)About Doaz

Doaz turns fragmented industrial knowledge into instant, actionable insight. We build LLM- and Vision-AI solutions for construction, heavy industry, and finance—helping teams convert drawings, specs, and regulations into real-time decisions. We're expanding our GeoAI programs (incl. joint work with POSCO E&C) and launching drawing-change detection services that compare plan versions, detect deltas, and explain design impacts.

Why You'll Love Working Here
  • Ship real things: Your models and tools can reach production pilots in weeks.
  • Mentorship, not bureaucracy: Learn directly from senior CV/LLM engineers and domain SMEs.
  • Global crew: 30 teammates across KR / PK / IN ; English-first collaboration.
  • Tech playground: YOLO/RT-DETR, Gemma-VL/Qwen-VL/LLaVA, PaddleOCR, LayoutLMv3, Triton—hands-on.
Role Overview

As a CV & Multimodal LLM Intern, you'll support the end-to-end development of a version-aware drawing-diff engine (PDF/DWG raster & vector), symbol/text extraction, and change-impact narratives powered by RAG/LLM. You'll prototype, evaluate, and iterate with fast feedback from real engineering users.

What You'll Do (Intern Scope)
  • Drawing Change Analysis (CV): assist in rasterization, layer parsing, vector geometry ops; train/evaluate detectors (YOLOv8/RT-DETR/SAM); implement geometry-aware post-processing (IoU/topology/snapping).
  • Document & Layout Understanding: combine OCR (PaddleOCR/Tesseract) with layout models (DocFormer/LayoutLMv3/Donut); normalize to structured JSON; help with version-aware entity tracking (gridlines, BH IDs, coordinates).
  • GeoAI & LLM/RAG: set up retrieval (BM25 + vector with reranking); ground LLM answers with citations and clickable evidence; draft change-impact summaries with rule prompts + LLM verification.
  • Productization Basics: package prototypes as FastAPI services or notebooks; write READMEs; contribute datasets, labeling guides, and simple A/B or ablation tests.
Minimum Qualifications
  • BS/MS student or recent graduate in CS/EE/CE/Geoinformatics/Civil (or similar).
  • Solid Python (3.x); foundations in DS/algorithms, linear algebra, probability.
  • Coursework/projects in CV and/or document AI (detection, segmentation, OCR, layout).
  • Familiar with PyTorch or TensorFlow; Git, Linux, Jupyter.
  • Clear written English; high learning velocity and ownership.
Nice to Have
  • Hands-on with YOLO/RT-DETR/Detectron2/SAM; PaddleOCR/Tesseract; LayoutLMv3/Donut.
  • Exposure to VLMs (Gemma-VL, Qwen-VL, LLaVA), CLIP, rerankers.
  • Experience with engineering drawings/CAD/PDF toolchains.
  • Basic FastAPI, Docker, ONNX/TensorRT/Triton.
  • Frontend (TypeScript/React) for quick review UIs.
Internship Details & Benefits
  • Type/Duration: Paid internship — 4 months (full-time preferred).
  • Compensation (India): Stipend prorated from 6 LPA (INR 600,000 annualized), paid monthly ( INR 50,000/month during the internship).
  • For candidates outside India, compensation will be benchmarked to local market equivalents.
  • Conversion: High performers will receive a full-time offer upon successful completion of the 4-month internship.
  • Perks: Mentorship, cloud/GPU credits, real production impact.
Hiring Process (fast)
  1. Intro call (15–20 min).
  2. 48-hour mini task: simple drawing diff or OCR/layout extraction + short README (clarity > polish).
  3. Tech chat (45–60 min): approach, trade-offs, evaluation.
  4. Founder chat on culture & goals.
  5. Offer.
How to Apply

Email [Confidential Information]

with subject [CV/LLM Intern – Your Name] and include:

  • Résumé/CV (highlight courses/projects; metrics if available).
  • GitHub or demo links (CV/doc-AI/RAG preferred).
  • Availability (start date, weekly hours).
  • (Optional) A one-page diagram of your Drawing Revision Detection Evidence LLM Narrative pipeline.
  • Ready to learn fast and turn messy drawings into trusted intelligence Join Doaz and build with us.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 132130921