Experience: 6+ years backend/ML development, 1+ year with GenAI/RAG systemsAbout the Role
We're hiring a Generative AI Developer to build an advanced RAG pipeline with agentic and multimodal capabilitiesfor processing complex contract documents (PDFs with text, tables, and images). You'll work with open-source LLMs, build agent workflows with AgenticX/LangGraph, and design systems that combine vision and language models for deeper contract understanding.Key
Responsibilities
- Build end-to-end RAG pipelines: parse chunk embed retrieve generate
- Develop agentic workflows using AgenticX, LangGraph, or custom orchestration
- Parse contract PDFs using layout-aware and OCR-based parsers (Unstructured, PyMuPDF, PDFPlumber)
- Enable multimodal processing (text + images/tables) using models like GPT-4V, LLaVA, Donut
- Integrate open-source models (Mistral, LLaMA3, Ollama) and vector databases (Milvus, FAISS, Qdrant)
- Build modular APIs (FastAPI) for Q&A, summarization, and classification across documents
- Optimize retrieval/generation for long-context and multi-document inputs
Required Skills
- 6+ years in backend/ML; 1+ year in GenAI, RAG, or agentic architectures
- Strong Python skills; experience with LangChain, LlamaIndex, Haystack
- Deep experience in PDF parsing (text, tables, images) and chunking strategies
- Familiarity with multimodal models (e.g., GPT-4V, LLaVA, or equivalents)
- Hands-on with vector DBs, FastAPI, Redis/Postgres
- Understanding of legal/contract structure and NLP challenges
Bonus Skills
- Experience with LangGraph, AgenticX, or custom multi-agent systems
- Familiarity with multimodal fusion techniques and prompt engineering
- Knowledge of document layout analysis and long-form reasoning
- Prior work in legal tech, compliance, or enterprise contract analytics