AI/ML Engineer – RAG

Viamagus

Chennai, India

5-7 Years

Save

Posted 23 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Company Overview

Viamagus is a forward-thinking technology company focused on building intelligent systems and solutions that drive business growth. Our teams work at the intersection of innovation and engineering, solving complex problems with cutting-edge AI/ML technologies. At Viamagus, we invest in our people and foster a culture of collaboration, curiosity, and continuous improvement.

Position Overview

Viamagus is hiring for an AI/ML Engineer – RAG. We are seeking a hands-on engineer specializing in Retrieval-Augmented Generation (RAG) to design, build, and optimize production-grade systems that ground LLM responses in enterprise knowledge. You will own end-to-end retrieval pipelines — from ingestion and indexing to hybrid search, reranking, and evaluation — ensuring high relevance, low latency, and measurable reductions in hallucinations and answer failures.

Key Responsibilities

RAG Pipeline Design & Production Deployment

Design and implement robust RAG pipelines: ingestion, parsing, chunking, enrichment, embedding, indexing, retrieval, reranking, and answer generation.
Choose and tune retrieval strategies (dense, sparse/lexical, and hybrid) to maximize recall and precision for real enterprise queries.
Build citation/grounding mechanisms and response policies to ensure traceable, trustworthy outputs.

Indexing, Search Quality & Ranking

Implement and optimize vector and hybrid search over structured and unstructured data (documents, wikis, tickets, logs, and metadata).
Develop reranking strategies (cross-encoder, late-interaction, or LLM-based) and fusion methods (RRF/weighted fusion) to improve ranking quality.
Establish query understanding and rewriting techniques (intent classification, expansions, entity/keyword boosting) to improve retrieval robustness.

Evaluation, Guardrails & Continuous Improvement

Define an evaluation harness for retrieval and generation using offline datasets and online telemetry (precision/recall@k, MRR/nDCG, groundedness).
Implement automated regression tests and quality gates for new prompts, retrievers, and model updates.
Create feedback loops using human review and lightweight labeling to improve relevance over time.

Performance, Reliability & Cost Efficiency

Optimize latency and throughput using caching, batching, streaming responses, and efficient retrieval/index configurations.
Instrument the full pipeline with logs, metrics, traces, dashboards, and alerting; triage failures with runbooks.
Drive cost-aware design across embedding, retrieval, and generation (token budgets, context windows, adaptive retrieval).

Security, Access Control & Compliance

Implement document-level security and access control in retrieval (ACL-aware indexing, filtering, or query-time authorization checks).
Ensure safe handling of sensitive data, auditability, and compliance with enterprise governance standards.

Collaboration & Enablement

Partner with domain owners and engineering teams to prioritize use cases and integrate RAG into products and workflows.
Document best practices and provide reusable templates for ingestion, evaluation, and deployment.

Required Qualifications

Bachelor's degree in computer science, Engineering, Data Science, Human-Computer Interaction, or a related field with 5+ years of relevant experience; OR a Master's/PhD with 3+ years of relevant experience.
Strong programming skills in Python and experience with LLM/RAG development in production environments.
Experience with vector databases or search engines and retrieval concepts (ANN indexes, BM25/lexical search, hybrid retrieval).
Experience designing evaluation methods for retrieval and LLM outputs (grounding, relevance, factuality, and regression testing).
Experience building scalable services and APIs (REST/gRPC), with attention to reliability and performance.
Strong understanding of data processing pipelines, metadata design, and information retrieval fundamentals.
Excellent communication skills and ability to work effectively in cross-functional teams.

Preferred Qualifications

Experience with ranking/reranking techniques (cross-encoders, late-interaction, learning-to-rank) and fusion methods (RRF, weighted scoring).
Experience with document parsing for PDFs/HTML and handling tables, diagrams, or mixed layouts.
Experience with observability and SRE practices for AI systems (SLOs/SLIs, incident response, runbooks).
Experience implementing ACL-aware retrieval and security patterns for enterprise knowledge systems.
Experience building prompt/tooling libraries and maintaining multi-model compatibility across LLM providers.