Search by job, company or skills

Orbion Infotech

ML + Python RAG ,LLM ENG (4 Yrs)

4-6 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 16 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Primary Job Title: Machine Learning Engineer LLM & RAG

Industry: Enterprise AI / Software & Cloud Solutions. Sector: Large Language Model (LLM) applications, Retrieval-Augmented Generation (RAG), and production ML services for business workflows. Location: India (Remote).

About The Opportunity

Join a fast-moving engineering team building production-grade LLM-powered services and RAG pipelines that enable intelligent search, document understanding, and agentic automation for enterprise customers. You will design, implement, and operate scalable retrieval, embedding, and inference pipelinesturning research-grade models into reliable, low-latency products.

Role & Responsibilities

  • Design and implement end-to-end RAG workflows: document ingestion, embedding generation, vector indexing, retrieval, and LLM inference.
  • Develop robust Python services that integrate Transformers-based models, LangChain pipelines, and vector search (FAISS/Milvus) for production APIs.
  • Optimize embedding strategies, retrieval quality, and prompt templates to improve relevance, latency, and cost-efficiency.
  • Build scalable inference stacks with serving, batching, caching, and monitoring to meet SLA targets for throughput and latency.
  • Collaborate with data scientists and product teams to evaluate model architectures, run A/B tests, and implement continuous retraining/validation loops.
  • Implement observability, CI/CD, and reproducible deployments (Docker-based containers, model versioning, and automated tests).

Skills & Qualifications

Must-Have

  • 4+ years of professional experience in ML or software engineering with hands-on LLM/RAG work.
  • Strong Python programming and system-design skills for production services.
  • Experience with Transformers-based models and fine-tuning/inference workflows.
  • Proven experience building retrieval pipelines using vector search (FAISS, Milvus) and embeddings.
  • Familiarity with LangChain or equivalent orchestration libraries for LLM workflows.
  • Practical experience containerizing and deploying ML workloads (Docker, CI/CD, basic infra automation).

Preferred

  • Experience with cloud ML infra (AWS, Azure or GCP) and model serving at scale.
  • Familiarity with Kubernetes or other orchestration for production deployments.
  • Experience with retrieval evaluation, relevance metrics, and A/B experimentation.

Benefits & Culture Highlights

  • Fully remote role with flexible hours and an outcomes-driven culture.
  • Opportunity to ship end-to-end LLM products and influence architecture choices.
  • Mentorship-oriented environment with access to modern tools and model stacks.

Why apply: This role offers hands-on ownership of RAG systems and LLM deployment in productionideal for engineers who want to move fast, optimize for real-world impact, and work with cutting-edge LLM tooling.

Skills: python,backend,rag,llm

More Info

About Company

Job ID: 134139271