A leader in the Enterprise AI & Cloud Solutions sector, focused on building production-grade Generative AI, LLM-powered products and knowledge-driven applications for enterprise customers across search, automation, and decision-support domains. We are expanding our Bangalore engineering hub and seeking a senior on-site architect to design, deploy, and operationalize LLM and RAG solutions that run at scale.
Role & Responsibilities
- Architect end-to-end Generative AI solutions: model selection, RAG design, embedding pipelines, vector storage, and inference infrastructure for production workloads.
- Design and implement scalable data ingestion, embedding generation, and retrieval pipelines that integrate with vector databases and search layers.
- Lead model integration and deployment: build model serving (REST/gRPC), autoscaling, batching, and low-latency inference for live traffic.
- Define and drive MLOps best practices: CI/CD for models, monitoring, observability, metrics for accuracy/latency/cost, and automated retraining workflows.
- Collaborate with product, data science, and security teams to ensure prompt engineering, model evaluation, safety, privacy, and compliance requirements are met.
- Mentor engineers, create architecture playbooks, and establish governance for model lifecycle, cost control, and performance SLAs.
Skills & Qualifications Must-Have
- Expertise with Large Language Models and Generative AI architectures
- Hands-on experience designing and implementing Retrieval-Augmented Generation (RAG) systems
- Practical experience with LangChain or similar orchestration frameworks
- Experience with Hugging Face Transformers and model fine-tuning/serving
- Strong experience with vector databases and similarity search (e.g., FAISS, Milvus, Elasticsearch k-NN)
- Production deployment experience using Kubernetes and Docker with model serving patterns
Preferred
- Familiarity with cloud-managed LLM services (Azure OpenAI, AWS Bedrock, GCP Vertex AI)
- Experience with PEFT/LoRA/TRLX fine-tuning techniques and evaluation pipelines
- Background in LLM safety, prompt governance, cost/latency optimization, and observability for models
Benefits & Culture Highlights
- On-site leadership role in Bangalore with ownership of major GenAI initiatives and direct impact on product roadmaps
- Highly collaborative, product-driven engineering culture with opportunities for technical mentorship and career growth
- Access to cutting-edge LLM tooling, professional development, and a strong focus on engineering excellence
Skills: architect,ai,,llm,rag,gen,