Experience: 38 years
Location: Pune
About The Role
We are looking for a highly skilled Generative AI Engineer with strong expertise in Large Language
Models (LLMs), model fine-tuning, Reinforcement Learning (RL/RLHF), and RAG-based systems. The
ideal candidate must have a deep understanding of how LLMs work internally and proven hands-on
experience building, customizing, optimizing, and deploying production-grade generative AI systems.
This role requires both strong research depth and production engineering experience.
Key Responsibilities
Design, build, and deploy LLM-powered applications in production.
Fine-tune open-source foundation models (LLaMA, Mistral, Falcon, etc.) for domain-specific
use cases.
Implement parameter-efficient fine-tuning (LoRA, QLoRA, PEFT).
Develop and implement Reinforcement Learning pipelines, including RLHF and PPO.
Build and optimize RAG (Retrieval-Augmented Generation) systems.
Design and implement AI agents and multi-step reasoning workflows.
Develop scalable training and inference pipelines using GPUs.
Benchmark, evaluate, and improve LLM performance.
Implement evaluation frameworks and model monitoring systems.
Collaborate cross-functionally to integrate AI into enterprise systems.
Required Skills & Qualifications (Must Have)
LLM & Generative AI Expertise
Deep understanding of Transformer architecture, attention mechanisms, embeddings, and
tokenization.
Clear Understanding Of LLM Training Lifecycle
oSupervised Fine-Tuning (SFT)
Strong Hands-on Experience With
- PyTorch
- Hugging Face Transformers
- Fine-tuning large language models
- LoRA / QLoRA / PEFT
Experience implementing Reinforcement Learning techniques
- PPO
- Policy gradients
- Reward modeling
Experience building RAG pipelines end-to-end.
Experience building AI agents / tool-using LLM systems.
Understanding of evaluation techniques for generative models.
Experience working with multi-modal models (text + vision is a plus but preferred).
Strong Python programming skills.
Infrastructure & Deployment
- Hands-on experience with GPU-based training and inference.
- Experience with distributed training and mixed precision training.
- Familiarity with Docker and Kubernetes.
- Experience deploying models as APIs/services.
- Experience with vector databases (FAISS, Pinecone, Weaviate).
- Experience with cloud platforms (AWS / GCP / Azure).
What We're Looking For
Strong research-oriented mindset.
Ability to read and implement research papers.
Strong debugging and optimization skills.
Clear understanding of cost, latency, and scaling trade-offs.
Strong communication and ownership.