Location: Hyderabad, India (US Timings) | Experience Level: 3-4 years (AI/Data Science) + 2 years (MLOps/LLMOps/AIOps)
About The Role
We're seeking an experienced AI Engineer to design, deploy, and manage autonomous agent systems on proprietary infrastructure. You'll own the full lifecyclefrom optimizing model weights to building production-grade agents with fine-tuning and reinforcement learning on on-premises or private cloud environments.
Key Responsibilities
Design and deploy autonomous agent architectures on AWS VPC and on-premise environments
Manage model weights and optimize for inference; implement LoRA and QLoRA fine-tuning for domain-specific tasks
Develop reinforcement learning pipelines for agent training with reward modeling and policy optimization
Implement MLOps/LLMOps infrastructure: model versioning, A/B testing, rollbacks, and evaluation frameworks
Architect RAG systems integrating vector databases with proprietary and fine-tuned models
Optimize model serving infrastructure (vLLM, TorchServe, TensorRT) for production inference
Build monitoring and observability systems for agent behavior and RL training quality
Ensure model security, data privacy, and audit compliance in enterprise deployments
Required Qualifications
3-4 years hands-on experience in AI/ML/Data Science with at least 2 projects shipped to production
2+ years dedicated experience in MLOps, LLMOps, or AIops (model deployment, inference optimization, pipeline automation, model management)
AWS proficiency across AI services: EC2, VPC, S3, IAM, SageMaker, Bedrock, Lambda, or custom ML infrastructure
Strong software engineering fundamentals: containerization (Docker), orchestration (Kubernetes), CI/CD, and API design
Hands-on experience deploying and serving large language models or foundation models in production environments
Practical experience with LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) fine-tuning techniques for efficient model adaptation
Understanding of reinforcement learning fundamentals and experience implementing RL-based training: policy gradients, reward shaping, or preference-based optimization
Working knowledge of vector databases and RAG implementation
Solid understanding of model optimization techniques and inference constraints (GPU memory, latency, throughput)
Preferred Qualifications
Experience building autonomous agents with RL frameworks (DPO, PPO, RLHF) and fine-tuning frameworks (Hugging Face Transformers, PEFT)
QLoRA experience on consumer-grade GPUs in memory-constrained environments
Migration experience from cloud APIs (OpenAI, Anthropic) to self-hosted models
On-premises or VPC-only deployment experience
Familiarity with agent frameworks (LangChain, LlamaIndex, AutoGen) and MLOps tools (MLflow, W&B, DVC)
Strong debugging and systems thinking approach with evidence-based problem-solving
Pragmatic engineers who balance performance and cost, strong debuggers with evidence-based approaches, clear communicators, and owners who see projects end-to-end.
Application Requirements
Submit GitHub portfolio with production ML systems, MLOps implementations, fine-tuning work (LoRA/QLoRA), and RL-based agent training examples. Bonus: on-premises deployments or model optimization projects.