Job Title: Senior Machine Learning Engineer
Experience: 69 Years
Location: Bangalore / Mumbai
Notice Period: Immediate Joiners Only
Role Overview
We are seeking an experienced Senior Machine Learning Engineer to lead the development, fine-tuning, and optimization of Small Language Models (SLMs) for enterprise clients. This role requires deep expertise in model distillation, production-grade RAG architectures, and GPU optimization.
As a senior technical contributor, you will mentor junior engineers, guide architectural decisions, and ensure high-quality delivery across multiple concurrent client engagements in domains such as FinTech, Healthcare, Insurance, and Retail.
Key Responsibilities
- Lead complex fine-tuning and knowledge distillation pipelines for Small Language Models (1B13B parameters), including model families such as LLaMA, Mistral, Phi, Qwen, and Gemma.
- Architect and implement production-grade Retrieval-Augmented Generation (RAG) systems with vector database integration for enterprise use cases.
- Drive model selection decisions by evaluating performance benchmarks, licensing constraints, and deployment requirements across multiple client environments.
- Collaborate with MLOps teams to optimize inference performance, including quantization (INT8/INT4), latency tuning, and GPU resource utilization on Amazon Web Services (EC2, SageMaker, EKS).
- Design and generate high-quality synthetic datasets while addressing data privacy and regulatory constraints.
- Mentor and guide mid-level ML engineers reviewing experimentation frameworks, code quality, and ML best practices.
- Evaluate emerging SLM architectures, fine-tuning strategies, and optimization frameworks to maintain technical leadership.
- Support pre-sales efforts by contributing to solution design, technical assessments, and model benchmarking.
- Contribute to building reusable, domain-specific SLM accelerators for rapid deployment across priority verticals.
- Stay current with advancements in SLM research, new model releases, and optimization techniques through continuous learning and knowledge-sharing sessions.
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Mathematics, Electrical Engineering, or a related field.
- 6+ years of hands-on experience in applied Machine Learning, Deep Learning, or AI systems engineering.
- Strong proficiency in Python and ML frameworks such as PyTorch, TensorFlow, Hugging Face Transformers, and LangChain.
- Proven experience with model compression, quantization, distillation, and Retrieval-Augmented Generation (RAG) workflows.
- Strong understanding of vector databases, modern LLM/SLM architectures, and distributed training/inference.
- Experience working with GPU infrastructure and cloud-based ML environments.
- Excellent problem-solving, collaboration, and communication skills.
- Prior experience mentoring or leading engineers is highly desirable.