Important:
To successfully apply for this position, filling out this form is mandatory. Applications without a completed form will not be considered.
Position: Sr. ML Engineer
Location: On-site
Type: Full Time
About the Role:
We are seeking a driven NLP Engineer who can help scale, optimize, and deploy large language model (LLM)-based solutions within the healthcare domain. The primary focus of this role is on building and maintaining production-grade, end-to-end NLP systemsincluding ML/NLP architecture design, DL model inference optimization, and efficient model deployment pipelines. While there will be opportunities to train or fine-tune LLMs for specific use cases, your core responsibility is to ensure that these models run at scale, efficiently, and reliably in production environments.
In addition to working with cutting-edge LLMs, you will also build and maintain NLP pipelines utilizing already-trained LLMs and embedding models. This includes constructing retrieval-augmented generation (RAG) systems and agentic systems that integrate multiple models and data sources to deliver robust, real-time NLP functionalities.
What We Expect You to Bring:
- Bachelor's or Master's degree in Computer Science or related field.
- 3+ years of professional experience (or 2+ years with an advanced degree) in building and deploying ML/NLP systems.
- Strong knowledge of containerization and version control for building reliable, production-grade systems.
- Proficiency in working with NLP frameworks (e.g., spaCy, NLTK, HuggingFace Transformers, LangChain, vectorDBs, etc), deep learning libraries (e.g., PyTorch), and common data preprocessing techniques.
- Practical experience in designing, implementing, and maintaining robust, scalable backend infrastructures for NLP and LLM-based applications.
- Experience deploying NLP models in production environments, including load balancing and latency reduction.
- Hands-on experience optimizing LLM inference performance using frameworks like vLLM, TensorRT, Ray, etc.
- Understanding of prompt engineering, model fine-tuning, and large-scale inference optimization for LLMs.
- Familiarity with building retrieval-augmented generation (RAG) pipelines and integrating embedding models into NLP workflows.
- Exposure to agentic systems that combine multiple models or tools for more dynamic, context-aware NLP solutions.
What You Will Be Doing:
- Production-Grade NLP Systems:
- Design and implement scalable, efficient NLP pipelines leveraging already-trained LLMs and embedding models.
- Integrate RAG and agentic components to enhance the capabilities and adaptability of NLP systems.
- Inference Optimization & Deployment:
- Optimize model inference performance, reduce latency, and improve throughput using techniques and frameworks designed for large-scale LLM deployments.
- Implement best practices for containerization, CI/CD, monitoring, and observability to ensure rapid, reliable deployments.
- Occasional Model Adaptation:
- As needed, assist with fine-tuning or adapting LLMs to specific healthcare use cases, while maintaining a focus on long-term scalability and performance.
- Collaboration & Continuous Improvement:
- Work closely with cross-functional teamsincluding NLP researchers, backend engineers, product managers, and front-end developersto deliver high-quality NLP solutions.
- Participate in code reviews, contribute to architectural discussions, and remain current on emerging NLP and LLM optimization techniques.
Why Join
- We are revolutionizing a unique industry that has the potential to impact and benefit patients from all over the world - you can create impact at scale.
- We have access to the best computing resources available, including the H100 and A100, among others.
- We have had company-sponsored workations in Bali, Sri Lanka, and Manali and take pride in our hard-working yet super fun culture.
- We are working on a few of the most challenging problems in a highly regulated industry, which provides you with an opportunity to solve some of the most interesting things.
- You will get a chance to work with experts from multiple industries, the best in the industry compensation, and to continue building your own (and, of course, new) projects.
Perks & Benefits:
- Unlimited Leave Policy take time off when you need it. We believe in trust, not tracking.
- Lunch Provided at the Office one less daily decision, one happier employee.
- Flexible Working Hours we care about output, not clock-ins.
- Health Insurance comprehensive coverage for you and your family.
- Zomato Meal Benefit breakfast and dinner can be ordered when you come in early or leave late, because effort deserves fuel.