Senior ML Engineer

Triomics

Bengaluru, India

3-5 Years

Save

Posted 4 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Important:

To successfully apply for this position, filling out this form is mandatory. Applications without a completed form will not be considered.

Position: Sr. ML Engineer

Location: On-site

Type: Full Time

About the Role:

We are seeking a driven NLP Engineer who can help scale, optimize, and deploy large language model (LLM)-based solutions within the healthcare domain. The primary focus of this role is on building and maintaining production-grade, end-to-end NLP systemsincluding ML/NLP architecture design, DL model inference optimization, and efficient model deployment pipelines. While there will be opportunities to train or fine-tune LLMs for specific use cases, your core responsibility is to ensure that these models run at scale, efficiently, and reliably in production environments.

In addition to working with cutting-edge LLMs, you will also build and maintain NLP pipelines utilizing already-trained LLMs and embedding models. This includes constructing retrieval-augmented generation (RAG) systems and agentic systems that integrate multiple models and data sources to deliver robust, real-time NLP functionalities.

What We Expect You to Bring:

Bachelor's or Master's degree in Computer Science or related field.
3+ years of professional experience (or 2+ years with an advanced degree) in building and deploying ML/NLP systems.
Strong knowledge of containerization and version control for building reliable, production-grade systems.
Proficiency in working with NLP frameworks (e.g., spaCy, NLTK, HuggingFace Transformers, LangChain, vectorDBs, etc), deep learning libraries (e.g., PyTorch), and common data preprocessing techniques.
Practical experience in designing, implementing, and maintaining robust, scalable backend infrastructures for NLP and LLM-based applications.
Experience deploying NLP models in production environments, including load balancing and latency reduction.
Hands-on experience optimizing LLM inference performance using frameworks like vLLM, TensorRT, Ray, etc.
Understanding of prompt engineering, model fine-tuning, and large-scale inference optimization for LLMs.
Familiarity with building retrieval-augmented generation (RAG) pipelines and integrating embedding models into NLP workflows.
Exposure to agentic systems that combine multiple models or tools for more dynamic, context-aware NLP solutions.

What You Will Be Doing:

Production-Grade NLP Systems:
Design and implement scalable, efficient NLP pipelines leveraging already-trained LLMs and embedding models.
Integrate RAG and agentic components to enhance the capabilities and adaptability of NLP systems.
Inference Optimization & Deployment:
Optimize model inference performance, reduce latency, and improve throughput using techniques and frameworks designed for large-scale LLM deployments.
Implement best practices for containerization, CI/CD, monitoring, and observability to ensure rapid, reliable deployments.
Occasional Model Adaptation:
As needed, assist with fine-tuning or adapting LLMs to specific healthcare use cases, while maintaining a focus on long-term scalability and performance.
Collaboration & Continuous Improvement:
Work closely with cross-functional teamsincluding NLP researchers, backend engineers, product managers, and front-end developersto deliver high-quality NLP solutions.
Participate in code reviews, contribute to architectural discussions, and remain current on emerging NLP and LLM optimization techniques.

Why Join

We are revolutionizing a unique industry that has the potential to impact and benefit patients from all over the world - you can create impact at scale.
We have access to the best computing resources available, including the H100 and A100, among others.
We have had company-sponsored workations in Bali, Sri Lanka, and Manali and take pride in our hard-working yet super fun culture.
We are working on a few of the most challenging problems in a highly regulated industry, which provides you with an opportunity to solve some of the most interesting things.
You will get a chance to work with experts from multiple industries, the best in the industry compensation, and to continue building your own (and, of course, new) projects.

Perks & Benefits:

Unlimited Leave Policy take time off when you need it. We believe in trust, not tracking.
Lunch Provided at the Office one less daily decision, one happier employee.
Flexible Working Hours we care about output, not clock-ins.
Health Insurance comprehensive coverage for you and your family.
Zomato Meal Benefit breakfast and dinner can be ordered when you come in early or leave late, because effort deserves fuel.