
Search by job, company or skills

This role is with one of our client, which is an AI-Powered Revenue Cycle Intelligence Platform, transforming the healthcare billing stack with autonomous medical coding, proactive denial prevention, and workflow automation solutions.
Location: Bengaluru
Experience: 2-7yrs
Role:ML Engineer / Senior AI/ML Engineer
Mode: Work from Office
Key requirement:Candidate must be only from IIT
Job Overview
We are hiring a ML Engineer / Senior AI/ML Engineer to own the end-to end applied LLM, retrieval, and evaluation layer of our healthcare AI platform. You will build production systems that automate mid- and end revenue cycle workflows for US healthcare spanning coding, claim edits, denials triage, appeal generation, and payer-rule reasoning. This is a production engineering role (not research) focused on building scalable, auditable, and cost-efficient LLM systems in a regulated healthcare environment.
What You'll Own
1. Self-Hosted LLM Infrastructure
• Deploy, fine-tune, and operate open-source models (Llama, Qwen, MedGemma, and successors) as our primary inference stack
• Work with vLLM / SGLang / TensorRT-LLM for serving at scale, with disciplined attention to throughput, tail latency, batching, KV-cache, and GPU economics
• Own fine-tuning workflows end-to-end (SFT, LoRA, QLoRA, DPO) on clinical notes, claims, and payer-rule data
• Optimize GPU usage, latency, batching, and cost; make build-vs-buy and hosted-vs-self-hosted trade-offs explicit and measured
2. Knowledge Graphs & Embedding-Based Retrieval
• Design and maintain the knowledge graph encoding ICD-10-CM, CPT, HCPCS, modifiers, HCC, NCCI edits, LCD/NCD policies, and payer specific rules — and the relationships between them
• Build embedding-based retrieval over clinical notes, historical claims, denial reasons, and payer-policy corpora — including chunking, embedding model selection, hybrid search, and reranking
• Combine graph traversal and dense retrieval so every coded line, scrubbed edit, and appeal response is grounded in auditable evidence
• Own ingestion, versioning, and quality of underlying knowledge sources (CMS, AHA, AMA, NCCI, payer bulletins)
3. Evaluation & Monitoring
• Build continuous evaluation pipelines that gate every model, prompt, retrieval, and graph change before production
• Run offline eval suites grounded in coder- and biller-validated labels; use LLM-as-judge where appropriate, calibrated against human ground truth
• Monitor drift, hallucinations, regressions, and output quality in production; operate shadow-mode rollouts and per-cohort accuracy tracking (specialty, payer, chart type)
• Track business metrics: chart-level and opportunity-level coding accuracy, denial rate impact, clean-claim rate, cost per chart, and end-to-end latency
4. LLM Systems & Prompt Engineering
• Design prompts and context pipelines for coding (CPT, ICD, HCC, E/M), claim edits, denial classification, and appeal drafting
• Implement structured outputs (JSON, function calling, constrained decoding) on top of the self-hosted stack
• Apply RAG over medical coding standards (CMS, ICD-10, AHA, NCCI) and payer policies, grounded in the knowledge graph and embedding stores
• Treat prompts as a thin, well-versioned, well-evaluated layer — never the load-bearing piece
5. Agentic Workflows & Tooling — MCP
• Build MCP servers for internal tools: code lookup, NCCI / rule checks, payer logic, eligibility, denial classification
• Design multi-step agent workflows with audit trails and human-in-the-loop checkpoints for coder, biller, and AR-analyst review
• Define deterministic vs. LLM-based tool boundaries for reliability — reliability comes from knowing which is which
What We're Looking For
Must-Have
• 5+ years in ML/AI engineering, including 6+ months in production LLM systems
• Hands-on experience deploying and operating self-hosted LLMs (vLLM, SGLang, TensorRT-LLM, or equivalent)
• Strong experience designing embedding-based retrieval and/or knowledge graphs for grounded LLM applications
• Demonstrated ownership of evaluation infrastructure — offline benchmarks, online monitoring, drift and regression detection
• Strong Python + PyTorch + Hugging Face experience
• Production experience with monitoring, incidents, and system ownership
Strongly Preferred
• Fine-tuning experience (SFT, LoRA, QLoRA, DPO) on domain-specific corpora
• Experience with graph databases (Neo4j, ArangoDB, or equivalent) and graph-aware retrieval
• Experience with vector databases and hybrid search (BM25 + dense, rerankers)
• Familiarity with LLM observability tools (Langfuse, LangSmith, Arize, Braintrust, or in-house equivalents)
• Exposure to healthcare, RCM, claims, or other regulated domains
• Experience with MCP or similar tool-orchestration frameworks
• Strong prompt-engineering and LLM-evaluation instincts
What We Offer
• Work on high-impact healthcare AI systems used in real billing and RCM workflows
• Ownership of production LLM, retrieval, and evaluation systems end-to-end
• Solve real-world problems with real constraints (cost, latency, compliance, auditability)
Job ID: 148879423
Skills:
Open Cv, Tensorflow, Pytorch, MLops, Python, AWS, Langchain, edge computing, HuggingFace, Go, vector databases, RAG based applications, LLM architectures
Skills:
Pytorch, Python, experiment tracking, DeepSpeed, benchmarking systems, evaluation frameworks, FSDP, reproducibility practices, Compression, model optimization
Skills:
snowflake , Github, Cursor, Code, Deep Learning, Tensorflow, Pytorch, MLops, Spark, Gitlab, Azure, Python, AWS, CrewAI, LangChain, LLMOps, Codex, Claude, AI Cloud architectures, Autogen, Agentic AI, Agentic Coding Frameworks
Skills:
Sql, Cuda, Tensorflow, Numpy, Pytorch, Pandas, Docker, Spark, Kubernetes, Python, TensorRT, MLflow, Ray, Scikit-Learn, ONNX, Kubeflow
Skills:
Scipy, Nltk, Sklearn, Tensorflow, Pandas, Gcp, MLops, Numpy, Matplotlib, ECS, Azure, Kubernetes, Python, AWS, LLMs, Hugging Face, OpenRouter, Torch, BERT, Spacy, Modal
We don’t charge any money for job offers