Data Scientist

OmniMD

Ahmedabad, India

2-4 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Description

Position: Data Scientist LLM & Applied AI

Experience: 23 Years

Employment Type: Full-Time

Domain: Healthcare AI / Digital Health / SaaS Platforms

Reporting To: CTO

Role Summary

We are seeking a highly hands-on Data Scientist with 23 years of experience who is deeply proficient in Large Language Models (LLMs)both open-source and commercialand has strong expertise in prompt engineering, applied machine learning, and local LLM deployments.

This role is not purely academic. The ideal candidate will work on real-world AI systems including AI Frontdesk, AI Clinician, AI RCM, multimodal agents, and healthcare-specific automation, with a focus on production-grade AI, domain-aligned reasoning, and privacy-aware architectures.

Key Responsibilities

1. LLM Research, Evaluation & Selection

Evaluate, benchmark, and compare open-source LLMs (LLaMA-2/3, Mistral, Mixtral, Falcon, Qwen, Phi, etc.) and commercial LLMs (OpenAI, Anthropic, Google, Azure).
Select appropriate models based on latency, accuracy, cost, explainability, and data-privacy requirements.
Maintain an internal LLM capability matrix mapped to specific business use cases.

2. Prompt Engineering & Reasoning Design

Design, test, and optimize prompt strategies:
Zero-shot, few-shot, chain-of-thought (where applicable)
Tool-calling and function-calling prompts
Multi-agent and planner-executor patterns
Build domain-aware prompts for healthcare workflows (clinical notes, scheduling, RCM, patient communication).
Implement prompt versioning, prompt A/B testing, and regression checks.

3. Applied ML & Model Development

Build and fine-tune ML/DL models (classification, NER, summarization, clustering, recommendation).
Apply traditional ML + LLM hybrids where LLMs alone are not optimal.
Perform feature engineering, model evaluation, and error analysis.
Work with structured (SQL/FHIR) and unstructured (text, audio) data.

4. Local LLM & On-Prem Deployment

Deploy and optimize local LLMs using frameworks such as:
Ollama, vLLM, llama.cpp, HuggingFace Transformers
Implement quantization (4-bit/8-bit) and performance tuning.
Support air-gapped / HIPAA-compliant inference environments.
Integrate local models with microservices and APIs.

5. RAG & Knowledge Systems

Design and implement Retrieval-Augmented Generation (RAG) pipelines.
Work with vector databases (FAISS, Chroma, Weaviate, Pinecone).
Optimize chunking, embedding strategies, and relevance scoring.
Ensure traceability and citation of retrieved sources.

6. AI System Integration & Productionization

Collaborate with backend and frontend teams to integrate AI models into:
Spring Boot / FastAPI services
React-based applications
Implement monitoring for accuracy drift, latency, hallucinations, and cost.
Document AI behaviors clearly for BA, QA, and compliance teams.

7. Responsible AI & Compliance Awareness

Apply PHI-safe design principles (prompt redaction, data minimization).
Understand healthcare AI constraints (HIPAA, auditability, explainability).
Support human-in-the-loop and fallback mechanisms.

Required Skills & Qualifications

Core Technical Skills

Strong proficiency in Python (NumPy, Pandas, Scikit-learn).
Solid understanding of ML fundamentals (supervised/unsupervised learning).
Hands-on experience with LLMs (open-source + commercial).
Strong command of prompt engineering techniques.
Experience deploying models locally or in controlled environments.

LLM & AI Tooling

HuggingFace ecosystem
OpenAI / Anthropic APIs
Vector databases
LangChain / LlamaIndex (or equivalent orchestration frameworks)

Data & Systems

SQL and data modeling
REST APIs
Git, Docker (basic)
Linux environments

Preferred / Good-to-Have Skills

Experience in healthcare data (EHR, clinical text, FHIR concepts).
Exposure to multimodal AI (speech-to-text, text-to-speech).
Knowledge of model evaluation frameworks for LLMs.
Familiarity with agentic AI architectures.
Experience working in startup or fast-moving product teams.

Research & Mindset Expectations (Important)

Strong inclination toward applied research, not just model usage.
Ability to read and translate research papers into working prototypes.
Curious, experimental, and iterative mindset.
Clear understanding that accuracy, safety, and explainability matter more than flashy demos.

What We Offer

Opportunity to work on real production AI systems used in US healthcare.
Exposure to end-to-end AI lifecycle: research prototype production.
Work with local LLMs, agentic systems, and multimodal AI.
High ownership, visibility, and learning curve.

If you're ready to take on a challenging and rewarding leadership role in the evolving world of healthcare IT, we want to hear from you! Please share your CV at [Confidential Information]

If you're ready to take on a challenging and rewarding role in the evolving world of healthcare IT, we want to hear from you! Please share your CV at [HIDDEN TEXT]