Sr. Data Scientist

OmniMD

Ahmedabad, India

5-7 Years

Save

Posted 14 days ago
Be among the first 10 applicants

Early Applicant

Job Description

We are seeking a highly hands-on Data Scientistwith 5+ years of experience who is deeply proficient in Large Language Models (LLMs)both open-source and commercialand has strong expertise in prompt engineering, applied machine learning, and local LLM deployments.

This role is not purely academic. The ideal candidate will work on real-world AI systemsincluding AI Frontdesk, AI Clinician, AI RCM, multimodal agents, and healthcare-specific automation, with a focus on production-grade AI, domain-aligned reasoning, and privacy-aware architectures.

Key Responsibilities

1. LLM Research, Evaluation & Selection

Evaluate, benchmark, and compare open-source LLMs(LLaMA-2/3, Mistral, Mixtral, Falcon, Qwen, Phi, etc.) and commercial LLMs(OpenAI, Anthropic, Google, Azure).
Select appropriate models based on latency, accuracy, cost, explainability, and data-privacy requirements.
Maintain an internal LLM capability matrixmapped to specific business use cases.

2. Prompt Engineering & Reasoning Design

Design, test, and optimize prompt strategies:
Zero-shot, few-shot, chain-of-thought (where applicable)
Tool-calling and function-calling prompts
Multi-agent and planner-executor patterns
Build domain-aware promptsfor healthcare workflows (clinical notes, scheduling, RCM, patient communication).
Implement prompt versioning, prompt A/B testing, and regression checks.

3. Applied ML & Model Development

Build and fine-tune ML/DL models(classification, NER, summarization, clustering, recommendation).
Apply traditional ML + LLM hybridswhere LLMs alone are not optimal.
Perform feature engineering, model evaluation, and error analysis.
Work with structured (SQL/FHIR)and unstructured (text, audio)data.

4. Local LLM & On-Prem Deployment

Deploy and optimize local LLMsusing frameworks such as:
Ollama, vLLM, llama.cpp, HuggingFace Transformers
Implement quantization (4-bit/8-bit)and performance tuning.
Support air-gapped / HIPAA-compliantinference environments.
Integrate local models with microservices and APIs.

5. RAG & Knowledge Systems

Design and implement Retrieval-Augmented Generation (RAG)pipelines.
Work with vector databases (FAISS, Chroma, Weaviate, Pinecone).
Optimize chunking, embedding strategies, and relevance scoring.
Ensure traceability and citation of retrieved sources.

6. AI System Integration & Productionization

Collaborate with backend and frontend teams to integrate AI models into:
Spring Boot / FastAPI services
React-based applications
Implement monitoring for accuracy drift, latency, hallucinations, and cost.
Document AI behaviors clearly for BA, QA, and compliance teams.

7. Responsible AI & Compliance Awareness

Apply PHI-safe design principles(prompt redaction, data minimization).
Understand healthcare AI constraints (HIPAA, auditability, explainability).
Support human-in-the-loop and fallback mechanisms.

Required Skills & Qualifications

Core Technical Skills

Strong proficiency in Python(NumPy, Pandas, Scikit-learn).
Solid understanding of ML fundamentals(supervised/unsupervised learning).
Hands-on experience with LLMs (open-source + commercial).
Strong command of prompt engineering techniques.
Experience deploying models locally or in controlled environments.

LLM & AI Tooling

HuggingFace ecosystem
OpenAI / Anthropic APIs
Vector databases
LangChain / LlamaIndex (or equivalent orchestration frameworks)

Data & Systems

SQL and data modeling
REST APIs
Git, Docker (basic)
Linux environments

Preferred / Good-to-Have Skills

Experience in healthcare data(EHR, clinical text, FHIR concepts).
Exposure to multimodal AI(speech-to-text, text-to-speech).
Knowledge of model evaluation frameworksfor LLMs.
Familiarity with agentic AI architectures.
Experience working in startup or fast-moving product teams.

Research & Mindset Expectations (Important)

Strong inclination toward applied research, not just model usage.
Ability to read and translate research papers into working prototypes.
Curious, experimental, and iterative mindset.
Clear understanding that accuracy, safety, and explainabilitymatter more than flashy demos.

What We Offer