Search by job, company or skills

OmniMD

Data Scientist

new job description bg glownew job description bg glownew job description bg svg
  • Posted 8 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description

Position: Data Scientist LLM & Applied AI

Experience: 23 Years

Employment Type: Full-Time

Domain: Healthcare AI / Digital Health / SaaS Platforms

Reporting To: CTO

Role Summary

We are seeking a highly hands-on Data Scientist with 23 years of experience who is deeply proficient in Large Language Models (LLMs)both open-source and commercialand has strong expertise in prompt engineering, applied machine learning, and local LLM deployments.

This role is not purely academic. The ideal candidate will work on real-world AI systems including AI Frontdesk, AI Clinician, AI RCM, multimodal agents, and healthcare-specific automation, with a focus on production-grade AI, domain-aligned reasoning, and privacy-aware architectures.

Key Responsibilities

1. LLM Research, Evaluation & Selection

  • Evaluate, benchmark, and compare open-source LLMs (LLaMA-2/3, Mistral, Mixtral, Falcon, Qwen, Phi, etc.) and commercial LLMs (OpenAI, Anthropic, Google, Azure).
  • Select appropriate models based on latency, accuracy, cost, explainability, and data-privacy requirements.
  • Maintain an internal LLM capability matrix mapped to specific business use cases.

2. Prompt Engineering & Reasoning Design

  • Design, test, and optimize prompt strategies:
  • Zero-shot, few-shot, chain-of-thought (where applicable)
  • Tool-calling and function-calling prompts
  • Multi-agent and planner-executor patterns
  • Build domain-aware prompts for healthcare workflows (clinical notes, scheduling, RCM, patient communication).
  • Implement prompt versioning, prompt A/B testing, and regression checks.

3. Applied ML & Model Development

  • Build and fine-tune ML/DL models (classification, NER, summarization, clustering, recommendation).
  • Apply traditional ML + LLM hybrids where LLMs alone are not optimal.
  • Perform feature engineering, model evaluation, and error analysis.
  • Work with structured (SQL/FHIR) and unstructured (text, audio) data.

4. Local LLM & On-Prem Deployment

  • Deploy and optimize local LLMs using frameworks such as:
  • Ollama, vLLM, llama.cpp, HuggingFace Transformers
  • Implement quantization (4-bit/8-bit) and performance tuning.
  • Support air-gapped / HIPAA-compliant inference environments.
  • Integrate local models with microservices and APIs.

5. RAG & Knowledge Systems

  • Design and implement Retrieval-Augmented Generation (RAG) pipelines.
  • Work with vector databases (FAISS, Chroma, Weaviate, Pinecone).
  • Optimize chunking, embedding strategies, and relevance scoring.
  • Ensure traceability and citation of retrieved sources.

6. AI System Integration & Productionization

  • Collaborate with backend and frontend teams to integrate AI models into:
  • Spring Boot / FastAPI services
  • React-based applications
  • Implement monitoring for accuracy drift, latency, hallucinations, and cost.
  • Document AI behaviors clearly for BA, QA, and compliance teams.

7. Responsible AI & Compliance Awareness

  • Apply PHI-safe design principles (prompt redaction, data minimization).
  • Understand healthcare AI constraints (HIPAA, auditability, explainability).
  • Support human-in-the-loop and fallback mechanisms.

Required Skills & Qualifications

Core Technical Skills

  • Strong proficiency in Python (NumPy, Pandas, Scikit-learn).
  • Solid understanding of ML fundamentals (supervised/unsupervised learning).
  • Hands-on experience with LLMs (open-source + commercial).
  • Strong command of prompt engineering techniques.
  • Experience deploying models locally or in controlled environments.

LLM & AI Tooling

  • HuggingFace ecosystem
  • OpenAI / Anthropic APIs
  • Vector databases
  • LangChain / LlamaIndex (or equivalent orchestration frameworks)

Data & Systems

  • SQL and data modeling
  • REST APIs
  • Git, Docker (basic)
  • Linux environments

Preferred / Good-to-Have Skills

  • Experience in healthcare data (EHR, clinical text, FHIR concepts).
  • Exposure to multimodal AI (speech-to-text, text-to-speech).
  • Knowledge of model evaluation frameworks for LLMs.
  • Familiarity with agentic AI architectures.
  • Experience working in startup or fast-moving product teams.

Research & Mindset Expectations (Important)

  • Strong inclination toward applied research, not just model usage.
  • Ability to read and translate research papers into working prototypes.
  • Curious, experimental, and iterative mindset.
  • Clear understanding that accuracy, safety, and explainability matter more than flashy demos.

What We Offer

  • Opportunity to work on real production AI systems used in US healthcare.
  • Exposure to end-to-end AI lifecycle: research prototype production.
  • Work with local LLMs, agentic systems, and multimodal AI.
  • High ownership, visibility, and learning curve.

If you're ready to take on a challenging and rewarding leadership role in the evolving world of healthcare IT, we want to hear from you! Please share your CV at [Confidential Information]

If you're ready to take on a challenging and rewarding role in the evolving world of healthcare IT, we want to hear from you! Please share your CV at [HIDDEN TEXT]

More Info

About Company

Job ID: 136219837

Similar Jobs