Key Responsibilities
- Design, build, and deploy end-to-end machine learning solutions across structured (tabular) medical datasets and unstructured/non-tabular data (text, images, OCR, signals).
- Apply advanced ML, deep learning, and LLM-based approaches to solve real-world healthcare/medical challenges.
- Collaborate with cross-functional teams (engineering, product, domain experts) to translate business requirements into scalable data science solutions.
- Drive MLOps best practices: experiment tracking, model versioning, CI/CD, monitoring, and automated retraining pipelines.
- Ensure production-grade robustness using Test-Driven Development (TDD) and OOP principles in ML pipelines.
- Contribute to architecture discussions and mentor junior team members.
Technical Skills & Requirements
- Strong proficiency in Machine Learning, Deep Learning, and Statistical Modeling.
- Expertise in handling tabular medical data (EHRs, lab reports, prescriptions) and non-tabular data (text, imaging, audio).
- Experience with LLMs (fine-tuning, prompting, integration with downstream tasks).
- Hands-on experience with MLOps frameworks and practices:
- MLflow, DVC, Airflow, Kubeflow, Weights & Biases
- Model deployment and monitoring in production
- Strong software engineering skills: Python, SQL, TDD, OOP, design patterns.
- Containerization & orchestration: Docker, Kubernetes (nice-to-have).
- Familiarity with cloud platforms (AWS / GCP / Azure) for scalable ML solutions.
- Knowledge of medical data compliance standards (HIPAA, GDPR) is a plus.
Preferred Frameworks/Tools
- ML/Deep Learning: scikit-learn, PyTorch, TensorFlow, Hugging Face, XGBoost, LightGBM
- NLP/LLMs: Transformers, LangChain, RAG pipelines
- MLOps: MLflow, DVC, Airflow, Kubeflow
- Data Processing: pandas, PySpark, SQL, Polars
Soft Skills
- Strong problem-solving and analytical mindset.
- Ability to explain technical concepts to non-technical stakeholders.
- Leadership and mentorship capabilities.