Search by job, company or skills

carelon global solutions india

Carelon - Senior Data Scientist - NLP/LLM

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 18 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Position

We are currently looking to hire a Senior Data Scientist with strong analytical skills and a background in US Healthcare. The ideal candidate should have :

  • A minimum of 5+ years of overall experience in Data science or related fields
  • At least 3 years of hands-on experience in Machine Learning (ML) and Natural Language Processing (NLP)
  • Candidates with proven expertise in healthcare data analytics and a solid understanding of healthcare systems in the US will be preferred

Job Responsibility

Key Responsibilities :

  • Demonstrate expertise in programming with a strong background in machine learning and data processing.
  • Possess strong analytical skills to interpret complex healthcare datasets and derive actionable insights.
  • Collaborate closely with AI/ML engineers, data scientists, and product teams to acquire and process data, debug issues, and enhance ML models.
  • Develop and maintain enterprise-grade data pipelines to support state-of-the-art AI/ML models.
  • Work with diverse data types including structured, semi-structured, and textual data.
  • Communicate effectively and collaborate with cross-functional teams including engineering, product, and customer stakeholders.
  • Operate independently with minimal guidance from product managers and architects, demonstrating strong decision-making capabilities.
  • Embrace complex problems and deliver intelligence-driven solutions with a focus on innovation and scalability.
  • Quickly understand product requirements and adapt to evolving business needs and technical environments.

Technical Responsibilities

  • Design and implement statistical and machine learning models (e.g., regression, classification, clustering) using frameworks such as scikit-learn, TensorFlow, and PyTorch.
  • Build robust data preprocessing pipelines to handle missing values, outliers, feature scaling, and dimensionality reduction.
  • Specialize in Large Language Model (LLM) development, including fine-tuning, prompt engineering, and embedding optimization using frameworks like Hugging Face Transformers.
  • Develop and optimize LLM evaluation frameworks using metrics such as ROUGE, BLEU, and custom human-aligned evaluation techniques.
  • Apply advanced statistical methods including hypothesis testing, confidence intervals, and experimental design to extract insights from complex datasets.
  • Create NLP solutions for text classification, sentiment analysis, and topic modeling using both classical and deep learning approaches.
  • Design and execute A/B testing strategies, including sample size determination, metric selection, and statistical analysis (e.g., t-tests, ANOVA).
  • Implement comprehensive data visualization strategies using tools like Matplotlib, Seaborn, and Plotly to present insights effectively.
  • Maintain detailed documentation of model architectures, experiments, and validation results using tools like MLflow or DVC.
  • Research and apply LLM optimization techniques such as quantization, pruning, and knowledge distillation to improve efficiency.
  • Stay up to date with the latest advancements in statistical learning, deep learning, and LLM research, with a focus on emerging architectures and training :
  • Bachelors or masters degree in computer science, Mathematics or Statistics, Computational linguistics, Engineering, or a related field. Ph.D. preferred.

Experience

  • 5+ years of overall professional experience in data science, analytics, or related fields.
  • 3+ years of hands-on experience working with large-scale structured and unstructured data to develop data-driven insights and solutions using Machine Learning (ML), Natural Language Processing (NLP), and Computer Vision.
  • Proven 3+ years of experience with core technologies including Python (mandatory), SQL, Hugging Face, TensorFlow, Keras, PyTorch, and Apache Spark.
  • 3+ years of experience in developing NLP models, with a strong focus on transformer-based architectures.
  • 2+ years of experience implementing information retrieval systems at scale, including both keyword-based and semantic search using embeddings.
  • Hands-on experience with cloud platforms such as Google Cloud Platform (GCP) and Amazon Web Services (AWS).
  • Strong expertise in Large Language Models (LLMs) and Generative AI (GAI), including model development, fine-tuning, and optimization.
  • Demonstrated ability to work independently with minimal supervision and exercise sound judgment in technical and business decision-making.
  • In-depth experience with LLMs (both extractive and generative), including prompt engineering, fine-tuning, and familiarity with open-source ecosystems.
  • Experience in prompt development and optimization for NLP applications.
  • Strategic thinker with a blend of technical expertise and business acumen, capable of solving complex problems and influencing outcomes.
  • Proficient in creating analytical reports, projections, models, and presentations to support business objectives.
  • Excellent written and verbal communication skills, with strong stakeholder management capabilities.
  • Prior experience in the healthcare industry, with an understanding of domain-specific data and regulatory considerations.

Skills And Competencies

  • Must have : Machine Learning, LLM, NLP, Python, SQL, Hugging Face, TensorFlow & Keras.
  • Good to have : PyTorch, Spark & any cloud exp.

(ref:hirist.tech)

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 147214865