Senior Data Scientist

Optum

Noida, India

8-10 Years

Save

Posted 6 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.

The Provider Enablement Methods and Data Science team is seeking a Senior Data Scientist to support various Optum Insights data science projects. This position is responsible for end-to-end model development for our clients in provider and payer segments. We are investing significantly in new capabilities to design, develop and deliver better solutions for these clients. We are seeking people with proven experience conceptualizing, defining and delivering solutions that enable data science teams to build and scale Machine Learning and Artificial Intelligence models that will power these solutions.

Primary Responsibilities:

Build and put into production LLM solutions across major use cases (Q&A, generation, summarization, classification, conversational/assistant, and multimodal)
Apply modern LLM development techniques including PEFT (LoRA/adapters), RAG, prompt design, tool/function calling (incl. MCP patterns where relevant), and structured output enforcement
Establish robust LLM evaluation and quality programs using automated metrics plus structured human review, covering relevance, faithfulness, hallucinations, robustness, bias, readability, and offline/online evaluation strategy
Implement Responsible AI guardrails including promptinjection defense, toxicity and safety filtering, privacy/PII controls, scope/refusal behavior, adversarial testing, and mitigation of automation bias in review processes
Design and optimize retrieval and vector search workflows (chunking, indexing, reranking, grounding, and handling lowsignal/conflicting context)
Orchestrate multistep AI workflows with reliable control logic (branching, retries, tool execution, state/graph orchestration) integrated with evaluation gates and guardrails
Lead endtoend ML delivery from data preparation and feature engineering through modeling, evaluation, deployment, monitoring, and iterative improvement using reproducible practices
Build scalable data foundations using advanced SQL and distributed processing, ensuring correctness, performance, and stability for large multisource healthcare datasets
Develop advanced modeling solutions across statistics, ML, deep learning, and NLP (including NER, OCR to text pipelines, LSTM/sequence models) with solid Transformer proficiency and clear method tradeoff reasoning
Own MLOps practices including model lifecycle management, CI/CD releases, monitoring and drift detection (data/concept), structured peer review/validation, data pipelines/orchestration, governance, architecture/design documentation, healthcare analytics application, agile execution, and solid Python engineering practices
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so

Required Qualifications:

A Bachelor's degree in Data Science, Statistics, Mathematics, Computer Science, Machine Learning, Economics, Engineering, or a related quantitative field, or equivalent practical experience
8+ years of hands-on experience designing, developing, and validating statistical, machine learning, and/or deep learning models in production or applied research settings
8+ years of experience working with largescale structured and unstructured datasets, preferably within healthcare, life sciences, financial services, or other regulated domains
Experience designing and evaluating models using appropriate performance metrics, validation strategies, and error analysis techniques to ensure robustness and reliability
Experience communicating complex technical concepts, modeling results, and analytical insights to both technical and nontechnical stakeholders through clear documentation and presentations
Demonstrated experience applying statistical methods, machine learning algorithms, and computational techniques to extract insights and build predictive solutions from complex datasets
Solid proficiency in core data science and engineering tools, including Python, SQL, and distributed processing frameworks such as Spark or equivalent technologies
Demonstrated ability to work collaboratively within cross functional teams while also operating independently and taking technical ownership of complex data science initiatives
Proven ability to translate ambiguous business or product questions into clearly defined analytical problems, end to end modeling approaches, and measurable success criteria

Preferred Qualifications:

Experience with containerization practices and tools such as Docker, including packaging ML/LLM services for reproducible deployment and environment consistency
Experience with orchestration platforms such as Kubernetes for deploying and managing scalable ML/LLM workloads in production environments
Experience with cloud services and managed ML platforms for deployment and operations, including designing cloud native architectures that balance latency, reliability, and security requirements
Experience with ML Ops tooling such as ML flow, Kubeflow, and/or TensorFlow Extended (TFX) for experiment tracking, pipeline automation, model registry workflows, and production governance
Experience implementing robust evaluation and monitoring for GenAI systems beyond development environments, including post deployment observability, drift detection for prompts and responses, and operational alerting for quality regressions
Experience building and maintaining cross platform data and model infrastructure, including cost/performance optimization decisions across compute and storage
Experience with deploying ML models in Azure, AWS, and/or Google Cloud

More Info

Job Type:

Permanent Job

Industry:

Hotels /Hospitality /Restaurant

Role:

Data Science

Function:

Healthcare

Employment Type:

Full time

About Company

OptumJob Source: careers.unitedhealthgroup.com

Optum, Inc. is an American pharmacy benefit manager and health care provider. It is a subsidiary of UnitedHealth Group since 2011. UHG formed Optum by merging its existing pharmacy and care delivery services into the single Optum brand, comprising three main businesses: OptumHealth, OptumInsight and OptumRx.In 2017, Optum accounted for 44 percent of UnitedHealth Group's profits and as of 2019, Optum's revenues have surpassed $100 billion.Also in early 2019, Optum gained significant media attention regarding a trade secrets lawsuit that the company filed against former executive David William Smith, after Smith left Optum to join Haven, the joint healthcare venture of Amazon, JPMorgan Chase, and Berkshire Hathaway.

Job ID: 142754785

Jobs by Skill - IT