Data Scientist – Insurance Analytics
Role Summary
We are seeking a Senior Data Scientist (6–7 years of experience) to support an analytics initiative focused on the insurance domain. This is a client facing role requiring strong analytical expertise, hands-on modeling experience and the ability to independently drive analysis, present insights and collaborate with stakeholders.
The ideal candidate will have a solid foundation in statistical modeling and hypothesis testing, deep experience in tree-based and ensemble machine learning models, exposure to cloud-based data platforms and working knowledge of modern Generative AI and Large Language Model (LLM) techniques relevant to insurance analytics.
Key Responsibilities
- Perform exploratory data analysis (EDA), feature engineering, and hypothesis testing to identify fraud patterns and anomalies.
- Build, evaluate, and optimize traditional statistical models as well as tree-based ML models such as Random Forest, XGBoost, CatBoost, and LightGBM.
- Explore and apply LLM based approaches (e.g. text classification, summarization, entity extraction) for leveraging unstructured data such as claim notes, adjuster comments and documents.
- Develop GenAI powered accelerators for documentation, feature ideation, data enrichment and model insight generation.
- Independently conduct data analysis, research, model experimentation and translate findings into actionable insights.
- Write clean, efficient and production ready code using Python and SQL.
- Work extensively with large datasets using cloud platforms, primarily Google Cloud Platform (GCP).
- Query and manage data using Big Query and datasets stored in Cloud Storage (Buckets).
- Use Git for version control, collaboration and code review.
- Prepare clear, concise and impactful presentations for clients, explaining analytical findings to both technical and nontechnical stakeholders.
- Collaborate with business, data engineering, and client teams to ensure models align with investigation strategies and broader business objectives
Required Skills & Experience
- 5–11years of hands on experience in data science, analytics, or applied machine learning
- Strong understanding of statistical modeling, probability concepts, and hypothesis testing
- Proven experience with tree-based and ensemble machine learning models (RF, XGBoost, CatBoost, LightGBM)
- Experience working with unstructured data and NLP techniques, preferably including LLMs (OpenAI, Gemini, Llama, etc.)
- Practical exposure to GenAI workflows such as prompt engineering, fine tuning, retrieval augmented generation (RAG), or automated insight generation
- Expert‑level SQL for data extraction, transformation, and analysis
- Strong Python skills for data analysis, machine learning, and LLM based pipelines
- Experience using Git for source code management
- Solid exposure to cloud based analytics environments, preferably Google Cloud Platform (GCP), Big Query, and Cloud Storage
- Ability to work independently, manage deliverables and drive tasks end to end.
- Excellent verbal and written communication skills, essential for a client facing role.
Candidate Profile
- Bachelor's/Master's degree in economics, statistics, mathematics, computer science/engineering, operations research, or related analytics areas.
- Strong data analysis experience with complex, real world datasets.
- Demonstrated capability in solving business problems using both traditional ML and emerging GenAI/LLM based approaches.
- Superior analytical thinking and problem-solving skills.
- Outstanding written and verbal communication skills with confidence in client interactions.