The role requires a strong focus on data analysis, machine learning model development, and fraud detection across large-scale datasets. The ideal candidate collaborates closely with engineering and product teams to build scalable and reliable machine learning solutions that support data-driven decision-making. Exposure to model development, feature engineering, experiment tracking, and modern MLOps practices is a strong advantage.
Key Responsibilities
- Design, develop, and refine high-performance Fraud Prevention models using Python and Gradient Boosting frameworks such as XGBoost, LightGBM, or CatBoost.
- Manage the complete machine learning lifecycle including data extraction, feature engineering, model training, evaluation, and deployment support.
- Conduct data research, behavioural analysis, and performance benchmarking on production datasets.
- Write and optimize SQL queries to extract and analyse data from PostgreSQL databases for model development and validation.
- Utilize MLflow for experiment tracking, model versioning, and ensuring reproducibility across development stages.
- Maintain code integrity and collaborative workflows using Git and Bitbucket.
- Work within Linux environments and utilize shell scripting (Bash) to automate workflows and operational tasks.
- Develop visualizations and analytical insights using data visualization tools.
- Collaborate with cross-functional teams to improve model performance and data-driven decision making.
- Ensure data privacy, security, and compliance best practices while working with production data.
Required Skills
- 58 years of overall experience in Data Science, Machine Learning, or related roles.
- 35 years of hands-on experience in Python-based data science and machine learning.
- Strong proficiency in Python and data science libraries such as Pandas, NumPy, and Scikit-learn.
- 12 years of experience working with Gradient Boosting frameworks such as XGBoost, LightGBM, or CatBoost.
- Strong knowledge of SQL and PostgreSQL for data extraction and analysis.
- Hands-on experience with MLflow for experiment tracking and model versioning.
- Experience with Jupyter Notebook or JupyterHub for model experimentation and data exploration.
- Proficiency in Git and Bitbucket for version control and collaborative development.
- Familiarity with Linux/Unix environments and basic Shell scripting.
- Understanding of machine learning techniques including classification, anomaly detection, and feature engineering.
- Knowledge of data visualization tools such as Plotly, Matplotlib, or Seaborn.
- Strong analytical thinking, problem-solving ability, and attention to detail.
- Good communication and collaboration skills.
Added Advantage
- Experience working with large datasets or big data technologies such as Spark or Dask.
- Prior experience in Fintech, Banking, or Cybersecurity domains.
- Understanding of MLOps concepts including model deployment and monitoring in production environments.
- Familiarity with package management tools such as Conda, Pip, or virtual environments.
- Knowledge of data privacy and security best practices when handling production data.