Experience : 5 years
Notice Period : Immediate joiner
Job Location : Bangalore (Hybrid)
Job Description:
Key Responsibilities
Core Modeling & Algorithmic Work
- Develop and optimize models for classification, regression, clustering, forecasting, and recommendation systems.
- Use a range of algorithms such as:
- Regression Models: Linear, Ridge, Lasso, ElasticNet, Quantile, Poisson, etc.
- Classification Models: Logistic Regression, Decision Trees, Random Forests, XGBoost, LightGBM, SVM, Neural Networks, etc.
- Unsupervised Learning: K-Means, DBSCAN, Hierarchical clustering, PCA, t-SNE, Autoencoders.
- Time Series & Forecasting: ARIMA, SARIMA, Prophet, LSTM, and hybrid models.
- Recommendation Systems: Collaborative filtering, Matrix factorization, Content-based and hybrid approaches.
Evaluation Metrics & Model Assessment
- Select appropriate evaluation metrics based on business goals and problem types:
- Classification: Accuracy, Precision, Recall, F1-score, ROC-AUC, PR-AUC, Log Loss, Cohen's Kappa, Matthews Correlation Coefficient.
- Regression: RMSE, MAE, R2, Adjusted R2, MAPE, SMAPE.
- Ranking/Recommenders: NDCG, MAP@K, Recall@K, Hit Rate.
- Clustering: Silhouette score, Davies-Bouldin Index, Calinski-Harabasz score.
- Forecasting: MSE, RMSE, MAPE, sMAPE, Theil's U statistic.
- Perform cross-validation, bootstrapping, and A/B testing for robust model validation.
- Monitor model drift, bias, and fairness across data slices.
Research & Experimentation
- Stay current with research trends in ML, DL, and applied AI (e.g., transformer models, self-supervised learning, and causal inference).
- Conduct experiments to improve baseline models using new architectures or ensemble approaches.
- Document hypotheses, results, and model interpretation clearly for cross-functional collaboration.
Required Skills & Qualifications
- Education: Master's or Bachelor's in Computer Science, Mathematics, Statistics, Data Science, or a related quantitative discipline.
- Experience: 67 years in core data science or applied ML, with end-to-end project ownership.
- Programming: Proficient in Python (pandas, NumPy, scikit-learn, statsmodels, XGBoost, LightGBM, TensorFlow/PyTorch).
- Data Handling: Strong in SQL and data wrangling with large-scale structured and unstructured datasets.
- Mathematics & Statistics: Excellent foundation in probability, linear algebra, optimization, and hypothesis testing.
- Model Evaluation: Proven expertise in selecting and interpreting metrics aligned to business goals.
- Visualization: Skilled in Matplotlib, Seaborn, Plotly, and storytelling with data-driven insights.
- Experience with MLOps, A/B testing, and data versioning tools (e.g., DVC, MLflow).
Nice to Have
- Knowledge of causal inference, Bayesian modeling, and Monte Carlo simulations.
- Familiarity with transformer-based models (BERT, GPT, etc.) for NLP tasks.
- Hands-on experience with graph analytics or network science.
- Experience mentoring junior data scientists and reviewing model design.
- Exposure to cloud ML stacks (AWS Sagemaker, GCP Vertex AI, or Azure ML Studio).
Soft Skills
- Strong analytical thinking and problem-solving orientation.
- Ability to balance scientific rigor with business pragmatism.
- Excellent communication both technical and non-technical audiences.
- Curious, self-driven, and comfortable working in fast-paced environments.