Data Scientist I (Last Mile)

Zepto

Bengaluru, India

Fresher

This job is no longer accepting applications

Posted a month ago
Over 100 applicants

Job Description

About Zepto

Zepto is a leading Indian consumer technology company and a pioneer in the fast-growing quick-commerce sector. Founded in 2021, Zepto delivers groceries and everyday essentials to millions of customers within minutes through a technology-driven logistics and supply chain network. The company operates at the intersection of product innovation, data science, and customer experience, building scalable consumer platforms in one of the world's most dynamic digital markets.

Zepto is backed by prominent global and institutional investors, including General Catalyst, Lightspeed Venture Partners, Nexus Venture Partners, StepStone Group, Avra Ventures, and the California Public Employees Retirement System (CalPERS). In its most recent pre-IPO funding round, Zepto raised approximately USD 450 million at a valuation of nearly USD 7 billion, underscoring strong investor confidence in the company's long-term growth and market leadership.

Responsibilities, aka, what we'd like you to deliver :

Perform advanced feature engineering using real-time and historical data (e.g. spatial signals, environmental factors)
Design and train regression and tree-based models for high-scale, low-latency production systems
Work on feature selection, model tuning, and validation to improve prediction accuracy and robustness
Collaborate with data engineering teams to build scalable data pipelines
Contribute to real-time inference systems
Monitor model performance, detect drift, and support automated retraining pipelines
Analyze model errors and build correction layers to capture residual patterns and edge cases
Partner with product and operations teams to translate business problems into ML solutions

Are you who we're searching for

We are looking for a highly motivated Data Scientist I to join our team. This role focuses on building and improving machine learning models that power real-time delivery time predictions at scale. The role involves designing robust regression and tree-based models, developing high-quality features from real-time and historical data, and contributing to systems that operate under strict latency and scale constraints.

You will collaborate closely with Data Engineers, MLOps Engineers, Product, and Operations teams to build scalable data pipelines, deploy models into production, and continuously improve prediction accuracy through monitoring and retraining.

The ideal candidate is passionate about solving real-world problems using data, has a strong foundation in machine learning and statistics, and is excited to work on large-scale systems where model performance directly impacts customer experience.

Technical Skills (Must Have)

Strong programming skills in Python and Strong SQL skills for data extraction, transformation, and analysis
Solid understanding of machine learning algorithms, especially: Tree-based models (XGBoost, LightGBM, Random Forest)
Regression techniques (linear, quantile, regularized models)
Hands-on experience with feature engineering and feature selection techniques
Understanding of model evaluation metrics and validation strategies
Basic knowledge of model monitoring concepts

Core Competencies

Strong analytical and problem-solving skills with attention to detail
Ability to break down complex real-world problems into structured ML components
Curiosity to explore non-obvious features and signals from data
Strong debugging and data validation skills
Ability to work in high-scale, fast-paced environments with evolving data
Clear communication and collaboration with cross-functional teams
Ownership mindset with a focus on impact and continuous improvement

Preferred Experience

Experience working on time prediction / forecasting problems (e.g., ETA, demand forecasting)
Exposure to large-scale ML systems or real-time prediction pipelines
Experience with PySpark and Databricks
Familiarity with spatial data processing (e.g., geo-coordinates, H3 indexing)
Understanding of model monitoring, drift detection, and retraining pipelines
Experience with A/B testing and experimentation frameworks
Knowledge of handling high-throughput, low-latency ML systems

0–2 years of experience in Data Science, Machine Learning, or related roles
Freshers with strong internships / academic projects in ML, data analysis, or large-scale data processing are encouraged to apply
Hands-on exposure to building and evaluating ML models
Experience working with real-world datasets, feature engineering, and model evaluation is a plus

Qualification

Bachelor's or Master's degree in Computer Science, Engineering, Mathematics, Statistics, or a related quantitative field
Strong foundation in probability, statistics, linear algebra, and machine learning fundamentals
Demonstrated academic or project experience in applied machine learning and data-driven problem solving

Sounds like we're talking about you Please add your application to our cart!

Equal Opportunity Statement: Zepto is an equal opportunity employer and is committed to building an inclusive workplace. All qualified applicants will receive consideration without regard to race, color, religion, gender, sexual orientation, national origin, or disability status.