Senior Software Engineer

Myntra

Bengaluru, India

3-5 Years

Save

Posted 9 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

About The Company

Who are we

Myntra is India's leading fashion and lifestyle platform, where technology meets creativity. As pioneers in fashion e-commerce, we've always believed in disrupting the ordinary.

We thrive on a shared passion for fashion, a drive to innovate to lead, and an environment that empowers each one of us to pave our own way. We're bold in our thinking, agile in our execution, and collaborative in spirit.

Here, we create MAGIC by inspiring vibrant and joyous self-expression and expanding fashion possibilities for India, while staying true to what we believe in.

We believe in taking bold bets and changing the fashion landscape of India. We are a company that is constantly evolving into newer and better forms and we look for people who are ready to evolve with us.

From our humble beginnings as a customization company in 2007 to being technology and fashion pioneers today, Myntra is going places and we want you to take part in this journey with us.

Working at Myntra is challenging but fun - we are a young and dynamic team, firm believers in meritocracy, believe in equal opportunity, encourage intellectual curiosity and empower our teams with the right tools, space, and opportunities.

You will be part of: Myntra DataScience Team

Myntra is a one stop shop for all your fashion and lifestyle needs. Being India's largest online store for fashion and lifestyle products, Myntra.com aims at providing a hassle free and enjoyable shopping experience to shoppers across the country with the widest range of brands and products on offer. The brand is making a conscious effort to bring the power of fashion to shoppers with an array of the latest and trendiest products available in the country.

The Myntra Data Science team is at the forefront of innovation, delivering cutting-edge solutions that drive significant revenue and enhance customer experiences across various touchpoints. Our model systems impact millions of customers everyday, leveraging real-time, near-real-time, and offline state of the art models and data systems, with diverse latency and throughput requirements. These models are built on massive datasets and events, allowing for opportunity to innovate in model centric AI, data centric AI, agile active learning systems, incremental feedback loop driven systems, vision and text content sciences, and has other innovation opportunities in a rapidly growing customer, catalogue/category space. The triage of customer, supply and business benefit from these models on all fronts.

We take pride in training and deploying solutions that not only utilize state-of-the-art ML, DL, GenAI techniques but also contribute to the research community with multiple peer-reviewed publications. The success of these systems depends heavily on massive events and dimension datasets, real time and batch data pipeline infrastructure and fault tolerant API systems which adapt to all seasons including very high scale season sales.

Roles and Responsibilities

Design and build resilient, scalable, online metric measurable ML DL recommender APIs.
Build cost-effective yet scalable, ultra low latency semi-real-time event streams.
Build deep learning model pipelines and pipeline abstractions, python libraries partner with applied data scientists to build world class agile, incremental, sequential learners.
Set up and manage Continuous Integration/Continuous Deployment (CI/CD) pipelines for automated testing, deployment, and model integration
Experiment and upgrade data science (DS) inference systems, build best in class restful frameworks. Streamline OS and python environment migrations, python version migrations co-own end to end production model service testing, capacity planning, testing, container/service health monitoring and alert automation.
Co-create DS service asynchronous logging modules, rate limiters & circuit breaker modules, automations for smart service management in kubernetes clusters, and integrate with code/PII data protection modules.
Build generic and charter centric derived event tables stitching events, features, and user feedback loop with appropriate partitions to stream line sequential or online ML learners, contextual bandit learners etc.
GPU ops opportunity: Learn and optimise GPU services for encoder/predictor models using Tensor RT, Triton and other frameworks and be a champion of GPU ML inference optimisations across org.
Opportunity to learn, own and add capabilities to offline ML DL model performance evaluation metric systems (MLflow, Arize, W&B, Neptune etc).
Build ML/DL observability 2.0 modules with a combination of statistical, causal ML, and LLMs-as-judge.
Co-own cost optimisation to maximise ratio of impact per user vs infra cost per user.
Assist data science to improve dev cost efficiencies.
Opportunity to co-create LLM as a judge framework
Opportunity to streamline embedding ANNs, HNSW, quantisations and vector databases.
Opportunity to build services with open sourced SLMs and computer vision models.
Opportunity to experiment and deploy Agentic AI solutions with pre-trained LLMs and context/prompt engineering principles and best in class RAG/graphRAG/API retrieval mechanisms.
Build a culture for enforcing strong production coding discipline in APIs, and design and solution documentation.
Work with data analysts, data experts, product/engg and create custom queries and pipelines such that applied DS can efficiently create A/B metric dashboards and conduct error analyses and faster A/B iterations.
Strong engineering mindset - build automated monitoring, alerting, self healing capabilities. Review the design and implementation in collaboration with Architects and advocate latest DL/ML engineering practices amongst tech.
Collaboration: Work closely with the Product Managers, Platforms and Engineering teams to ensure smooth deployment and integration of ML models into Myntra production systems.

Desired Skills And Experience

3 to 5 Years hands on experience and proven expertise with building end to end, complex & robust complex and large Data Engineering pipelines on PySpark or scala.
1+ years experience especially in both ML batch feature pipelines, and also ideally/optionally near-real-time kafka consumer pipelines (Spark Structured streaming or Flink).
Breadth experience working with data blobs, delta lakes, storage for ML workflows.
3+ years experience in ML API and ML pipeline engineering is must
2+ years excellent coding experience in Python, pyspark(Python3), Flask/Falcon/FastAPI.
Solid experience in Kafka consumer/producers, read/write connectors to aerospike/redis DBs.
Experience with ML orchestration tools (Airflow, Kubeflow, MLFlow)
Must have experience with Qdrant/MIlvus or other vector DBs.
Understanding of Architecture and Design of Data Engineering products. Be able to articulate the trade offs .

Nice To Have

Experience with building feature store pipelines will be a plus
Experience with CI/CD , API testing and monitoring is a plus
Solid experience with GPU ops is a plus
Experience with LLM ops for inference and training is a plus.
Experience building production grade context /prompt engineering pipelines, production grade multi-agentic AI frameworks is a plus.

Required Skills

pyspark, Python, scala, API, LLM, ml pipelines, mlflow, kubeflow

Required Education

B.Tech