Search by job, company or skills

Holcim

Data Scientist (Gen/Agentic AI solutions)

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Location:

Navi Mumbai, MH, IN, 400708

Requisition ID: 17702

About Holcim

As the world's global leader in innovative and sustainable building materials, Holcim is reinventing the way the world builds. Supported by a 60,000-strong global team spread across 70 markets and four industry segments, we are committed to shaping a greener, smarter and healthier world. We at Holcim believe that logistics performance is managed best locally at the country level and supported by corporate logistics. We envision to create a digitized supply chain managing the group's logistics performance in real-time and it is our aim to establish a true cost to serve model with decisions based on data to support reducing cost and business growth by improving customer service and achieving zero harm.

About The RoleQualifications:

  • BE / B. Tech in Computer Science, Engineering or relevant field
  • Graduate degree in Data Science or other quantitative field is preferred
  • Strong mathematics skills (e.g. statistics, algebra)
  • Certification in Gen/Agentic AI solutions
  • Certification in Platforms – Databricks,AWS is preferred

Experience:


  • 8+ years of progressive experience in data science and machine learning, with a minimum of 3 years focused on Generative AI and LLM-based systems.
  • Demonstrated track record of delivering AI solutions at enterprise scale
  • Hands-on experience with full AI/ML lifecycle management: data engineering, feature stores, model training, evaluation, deployment, and monitoring.
  • Industry experience especially in Manufacturing Function in a Building Material Industry, Manufacturing, Process or Pharma is preferred.

Required skills:


  • Proficiency in Python; strong working knowledge of relevant libraries & frameworks: LangChain, LangGraph, HuggingFace Transformers, PyTorch, scikit-learn, Pandas, and NumPy.
  • Deep experience with LLM APIs (OpenAI, Anthropic, Google, Mistral) and open-source model deployment via Ollama, vLLM, or TGI.
  • Solid command of vector databases, semantic search, and knowledge graph technologies for enterprise RAG architectures.
  • Proficiency with MLOps tooling: MLflow, Weights & Biases, Kubeflow, or similar; experience with LLMOps tools such as LangSmith.
  • Strong SQL and experience with modern data platforms such as Databricks for AI-ready data preparation.
  • Understanding of software engineering best practices: version control (Git), containerization (Docker/Kubernetes), API design (REST/gRPC), and CI/CD pipelines.
  • Knowledge of how to benchmark GenAI models beyond simple accuracy (e.g., toxicity, bias, and reasoning depth).
  • Exposure to multi-modal AI systems incorporating vision, audio, or structured document understanding (PDFs, tables, charts).
  • Good understanding of the GENAI standards (MCP, A2A, A2UI etc.)

Key Responsibility:


  • Platform Prototyping: Design and implement core ML components, such as feature stores, model registries, and automated evaluation pipelines.
  • Standardization: Establish best practices for the ML lifecycle, from data labeling and experimentation to CI/CD for ML (MLOps).
  • Scalability: Optimize model inference and training workflows to handle high-throughput, low-latency requirements.
  • Internal Consulting: Act as a subject matter expert for product-facing data science teams, helping them leverage platform tools to solve complex business problems.
  • Tooling & Automation: Build internal libraries and SDKs that simplify the transition from a local research environment to a distributed production environment.
  • RAG Infrastructure: Design and optimize high-performance retrieval systems using vector databases (e.g., Pinecone, Weaviate) and advanced semantic search techniques.
  • LLM Evaluation Frameworks: Build automated vibe-check replacements. Develop rigorous evaluation pipelines using LLM-as-a-judge, G-Eval, or custom scoring rubrics to measure hallucination, faithfulness, and relevancy.
  • Agentic Orchestration: Develop and standardize the use of agentic frameworks (e.g., LangGraph, CrewAI) to allow product teams to build complex, multi-step AI workflows.
  • Model Lifecycle Management: Manage the transition between model providers (OpenAI, Anthropic, Google) and open-source alternatives (Llama 3+, Mistral) through unified abstraction layers.
  • Cost & Latency Optimization: Implement caching strategies (e.g., GPTCache), prompt compression, and token-usage monitoring to ensure the platform remains economically viable.
  • Guardrails & Safety: Integrate real-time content filtering and PII masking to ensure all LLM outputs comply with corporate security and ethical standards.

Result oriented and with a work ethic of delivering on-time and in scope

Did we spark your interest Build your future with us and apply.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147316295