Search by job, company or skills

larsen & toubro-vyoma

AI Architect

15-20 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Purpose

Designs and architect end-to-end AI Cloud platforms with a focus on security, cost-efficiency, and performance. This position involves direct client engagement to translate requirements into technical Solution, encompassing GPU infrastructure rightsizing and optimal model selection. We are looking for a cloud expert with a demonstrated ability to transition complex AI models from concept to large-scale production. The ideal candidate brings extensive experience in AI/Cloud ecosystems and a successful track record of architecting and managing production-grade, large-scale AI platforms.

Role Summary

Key Responsibilities

  • Translate business requirements into scalable, high-performance AI/GenAI architectures featuring NVIDIA GPU clusters
  • Design end-to-end AI Cloud and next-generation platforms optimized for deep learning workloads and distributed training.
  • Architect HPC cluster topologies utilizing high-speed InfiniBand (NDR/HDR) and RoCE v2 interconnects for low-latency communication.
  • Right-size platform components, including GPUs, CPUs, memory and NVMe storage for comprehensive client proposals.
  • Architect distributed training and inference environments optimized for MPI frameworks and workload scheduling via Slurm.
  • Desing scalable container orchestration platforms using Kubernetes and Kubeflow to manage AI workloads.
  • Propose optimized inference strategies using vLLM, Triton, and TensorRT-LLM to meet specific latency and throughput KPIs.
  • Should have experience on RAG systems and multi-agent orchestration frameworks like LangGraph and agentic ecosystems.
  • Develop private AI cloud environments focused on data sovereignty and regulatory compliance, such as the India DPDP Act.
  • Define integration strategies for LLMs and open-source models within existing enterprise data systems, APIs, and knowledge graphs.
  • Establish reference architectures for CI/CD/CT pipelines and automated model retraining workflows to ensure reproducibility.
  • Implement automation and observability frameworks for monitoring GPU utilization, performance tuning, and failure handling.
  • Drive technical validation through Proof of Concept (PoC) engagements, focusing on scalability and performance benchmarks for LLM training.
  • Establish Infrastructure-as-Code (IaC) practices to ensure reproducible and reliable cluster deployments.
  • Collaborate with C-suite stakeholders and cross-functional teams to drive technical decision-making, innovation, and roadmap alignment.

Experience & Educational Requirements

Qualifications and Experience

EDUCATIONAL QUALIFICATIONS: (degree, training, or certification required)

BE/B-Tech or equivalent with Computer Science or Electronics & Communication

RELEVANT EXPERIENCE: 15 - 20 years of IT Experience with minimum 5 years in AI platform

Required Technical Skills

Core AI/ML Expertise

  • Strong experience in Nvidia, Intel, Google GPU Architecture, InfiniBand
  • Strong expertise in Kubernetes, Slurm and OpenShift
  • Good experience in Python, PyTorch and TensorFlow
  • Good knowledge on LangChain, LangGraph
  • Deep understanding of Transformers, Attention mechanisms, Diffusion, MoE
  • Knowledge of RLHF, Pinecone, FAISS, Chroma, OpenAI, VLLM
  • Expertise in RAG and agentic AI workflows
  • Knowledge of high-performance storage (Lustre, PFS, Object NVMe)
  • Good Knowledge with NVIDIA architectures (Hopper, Blackwell)

Soft Skills

  • Strong problem-solving and analytical thinking
  • Excellent communication and stakeholder management
  • Ability to influence leadership and drive strategic decisions
  • Innovation mindset with focus on enterprise impact

Preferred Experience

  • Currently in AI / Cloud Presales team
  • Should be able to right size infra and choose right GPU model as per client requirement
  • Hands-on with Python, vector DBs (Pinecone, FAISS, Chroma), and LLM APIs (OpenAI, Anthropic).
  • Solid understanding of cloud-native architecture OpenStack, KVM, (Azure/AWS/GCP), microservices, Kubernetes, serverless, API gateways.
  • Good knowledge on deep learning experience: CNNs, RNNs/LSTMs, Transformers, and attention mechanisms.
  • Proficiency in Python for ML: NumPy, pandas, scikit-learn, and frameworks such as PyTorch or TensorFlow.
  • Experience in integrating LLMs (GPT, Claude, Gemini, LLaMA, Mistral) into applications.
  • Prompt engineering skills: zero-shot, few-shot, chain-of-thought, ReAct, and structured output patterns.
  • Experience building RAG systems: document chunking, embedding models, vector search, and retrieval optimization.
  • Understanding of AI agent patterns, tool use, and agentic workflows.
  • Familiarity with Docker, CI/CD pipelines, and Git-based workflows.
  • Strong communication, stakeholder management, and solution design skills.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 146194235

Similar Jobs