Search by job, company or skills

Glance

SDE III Gen AI

Save
new job description bg glownew job description bg glow
  • Posted 10 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

What You Will Be Doing

- Design and implement production-ready generative AI applications that serve millions of users, from initial architecture through deployment and monitoring

- Build advanced RAG (Retrieval-Augmented Generation) pipelines that combine vector databases, hybrid search, and intelligent caching to deliver sub-second response times

- Develop multimodal AI systems that seamlessly integrate text, vision, and audio capabilities using state-of-the-art models

- Architect scalable microservices that handle thousands of concurrent AI requests while optimizing for cost, latency, and reliability

- Lead code reviews and technical design sessions, establishing best practices and architectural patterns that elevate the entire team's capabilities

- Optimize large language models through fine-tuning techniques to achieve domain-specific performance improvements

- Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing frameworks, and real-time monitoring dashboards

- Collaborate with product managers and stakeholders to translate complex business requirements into innovative AI solutions

- Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration technologies

- Create and maintain technical documentation, runbooks, and architectural decision records that enable knowledge sharing across teams

- Mentor junior engineers through pair programming, technical talks, and hands-on guidance to accelerate their growth

- Research and prototype emerging AI technologies to identify opportunities for competitive advantage

Gen AI Responsibilities

- Fine-tune and optimize state-of-the-art language models for specific business use cases, achieving significant improvements in accuracy and relevance

- Design multi-agent AI systems using frameworks to orchestrate complex workflows and decision-making processes

- Implement advanced prompt engineering strategies including Tree of Thoughts, ReAct patterns, and automatic prompt optimization to maximize model performance

- Build production-grade embedding systems that handle billions of vectors, implementing efficient indexing strategies and hybrid search capabilities

- Develop computer vision pipelines using models for tasks ranging from object detection to visual question answering

- Create secure AI applications with robust safeguards against prompt injection, jailbreaking, and data leakage while maintaining compliance with AI governance standards

- Optimize token usage and implement intelligent caching strategies to reduce costs by 50-70% while maintaining quality

- Design and implement evaluation frameworks that go beyond traditional metrics, incorporating human feedback loops and domain-specific quality measures

- Build real-time AI inference systems capable of processing streaming data with sub-100ms latency requirements

- Integrate multiple foundation models into unified applications, implementing fallback mechanisms and load balancing for high availability

- Develop custom tools and functions that extend LLM capabilities, enabling models to interact with databases, APIs, and external systems

- Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, and Graph RAG for complex reasoning tasks

- Create multimodal search systems that enable users to query across text, images, and documents using natural language

- Build AI-powered data processing pipelines that automatically extract, transform, and enrich unstructured data at scale

- Deploy edge AI solutions using frameworks like ONNX and TensorRT, optimizing models for resource-constrained environments

What We're Looking For

- 5+ years of hands-on experience building and deploying ML/AI systems, with at least 2+ years focused on generative AI and LLMs

- Expert-level Python programming skills with deep knowledge of async programming, multiprocessing, and performance optimization

- Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases

- Proven track record of deploying AI applications to production environments serving real users at scale

- Deep understanding of transformer architectures, attention mechanisms, and the latest advances in generative AI

- Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)

- Excellent communication skills with the ability to explain complex AI concepts to both technical and non-technical audiences

- Proven experience improving large-scale product search and discovery — including dense retrieval with bi-encoders, cross-encoder reranking, query understanding, and hybrid BM25 + vector search across catalogs of tens of millions of SKUs

- Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK — designing stateful, tool-augmented agents for complex, real-world workflows

- Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required with relevant experience)

Nice to Have

- Published research papers or significant contributions to open-source AI projects

- Experience with multimodal AI systems combining vision, language, and audio

- Domain expertise in specific verticals (healthcare, finance, legal, e-commerce)

- Knowledge of AI safety, alignment, and constitutional AI principles

- Experience building AI infrastructure and platforms used by other engineers

- Familiarity with emerging technologies like neural architecture search, mixture of experts, or neuromorphic computing.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 147506127

Similar Jobs

Bengaluru, India

Skills:

MLopsPytorchDockerKubernetesPythonLangChainorchestration frameworksmulti-agent systemsgenerative AIvector databasesTransformerslarge language modelsAI frameworkscloud platforms GCP