Search by job, company or skills

Bridge.ai

Lead Principal Agentic RAG Engineer – Python & AI Platforms

Save
new job description bg glownew job description bg glow
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

About the Role

We are hiring a senior, hands-on Agentic RAG Engineer to design, build, and operate the Retrieval-Augmented Generation platforms that power our autonomous AI agents.

This is a lead-by-example role:

  • You design the architecture
  • You write the Python
  • You ship to production
  • You mentor engineers by building real systems

You will lead the technical direction of RAG and agent memory systems, while remaining deeply involved in implementation, observability, and operational readiness.

GCP is our primary platform, but all designs should be multi-cloud capable.

 

Key Responsibilities

RAG & Backend Engineering (Python-First)

  • Design and build production-grade RAG pipelines
  • Implement:
  • Retrieval strategies
  • Vector database integrations
  • Agent memory and state management
  • Prompt orchestration and chaining
  • Build scalable Python services using FastAPI / Django / similar
  • Integrate LLM APIs (OpenAI, Claude, Gemini) and open-source models (Llama, Mistral)
  • Implement model/version rollout, rollback, and simulation testing

 

Agentic Systems & Workflow Design

  • Build and operate multi-step agent workflows
  • Enable:
  • Tool calling
  • Human-in-the-loop interventions
  • Safe agent execution patterns
  • Define patterns for:
  • Prompt versioning
  • Context management
  • Token and cost control
  • Collaborate closely with AgentOps to ensure production-safe execution

 

Full-Stack & Observability

  • Design and contribute to internal UIs for:
  • Agent execution monitoring
  • Decision and reasoning audits
  • Prompt testing and visualization
  • Implement structured logging and telemetry for:
  • Retrieval quality
  • Agent decisions
  • Token usage and latency
  • Work with Prometheus, Grafana, OpenTelemetry, or ELK-based stacks

 

Cloud, DevOps & Production Readiness

  • Own deployment pipelines for RAG and agent services
  • Work hands-on with:
  • Docker
  • Kubernetes
  • Terraform
  • CI/CD pipelines
  • Ensure secure API design, auth, sandboxing, and operational guardrails
  • Optimise for scalability, performance, and cloud cost efficiency on GCP

 

Technical Leadership & Team Enablement

  • Act as technical lead for Agentic RAG engineering
  • Set architectural standards and best practices
  • Review code and designs with a high bar
  • Mentor engineers in:
  • Pythonic system design
  • RAG correctness and evaluation
  • Production-grade GenAI systems
  • Partner with Product and Platform leads on roadmap and delivery

 

Required Skills & Experience

Core Engineering

  • 5+ years of strong Python engineering experience
  • Proven backend or full-stack development background
  • Experience with FastAPI, Django, Flask, or similar frameworks
  • Comfort contributing to frontend systems (React / Next.js / Vue) when needed

 

RAG, LLMs & Agentic Systems

  • Hands-on experience building RAG pipelines in production
  • Strong understanding of:
  • Vector databases and retrieval strategies
  • Prompt chaining and context handling
  • Agent workflows and tool invocation
  • Experience with frameworks such as:
  • LangChain
  • LangGraph
  • LlamaIndex
  • AutoGen / CrewAI

 

Cloud & Platform Engineering

  • Strong experience with GCP, including:
  • Vertex AI
  • GKE / Compute Engine
  • Cloud Functions
  • Cloud Storage, Pub/Sub
  • Hands-on DevOps skills:
  • Docker
  • Kubernetes
  • Terraform
  • CI/CD tooling
  • Understanding of secure APIs, auth, and sandboxing patterns

 

Nice to Have

  • Multi-cloud experience (AWS, Azure)
  • Experience with Responsible AI, guardrails, and eval frameworks
  • Contributions to open-source AI or infrastructure projects
  • Experience building internal tooling or monitoring dashboards for AI systems

 

What Success Looks Like

  • RAG systems are accurate, observable, and cost-efficient
  • Agent failures are explainable and debuggable
  • Engineers follow clear, scalable RAG patterns you defined
  • Product teams trust agent outputs in production
  • You are the technical authority for RAG at BridgeAI

 

What This Role Is (and Is Not)

✔ Deeply hands-on technical leadership

 ✔ Python-first engineering

 ✔ Production-grade RAG ownership

 ✔ Mentorship through real code

✖ Not a research-only role

 ✖ Not a people-manager-only role

 ✖ Not a demo or prototype position

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148379449