GEN AI Architect

Tredence Inc.

Bengaluru, India

Fresher

Save

Posted 2 months ago
Be among the first 10 applicants

Early Applicant

Job Description

Role description

AI Architect – Job Description

Role: AI Architect Agentic AI Systems

Context: Agentic analytics pipeline (root cause analysis, hypothesis testing, question expansion, report generation)

Stack: Python, LangGraph, LangChain, FastAPI, Celery, Redis, PostgreSQL, Databricks, Docker, Kubernetes, Azure, Context Engineering

About the Role

We are looking for an AI Architect who can own the end-to-end design and evolution of our agentic AI pipeline. The candidate should be well versed with event-driven architecture along with LangGraph to orchestrate multi-step workflows. You will define graph topologies, state schemas, routing and parallelization, and production patterns for observability, checkpointing, and scale.

What You'll Do

·Design and evolve LangGraph pipelines: Define and implement state graphs, node contracts, conditional routing, and parallel execution (e.g. SendAPI) for question expansion, data sufficiency, and hypothesis-testing flows. Ensure clear state boundaries and reusable subgraphs.

·Own the agent framework and conventions: Maintain and extend our modular LangGraph framework: base node abstractions, lifecycle decorators, automatic registration, and fluent graph builders. Keep patterns consistent and reduce boilerplate across nodes.

·Orchestrate multi-LLM and tool usage: Integrate and tune multiple LLM providers (OpenAI, Google Vertex, Anthropic, Mistral) and tool chains. Design tool contracts, error handling, and human-in-the-loop or clarification flows where needed.

·Productionize agent pipelines — Integrate with FastAPI, Celery, and Redis for async execution, streaming (e.g. SSE/Redis Streams), and event publishing. Ensure observability (e.g. Langfuse or similar), logging, and tracing for debugging and SLA monitoring.

·Scale execution and data — Evolve execution from in-process to distributed where needed (e.g. Celery workers, Databricks Serverless for heavy or sensitive code execution). Design for security, isolation, and cost.

·Collaborate with product and data — Turn product requirements into graph design (nodes, edges, routing). Work with data/analytics on schema, SQL validation, and statistical testing integration, so agent outputs are reliable and interpretable.

·Design memory architectures - long-term (cross-session knowledge, vector-backed retrieval), short-term (working memory within agent runs), and episodic (learning from past analyses)

·Context window management - token budgeting across multi-step pipelines, summarization strategies, selective context injection, and graceful degradation when context limits are hit

What We're Looking ForMust-have

·Strong Python software engineering and experience with async (asyncio) and production APIs (e.g. FastAPI).

·Hands-on LangGraph and LangChain (or equivalent agent/graph frameworks): building state graphs, conditional edges, subgraphs, and checkpointing. Understanding of state management and reducer patterns (e.g. add for lists).

·LLM integration experience: multiple providers, prompt design, tool/function calling, and basic cost/latency tradeoffs.

·Systems and production mindset: APIs, task queues (e.g. Celery), Redis, RabbitMQ, PostgreSQL. Comfort with Docker and basic DevOps (logging, health checks, env-based config).

·Docker and Kubernetes deployment with microservice architecture

·Ability to design for clarity and maintainability: modular graphs, clear node boundaries, and documentation of flows and state.

Nice-to-have

·Experience with MCP (Model Context Protocol) or similar agent tool protocols.

·Observability for LLM/agent systems (e.g. Langfuse, OpenTelemetry, or custom tracing).

·Databricks (or Spark): serverless jobs, notebooks, or code execution for large or sensitive data.

·Statistics/ML: hypothesis testing, EDA, or working with data scientists on automated analysis pipelines.

·Azure (or other cloud): Blob/Storage, app hosting, and security (e.g. OAuth, API keys).

·Experience with knowledge graphs

How to Apply

Share a resume and a short note on:

1.A system you designed or significantly changed that involved agents, workflows, or multi-step LLM pipelines (what you built, tradeoffs, and what you'd do differently).

2.Your experience with LangGraph/LangChain (or similar) and production deployment of agent systems.

3.Your github repository or OSS contributions