Search by job, company or skills

Tredence Inc.

GEN AI Architect

new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description


Role description

AI Architect Job Description




Role: AI Architect Agentic AI Systems


Context: Agentic analytics pipeline (root cause analysis, hypothesis testing, question expansion, report generation)


Stack: Python, LangGraph, LangChain, FastAPI, Celery, Redis, PostgreSQL, Databricks, Docker, Kubernetes, Azure, Context Engineering


About the Role


We are looking for an AI Architect who can own the end-to-end design and evolution of our agentic AI pipeline. The candidate should be well versed with event-driven architecture along with LangGraph to orchestrate multi-step workflows. You will define graph topologies, state schemas, routing and parallelization, and production patterns for observability, checkpointing, and scale.


What You'll Do


Design and evolve LangGraph pipelines: Define and implement state graphs, node contracts, conditional routing, and parallel execution (e.g. SendAPI) for question expansion, data sufficiency, and hypothesis-testing flows. Ensure clear state boundaries and reusable subgraphs.


Own the agent framework and conventions: Maintain and extend our modular LangGraph framework: base node abstractions, lifecycle decorators, automatic registration, and fluent graph builders. Keep patterns consistent and reduce boilerplate across nodes.


Orchestrate multi-LLM and tool usage: Integrate and tune multiple LLM providers (OpenAI, Google Vertex, Anthropic, Mistral) and tool chains. Design tool contracts, error handling, and human-in-the-loop or clarification flows where needed.


Productionize agent pipelines Integrate with FastAPI, Celery, and Redis for async execution, streaming (e.g. SSE/Redis Streams), and event publishing. Ensure observability (e.g. Langfuse or similar), logging, and tracing for debugging and SLA monitoring.


Scale execution and data Evolve execution from in-process to distributed where needed (e.g. Celery workers, Databricks Serverless for heavy or sensitive code execution). Design for security, isolation, and cost.


Collaborate with product and data Turn product requirements into graph design (nodes, edges, routing). Work with data/analytics on schema, SQL validation, and statistical testing integration, so agent outputs are reliable and interpretable.


Design memory architectures - long-term (cross-session knowledge, vector-backed retrieval), short-term (working memory within agent runs), and episodic (learning from past analyses)


Context window management - token budgeting across multi-step pipelines, summarization strategies, selective context injection, and graceful degradation when context limits are hit


What We're Looking ForMust-have


Strong Python software engineering and experience with async (asyncio) and production APIs (e.g. FastAPI).


Hands-on LangGraph and LangChain (or equivalent agent/graph frameworks): building state graphs, conditional edges, subgraphs, and checkpointing. Understanding of state management and reducer patterns (e.g. add for lists).


LLM integration experience: multiple providers, prompt design, tool/function calling, and basic cost/latency tradeoffs.


Systems and production mindset: APIs, task queues (e.g. Celery), Redis, RabbitMQ, PostgreSQL. Comfort with Docker and basic DevOps (logging, health checks, env-based config).


Docker and Kubernetes deployment with microservice architecture


Ability to design for clarity and maintainability: modular graphs, clear node boundaries, and documentation of flows and state.


Nice-to-have


Experience with MCP (Model Context Protocol) or similar agent tool protocols.


Observability for LLM/agent systems (e.g. Langfuse, OpenTelemetry, or custom tracing).


Databricks (or Spark): serverless jobs, notebooks, or code execution for large or sensitive data.


Statistics/ML: hypothesis testing, EDA, or working with data scientists on automated analysis pipelines.


Azure (or other cloud): Blob/Storage, app hosting, and security (e.g. OAuth, API keys).


Experience with knowledge graphs


How to Apply


Share a resume and a short note on:


1. A system you designed or significantly changed that involved agents, workflows, or multi-step LLM pipelines (what you built, tradeoffs, and what you'd do differently).


2. Your experience with LangGraph/LangChain (or similar) and production deployment of agent systems.


3. Your github repository or OSS contributions


More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 144720329