Role Summary
We are looking for a Senior LLM Engineer with strong hands-on experience in agentic AI systems. The role focuses on designing and implementing multi-agent workflows, building LLM evaluation pipelines, and optimizing LLM behavior using prompt engineering. This is a core IC role requiring ownership of production-grade LLM systems.
- Agentic Workflow Design (Primary - Core) :
- Design and implement agent-based workflows using LangChain and LangGraph.
- Build stateful, cyclic, and conditional agent graphs for autonomous decision-making.
- Manage agent memory, tool usage, retries, and execution control.
- Ensure reliability and scalability of agent pipelines in production.
- LLM Evaluation & Quality Control :
- Implement LLM-as-a-Judge evaluation frameworks.
- Design evaluation prompts where stronger models validate :
- Accuracy
- Safety
- Instruction adherence
- Automate response scoring and regression testing for prompt and model changes.
- Prompt Engineering & Model Usage :
- Perform advanced prompt engineering and prompt tuning to align outputs with product requirements.
- Manage and optimize calls to multiple LLM providers (OpenAI, Anthropic, Open-source).
- Optimize for latency, cost, and response quality.
- Data Integrity & Validation :
- Enforce strict schema validation on LLM inputs and outputs.
- Analyze LLM responses to ensure compliance with business and technical requirements.
- Implement guardrails to reduce hallucinations and invalid outputs.
- Ownership & Delivery :
- Operate as an independent contributor, owning features end-to-end.
- Collaborate with product and engineering teams on AI system design.
- Contribute to architectural decisions related to agentic systems.
Required Skills & Experience (Aligned)
- 7+ years of strong Python development experience
- Hands-on experience building LLM-based applications
- LangChain (mandatory)
- LangGraph (highly preferred)
- Experience with LLM evaluation frameworks (LLM-as-a-Judge or equivalent)
- Strong prompt engineering and prompt tuning experience
Good To Have (Kept Optional, Not Core)
- Exposure to PEFT / LoRA / adapters
- Experience with open-source LLMs
- Familiarity with RAG pipelines and vector database
(ref:hirist.tech)