We're looking for an applied AI specialist with strong hands-on experience working with Large Language Models (LLMs) This is a practical, implementation-focused role not a pure software engineering position.
What you'll do
- Design and run LLM evaluation pipelines to assess quality, accuracy, and performance
- Work with RAG systems, including chunking strategies, embeddings, and retrieval tuning
- Build and iterate on agent-based workflows for real-world use cases
- Experiment with prompting, fine-tuning, and model selection across LLM providers
- Analyze outputs, identify failure modes, and propose improvements
- Collaborate with product and engineering to translate use cases into AI solutions
What we're looking for
- Strong hands-on experience with LLMs (GPT, Gemini, Claude, etc.)
- Experience building and maintaining LLM evaluation frameworks
- Practical knowledge of RAG, vector databases, and retrieval strategies
- Experience with fine-tuning LLMs and training ML models
- Solid understanding of agents, tools, and multi-step reasoning workflows
- Comfortable working with experimentation, metrics, and qualitative analysis
Nice to have
- Experience with production AI systems
- Familiarity with AI observability and monitoring
- Background in applied ML or data science
This role is ideal for someone who enjoys applied AI problem-solving, experimentation, and improving real-world LLM performance rather than building large systems from scratch.