Job Description
About the Role
This is a hands-on role for someone who's equally strong in research thinking and engineering execution. You will define what good looks like for AI-powered features, build systems that meet that bar, and own them through launch and beyond. From writing eval plans to debugging failures in production, you will work across the stack and across functions to ship reliable, high-quality LLM-driven systems.
What You Will Do
You will design, build & ship AI-backed features that are reliable in production
- Define the quality bar: design eval rubrics, test plans, and rollout criteria. Make sure they're measurable and enforced.
- Build with real-world constraints: write and extend production code, set up monitoring, and add tests that catch regressions before users do.
- Own features end to end from problem framing to modeling, from system design to rollout and iteration.
- Debug failures across the stack including data, infra, model, prompt logic and harden the system with what you learn.
- Design and implement systems: retrieval pipelines, agents, or hybrid patterns, based on what the problem actually needs.
- Work across functions: collaborate with product, infra, and engineers to ship features that actually stick.
What It Takes
You have built and shipped AI systems before and carried the load when things broke post-launch.
- Strong research instincts: you are good at defining what working means and designing evaluations that reflect real-world usage.
- Solid engineering skills: you write clean, testable Python, debug at system boundaries, and know your way around production stacks.
- LLM understanding: you have worked with modern models and know how to prompt, fine-tune, or wrap them with tooling and evaluation.
- Systems mindset: you think in interfaces, data contracts, failure modes, and rollout plans, not just model tweaks.
- Practical bias: you care more about what ships and survives than what's novel.
- Ownership: you take initiative, communicate clearly, and push for quality without being asked.
Required Skills
[Python]
Additional Information
Look for B.Tech/M.Tech Tier-1 Colleges from B2B AI companies or large scalable companies