Exp Range - 5+ Years
Work Mode - 3 days WFO / Bangalore
Company & Project Overview:
We are moving beyond traditional automation into the world of multi-agent logic. As a Production Support Engineer, you won't just be monitoring logsas a core member of the operations team, you will be the first line of defense in ensuring our AI agents think, act, and communicate accurately.
The Role:
We are seeking a Production Support Engineer who thrives on technical discovery. This is a high-impact role that bridges the gap between traditional backend support and modern AI engineering. You will maintain the health of our high-concurrency Python environment, troubleshoot complex data flows in Snowflake, and refine the behavior of our AI agents in real-time.
Key Responsibilities:
- Agentic Oversight: Monitor and debug multi-agent orchestration (Google ADK), focusing on tool-calling accuracy and logic flow.
- Incident Management: Triage and resolve production issues across a high-concurrency FastAPI and Python Async backend.
- Data Integrity: Execute and optimize SQL queries in Snowflake to validate data consistency and resolve discrepancies.
- AI Reliability: Identify and mitigate LLM hallucinations or logic errors through Prompt Engineering and Vertex AI observability tools.
- Full-Stack Troubleshooting: Assist in UI/UX debugging (React/TypeScript) to ensure seamless integration between the AI backend and the user interface.
- Cloud Operations: Manage services within the GCP Ecosystem, specifically Vertex AI, Discovery Engine, and Cloud Run.
Technical Profile:
We prioritize problem-solving ability and technical curiosity over a perfect 1:1 match of years-of-experience. If you have the foundation and the drive to learn the rest, we want to hear from you.
Core Competencies:
- Backend: Proficiency in Python (Async experience is a major plus) and FastAPI.
- Data: Strong SQL skills and experience with cloud data warehouses like Snowflake.
- Cloud: Experience with GCP (Vertex AI, Cloud Run) or similar cloud environments (AWS/Azure).
- Frontend: Familiarity with React and TypeScript for basic troubleshooting.
- AI/LLM: An understanding of Prompt Engineering and the mechanics of Agentic frameworks.
Qualifications & Mindset:
- The Can-Do Spirit: You are a self-starter who owns a problem from discovery to resolution.
- Curiosity-Driven: You enjoy taking apart complex systems to understand how they work.
- Adaptability: You are comfortable jumping into unfamiliar stacks and learning on the fly.
Communication: Ability to understand and communicate technical AI behaviors to both engineering and non-technical stakeholders.