Aviato Consulting is looking for an AI Engineer who has moved past the prototype phase of Generative AI. At this level, you aren't just calling APIs—you are building resilient, cost-effective, and deeply integrated AI architectures on Google Cloud. You will be responsible for designing systems that can reason, use external tools, and orchestrate complex workflows while maintaining strict enterprise standards for latency, cost, and data security.
If you know the difference between a cool demo and a Day 2 production system that handles edge cases, state management, and rigorous evaluation, this role is for you.
What You Will Do (Key Responsibilities)
- Architect Advanced Retrieval Systems: Design highly scalable retrieval pipelines for complex, unstructured enterprise data. You will make strategic build vs. buy decisions to balance retrieval latency, token costs, and multi-tenant security.
- Build Tool-Using Agents: Develop deterministic, autonomous AI agents capable of executing multi-step reasoning workflows. You will integrate these agents with external APIs, ensuring strict schema compliance and robust error handling.
- Orchestrate Multi-Agent Workflows: Design asynchronous, event-driven architectures where specialized AI personas collaborate to solve complex tasks. You will implement robust state management, idempotency, and Human-in-the-Loop (HITL) circuit breakers.
- Establish Scientific Evaluation Pipelines: Move beyond vibe checks by building automated, metric-driven CI/CD pipelines for LLMs. You will create frameworks to mathematically prove model groundedness and reasoning quality before migrating to new foundation models.
- Optimize Model Performance: Apply parameter-efficient tuning and context-management strategies to inject domain-specific knowledge into foundation models, ensuring they remain highly capable without degrading base intelligence or inflating inference costs.
What You Bring (Required Skills)
- Cloud Architecture: 3+ years of deep, hands-on experience in the Google Cloud ecosystem, specifically with serverless deployments (Cloud Run, Cloud Functions) and asynchronous event handling (Cloud Tasks, Pub/Sub).
- Applied Generative AI: Proven experience building production-grade LLM applications (using Gemini, GPT-4, or Claude).
- Agentic Frameworks: Deep understanding of how to build and debug agentic loops (Reasoning/Acting), enforce structured API outputs, and manage conversational memory state at scale.
- Search & Vector Math: Hands-on experience with Vector Databases, semantic search algorithms, hybrid retrieval strategies, and advanced document parsing methodologies.
- Engineering Rigor: A strong foundation in Python backend development. You write testable, modular code and understand how to handle the inherent unpredictability of LLMs with strict backend validation.
What We Offer (Beyond the Standard):
- Continuous Learning: We are committed to your growth, providing opportunities to deepen your AI and GCP skills.
- Equity Opportunity: Become a part-owner of Aviato (after 6 months) and share in our success.
- Remote Flexibility: Work remotely from anywhere in India, aligning with IST to collaborate effectively with our clients.
- Direct Contribution: Your ideas and contributions will be valued and have a tangible impact on our practice.
Ready to be a key player in our growing AI practice and make a real difference for enterprise clients