Search by job, company or skills

teraops

Agentic System Engineer

Fresher
Save
new job description bg glownew job description bg glow
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

TeraOps is building the next generation of cloud cost and margin intelligence for modern, consumption-based platforms. As cloud-native architectures evolve to support AI-heavy workloads, traditional cost-optimization approaches break down. GPU runtimes, bursty inference patterns, and autonomous systems introduce a new class of cost risk that legacy tools cannot handle.

TeraOps addresses these challenges by combining deep cloud infrastructure intelligence with agentic AI systems that can analyze, reason about, and act on complex AWS environments. Our goal is not to build another reporting dashboard, but to create a system that understands cloud behavior and enforces economic discipline at scale.

The Role

We are hiring an AI & Agentic Systems Engineer to join TeraOps on the engineering team. This role is designed for a hands-on AI expert who moves beyond simple prompt engineering to build sophisticated, multi-step agentic architectures that interact with real-world infrastructure.

You will serve as:

●       The Agentic AI Subject Matter Expert (SME) for the TeraOps platform.

●       An expert engineer is building the reasoning engine that translates cloud metadata into autonomous actions.

●       A technical leader defining how our agents adapt to new tools, manage memory, and optimize their own inference costs.

This is not a research role; it is a systems engineering role. You will guide how our AI observes, plans, and executes within AWS, ensuring our Expert Agents are as efficient as they are intelligent.

What You Will Do

  1. Agentic Architecture & Tool Adaptation

●      Design and implement multi-agent workflows using frameworks like MCP, LangGraph, or Bedrock Agents to solve complex cloud optimization problems.

●      Develop the Tools Adaptation layer to enable agents to surgically interact with AWS APIs (S3, EC2, RDS) for last-mile remediation.

●      Define the safeguards and constraints that govern how AI systems act on customer environments to ensure risk-free execution.

2.Context Intelligence & Memory Management

●      Build the Context Intelligence engine that ingests multi-dimensional data (CUR, CloudWatch, App Telemetry) to provide agents with a full-picture view of infrastructure.

●      Implement advanced Memory Management strategies, including RAG (Retrieval-Augmented Generation) and MCP (Model Context Protocol) to ensure agents have the right data at the right time.

●      Optimize retrieval patterns to reduce token noise, directly lowering the AI unit cost of the platform.

3.Inference Optimization & Model Routing

●      Implement Model Routing logic to balance performance, latency, and cost—automatically choosing the right model (e.g., Claude 3.5 Sonnet vs. Haiku) based on task complexity.

●      Track and optimize AI Unit Economics to ensure the platform remains profitable even during 10,000x usage spikes from power users.

●      Design for Agentic Efficiency—reducing the number of reasoning loops required to reach a confident execution plan.

4.Leadership

●      Collaborate closely with AWS Architects to align AI reasoning with AWS Well-Architected principles.

●      Own technical outcomes, not just code—success is measured by the realized savings and ROI our agents deliver to customers.

●      Set the engineering standards for building, testing, and deploying agentic systems at scale.

Required Qualifications

●      Deep LLM & Agentic Expertise: Proven experience building production-grade AI systems that go beyond simple chat—specifically involving agentic workflows, tool-use, and multi-step reasoning.

●      Advanced Python Engineering: Mastery of Python for systems integration, with experience using AI SDKs (LangChain, Boto3, OpenAI/Anthropic).

●      Memory & RAG Mastery: Strong understanding of vector databases, context window management, and semantic retrieval strategies.

●      Cloud-Native Mindset: Familiarity with AWS services and how to programmatically observe and control them.

●      Pragmatic AI Specialist: Ability to reason about the trade-offs between model accuracy, inference speed, and cost.

●      Systems Thinker: Ability to design AI that can connect the dots between disparate data sources like billing logs and runtime telemetry.

●      High Ownership Mentality: Self-directed and comfortable operating with the ambiguity of a founding-stage startup.

Why This Role Is Different

  • Traditional AI roles are about building models. This role is about building an AI workforce. You are creating a system that doesn't just talk about problems but solves them autonomously within the world's most complex cloud environments. If you are excited by the challenge of making AI economically viable and operationally reliable at scale, this role is core to TeraOps mission.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147499361