Search by job, company or skills

ValueMomentum

Sr. Gen AI Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 14 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Position Summary:

This role is hands-on and delivery-focused: you will move from ambiguity to production-ready designs while ensuring solutions are secure, monitored, and resilient - with clear human-in/on-the-loop checkpoints and strong controls against data leakage. You'll also apply classical ML (supervised/unsupervised learning, feature engineering, evaluation, deployment) alongside GenAI to build hybrid solutions that perform reliably in production.

Role:

You will lead and contribute to high-complexity initiatives including:

Agentic AI Delivery

  • Architect and implement AI agents and agentic workflows for insurance use cases (e.g., claim triage, fraud detection signals, document understanding, decision support).
  • Design multi-agent orchestration patterns (tool routing, planning, reflection/verification, delegation, prioritization, and fallback strategies).
  • Build and maintain MCP-based integrations (Model Context Protocol) to connect agents with enterprise tools/services in a governed, repeatable way.
  • Create reusable agent frameworks, templates, and accelerators for consistent delivery across multiple problem domains.

RAG, Retrieval, and Knowledge Systems

  • Design and implement RAG pipelines using embeddings, indexing strategies, chunking, reranking, and query rewriting.
  • Build and tune vector embedding strategies for insurance corpora (policies, claims notes, SIU artifacts, adjuster docs, knowledge bases).
  • Implement vector cache patterns to reduce latency/cost and improve response stability.
  • Apply and integrate graph-based retrieval and reasoning (knowledge graphs / graph networks) for entity relationships and multi-hop retrieval.

Traditional Machine Learning

  • Design, train, and deploy classical ML models for risk and operations use cases (fraud scoring, triage/prioritization, anomaly detection, severity prediction, propensity/next-best-action).
  • Perform feature engineering across structured/semi-structured data sources (claims, policy, billing, customer interactions, documents/metadata).
  • Select appropriate algorithms and techniques (e.g., logistic regression, tree-based models such as XGBoost/LightGBM/CatBoost, random forests, time series, clustering, outlier/anomaly detection, graph-based features, and calibration methods).
  • Build robust evaluation pipelines (AUC/PR, lift, calibration, stability/drift metrics, fairness checks) and model validation aligned to the business decision context.
  • Implement ML lifecycle best practices: reproducible training, versioning, experiment tracking, packaging, deployment, and monitoring.
  • Develop hybrid AI systems where LLM/agents augment ML (e.g., using LLMs for enrichment/extraction while ML performs scoring; or ML acts as guardrails/routing logic for agent decisions).

Observability, Monitoring, and Automation

  • Implement agentic workflow monitoring automation (latency, cost, tool success rates, retrieval hit-rate, hallucination indicators, quality metrics).
  • Build traceability across prompts, tools, retrieval sources, and model outputs to support debugging and audit needs.
  • Establish model output lineage and run-level provenance (input → retrieval context → tool calls → output → downstream actions).

Governance, Risk Controls, and Fail-Safe Design

  • Engineer solutions with explicit controls for:
  • Data leakage risk prevention (prompt injection defense, secrets handling, policy enforcement, data minimization).
  • Auditability (decision rationale, evidence capture, reproducibility, and retention of run artifacts).
  • Fail-safe behavior (timeouts, retries, circuit breakers, graceful degradation, safe defaults).
  • Design human-in-the-loop / human-on-the-loop contact points for review, escalation, and override in high-risk steps.
  • Partner with security and governance stakeholders to ensure solutions meet Erie's enterprise controls and compliance expectations.

Fraud Detection & Insurance Analytics Enablement

  • Collaborate with fraud/SIU and analytics teams to design agent-supported workflows for:
  • Case summarization and evidence gathering
  • Signal enrichment and prioritization
  • Pattern discovery across claims, payments, providers, and narratives
  • Support experimentation and production rollout with measurable success criteria and controls.

Duties & Responsibilities

Essential Functions

  • Designs and implements production-ready agentic AI systems with orchestration, tools, and retrieval under minimal supervision.
  • Designs, trains, and operationalizes traditional ML models with strong evaluation, validation, and monitoring.
  • Builds secure, auditable, and traceable AI workflows with end-to-end observability and run-level lineage.
  • Leads optimization of agent workflows for quality, latency, cost, and reliability; identifies bottlenecks and eliminates failure modes.
  • Troubleshoots complex distributed failures across AI services, retrieval systems, tool integrations, and CI/CD pipelines.

Additional Responsibilities

  • Develops technical design documentation (architecture, data flow, threat modeling, observability plans, runbooks).
  • Implements automated testing strategies (unit, integration, evaluation harnesses, regression suites for prompts/retrieval).
  • Establishes best practices for prompt management, versioning, evaluation, and controlled rollout.
  • Mentors engineers and contributes to engineering standards for enterprise AI delivery.

Required Qualifications

Education & Experience

  • Bachelor's degree in Computer Science, Engineering, Data Science, or related field (or equivalent practical experience).
  • Minimum 5 years of hands-on experience building AI solutions, with deep, current hands-on delivery in modern agentic workflows.
  • Extensive background and hands-on experience with Machine learning concepts with proven industry experience

Core Technical Requirements (Must Have)

  • Proven experience architecting and implementing:
  • AI agents, agent orchestration, and tool-using systems
  • Agentic workflow optimization (quality/cost/latency/reliability tradeoffs)
  • Agent monitoring automation and operational runbooks

  • Strong, hands-on traditional ML experience:
  • Supervised learning, model selection, feature engineering, evaluation, calibration, and deployment
  • Experience with at least one: XGBoost/LightGBM/CatBoost, scikit-learn, PyTorch/TensorFlow (as appropriate)
  • Proven experience building production scoring systems and monitoring model performance/drift
  • Strong understanding and implementation experience with:
  • Data leakage risks, prompt injection vectors, and mitigations
  • Traceability, auditability, and evidence-based outputs (citations/grounding)
  • Fail-safe system design, robust error handling, and workflow troubleshooting
  • Hands-on experience with:
  • AWS Bedrock (or equivalent managed LLM platforms) and secure enterprise integration patterns
  • RAG, embeddings, vector databases, and retrieval tuning
  • Vector caching and performance optimization
  • Graph networks / knowledge graph concepts applied to retrieval or reasoning
  • Experience designing human-in/on-the-loop workflow checkpoints and escalation patterns.
  • Strong system design skills: distributed components, reliability patterns, scaling, and production support.

Tooling & Engineering Practices

  • Strong proficiency in Python and/or TypeScript/Node.js in production environments.
  • Experience with modern CI/CD, infrastructure-as-code, and cloud-native practices.
  • Comfort working across APIs, event-driven workflows, and integration patterns.

Preferred Qualifications (Nice to Have)

  • Insurance domain experience: claims, underwriting, billing, SIU/fraud workflows, or document-heavy enterprise operations.
  • Experience with AWS AI/ML services beyond Bedrock (or equivalents), such as:
  • Document extraction, entity recognition, speech-to-text, vision, search, personalization, etc.
  • Experience with feature stores, offline/online feature consistency, and real-time scoring architectures.
  • Production experience with evaluation frameworks (offline/online evals, groundedness checks, red-teaming, regression testing).
  • Experience implementing policy-as-code style controls for AI (guardrails, content filters, tool allowlists, PII handling).
  • Experience building knowledge graphs / graph retrieval systems (entity resolution, relationship inference, graph queries).

What Success Looks Like (First 30–60 Days)

  • Delivers at least one end-to-end agentic workflow to a production-ready standard (or strong pilot) with:
  • A measurable ML component (model + evaluation + monitoring) and/or hybrid ML + agent architecture
  • Retrieval grounding and clear evidence trails
  • Monitoring dashboards and alerting
  • Documented failure modes and safe fallbacks
  • Human review points for high-risk decisions
  • Establishes reusable patterns for MCP tools, agent orchestration, ML lifecycle, and evaluation/monitoring that other teams can adopt.
  • Demonstrates measurable improvements in quality and/or efficiency (latency, cost, throughput, or reduced manual effort).

Working Style & Collaboration

  • Operates effectively in ambiguity and can translate business problems into robust technical designs.
  • Communicates clearly with both engineering peers and non-technical stakeholders.
  • Comfortable partnering with governance/security/data teams to ensure compliant delivery.

Keywords (for sourcing)

Agentic AI, AI Agents, Orchestration, MCP (Model Context Protocol), AWS Bedrock, RAG, Embeddings, Vector DB, Vector Cache, Knowledge Graph, Graph Networks, Observability, Traceability, Auditability, Prompt Injection Defense, Data Leakage Prevention, Human-in-the-Loop, Fraud Detection, Machine Learning.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147213239

Similar Jobs

Hyderabad, India

Skills:

PytorchCJavaTensorflowPythonMilvusVector DatabasesPinecone