Search by job, company or skills

insiteverse

Founding AI OPS Engineer (AIOps + DevOps + CloudOps )

10-12 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Founding AI OPS Engineer

AIOps + DevOps + CloudOps — Where Intelligence Meets Infrastructure

Equity Only | Pre-Seed Stage Startup | India Only

About Us

We are a startup based in the US and have a registered office in India. We are building an industry-leading FinTech mobile app that brings hedge-fund-grade trading intelligence to everyday investors. Think Robinhood but powered by AI-driven insights, ultra-low-latency systems, and radically transparent user experiences. We need your investment of a minimum of 20 Hours per Week (Part-time), which offers significant equity returns, as you continue working while we secure funding and onboard you as a Full-time employee within the next 6–8 months. You'll be a partner with our Data Scientist, AI engineers, quantitative research, frontend, and backend teams.

As part of our founding technical team, you will inherit, optimize, and implement the operational (Dev/Cloud Ops) foundation that powers our entire platform from AI model infrastructure and trading services to real-time observability and self-healing systems.

Role Overview

An AI-driven Cloud/DevOps Engineer (GitHub CI/CD + Cloud Provisioning/Operations) is a rare hybrid who combines deep knowledge of cloud infrastructure with AIOps principles. You will bridge Dev, Cloud-Ops, and use AI to automate, optimize, and proactively manage our complex, distributed FinTech environment, reducing manual toil and eliminating incidents before they reach production.

Experience

  • Minimum 10 years of hands-on work experience in DevOps, CloudOps, AIOps, or a closely related engineering discipline.

Key Responsibilities

Cloud Operations (CloudOps)

  • Provision and manage cloud resources on AWS and Azure using both portal-based workflows and Infrastructure as Code (IaC) via Terraform.
  • Design and maintain cloud IAM policies in JSON and YAML for both AWS and Azure, ensuring least-privilege access controls.
  • Architect scalable, cost-efficient cloud environments aligned with FinTech compliance and security requirements.

Development Operations (DevOps)

  • Develop, manage, and optimize GitHub Actions workflows for CI/CD pipelines across all services.
  • Implement and maintain container orchestration using Docker and Kubernetes (EKS) for reliable deployments.
  • Manage deployment lifecycle, rollbacks, blue-green deployments, and feature flag strategies.

AIOps & Intelligent Monitoring

  • Build ML-powered predictive alerting systems to identify anomalies and forecast incidents before they impact production.
  • Integrate and administer AIOps platforms such as Splunk and Datadog into the existing IT infrastructure.
  • Configure observability stacks using Prometheus and Grafana for real-time dashboards and SLO tracking.

Automated Incident Response

  • Design and implement self-healing mechanisms and automated runbooks to resolve recurring IT issues without manual intervention.
  • Develop intelligent playbooks that leverage AI to triage, categorize, and route incidents automatically.
  • Define and own Mean Time to Resolution (MTTR) SLAs; continuously improve incident response workflows.

Log, Data & Root Cause Analysis

  • Analyze high-volume logs, metrics, and distribute traces for root cause analysis (RCA) using ML-assisted tooling.
  • Build pipelines that ingest operational telemetry into analytics platforms for continuous insight generation.
  • Correlate signals across systems to surface hidden dependencies and failure patterns.

AI Model Lifecycle Management

  • Maintain, scale, and monitor AI/ML models in production to ensure consistent performance and reliability.
  • Implement model versioning, A/B deployment strategies, and automated rollback on degraded performance.
  • Collaborate with ML engineers to operationalize new models with minimal downtime.

Essential Skills & Qualifications

Programming & Scripting

  • High proficiency in Python — automation scripts, data pipelines, CLI tooling, and ML integrations.
  • Proficiency in Bash/Shell scripting for system-level automation and operational tasks.
  • Familiarity with YAML and JSON for configuration management and IAM policy authoring.

Cloud & Infrastructure

  • Hands-on experience with AWS services: EC2, Lambda, S3, EKS, IAM, CloudWatch, VPC.
  • Working knowledge of Microsoft Azure services and Azure IAM (RBAC, Managed Identities).
  • Proficiency with Terraform for Infrastructure as Code across multi-cloud environments.
  • Experience with Docker and Kubernetes for containerized workloads and microservices.

AIOps & Observability

  • Experience with AIOps platforms: Splunk, Datadog, Dynatrace, or equivalent.
  • Proficiency in Prometheus and Grafana for metrics collection, alerting, and dashboards.
  • Familiarity with distributed tracing tools (Jaeger, OpenTelemetry) and log aggregation (ELK Stack).

AI / ML Knowledge

  • Working understanding ML algorithms, anomaly detection techniques, and NLP.
  • Exposure to Generative AI and Large Language Models (LLMs, GPT-based systems) in operational contexts.
  • Ability to interpret model performance metrics and identify drift or degradation in production.

DevOps & CI/CD

  • Proficiency with GitHub Actions for CI/CD pipeline development and management.
  • Understanding of GitOps principles, branching strategies, and release management.
  • Experience with secrets management tools (AWS Secrets Manager).

Analytical & Debugging Skills

  • Strong ability to debug complex, distributed systems at scale.
  • Systematic approach to root cause analysis using data-driven methodologies.
  • Comfortable operating in high-ambiguity, early-stage environments with shifting priorities.

Why Join Us

  • Founding equity stake: You will own a meaningful share of what we are building.
  • Greenfield architecture, no legacy systems; design the stack from scratch, the right way.
  • High-impact role at the intersection of AI and financial technology.
  • Direct collaboration with founders; your decisions shape the product and company direction.
  • A mission that democratizes institutional-grade trading intelligence for everyday investors.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 147320615