DevOps Engineer

GeekyAnts

Mumbai, India

6-8 Years

Save

Posted 2 days ago
Be among the first 10 applicants

Early Applicant

Job Description

We are seeking an experienced DevOps / Site Reliability Engineer (L5) to own and scale the production operations of a large-scale, AI-first platform. In this role, you will be responsible for reliability, performance, observability, and cost efficiency across cloud-native workloads running on GCP and Kubernetes. You will work closely with platform, data, and AI teams to ensure resilient, secure, and highly available systems in production.

Key Responsibilities

Own day-2 production operations for a large-scale AI-driven platform running on Google Cloud Platform (GCP).

Run, scale, and harden GKE-based Kubernetes workloads integrated with GCP managed services (data, messaging, AI, networking, and security).

Define, implement, and operate SLIs, SLOs, and error budgets across platform and AI services.

Build and manage end-to-end observability using New Relic (APM, infrastructure monitoring, logging, alerts, and dashboards).

Design, improve, and maintain CI/CD pipelines and Terraform-driven infrastructure automation.

Operate and integrate Azure AI Foundry for LLM deployments and model lifecycle management.

Lead incident response, conduct postmortems, and drive long-term reliability and resilience improvements.

Optimize cost, performance, and autoscaling for AI- and data-intensive workloads.

Collaborate with engineering and leadership teams to drive best practices in reliability, security, and operations.

Key Skil

ls6+ years of hands-on experience in DevOps, SRE, or Platform Engineering role

s.Strong, production-grade expertise in Google Cloud Platform (GCP), especially GKE and core managed service

s.Proven experience running Kubernetes at scale in live, mission-critical environment

s.Deep hands-on expertise with New Relic in complex, distributed system

s.Solid experience operating AI/ML or LLM-powered platforms in productio

n.Strong background in Terraform, infrastructure as code, and CI/CD pipeline

s.Good understanding of cloud networking, security, and reliability engineering principle

s.Ability to own and operate production systems end-to-end with minimal supervisio

n.

Good-to-Have Sk

illsExperience with multi-cloud environments (GCP + Azu

re).Familiarity with FinOps practices for cloud cost optimizat

ion.Exposure to service mesh, advanced autoscaling strategies, and capacity plann

ing.Experience with data-intensive or real-time syst

ems.Knowledge of security best practices, compliance, and IAM in cloud environme

nts.Prior experience mentoring junior engineers or leading operational initiati

ves.

Educational Qualifi

cations
Education & Qualif

icationsBachelor's degree in Computer Science, Information Technology, Engineering, or a relate

d field.Master's degree in a relevant discipline is a plus, but not ma

ndatory.

More Info

Job Type:

Permanent Job

Industry:

Other

Function:

Devops / Site Reliability Engineering

Employment Type:

Full time

About Company

GeekyAntsJob Source: www.linkedin.com

Job ID: 143228891

Jobs by Skill - IT

Jobs by Skill - Non IT

International Jobs

Last Updated: 22-02-2026 06:17:56 PM

Homejobs in MumbaiDevOps Engineer

Similar Jobs

Senior DevOps Engineer (Working Days: Sunday to Thursday)

Rackspace Technology

10-12 yrs

Remote, India

DevOps Engineer

Wildnet Technologies

3-7 yrs

Mumbai, India

Senior DevOps Engineer

Yotta Data Services Private Limited

8-10 yrs

Mumbai, India

Senior DevOps Engineer

Seclore

4-6 yrs

Mumbai, India

VP Devops Engineer

Talent HR Networks Private Limited

10-12 yrs

Mumbai

Do you want to see more relevant and perfect job for you?

Beware of Scammers

We don’t charge any money for job offers

What it feels like to have

48% more interview calls?

To get 5X more recruiter views on your profile