About the job
Location - Noida
We're building an AI-driven, real-time enterprise operations platform and are looking for a Lead DevOps who brings strong Azure DevOps expertise with a solid full-stack engineering background (Python + React).
This role is DevOps-first, but your full-stack understanding will help drive smarter automation, better CI/CD pipelines, and high-reliability engineering across the platform.
Candidates who are available to join immediately (or at short notice) will be given preference.
What You Will Lead
- Architect and scale CI/CD pipelines for microservices deployed on Azure Kubernetes Service (AKS) using Terraform + GitOps (ArgoCD)
- Own DevOps & SRE frameworks including deployments, reliability engineering, self-healing systems, and incident readiness
- Lead observability efforts with Grafana, Azure Monitor, distributed tracing, and structured logging
- Define and enforce SLOs, SLIs, runbooks, and reliability automation practices
- Collaborate closely with backend, frontend, data, and automation teamsleveraging your full-stack knowledge to align infrastructure with application needs
- Contribute to architectural reviews across services built with Python (FastAPI), Node.js, React, and Postgres
- Implement scalable, secure, cost-efficient cloud environments and FinOps best practices (Kubecost + Azure telemetry)
Primary Expertise DevOps & SRE
- Deep hands-on experience with Azure Cloud: AKS, Event Hubs, VNets/security, Azure Postgres, Azure Storage
- CI/CD experience with GitHub Actions or Azure DevOps; Infrastructure-as-Code via Terraform
- Strong command of containers, service meshes, secrets management, and cloud security hardening
- Expertise with monitoring, logging, tracing, alerting, and operational dashboards
- Familiarity with distributed systems, microservices reliability, and performance tuning
- Ability to transform infrastructure into an automated, resilient, self-recovering system
Secondary Expertise Full-Stack Engineering & ML Ops
- Experience in application development across frontend and backend
- Good to have a working understanding of ML Ops
- Strong familiarity with Python (FastAPI) and modern JavaScript frameworks (React, Tailwind)
- Ability to review, debug, and optimize application code to enhance CI/CD, observability, and performance
- Experience with APIs, caching, Postgres, serialization, and microservice communication patterns
- A developer-first mindset applied to DevOps tooling and workflows
Why This Role Is Unique
- DevOps/SRE leadership with meaningful involvement in full-stack engineering
- High impact on platform design, reliability, automation, and developer workflow experience
- Work at the intersection of infrastructure, real-time data, full-stack engineering, and applied AI
- Drive the DevOpsSRE maturity for a next-gen, cognitive automation platform