Search by job, company or skills

  • Posted 9 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Industry & Sector: Operating in the Cloud Infrastructure and Enterprise SaaS sector, this high-availability engineering team builds and runs resilient, containerised production platforms that support mission-critical customer applications. We deliver scalable, observable, and secure cloud-native services for global users.

Role: Site Reliability Engineer (SRE) On-site (India)

Role & Responsibilities

  • Design, deploy, and maintain production-grade Kubernetes-based platforms to ensure high availability, scalability, and security.
  • Author and maintain Infrastructure-as-Code to provision and manage cloud resources, enabling repeatable, auditable deployments.
  • Build and operate CI/CD pipelines and automated release processes to accelerate safe delivery of features and fixes.
  • Implement observability: metrics, logging, tracing, and alerting; define SLOs/SLIs and automate incident detection and response.
  • Lead incident management and post-incident reviews to drive reliability improvements and reduce MTTR.
  • Collaborate with development and product teams to optimize performance, reduce costs, and harden the platform for production traffic.

Skills & Qualifications

Must-Have

  • Kubernetes
  • Docker
  • Terraform
  • AWS
  • Linux
  • Prometheus
  • Grafana
  • Jenkins

Preferred

  • Helm
  • Ansible
  • Python

Qualifications

  • Proven experience operating production cloud infrastructure and container platforms (demonstrable projects or on-call history preferred).
  • Strong troubleshooting skills across distributed systems, networking, and storage.
  • Willingness to work on-site in India and participate in on-call rotation.

Benefits & Culture Highlights

  • Hands-on exposure to large-scale cloud-native systems and opportunity to drive reliability best practices.
  • Collaborative engineering culture with focus on learning, ownership, and measurable impact.
  • Competitive compensation and benefits aligned to on-site roles in India.

We are looking for proactive SREs who enjoy end-to-end ownership of platform reliability, automation-first engineering, and close collaboration with developers to deliver reliable services at scale. Apply if you thrive on solving complex operational challenges and driving continuous improvement.

Skills: aws,prometheus,kubernetes,sre,jenkins,grafana,terraform,linux,docker

More Info

Job ID: 141471625