Search by job, company or skills

PwC India

Site Reliability Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 11 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Experience : 5 - 12 yrs

Role Overview

As a Lead SRE, you will be the bridge between software engineering and systems operations. You will apply an engineering mindset to system administration, focusing on optimizing high-availability Azure environments through automation. You will be responsible for the Error Budget and ensuring that our SLOs (Service Level Objectives) are met through robust IaC and CI/CD practices.

Key Responsibilities

  • Reliability Engineering: Design and implement self-healing infrastructure on Azure to ensure 99.9%+ uptime for mission-critical services.
  • Infrastructure as Code (IaC): Lead the development of standardized, reusable Terraform modules to ensure consistent environment provisioning and prevent configuration drift.
  • Platform Orchestration: Manage and optimize Azure Kubernetes Service (AKS) clusters, using Helm to manage complex application lifecycles and deployments.
  • Automation & CI/CD: Architect end-to-end delivery pipelines using Azure DevOps (YAML & Classic) and GitHub Actions, prioritizing Security-as-Code and automated testing.
  • Incident Response & Post-mortems: Lead the On-Call rotation strategy and conduct blameless post-mortems to identify root causes and prevent recurrence.
  • Observability: Implement comprehensive monitoring, logging, and alerting (using Azure Monitor, Log Analytics, or Prometheus/Grafana) to track SLIs (Service Level Indicators).

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145618565

Similar Jobs