Search by job, company or skills

E

Site Reliability Engineer - Azure

4-6 Years
Save
  • Posted 9 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are seeking an experienced Azure Site Reliability Engineer (SRE) to design, implement, and maintain highly available, scalable, and secure cloud infrastructure on Microsoft Azure. The ideal candidate will have strong expertise in cloud operations, automation, monitoring, incident management, and DevOps practices.

 

Responsibilities

  • Manage and support Azure cloud infrastructure and services
  • Ensure platform reliability, availability, performance, and scalability
  • Implement Infrastructure as Code (IaC) using Terraform, Bicep, or ARM templates
  • Automate operational tasks using PowerShell, Python, Bash, or Azure CLI
  • Configure and manage CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins
  • Monitor applications and infrastructure using Azure Monitor, Log Analytics, Application Insights, Grafana, and Prometheus
  • Troubleshoot production issues and perform root cause analysis (RCA)
  • Implement backup, disaster recovery, and business continuity solutions
  • Collaborate with development and operations teams to improve system reliability
  • Participate in on-call support and incident management activities

Requirements

  • 4+ years of experience in Site Reliability Engineering or a related field
  • Strong experience with Microsoft Azure services, including Azure Virtual Machines, Azure Kubernetes Service (AKS), and Azure App Services
  • Expertise in Azure Storage, Azure Networking (VNet, NSG, Load Balancer, Application Gateway), and Azure Entra ID (Azure AD)
  • Background in Kubernetes and containerization technologies such as Docker and AKS
  • Hands-on proficiency in Terraform, ARM Templates, or Bicep
  • Strong knowledge of Linux and/or Windows administration
  • Familiarity with CI/CD tools such as Azure DevOps, GitHub Actions, or Jenkins
  • Competency in monitoring, logging, and observability tools
  • Understanding of security best practices and cloud governance
  • Scripting skills in PowerShell, Python, or Bash
  • Proficient communication skills in English (B2 level or higher)

Nice to have

  • Experience with SRE principles and SLI/SLO/SLA implementation
  • Knowledge of Chaos Engineering and Reliability Engineering practices
  • Microsoft Azure certifications, such as Azure Administrator Associate, Azure DevOps Engineer Expert, or Azure Solutions Architect Expert
  • Familiarity with ServiceNow, ITIL processes, and cloud cost optimization

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149057893

Similar Jobs

Remote, India

Skills:

GcpDatadogPrometheusAzureTerraformGrafanaJenkinsAnsibleGitHub ActionsAI-OpsGCP Operations SuiteAzure Monitor