Search by job, company or skills

L

Site Reliability Engineer

5-7 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 9 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

What You Will Do

  • Design, build, and improve CI/CD pipelines for applications and infrastructure
  • Develop automation frameworks that reduce manual effort and increase consistency.
  • Configure and optimize cloud infrastructure to align with security, scalability, and
  • performance best practices.
  • Collaborate with development teams to remove deployment blockers and improve
  • delivery workflows.
  • Monitor reliability and performance, identify issues early, and implement data-driven
  • improvements to increase uptime and efficiency.
  • Participate in on-call rotations and drive incident resolution with clear postmortems and
  • preventive actions.
  • Maintain technical documentation for pipelines, configurations, and runbooks.
  • Perform readiness assessments and validation tests before production rollouts.
  • Implement Infrastructure as Code using Terraform and ARM templates with version
  • control and reproducibility.
  • Troubleshoot complex deployment, provisioning, and performance issues across multi-
  • cloud and containerized environments.

Minimum Qualifications

  • 5+ years in SRE or DevOps roles operating production systems
  • Hands-on experience running production workloads on Kubernetes in a cloud
  • environment, including cluster design, autoscaling, upgrades, and network policies.
  • Proven CI/CD delivery using GitHub Actions or Jenkins, including promotion across
  • environments, approvals, and rollback strategies.
  • Infrastructure as Code expertise with Terraform and ARM templates, including modules,
  • remote state, workspaces, and policy enforcement.
  • Strong scripting in PowerShell, Bash, or Python for automation and diagnostics.
  • GitOps experience with Argo CD or Flux, managing multi-environment application
  • delivery and drift remediation.
  • Containerization with Docker and Kubernetes, including health probes,
  • PodDisruptionBudgets, resource quotas, HorizontalPodAutoscaler, and operators.
  • Networking fundamentals with cloud network security practices such as VNet design,
  • NSGs, Private Link, and ingress controllers.
  • Working knowledge of cloud security and compliance, including least privilege, secrets
  • management, audit trails, and control evidence.
  • Excellent written and spoken English.
  • Ability to collaborate across US time zone.

Preferred Qualifications

  • Microsoft Azure certification, such as Developer Associate, Administrator, or DevOps
  • Engineer Expert
  • Observability using Application Insights, Elastic Stack (ELK), Grafana, and Prometheus for
  • metrics, logs, and traces.
  • Experience with log aggregation and alerting at scale using Elastic and Prometheus.
  • Understanding of high availability, scalability, disaster recovery, and cost optimization
  • Experience managing Windows-based containerized applications

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 147253263

Similar Jobs

Pune, India

Skills:

TypescriptTerraformAnsiblePrometheusElk StackPuppetGrafanaAzurePythonAWSChef

Pune, India

Skills:

NosqlJavaFile SystemPythonSqlSystems ProgrammingRoot Cause AnalysisGoOS servicesNetwork stack

Pune, India

Skills:

BashGrafanaAzureKubernetesPythonGoogle CloudAWSMimirPrometheus monitoring stackLokiInfrastructure automation

Pune, India

Skills:

CloudformationPrometheusDnsGrafanaJenkinsLoad BalancingTerraformLinux InternalsDynatraceSplunkAzureKubernetesPythonAzure DevOpsAWSDistributed systems architectureGitOpsHybrid networkingGitHub ActionsInfrastructure as codeOpenTelemetry

Pune, India

Skills:

TerraformNode.jsDistributed SystemsPythonAWSGoInfrastructure-as-CodeObservabilityMonitoring