Search by job, company or skills

Headout

Senior Site Reliability Engineer

Save
new job description bg glownew job description bg glow
  • Posted a month ago
  • Be among the first 20 applicants
Early Applicant

Job Description

The Role

As a Senior Site Reliability Engineer, you will own and operate cloud-native infrastructure and Kubernetes platforms that power customer-facing services at scale. You will design and optimize CI/CD workflows, improve deployment reliability, and drive observability, incident management, and performance improvements across the organization. You will build platform tooling to improve developer velocity, enforce security guardrails, and standardize best practices. This role expects strong ownership, architectural thinking, and mentorship of junior engineers.

What makes this role special

  • Full Platform Exposure – Work across DevOps, infrastructure, observability, performance, and reliability
  • Architecture Ownership – Influence platform and tooling decisions using benchmarks and metrics
  • High Impact – Build systems that reduce deployment TAT, improve p99s, and scale across teams
  • Flexibility – Freedom to work across stacks, tools, and evolving platforms

What skills & experience do you nee

  • 4-7 years of experience operating customer-facing services at scale
  • Strong hands-on experience with Kubernetes cluster operations and workload optimization
  • Experience with service mesh and distributed tracing tools (e.g., Istio, Jaeger)
  • Comfortable with at least one cloud provider (AWS preferred; GCP or Azure acceptable)
  • Hands-on experience with monitoring and alerting stacks (Prometheus, Grafana, Thanos, Datadog, New Relic)
  • Proven experience designing robust CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins)
  • Proficiency in Infrastructure as Code (Terraform or Pulumi)
  • Strong programming skills in Python, Go, or Java/Kotlin, plus shell scripting
  • Experience with databases such as MySQL and MongoDB, including application and query profiling
  • Solid understanding of security best practices and compliance
  • High-ownership mindset with the ability to proactively identify and resolve platform issues

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 144677675

Similar Jobs

Bengaluru, India

Skills:

ElkPrometheusSlasNetworkingDnsGrafanaCdnGraylogPythonAWSPerformance TuningBashDevopsHigh AvailabilityGcpLoad BalancingAzureKubernetesSLIsGoDisaster Recoveryobservability toolsSecurityOpenTelemetryInfrastructure EngineeringSite Reliability Engineeringlog management toolsreliability metricsSLOscontainer orchestrationincident management frameworks

Bengaluru, India

Skills:

GithubPowerShellPrometheusBashGrafanaJenkinsGitCloudwatchLinuxBitbucketTerraformAWS CloudFormationKubernetesPythonAWSLoki

Bengaluru, India

Skills:

GolangAws ServicesNetworkingDockerLinux System AdministrationOs FundamentalsBashPythonSecurity best practices in cloud environments

Bengaluru, India

Skills:

JavaGolangPrometheusGcpDockerLinuxAnsibleOpenshiftPuppetAzureKubernetesPythonAWSChef

Bengaluru

Skills:

Cloud TechnologiesSqlAwsSite Reliability Engineer