Search by job, company or skills

N

Site Reliability Engineer, AVP

7-11 Years
Save
  • Posted a month ago
  • Over 50 applicants
Quick Apply

Job Description

As a Site Reliability Engineer, you will work closely with feature teams to support application changes, participate in delivery activities, and resolve production issues to ensure changes are delivered without negatively impacting customer experience. You will play a key role in maintaining reliable, secure, and high-performing production systems while driving continuous improvement in operational excellence.

Key Responsibilities

  • Collaborate with feature teams to understand application changes and support delivery into production
  • Participate in production support, incident response, on-call rota, and site reliability operations
  • Proactively improve release quality and ensure high availability, performance, and security of production systems
  • Design and deliver automation solutions to reduce or eliminate manual operational tasks
  • Maintain a deep understanding of the full technology stack supporting the application
  • Define alerting and monitoring requirements based on customer journeys and system behaviour
  • Assess and improve the resilience of end-to-end application and infrastructure stacks
  • Reduce operational toil and minimise hand-offs during customer-impacting incident resolution

Required Experience and Knowledge

  • Minimum eight years of experience supporting live production services for customer-facing applications
  • Strong knowledge of ITIL processes and IT security principles with a focus on compliance and risk prevention
  • Hands-on experience with Azure Cloud environments
  • Expertise in full-stack observability using tools such as Log Analytics, Application Insights, Grafana, and Splunk
  • Experience supporting Java and microservices-based applications, including Kafka stream processing
  • Working knowledge of relational databases such as Oracle and Postgres
  • Proven experience transitioning new applications and major releases into production support

More Info

About Company

Job ID: 137372521

Similar Jobs

Bengaluru

Skills:

CI/CD.DevopsPrometheusGrafanaAWSSite Reliability Engineering

Bengaluru, India

Skills:

ElkPrometheusSlasNetworkingDnsGrafanaCdnGraylogPythonAWSPerformance TuningBashDevopsHigh AvailabilityGcpLoad BalancingAzureKubernetesSLIsGoDisaster Recoveryobservability toolsSecurityOpenTelemetryInfrastructure EngineeringSite Reliability Engineeringlog management toolsreliability metricsSLOscontainer orchestrationincident management frameworks

Bengaluru, India

Skills:

containerization DatabasesSentryCloudformationBashNetworkingPulumiDatadogGcpTerraformAzureKubernetesPythonAWScloud security principlescloud infrastructure platformsGoPagerDutyobservability toolsSite Reliability EngineeringIncidentIOincident response practices