Search by job, company or skills

empenofore technologies

Site Reliability Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 13 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Senior SRE – Observability & Surrounding Systems

Location - Gurugram (On-site)

Responsibilities

  • Own end-to-end observability stack: Prometheus, Apache SkyWalking, Elasticsearch, Grafana — from ingestion to alerting.
  • Operate and maintain critical surrounding systems: MongoDB, Kafka, Redis, Vault, WSO2.
  • Provide L2/L3 support for platform stability and incident resolution.
  • Automate monitoring, alerting, and recovery workflows using Bash/Python.
  • Troubleshoot cross-layer issues: apps, K8s, nodes, networks, storage.
  • Collaborate with DevSecOps engineers to harden platform resilience.
  • Ensure observability coverage for all production services.

Profile

  • 3+ years in SRE/DevOps with focus on observability and infrastructure.
  • Proven hands-on experience with Prometheus, Elasticsearch, Apache SkyWalking or other APM application.
  • Operational expertise in MongoDB, Kafka, Redis, Vault, WSO2 APIM or similar.
  • Strong scripting in Bash or Python for automation.
  • Deep understanding of distributed systems and failure modes.
  • Must have incident ownership.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 147215755

Similar Jobs

Noida, India

Skills:

PowerShellPrometheusBashGrafanaDockerVirtual MachinesMicrosoft AzureKubernetesPythonAzure DevOpsAzure Front DoorAzure App ServicesLog AnalyticsCI CDAzure Monitor

Gurugram, Gurugram, India

Skills:

ElkCloudformationPrometheusBashGrafanaJenkinsGcpDockerTerraformAzureKubernetesPythonAWSGitLab CI

Delhi, Kolkata, Mumbai

Skills:

KubernetesPythonSoftware DevelopmentCCloudJavaSaas

Gurugram, Gurugram, India

Skills:

Load BalancersAerospikeFirewallsApache KafkaBashDnsElk StackPythonHashiCorp VaultAWS networking

Gurugram

Skills:

DevopsCloud InfrastructureAWSGcpPythonSite Reliability Engineering