Search by job, company or skills

Snapmint

Site Reliability Engineer

Save
  • Posted 2 days ago
  • Be among the first 20 applicants
Early Applicant

Job Description

About Role

We are looking for a Site Reliability Engineer (SRE) to ensure the availability, reliability, and performance of our production systems. The ideal candidate will be responsible for monitoring infrastructure and applications, managing incidents, analyzing logs, performing system health checks, and supporting operational excellence across the organization.

Roles & Responsibilities

  • Monitor servers, applications, databases, cloud infrastructure, and third-party integrations
  • Configure and manage alerts, dashboards, and observability tools
  • Respond to incidents, perform initial diagnosis, and coordinate escalations within defined SLAs
  • Analyze system and application logs to identify issues, anomalies, and recurring patterns
  • Perform daily health checks for production systems, databases, APIs, dashboards, and infrastructure components
  • Maintain SOPs, runbooks, incident reports, RCA documents, and shift handover reports
  • Provide L1/L2 production support and assist engineering teams with troubleshooting and system validation
  • Ensure timely stakeholder communication and accurate incident tracking

Education and Experience

  • B.tech/B.E. Equivalent
  • 1-4 years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Systems Engineering.

Skills

  • Strong Linux/ Unix fundamentals
  • Experience with AWS or GCP
  • Monitoring tools: Grafana, Prometheus, CloudWatch, New Relic
  • Log management tools: ELK, Loki, Wazuh
  • Databases: MySQL, Aurora DB, MongoDB
  • Basic networking knowledge (DNS, TCP/IP, Load Balancers)
  • Ticketing and incident management tools such as Jira and PagerDuty
  • Strong troubleshooting, analytical, communication, and documentation skills
  • Willingness to work in rotational shifts

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149338477

Similar Jobs

Gurugram, Gurugram, India

Skills:

GithubNewrelicPowerShellApigeePrometheusElk StackKafkaBashGrafanaJenkinsGitDockerBitbucketTerraformMicrosoft AzureKubernetesPythonAzure DevOpsGitLab CIAzure Monitor

Gurugram, India

Skills:

ApisElkVulnerability ManagementGrafanaconfiguration managementTerraformDockerLinuxAnsibleCloud InfrastructureSiemautomationPythonKubernetesAWScompliance automationGoasset discoveryinfrastructure-as-code

Gurugram, India

Skills:

Distributed SystemsNetworkingPrometheusBashGrafanaTerraformLinuxAzurePythonKubernetesAWSInfrastructure as CodeGo

Gurugram, Gurugram, India

Skills:

ElkUnix AdministrationNetworkingPrometheusDnsGrafanaDockerTerraformPythonAWSLoad BalancingDebuggingLog AnalysisBashAutomationNew RelicJenkinsMonitoring ToolsLinuxDistributed SystemsKubernetesInfrastructure as CodeGitHub ActionsCloud infrastructure managementContainer orchestrationGitLab CIArgoCD

Delhi, India

Skills:

Distributed SystemsNetworkingPrometheusBashGrafanaLinuxTerraformAzureKubernetesPythonAWSInfrastructure as CodeGo