Search by job, company or skills

  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

  • A Site Reliability Engineer is a professional who acts as a warrior to monitor, protect customer applications, taking charge on operational tasks to ensure the efficient functioning of a system.
  • They are responsible for monitoring, automating, and improving the reliability, performance, and availability of any applications.
  • Mandatory to work on 24x7 rotational shifts and retail domain knowledge.
  • Must have knowledge of Production Application Support from Level 2 support.
  • Prefer to have someone experienced in Shopify support side.
  • Hands on experience in Monitoring, Logging, Alerting, Dashboarding, and report generation in any monitoring tools such as AppDynamics/Splunk/Dynatrace/Datadog/CloudWatch/ELK/Prome/New Relic).
  • This engagement is a customer using NewRelic, PagerDuty hence it is good to have this expertise.
  • Should know how to write SQL query to fetch data from Database and from observability tools.
  • Must have knowledge in ITIL framework specifically on Alerts, Incident, change management, CAB, Production deployments, Risk and mitigation plan.
  • Should be able to lead P1 calls, brief about the P1 to customer, proactive in gathering leads/ customers into the P1 calls till RCA.
  • Experience working with postman.
  • Should have knowledge of building and executing SOP, runbooks, handling any ITSM platforms (JIRA/ServiceNow/BMC Remedy).
  • Should know how to work with the Dev team, cross functional teams across time zones.
  • Should be able to generate WSR/MSR by extracting the tickets from ITSM platforms.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 144374473