Search by job, company or skills

  • Posted a month ago
  • Over 50 applicants
Quick Apply

Job Description

Job Summary:

We are seeking a skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of critical systems and applications. The ideal candidate will have strong expertise in monitoring tools (Splunk, New Relic), and hands-on experience managing and optimizing AWS API Gateway and other cloud-native services.

Key Responsibilities:

  • Ensure system reliability, uptime, and scalability through proactive monitoring and incident management.
  • Develop and maintain observability dashboards and alerts using Splunk and New Relic.
  • Manage and optimize API Gateway (AWS API Gateway) configurations for secure and efficient traffic handling.
  • Collaborate with development and DevOps teams to automate deployments and implement best SRE practices.
  • Conduct root cause analysis (RCA) for incidents and drive post-incident improvements.
  • Implement performance tuning, fault-tolerant systems, and high-availability solutions.
  • Maintain infrastructure as code and support continuous integration and delivery (CI/CD) pipelines.

Required Skills & Experience:

Primary Skills:

  • Site Reliability Engineering (SRE)
  • Splunk (Monitoring & Log Analysis)
  • New Relic (Application Performance Monitoring)
  • AWS API Gateway

Secondary Skills:

  • AWS Cloud Infrastructure
  • CI/CD Automation
  • Incident & Problem Management

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

Job ID: 131118261

Similar Jobs

Early Applicant