SRE

Capgemini Technology Services India Limited

Hyderabad, Chennai, Pune

6-9 Years

Save

Posted a month ago
Over 50 applicants

Quick Apply

Job Description

Job Summary:

We are seeking a skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of critical systems and applications. The ideal candidate will have strong expertise in monitoring tools (Splunk, New Relic), and hands-on experience managing and optimizing AWS API Gateway and other cloud-native services.

Key Responsibilities:

Ensure system reliability, uptime, and scalability through proactive monitoring and incident management.
Develop and maintain observability dashboards and alerts using Splunk and New Relic.
Manage and optimize API Gateway (AWS API Gateway) configurations for secure and efficient traffic handling.
Collaborate with development and DevOps teams to automate deployments and implement best SRE practices.
Conduct root cause analysis (RCA) for incidents and drive post-incident improvements.
Implement performance tuning, fault-tolerant systems, and high-availability solutions.
Maintain infrastructure as code and support continuous integration and delivery (CI/CD) pipelines.