Overall 6+ Years of experience into DevOps/SRE Production Support & Incident Management
Will actively participate in a 247 on-call rotation, taking full responsibility as a primary or secondary on-call engineer to ensure production stability and quick incident resolution.
will continuously monitor system health and performance, proactively identifying issues before they escalate into critical incidents. When incidents occur, I will triage them efficiently, lead or support resolution efforts, and ensure they are resolved within defined SLAs.
During non-incident periods, you will focus on improving system reliability by building automation, enhancing monitoring and alerting systems, and creating clear runbooks and documentation to streamline operations.
you will collaborate closely with engineering, infrastructure, and product teams to support deployments, troubleshoot issues, and contribute to system improvements. I will also take part in post-incident reviews to identify root causes and implement preventive measures.
Overall, will work towards minimizing downtime, improving system resilience, and driving continuous operational excellence.
What You Know
Have strong experience in DevOps and Site Reliability Engineering, particularly in production support environments where system uptime and reliability are critical. I understand how to operate and maintain high-availability systems in fast-paced, 247 environments.
Proficient in monitoring and incident management tools such as PagerDuty, ServiceNow, and observability platforms like Datadog, Splunk, and New Relic. I have a solid foundation in cloud platforms including AWS, Azure, and GCP, along with hands-on experience in Linux/Unix systems and networking fundamentals.
Comfortable writing scripts in Python and Bash to automate repetitive tasks, improve operational efficiency, and reduce manual intervention. I also understand the importance of structured incident response, root cause analysis (RCA), and maintaining detailed operational documentation.
Beyond technical skills, bring a strong ownership mindset, the ability to stay calm under pressure, and effective communication skills for coordinating with cross-functional teams during critical situations.
Education
Bachelor's degree in computer science, Information Systems, Engineering, Computer Applications, or related field.
Benefits
In addition to competitive salaries and benefits packages, Nisum India offers its employees some unique and fun extras.
certifications sponsored by the company on an as need basis. We support our team to excel in their field.
Parental Medical Insurance - Nisum believes our team is the heart of our business and we want to make sure to take care of the heart of theirs. We offer opt-in parental medical insurance in addition to our medical benefits.
Activities -From the Nisum Premier League's cricket tournaments to hosted Hack-a-thon, Nisum employees can participate in a variety of team building activities such as skits, dances performance in addition to festival celebrations.
Free Meals - Free snacks and dinner is provided on a daily basis, in addition to subsidized lunch.