Lead 24/7 DevOps operations during assigned shifts within a Managed Services framework
Act as the primary escalation point for P1/P2 incidents and drive end-to-end resolution
Ensure adherence to SLAs, OLAs, and contractual KPIs across production environments
Govern production releases, deployments, and emergency changes during shift hours
Ensure compliance with ITIL-based Incident, Change, and Problem Management processes
Coordinate with infrastructure, cloud, database, application, and security teams for issue resolution
Monitor CI/CD pipeline health and ensure successful release execution
Oversee system monitoring, alert management, and automation workflows
Drive root cause analysis (RCA) and implement preventive action tracking
Provide structured shift handovers and daily operational reports
Communicate real-time updates to client stakeholders during high-severity incidents
Mentor shift engineers and enforce adherence to SOPs, runbooks, and escalation matrices
Identify automation, cost optimisation, and continuous improvement opportunities.
What You Know
Must have 10+ years of experience in DevOps, Production Support, or SRE within Managed Services environments
Strong hands-on experience in cloud platforms (GCP or AWS)
Experience working in SLA-driven, client-facing production environments
Strong knowledge of ITIL processes (Incident, Change, Problem Management)
Experience managing major incidents and minimising MTTR
Hands-on experience with CI/CD tools such as Jenkins, GitLab CI, and Azure DevOps
Experience with containerization and orchestration (Docker, Kubernetes)
Proficiency with monitoring and observability tools such as Splunk, Datadog, Dynatrace, AppDynamics, and Grafana
Exposure to Infrastructure as Code and automation tools such as Terraform and Ansible
Strong understanding of Linux systems, networking, and cloud security fundamentals
Willingness to work in rotational shifts.
Education Details
BE / BTech / MTech
Preferred Qualifications
ITIL Foundation or Intermediate Certification
Cloud Certifications (GCP or AWS)
Experience in multi-cloud production environments
Exposure to reliability engineering practices and operational governance frameworks
Mandatory Skills
Shift-based operational leadership in Managed Services environments
Major Incident Management (P1/P2) and RCA ownership
SLA governance and client communication
CI/CD pipeline monitoring and release governance
Kubernetes and container platform management
Monitoring, alerting, and observability implementation
Infrastructure automation using Terraform / Ansible
Strong knowledge of ITIL processes and operational compliance
Benefits
In addition to competitive salaries and benefits packages, Nisum India offers its employees some unique and fun extras:
Continuous Learning - Year-round training sessions are offered as part of skill enhancement certifications sponsored by the company on an as-needed basis. We support our team to excel in their field.
Parental Medical Insurance - Nisum believes our team is the heart of our business, and we want to make sure to take care of the heart of theirs. We offer opt-in parental medical insurance in addition to our medical benefits.
Activities -From the Nisum Premier League's cricket tournaments to hosting a Hack-a-thon, Nisum employees can participate in a variety of team-building activities, such as skits, dance performances, and festival celebrations.
Free Meals - Free snacks and dinner are provided daily, in addition to subsidised lunch.