Search by job, company or skills

A

Critical Incident Manager

new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Summary:

The Critical Incident Manager (CIM) is responsible for managing and coordinating high-severity incidents across cloud platforms, enterprise systems, and client environments. This role ensures minimal business impact by leading incident response, monitoring cloud and security operations, and supporting client onboarding initiatives. The CIM acts as a single point of accountability for resolving critical issues while collaborating with cross-functional teams.

Note: The role requires willingness to work in 247 shifts with 2 weekly offs. Candidates may also be required to perform on-call shifts on weekends, for which additional allowance will be provided as applicable.

Key Responsibilities:

  • Critical Incident Management

Own the end-to-end lifecycle of critical incidents including identification, escalation, coordination, and resolution.

Conduct initial triage and impact assessment for all incoming incidents and prioritize accordingly.

Lead post-incident reviews to capture root causes, lessons learned, and continuous improvement actions.

Ensure accurate timelines and event logs are captured during major incidents to support real-time coordination

Maintain incident documentation, reporting, and communication to internal stakeholders and clients.

Act as the Shift Manager on Duty (MOD) being the first point of escalation for all incidents and operational concerns during the shift. (include for RRD at level 9)

Ensure proper handover between shifts, documenting open incidents and key activities.

  • Cloud Outage Monitoring

Proactively monitor cloud infrastructure (AWS, Azure, GCP, etc.) for outages or performance degradations.

  • Security Vulnerability Management

Coordinate responses to security vulnerabilities, threats, and breaches.

  • Client Onboarding Support

Facilitate onboarding of new clients as per process.

  • Continuous Improvement

Analyze incident trends and implement process improvements to reduce recurrence and downtime.

  • Mail Monitoring and Responding (include for RRD at level 10/11)

Monitor operational and client-related emails for critical issues and respond promptly as per priority and escalation guidelines.

Qualifications:

Education & Experience

Bachelor's degree in Computer Science, Information Technology, Cybersecurity, or related field.

6+ years of experience in IT operations, critical incident management, problem management, or cloud operations.

Proven experience handling critical incidents.

Willingness to work in 247 shifts with 2 weekly offs and perform on-call weekend shifts as required.

Technical Skills

Strong knowledge of cloud platforms: AWS, Azure, GCP.

Understanding of cybersecurity frameworks, vulnerability management, and remediation practices.

Familiarity with ITIL/ITSM processes and incident management frameworks.

Soft Skills

Excellent communication and leadership skills for coordinating cross-functional teams.

Ability to work under pressure during critical outages or security incidents.

Strong problem-solving and analytical skills with attention to detail.

Preferred Certifications

ITIL v4 Foundation or higher

AWS/Azure/GCP Cloud Certification

Level 9 : 6+ years of experience in IT operations, critical incident management, problem management, or cloud operations.

Level 11 : 2+ years of experience in IT operations, critical incident management or cloud operations











More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 145437023