The Incident Manager is responsible for managing major incidents end-to-end, ensuring minimal disruption to business operations, restoring services within agreed SLAs, and preventing recurrence through post-incident reviews. The role involves coordination across global technical teams, vendors, and business stakeholders in a high-pressure, 24x7 environment.
Key Responsibilities:
- Incident Ownership: Act as the single point of contact (SPOC) for all major and critical incidents (P1/P2).
- Restoration Management: Coordinate with technical support teams, vendors, and third parties to restore services within defined SLAs.
- Impact Assessment: Evaluate business impact and prioritize incident response accordingly.
- Communication: Provide timely and transparent updates to stakeholders during the incident lifecycle, including business impact statements and recovery progress.
- Escalation Management: Proactively escalate critical issues to senior management and ensure timely decision-making.
- Root Cause Analysis (RCA): Facilitate post-incident reviews, document RCA findings, and track corrective and preventive actions to closure.
- Process Governance: Enforce ITIL-aligned incident and problem management processes, ensuring compliance and continuous improvement.
- Reporting: Generate daily, weekly, and monthly incident metrics, trend analyses, and SLA reports for management.
- Continuous Improvement: Identify process gaps and work with service delivery teams to enhance operational resilience and reduce incident frequency.
- Collaboration: Work closely with Service Delivery Managers, Change Managers, and Problem Managers to ensure service stability.
- Shift Operations: Support 24x7 operations with on-call availability for major incidents.
Required Education:
- Bachelor's Degree in IT, Computer Science, or a related field
Preferred Education:
- Master's Degree in a relevant field
Required Technical and Professional Expertise:
- Experience managing major incidents in IT operations or service management
- Strong knowledge of ITIL processes and best practices
- Proven ability to coordinate cross-functional teams and vendors under pressure
- Experience with incident tracking, reporting, and root cause analysis
Preferred Technical and Professional Expertise:
- Experience supporting 24x7 global operations
- Familiarity with service management tools (e.g., ServiceNow, JIRA)
- Strong process improvement and governance experience