Search by job, company or skills

antzlab technology services pvt. ltd.

Major Incident Commander

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 16 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are seeking a highly accomplished Principal Incident Commander / Director – Incident Management to lead enterprise-wide response to critical incidents across complex, large-scale, and globally distributed infrastructure environments.

This role operates at the intersection of technology leadership, crisis management, and business continuity, requiring the ability to make high-stakes decisions, influence senior stakeholders, and drive rapid resolution during mission-critical outages. The individual will serve as the ultimate authority during major incidents, ensuring minimal business disruption and long-term resilience.

RequirementsStrategic Responsibilities
  • Own and lead enterprise-level incident management strategy across global operations.
  • Act as the executive Incident Commander for P0/P1 incidents impacting business-critical systems.
  • Establish and drive incident governance frameworks, SLAs, and response protocols
  • Lead cross-functional crisis response involving Network, Cloud, Infrastructure, Security, and Field Operations
  • Influence and align with C-suite and senior leadership during high-impact incidents
  • Drive business continuity and service resilience initiatives
Operational Leadership
  • Command and orchestrate war rooms and global bridge calls with multiple stakeholders
  • Serve as the highest escalation point for critical outages and service disruptions.
  • Ensure rapid triage, containment, and resolution of incidents with minimal downtime
  • Drive real-time decision-making under ambiguity and pressure
  • Oversee post-incident reviews and enforce accountability across teams
Technical Expertise
  • Deep expertise in enterprise networking and distributed systems:
  • BGP, OSPF, EIGRP, TCP/IP, QoS
  • WAN, SD-WAN, Data Center architectures (Spine-Leaf)
  • Strong understanding of:
  • Load balancing, DNS, DHCP, Network Security
  • Latency, packet loss, and performance optimization
  • Familiarity with cloud platforms and hybrid infrastructure environments
  • Ability to engage in hands-on technical triage when required
  • Lead Root Cause Analysis (RCA) at an organizational level
  • Drive preventive engineering, automation, and process maturity
  • Establish a culture of proactive monitoring and early detection
  • Enhance incident response playbooks, runbooks, and training programs
Preferred Qualifications
  • ITIL Expert / Advanced Incident Management certifications
  • Exposure to Disaster Recovery (DR) & Business Continuity Planning (BCP)
  • Experience with automation, observability platforms, and AI-driven monitoring
  • Track record of driving transformation in incident management practices
  • 5 -7 years of experience in Network Engineering, SRE, NOC, or Cloud Operations
  • Proven experience handling enterprise-scale, high-impact incidents globally
  • Prior experience in large enterprises / telecom / hyperscalers / global tech organizations
  • Strong leadership presence with the ability to influence without authority
  • Experience working in 24x7, mission-critical environments

More Info

Job Type:
Industry:
Employment Type:

Job ID: 146334285

Similar Jobs