Search by job, company or skills

Powerbridge Technologies

Major Incident Commander

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are seeking a highly accomplished Principal Incident Commander / Director – Incident Management to lead enterprise-wide response to critical incidents across complex, large-scale, and globally distributed infrastructure environments.

This role operates at the intersection of technology leadership, crisis management, and business continuity, requiring the ability to make high-stakes decisions, influence senior stakeholders, and drive rapid resolution during mission-critical outages. The individual will serve as the ultimate authority during major incidents, ensuring minimal business disruption and long-term resilience.

Requirements

Strategic Responsibilities

  • Own and lead enterprise-level incident management strategy across global operations.
  • Act as the executive Incident Commander for P0/P1 incidents impacting business-critical systems.
  • Establish and drive incident governance frameworks, SLAs, and response protocols.
  • Lead cross-functional crisis response involving Network, Cloud, Infrastructure, Security, and Field Operations.
  • Influence and align with C-suite and senior leadership during high-impact incidents.
  • Drive business continuity and service resilience initiatives.

Operational Leadership

  • Command and orchestrate war rooms and global bridge calls with multiple stakeholders.
  • Serve as the highest escalation point for critical outages and service disruptions.
  • Ensure rapid triage, containment, and resolution of incidents with minimal downtime.
  • Drive real-time decision-making under ambiguity and pressure.
  • Oversee post-incident reviews and enforce accountability across teams.

Technical Expertise

  • Deep expertise in enterprise networking and distributed systems: 1) BGP, OSPF, EIGRP, TCP/IP, QoS 2) WAN, SD-WAN, Data Center architectures (Spine-Leaf)
  • Strong understanding of: 1) Load balancing, DNS, DHCP, Network Security 2) Latency, packet loss, and performance optimization
  • Familiarity with cloud platforms and hybrid infrastructure environments
  • Ability to engage in hands-on technical triage when required
  • Lead Root Cause Analysis (RCA) at an organizational level
  • Drive preventive engineering, automation, and process maturity
  • Establish a culture of proactive monitoring and early detection
  • Enhance incident response playbooks, runbooks, and training programs

Preferred Qualifications

  • ITIL Expert / Advanced Incident Management certifications
  • Exposure to Disaster Recovery (DR) & Business Continuity Planning (BCP)
  • Experience with automation, observability platforms, and AI-driven monitoring
  • Track record of driving transformation in incident management practices
  • 12–18+ years of experience in Network Engineering, SRE, NOC, or Cloud Operations
  • Proven experience handling enterprise-scale, high-impact incidents globally
  • Prior experience in large enterprises / telecom / hyperscalers / global tech organizations
  • Strong leadership presence with the ability to influence without authority
  • Experience working in 24x7, mission-critical environments

Benefits

  • Health insurance coverage for employees and their families.
  • Retirement savings plan with employer matching contributions.
  • Opportunities for professional development and advancement within the organization.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 146430779

Similar Jobs