Role Overview
The Command Centre Lead is responsible for managing enterprise-wide monitoring, event correlation, and incident command functions. This role ensures proactive detection, rapid response, and continuous improvement of IT operations using tools like LogicMonitor, driving high availability and operational excellence.
Key Responsibilities
Command Centre Operations
- Lead 24x7 Command Centre / NOC operations ensuring SLA adherence and uptime targets
- Act as Incident Commander for P1/P2 incidents and drive war room coordination
- Establish real-time monitoring, alerting, and escalation frameworks
Monitoring & Observability (LogicMonitor)
- Own and optimize LogicMonitor platform (deployment, configuration, dashboards)
- Develop monitoring standards across infrastructure, network, cloud, and applications
- Implement alert tuning, threshold optimization, and noise reduction strategies
- Build executive dashboards and service health views
Incident & Problem Management
- Drive MTTR reduction through structured incident response and RCA governance
- Ensure proper incident triage, prioritization, and escalation compliance
- Lead Problem Management initiatives to eliminate recurring incidents
Automation & Continuous Improvement
- Identify automation opportunities for alert remediation and runbooks
- Integrate monitoring tools with ITSM (ServiceNow or equivalent)
- Drive AIOps adoption for predictive monitoring and anomaly detection
Stakeholder Management
- Act as single point of contact for leadership during major outages
- Provide executive-level reporting on service health, trends, and risks
- Collaborate with Infra, Cloud, Network, and Security teams
Required Skills & Experience
Core Technical Skills
- 5+ years hands-on experience with LogicMonitor
- Strong understanding of infrastructure monitoring (Servers, Network, Cloud AWS/Azure)
- Experience in event correlation, alert management, and observability tools
- Knowledge of ITSM tools (ServiceNow preferred)
Operational Expertise
- Proven experience in leading NOC / Command Centre environments
- Strong Incident, Problem, and Change Management knowledge (ITIL aligned)
- Experience managing large-scale enterprise environments
Leadership Skills
- Excellent stakeholder communication and executive reporting skills
- Ability to lead cross-functional teams and drive accountability
Key Metrics / KPIs
- MTTR (Mean Time to Resolve/Respond)
- Alert Noise Reduction (%)
- SLA Compliance & Service Availability
- Incident Volume Reduction
- Automation Coverage
Preferred Qualifications
- ITIL Certification (v3/v4)
- Certifications in Cloud (AWS/Azure) or Monitoring tools