Details:
Job Description
Job Description - L3 Monitoring Tools Engineer (Nagios & SCOM)
Location: Noida
Experience: 8-15 Years
Role Overview
We are looking for a Senior (L3) Monitoring Tools Engineer with strong expertise in Nagios and Microsoft SCOM to lead architecture design, implementation, optimization, SOW creation, and transition of enterprise monitoring environments.
The role requires deep technical expertise, solutioning capability, stakeholder interaction, and ownership of large-scale monitoring transformation programs.
Key Responsibilities
Monitoring Architecture & Design (Nagios & SCOM)
- Design enterprise-grade monitoring architecture for hybrid environments (Windows, Linux, Network, Applications).
- Lead installation, configuration, and upgrade of Nagios (Core/XI) and SCOM platforms.
- Define monitoring standards, naming conventions, alert taxonomy, and threshold frameworks.
- Design distributed monitoring setups, gateway architecture, HA & DR strategy.
- Implement custom plugins (Nagios) and Management Pack customization (SCOM).
- Configure overrides, monitors, rules, distributed applications, dashboards, and reporting.
SOW Creation & Solutioning
- Lead pre-sales technical discussions for monitoring solutions.
- Define Scope of Work (SOW), effort estimation, timelines, and implementation roadmap.
- Perform infrastructure assessment and monitoring gap analysis.
- Create HLD/LLD for monitoring deployment.
- Define migration strategy from legacy tools to Nagios/SCOM.
- Provide cost optimization recommendations (license & infra sizing).
- Present solution architecture to client stakeholders.
Transition & Knowledge Transfer
- Lead end-to-end transition of monitoring environments from incumbent teams.
- Create transition plans including knowledge capture, risk register, and mitigation strategy.
- Define monitoring coverage matrix aligned with SLA and business criticality.
- Develop SOPs, runbooks, and operational playbooks.
- Conduct KT sessions and certify L1/L2 teams.
- Establish governance model for steady-state monitoring operations.
- Ensure zero disruption during monitoring tool migration or transition.
Alert Engineering & Optimization
- Perform threshold engineering and noise reduction initiatives.
- Design alert suppression, correlation, and dependency mapping.
- Integrate monitoring tools with ServiceNow (Event Management & Auto-ticketing).
- Reduce false positives and improve MTTR.
- Implement auto-resolution for repetitive alerts using scripting.
Automation & Integration
- Develop PowerShell automation scripts for bulk configuration & alert updates.
- Implement API-based integrations with ITSM platforms.
- Align monitoring with CMDB and service mapping.
- Drive continuous improvement through automation initiatives.
Incident & Governance Support
- Provide L3 support during P1/P2 incidents and perform monitoring validation.
- Participate in CAB, PIR, and audit discussions.
- Maintain compliance with ITIL & ISO 20000 frameworks.
- Provide monthly monitoring performance & improvement reports.
Technical Skills Required
Primary Tools
- Nagios (Core/XI) - Advanced configuration, plugin development, distributed monitoring.
- Microsoft SCOM - Management Packs, overrides, health model tuning, gateway setup.
Secondary / Supporting Skills
- SNMP, WMI, NRPE, NSClient++
- Windows Server & Linux administration basics
- PowerShell scripting
- ServiceNow integration
- Monitoring of Network, Virtualization, DB, Middleware & Applications
Key Competencies
- Solution Architecture & Design
- SOW Drafting & Effort Estimation
- Client Communication & Stakeholder Management
- Transition & Transformation Leadership
- Monitoring Strategy Development
- Cost & License Optimization
- Documentation & Governance
KPIs for L3 Role
- Successful monitoring transition within defined timelines.
- 20-40% alert noise reduction through optimization.
- Zero major monitoring gaps post-transition.
- Improved MTTR through better alert quality.
- Successful delivery of SOW-based implementations within budget.
This JD Positions The Role As:
Monitoring Architect
L3 SME
Transition Lead
Pre-Sales & Solutioning Contributor
Platform Owner
Job Requirements
This JD Positions The Role As:
Monitoring Architect
L3 SME
Transition Lead
Pre-Sales & Solutioning Contributor
Platform Owner