Details:
Job Description
Professional Profile - Monitoring Tools Engineer (L2 | SolarWinds, New Relic & Azure Monitor)
Professional Summary
Monitoring Tools Engineer with 5+ years of experience in administration, configuration, and support of enterprise monitoring tools including SolarWinds, New Relic, and Azure Monitor. Skilled in managing day-to-day monitoring operations, alert configuration, dashboard creation, and basic automation across on-prem and cloud environments.
Hands-on experience in infrastructure, application, and cloud monitoring with strong focus on alert tuning, incident support, and ensuring SLA-based monitoring coverage. Familiar with ITIL processes and experienced in supporting monitoring transitions and implementations.
Core Skills
Monitoring Tools
- SolarWinds (NPM, SAM - configuration, alerting, dashboards)
- New Relic (APM, Infrastructure monitoring, alerts, dashboards)
- Azure Monitor (Metrics, Logs, Alerts, Application Insights - basic configuration)
Monitoring Capabilities
- Alert configuration & threshold tuning
- Dashboard and report creation
- Infrastructure, application & cloud monitoring
- Log and metrics analysis (basic level)
- Monitoring health checks & validation
Automation & Integration
- Basic PowerShell scripting
- Azure Monitor alert integration with Action Groups
- ServiceNow integration (incident creation, alert forwarding)
- API/Webhook-based alerting (basic level)
Key Responsibilities
SolarWinds Administration (L2)
- Configure and maintain SolarWinds (NPM, SAM) monitoring environment.
- Add and monitor network devices, servers, and applications using SNMP/WMI.
- Create and update alerts, thresholds, and notification rules.
- Build dashboards and reports for operational visibility.
- Troubleshoot polling issues, alert failures, and performance gaps.
New Relic Administration (L2)
- Configure APM and Infrastructure monitoring.
- Create alerts and dashboards for application performance tracking.
- Monitor application health and escalate issues based on defined thresholds.
- Support basic application instrumentation and alert tuning.
Azure Monitor Administration (L2)
- Configure Azure Monitor for virtual machines, services, and resources.
- Create and manage alerts using Metrics and Log Analytics (KQL - basic queries).
- Set up Action Groups for notifications (email, webhook, ITSM tools).
- Monitor Application Insights for application performance (basic level).
- Create dashboards using Azure Workbooks.
- Perform basic troubleshooting for missing metrics/logs.
Monitoring Operations
- Monitor alerts and ensure timely incident creation and escalation.
- Support P1/P2 incidents by validating alerts and providing monitoring insights.
- Perform daily health checks of monitoring tools.
- Ensure monitoring coverage as per SLA requirements.
Transition & Implementation Support
- Support onboarding of servers, applications, and cloud resources into monitoring tools.
- Assist in KT sessions, documentation, and transition activities.
- Follow SOPs and runbooks for steady-state operations.
- Support implementation tasks defined in SOW (execution level).
Alert Optimization
- Perform threshold tuning to reduce false alerts.
- Update alert configurations based on operational feedback.
- Support noise reduction initiatives under L3/Lead guidance.
Key Achievements
- Onboarded infrastructure and cloud resources into SolarWinds and Azure Monitor.
- Improved alert accuracy through threshold tuning and configuration updates.
- Created dashboards for infrastructure, application, and cloud monitoring.
- Supported smooth transition of monitoring tools with minimal operational impact.
Technical Skills
- Protocols: SNMP, WMI, Azure Metrics & Logs
- Query: Basic KQL (Azure Log Analytics)
- Scripting: PowerShell (basic)
- Platforms: Windows, Linux, Azure
- ITSM: ServiceNow (Incident Management & Alert Integration)
Job Requirements
Technical Skills
- Protocols: SNMP, WMI, Azure Metrics & Logs
- Query: Basic KQL (Azure Log Analytics)
- Scripting: PowerShell (basic)
- Platforms: Windows, Linux, Azure
- ITSM: ServiceNow (Incident Management & Alert Integration)