Proactive monitoring and management of business critical 24x7 real-time. Where required to rectify issues in a timely fashion to restore application functionality.
Ensure incidents are correctly processed, assessing business and technical impact and severity.
Taking ownership of application incidents and ensuring that they are resolved, this includes retaining ownership of incidents that require 3rd Line or IT Change activity to resolve.
Ensuring the communication to the business community remains active.
Application responsibilities will cover Application Infrastructure, Data Fixes, User Queries, User Education and Incident Investigation.
Monitoring of application events alerts, job schedules, capacity monitors and performance KPI's. Creation and ownership of change requests raised to address any of the above issues.
Proactively share knowledge with the team and update the knowledge base with support documentation (Confluence).
Work to provide services to agreed Service Level Targets and Operating Level Agreements.
Leverage AI Ops techniques to analyse logs, metrics, traces, and event data, enabling proactive trend identification and continuous optimization of system performance