Join our dynamic team to innovate and refine technology operations, impacting the core of our business services.
As a Technology Support Lead in Consumer & community Banking, you will play a leadership role in ensuring the operational stability, availability, and performance of our production services. Critical thinking while overseeing day-to-day maintenance of the firm's systems will be key and set you up for success as you navigate tasks related to identifying, troubleshooting, and resolving issues to ensure a seamless user experience.
Job responsibilities
- Oversee root cause analysis (RCA) on major impacting incidents and standard incidents with potential for impact, ensuring root causes and tactical/strategic actions are identified and delivered
- Coordinate, convene, and facilitate major problem review meetings across the North America region and other regions as needed, ensuring effective collaboration and follow-through
- Proactively analyze and define problem areas, developing and driving strategic efforts across all levels of priority/severity, and applying RCA lessons learned across the technology environment
- Partner with business resources and develop actions to eliminate recurrence on business-owned incidents, ensuring alignment with business objectives
- Collaborate with subject matter experts to refine operating processes and procedures, delivering and restoring service more efficiently
- Ensure accuracy and timely progression of problem records through the Problem Management process, maintaining information in ServiceNow and other artifacts as necessary
- Own and run stability and service level improvement programs for applications/services and other initiatives, using an agile approach
- Drive continuous improvement initiatives and implement best practices in Problem Management, fostering a culture of learning and innovation
- Communicate effectively with senior leadership and stakeholders, providing regular updates on status, progress, and key metrics related to problem management activities
- Lead problem management conversations with precision and urgency, partnering with SRE and Application Development Engineers to research production incidents and develop post-incident analysis
- Apply AI-assisted analysis to accelerate Problem Management outcomes (e.g., incident pattern clustering, summarizing post-incident narratives, identifying recurring failure modes from tickets/alerts/timelines), with appropriate validation and human judgment
Required qualifications, capabilities, and skills
- 5+ years of experience or equivalent expertise troubleshooting, resolving, and maintaining information technology services
- Experience managing Root Cause Analysis (RCA) in a system of record such as ServiceNow
- Proficient in pattern recognition and data correlation, with strong analytical and problem-solving skills
Advanced Excel knowledge with the ability to dissect large data files, utilizing formulas, minor scripting, and filtering
- Strong organizational skills with the ability to track progress and ensure deliverables are met within prescribed timelines until full problem closure
- Understanding of observability and monitoring tools and techniques
- Excellent communication, technical writing, presentation, and relationship management skills
- Experience managing high-pressure situations and making decisions quickly to minimize impact on business operations
- AI literacy for IT Service Management: working knowledge of how ML/GenAI can be applied to ITIL workflows, plus awareness of limitations/risks (e.g., incorrect outputs, bias, sensitive data handling)
- Ability to use AI-assisted analytics for operational insights (e.g., ticket/log/alert correlation concepts, qualitative-to-quantitative categorization) with appropriate validation
Preferred qualifications, capabilities, and skills
- Working knowledge of dashboard reporting using Tableau, PowerBI, Qlik, and other such tools
- ITIL Foundation certification or higher preferred, with exposure to processes in scope of the ITIL framework
- Practical knowledge of engineering principles, design patterns, and failure mode-effects analysis
- Practical experience with public cloud
- Familiarity with AI governance concepts (model risk, data lineage, monitoring, change management) as applied to IT service management routines and operational reporting
- Experience partnering with data/analytics teams to validate and operationalize insights into service management workflows (runbooks, dashboards, problem records)