Job Description
Job Specification Monitoring Engineer
Band: C1/C2
The Monitoring & Observability engineer leads the small group of junior DevOps engineers, providing advanced technical support with an involvement in comprehending technical requirements to design and develop optimal monitoring of IT system solutions by applying the knowledge of business processes, systems architecture, and cutting-edge monitoring technologies.
Living into the product centric model, this role promotes a strong reliability and recoverability culture by partnering with software engineering, security, and architecture to solve complex problems and development challenges as required.
Contributes to and drivers the overall design, installation, maintenance, configuration, and integrity of Allstate's enterprise systems management systems monitoring tools and related software.
Drive actions to ensure the appropriate research, testing, deployment strategy, analysis, administration, support & problem resolution and maintenance of the environments and hardware/software technology components occur efficiently and that the toolsets provides Allstate a competitive advantage.
Utilize proven systems, scripting and overseeing development work to execute on highly complex tasks related to hardware/software technology component analysis, integration, and incident and problem resolution.
Drive the setup of effective end-to-end system performance and reliability monitoring and provide data and alerts to help avoid issues or to troubleshoot outages should they occur; design, develop and integrate solutions that improve the client experience.
Responsibilities
Job Responsibilities
Manage the day-to-day maintenance of monitoring systems, delivery of engineering solutions and their integration into broader monitoring ecosystem
Lead team meetings and ensuring workflow where required
Mentoring team and peer group
Drives opportunities for improvement both in the tools managed and the systems being monitored.
Pushing technologies where appropriate to micro-services-based cloud solutions.
Required Skills
Strong skills in systems management and experience with implementation (Design, Implementation, Configuration, and Management) of SCOM tool, along with admin knowledge of Datadog or Tivoli Monitoring Portal.
Ability to apply knowledge of tools to create solutions which are maintainable, follows enterprise patterns, and address new requirements.
Proven knowledge of and interest in software that enables observability of systems broadly, and know the multiplier effect this can have on operations and DevOps teams
Drive and reviews efforts to research, design, plan and maintain new or existing hardware and software technology components.
Ability to utilize advanced systems, scripting and developer skills to develop methodologies to implement, integrate, and maintain new and emerging enterprise-wide systems hardware/ software technology components.
Provide leadership and technical guidance in project management, planning, task definition, estimating, reporting, scheduling, and workflow
Direct, review and validate the work of more junior engineers serving as Subject Matter Expert for requirements within areas of responsibility
Experience operating in an environment driven by KPIs with accountability to determine the best course of action to meet departmental goals
Excellent written/verbal communication and facilitation skills for communicating at all levels, including strong presentation skills
Actively participates in agile team and ceremonies
Welcome new ideas, learns from successes and failures
Primary Skills
- Desirable Skills Observability Principles Product Centric Mindset Collaborative Partner Forward Thinker Problem Solver Customers Influencer Site Reliability Practices
Experience
7-10 Years of experience in SCOM, Datadog, Tivoli and other monitoring tools
Drive and reviews efforts to research, design, plan and maintain new or existing hardware and software technology components.
Ability to utilize advanced systems, scripting and developer skills to develop methodologies to implement, integrate, and maintain new and emerging enterprise-wide systems hardware/ software technology components.
Provide leadership and technical guidance in project management, planning, task definition, estimating, reporting, scheduling, and workflow
Direct, review and validate the work of more junior engineers serving as Subject Matter Expert for requirements within areas of responsibility
Shift Timing
1:00PM - 9:30 PM (24/7 support)