Search by job, company or skills

A

Operations Engineer

Save
new job description bg glownew job description bg glow
  • Posted 22 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Project Role : Operations Engineer

Project Role Description : Support the operations and/or manage delivery for production systems and services based on operational requirements and service agreement.

Must have skills : Critical Incident Management

Good to have skills : NA

Minimum 3 Year(s) Of Experience Is Required

Educational Qualification : 15 years full time education

Summary:

The Major Incident Manager (MIM) is responsible for leading and governing the end-to-end lifecycle of high-impact incidents and escalations, ensuring rapid service restoration while minimizing business disruption. The role requires strong leadership, cross-functional coordination, and real-time decision-making during critical situations, along with a focus on continuous service improvement.

Roles & Responsibilities:

Lead and own Major Incident Management (MIM) and escalated incidents, including bridge initiation, stakeholder communication, decision facilitation, and incident closure.

Drive incident triage, impact assessment, and recovery coordination across multiple resolver groups and support teams.

Act as a Subject Matter Expert (SME), collaborating with and guiding teams to ensure effective incident handling and operational excellence.

Take ownership of decisions during major incidents and steady-state operations, ensuring timely and effective outcomes.

Engage with cross-functional teams to contribute to critical operational and technical decision-making.

Provide timely and effective solutions to complex issues impacting both the immediate team and broader organizational units.

Ensure clear, consistent, and proactive communication with clients, stakeholders, and leadership during major incidents and service disruptions.

Design and implement appropriate workarounds and permanent fixes based on deep product, infrastructure, and service knowledge.

Collaborate with internal and external teams to ensure system stability, resilience, and service continuity.

Manage and prioritize incidents effectively to meet SLA commitments and MIM response timelines.

Drive Post Incident Reviews (PIRs), Root Cause Analysis (RCA), and continuous improvement initiatives to prevent recurrence of incidents.

Maintain adherence to ITIL-based incident, problem, and change management processes and governance standards.

Professional & Technical Skills:

Strong decision-making ability in high-pressure environments

Excellent stakeholder and client management skills

Ability to lead cross-functional teams and drive accountability

Proactive mindset focused on continuous service improvement and resilience engineering

Must-Have Skills (Core)

Strong proficiency in Infrastructure Service Management

Hands-on experience in Major Incident Management (MIM)

Deep understanding of IT infrastructure, applications, and service management principles

Experience in Incident, Problem, and Change Management processes

Solid knowledge of ITIL framework and best practices

Proven ability to handle high-severity incidents, escalations, and executive-level communications

Excellent communication, coordination, and leadership skills under pressure

Must-Have AI & Automation Skills (Modern Requirement)

Ability to leverage AI-powered tools (Copilot, GenAi ITSM automation tools) for

  • Incident summarization and communication drafting
  • RCA insights and pattern analysis
  • Automated triage and decision support

Experience in using AI/ML-driven monitoring and alerting tools for proactive incident detection

Understanding of AI-assisted incident management workflows (predictive analytics, anomaly detection)

Capability to analyze incident trends using AI insights and recommend preventive actions

Familiarity with automation tools (Power Automate, ServiceNow workflows, scripting) to reduce manual effort and improve response speed

Ability to drive adoption of AI use cases within MIM processes for efficiency and continuous improvement

Additional Information:

  • The candidate should have minimum 3 years of experience in Critical Incident Management.
  • This position is based at our Gurugram office.
  • A 15 years full time education is required.

, 15 years full time education

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148681169

Similar Jobs

Gurugram, Gurugram, India

Skills:

Identity Access ManagementIt Service ManagementMicrosoft Active DirectoryWindows Server AdministrationAnsiblePowershell ScriptingGroup Policy Management GPOAzure AD Entra IDNexthink

Remote

Skills:

ScriptingMonitoring ToolsMariaDB ToolsMongoDB ToolsData Bricks ToolsCi/CD Control

Noida, India

Skills:

S3SolrDynamodbSpring BootEmrBash ScriptingLambdaLinuxMySQLMongoDBFastAPIPythonAWSAirflowStep FunctionsOpenSearchONNXSageMakerMLFlowMetaflow