Search by job, company or skills

Airtel Prepaid

Product Manager (Service Assurance Platform)

new job description bg glownew job description bg glownew job description bg svg
  • Posted 6 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are looking for a Product Manager to own and evolve our Service Assurance

Platform — the system that ensures every service is observable, reliable, and

supported at scale.

This platform sits at the core of our operations, enabling real-time visibility, incident

response, change governance, and customer communication across all services and

regions.

The focus of this role is to improve reliability, reduce operational friction, and

increase customer trust by making the platform consistent, measurable, and

scalable.

What You Will Own

You will own the end-to-end product strategy and evolution of the Service Assurance

platform, including:

Observability & Monitoring

A unified system for logs, metrics, and traces with alerting, retention, and data access

capabilities across all services.Service Health & SLA Reporting

Accurate, real-time visibility into service status, uptime, and SLA performance for every

product and region.

Incident Management & Remediation

Standardized workflows to detect, triage, and resolve incidents, supported by

automation, runbooks, and full auditability.

Change Governance

Consistent processes to test, approve, and roll out changes safely, minimizing risk to

production systems.

Customer Communication & Support

Reliable delivery of alerts, maintenance updates, and incident notifications, along with

structured case management and escalation handling.

Documentation & Knowledge

A centralized, versioned, and searchable documentation system covering all services

and APIs.

Key Responsibilities

• Define and drive the platform roadmap focused on reliability and operational

excellence

• Establish standard workflows and practices across:

o Observability

o Incident managemento Change management

• Improve key operational outcomes:

o Faster detection and resolution of incidents

o Higher service availability and SLA compliance

o Better customer communication and transparency

• Define and track platform-level metrics (e.g., latency of detection, resolution

times, uptime)

• Ensure the platform scales consistently across services and regions

• Partner closely with engineering, SRE, and support teams to drive adoption and

execution

• Identify gaps and eliminate fragmentation across tools and processes

What We're Looking For

Required

• 6–10+ years of product management or equivalent experience

• Experience working on reliability, observability, or operational platforms

• Strong understanding of:

o Distributed systems

o Monitoring and alerting

o Incident response workflows

• Experience in at least two of:

o Observability platforms (logs, metrics, tracing)

o Incident or change management systems

o Customer support or case management platforms

• Ability to work closely with engineering and SRE teams

• Strong systems thinking and operational mindset

Preferred

• Experience at a cloud provider or large-scale SaaS platform

• Familiarity with SRE practices (SLOs, SLIs, error budgets)

• Exposure to observability tools (Prometheus, Grafana, OpenTelemetry, ELK)

• Experience with ITSM platforms (e.g., ServiceNow, Jira Service Management)

• Technical background (engineering or equivalent)What Success Looks Like

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145692815

Similar Jobs