Mode: Hybrid (2-3 days/per week)
Location: Hyderabad, Telangana
Type: Full-Time
Working Days: Monday - Friday
Notice Period: Immediate Joinee - 30 days serving of NP
About Role
We are looking for a Technical Product Manager (TPM) - Data Platform / AI / Observability to lead the development of a next-generation AI-powered Observability Platform.
This platform goes beyond traditional monitoring — it ingests, processes, and analyzes large-scale system data (logs, metrics, traces) to enable:
- Intelligent anomaly detection
- Automated root cause analysis
- AI-driven insights and recommendations
- Semi/fully autonomous remediation workflows
This role sits at the intersection of product, engineering, data platforms, and AI systems, requiring strong technical depth and the ability to convert complex problems into scalable, production-ready solutions.
Key Responsibilities
Product Strategy & Ownership
- Own and drive the product roadmap for observability (logs, metrics, traces, pipelines)
- Define long-term vision: Monitoring → Insights → Prediction → Autonomous Systems
- Identify gaps in current monitoring/alerting systems and translate them into product opportunities
- Anticipate future needs in AI-driven observability and proactive system intelligence
AI & Intelligent Systems (Core Focus)
- Define AI-driven capabilities such as:
- Root Cause Analysis (RCA) agents
- Anomaly detection systems
- Alert summarization engines
- Incident co-pilot / assistant systems
- Design workflows where AI:
- Correlates logs, metrics, and traces
- Generates actionable insights
- Enables automated or assisted remediation
- Partner with ML teams to:
- Define model inputs/outputs
- Establish feedback loops
- Ensure explainability and trust
Technical Product Management
- Act as a bridge between engineering and business with deep system understanding
- Write high-quality PRDs aligned with architecture and scale requirements
- Define system behavior across:
- Data ingestion pipelines
- Storage layers (Elasticsearch, MongoDB)
- AI inference layers
User Story Excellence (Critical)
- Break down complex features into granular, sprint-ready user stories
- Define:
- Acceptance criteria
- Edge cases
- Failure scenarios
- API and data flow expectations
Observability Domain
- Design features for:
- Log ingestion & parsing
- Metrics aggregation
- Distributed tracing
- Alerting & anomaly detection
- Pipeline/job monitoring (Airflow, ADF, etc.)
- Build intuitive dashboards, KPIs, and drill-down workflows
Execution & Delivery
- Work closely with engineering in agile/scrum environments
- Drive sprint planning, backlog grooming, and releases
- Ensure high-quality, scalable, and performant delivery
Data-Driven Decision Making
- Define product success metrics (adoption, MTTR, alert accuracy, etc.)
- Analyze usage patterns and system performance
- Continuously optimize product based on data insights
Required Skills
Technical Expertise
- Strong understanding of:
- Elasticsearch
- MongoDB
- Distributed systems & microservices
- Experience with:
- Data pipelines & streaming systems (Kafka, etc.)
- REST APIs and backend architectures
Product Management
- 6–8+ years in Technical Product Management / Platform PM roles
- Proven experience in:
- Product roadmap ownership
- PRD creation
- Writing detailed user stories
Domain Knowledge (Highly Preferred)
- Observability / Monitoring tools:
- ELK Stack, Grafana, Prometheus, Datadog, etc.
- Experience with large-scale data platforms
Good to Have
- Background in backend engineering / data engineering
- Exposure to AI/ML systems or data-driven products
- Experience with cloud platforms (AWS / Azure / GCP)
- Prior experience building platform products (not just features)
What We're Looking For
- A hands-on TPM who can go deep into architecture and systems
- Someone who thinks like an engineer and communicates like a product leader
- Ability to convert ambiguity into structured execution
- Strong ownership mindset with attention to detail
- Experience working on 0 → 1 or platform evolution journeys
Why Join
- Build a next-gen AI-first observability platform
- Work on high-scale, data-intensive systems
- Solve cutting-edge problems in AI + distributed systems
- Be part of the transition from monitoring tools → autonomous systems