Product Manager - AI/ML Operations & Deployment
About the Role
We are seeking a seasoned and highly technical Senior Technical Project Manager (with 10+ years of experience) to focus on the operationalization, health, and performance of our existing and new AI/ML product portfolio. This role is focused on the execution layer: ensuring that models are deployed reliably, performing optimally in production, delivering the expected business value, and iterating quickly based on real-world data and user feedback.
Key Responsibilities
- MLOps & Deployment Ownership: Drive the operational roadmap for deploying, monitoring, and maintaining production-grade machine learning models. Ensure products meet strict uptime, latency, and scalability requirements.
- Product Health & Performance: Define and rigorously track operational Key Performance Indicators (KPIs) for deployed models, including model drift, data quality, inference latency, and business value realization.
- Iterative Optimization: Work closely with Data Science and Engineering teams to establish feedback loops from production data. Prioritize maintenance, bug fixes, and continuous minor improvements that enhance operational efficiency and model accuracy.
- User Experience & Documentation: Ensure the operational experience for internal users (e.g., engineering, support) is streamlined. Develop clear documentation and runbooks for product support and issue resolution.
- Feature Prioritization (Execution Focus): Manage the product backlog with a focus on immediate operational needs, technical debt reduction, and compliance requirements, balancing these with small-scale feature enhancements.
- Cross-Functional Coordination: Act as the primary interface between development, site reliability engineering (SRE), and business operations teams to manage product releases and maintenance schedules.
- ML Pipeline Orchestration Strategy: Define and own the strategy for ML Pipeline orchestration, ensuring the selection, implementation, and optimisation of platform tools or equivalent cloud native services to support automated training, testing and continuous Deployment
- Integrate Responsible AI & Governance: Integrate Responsible AI principles into the product lifecycle, including defining requirements for model explainability, fairness , and bias detection in production.
Skills and Attributes for Success:
- 10+ years of progressive experience in product management or a highly relevant technical role (e.g., Technical Program Management, ML Engineering), with a minimum of 5 years focused on the operationalisation and maintenance of AI/ML models in production.
- Deep understanding of MLOps principles, tools, and best practices (e.g., model serving, monitoring, continuous integration/delivery for ML).
- Proven ability to manage technical debt and operational risks associated with deployed AI/ML systems.
- Technical fluency: Comfortable interpreting detailed technical logs, understanding model performance reports, and translating operational issues into engineering requirements.
- Experience with cloud platforms (AWS, Azure, or GCP) for managing compute resources, data pipelines, and ML services.
- Bachelor's degree in Computer Science, Engineering, Data Science, or a related field.
Desired Skills
- Direct operational experience with AI/ML products integrated within the SAP landscape (e.g., monitoring model performance related to Finance, SCM, or CRM processes within SAP environments).
- Experience managing products in highly regulated industries or environments requiring strict compliance and audit trails.
- Certifications or extensive experience in Agile methodologies (Scrum/Kanban) focusing on rapid iteration cycles.
- Data & Feature Engineering Decisions : Proven ability to collaborate with Data Scientists and ML Engineers on defining and managing a feature for ML models including versioning
- Microservices Architecture Experience: Proven experience designing, implementing and managing Microservices required for existing and future projects.
- Designing API architecture for the new projects based on the component and microservice requirements.