Role Overview
The Lead of Cloud Operations will lead the endtoend operations of the organization's shared cloud services across multicloud and hybrid environments. This leader will be responsible for ensuring high availability, operational excellence, cost efficiency, security compliance, and continuous service improvement across all cloud platforms
The role requires deep technical expertise, strong operational leadership, and the ability to define and implement cloud operating models aligned with business and transformation goals.
Key Responsibilities
1. Cloud Operations Management
- Lead daily operations for shared cloud services across Azure, AWS, GCP and hybrid environments.
- Ensure reliable, secure, and highperformance cloud service delivery with agreed SLAs.
- Oversee monitoring, incident response, problem management, and change management for cloud workloads.
- Drive automation and IaC adoption for provisioning, configuration, and operations.
2. Service Reliability & Availability
- Implement SRE frameworks, error budgets, and observability standards.
- Establish proactive monitoring tools (e.g., CloudWatch, Azure Monitor, Datadog, Grafana).
- Ensure continuous improvement of uptime, performance, failover, and disaster recovery capabilities.
3. Cloud Governance & Compliance
- Define and enforce cloud governance frameworks, tagging policies, resource standards, guardrails, and best practices.
- Ensure compliance with industry standards (ISO 27001, SOC2, PCI-DSS, regulatory requirements).
- Partner with security teams to maintain a strong cloud security posture.
4. Cost & Capacity Management
- Own cloud financial management (FinOps).
- Optimize cloud resource consumption, budgets, reservations, and cost-saving strategies.
- Provide periodic cloud cost visibility to leadership.
5. Team Leadership & Vendor Management
- Lead and mentor cloud operations engineers, SREs, and platform administrators.
- Manage MSP/cloud vendors, negotiate SLAs, and ensure contract compliance.
- Build capability roadmaps and maturity models for cloud operations.
6. Deployment & Lifecycle Management
- Oversee deployment pipelines, patch management, scaling, backup, and DR processes.
- Ensure standardization and lifecycle management of cloud resources.
7. Architecture & Transformation Collaboration
- Work closely with Solution Architects, DevOps, Security, and Application teams for seamless operations.
- Contribute to cloud roadmap, modernization initiatives, and migration programs.
Required Skills & Experience
Technical Skills
- 1015+ years experience in cloud & infrastructure operations.
- Deep expertise in Azure / AWS / GCP (at least two platforms preferred).
- Proven experience with multi-cloud, hybrid cloud, virtualization, containers (K8s), and networking.
- Strong understanding of:
- IaC tools: Terraform, ARM, CloudFormation
- CI/CD: Azure DevOps, GitHub Actions, Jenkins
- Monitoring tools: Dynatrace, AppInsights, Splunk
- Security posture tools: CSPM, CWPP, SIEM, SOC
Leadership & Operational Skills
- Strong people management and vendor management experience.
- Demonstrated ability to run large-scale 24x7 cloud operations.
- Experience implementing ITIL practices for Incident/Problem/Change.
- Strong analytical, decisionmaking, and crisismanagement skills.
Key Attributes
- Customer-first mindset with strong service quality orientation.
- Ability to work under pressure, manage cross-functional teams, and resolve complex operational issues.
- Strategic thinking with hands-on approach to operational excellence.
Regards,
HR Team