
Search by job, company or skills
Role Overview
This role focuses on managing cloud infrastructure, automation, and DevOps practices across AWS and Azure. The ideal candidate should have 5+ years of experience and will play a key role in ensuring system reliability, monitoring, and continuous improvement of cloud operations.
Key Responsibilities
Manage and maintain cloud infrastructure including compute, storage, databases, and container platforms across AWS and Azure
Support deployment and configuration of monitoring tools such as DataDog to ensure visibility into application and system performance
Build and maintain automation for infrastructure provisioning and configuration using Ansible, Terraform, or similar tools
Manage containerized applications using Docker and orchestration platforms like Kubernetes (EKS/AKS)
Develop and maintain CI/CD pipelines to automate application and infrastructure deployment processes
Monitor system performance and proactively identify and resolve issues before they impact production
Perform root cause analysis for incidents and implement long-term fixes to improve reliability
Optimize cloud resource usage and control costs by analysing consumption and implementing governance practices
Support disaster recovery planning and ensure backup strategies meet business requirements
Handle operational tasks and support requests through ITSM tools like ServiceNow
Collaborate with security, application, and infrastructure teams for smooth system integration and operations
Contribute to documentation, standards, and best practices within the cloud operations team
Requirements
Minimum 5+ years of experience in cloud infrastructure, DevOps, and system engineering
Strong experience with AWS and/or Azure cloud services
Hands-on experience with automation tools like Terraform, Ansible, or CloudFormation
Experience with containerization (Docker) and orchestration (Kubernetes)
Knowledge of monitoring tools such as DataDog, Prometheus, or New Relic
Understanding of CI/CD pipelines and DevOps practices
Strong troubleshooting and incident management skills
Preferred Skills
Experience with Microsoft Fabric or data platform tools
Knowledge of OpenFlow or network data flow systems
Hands-on experience with DataDog implementation
Experience in enterprise or retail/CPG environments
Certifications (Preferred, Not Mandatory)
AWS Solutions Architect
Azure Administrator
Benefits
Exposure to modern DevOps and cloud automation practices
Opportunity to work on large-scale enterprise cloud systems
Involvement in transformation from reactive to proactive cloud operations
Job ID: 145306471