Description:
We are seeking a skilled CloudOps Engineer to join our team. The successful candidate will be responsible for managing, automating, and optimizing cloud operations to ensure high availability, performance, and security of our cloud-based systems. The ideal candidate will have strong expertise in cloud platforms (GCP/AWS/Azure), Infrastructure as Code (Terraform/Ansible), Kubernetes, and monitoring/observability tools.
Responsibilities:
- Manage, monitor, and optimize cloud infrastructure across multi-cloud environments
- Implement Infrastructure as Code (IaC) using Terraform/Ansible for scalable and repeatable deployments
- Configure and operate Kubernetes clusters for containerized workloads
- Automate operational processes such as scaling, backups, patching, and recovery
- Monitor cloud resources using tools such as Stackdriver, Prometheus, Grafana, or equivalent
- Troubleshoot and resolve infrastructure incidents with root cause analysis and preventative measures
- Collaborate with DevOps, Development, and Security teams to ensure compliance and governance
- Drive continuous improvement in reliability, availability, and performance of services
- Stay up to date with advancements in cloud-native operations, FinOps, and security best practices
- Willing ness to work in rotational shifts
Requirements:
- 6-9 years of experience in CloudOps, SRE, or DevOps roles
- Strong background in cloud platforms (GCP or Azure preferred, AWS a plus)
- Hands-on experience with Kubernetes, Terraform, and CI/CD pipelines
- Proficiency in scripting languages (Python, Bash, or similar) for automation
- Experience with observability and logging tools (e.g., Prometheus, ELK, Grafana)
- Good understanding of networking, security, and compliance in cloud environments
- Strong problem-solving, communication, and collaboration skills
- Ability to work in a fast-paced, globally distributed environment