Cloud & Infrastructure Management
- Manage and support hybrid infrastructure including physical data centers, AWS (EC2, S3, RDS, Route53), and Azure services.
- Design and implement Infrastructure-as-Code (IaC) using Terraform, Pulumi, or CloudFormation.
- Configure and maintain Kubernetes clusters (EKS), ensuring scalability and high availability.
CI/CD & Automation
- Build, maintain, and optimize CI/CD pipelines with Jenkins and Bitbucket.
- Automate deployments, scaling, and monitoring processes.
Monitoring & Troubleshooting
- Set up and maintain observability using Grafana, CloudWatch, Kibana, and OpenSearch.
- Perform root cause analysis of performance issues, scaling bottlenecks, and high utilization scenarios.
- Participate in on-call support, ensuring timely resolution of incidents.
Security & Compliance
- Manage SSL and code signing certificates (issuing, renewing, deploying).
- Implement best practices for infrastructure security, password management, and access control.
- Support compliance and audit requirements through documentation and monitoring.
Collaboration & Knowledge Sharing
- Work with cross-functional teams to improve infrastructure reliability and delivery processes.
- Maintain clear documentation in Confluence, track work in Jira, and contribute to knowledge sharing.
- Write root cause analysis (5 Whys) and incident post-mortems to prevent future issues.
Requirements
Hands-on expertise with AWS services (EC2, S3, RDS, Route53) and Azure cloud
Strong experience with Infrastructure-as-Code tools (Terraform, Pulumi,
CloudFormation)
Experience with Kubernetes/EKS setup and operations
CI/CD pipelines (Jenkins, Bitbucket)
Monitoring and observability tools (Grafana, CloudWatch,
Kibana, OpenSearch)
Scripting and automation skills in Python or TypeScript
Networking (IP, subnetting), Linux/Windows administration, and troubleshooting
Managing SSL/code signing certificates
Security best practices in infrastructure and access control