Experience: 8+ Years
Location: Hyderabad (Hybrid)
Job Summary
We are looking for an experienced Site Reliability Engineer (SRE) / DevOps Engineer with strong expertise in cloud platforms, CI/CD pipelines, Infrastructure as Code (IaC), and container orchestration. The ideal candidate will be responsible for building, deploying, monitoring, and scaling highly available systems while ensuring reliability, performance, and security across environments.
Key Responsibilities
- Design, implement, and maintain highly available and scalable infrastructure following SRE best practices
- Build and manage CI/CD pipelines using tools such as Harness, Jenkins, GitHub Actions, or equivalent
- Implement Infrastructure as Code (IaC) using Terraform and configuration management tools
- Manage cloud infrastructure across GCP or any major cloud platform (AWS/Azure)
- Containerize applications using Docker and manage workloads on Kubernetes
- Deploy, manage, and monitor Kubernetes clusters including workloads, configuration, security, and storage
- Develop and support API/Web Service integrations for cloud-native applications
- Implement monitoring, logging, alerting, and incident response to ensure system reliability
- Collaborate with development teams to improve system performance, resilience, and deployment strategies
- Support DevOps culture by automating processes and driving continuous improvement
- Work with ITSM tools like ServiceNow or BMC Remedy for incident, change, and problem management
Mandatory Skills
- SRE & DevOps experience (7.5+ years)
- Cloud Platforms: GCP (preferred) or AWS/Azure
- CI/CD Tools: Harness / Jenkins / GitHub / GitLab
- Infrastructure as Code: Terraform (mandatory)
- Containerization: Docker
- Kubernetes:
- Workload management
- Configuration & deployment
- Security & access control
- Storage & networking
- Programming/Scripting: Java, Python, or Shell
- API & Web Services: RESTful APIs, JSON, XML
- Version Control: Git / GitHub
Good to Have / Preferred Skills
- Experience with Temporal workflows
- Knowledge of Spring Data Cloud / Spring Boot
- Exposure to GCP services and cloud-native architecture
- Observability tools (Prometheus, Grafana, ELK, Splunk)
- Experience in microservices-based architectures
- Strong understanding of DevOps & SRE best practices (SLIs, SLOs, SLAs)
Soft Skills
- Strong problem-solving and analytical skills
- Excellent communication and collaboration abilities
- Ability to work in fast-paced, high-availability environments
- Ownership mindset with strong operational discipline
Education
- Bachelor's degree in Engineering or equivalent qualification
Why Join Us
- Work on large-scale, cloud-native platforms
- Exposure to modern DevOps & SRE tooling
- Opportunity to design and operate highly reliable systems
- Collaborative, learning-driven engineering culture