Search by job, company or skills

P

Site Reliability Engineer Lead- GC

new job description bg glownew job description bg glownew job description bg svg
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

1.
System Reliability:
a.
Collaborate with software development teams to ensure reliability is a key consideration throughout the software development life cycle.
b.
Design and implement scalable and resilient architectures for mission-critical applications.
2.
Kubernetes Operations:
a.
Design, implement, and manage Kubernetes clusters, ensuring high availability, fault tolerance, and scalability.
b.
Perform upgrades, patch management, and security enhancements for Kubernetes infrastructure.
3.
Automation and Infrastructure as Code (IaC):
a.
Drive automation efforts to streamline deployment, scaling, and management of applications on Kubernetes and/or cloud environments.
b.
Implement CI/CD pipelines for deploying and updating Kubernetes applications.
c.
Develop and maintain Infrastructure as Code scripts (e.g., Terraform, Ansible) for provisioning and managing cloud and container resources.
4.
Cloud Integration:
a.
Leverage cloud services (AWS, GCP, Azure) to optimize Kubernetes infrastructure and seamlessly integrate with other cloud-native solutions.
b.
Implement best practices for deploying and managing Kubernetes on cloud platforms.
5.
Monitoring and Alerting:
a.
Implement effective monitoring and alerting solutions for Kubernetes clusters, applications, and underlying infrastructure.
b.
Proactively identify and address performance bottlenecks and reliability issues.
6.
Incident Response:
a.
Respond to and resolve incidents related to Kubernetes infrastructure and applications, ensuring minimal downtime and impact on users.
b.
Conduct post-incident reviews and implement improvements to prevent future issues.
7.
Capacity Planning:
a.
Perform capacity planning to ensure the Kubernetes infrastructure can accommodate current and future workloads in the cloud.
8.
Security:
a.
Collaborate with the security team to implement and maintain security best practices for Kubernetes environments in the cloud.
b.
Conduct regular security audits and vulnerability assessments.
9.
Collaboration and Documentation:
a.
Work closely with development, operations, and other teams to ensure a collaborative approach to infrastructure and application reliability.
b.
Maintain clear and comprehensive documentation for processes, configurations, and troubleshooting steps.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 144636607