Project Role : Infra Tech Support Practitioner
Project Role Description : Provide ongoing technical support and maintenance of production and development systems and software products (both remote and onsite) and for configured services running on various platforms (operating within a defined operating model and processes). Provide hardware/software support and implement technology at the operating system-level across all server and network areas, and for particular software solutions/vendors/brands. Work includes L1 and L2/ basic and intermediate level troubleshooting.
Must have skills : Kubernetes
Good to have skills : NA
Minimum 5 Year(s) Of Experience Is Required
Educational Qualification : 15 years full time education
Summary:
We are seeking Kubernetes Support Engineer with deep technical expertise in Kubernetes, GKE, and automation.
You will be a key technical point of contact for advanced troubleshooting, platform reliability, and change management, ensuring the stability and performance of our large-scale Kubernetes infrastructure.
As part of the Kubernetes & Platform Engineering team, you will contribute to the support, governance, and continuous improvement of internal cloud-native platform.
Roles & Responsibilities:
Advanced Troubleshooting & Support (L3):
- Diagnose and resolve complex Kubernetes incidents (networking, scheduling, API server, CNI, storage, etc.) across GKE clusters and potentially more distributions.
- Perform root cause analysis.
Incident & Problem Management:
- Take ownership of critical incidents, perform in-depth investigations, and ensure permanent resolutions through RCA reports and automation.
- Collaborate with DevOps, SRE, and Security teams to improve reliability and resilience.
Platform Automation & GitOps:
- Drive automation initiatives through GitOps practices using ArgoCD and internal platform engineering tools.
- Maintain consistency, compliance, and reproducibility of Kubernetes environments at scale.
Platform Change Management:
- Participate in and oversee change management processes for the Kubernetes platform:
- Assess and validate proposed changes (cluster upgrades, configuration updates, policy enforcement).
- Coordinate with internal teams to ensure minimal disruption and proper rollback strategies.
- Ensure all changes are properly tracked, reviewed, tested, and documented.
- Contribute to defining governance and standard operating procedures (SOPs) for platform lifecycle management.
- Your goal: guarantee safe, auditable, and transparent platform evolution across all environments.
Performance & Reliability Optimization:
- Continuously monitor and optimize Kubernetes cluster performance, scalability, and cost efficiency.
- Contribute to reliability metrics and proactive incident prevention.
Cross-team Collaboration:
- Work closely with application teams, platform engineers, and architects to ensure operational excellence and continuous improvement.
- Documentation & Knowledge Sharing:
- Write and maintain high-quality technical documentation, troubleshooting guides, and platform knowledge bases.
- Contribute to internal training and technical enablement sessions.
Technology Watch:
- Stay up to date with Kubernetes releases, CNCF ecosystem tools, and GKE innovations.
- Recommend improvements and help drive the platform roadmap evolution.
Professional & Technical Skills:
- Certification: Certified Kubernetes Administrator (CKA) required (CKAD or CKS certifications are a plus)
Kubernetes Expertise:
- 3+ years of hands-on experience managing and supporting Kubernetes clusters in production (preferably GKE).
- Solid understanding of Kubernetes internals: controlling plane components, scheduling, networking, and security.
Cloud & Infrastructure:
- Strong knowledge of Google Cloud Platform (GCP) — IAM, Cloud Logging, Monitoring, and Networking.
Automation & GitOps:
- Experience with ArgoCD and GitOps-based workflows.
- Familiarity with CI/CD tools like GitLab CI/CD, plus experience with or automation (Golang if possible but not mandatory)
Observability & Reliability:
- Experience with observability tools like Dynatrace
Change Management & Governance:
- Understanding of ITIL or internal change management best practices.
- Experience coordinating platform upgrades, patch management, and rollout strategies.
- Security & Compliance:
- Knowledge of RBAC, Kyverno, network policies, and secret management.
Additional Information:
- Master s degree in computer science or equivalent experience
- 3+ years as a Kubernetes Engineer, SRE, or Platform Support Engineer
- Analytical mindset and passion for solving complex platform issues
- Excellent communication, teamwork, and documentation skills
- Proactive, rigorous, and focused on reliability, automation, and user satisfaction