Kubernetes Expert

HARMAN India

Bengaluru, India

7-9 Years

Save

Posted 13 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Summary

We are seeking a highly skilled Site Reliability Engineer (SRE) to architect, operate, and scale Kubernetes-based infrastructure across on-premises and cloud environments. This role emphasizes manual application deployments, observability, security, and uptime accountability, while promoting resilience, automation, and operational excellence.

You will work closely with cross-functional teams to ensure performance, availability, and reliability of mission-critical services, while evolving platform capabilities using modern tools like Terraform, Helm, and Argo CD.

Key Responsibilities

Design, build, and manage highly available Kubernetes clusters across hybrid environments (on-premises and cloud platforms such as AWS EKS, Azure AKS).
Deploy and manage applications manually using tools such as kubectl and Helm, with growing integration of GitOps practices (e.g., ArgoCD).
Implement and manage observability stacks using Prometheus, Grafana, Loki, and Mimir to monitor infrastructure, applications, and system performance.
Define, monitor, and improve SLA/SLO/SLI metrics and alerting systems to ensure platform reliability.
Automate provisioning and configuration of infrastructure using Terraform, Helm, and scripting languages (e.g., Bash, Python).
Plan, implement, and test backup and disaster recovery (DR) strategies using tools like Velero, Commvault, etc.
Manage Kubernetes-native networking, storage, and security configurations (Ceph, NFS, Ingress, PodSecurityPolicies, etc.).
Configure and enforce Kubernetes security best practices using RBAC, OPA/Gatekeeper, NetworkPolicies, and secrets management tools.
Integrate and operate Kubernetes ecosystem tools such as Karpenter, MicroK8s, Service Meshes, and kubectl plugins.
Conduct root cause analysis (RCA) and lead resolution efforts for incidents.
Participate in the on-call rotation for platform availability and incident management.
Maintain up-to-date documentation, architecture diagrams, runbooks, and SOPs.
Mentor engineers and advocate for Kubernetes, security, observability, and deployment best practices across teams.
Continuously stay informed of industry trends in container orchestration, GitOps, security, and cloud-native tooling.

Required Qualifications

79 years of IT/Infrastructure/DevOps experience, with 5+ years in Kubernetes operations in production environments.
Strong hands-on experience in Kubernetes architecture, cluster operations, and manual application deployment practices.
Intermediate-level experience in Kubernetes Security, including:

Cluster hardening, secrets management
Pod Security Standards (PSS), OPA/Gatekeeper
Network policies, image scanning, and runtime protections

Intermediate experience with ArgoCD for GitOps-style Kubernetes deployments.
Solid proficiency in Linux system administration (Ubuntu, CentOS, RHEL) and troubleshooting.
Hands-on experience with Kubernetes-native storage (e.g., Ceph, NFS) and persistent volume provisioning.
Strong familiarity with observability tools: Grafana, Prometheus, Loki, Mimir, etc.
Proficiency in Infrastructure as Code using Terraform, Helm, and scripting.
Experience with Velero, Commvault, or similar for backup and DR.
Experience operating and optimizing cloud-native Kubernetes platforms like EKS, AKS.
Exposure to tools like Karpenter, MicroK8s, Service Mesh, and Ingress Controllers.
Familiarity with AI/ML workloads running on Kubernetes is a plus.
Excellent collaboration, communication, documentation, and incident resolution skills.

Preferred Qualifications

Kubernetes certifications: CKA, CKAD, or CKS.
Strong understanding of container security, networking, and distributed system architecture.
Experience using Portainer for container and Kubernetes management.
Advanced knowledge of Grafana and other enterprise-grade observability tools.
Experience managing large-scale Kubernetes clusters (200+ nodes) is highly preferred.
Prior experience supporting production-grade, high-availability platforms and environments.

Why Join Us

Help shape and operate mission-critical, modern Kubernetes infrastructure.
Be part of a team focused on platform reliability, observability, and secure operations.
Contribute to and influence the evolution of deployment and automation practices (GitOps, IaC).
Access cutting-edge tools, industry best practices, and continuous learning.

Enjoy competitive compensation, flexible working options, and a growth-focused engineering culture