Spark on Kubernetes Support Engineer

cloudethix

Thane, India

3-10 Years

Save

Posted 6 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Role Overview
We are looking for a skilled Spark on Kubernetes Support Engineer to provide L1/L2 support for large-scale data platforms. This role involves monitoring, troubleshooting, and optimizing Spark workloads running on Kubernetes, ensuring high availability and performance of data pipelines.

Key Responsibilities

Act as first-level escalation for 24×7 monitoring of Spark (batch & streaming) workloads on Kubernetes
Troubleshoot Spark job failures, performance issues, and resource bottlenecks
Diagnose Kubernetes issues (pod failures, OOMKilled, evictions, DiskPressure, scaling issues)
Monitor Spark UI, cluster health, and resource utilization
Collaborate with development teams to debug and optimize pipelines
Handle Sev1/Sev2 incidents, including RCA and war-room coordination
Build and maintain monitoring dashboards and alerting frameworks (Prometheus/Grafana/ELK)
Support CI/CD pipelines and deployment automation using Azure DevOps
Maintain SOPs, runbooks, and drive continuous improvements

Required Skills