Search by job, company or skills

H

Senior Site Reliability Engineer

3-7 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 12 days ago
  • Over 50 applicants
Quick Apply

Job Description

As a Site Reliability Engineer at Aruba (HPE), you will drive reliability, scalability, and performance for HPE Networking SASE products. You will work at the intersection of software development and operations, focusing on cloud-native applications, observability, and incident management to ensure highly available and resilient services.

Key Responsibilities:

  • Enable SRE support and implement monitoring for cloud-based applications to meet performance and availability requirements.
  • Create strategies to detect, troubleshoot, and resolve issues using tools such as Prometheus, Grafana, or Datadog.
  • Ensure high availability, reliability, and performance of cloud-native applications.
  • Collaborate with development teams to improve application performance and reliability in production.
  • Analyze logs, metrics, traces, and dashboards to gain actionable insights for product optimization.
  • Manage Kubernetes clusters and containerized applications (Docker, Helm).
  • Lead and participate in incident management, including root cause analysis to prevent recurrence.
  • Deploy, operate, and maintain applications in public cloud environments (AWS, Azure, GCP).
  • Apply networking knowledge for routing, TCP/IP, UDP, DNS, Firewalls, SNMP, and Internet traffic engineering.
  • Take ownership of SRE responsibilities and contribute proactively to operational excellence.

Education & Experience Required:

  • Bachelor's degree in Computer Science, Information Systems, or equivalent.
  • 5-7 years of overall experience in DevOps or SRE.
  • 3+ years of experience developing cloud-native applications and integrating observability tools for Kubernetes, Helm, or Docker environments.
  • Familiarity with incident management processes and on-call responsibilities (night shifts may be required).

Technical Knowledge & Skills:

  • Expertise in monitoring and observability tools: Prometheus, Grafana, Datadog.
  • Cloud-native application development and deployment in AWS, Azure, or GCP.
  • Containerization and orchestration: Kubernetes, Docker, Helm.
  • Logs, metrics, traces, and dashboards for performance monitoring.
  • Networking protocols and concepts: TCP/IP, UDP, DNS, routing, firewalls, SNMP, traffic engineering.
  • Incident management, troubleshooting, and root cause analysis.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

The Hewlett-Packard Company, commonly shortened to Hewlett-Packard or HP, was an American multinational information technology company headquartered in Palo Alto, California. HP developed and provided a wide variety of hardware components, as well as software and related services to consumers, small and medium-sized businesses (SMBs), and large enterprises, including customers in the government, health, and education sectors. The company was founded in a one-car garage in Palo Alto by Bill Hewlett and David Packard in 1939, and initially produced a line of electronic test and measurement equipment. The HP Garage at 367 Addison Avenue is now designated an official California Historical Landmark, and is marked with a plaque calling it the "Birthplace of 'Silicon Valley'".

Job ID: 139848253

Similar Jobs