Search by job, company or skills

Coredge.io

Site Reliability Engineer - 2

new job description bg glownew job description bg glownew job description bg svg
  • Posted 6 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We are seeking a highly skilled and motivated Site Reliability Engineer to join our team. The ideal candidate will have at least 3 years of DevOps experience and a strong technical background in Linux, Kubernetes, monitoring tools, and automation. The role involves deploying, monitoring, and managing infrastructure and applications to ensure optimal performance and reliability.

Responsibilities

  • Work with Linux-based systems to deploy and manage applications.
  • Troubleshoot Linux-related issues to ensure high availability and performance.
  • Maintain system stability, security, and performance tuning.
  • Deploy, configure, and maintain Kubernetes clusters.
  • Debug issues related to Kubernetes environments, including container orchestration and service failures.
  • Ensure seamless containerised application deployments and scaling.
  • Implement, configure, and maintain Prometheus and Grafana for system and application monitoring.
  • Develop and maintain real-time Grafana dashboards for critical insights.
  • Troubleshoot system performance and application issues using monitoring data.
  • Understand cloud-based environments and basic cloud computing principles.
  • Work with cloud services for infrastructure management and monitoring.
  • Assist in troubleshooting cloud-related issues when required.
  • Gain an understanding of the Cloud/Horizon portal for managing project-related tasks.
  • Monitor and track cloud-based infrastructure using Horizon.
  • Utilise the portal for operational insights and incident management.
  • Set up, manage, and troubleshoot CronJobs for automating scheduled tasks.
  • Ensure automated tasks execute as planned and investigate failures.
  • Enhance automation processes to optimise system operations.

Requirements

  • Bachelor's degree in computer science, Information Technology, or a related field (or equivalent experience).
  • 3-5 years of experience in a DevOps Support Engineer role.
  • Strong expertise in Linux system administration.
  • Hands-on experience with Kubernetes deployment, debugging, and troubleshooting.
  • Proficiency in Prometheus and Grafana for monitoring and dashboard management.
  • Basic knowledge of cloud computing environments.
  • Experience with the Horizon portal (preferred but not mandatory).
  • Strong scripting and automation skills (Shell, Python, or Ansible is a plus).
  • Ability to work independently and handle production incidents with minimal supervision.
  • Excellent troubleshooting and analytical skills.
  • Certification in Kubernetes (CKA, CKAD) is a plus.
  • Experience with CI/CD pipelines and DevOps automation.
  • Exposure to cloud providers such as AWS, Azure, and OpenStack.
  • Strong understanding of networking fundamentals in a cloud-native environment.

This job was posted by Sajal Saxena from CorEdge.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 134108173