Search by job, company or skills

S

SRE DevOps Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 10 days ago
  • Be among the first 50 applicants
Early Applicant

Job Description

Job Overview

We are seeking an experienced SRE DevOps Engineer to join our team in Hyderabad. This is a full-time, mid-level position requiring 4 to 6 years of relevant work experience. The role demands hands-on expertise in site reliability engineering and DevOps practices, particularly within AWS environments, to ensure the smooth and efficient operation of our systems.

Qualifications and Skills

  • Proven experience of 4 to 6 years as an SRE DevOps Engineer in complex cloud environments.
  • Expertise in AWS, SRE, and DevOps practices (Mandatory skill) to architect, monitor, and maintain scalable infrastructures.
  • Proficiency in Python for scripting and automation to improve system efficiency and performance.
  • In-depth understanding of CI/CD pipelines, enabling continuous integration and delivery of software updates.
  • Experience in using Grafana for monitoring system performance and responding to incidents swiftly.
  • Capability to troubleshoot and resolve system-related issues, ensuring minimum downtime and service disruption.
  • Strong communication skills to collaborate effectively with cross-functional teams and stakeholders.
  • Ability to maintain system reliability and availability through proactive planning and capacity management.

Roles and Responsibilities

  • Design, implement, and manage reliable and scalable infrastructure leveraging AWS cloud services.
  • Develop automation and monitoring solutions to enhance system performance and automate repetitive tasks.
  • Collaborate with development teams to integrate SRE and DevOps best practices into the software development lifecycle.
  • Establish and maintain CI/CD pipelines for seamless software integration and deployment.
  • Monitor system performance and implement strategies to improve application stability and reduce downtime.
  • Respond to system alerts and incidents, performing root cause analysis and implementing corrective actions.
  • Participate in on-call rotations to ensure 24/7 availability and quick resolution of critical issues.
  • Document processes, system configurations, and best practices to enhance team knowledge sharing and efficiency.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 133661175

Similar Jobs