Search by job, company or skills

E

SRE DevOps Engineer <> PersistentSystems

6-12 Years
Save
new job description bg glownew job description bg glow
  • Posted 2 days ago
  • Over 50 applicants
Quick Apply

Job Description

Key Deliverables

  • Maintain and support a highly available production SaaS environment on AWS, ensuring optimal uptime and performance.
  • Manage and optimize cloud services including EKS, Kafka, S3, EC2, VPC, Cassandra, and networking components.
  • Implement and manage Infrastructure as Code (IaC) using Terraform to ensure repeatable and compliant infrastructure.
  • Handle production deployments, upgrades, patching, and release rollouts with minimal downtime.
  • Monitor system performance, troubleshoot issues, and ensure high reliability through proactive measures.
  • Own backup and disaster recovery strategies, including planning and executing regular DR exercises.
  • Participate in 24/7/365 on-call support via PagerDuty, responding to and resolving critical incidents.
  • Collaborate effectively across global teams, ensuring smooth operations and communication across different time zones.
  • Apply DevSecOps best practices and maintain CI/CD pipelines to enhance security, reliability, and automation.

Essential Requirements

  • 6 to 12 years of hands-on experience running and supporting highly available, mission-critical SaaS platforms on AWS.
  • Deep expertise in managing and optimizing complex AWS environments including VPCs, EC2, S3, and EKS.
  • Strong operational experience with Kubernetes, container orchestration, and distributed systems like Kafka and Cassandra in production.
  • Expert-level proficiency in Terraform for infrastructure automation.
  • Proven ability to handle production deployments, upgrades, patching, and release rollouts with minimal downtime.
  • Experience with observability, alerting, and incident management, including participation in 24/7 on-call rotations.
  • Demonstrated expertise in defining, implementing, and validating backup, disaster recovery, and failover strategies.

Preferred Qualifications

  • Experience with GitOps workflows and DevSecOps best practices.
  • Familiarity with Ansible for configuration management.
  • A strong SRE mindset focused on SLIs, SLOs, error budgets, and continuous reliability improvements.

Bachelor Of Technology (B.Tech/B.E)

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Easy Refer is a recruitment and referral-based staffing platform that connects job seekers with employers through a streamlined hiring process.

Job ID: 147598351