Site Reliability Engineer

TECEZE

Pune, India

5-7 Years

Save

Posted 4 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Location: Pune

Experience: Minimum 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Operations.

Employment Type: Full-time

Job Overview

We are seeking a skilled Site Reliability Engineer (SRE) with experience in Private Cloud and infrastructure operations. The role focuses on ensuring the reliability, scalability, performance, and security of enterprise infrastructure while driving automation, observability, and DevOps best practices.

Key Responsibilities

Design and maintain highly available and fault-tolerant infrastructure systems.
Monitor system performance and ensure infrastructure reliability and scalability.
Lead incident response, root cause analysis (RCA), and system performance improvements.
Develop automation tools and scripts to streamline operations and reduce manual tasks.
Implement Infrastructure as Code (IaC) using tools like Terraform or Ansible.
Support containerized environments using Docker, Kubernetes, or OpenShift.
Build and maintain monitoring, logging, and alerting systems.
Collaborate with DevOps, development, and security teams to support CI/CD pipelines and ensure secure infrastructure operations.

Required Skills

Strong experience in Linux/Unix system administration.
Proficiency in Python, Go, Bash, or Shell scripting.
Experience with cloud platforms (AWS, Azure, or GCP).
Hands-on experience with containerization and orchestration technologies.
Good understanding of networking concepts (DNS, TCP/IP, Load Balancing, Firewalls).
Experience with monitoring and observability tools such as Prometheus, Grafana, ELK, or Datadog.