Search by job, company or skills

Datum Technologies Group

Site Reliability Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 24 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title: Site Reliability Engineer (SRE) AWS

Experience: 8+ years

Location: Chennai / Mumbai

Work Mode: Hybrid

Key Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog

Job Summary:

We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and large-scale distributed systems.

Responsibilities:

Manage and optimize cloud infrastructure using AWS IaaS.

Implement SRE practices to enhance reliability, performance, and SDLC efficiency.

Build and maintain CI/CD pipelines (Jenkins, GitLab, Terraform).

Work with containers and orchestration (Docker, ECS, Kubernetes).

Troubleshoot performance, networking, and distributed system issues.

Drive DevOps and QA best practices across teams.

Implement observability: SLI/SLO, Error Budgets, monitoring, logging, tracing, alerting.

Lead incident resolution and perform RCA.

Automate tasks using Python/Bash/PowerShell.

Collaborate effectively with cross-functional teams with minimal supervision.

Qualifications:

Strong AWS cloud experience

Proven DevOps & SRE implementation skills

Good understanding of Linux, networking, and distributed systems

Hands-on experience with observability tools

Strong scripting and automation expertise

Excellent communication and teamwork skills

More Info

Job Type:
Industry:
Employment Type:

Job ID: 133342111

Similar Jobs