Search by job, company or skills

Datum Technologies Group

Lead Site Reliability Engineer (SRE)

new job description bg glownew job description bg glownew job description bg svg
  • Posted 21 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Details:

Job Title: Lead Site Reliability Engineer (SRE)

Duration: Contract to Hire (On the Payroll of Datum Technology Group)

Location: Chennai || Mumbai || Gurugram

Interview Process: Virtual (2 Rounds) + 1 Technical screening.

Job Description:

  • We are seeking a highly skilled and experienced Lead Site Reliability Engineer (SRE) to drive reliability, scalability, and performance across our cloud infrastructure, with a strong emphasis on cloud security, compliance, networking, and operating systems expertise.
  • This role blends reliability engineering with security best practices to ensure our cloud infrastructure is not only scalable and resilient but also secure and compliant.

Responsibilities:

  • Develop and maintain Infrastructure as Code (IaC) using Terraform, including advanced module design and best practices for highly complex environments.
  • Design and optimize CI/CD pipelines with a focus on automation, scalability, and deployment efficiency. Ability to discuss and implement pipeline optimizations from prior experience.
  • Collaborate with development teams to integrate security and observability tools into CI/CD pipelines, automating security checks.
  • Troubleshoot and debug networking issues, including deep understanding of networking layers, components, and configurations across cloud and hybrid environments.
  • Administer and optimize Linux-based operating systems, including troubleshooting, performance tuning, and implementing best practices for security and reliability.
  • Address vulnerabilities in code libraries and infrastructure (e.g., OS packages) through patching and remediation.
  • Partner with application teams to resolve specific security findings and improve overall system resilience.

Requirements:

  • 9+ years of experience in DevOps, Site Reliability Engineering (SRE), or Cloud Engineering.
  • Some experience into leading or managing a team of engineers.
  • Deep knowledge of networking fundamentals, Linux operating systems, and CI/CD optimization strategies.
  • Very strong expertise in writing complex Terraform code, including advanced module design and best practices for large-scale, highly complex environments.
  • Proficiency in scripting or programming languages (e.g., Python, Bash, Go).
  • Hands-on experience with Azure cloud platform

Bonus/Preferred Skills:

  • Experience with Docker and Kubernetes for containerization and orchestration.

More Info

Job Type:
Industry:
Employment Type:

Job ID: 134703363