Search by job, company or skills

Coforge

Site Reliability Engineer/Lead

8-10 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 months ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role: SRE Lead Engineer

Skills: Docker, Prometheus, grafana, ELK, DataDog

Location: Noida

Experience: 8+ Years

Mode: Work from office

We at Coforge are hiring a highly skilled and experienced SRE Lead Engineer to drive reliability, scalability, and performance across our infrastructure and applications. You will lead a team of SREs, collaborate with development and operations teams, and implement best practices to ensure high availability and resilience of our systems.

Key Responsibilities:

  • Lead and mentor a team of SREs to build scalable and reliable systems.
  • Design and implement monitoring, alerting, and incident response strategies.
  • Drive automation of operational tasks and improve deployment pipelines.
  • Collaborate with software engineers to ensure reliability is built into the product from the ground up.
  • Conduct root cause analysis and postmortems for production incidents.
  • Define and track SLAs, SLOs, and SLIs to measure and improve system reliability.
  • Champion DevOps and SRE best practices across the organization.
  • Manage capacity planning and performance tuning.
  • Ensure security and compliance standards are met in infrastructure operations.

Required Qualifications:

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 8+ years of experience in software engineering, DevOps, or SRE roles.
  • Strong experience with azure platforms.
  • Proficiency in programming/scripting languages (Python, Go, Bash, etc.).
  • Expertise in CI/CD tools (Jenkins, GitLab CI, etc.).
  • Deep understanding of containerization and orchestration (Docker, Kubernetes).
  • Experience with observability tools (Prometheus, Grafana, ELK, Datadog).
  • Excellent problem-solving and communication skills.
  • Proven leadership experience in technical teams.

Preferred Qualifications:

  • Certifications in cloud technologies or DevOps practices.
  • Experience with Infrastructure as Code (Terraform, Ansible).
  • Familiarity with chaos engineering and resilience testing.
  • Exposure to regulatory compliance (e.g., SOC2, ISO27001).

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 128149357