Search by job, company or skills

ITC Infotech India Limited

Sr. Site Reliability Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 3 hours ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

Key Responsibilities:

  • Configure, deploy, and operate public cloud services (Azure, AWS, GCP).
  • Ensure high availability, security, performance, and disaster recovery best practices.
  • Handle production incidents, plan escalations, conduct post-mortems, and perform impact analysis.
  • Develop and manage CI/CD pipelines for automation and continuous deployment.
  • Maintain a balance between Development & SRE mindset (Software & Infrastructure).
  • Implement and maintain Application Performance Monitoring (APM) tools (Zabbix, Grafana, CloudWatch, etc.).
  • Work with network and security components (BGP, TCP/IP, DNS, SMTP, HTTPS, Security Guardrails).
  • Identify and resolve performance bottlenecks and anomalous system behavior.
  • Use Infrastructure as Code (IaC) tools (Terraform, Ansible, Chef, Puppet, CloudFormation, ARM).
  • Design and manage high availability infrastructure (regions, availability zones, replication).
  • Automate and optimize Kubernetes cluster deployment and monitoring.
  • Implement network architectures suitable for cloud topologies and cloud service expectations.
  • Maintain firewalls and security solutions (Palo Alto, Fortinet, WAF, Cisco routers).
  • Work with containerized environments (Docker, Kubernetes, EKS, GKE, Anthos, OpenShift).
  • Implement DevOps best practices, rapid prototyping, and agile development methodologies.
  • Troubleshoot and debug Kubernetes clusters and cloud infrastructure issues.
  • Utilize SQL/NoSQL databases such as PostgreSQL for cloud storage management.
  • Document troubleshooting processes, automation scripts, and procedural workflows.
  • Provide client management support and collaborate with cross-functional teams.
  • Stay updated on Kubernetes and cloud technology trends.

Required Skills & Qualifications:

  • years of experience in Site Reliability Engineering, Cloud Engineering, or DevOps.
  • Expertise in Azure, AWS, or GCP cloud platforms.
  • Hands-on experience with Infrastructure as Code (IaC) and automation tools.
  • Strong understanding of networking, security, and scalability in cloud environments.
  • Experience in designing and deploying Kubernetes clusters for large-scale applications.
  • Proficiency in Python, PowerShell, or Shell scripting for automation.
  • Experience with firewalls, security policies, and monitoring tools.
  • Strong knowledge of cloud security best practices.
  • Ability to work independently and in a team, including 24x7 shifts when required.

Preferred Qualifications:

  • Certifications in Azure, AWS, or GCP.
  • Experience with Google Cloud Platform (GCP) services like Compute Engine, Cloud Storage, and Kubernetes Engine.
  • Experience in a scaled agile environment with DevOps methodologies.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

Job ID: 105663559

Similar Jobs

Early Applicant