
Search by job, company or skills
Job Description
We are seeking a highly skilled and motivated Senior Site Reliability Engineer(SRE)-
Disaster Recovery Specialist. The ideal candidate will be responsible for designing,
implementing, and maintaining systems and processes that ensure the reliability,
scalability, and disaster resilience of our infrastructure and applications. This role
requires a good understanding of both SRE/DevOps practices and Disaster Recovery
strategies.
• Responsible for managing all activities related to disaster recovery program, to
ensure that IT systems are able to recover in the event of a disaster and perform
DR testing exercises in AWS environment.
• Planning, design, documentation, and testing of disasterrecovery solutions to
meet business or technology requirements.
• Implement and manage monitoring, logging, and alerting solutions to ensure
system health and performance.
• Automate repetitive tasks to improve efficiency and reduce manual intervention
in Disaster recovery
• Drive continuous improvementinitiatives to enhance the effectiveness and
efficiency of disaster recovery processes and technologies.
• Develop, implement, and maintain disasterrecovery plans to ensure the quick
restoration of critical systems and data in the event of a disaster.
• Conductregular DR drills and simulations to testthe effectiveness of the
disaster recovery plan and identify areas for improvement.
• Collaborate with stakeholders to identify critical systems and data thatrequire
protection under the DR plan.
• Manage backup solutions and ensure that data is regularly backed up and can be
quickly restored.
• Understanding of DevOps principles of continuous delivery, deployment, and
improvement
• Good knowledge of DevOps Platform tooling (Git based repo, Jenkins, Docker,
Kubernetes)
• Maintain and enhance infrastructure-as-code (IaC) practices using tools like
Terraform, Ansible, or CloudFormation.
• Manage containerization and orchestration platforms such as Docker and
Kubernetes.
Required Qualifications:
• Bachelor's degree in computer science, Information Technology, orrelated field,
or equivalent experience.
• 5+ years of experience in SRE role with expertise in Disaster Recovery.
• Proven experience in disasterrecovery planning and implementation.
• Strong knowledge of cloud platforms (e.g., AWS, Google Cloud).
• Proficiency in scripting languages (e.g., Python, Bash).
• Familiarity with CI/CD tools (e.g., Jenkins, GitHub).
• Familiaritywith containerization and orchestration tools (e.g., Docker,
Kubernetes).
• Strong understanding of networking, security, and infrastructure best
practices.
• Excellent problem-solving skills and the ability to work under pressure.
### **Preferred Qualifications:**
• Certifications in cloud platforms (e.g., AWS Certified Solutions Architect).
• Experience with infrastructure-as-code tools (e.g., Terraform, Ansible).
• Experience with monitoring and logging tools (e.g., Dynatrace, AWS OpenSearch,
Site24x7).
• Knowledge of database management and DR solutions for databases.
• Strong communication and collaboration skills
ITC Infotech is a leading global technology services and solutions provider, led by Business and Technology Consulting. ITC Infotech provides business-friendly solutions to help clients succeed and be future-ready, by seamlessly bringing together digital expertise, strong industry specific alliances and the unique ability to leverage deep domain expertise from ITC Group businesses. The company provides technology solutions and services to enterprises across industries such as Banking & Financial Services, Healthcare, Manufacturing, Consumer Goods, Travel and Hospitality, through a combination of traditional and newer business models, as a long-term sustainable partner.
ITC Infotech is a wholly owned subsidiary of ITC Ltd. ITC is one of India’s leading private sector companies and a diversified conglomerate with businesses spanning Consumer Goods, Hotels, Paperboards and Packaging, Agri Business and Information Technology.
Job ID: 105663591