Search by job, company or skills

ModMed Technologies India Private Lid

Site Reliability Engineer 2

4-8 Years

This job is no longer accepting applications

new job description bg glownew job description bg glow
  • Posted 9 months ago

Job Description

Key Responsibilities:

  • Architect and manage secure, scalable cloud infrastructure and services, focusing on automation, reliability, and proactive cost management to ensure efficient operations.
  • Implement and refine observability and monitoring solutions using DataDog, ensuring proactive issue identification and efficient resource utilization.
  • Lead CI/CD pipeline development, maintenance, and optimization with Jenkins, integrating AWS services to enhance development workflows and infrastructure automation.
  • Drive the containerization and orchestration of applications using Kubernetes, enhancing scalability, deployment efficiency, and cost-effectiveness.
  • Monitor application and infrastructure performance in AWS, applying tuning and optimizations to ensure optimal resource utilization and user experience while managing costs.
  • Design and manage disaster recovery and backup strategies on AWS, prioritizing data integrity, system availability, and cost efficiency.
  • Provide expert troubleshooting and problem-solving across various platforms and applications within AWS, aiming for minimal disruption and quick resolution.
  • Ensure strict adherence to AWS security standards and compliance with data protection regulations, with a keen eye on cost implications.
  • Keep abreast of new cloud technologies and trends, recommending and implementing improvements for competitive advantage and cost savings.
  • Mentor and support junior team members, fostering a culture of learning, collaboration, and cost-consciousness.
  • Work closely with cross-functional teams to understand requirements and deliver AWS-based solutions that meet business objectives efficiently and cost-effectively

Qualifications:

  • Bachelor s degree in Computer Science, Information Technology, or related field, or equivalent experience.
  • A minimum of 3 years of experience in Site Reliability Engineering, Cloud Engineering, or a similar role, with a demonstrated track record of problem-solving in complex, cloud-based environments. This should include extensive experience with designing, implementing, and managing scalable, highly available, and fault-tolerant systems.
  • Strong expertise in managing cloud environments (preferably in AWS), with hands-on experience in observability platforms such as DataDog.
  • Proficiency in automation and scripting languages (e.g., Python, Bash) and infrastructure as code (IaC) tools (e.g., Terraform, Ansible).
  • Extensive experience with CI/CD tools, notably Jenkins, and familiarity with containerization and orchestration technologies like Kubernetes.
  • Solid understanding of networking, cloud security best practices, performance optimization, and cost management strategies.
  • Demonstrated commitment to implementing industry-standard site reliability principles and a proactive approach to cost management in daily operations.
  • Proven leadership skills and the ability to mentor junior team members, guide teams through complex operational challenges, and foster a culture of continuous improvement.
  • Excellent verbal and written communication skills, with the ability to work effectively in a team environment and communicate complex technical concepts to a non-technical audience.

More Info

Job Type:
Industry:
Function:
Employment Type:
Open to candidates from:
Indian

Job ID: 108709421

Similar Jobs

Hyderabad, India

Skills:

JavaPrometheusGrafanaWindowsJenkinsGcpDockerLinuxAzureKubernetesPythonAWSAzure DevOpsGitHub ActionsAzure Monitor