Search by job, company or skills

A

Senior Site Reliability Engineer - Logging Metrics and Monitoring

5-8 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 days ago
  • Be among the first 30 applicants
Early Applicant
Quick Apply

Job Description

Job Responsibilities

  • Automate the deployment of logging, metrics, and monitoring services through configuration management utilizing Puppet.
  • Address and resolve production incidents by applying Linux administration and engineering expertise.
  • Lead projects from inception to completion, including designing technical solutions, managing timelines, and executing deliverables.
  • Design and implement metrics dashboards and alert criteria to effectively monitor and scale services.
  • Participate in a week-long on-call rotation in collaboration with team members.
  • Assist development teams in enhancing their logging and metrics collection processes.
  • Demonstrate the ability to manage on-call rotations every few weeks.

Typical Qualifications

  • Possess 5 to 8 years of prior experience in a production environment, exhibit strong system administration and DevOps skills for managing services within a Linux environment.
  • Demonstrate hands-on experience with configuration management tools such as Puppet or Ansible.
  • Strong experience troubleshooting production services in a Linux environment and participating in on-call rotations.
  • Proficient in programming with experience writing and maintaining scripts in the following languages: Bash, Ruby, Python, Perl, C++, Java, and Golang.
  • Experience developing Infrastructure as Code utilizing Terraform and CloudFormation.
  • Display adaptability and flexibility in response to changing environmental and business demands.

Additional Qualifications

  • Demonstrated experience in managing production server fleets at a scale of thousands.
  • Subject matter expertise in relevant technologies, including FluentD, Kafka, Elasticsearch, Graphite, Clickhouse, Prometheus, Grafana, Graylog, Terraform, CloudFormation, Docker, Jenkins, and Git.
  • Exposure to Amazon Web Services (AWS) for deploying, managing, and scaling applications, with a foundational understanding of AWS services, architecture, and best practices.
  • Proficient in using protocol analyzers such as tcpdump and Wireshark.

More Info

Job Type:
Industry:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Athenahealth Technology Private Limited is the Indian subsidiary of U.S.-based Athenahealth Inc., specializing in healthcare software development and IT services. Founded in 2005 and headquartered in Chennai, the company supports global operations with offices in Bengaluru and Pune. It plays a key role in building innovative solutions to improve healthcare access and delivery.

Job ID: 118196293