Search by job, company or skills

ManageServe Technologies

Linux Site Reliability Consultant

3-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 26 days ago
  • Over 100 applicants
Quick Apply

Job Description

What will you be doing

  • Operate, maintain, and administer solutions contributing to customer infrastructures operational efficiency, availability, and visibility.
  • Planning maintenance activity, design documentation, and standard procedures
  • Provide Root Cause Analysis reports for outages/incidents (ITIL - Problem Management)
  • Observe and provide feedback on the current state of the client s infrastructure, and identify opportunities to improve resiliency, reduce incident occurrence, and automate repetitive administrative and operational tasks.
  • Contribute to, improve, and maintain team documentation about client systems and infrastructure, procedures, policies, and schedules.
  • Gather and document information about client environments through audit activities, and analyze the information to identify opportunities for improvement and application of best practices.
  • Work collaboratively with teammates to contribute to the continuous improvement of our working culture.
  • Act as a technology leader for clients, as well as drive client discussions on technology road maps.
  • Participate in an on-call rotation in an escalation capacity.
  • What do we need from you
  • Experience working with Google and AWS Clouds (including infrastructure as code deployment with Cloud Formation, Terraform, Opsworks, etc)
  • Scripting and automation of administrative tasks using Python and Scala is a must
  • Solid understanding of microservices architecture and container technologies (Kubernetes is a must, Docker, lxc, etc)
  • Clear understanding of software development lifecycles and best practices from an infrastructure point of view (PRs, merge, rebase, etc)
  • Understanding the end-to-end operations of a Business System vs components.
  • Comprehensive systems hardware and network troubleshooting experience
  • Common Linux distribution platform installation, configuration, performance tuning, and cloud migration.
  • TCP/IP networking, NIC bonding, and network services configuration (DNS, NTP, DHCP, SMTP, etc)
  • Operation and administration of virtual infrastructure, including experience with at least one hypervisor (VMware, Hyper-V, KVM, etc.)
  • Ability to describe IaaS, PaaS, SaaS, pros and cons of each, use cases for virtualization and cloud
  • Administration of web servers and supporting technologies, including network load balancers
  • Experience with the design, development, and deployment of Puppet
  • System and application error investigation, troubleshooting of access/availability issues including deep multi-system root cause analysis
  • Experience managing networking devices, such as switches and firewalls from a variety of vendors
  • Solid understanding of DevOps tools, processes, and culture
  • Ability to pick up new technologies quickly
  • Ability to provide accurate work scheduling and task estimations for work delivery

More Info

Function:
Employment Type:
Open to candidates from:
Indian

Job ID: 115967051