Search by job, company or skills

Mphasis

Infrastructure Servcs Mgr

Save
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description

Role: Automation Lead

Automation Lead - Leading Automation SRE, Responsible to perform end to end Self-Healing automation solution to reduce manual effort/TOIL.

Primary Skill - Python, Ansible, Observability, SRE

Secondary Skill -Shell Script, Linux, Monitoring tools-Splunk, AppD, Grafana, ITRS etc.

Location:Bangalore/ Hyderabad

Automation Engineer

  • 12+ years of experience in leading Automation SRE teams.
  • Advanced working experience with two or more of the following: Unix/Linux, Windows Server, Oracle, MSSQL, MongoDB.
  • Experience with Python, Java, Curl scripting or any other types of scripting.
  • Experience with two or more of the following observability tools: AppDynamics, Big Panda, Elastic Search (ELK), Google Cloud Logging, Grafana, Prometheus, Splunk, Thousand Eyes.
  • Experience with logging, monitoring, and event detection on Cloud or Distributed platforms.
  • Experience working with one or more of the following: AutoSys, CRON, Windows Scheduler or other logical batch schedulers.
  • Provides technical direction regarding monitoring and logging to less experienced staff or develops highly complex original solutions. Acts as an Expert technical resource for modeling, simulation and analysis efforts.
  • Experience creating and modifying technical documentation such as environment flow, functional requirements, nonfunctional requirements.
  • Outstanding problem solving and analytical skills with ability to turn findings into strategic imperatives.
  • Technical operations application support experience.
  • Minimum 4-6 years of hands-on experience into SRE implementation of monitoring system development for application reliability using Splunk, Grafana, App Dynamics, Big panda.
  • Completely On-Prim environment, so we would require strong candidates on the above skills.
  • Overall, we are looking for an Automation Engineer, who could reduce the toil issues and enhance the system towards reliability and scalability.

Nature Of The Job

  • Collaborate with Production support team, identify the existing manual activities, and automate.
  • Identify toil area where it can be automated to avoid manual intervention
  • Build Monitoring system and observability platform for more Stack traces and s and Dashboards.
  • Ability to define SLA, SLO and SLI and implement the same for better monitoring
  • Scalability, reliability, and observability are the primary goals for reduction of MTTD and MTTR

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148995243