Dear Connections,
Greetings from People Prime Worldwide!
We have an excellent job opportunity with one of our top client.
Job Title: Sr. System Reliability Engineer - (Contract to Hire)
Experience: 6 - 9 Years
Location: Pune (Hybrid)
Notice: Immediate to 15 Days (Currently Serving Only)
Job Description:
The Role
- Plan, manage, and oversee all aspects of a Production Environment
- Define strategies for Application Performance Monitoring, Optimization in Prod environment
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Support deployment of code into multiple lower environments. Supporting current processes with an emphasis on automating everything as soon as possible.
- Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.
- Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.
- Engage in and improve the whole lifecycle of servicesfrom inception and design, through deployment, operation and refinement.
- Analyse ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones.
- Ability to share knowledge and explain processes and procedures to others.
- Able to perform on-call duties on a rotational basis.
- Occasional off hours work required.
Requirements:
Must Have
- Linux
- Shell Scripting
- ITIL / ITSM
- SQL - Basic
- Application Troubleshooting
- Monitoring tool - (Preferred - Splunk/Dynatrace)
- Jenkins - CI/CD
- Cloud - Good in Cloud technology in AWS / Azure (preferred - Azure)
- Groovy Scripting/Yaml
- Git basic/bit bucket
- Ansible/Chef
Good To Have
- Even Framework architecture
- Nifi
- Hadoop