
Search by job, company or skills
SRE Engineer will play a critical role in ensuring our trading services are always available, scalable, and
engineered to withstand unparalleled demand.
You will be deeply involved in incident management, troubleshooting, and root cause analysis, with a strong
emphasis on automation and improving our operational processes.
Your expertise in DevOps practices, coupled with your strong skills in Dynatrace, Splunk, and Grafana, will be
essential in monitoring, visualizing, and troubleshooting our systems.
Key Responsibilities:
Incident Management: Lead and manage incident response and blameless post-mortems, ensuring quick
recovery and future prevention.
Monitoring and Observability: Utilize Dynatrace, Splunk, and Grafana to implement comprehensive monitoring
and observability frameworks for proactive incident detection and performance metrics visualization.
DevOps Integration: Collaborate with development teams to integrate DevOps practices into the software
development lifecycle, enhancing CI/CD pipelines for better reliability and efficiency.
Root Cause Analysis: Conduct in-depth root cause analysis for incidents and outages, developing long-term
solutions to prevent recurrence.
Performance Tuning: Optimize system performance by identifying bottlenecks and implementing scalable
solutions.
Automation: Develop automation tools and scripts to reduce manual intervention, improve system reliability,
and streamline operational processes.
Documentation: Create and maintain detailed documentation for system architecture, incident reports, and
operational procedures.
Collaboration and Leadership: Work closely with cross-functional teams to share knowledge, mentor junior
team members, and promote a culture of reliability and continuous improvement.
SRE Exposure (5-15 Yrs), DevOps, Exposure to high vol. transanction mgnt with APM tool Dynatrace, Grafana, Splunk & Grafana
[Confidential Information]
Job ID: 144753737