We are looking for Senior Site Reliability Engineers to support and scale applications deployed worldwide. This role is deeply rooted in software engineering, with a strong emphasis on automation, tooling, and reliability through code. Your mission is to reduce operational toil by building software, not manual processes. You'll apply senior-level engineering skills to solve complex reliability challenges, enabling systems to detect, respond to, and recover from failures automatically.
Role Mission:
Reduce operational toil by building software, not manual processes.
Apply senior software engineering expertise to reliability problems at scale.
Key Responsibilities:
- Design and build automation tools and internal platforms to improve system reliability
- Implement self-healing mechanisms that automatically detect and remediate failures
- Write effective glue code that connects monitoring signals to remediation actions
- Eliminate repetitive and manual operational work through automation
- Debug and resolve complex system-level and distributed system issues
- Collaborate closely with SRE, Platform, and Engineering teams to improve reliability through code
- Contribute to best practices for reliability, observability, and automation across the organization.
Skills & Experience:
Core Requirements
- Strong background as a Senior Software Engineer (not operations-only)
- Excellent coding and design skills, with a focus on automation and tooling
- Python (mandatory) used for automation, tooling, and integration
- Experience building production-grade tools and services
- Solid understanding of distributed systems, reliability patterns, and failure modes
- Experience integrating automation with monitoring, alerting, and observability systems
- Strong debugging skills across application and infrastructure layers
Strong Advantages
- C# / .NET experience
- Experience with:
- APIs and service integrations
- Event-driven architectures
- Queues and asynchronous systems
Engineering Practices
- Git-based workflows
- CI/CD pipelines and automated deployments
- Infrastructure-aware development mindset
Cloud & Platform Experience:
- Cloud experience required
- Strong hands-on experience with Microsoft Azure
- Familiarity with cloud-native architectures and managed services