This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer, Tenant Services: Geo in India.
This role is focused on ensuring the reliability, scalability, and operational excellence of large-scale distributed systems that support data replication and disaster recovery workflows for enterprise customers. You will join a high-impact SRE team responsible for executing and improving complex migration and cutover processes in a SaaS environment. The position blends deep infrastructure engineering with hands-on operational work, including incident response, automation, and observability. You will help ensure that critical customer data migrations are safe, repeatable, and increasingly low-risk over time. Working in a fully remote, global setup, you will collaborate closely with multiple engineering, support, and infrastructure teams. The environment is fast-paced, highly collaborative, and driven by strong engineering values and automation-first thinking.
Accountabilities
- Execute end-to-end data migrations and cutovers, including planning, validation, execution, and post-cutover verification and cleanup activities.
- Participate in on-call rotations and shift coverage to handle incidents, ensure system availability, and support live migration events across global time zones.
- Operate and improve replication and migration systems, including data hygiene checks, validation workflows, and escalation handling.
- Design and maintain automation, tooling, and runbooks to reduce operational complexity and make processes repeatable and reliable.
- Build and enhance observability systems, including monitoring, alerting, dashboards, and SLO tracking for migrations and system health.
- Collaborate with multiple engineering and support teams to improve reliability, capacity planning, and disaster recovery processes.
- Contribute to incident response, post-incident reviews, and root cause analysis, ensuring learnings are converted into long-term improvements.
- Continuously reduce operational toil through automation and process optimization.
Requirements
- Strong experience operating large-scale, highly available distributed systems in a SaaS or cloud environment.
- Hands-on experience with major cloud platforms, including networking, compute, storage, and managed services.
- Solid Kubernetes experience, including deployment, troubleshooting, and ecosystem tooling such as Helm.
- Proficiency with infrastructure as code and configuration tools such as Terraform, Ansible, or Chef.
- Strong programming ability in at least one language (preferably Go or Ruby) plus scripting skills in Python or Shell.
- Experience with observability stacks such as Prometheus, Grafana, and logging systems for troubleshooting and performance analysis.
- Exposure to data replication, backup/restore, or migration scenarios where data integrity and downtime risk are critical.
- Experience working in on-call environments and handling production incidents under pressure.
- Strong communication skills with the ability to engage customers during migrations and incidents.
- Ability to work independently in a remote, asynchronous environment with strong ownership mindset.
- Clear problem-solving skills with a focus on long-term system improvements and not just short-term fixes.
Benefits
- Flexible Paid Time Off
- Equity compensation and Employee Stock Purchase Plan
- Growth and Development Fund
- Parental leave
- Home office support
- Team Member Resource Groups
- Global remote-first working environment
- Inclusive and values-driven culture
How Jobgether Works
We use an
AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Why Apply Through Jobgether
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.