Search by job, company or skills

Riverbed Technology

Lead Site Reliability Engineer

Save
new job description bg glownew job description bg glow
  • Posted 4 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Riverbed, the leader in AIOps for observability, helps organizations optimize their users experiences by leveraging AI automation for the prevention, identification, and resolution of IT issues. With over 20 years of experience in data collection and AI and machine learning, Riverbed's open and AI-powered observability platform and solutions optimize digital experiences and greatly improve IT efficiency. Riverbed also offers industry-leading Acceleration solutions that provide fast, agile, secure acceleration of any app, over any network, to users anywhere. Together with our thousands of market-leading customers globally – including 95% of the FORTUNE 100 – we are empowering next-generation digital experiences. Learn more at riverbed.com

Position

Position : Lead SRE Engineer

Location : Bangalore

Join Riverbed Technology and be part of shaping the future of digital experience management!

At Riverbed Technology, we are on a mission to help the world's leading enterprises deliver superior digital experiences. Our Digital Experience Management (DEM) solutions provide deep visibility, AI-driven insights, and performance optimization across complex, global infrastructures.

We are expanding our Site Reliability Engineering (SRE) team and looking for an experienced SRE Lead in India to drive reliability, scalability, and operational excellence across our production environments. This is a unique opportunity to join a global company, lead technical initiatives, mentor engineers across Israel, the US, and beyond, and be instrumental in keeping Riverbed's SaaS solutions reliable and trusted by customers worldwide.

What You Will Do

  • Lead incident response and resolution – coordinate investigations during critical production incidents, drive root cause analysis, and ensure rapid resolution.
  • Architect and implement reliability solutions – design and deploy infrastructure improvements, automation frameworks, and observability systems to prevent issues proactively.
  • Own production stability initiatives - drive strategic projects that improve system resilience, reduce MTTR, and optimize infrastructure performance
  • Mentor and guide SRE team members - provide technical leadership, conduct code/design reviews, and develop team capabilities
  • Lead post-incident reviews and blameless postmortems - facilitate learning, document findings, and drive continuous improvement in incident response playbooks
  • Collaborate with DevOps and Engineering leadership - partner with cross-functional teams to influence architectural decisions and reliability standards
  • Establish and track SLIs/SLOs/SLAs - define reliability metrics, implement monitoring strategies, and drive data-driven operational improvements
  • Participate in and help coordinate global on-call rotation - ensure continuous coverage and mentor team members on escalation procedures

What Makes You An Ideal Candidate

  • 4+ years of hands-on experience with AW S - expert-level knowledge of EC2, ECS, EKS, RDS, S3, VPC, Load Balancing, CloudFormation, and multi-account strategies
  • Strong leadership and mentorship experience - proven track record of leading technical initiatives and developing engineering talent
  • Expert-level proficiency in Linux systems administration and performance tuning
  • Advanced experience with infrastructure-as-code - Terraform and Ansible in production environments at scale
  • Deep expertise in container orchestration - Kubernetes (K8S) and ECS, including cluster management, scaling strategies, and troubleshooting
  • Strong CI/CD pipeline design and implementation experience (Jenkins, GitLab CI, or similar)
  • Advanced knowledge of observability stack - CloudWatch, Prometheus, Grafana, ELK/EFK, Datadog, or equivalent platforms
  • Expert networking skills - DNS, load balancing, TLS/SSL, VPNs, service mesh architectures, and complex connectivity troubleshooting
  • Automation and scripting proficiency - Python, Bash, or Go for building tools and automation frameworks
  • Excellent communication and technical documentation skills - able to clearly articulate complex technical concepts to both technical and non-technical stakeholders
  • Experience with DORA metrics and SRE best practices - understanding of error budgets, toil reduction, and reliability engineering principles

Nice to Have

  • Background in security and compliance (SOC2, ISO, FedRAMP)
  • Contributions to open-source SRE/DevOps projects
  • Experience with multi-region, high-availability architectures
  • Knowledge of FinOps and cloud cost optimization at scale
  • Familiarity with GitOps practices (ArgoCD, Flux)

What We Offer

Our employee benefits including flexible workplace policies, employee resource groups, learning and development resources, career progression pathways, and community engagement initiatives are some of the reasons why we have had great success in bringing in new talent. In addition, our global employee wellness programs are crafted to support the physical, emotional, and financial well-being of our employees.

Benefits & Perks vary by Country.

About Riverbed

With a 20-year history of innovation, Riverbed is agile and proven—and we continue to disrupt the market with differentiated solutions that help customers deliver secure, seamless digital experiences and accelerate enterprise performance. We are relentlessly customer-first: we listen, learn, and act with urgency to solve real problems and earn trust every day. We pair bold ideas with operational efficiency, simplifying how we work, focusing on what matters most, and delivering with quality and speed. Fueled by a will to win, we set ambitious goals, hold ourselves accountable, and raise the bar through measurable outcomes. At the center of it all are our people—bringing our best selves to work with a shared commitment to excellence, transparency, and open communication. We strive to be an inclusive, fair, and enjoyable workplace where respect and wellbeing are prioritized. We are committed to our people, partners, and customers while supporting the communities where we work and live. It's the Power ofWE that binds us together and drives high-impact success.

Riverbed is an equal employment opportunity/Affirmative Action (EEO/AA) employer and provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, gender, sexual orientation, gender identity or expression, national origin, age, physical disability (including HIV and AIDS), mental disability, medical condition, pregnancy or child birth (including breast feeding), sexual orientation, genetics, genetic information, marital status, veteran status or any other basis protected by and in accordance with applicable federal, state and local laws.

Check Us Out On

www.riverbed.com

@LifeAtRiverbed

Tags

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147534135

Similar Jobs

Bengaluru, India

Skills:

Monitoring ToolsPrometheusGrafanaAutomation ToolsTerraformScripting LanguagesPythonAzure DevOpsAWSTroubleshooting SkillsBashElk StackDatadogGoogle CloudJenkinsAnsibleIncident ManagementSplunkWeb ApiAzureSREpost-mortem reviewsIaC toolscloud platformsdevelopment technologieslogging frameworks

Bengaluru, India

Skills:

KibanaGroovy ScriptGraphqlReduxGrafanaDatadogGcpTerraformTypescriptAnsibleReactjsJavascriptGKEmicroservices managementGCE

Bengaluru

Skills:

JavaAutomationPythonAWSSREDeployment

Bengaluru, India

Skills:

PerlS3YamlDnsVpcConfigConfiguration ManagementLambdaWindowsRoute 53RDSLinuxNetworkingMySQLCloudformationXmlTerraformEc2MS SQLPalo Alto FirewallsAWS BackupInfrastructure as CodeCloudTrailAWS Organization

Bengaluru, India

Skills:

GrafanaAutomationCloud NativeCloudSystem DesignMachine LearningDatabricksApplication DevelopmentArtificial IntelligenceTestingReliability EngineeringMobileAI technologyOperational stabilitySRE conceptsObservability tools