Search by job, company or skills

ideas revenue solutions

Senior Site Reliability Engineer

Save
  • Posted 29 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

We're looking for strong Java engineers who've owned production systems and want to focus on reliability, scalability, and resilience.

This is for you if you've:

  • Built and operated large-scale Java/JVM services
  • Carried on-call and handled real production incidents
  • Debugged JVM, GC, latency, and concurrency issues under pressure
  • Implemented resilience patterns (circuit breakers, timeouts, graceful degradation)

What you'll do:

  • Own availability, latency, and reliability of critical services
  • Improve systems through code-level reliability, not just infra
  • Define SLIs/SLOs, lead incident reviews, reduce toil
  • Partner with product teams to design for failure

Reliability mindset (what differentiates this role)

  • Experience implementing or driving:

Circuit breakers, bulkheads, rate limiting, backpressure

Graceful degradation and fallback strategies

  • Familiarity with observability concepts:

Metrics (e.g., latency percentiles, saturation)

Distributed tracing

Health checks & readiness probes

Nice to have (not mandatory):

  • Exposure to SRE / Platform / Production Engineering
  • Kubernetes, observability, or chaos engineering experience

Software-first SRE role | Real ownership | Strong growth into reliability leadership

More Info

Job Type:
Industry:
Employment Type:

Job ID: 147231431