We are looking for a Staff Engineering Manager to run the Vulcan backend platform - the serving, delivery, and runtime layer that other product teams build on.
This role is closer to running a SaaS product than running a traditional infrastructure team: the platform has a defined catalog of capabilities, real tenants with real expectations, and explicit guarantees on latency, availability, throughput, and cost. The role works closely with Product, Applied Science, and other engineering managers across the org, and is accountable for both the architectural direction and the operational reliability of the platform.
What would you own :
- The Vulcan charter — the capability catalog: what the platform offers, what it doesn't, what's on the roadmap, and what's being deprecated.
- SLOs tenants rely on — define, publish, and hold the guarantees on latency, availability, throughput, and cost, and the observability and operational practices that make them real.
- Architectural direction — lead design reviews, raise the bar on interface design and rollout safety, and keep the platform from fragmenting into team-specific implementations.
- A self-service capability catalog — so tenants discover, onboard, and operate capabilities without the platform team in the room (in the spirit of Stripe's internal platforms or Spotify's Backstage).
- Team health and growth — stabilize and run a team of 7–10 engineers, grow Senior engineers into tech leads, and protect the team's focus.
- Roadmap, prioritization, and operational excellence — own trade-offs in the open, and turn incidents into systemic fixes rather than burnout.
What we're looking for :
- 10+ years in software engineering, with meaningful time leading teams that built and operated platforms, infrastructure products, or shared services for other engineering teams.
- A track record of running teams of 7–15 engineers across levels — hiring, performance, and growing seniors into tech leads.
- Experience defining and holding SLOs that other teams genuinely depended on, and building the operational culture behind them.
- Experience building self-service platforms that internal tenants chose to adopt. Running a SaaS product (internal or external) is a strong plus.
- Architectural depth in backend and distributed systems at consumer scale — multi-tenant isolation, observability, rollout safety, lifecycle management
- Working knowledge of modern infrastructure and AI ecosystems (orchestration, streaming/batch data, model and inference services, cloud-native cost characteristics)
- Strong written and verbal communication, and the ability to lead through influence rather than authority.