Search by job, company or skills

recrew ai

Backend & Platform Architect High-Throughput Streaming

Save
  • Posted 12 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role: Backend & Platform Architect – High-Throughput Streaming

Function: Backend Engineering / Platform Architecture

Location: Mumbai

Type: Full-time

Industry: Consumer AI

About Company

The company is building the AI layer for India at massive scale. Backed by partnerships with global tech leaders like Meta and Google, it serves the entire Indian user base—across languages, contexts, and daily needs.

The platform is engineered from day one for 100M+ users, with 1B-ready constraints on latency, cost, reliability, and safety. It combines deep India-first AI capability with unmatched India-scale distribution.

The culture emphasises engineering excellence, high autonomy, and tangible impact across sectors that matter to India.

Position Overview

We're looking for a Backend & Platform Architect who builds high-throughput, low-latency systems that handle millions of concurrent voice and data streams without breaking a sweat. You'll own the core streaming infrastructure that powers the company's consumer AI platform—from WebRTC voice pipelines to gRPC orchestration layers. This is a hands-on IC role for someone who treats sub-10ms p99 latency as a personal standard, not a stretch goal.

Role & Responsibilities

• Design and own high-throughput streaming APIs handling millions of concurrent voice and data connections

• Build and optimize WebRTC-based voice pipelines for real-time, low-latency AI interactions at consumer scale

• Architect gRPC and WebSocket service layers for bidirectional, streaming AI orchestration across modalities

• Implement backpressure, flow control, and adaptive bitrate mechanisms to maintain stability under peak load

• Drive performance profiling and optimization cycles targeting sub-50ms end-to-end latency at p99

• Build observability tooling—distributed tracing, stream health dashboards, and latency SLO alerting

• Collaborate with AI/ML platform teams to integrate model inference endpoints into the streaming pipeline efficiently

Must Have Criteria

• 5–8 years of backend engineering with at least 2 years building high-throughput streaming systems in Go or Rust

• Production experience with WebRTC—media servers, signaling, ICE/STUN/TURN, and codec negotiation

• Hands-on experience designing and operating gRPC services with bidirectional streaming at scale

• Demonstrated experience building systems handling 100K+ concurrent connections or equivalent stream volume

• Deep understanding of TCP/UDP internals, congestion control, and network-layer performance tuning

• Experience with async/concurrent programming patterns—goroutines, channels, or Rust async runtimes

• Track record of owning latency and reliability SLOs in production (p99 latency targets, uptime commitments)

Nice to Have

• Experience building or operating media servers (e.g., Mediasoup, Janus, Pion) at consumer scale

• Prior work at a consumer internet company with 10M+ active users (voice/video product preferred)

• Familiarity with AI inference serving—integrating streaming LLM outputs (token streaming) into real-time pipelines

• Experience with QUIC/HTTP3 for low-latency transport in mobile-first environments

• Open source contributions to networking, streaming, or infrastructure tooling

What We Offer

• Opportunity to architect streaming infrastructure for 100M+ users from day one—not a future roadmap item

• Small, high-ownership pods with direct accountability and zero bureaucracy

• Work alongside AI researchers and platform engineers building India's most ambitious consumer AI product

• Competitive compensation with meaningful equity in a high-conviction, well-backed venture

• Locations in Mumbai or Bangalore with a fast-iteration, production-first engineering culture

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 149070659