Senior Software Engineer - Backend (Gen AI & Audio Production Systems)
Location: Bengaluru
Experience: 3 - 6 years
About Pocket FM
Pocket FM, founded in 2018, is India's leading audio storytelling platform, transforming the way millions consume stories. Offering high-quality serialized content across genres such as Romance, Drama, Thriller, Fantasy, Sci-Fi and Mythology in eight languages, Pocket FM has built a strong global presence with over 200 million listeners worldwide. With users spending an average of 120 minutes daily on the platform, it has emerged as one of the fastest-growing audio platforms, rapidly expanding its reach across the US, Europe, LATAM and Southeast Asia.
Our unique model combines free listening with micropayments for premium content, powering strong business growth. In FY25, we reached an ARR of INR 2,000 crore, with over 100,000 hours of content on the platform. We are at the forefront of innovation, leveraging AI generated content to scale efficiently and enable new creative workflows in audio storytelling.
Role Overview
We are seeking a highly motivated Senior Software Engineer to join our Generative AI research team with a primary focus on AI driven audio generation and text to speech systems. This role sits at the intersection of research, engineering and creative tooling.
You will build and evolve AI powered audio pipelines, enable iterative and patch based audio revisions and contribute to the development of an AI driven audio workstation. The role involves working in ambiguous problem spaces, experimenting with new approaches and translating research prototypes into reliable, production ready systems.
This is a hands on builder role where depth in engineering fundamentals, system design and code architecture is essential. The role includes technical ownership and solution leadership within defined problem areas.
Key Responsibilities
- Design, develop and maintain backend systems that support large scale AI driven audio generation and speech synthesis workflows.
- Own the design and implementation of complex services and subsystems, ensuring clean architecture, maintainability and scalability.
- Work closely with AI researchers and audio engineers to integrate text to speech and generative audio models into production pipelines.
- Lead solution design for scoped problem areas, translating ambiguous requirements into clear technical designs and implementation plans.
- Experiment with prompts, conditioning strategies and model parameters to improve speech naturalness, expressiveness and consistency.
- Build systems that enable iterative, patch based audio revisions rather than full regeneration, supporting efficient creative workflows.
- Develop tools and services for previewing, evaluating, editing and regenerating audio assets.
- Contribute to research prototyping, including rapid experimentation with new models, APIs or approaches to audio generation.
- Own the transition of successful research experiments into scalable, observable and fault tolerant production services.
- Write clean, maintainable and well tested code, following strong engineering practices including code reviews and CI pipelines.
- Collaborate cross functionally with product, content and platform teams to align technical solutions with creative and business needs.
- Design and implement scalable, distributed backend systems, integrating cloud services, caching and storage solutions to support high throughput AI driven features with efficient performance.
- Ensure security, fault tolerance and observability of backend services.
- Collaborate with AI teams to integrate model APIs and orchestrate inference workflows.
Preferred Qualifications
- Proficiency in Python or Go (Golang) and backend frameworks (FastAPI, Django, Gin, Echo).
- Strong understanding of system design, service architecture and backend code structure.
- Solid understanding of distributed systems, scaling and cloud infrastructure (AWS or GCP or Azure).
- Exposure to modern speech or audio generation models and frameworks.
- Experience building or contributing to creative tools, enterprise platforms or authoring workflows.
- Experience integrating AI or ML models into production systems.
- Experience with LLM APIs (OpenAI, Anthropic etc.) and complex AI integration workflows.
- Familiarity with model orchestration or experimentation frameworks.
- Familiarity with containerization and deployment using Docker and Kubernetes.
- Knowledge of observability tools and best practices for monitoring AI driven systems.
Experience with some, not necessarily all of the above is expected.
Nice-to-Have Skills
- Experience with message queues (Kafka, RabbitMQ) and event-driven architectures.
- Knowledge of security best practices for API and backend systems.
- Observability experience (Prometheus, Grafana, Datadog).
What We Value
- Strong problem solving skills and comfort working in research oriented, evolving problem spaces.
- A mindset that balances experimentation speed with engineering rigor.
- Clear communication and a collaborative approach to cross functional work.
- Curiosity about audio, storytelling and the creative applications of Generative AI.
Why Join Pocket FM
This role offers a rare opportunity to shape the future of AI driven audio storytelling at global scale. You will work on problems that directly impact how stories are created, refined and experienced by millions of listeners worldwide, while helping define the next generation of creative audio tools.
You can get more updates, insights and everything behind the scenes at Pocket FM here - Pocket FM