Python Backend Developer - Generative AI
A fast-scaling company in the Generative AI and cloud-native backend services sector, building production-grade LLM-powered applications and APIs for enterprise workflows. The team focuses on delivering low-latency, secure inference services, reliable data pipelines, and seamless integration of GenAI capabilities into product surfaces.
Location: India On-site role (candidate must be based in/able to work from the assigned office).
Role & Responsibilities
- Design, implement, and maintain scalable Python backend services and REST/gRPC APIs to serve GenAI-based features in production.
- Integrate LLMs and embedding pipelines into backend systems using GenAI toolkits (e.g., LangChain, Hugging Face) and vector search solutions.
- Build and optimize inference endpointsimplement batching, caching, rate-limiting, and request routing to minimize latency and cost.
- Develop data ingestion and pre/post-processing pipelines for model inputs/outputs; instrument logging, metrics, and tracing for observability.
- Containerize services, author CI/CD pipelines, and deploy/manage workloads on Kubernetes and cloud platforms; ensure reliability and autoscaling.
- Maintain high code quality through automated tests and code reviews; collaborate closely with ML engineers and product teams to ship robust features.
Skills & Qualifications
Must-Have
- Proven experience building production Python backend services (FastAPI / equivalent) and designing REST/gRPC APIs.
- Hands-on integration experience with GenAI frameworks or toolchains (e.g., LangChain, Hugging Face integrations).
- Strong database experience: PostgreSQL (schema design, indexing) and Redis (caching, pub/sub).
- Containerization and orchestration: Docker and Kubernetes; experience with CI/CD and deployment automation.
- Practical knowledge of cloud platforms (AWS/Azure/GCP) and monitoring/observability tooling (Prometheus, Grafana, ELK).
- Solid engineering fundamentals: testing, API security, performance tuning, and production troubleshooting.
Preferred
- Experience with PyTorch or TensorFlow for model serving or inference optimization.
- Familiarity with vector search / similarity tooling (FAISS, Pinecone, Milvus).
- Background in ML inference engineering, model quantization, or accelerating low-latency inference.
Benefits & Culture Highlights
- Collaborative, on-site engineering teams with direct exposure to product and ML researchfast decision cycles.
- Opportunities for technical ownership, mentorship, and career growth working on production GenAI systems.
- Competitive compensation, upskilling budget, and regular knowledge-sharing sessions focused on LLMs and backend excellence.
How to apply: Submit a resume highlighting relevant backend and GenAI integrations, examples of production services you built, and your preferred contact details. We seek pragmatic engineers who can deliver reliable, low-latency GenAI backend systems on-site in India.
Skills: docker,postgresql,kubernetes,python,redis,fastapi