AI Systems Engineer

ablecredit - genai infra for bfsi

Pune, India

4-8 Years

Save

Posted 8 days ago
Be among the first 10 applicants

Early Applicant

Job Description

SDE 2 / SDE 3 AI Infrastructure & LLM Systems Engineer

Location: Pune / Bangalore (India)

Experience: 48 years

Compensation: no bar for the right candidate

Bonus: Up to 10% of base

About The Company

AbleCredit builds production-grade AI systems for BFSI enterprises, reducing OPEX by up to 70% across onboarding, credit, collections, and claims.

We run our own LLMs on GPUs, operate high-concurrency inference systems, and build AI workflows that must scale reliably under real enterprise traffic.

Role Summary (What We're Really Hiring For)

We are looking for a strong backend / systems engineer who can:

Deploy AI models on GPUs
Expose them via APIs
Scale inference under high parallel load using async systems and queues

This is not a prompt-engineering or UI-AI role.

Core Responsibilities

Deploy and operate LLMs on GPU infrastructure (cloud or on-prem).
Run inference servers such as vLLM / TGI / SGLang / Triton or equivalents.
Build FastAPI / gRPC APIs on top of AI models.
Design async, queue-based execution for AI workflows (fan-out, retries, backpressure).
Plan and reason about capacity & scaling:
GPU count vs RPS
batching vs latency
cost vs throughput
Add observability around latency, GPU usage, queue depth, failures.
Work closely with AI researchers to productionize models safely.

Must-Have Skills

Strong backend engineering fundamentals (distributed systems, async workflows).
Hands-on experience running GPU workloads in production.
Proficiency in Python (Golang acceptable).
Experience with Docker + Kubernetes (or equivalent).
Practical knowledge of queues / workers (Redis, Kafka, SQS, Celery, Temporal, etc.).
Ability to reason quantitatively about performance, reliability, and cost.

Strong Signals (Recruiter Screening Clues)

Look For Candidates Who Have