In this role, you will be responsible for taking a foundation language model and turning it into a production-ready AI system. This includes fine-tuning, inference optimization, deployment workflows, observability, and long-term operability.
You will work closely with the founding team to translate product intent into clear system architecture, ensuring the platform is reliable, controllable, and extensible. This is a hands-on individual contributor role with architectural responsibility.
Key Responsibilities
- Design and operate infrastructure for fine-tuning, serving, and running LLMs
- Build and optimize inference pipelines with a focus on latency, throughput, and stability
- Implement observability for:
- Model behavior and performance
- Latency and throughput
- GPU utilization and memory efficiency
- Own the complete lifecycle of models:
- Experimentation
- Deployment
- Versioning
- Rollback and iteration
- Design and maintain tool-driven AI execution flows (intent policy action)
- Support multimodal pipelines, including video and vision processing where required
- Create reproducible packaging, release, and deployment workflows
- Ensure the system is secure, predictable, and maintainable as it scales
Minimum Qualifications
- 8+ years of experience in LLM Ops, ML Ops, AI systems, or infrastructure engineering
- Strong proficiency in Python with production-grade coding practices
- Deep understanding of transformer architectures, tokenization, attention mechanisms, batching, and memory behavior
- Hands-on experience running LLMs in production environments
- Experience with GPU inference, performance tuning, and failure handling
- Familiarity with containerized deployments and orchestration
- Ability to work independently and take full technical ownership
Preferred Qualifications
- Experience operating AI systems that combine language, vision, or video pipelines
- Familiarity with model optimization techniques (quantization, low-precision inference, compilation)
- Experience designing internal AI APIs or system services
- Exposure to secure, controlled model deployment environments
- Background in building systems that others later scale and extend
What This Role Requires
- Comfort with ambiguity and early-stage systems
- Willingness to be the primary technical owner in Phase-1
- Strong system-level thinking over isolated model experimentation
This role is not focused on people management or research publishing.
What Kalbii Offers
- 18 40+ LPA (market-aligned, experience-based)
- Full technical ownership during Phase-1
- Direct collaboration with founders
- Clear growth path into AI Systems Lead / Chief Architect roles
- Opportunity to define the foundation that future teams will build on
Work Location
- On-site only
- No remote or contract roles
How to Apply
Submit your profile to:
[Confidential Information]
Subject: Senior LLM Systems & Ops Engineer Kalbii
This role is intended for engineers who want to build and operate the core, not just integrate components.