Search by job, company or skills

qubrid ai

AI Architect Engineer (Hands-on coder) WFH

3-5 Years
Save
new job description bg glownew job description bg glow
  • Posted an hour ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Read everything carefully. The requirements and screening questions are critical and if not answered correctly and satisfactorily will result in auto-rejection and waste of your time.

  • Work from Home.
  • This is a full-time role. If you plan to do 2 or more jobs at the same time or want to do this part-time, that won't work for us. In that case please do not apply as it will get auto-rejected
  • Note - this job requires working late night India time until 4AM to overlap with USA working times. Do not apply if this timing doesn't work
  • Salary depends on experience and current verifiable (paychecks) compensation.
  • Mid-level candidates with 3-5 years experience are suitable

AI Architect (Hands-On) — Multi-Agent AI Platform

About Qubrid AI

Qubrid AI is building next-generation AI infrastructure focused on inference, GPUs, multi-model orchestration, and scalable AI deployments. Our mission is simple: democratize access to AI infrastructure - from developers spending their first $5 to enterprise-scale AI deployments processing billions of inference requests. We are looking for a deeply technical AI Architect who can design and build production-grade AI systems end-to-end, not just create architecture diagrams.

This role is for builders:

  • You should be equally comfortable:
  • writing production Python code
  • optimizing inference pipelines
  • working with open-source models
  • building multi-agent systems
  • designing scalable backend architectures
  • deploying AI systems into production
  • If you are primarily theoretical or management-focused, this role is probably not the right fit.

What You'll Build

You will help architect and build a full-stack multi-agent AI SaaS platform including:

  • Multi-agent orchestration systems
  • AI inference pipelines
  • Fine-tuning workflows
  • RAG systems
  • Tool-calling architectures
  • Memory and context management systems
  • Model routing and optimization layers
  • Backend APIs and distributed systems
  • GPU-aware inference infrastructure
  • Enterprise-grade scalable deployments
  • This is a highly hands-on engineering role where architecture and implementation go together.

Responsibilities

  • AI Systems & Multi-Agent Architecture
  • Design and build production-grade multi-agent AI systems
  • Develop orchestration frameworks for autonomous workflows
  • Implement agent communication, memory, planning, and tool usage
  • Build scalable RAG and retrieval pipelines
  • Design long-context and multi-modal workflows
  • Inference & Model Infrastructure
  • Optimize inference pipelines for latency and throughput
  • Work with open-source models including Llama, Qwen, Kimi, Mistral, DeepSeek, Gemma, Flux, SDXL, and other frontier/open models

Implement model serving infrastructure using technologies like:

  • vLLM
  • TensorRT-LLM
  • TGI
  • Ollama
  • SGLang
  • Ray Serve
  • Build intelligent model routing and fallback systems
  • Improve GPU utilization and inference efficiency
  • Fine-Tuning & Model Optimization
  • Build and manage fine-tuning pipelines
  • Work with:
  • LoRA / QLoRA
  • PEFT
  • RLHF/RLAIF concepts
  • Quantization
  • Distillation
  • Evaluate models across latency, quality, and cost tradeoffs

Backend & Platform Engineering

  • Develop scalable backend systems using Python
  • Design APIs, microservices, async workflows, and distributed systems
  • Build production-grade SaaS architecture
  • Implement observability, logging, monitoring, and reliability systems
  • Work with vector databases, caching systems, queues, and storage layers
  • Deployment & Infrastructure
  • Deploy AI systems on cloud and GPU infrastructure
  • Work with Kubernetes, Docker, and scalable orchestration systems
  • Build highly available inference infrastructure
  • Optimize infrastructure costs and scalability

Requirements

General requirements

  • 3-5 Years in AI architecture and system design
  • Strong hands-on Python expertise
  • Proven experience building production AI systems
  • Experience with LLM inference optimization
  • Deep understanding of transformer architectures and modern LLM ecosystems
  • Experience with open-source model deployment
  • Strong backend engineering experience
  • Experience designing scalable SaaS platforms
  • Experience with APIs, async systems, and distributed architectures
  • Strong debugging and systems-thinking ability

AI/ML Experience

  • Multi-agent systems
  • RAG architectures
  • Fine-tuning pipelines
  • Embeddings and vector databases
  • Tool-calling frameworks
  • Model evaluation and benchmarking
  • Prompt orchestration and workflow systems

Infrastructure Experience

  • Docker
  • Kubernetes
  • GPU infrastructure
  • CI/CD pipelines
  • Cloud platforms (AWS/GCP/Azure)
  • Distributed inference systems

What We're Looking For

We are specifically looking for engineers who:

  • build things themselves
  • move fast
  • can go from idea to production
  • understand both AI and systems engineering
  • can architect and implement
  • are comfortable operating in ambiguity
  • care about performance and scalability
  • are obsessed with execution
  • This is not a slide deck architect role.

You should be able to:

  • write production code daily
  • review system bottlenecks
  • optimize inference performance
  • debug distributed systems
  • build MVPs rapidly
  • scale products into production systems
  • Bonus Points
  • Experience building AI SaaS products from scratch
  • Experience with agentic frameworks
  • Experience with GPU optimization
  • Contributions to open-source AI projects
  • Experience with large-scale inference systems
  • Startup experience
  • Experience working with high-growth engineering teams

If you want to help shape the future of AI infrastructure and build systems that can scale from startup experimentation to enterprise deployments, we'd love to talk.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148314243