Search by job, company or skills

Mako IT Lab

Generative AI Engineer

new job description bg glownew job description bg glownew job description bg svg
  • Posted 10 days ago
  • Be among the first 20 applicants
Early Applicant

Job Description

About MAKO:

Founded in 2013,Mako IT Labis a global software development company with a strong presence across the USA, UK, India, and Nepal. Over the years, we've partnered with companies globally helping them solve complex challenges and build meaningful digital experiences.

What truly defines Mako is ourculture. We believe in creating an environment where people feel empowered to take ownership, exercise freedom in their ideas, and contribute to solutions that genuinely make an impact. Learning is at the heart of who we areour teams constantly grow through hands-on exposure, real-world problem solving, and continuous knowledge sharing across functions and geographies.

We don't just build long-term partnerships with clientswe build long-term careers for our people. At Mako, you'll be part of a collaborative, supportive, and fast-growing global team where curiosity is encouraged, initiative is celebrated, and every individual plays a meaningful role in shaping the company's journey.

Role Overview:

We are seeking an experiencedAI Engineerwith deep expertise inLLM-driven architectures,RAG systems,agentic workflows, andmultimodal AI development. The ideal candidate will be skilled in building scalable AI pipelines usingFastAPI,Kafka,FastMCP, andTavily Web Search, while also having hands-on experience withvllm-based inferenceandStable Diffusion pipelines.You will architect and implement intelligent systems leveragingLarge Language Models,vision models, andautonomous agents, with a strong focus on observability, performance, and production reliability.

Key Responsibilities:

1. LLM, VLLM & Agentic System Development

Build autonomous LLM agents using LangChain, LangGraph, and FastMCP.

Develop RAG workflows using embeddings, vector stores, and knowledge-grounded reasoning.

IntegrateVLLM / SGLang / other high-throughput inference backendsfor low-latency model serving.

Implement Tavily web-search integrations for real-time knowledge augmentation.

Optimize inference using quantized GGUF, tensorized formats, and GPU-accelerated pipelines.

2. Multimodal & Image Generation Systems

Build and deployStable Diffusion(SDXL/SD 1.5/ControlNet/T2I) pipelines for image generation tasks.

Integrate LoRAs, control modules, and diffusion-based fine-tuning for custom domains.

Develop multimodal agents that combine LLM reasoning with vision tasks such as classification, captioning, or image prompts.

3. Backend & Infrastructure Engineering

Build robust FastAPI services for orchestrating LLMs, Stable Diffusion, retrieval, and agentic tasks.

Develop event-driven workflows usingKafkafor distributed AI systems.

Implement auditing, agent-output monitoring, and API-layer logging for end-to-end traceability.

4. High-level API & Third-party Integrations

Integrate third-party services: authentication, analytics, search APIs, cloud inference APIs, and enterprise data sources.

Build secure and scalable API layers for production deployments.

5. Fine-tuning & Model Lifecycle Management

Fine-tune LLaMA, Mistral, Phi-3, and diffusion models for domain-specific tasks.

Use MLflow for tracking experiments, hyperparameters, metrics, and versioning.

Conduct evaluation on hallucinations, retrieval consistency, reasoning depth, and multimodal accuracy.

Required Skills & Qualifications:

Core AI/LLM Skills

Experience withLLMs, RAG systems, LangChain, LangGraph, LlamaIndex

Hands-on withVLLM, SGLang, or similar inference engines

Model quantization (GGUF), optimization, and GPU memory tuning

Agent frameworks & tool calling (FastMCP, Groq, Hugging Face)

Multimodal & Image Generation

Stable Diffusion, ControlNet, LoRA fine-tuning, custom pipelines

Diffusers, ComfyUI, or InvokeAI experience (bonus)

Engineering & Systems

Kafka-based event-driven systems

FastAPI/Flask/Node.js backend development

Third-party API integrations

Docker, CI/CD, and cloud platforms (GCP/Azure)

Databases & Retrieval

MongoDB, DuckDB,

Embedding stores, vector databases (Pinecone / Qdrant), retrieval optimization

Observability & MLOps

MLflow for experiment tracking and model lifecycle

Performance monitoring, logging, auditing, API observability

Frontend (Good to have)

React, Redux, Next.js, Electron.js for dashboards and AI interfaces

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 133395367

Similar Jobs