Search Jobs

Search by job, company or skills

qubrid ai

Senior GPU and AI Infrastructure Engineer

Fresher

Save

new job description bg glow

new job description bg glow

new job description bg svg

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

Work from Home. Join one of the most advanced AI companies in the world.

Note - this job requires working late night India time until 4AM to overlap with USA working times.

We are searching for a startup guru who has hands on experience in working on NVIDIA GPU technologies, open source AI models etc. Someone who has previous experience with owning the whole product and roadmap from AI and back-end standpoint. Must be a coder, not just DevOps manager. Someone who is not shy in going overboard to launch a world-class product.

This is a full-time role. If you plan to do 2 or more jobs at the same time or want to do this part-time, that won't work for us. In that case please do not apply.

Salary depends on experience and current verifiable (paychecks) compensation.

Company Description

Headquartered in McLean, Virginia, USA, Qubrid is a global provider of Artificial Intelligence (AI), Data Center and IoT products, solutions, and services. As pioneers in the realm of advanced computing technologies, we pride ourselves on being at the forefront of innovation, empowering businesses with the transformative capabilities of GPUs, Artificial Intelligence, Quantum Computing, IoT and more. We specialize in offering a wide array of hardware and software solutions for industries such as healthcare, manufacturing, finance, government, education and more.

Here's a stronger, product-focused version aimed at attracting engineers who have actually deployed and optimized real-world AI systems—not just experimented with models.

Senior GPU & AI Infrastructure Engineer
About the Role

We are looking for a highly experienced Senior GPU & AI Infrastructure Engineer to build and optimize production-scale AI systems focused on large language models (LLMs), multimodal models, and high-performance inference infrastructure.

This is a hands-on engineering role for someone who has real experience deploying AI products at scale—not just training models in research environments.

You will work on:

LLM deployment and serving
high-throughput inference systems
GPU optimization
distributed AI infrastructure
cloud GPU environments
multi-tenant AI workloads
model optimization and batching systems

The ideal candidate understands both AI systems and low-level infrastructure performance.

Responsibilities
AI Model Deployment & Serving

Deploy and maintain production-grade LLM and multimodal inference systems
Build scalable APIs and serving infrastructure for AI products
Implement high-throughput and low-latency inference pipelines
Design systems for:
batch inferencing
streaming inference
concurrent request handling
model routing
autoscaling

GPU Optimization & Performance Engineering

Optimize NVIDIA GPU environments for maximum throughput and efficiency
Work with:
CUDA
TensorRT
NCCL
ONNX Runtime
vLLM
Triton Inference Server
NVIDIA Dynamo
Improve:
GPU memory utilization
token throughput
batching efficiency
inference latency
GPU scheduling and allocation
Diagnose and resolve GPU bottlenecks, memory fragmentation, and scaling issues

Infrastructure & Distributed Systems

Design scalable AI infrastructure across cloud and dedicated GPU environments
Work with distributed inference and multi-node deployments
Implement GPU partitioning and resource isolation strategies
Manage containerized AI workloads using Docker and orchestration systems
Build infrastructure for resilient and fault-tolerant AI services

Cloud & Production Operations

Deploy and manage AI systems across providers such as:
AWS
GCP
Azure
bare-metal GPU clusters
Monitor infrastructure reliability, scaling, and cost efficiency
Build observability systems for GPU and inference monitoring
Create CI/CD workflows for AI deployments

Engineering & Documentation

Write production-grade infrastructure and deployment code
Create technical documentation and deployment runbooks
Participate in architecture reviews and infrastructure planning
Collaborate with AI engineers, backend engineers, and product teams

Requirements
Must-Have

Strong hands-on experience deploying LLMs or AI models in production
Deep understanding of NVIDIA GPU architecture and optimization
Experience with large-scale inference systems and batching strategies
Strong Linux, Python, and systems engineering background
Experience with containerization and distributed systems
Familiarity with model optimization techniques including:
quantization
KV cache optimization
tensor parallelism
pipeline parallelism
memory optimization

Preferred

Experience with multi-tenant AI serving systems
Experience building AI products used in production environments
Familiarity with Kubernetes and GPU orchestration
Experience with fine-tuning pipelines and model training infrastructure
Understanding of networking and high-performance compute systems

What We're Looking For

Someone who has built and operated real AI infrastructure at scale
Strong systems and performance engineering mindset
Ability to diagnose deep infrastructure and GPU-level issues
Product-focused engineer who understands reliability and user experience
Fast execution with strong ownership mentality

Compensation

Competitive salary depending on experience.

To apply, please include:

Relevant infrastructure or deployment experience
GPU systems and frameworks worked with
Links to GitHub, projects, or deployed AI systems if available

More Info

Job Type:

Industry:

Function:

Artificial Intelligence

Employment Type:

About Company

qubrid aiJob Source: www.linkedin.com

Job ID: 147319205

Jobs by Skill - IT

Jobs by Skill - Non IT

International Jobs

Last Updated: 12-05-2026 08:59:40 PM

Homejobs in KolkataSenior GPU and AI Infrastructure Engineer

contract

Do you want to see more relevant and perfect job for you?

Beware of Scammers

Beware of Scammers

We don’t charge any money for job offers

Interview Calls

What it feels like to have

48% more interview calls?

To get 5X more recruiter views on your profile

Real-time notifications

Discover new jobs, get recruiter notifications, track applications & more with the foundit App.

Scan to download foundit App