Search by job, company or skills

C

ML Inference & Optimization Engineer

2-4 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 months ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

You Bring

  • 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models
  • Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.)
  • Understanding of PyTorch internals and inference-time optimization
  • Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc.
  • Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 126000997