Search by job, company or skills

neuralgarage

Senior ML Engineer

Save
  • Posted 20 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Responsibilities

  • Architect and scale a next-generation multimodal inference platform powering multiple production AI pipelines.
  • Building GPU-resident multimodal pipelines for image, audio, and video inference.
  • Optimizing PyTorch models with TensorRT and Torch-TensorRT.
  • Designing dynamic batching and low-latency inference systems.
  • Building GPU-native preprocessing pipelines using NVIDIA DALI and Kornia.
  • Supporting inference across heterogeneous GPU fleets (H100 A100 A10G, RTX 4090 etc. ).
  • Improving observability, throughput, reliability, and deployment automation.

Requirements

  • Strong Python and PyTorch skills.
  • Experience deploying ML models in production.
  • Understanding of GPU inference optimization and CUDA fundamentals.
  • Familiarity with Docker and Linux.
  • Strong debugging and problem-solving skills.
  • Experience with NVIDIA Triton, TensorRT, or DALI.
  • Familiarity with CUDA profiling and performance optimization.
  • Experience with distributed inference systems.
  • Knowledge of audio/video ML pipelines.
  • Experience with ONNX, AWS GPU infrastructure, and model quantization techniques (FP16/INT8).

This job was posted by Subrina Ahoy Lai from NeuralGarage.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148900333

Similar Jobs

Bengaluru, India

Skills:

data engineering JavaCData StructuresTensorflowAlgorithmsPytorchPythonSignal ProcessingSystem-Level SimulationPerformance EvaluationWireless CommunicationsSystems engineeringChannel Modelling

Bengaluru, India

Skills:

RAGPython programming languageAWS AI ML services ecosystemFAST APIsMachine Learning architectures

Bengaluru, India

Skills:

MLopsPythonAWSNLP systemsmulti-agent systemscloud-based AI deploymentsLlmRAGAI frameworks