Search by job, company or skills

InfiVR

Machine Learning Engineer

Save
new job description bg glownew job description bg glow
  • Posted 6 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title: ML Engineer — Edge AI (Vision, Voice & On-Device LLMs)

Company: InfiVR

Location: Bangalore, India (On-Site)

Employment Type: Full-time

Experience Level: 2–5 Years

About InfiVR

InfiVR delivers AI, computer vision, and immersive digital solutions for industrial enterprises across Oil & Gas, Defense, Healthcare, and Aerospace. We build intelligent applications that run in real-world operational environments — on edge devices, in the field, often without connectivity. Our clients include Fortune 500 companies and leading global organizations.

About the Role

We are looking for an ML Engineer who can take AI models from research to production on mobile and edge hardware. This is not a training-in-the-cloud role — you will be selecting, optimizing, and deploying models that run entirely on-device under tight compute, memory, and power constraints. You will work across computer vision, speech-to-text, and small language models, shipping them on Qualcomm chipsets with NPU acceleration for industrial field applications.

Responsibilities

Evaluate and benchmark pre-trained models for on-device deployment across object detection, OCR, speech-to-text, and conversational AI. Quantize and optimize models (INT8, INT4, W4A16) using AIMET or equivalent tools, targeting ONNX, TFLite, and QNN formats. Profile and optimize inference latency, memory usage, and thermal performance on actual target hardware. Integrate models into Android applications using ONNX Runtime (QNN Execution Provider), whisper.cpp, llama.cpp, and NDK. Build and maintain on-device RAG pipelines using local vector stores and small language models. Collaborate with Android developers on camera, audio, and sensor integration for AI-powered field applications.

Requirements

2+ years deploying ML models on mobile or edge devices — not just training, but shipping on real hardware. Hands-on experience with model quantization and optimization for constrained environments. Strong working knowledge of at least two of: object detection (YOLO family, EfficientDet), speech-to-text (Whisper, wav2vec), or small language models (Phi, Gemma, Llama). Proficiency in Python and PyTorch for model preparation. Comfortable with C/C++ and Android NDK for on-device integration. Understanding of ONNX model format and runtime ecosystem.

Good to Have

Experience with Qualcomm AI stack (AI Hub, QNN SDK, Hexagon NPU). Familiarity with PaddleOCR or similar mobile OCR frameworks. Prior work on industrial, field-deployed, or offline-first applications. Exposure to on-device embedding models and vector search.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147488427

Similar Jobs

Bengaluru, India

Skills:

bedrock JavaCassandraScalaKafkaSecurity AuthenticationAPI designTensorflowGitPytorchSparkMongoDBKubernetesPythonData PrivacyAWSGenerative AIEKSSageMakerAuthorization methodsSpark ML

Bengaluru, India

Skills:

data engineering Machine LearningScalaSqlTensorflowPytorchPythonLangChainDimensional Data ModelsLLaMATransformer-based ModelsGPTAI ModelsRAG Retrieval-Augmented GenerationVector DatabasesEmbeddingssemantic searchBERTLlamaIndex

Bengaluru, India

Skills:

TensorflowMachine LearningPytorchOpencvOcrPythonComputer VisionDeep LearningGenAILLMsPIL

Bengaluru, India

Skills:

SqlPytorchPandasDockerSparkKubernetesPythonAirflowScikit-learnMLflowPrefectdbtKubeflowBentoML

Bengaluru, India

Skills:

UnixOrchestrationHadoopPysparkDockerOoziePythonEvaluationAirflowK8SFeature StoreML lifecycle toolsSpark-mlMLflowBig Data modelingML Models Lifecycle FrameworksDistributed processingDagsterLake Warehouse architecturesmodel monitoring