Search by job, company or skills

S

Opening For Edge AI Architect

10-16 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant
Quick Apply

Job Description

Responsibilities:

Agentic Architecture & Interoperability

  • Define end-to-end Edge AI system architecture covering data acquisition, preprocessing, model execution, orchestration, and edgecloud integration.
  • Evaluate and select hardware accelerators (GPU, NPU, DSP, TPU, VPU) based on workload characteristics and performance requirements.
  • Architect solutions using platforms such as NVIDIA Jetson, Intel OpenVINO, Qualcomm AI Engine, ARM Ethos, and Edge TPU.
  • Design real-time model pipelines for vision, audio, signal processing, and sensor fusion workloads.
  • Implement decentralized multi-agent systems using agentic frameworks and graph-based orchestration.
  • Design Agent-to-Agent (A2A) communication protocols for interoperability across heterogeneous environments.
  • Integrate Model Context Protocol (MCP) servers to securely enable agents to access enterprise data, tools, and services.

Generative AI & Small Language Model Customization

  • Lead selection and customization of Small Language Models (SLMs) for domain-specific use cases.
  • Apply parameter-efficient fine-tuning techniques such as LoRA and QLoRA to optimize compute efficiency.
  • Adapt models for on-device intelligence and enterprise-grade agentic workflows.

Edge AI & Inference Optimization

  • Optimize AI models for deployment on resource-constrained devices including smartphones, smart glasses, wearables, IoT gateways, and embedded Linux systems.
  • Implement Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), pruning, and sparsity techniques.
  • Optimize inference using TFLite, PyTorch Mobile, and ONNX Runtime with hardware acceleration support (NPU/DSP).
  • Perform advanced optimizations including INT8/INT4 quantization, mixed precision, KV-cache optimization, speculative decoding, batch and streaming inference tuning.
  • Profile and optimize inference pipelines across CPU, GPU, NPU, and DSP to reduce cold-start latency and enhance real-time responsiveness.

Embedded & System Integration

  • Develop high-performance inference engines and middleware in C/C++ to interface AI models with sensors and actuators.
  • Build Android-native AI services using Java/Kotlin and Android NDK with optimized background execution and battery efficiency.
  • Ensure seamless integration between AI workloads and embedded hardware platforms.

More Info

Job Type:
Function:
Employment Type:
Open to candidates from:
Indian

About Company

Job ID: 143736433