About the Company: Quantiphi is an award-winning Applied AI and Big Data software and services company, driven by a deep desire to solve transformational problems at the heart of businesses. Our signature approach combines groundbreaking machine-learning research with disciplined cloud and data-engineering practices to create breakthrough impact at unprecedented speed.
Company Highlights:
- Quantiphi, an AI-First Digital Engineering Services & Platforms company, with a 2.5x growth YoY since its inception in 2013
- Headquartered in Boston, with 3200+ data science professionals across 11 global offices
- Winner of 3X NVIDIA AI Partner of the year award
- Winner of the 13X Google Cloud Partner of the Year award including Machine Learning, Breakthrough and Social Impact partner
- Winner of 3X AWS AI/ML Partner of the Year award
- Preferred and Premier Partner for AWS, Google Cloud, NVIDIA, Snowflake, Databricks and more
About the Role:
We are looking for a highly skilled Senior Machine Learning Engineer to lead the design and implementation of next-generation Agentic AI ecosystems. In this role, you will go beyond simple automation to build sophisticated Multi-Agent Systems (MAS), develop robust Agent Platforms, and ensure seamless Agent Interoperability. The ideal candidate will bridge the gap between high-level agent orchestration and low-level GPU Tuning, ensuring our autonomous solutions are both intelligent and computationally efficient.
Responsibilities:
- Agent Platform & Registry: Design and maintain a centralized Agent Platform and Agent Registry to manage the lifecycle, discovery, and versioning of specialized AI agents across the organization.
- Multi-Agent System (MAS) Orchestration: Develop complex Multi-Agent Systems where autonomous agents collaborate, negotiate, and execute intricate business processes with minimal human oversight.
- Agent Interoperability: Define and implement communication protocols and standards to ensure Agent Interoperability across different frameworks, tools, and LLM providers.
- Agent Evaluations (Evals): Build and scale rigorous Agent Evaluation frameworks to measure performance, accuracy, safety, and reliability of agentic workflows in production.
- GPU Tuning & Optimization: Perform deep-level GPU Tuning and optimization (e.g., quantization, kernel tuning, memory management) to maximize throughput and minimize latency for large-scale model deployments.
- LLM Fine-Tuning: Execute domain-specific fine-tuning using techniques like PEFT and SFT on models such as Llama or Mistral to power specialized agents.
- Research & Prototyping: Stay at the forefront of Generative AI research, specifically in autonomous decision-making and reinforcement learning, to maintain a competitive technological edge.
Qualifications:
Technical Expertise:
- Agentic Frameworks: Proficiency in building and scaling agentic workflows using tools like LangGraph, CrewAI, AutoGen, or PhiData.
- Evaluation & Monitoring: Experience with LLM and Agent evaluation tools (e.g., RAGAS, DeepEval) and building custom Evals for multi-step reasoning.
- Optimization: Deep knowledge of GPU optimization techniques and libraries (e.g., vLLM, TensorRT, NVIDIA Triton, or CUDA-based tuning).
- Programming: Mastery of Python and its machine learning ecosystem (PyTorch, TensorFlow, or JAX).
- Systems Design: Experience designing Agent Registries and scalable infrastructure on cloud platforms like AWS, GCP, or Azure.
- NLP & RL: Extensive experience with NLP tasks (summarization, QA) and familiarity with Reinforcement Learning (RL) for autonomous decision-making.
Soft Skills:
- Strong analytical skills to debug complex agentic interactions and logic loops.
- Ability to collaborate with cross-functional teams to define product requirements for autonomous systems.
- Proven ability to drive high-impact projects from research to production with minimal supervision.
Good to Have Skills
- Contributions to open-source Agentic AI or ML optimization projects.
- Experience with MLOps practices for continuous integration and deployment of agentic systems.
Background in Multi-Agent Reinforcement Learning (MARL) or game theory.