
Search by job, company or skills
Key Responsibilities:
Production Application Development: Lead the end-to-end lifecycle of LLM
applications, transitioning functional prototypes into robust, scalable, and
resilient production systems.
LLM API Integration & Orchestration: Design and implement robust integrations
with various LLM APIs (e.g., OpenAI, Anthropic, internal models), optimizing
performance, cost, and reliability.
Prompt Engineering & Optimization: Develop, test, and refine advanced prompt
engineering techniques to ensure accurate, relevant, and reliable model outputs
tailored to specific business use cases.
Context Management & RAG Implementation: Implement strategies for effective
context management, including Retrieval-Augmented Generation (RAG) systems,
vector databases, and memory structures to enhance model relevance and
accuracy.
Output Validation & Quality Assurance: Establish rigorous validation frameworks
to automatically check and verify LLM outputs against predefined constraints,
minimizing hallucinations and ensuring compliance with quality standards.
AI Security & Risk Mitigation: Implement robust security protocols to protect
against adversarial attacks, specifically focusing on prompt injection, indirect
prompt injection, and SQL injection vulnerabilities within the LLM application
stack.
Production Deployment & Monitoring: Utilize MLOps principles to deploy
applications across cloud infrastructures (e.g., AWS, GCP, Azure), setting up
comprehensive monitoring for performance metrics, latency, token usage, and
drift using tools like MLflow, Weights & Biases, or Prometheus.
Required Skills and Qualifications:
Experience: 5+ years of professional experience as an ML Engineer or MLOps
Engineer, with significant experience specifically focused on deploying LLM
applications into production environments (beyond just demos).
Technical Proficiency:
o Strong programming skills in Python.
o Hands-on experience with ML frameworks (e.g., PyTorch, TensorFlow) and
orchestration tools (e.g., Kubeflow, Airflow).
o Proficiency with cloud platforms (AWS, GCP, or Azure) and
containerization technologies (Docker, Kubernetes).
o Experience with vector databases (e.g., Pinecone, Weaviate, Chroma) and
RAG architecture patterns.
o Familiarity with MLOps tools for tracking, deployment, and monitoring.
LLM Domain Knowledge: Deep understanding of current LLM capabilities,
limitations, prompt engineering best practices, and emerging security
vulnerabilities in generative AI.
Problem-Solving: Strong analytical skills with a proactive approach to
troubleshooting complex production issues related to model performance,
latency, and system stability.
Communication: Excellent collaboration and communication skills, capable of
working effectively within cross-functional teams (Data Scientists, Software
Engineers, Security Teams).
Job ID: 140414135