
Search by job, company or skills

Job description:
KEY RESPONSIBILITIES
· Design and own multi-stage ingestion pipelines — handling HTML, PDF, and image sources with layout parsing, metadata extraction, and vector storage
· Architect RAG systems with hybrid search (BM25 + semantic), document versioning, and cross-reference resolution
· Build production-grade FastAPI services with typed response envelopes, OpenAPI compliance, and Langfuse tracing integration
· Engineer prompt systems — structured prompts, prompt versioning, few-shot strategies, and judge-based evaluation
· Integrate and manage LLM routing via LiteLLM: model fallback, cost control, and per-route configuration
· Design agentic workflows using LangGraph: multi-step retrieval, tool use, and conditional branching
· Build and maintain knowledge graphs in NebulaGraph / Neo4j — entity extraction, relationship modelling, and domain ontology alignment
· Implement graph-augmented retrieval (GraphRAG) — combining vector search with graph traversal to surface contextually connected information beyond chunk-level retrieval
· Own entity linking and co-reference resolution pipelines that connect ingested documents to graph nodes
· Lead RAG evaluation initiatives — define metrics, build eval datasets, and run regression cycles
· Drive observability standards — tracing, cost attribution, and latency profiling via Langfuse
· Collaborate on K8s deployment patterns for AI services: resource limits, GPU scheduling, and health probes
· Mentor junior developers and conduct code and prompt reviews
REQUIRED SKILLS
· Python (5+ years) — async, concurrency patterns, production packaging
· Deep understanding of RAG — hybrid retrieval, reranking, chunking strategies, embedding model selection
· FastAPI — dependency injection, middleware, background tasks, async patterns
· Prompt engineering — structured prompting, chain-of-thought, evaluation-driven iteration
· LLM API integration — OpenAI-compatible APIs, AWS Bedrock, or similar
· Vector DB expertise — Weaviate or equivalent: schema design, indexing, and filtering
· Document parsing at scale — Docling, layout models, VLM-based extraction from PDFs, HTML, and images
· Graph DB — NebulaGraph or Neo4j: schema design, Cypher / nGQL queries, knowledge graph construction
· Observability mindset — tracing, evaluation loops, cost-aware system design
Job ID: 149316503
Skills:
Api Development, Tensorflow, Pytorch, Python, LLAMA, LLMs, Hugging Face, MistralAI, GPT, LangGraph, GitHub Copilot, RAG techniques, BERT, CLaude Code
Skills:
Tensorflow, Numpy, Pandas, AWS, Gitlab, Pytorch, Python, Azure, Gcp, Jenkins, Azure DevOps, Transformers, MLflow, Kubeflow, Hugging Face, LangChain, OpenAI APIs
Skills:
Nosql, Databricks, Sql, Python, RAG, Azure AI Search
Skills:
software design patterns , Data Security, Neural Networks, Python, embeddings, production agentic AI systems, experiment tracking, Transformers, governance controls, LLMOps, retrieval quality tuning, vector databases, multi-agent orchestration frameworks, architecture for complex distributed systems, RAG engineering, LLM behavior, deep learning foundations, release governance, evaluation frameworks, model monitoring, MCP server-client integrations
Skills:
Python, Prompt engineering LLM orchestration, Docker Kubernetes, Cloud platforms AWS Azure GCP, RAG architecture and vector databases, Fine-tuning LLMs LoRA PEFT, REST APIs microservices integration, Agentic AI frameworks – LangChain LangGraph AutoGen CrewAI, AI ML fundamentals – NLP Deep Learning Transformers, MLOps LLMOps MLflow Kubeflow monitoring tools, Generative AI LLMs OpenAI Azure OpenAI Gemini Anthropic LLaMA
We don’t charge any money for job offers