Job Description
Python Gen AI Engineer who can design, build, and deploy MCP (Model Context Protocol) servers that power scalable Generative AI and Agentic AI systems. The role requires strong backend engineering skills combined with hands-on experience in LLMs, prompt engineering, and AI agent frameworks.
Key Responsibilities
Design and deploy MCP servers in Python using FastMCP.
Implement tool-calling, memory, and structured context handling for LLM-driven systems.
Build and optimize RAG pipelines using embeddings and vector databases.
Develop and orchestrate agentic workflows using frameworks such as LangChain, LlamaIndex, Semantic Kernel, AutoGen, or similar.
Apply advanced prompt engineering techniques (ReAct, CoT, function-calling, structured outputs).
Integrate Azure OpenAI/Gemini and other LLM APIs into production systems.
Implement caching, logging, monitoring, and performance optimization.
Containerize and deploy services using Docker and cloud platforms (Azure/GCP).
Build secure APIs using modern web standards (REST, WebSockets, OAuth2/JWT).
Required Skills
Strong Python backend development experience.
Strong expertise in asynchronous programming (asyncio), concurrent request handling and streaming responses
Hands-on experience with Generative AI, LLMs, and Agentic AI systems.
Experience building MCP servers or similar AI-serving architectures.
Experience with RAG, embeddings, and vector databases.
Solid understanding of prompt engineering best practices.
Knowledge of web protocols (HTTP, REST, WebSockets) and API security.
Experience with MLOps and CI/CD best practices using Azure DevOps