Job description
We are looking for a Sr.GenAI Engineer, who will lead the design, development, integration, and deployment of Generative AI solutions ranging from enterprise chatbots and document summarization systems to AI-powered search, knowledge management, and content generation platforms. You'll work closely with Data Scientists, DevOps Engineers, and Full-Stack Developers to operationalize scalable GenAI products for our clients.
Responsibilities and Duties
- Lead the solution design and architecture of Generative AI applications based on business needs and technical feasibility.
- Integrate and deploy Large Language Models (LLMs) using open-source (Hugging Face, Llama, Mistral, Falcon) and commercial models (OpenAI, Azure OpenAI, Anthropic, Gemini).
- Build and operationalize RAG (Retrieval-Augmented Generation) pipelines, integrating vector databases (e.g., FAISS, ChromaDB, Pinecone) for contextual enterprise search.
- Perform LLM fine-tuning (Supervised Fine-Tuning (SFT), LoRA, QLoRA) and Continued Pre-training (CPT) on domain-specific data where required.
- Develop APIs and backend services to serve LLM and GenAI functionalities securely and efficiently.
- Integrate cloud-native AI services (AWS Bedrock, Azure AI, Google Vertex AI) into enterprise applications.
- Collaborate with Data Engineers to curate, preprocess, and vectorize data for RAG and AI pipelines.
- Implement MLOps for GenAI, ensuring model versioning, CI/CD, and monitoring for deployed models.
- Develop quick POCs, pilots, and scalable production-ready applications in collaboration with product teams.
- Stay current with the rapidly evolving GenAI ecosystem and assess emerging tools, frameworks, and models for applicability.
Desired Experience & Qualification
- Bachelor's or Master's Degree in Computer Science, AI/ML, Data Science, or a related technical discipline.
- Minimum 4 years of AI/ML engineering experience, with at least 2+ years hands-on in Generative AI and Agentic AI implementation.
- Proven experience with LLM frameworks and libraries like Hugging Face Transformers, LangChain, LlamaIndex, etc.
- Hands-on experience with LLM deployment frameworks such as TGI (Text Generation Inference), vLLM, Ollama, BentoML.
- Expertise in building RAG pipelines and integrating vector databases (FAISS, ChromaDB, Pinecone, Weaviate).
- Solid experience with Python for AI/ML model development, API development (FastAPI, Flask), and automation.
- Familiarity with cloud AI platforms (AWS Bedrock, Azure OpenAI, GCP Vertex AI, Databricks).
- Experience in containerization (Docker) and deploying AI services with MLOps pipelines and CI/CD integration.
- Strong understanding of AI security, prompt engineering, data privacy, and model governance practices.
- Excellent problem-solving, critical-thinking, and cross-functional collaboration skills.
Preferred Qualifications
- Experience with open-source LLM fine-tuning tools like PEFT, LoRA, QLoRA, Hugging Face PEFT library.
- Familiarity with Generative AI frameworks for images, audio, or video (e.g., Stable Diffusion, Whisper, Bark).
- Experience integrating GenAI into enterprise chatbots, documentation search, summarization systems, or code generation tools.
- Working knowledge of AWS Lambda, API Gateway, and serverless AI application development.
- Experience with Streamlit, Gradio, or Dash for rapid prototyping of GenAI applications.
- Cloud certifications in AWS AI/ML, Azure AI Engineer, or GCP AI Engineer tracks.