We are CirrusLabs . Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make things happen. We are committed to excellence. We are a dependable partner organization that delivers on commitments. We strive to maintain integrity with our employees and customers. Every action we take is driven by value. The core of who we are is through our well-knit teams and employees. You are the core of a values driven organization.
You have an entrepreneurial spirit. You enjoy working as a part of well-knit teams. You value the team over the individual. You welcome diversity at work and within the greater community. You aren't afraid to take risks. You appreciate a growth path with your leadership team that journeys how you can grow inside and outside of the organization. You thrive upon continuing education programs that your company sponsors to strengthen your skills and for you to become a thought leader ahead of the industry curve.
You are excited about creating change because your skills can help the greater good of every customer, industry and community. We are hiring a talented Data Scientist & Agentic AI Developer to join our team. If you're excited to be part of a winning team, CirrusLabs (http://www.cirruslabs.io) is a great place to grow your career.
Experience - 5-8 years
Location - Bengaluru
Work Timings - 2pm - 11pm IST
Tech Stack:Python, LangChain, LangSmith, Phoenix, TensorFlow, PyTorch, OpenAI API, Anthropic Claude, Azure OpenAI, AWS/Azure/GCP. We value innovation, collaboration, and a commitment to ethical AI development.
Position Overview
We are seeking an experienced Data Scientist and Agentic AI Developer to design, develop, and evaluate intelligent AI agents that deliver high-quality, reliable, and compliant solutions. This role requires a blend of data science expertise, AI/ML engineering capabilities, and hands-on experience building agentic systems.
Experience Required
5-6 years of professional experience in data science, machine learning, and AI development
Key Responsibilities
AI Agent Development & Deployment
- Design and develop agentic AI systems powered by Large Language Models (LLMs) with tool-calling capabilities
- Build and optimize multi-step reasoning workflows and agent orchestration frameworks
- Implement retrieval-augmented generation (RAG) pipelines for knowledge-intensive applications
- Integrate external tools, APIs, and databases into agent workflows
- Deploy and monitor production-grade AI agents at scale
Evaluation & Quality Assurance
- Develop comprehensive evaluation frameworks using LLM-as-a-judge methodologies
- Implement automated scoring systems for output quality metrics (correctness, helpfulness, coherence, relevance)
- Design and execute robustness testing including adversarial attack scenarios
- Monitor and reduce hallucination rates and ensure factual accuracy
- Track performance metrics including latency, throughput, and cost-per-interaction
Data Science & Analytics
- Analyze agent performance data to identify improvement opportunities
- Build custom evaluation pipelines and scoring rubrics
- Conduct A/B testing and statistical analysis for model optimization
- Create dashboards and visualization tools for stakeholder reporting
- Implement RAGAs (Retrieval Augmented Generation Assessment) frameworks
Safety, Ethics & Compliance
- Ensure AI systems meet ethical standards including bias detection and fairness
- Implement safety guardrails to prevent harmful content generation
- Develop compliance monitoring systems for regulatory frameworks (EU AI Act, GDPR, HIPAA, DPDP)
- Document transparency and explainability measures
- Establish human oversight protocols
Required Skills & Qualifications
- Technical Expertise Programming : Strong proficiency in Python; experience with AI/ML frameworks (Lang Chain, Lang Smith, Phoenix, or similar)
- LLM Expertise : Hands-on experience with GPT, Claude, or other frontier models; prompt engineering and fine-tuning
- Machine Learning : Deep understanding of NLP, deep learning architectures, and model evaluation
- Tools & Platforms : Experience with MLOps tools, vector databases, and observability platforms
- Data Engineering : Proficiency in SQL, data pipelines, and ETL processes
Domain Knowledge
- Understanding of agentic AI architectures and autonomous systems
- Knowledge of RAG systems and information retrieval techniques
- Familiarity with LLM evaluation methodologies and benchmarks
- Experience with conversational AI and dialogue systems
- Understanding of AI safety, alignment, and interpretability
Evaluation & Metrics
- Experience designing evaluation rubrics and scoring systems
- Proficiency with automated evaluation frameworks (RAGAs, custom evaluators)
- Understanding of quality metrics: coherence, fluency, factual accuracy, hallucination detection
- Knowledge of performance metrics: latency optimization, token usage, throughput analysis
- Experience with user experience metrics (CSAT, NPS, turn count analysis)
- Soft SkillsStrong analytical and problem-solving abilities
- Excellent communication skills for cross-functional collaboration
- Ability to balance innovation with practical constraints
- Detail-oriented with a focus on quality and reliability
- Self-driven with ability to work in fast-paced environments
Preferred Qualifications
Experience with compliance frameworks and regulatory requirements
- Background in conversational AI or chatbot development
- Knowledge of reinforcement learning from human feedback (RLHF)
- Experience with multi-modal AI systems
- Tools & Technologies Frameworks: Lang Chain, Lang Smith, Phoenix, TensorFlow, PyTorch
- LLM Platforms: OpenAI API, Anthropic Claude, Azure OpenAI
- Databases: Vector databases (Pinecone, Weaviate, Chroma), PostgreSQL, MongoDB
- Monitoring: Lang Smith, Phoenix, custom observability tools
- Cloud: AWS/Azure/GCP experience preferred
- Version Control: Git, CI/CD pipelines
- What You'll Deliver Production-ready agentic AI systems with measurable quality improvements
- Comprehensive evaluation frameworks with automated scoring
- Performance dashboards and reporting systems
- Documentation for technical specifications and compliance standards
- Continuous improvement strategies based on data-driven insights