Job Post :- Senior Data Engineer
Experience :- 7+ years
Timing :- Approx. 5:30 PM/6:30 PM IST to start, then 8 hours a day (Basically EST time zone)
Location :- Remote (India)
Contract Duration :- 6 Months to 1 year depends on client requirement
Position Overview
We are seeking experienced Data/GenAI Engineers to join our Professional Services team. You will work directly on client engagements delivering production-grade Generative AI solutions, including conversational AI assistants, document processing automation, RAG (Retrieval-Augmented Generation) systems, and AI-powered data analytics platforms. This role requires hands-on technical execution, client interaction, and the ability to work independently within an agile delivery framework.
- Design and implement production-ready Generative AI applications using Amazon Bedrock, Anthropic Claude, and other foundation models
- Build and optimise RAG (Retrieval-Augmented Generation) pipelines with vector databases (Weaviate, OpenSearch, Pinecone)
- Develop AI agents and multi-agent orchestration systems using frameworks like LangChain, LlamaIndex, or custom implementations
- Create conversational AI interfaces with natural language understanding, intent detection, and context management
- Implement prompt engineering strategies, few-shot learning, and fine-tuning approaches for domain-specific applications
- Build serverless architectures using AWS Lambda, API Gateway, Step Functions, and EventBridge
- Design and implement data pipelines for AI model training, inference, and feedback loops
- Develop RESTful APIs and WebSocket connections for real-time AI interactions
- Configure and optimise AWS services including S3, DynamoDB, RDS, SQS, SNS, and CloudWatch
- Implement infrastructure-as-code using CloudFormation, CDK, or Terraform
Data Engineering & ML Operations
- Design and build data ingestion pipelines for structured and unstructured data sources
- Implement ETL/ELT workflows for data preparation, cleaning, and transformation
- Create vector embeddings and semantic search capabilities for knowledge retrieval
- Develop data validation, quality monitoring, and observability frameworks
- Optimise model inference performance, latency, and cost efficiency
Client Engagement & Delivery
- Participate in sprint planning, daily standups, and client review sessions
- Translate business requirements into technical specifications and implementation plans
- Provide technical guidance and recommendations to clients on AI/ML best practices
- Document architecture decisions, code, and deployment procedures
- Troubleshoot production issues and implement solutions quickly
Tier 1 - Critical Must-Haves
- Amazon Bedrock - Hands-on experience with foundation models (Claude, Nova, Llama or others), model invocation, streaming responses, and guardrails
- Agent Frameworks & Orchestration - Production experience with LangChain, LlamaIndex, Bedrock Agents, or custom multi-agent orchestration systems
- Python - Advanced proficiency with modern Python (3.9+), including async/await, type hints, and testing frameworks (pytest, unittest)
- AWS Lambda & Serverless - Production experience building event-driven architectures, function optimisation, and cold start mitigation
- Vector Databases - Practical experience with at least one: Weaviate, OpenSearch, Pinecone, Chroma, or FAISS for semantic search
- LLM Integration - Direct experience with LLM APIs (Anthropic, OpenAI, Cohere), prompt engineering, and response parsing
- API Development - RESTful API design and implementation using FastAPI, Flask, or similar frameworks
Tier 2 - Highly Valuable
- Amazon Bedrock AgentCore - Experience with AgentCore Runtime, Memory, Gateway, and Observability for building production agent systems
- AWS API Gateway - Configuration, authorisation, throttling, and integration with Lambda/backend services
- DynamoDB - NoSQL data modelling, single-table design, GSI/LSI optimisation, and DynamoDB Streams
- AWS Step Functions - Workflow orchestration for complex AI pipelines and multi-step processes
- Docker & Containers - Containerization, ECR, ECS/Fargate deployment for AI workloads
- Data Processing - Experience with Pandas, PySpark, AWS Glue, or similar data transformation tools
Tier 3
- RAG Architecture - End-to-end RAG system design including chunking strategies, retrieval optimisation, and context management
- Embedding Models - Working knowledge of text embeddings (Bedrock Titan, OpenAI, Cohere) and embedding optimisation
- AWS S3 & Data Lakes - S3 event notifications, lifecycle policies, and data lake architecture patterns
- CloudWatch & Observability - Logging, metrics, alarms, and distributed tracing for AI applications
- IAM & Security - AWS security best practices, least privilege access, secrets management (Secrets Manager, Parameter Store)
- CI/CD Pipelines - Experience with CodePipeline, GitHub Actions, or GitLab CI for automated deployments
Tier 4 - Nice to Have
- SageMaker - Model training, deployment, endpoints, and feature stores
- OpenSearch - Full-text search, vector search, and hybrid search implementations
- EventBridge - Event-driven architectures and cross-service integrations
- WebSockets - Real-time bidirectional communication for streaming AI responses
- AWS CDK - Infrastructure-as-code using Python or TypeScript CDK constructs
- Fine-tuning & Training - Experience with model fine-tuning, PEFT methods, or custom model training
Required Experience & Qualifications
- 7-8+ years of software engineering experience with at least 2/3+ years focused on AI/ML, data engineering, or cloud-native development
- 2-3+ years of hands-on AWS experience with production deployments
- 1-2+ years of direct Generative AI experience (LLMs, embeddings, RAG, agents)
- Proven track record delivering production AI applications from concept to deployment
- Strong understanding of software engineering best practices (version control, testing, code review, documentation)
- Experience working in agile/scrum environments with distributed teams
- Excellent problem-solving skills and ability to work independently with minimal supervision
- Strong written and verbal communication skills for client-facing interactions