Cloud Database/Platform Engineer — AI/ML Infrastructure
This is a software engineering and platform engineering role focused on building scalable data infrastructure, distributed systems, cloud-native data platforms, and developer tooling. This is not a database administration (DBA) role.
The Opportunity
Join FICO's platform engineering team as a Database Platform Engineer — a role at the intersection of databases, data platforms, and AI/ML infrastructure. You will architect and operate the data platform capabilities powering FICO Assistant — our client-facing AI product serving financial institutions across APAC, EMEA, and North America. Your job is not to administer databases, but to build the platform capabilities that application developers, data engineers, data scientists, and ML engineers use. You'll own production AI/ML data systems end-to-end: model serving infrastructure, training pipelines, vector search, cloud database platforms, and 24x7 operational support.
What You'll Do
1. Build Database Platform Capabilities
Define and build platform features that enable scalable, resilient data infrastructure:
- Design and build database platform capabilities — query optimization, data storage engines, replication and consistency mechanisms, backup and recovery systems
- Build Infrastructure as Code (Terraform, Crossplane, CloudFormation) for all database and AI/ML resources
- Implement performance monitoring and observability — CloudWatch, Grafana, and custom database metrics
- Design and enforce database security — IAM authentication, TLS, encryption at rest/in-transit, and fine-grained access controls
2. Enable AI/ML Workloads on Data Platforms
Modern databases increasingly support AI use cases. Build the capabilities that make this possible:
- Design and operate production AI/ML infrastructure — model serving (SageMaker, Bedrock, self-hosted LLMs on EKS), training pipelines, and inference optimization
- Build vector search and embedding index management — OpenSearch k-NN, dimension tuning, index optimization for Retrieval-Augmented Generation (RAG)
- Implement AI observability using Langfuse — latency tracking, token economics, hallucination detection, and response quality metrics
- Support generative AI applications with feature stores, similarity search, and embedding storage
- Implement CI/CD pipelines for ML systems with automated testing and model validation gates
3. Build Data-to-AI Pipelines
Create integrations between data platforms and ML/LLM systems:
- Build data-to-ML workflows connecting data warehouses, data lakes, ML platforms, and LLM platforms
- Create data pipelines that feed training data, embeddings, and feature stores into AI systems
- Collaborate with data scientists, ML engineers, and product teams to deliver scalable infrastructure
- Own 24x7 production support for FICO Assistant — proactive monitoring, incident management, and SLA compliance
What We're Looking For
Must Have:
- 8+ years infrastructure/platform engineering; 3+ years focused on AI/ML infrastructure or data platforms
- Hands-on ML model serving — SageMaker, Bedrock, vLLM, or TGI
- Infrastructure as Code: Terraform, Crossplane, or CloudFormation for database and AI/ML resources
Nice to Have:
- Foundation model fine-tuning (LoRA, QLoRA, RLHF)
- AI agent frameworks and autonomous system orchestration
- Graph databases (Neo4j, Neptune) for knowledge graphs
- AWS ML Specialty or Database Specialty certification
- Experience at companies building data platforms (Snowflake, Databricks, AWS, Google, Microsoft)
- In-memory cache architecture — Redis cluster management, eviction policies, memory monitoring
- PostgreSQL administration — Aurora PostgreSQL performance tuning, connection pooling, query optimization
- Vector search / embedding index management — OpenSearch k-NN, dimension tuning, index optimization for RAG
- MLOps tooling: experiment tracking (MLflow, W&B), model registries, CI/CD for ML
- Kubernetes (EKS) for ML workloads — GPU node pools, autoscaling, service mesh
- LLM application patterns: conversation memory, guardrails, agent frameworks (LangChain, LlamaIndex)
Working Arrangements
- Hours: 2 PM – 11 PM India Time Zone (weekdays)
- On-Call: Rotating alternate weekends
- Support Model: 24x7 follow-the-sun for client-facing AI product
Tech Stack
- AI/ML: SageMaker, Bedrock, EKS (GPU), Langfuse, LangChain
- IaC: Terraform, Crossplane, CloudFormation
- Monitoring: CloudWatch, Grafana, Prometheus
- CI/CD: GitHub Actions, ArgoCD, MLflow
- Programming language: Go and python
Why FICO
- Build the data platform behind a live, client-facing AI product at enterprise scale
- Work with cutting-edge ML/AI technologies in production (not just POCs)
- Global impact — powering fraud detection and credit decisioning for major financial institutions
- Join a newer category of engineers — Database Platform Engineers — recognized at AWS, Snowflake, Databricks, Google, and Microsoft
- Competitive compensation, flexible work, and comprehensive benefits