1. Role Summary
The Senior Cloud Infrastructure & Platform Engineer will own the design, automation, and scaling of company multi-cloud infrastructure across AWS, Azure, and GCP. This role is responsible for building a fully containerized, stateless, and highly portable application platform that supports our Node.js backend services, React frontend, Redis caching, ClickHouse analytics engine, and LLM-driven workloads.
You will architect and manage Kubernetes-based deployments, implement infrastructure using Terraform, build CI/CD pipelines with Azure DevOps, optimize cloud cost and performance, and deliver an observable, secure, and resilient platform. This role is ideal for an engineer who thrives in fast-paced product environments, deeply understands Cloud + DevOps fundamentals, and can drive the next generation of our company cloud-agnostic architecture.
2. Required Skills & Experience (57 Years)
Core Experience (57 years):
- Strong experience designing, deploying, and managing cloud infrastructure on AWS, Azure, or GCP.
- Hands-on expertise in Kubernetes (EKS, AKS, GKE, or K3s) and containerized architectures.
- Strong proficiency with Terraform (or Pulumi) for Infrastructure-as-Code.
- Solid understanding of stateless service design, scaling patterns, service discovery, and load balancing.
- Experience building and managing CI/CD pipelines using Azure DevOps (pipelines, releases, automation).
- Strong scripting and automation skills with TypeScript/Node, Bash, or Python.
- Experience managing and optimizing Redis, ClickHouse, or similar real-time/caching systems.
- Exposure to LLM/AI infrastructure (API integrations or container-based inference servers).
- Deep understanding of cloud observability (Prometheus, Grafana, Loki/ELK, OpenTelemetry).
- Experience with cloud cost optimization, autoscaling strategies, and resource efficiency.
Mindset & Attributes
- Strong automation-first mentality.
- Excellent problem-solving skills for distributed systems.
- Comfortable owning architecture end-to-end: design build deploy support.
- Thrives in a dynamic, startup-paced environment.
- Strong communication and cross-team collaboration.
3. Tech Stack You Will Work With
Frontend & Backend
- React
- Node.js / TypeScript microservices
- Redis for caching and ephemeral state
- ClickHouse for real-time analytics
- Mistral / LLM APIs and self-hosted inference containers
Infrastructure & Platform
- Kubernetes (EKS, AKS, GKE, K3s)
- Docker / OCI containers
- Terraform for Infrastructure-as-Code
- Helm charts for deployments
- Azure DevOps for CI/CD orchestration
- Cloudflare, Nginx Ingress, TLS automation
- Prometheus, Grafana, Alertmanager, Loki, OpenTelemetry
- Multi-cloud networking, autoscaling, and cost optimization tools