Company Description
Blend is a premier AI services provider, committed to co-creating meaningful impact for its clients through the power of data science, AI, technology, and people. We help organisations solve complex business challenges by combining deep domain understanding with modern data and AI capabilities. Our teams work across strategy, analytics, engineering, and product delivery to create scalable, high-value solutions that improve decision-making, efficiency, and growth.
Job Description
We are looking for an experienced Senior Cloud & DevOps Engineer to support the build and production readiness of an AI-powered analytics capability for a large enterprise client. This role will focus on establishing cloud infrastructure, CI/CD, environment management, deployment automation, monitoring, logging, and operational controls needed to take the solution through Dev, Test, and Production. The ideal candidate will have strong expertise in cloud-native architecture, infrastructure-as-code, release engineering, observability, and secure platform operations. This person will work closely with AI Engineers, Software Engineers, Data Engineers, and Data Scientists to ensure the solution is deployable, scalable, secure, and aligned with enterprise standards.
Responsibilities
- Design and implement cloud infrastructure and deployment patterns for the solution in collaboration with client platform teams.
- Build and maintain CI/CD pipelines to support repeatable, controlled releases across Development, Test, and Production environments.
- Implement infrastructure-as-code for cloud resources, services, permissions, and environment configuration.
- Support deployment of backend services, orchestration components, data services, and front-end applications.
- Enable monitoring, logging, alerting, and telemetry for both platform health and end-user usage feedback loops.
- Define and implement operational controls for reliability, performance, scalability, and incident response.
- Support secure access patterns, secrets management, environment separation, and role-based controls.
- Collaborate with engineering and AI teams to operationalise LLM-enabled workloads in a governed enterprise environment.
- Ensure the solution aligns with architecture, security, and service transition requirements.
- Support non-functional testing, release readiness, and path-to-production activities.
- Contribute to support models, handover documentation, and internal enablement for ongoing operations.
- Help define scalable platform patterns for future phases, including additional datasets and advanced AI capabilities.
Qualifications
- 4+ years of experience in Cloud Engineering, DevOps, or Platform Engineering roles.
- Strong hands-on experience with CI/CD tooling and release automation.
- Experience with infrastructure-as-code using Terraform or similar tools.
- Experience deploying and operating cloud-native workloads in GCP.
- Strong understanding of containerisation, serverless and managed compute services, and environment promotion strategies.
- Experience with observability tooling covering logging, monitoring, alerting, and service health.
- Knowledge of security best practices including IAM, RBAC, secrets management, and policy-driven access control.
- Experience supporting production-grade data or AI platforms in enterprise environments.
- Familiarity with Git-based workflows and collaborative engineering practices.
- Strong troubleshooting, communication, and stakeholder management skills.
Additional Information
- Experience with GCP services such as Cloud Run, Pub/Sub, BigQuery, Vertex AI, Cloud Build.
- Familiarity with MLOps or operationalising LLM-based services.
- Experience with telemetry and monitoring patterns for AI systems, including user interaction and model or service performance.
- Understanding of enterprise governance requirements for AI workloads, including auditability and safe release practices.
- Experience supporting service transition into managed support models.
- Exposure to QA automation and non-functional testing in cloud-native systems.