Search by job, company or skills

I

Cloud AI Infrastructure Architect

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 22 days ago

Job Description

  • Proven experience designing, implementing, and managing cloud solutions on major cloud platforms (e.g., AWS, Azure, GCP).
  • Strong understanding of cloud computing concepts, architectures, and services (IaaS, PaaS, SaaS).
  • Hands-on experience with cloud automation and infrastructure-as-code tools (e.g., Terraform, CloudFormation, ARM).
  • Experience with cloud security best practices and tools.
  • Prompt Engineering Environments: Designing and implementing infrastructure to support prompt engineering workflows and experimentation.
  • Agent orchestration & tool integration (e.g., LangChain).
  • Infrastructure as Code (IaC): Terraform (expert), CloudFormation, Google Deployment Manager, Bicep.
  • Containerization & Orchestration: Docker, Kubernetes (EKS, GKE, AKS).
  • MLOps/Gen AIOps: CI/CD pipelines for AI models/agents, model versioning, monitoring.
  • Programming/Scripting: Python (strong).

As a Cloud AI Infra Architect you should have with a minimum of 10 years of experience in managing Cloud Enterprise infrastructure projects and driving automation through Gen AI, drive the adoption, optimization of our cloud infrastructure and services.

  • Design, implement, and evolve highly available, scalable, and secure multi-cloud architectures specifically tailored for large language models (LLMs), foundation models, vector databases, prompt engineering environments, fine-tuning, and real-time inference for Gen AI.
  • Develop infrastructure patterns and frameworks to support the deployment, orchestration, and management of autonomous AI agents, including their interaction with external tools, data sources, and reasoning engines.
  • Drive the adoption and implementation of advanced IaC to automate the provisioning, configuration, and governance of all AI infrastructure.
  • Proactively identify bottlenecks and implement innovative strategies for optimizing the performance, cost-efficiency, and resource utilization of high-compute AI workloads across all cloud providers.
  • Define and enforce stringent security architectures, data governance policies, and compliance frameworks for sensitive AI data, models, and agent interactions (e.g., data privacy, responsible AI principles).
  • Partner with Data Engineering to design and optimize data pipelines for large-scale, unstructured, and vector data required for Gen AI model training, fine-tuning, and retrieval-augmented generation
  • Collaborate closely with Data Scientists and ML/Gen AI Engineers to design and implement robust MLOps/Gen AIOps pipelines for continuous integration, continuous delivery (CI/CD), continuous training (CT), and continuous evaluation (CE) of Gen AI models and agents.
  • Good Communication skills
  • Good analytical and problem-solving skills

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 132337819