Cloud AI Infrastructure Architect

Infosys

Bengaluru, India

10-12 Years

This job is no longer accepting applications

Posted 23 days ago

Job Description

Proven experience designing, implementing, and managing cloud solutions on major cloud platforms (e.g., AWS, Azure, GCP).
Strong understanding of cloud computing concepts, architectures, and services (IaaS, PaaS, SaaS).
Hands-on experience with cloud automation and infrastructure-as-code tools (e.g., Terraform, CloudFormation, ARM).
Experience with cloud security best practices and tools.
Prompt Engineering Environments: Designing and implementing infrastructure to support prompt engineering workflows and experimentation.
Agent orchestration & tool integration (e.g., LangChain).
Infrastructure as Code (IaC): Terraform (expert), CloudFormation, Google Deployment Manager, Bicep.
Containerization & Orchestration: Docker, Kubernetes (EKS, GKE, AKS).
MLOps/Gen AIOps: CI/CD pipelines for AI models/agents, model versioning, monitoring.
Programming/Scripting: Python (strong).

As a Cloud AI Infra Architect you should have with a minimum of 10 years of experience in managing Cloud Enterprise infrastructure projects and driving automation through Gen AI, drive the adoption, optimization of our cloud infrastructure and services.

Design, implement, and evolve highly available, scalable, and secure multi-cloud architectures specifically tailored for large language models (LLMs), foundation models, vector databases, prompt engineering environments, fine-tuning, and real-time inference for Gen AI.
Develop infrastructure patterns and frameworks to support the deployment, orchestration, and management of autonomous AI agents, including their interaction with external tools, data sources, and reasoning engines.
Drive the adoption and implementation of advanced IaC to automate the provisioning, configuration, and governance of all AI infrastructure.
Proactively identify bottlenecks and implement innovative strategies for optimizing the performance, cost-efficiency, and resource utilization of high-compute AI workloads across all cloud providers.
Define and enforce stringent security architectures, data governance policies, and compliance frameworks for sensitive AI data, models, and agent interactions (e.g., data privacy, responsible AI principles).
Partner with Data Engineering to design and optimize data pipelines for large-scale, unstructured, and vector data required for Gen AI model training, fine-tuning, and retrieval-augmented generation
Collaborate closely with Data Scientists and ML/Gen AI Engineers to design and implement robust MLOps/Gen AIOps pipelines for continuous integration, continuous delivery (CI/CD), continuous training (CT), and continuous evaluation (CE) of Gen AI models and agents.
Good Communication skills
Good analytical and problem-solving skills