AWS Infra

5-7 Years

Save

Early Applicant

Quick Apply

Job Description

Generative AI Platform Support Engineer Responsibilities:

Provide technical support for our AI platform focusing on the integration of cloud infrastructure deployment and ongoing maintenance.
Work closely with cross-functional teams to troubleshoot technical issues, implement platform enhancements, monitor system performance, and ensure the platform runs efficiently and effectively.
Leverage expertise in AWS Cloud Administration and Infrastructure management to support platform operations and ensure optimal system performance.

Key Responsibilities:

Assess and enhance the AI platform's cloud infrastructure and data pipeline resilience using AWS and cloud-based technologies.
Ensure scalability and fault tolerance of AI/ML models within cloud environments.
Identify and resolve bottlenecks in model inference and training pipelines, focusing on performance and resource optimization.
Optimize cloud resource utilization on AWS for real-time use cases, including AI model deployment.
Collaborate with the DevOps team on improving cloud deployment processes and managing AWS infrastructure.
Implement automated testing to simulate fault tolerance and ensure high availability.
Provide ongoing technical support for users of the Generative AI platform, troubleshooting issues and responding to queries to ensure seamless operations.
Monitor cloud platform performance on AWS, identifying and implementing optimization strategies to improve cost efficiency and scalability.
Work with AWS cloud services (e.g., EC2, S3, Lambda, VPC) to ensure proper configuration management and performance.
Document key processes, issues, and solutions for knowledge sharing and future reference.
Stay updated with industry trends in Generative AI, cloud technologies, and AWS cloud administration.