At goML, we design and build cutting-edge Generative AI, AI/ML, and Data Engineering solutions that help businesses unlock the full potential of their data, drive intelligent automation, and create transformative AI-powered experiences. Our mission is to bridge the gap between state-of-the-art AI research and real-world enterprise applications – helping organizations innovate faster, make smarter decisions, and scale AI solutions seamlessly.
We're looking for a DevOps Engineer with strong cloud expertise (AWS & Azure), hands-on experience in container orchestration, CI/CD automation, and Infrastructure as Code (IaC). In this role, you'll help design, implement, and optimize secure, scalable, and efficient cloud infrastructure that powers our AI/ML and GenAI workloads. If you thrive in fast-paced engineering environments and love automating infrastructure at scale, we'd love to hear from you!
Why You Why Now
As enterprises rapidly adopt AI, the need for reliable, secure, and automated infrastructure grows exponentially. This role is perfect for someone who loves solving cloud challenges, building robust DevOps pipelines, and enabling engineering teams to ship high-quality products—fast and confidently.
What You'll Do (Key Responsibilities)
First 30 Days: Foundation & Orientation
- Deep dive into goML's AI/ML & GenAI pipelines and DevOps architecture
- Familiarize yourself with our AWS & Azure environments, CI/CD workflows, and containerized infrastructure
- Review current deployment processes and identify improvement opportunities
- Shadow engineering teams to understand environment, release, and automation needs
First 60 Days: Execution & Impact
- Design, deploy, and manage cloud infrastructure using:
- AWS: ECS, EKS, Lambda, EC2, VPC, S3, API Gateway
- Azure: AKS, Virtual Machines, Azure Functions, Virtual Network, Blob Storage, API Management
- Support ML/AI workloads using:
- AWS: Bedrock, SageMaker
- Azure: Azure Machine Learning, Azure OpenAI Service
- Build and optimize CI/CD pipelines using Jenkins, GitHub Actions, AWS CodePipeline, Azure DevOps, etc.
- Implement IaC using Terraform, AWS CDK, Azure Bicep, or CloudFormation
- Automate infrastructure tasks using Python/Bash scripts
- Enhance monitoring, alerting, and logging using CloudWatch, Azure Monitor, Application Insights, and observability tools
- Collaborate closely with developers to streamline deployments and integrations
First 180 Days: Ownership & Transformation
- Own and evolve DevOps architecture for large-scale AI/ML deployments across multi-cloud (AWS & Azure)
- Optimize infrastructure for performance, resilience, and cost efficiency
- Improve Kubernetes & container orchestration standards (EKS & AKS)
- Strengthen cloud governance, compliance, and security posture
- Build automated workflows to reduce manual ops and accelerate delivery
- Troubleshoot complex production issues and drive long-term stability improvements
What You Bring (Qualifications & Skills)
Must-Have
- 3+ years of experience in DevOps engineering
- Strong hands-on experience with AWS and/or Azure cloud platforms
- AWS services: ECS, EKS, Lambda, EC2, VPC, API Gateway, Load Balancers, S3, CloudWatch
- Azure services: AKS, Azure Functions, Virtual Machines, VNet, API Management, Blob Storage, Azure Monitor
- Proficiency with IaC tools: Terraform, AWS CDK, Azure Bicep, or CloudFormation
- Strong knowledge of Docker & Kubernetes
- Hands-on experience building CI/CD pipelines (Jenkins, GitHub Actions, AWS-native tools, Azure DevOps)
- Scripting experience using Python or Bash
- Solid understanding of cloud security, monitoring, and logging
- Strong troubleshooting skills and effective communication
Nice-to-Have
- AWS Certified DevOps Engineer
- AWS Certified Solutions Architect
- Microsoft Certified: Azure DevOps Engineer Expert
- Microsoft Certified: Azure Solutions Architect Expert
- Certified Kubernetes Administrator (CKA)
- Experience with performance tuning and production-grade Kubernetes clusters
Why Work With Us
- Remote-first, with offices in Coimbatore for in-person collaboration
- Work on cutting-edge AI/ML & GenAI cloud challenges at scale
- Direct ownership of DevOps systems powering enterprise AI deployments
- Competitive salary and career growth opportunities