Search by job, company or skills

  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Scope:

Yotta is building a family of advanced AI platforms-from large language model infrastructure to no-code AI builders and autonomous agent orchestration tools.

As a Senior DevOps Engineer, you will own the reliability, scalability, and automation of Yotta's platform infrastructure. You will design and operate CI/CD pipelines, cloud-native environments, and production monitoring systems that enable engineering and AI teams to ship safely, frequently, and at scale.

This This role sits at the intersection of software engineering, infrastructure, and operations, with a strong focus on automation, resilience, and cost efficiency.

Key Responsibilities:

A. Infrastructure Automation & Cloud Operations

Design, build, and maintain scalable infrastructure across cloud (AWS/GCP/Azure) and on-prem environments.

Implement Infrastructure as Code (IaC) using tools such as Terraform, Ansible, or equivalent.

Manage Kubernetes clusters, container orchestration, and node-level optimizations.

Ensure high availability, fault tolerance, and disaster recovery readiness.

B. CI/CD & Release Engineering

Design and maintain CI/CD pipelines for backend, frontend, and AI workloads.

Automate build, test, security scanning, and deployment processes.

Implement safe deployment strategies such as blue-green, canary, and rolling releases.

Partner with engineering teams to improve release velocity without compromising stability.

C. Observability, Reliability & Incident Management

Build and maintain observability stacks for metrics, logs, and traces (Prometheus, Grafana, ELK, Datadog, etc.).

Define SLIs, SLOs, and SLAs for critical services.

Lead incident response, root-cause analysis, and post-incident reviews.

Proactively identify and resolve reliability and performance bottlenecks.

D. Security, Compliance & Governance

Implement security best practices across infrastructure, CI/CD, and runtime environments.

Manage secrets, access controls, and identity management securely.

Support compliance requirements such as SOC2, ISO 27001, GDPR, and India DPDP Act.

Collaborate with security and compliance teams on audits and risk mitigation.

E. Cost Optimization & Platform Efficiency

Monitor and optimize infrastructure costs across compute, storage, networking, and AI workloads.

Implement auto-scaling, resource quotas, and cost-aware scheduling.

Provide visibility into infra usage and cost drivers for leadership and product teams.

F. Cross-Functional Collaboration & Enablement

Work closely with backend, frontend, AI/ML, and MLOps teams to support production workloads.

Create reusable infrastructure templates and DevOps best practices.

Mentor junior DevOps or platform engineers through reviews and guidance.

Act as a technical advisor on scalability, reliability, and deployment strategies.

Qualifications Criteria:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

8 plus years of experience in DevOps, Site Reliability Engineering, or Platform Engineering roles.

Proven experience operating production systems with real uptime and SLA commitments.

Strong hands-on experience with Docker, Kubernetes, and container ecosystems.

Deep experience with cloud platforms (AWS, GCP, or Azure).

Proficiency in Terraform, Ansible, Helm, or similar tooling.

Experience with Git-based CI/CD systems (GitHub Actions, GitLab CI, Jenkins, Argo CD).

Experience with monitoring, logging, and tracing stacks.

Knowledge of networking, load balancing, and CDN architectures.

Strong scripting skills (Bash, Python, or equivalent).

Hands-on experience managing large-scale distributed systems.

Experience in SaaS, cloud platforms, or AI infrastructure environments is a major plus.

Interested candidates can share their updates resume at [Confidential Information]

Regards,

YOTTA

More Info

Job Type:
Industry:
Function:
Employment Type:

Job ID: 143294497