Search by job, company or skills

innodata india private limited

Site Reliability Engineer-SRE

new job description bg glownew job description bg glownew job description bg svg
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Site Reliability Engineer (SRE) AWS, Kubernetes (EKS), CI/CD

Location: Remote | Experience: 46 years

Role Summary

Looking for an SRE to ensure reliability, scalability, and automation of AWS + Kubernetes platforms. Responsibilities include CI/CD, infrastructure provisioning, monitoring, incident management, and collaborating with engineering teams to deliver secure, high-availability systems.

Key Responsibilities

  • Reliability & Ops: Manage availability, performance, SLIs/SLOs, incident response, and improve MTTR.
  • Kubernetes (EKS): Operate EKS clusters, deployments (Helm/Kustomize), autoscaling, ingress, and security policies.
  • CI/CD: Build and maintain pipelines (GitHub Actions/Jenkins/GitLab), enforce best practices, and manage releases.
  • IaC & AWS: Provision infrastructure using Terraform/CloudFormation; manage AWS services (EKS, EC2, IAM, VPC, RDS, S3, etc.) and cost optimization.
  • Observability: Implement monitoring, logging, and tracing (Prometheus, Grafana, CloudWatch).
  • Security: Enforce IAM, WAF, encryption, and vulnerability management.
  • Collaboration: Maintain runbooks, SOPs, and work with teams on design and deployment.

Must-Have Skills

  • AWS (EKS, EC2, IAM, VPC, etc.)
  • Kubernetes (production experience)
  • CI/CD pipelines & release management
  • Observability (Prometheus, Grafana)
  • Terraform (or CloudFormation/CDK)
  • Linux, networking, Python/Bash

Good-to-Have

  • Service mesh (Istio/Linkerd), GitOps (Argo CD/Flux)
  • Tools like Datadog/New Relic
  • WAF tuning, DB basics (Postgres/MySQL)
  • High-volume data systems experience

More Info

Job Type:
Industry:
Employment Type:

Job ID: 145402201