Search by job, company or skills

P

Senior Cloud/DevOps Engineer (AWS)_Offshore

7-10 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 11 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

JOB DESCRIPTION

Job Summary

Photon is seeking a Cloud / Site Reliability Engineer (SRE) to design, build, and operate highly available, scalable, and resilient cloud platforms supporting large-scale e-commerce, loyalty, and digital engagement ecosystems in the Quick Service Restaurant (QSR) domain.

This role focuses on platform reliability, cloud infrastructure, automation, observability, and operational excellence. You will work closely with backend, frontend, and product teams to ensure systems are secure, performant, cost-efficient, and production-ready, enabling seamless customer experiences across ordering, payments, loyalty, and engagement channels.

Key Responsibilities

Cloud Infrastructure & Platform Engineering

  • Design, build, and manage scalable, secure, and highly available cloud infrastructure on AWS.
  • Support cloud-native architectures using EC2, Lambda, ECS/EKS, API Gateway, ALB/NLB, VPC, IAM, and CloudFront.
  • Implement multi-environment cloud setups following best practices for isolation, security, and reliability.
  • Drive capacity planning, performance tuning, and cloud cost optimization (FinOps).

Site Reliability Engineering (SRE)

  • Define and implement SRE best practices, including SLIs, SLOs, SLAs, and error budgets.
  • Ensure high availability through auto-scaling, failover strategies, and disaster recovery planning.
  • Lead incident response, root cause analysis (RCA), and postmortems, driving reliability improvements.
  • Build and maintain runbooks and operational playbooks.

Automation, Git Pipelines & CI/CD

  • Design, implement, and maintain Git-based CI/CD pipelines using tools such as GitHub Actions, GitLab CI, or Bitbucket Pipelines.
  • Implement Infrastructure as Code (IaC) using Terraform to provision and manage cloud resources.
  • Enable progressive delivery strategies including blue-green deployments, canary releases, and feature flags.
  • Automate operational tasks to reduce toil and improve deployment reliability and speed.

Observability & Monitoring

  • Build and operate production-grade observability solutions, including monitoring, logging, alerting, and distributed tracing.
  • Implement dashboards, alerts, and telemetry using OpenTelemetry and cloud-native monitoring tools.
  • Partner with engineering teams to ensure applications are observable by default.

Security & Compliance

  • Enforce cloud security best practices, including IAM, secrets management, encryption, and network security.
  • Support security compliance, vulnerability management, and audit readiness.

Platform & QSR Support

  • Ensure platform reliability during QSR peak traffic periods, promotions, and loyalty campaigns.
  • Collaborate closely with backend engineers, architects, QA, product owners, and DevOps teams.
  • Act as a trusted advisor on cloud reliability, scalability, and operational readiness.

Required Qualifications

  • 7-10 years of experience in Cloud Engineering, DevOps, or Site Reliability Engineering roles.
  • Strong hands-on experience with AWS cloud services and cloud-native architectures.
  • Proven experience designing and maintaining Git-based CI/CD pipelines (GitHub Actions, GitLab CI, Bitbucket Pipelines, or similar).
  • Strong experience with Infrastructure as Code (Terraform preferred).
  • Hands-on experience with monitoring, logging, alerting, and observability platforms.
  • Solid understanding of Linux systems, networking, and distributed systems fundamentals.
  • Experience supporting high-availability, high-traffic production systems.
  • Strong troubleshooting, incident management, and problem-solving skills.

Nice to Have

  • Experience in QSR, retail, hospitality, or large-scale consumer digital platforms.
  • Exposure to Kubernetes (EKS) and container orchestration.
  • Experience supporting event-driven architectures (Kafka, SQS, SNS, MSK).
  • Familiarity with database reliability and scaling (DynamoDB, Cosmos DB, RDS).
  • Experience with cost optimization (FinOps) practices.
  • Knowledge of cloud security, compliance frameworks, and governance.
  • Contribute to continuous improvement in engineering practices and delivery quality.

More Info

Job Type:
Employment Type:

About Company

Photon, a global leader in digital transformation services and IT consulting, works with 40% of the Fortune 100 companies as their digital agency of choice. Photon Infotech Private Limited is an information technology and services company based out of Omr, Chennai, Tamil Nadu, India.

Job ID: 144140815