About The Role
We are seeking a highly experienced and technically strong Senior AWS DevOps Engineer to join our engineering team. The ideal candidate will have deep expertise in AWS cloud infrastructure, enterprise-scale CI/CD automation, Infrastructure as Code (IaC), Kubernetes orchestration, and production reliability engineering.
You will play a critical role in architecting scalable cloud platforms, leading DevOps transformation initiatives, improving deployment automation, strengthening security and observability practices, and ensuring high availability of mission-critical systems across multiple environments.
This role requires strong hands-on expertise, strategic thinking, leadership capabilities, and the ability to collaborate across development, QA, security, and infrastructure teams in a fast-paced Agile environment.
What You'll Do
Cloud Infrastructure & Architecture
- Design, architect, build, and manage highly scalable, secure, and resilient AWS cloud infrastructure
- Lead cloud modernization and infrastructure automation initiatives across multiple environments and regions
- Implement high-availability, disaster recovery, backup, and failover strategies for production workloads
- Architect and manage multi-account AWS environments using AWS Organizations and IAM best practices
- Optimize infrastructure for scalability, performance, reliability, and cost efficiency
DevOps & CI/CD Engineering
- Design and manage enterprise-grade CI/CD pipelines using Jenkins, CircleCI, GitHub Actions, AWS CodePipeline, CodeBuild, and CodeDeploy
- Build automated deployment workflows supporting microservices and containerized applications
- Implement blue-green, canary, and rolling deployment strategies
- Standardize release management and environment consistency across development, staging, and production
- Integrate automated testing, security scanning, and quality gates into CI/CD pipelines
Infrastructure as Code & Automation
- Lead Infrastructure as Code implementations using Terraform, Terragrunt, AWS CloudFormation, and AWS CDK
- Develop reusable Terraform modules and automated provisioning frameworks
- Automate infrastructure lifecycle management, configuration management, and environment provisioning
- Implement GitOps and infrastructure version control best practices
Containerization & Kubernetes
- Manage and optimize containerized workloads using Docker and Kubernetes (EKS)
- Deploy and maintain Kubernetes clusters with Helm charts and advanced orchestration strategies
- Implement auto-scaling, ingress controllers, service mesh, and container security best practices
- Troubleshoot Kubernetes networking, storage, and cluster performance issues
Monitoring, Observability & Reliability
- Implement enterprise observability solutions using Prometheus, Grafana, CloudWatch, OpenSearch, ELK Stack, and AWS X-Ray
- Define monitoring, logging, alerting, and incident management standards
- Perform root cause analysis and drive resolution of critical production incidents
- Improve system reliability, uptime, and operational efficiency through automation and proactive monitoring
Security & Compliance
- Implement DevSecOps practices and cloud security standards across infrastructure and pipelines
- Manage IAM policies, secrets management, encryption, and compliance controls
- Integrate security tools such as SonarQube, Snyk, Trivy, or Checkov into CI/CD pipelines
- Ensure infrastructure compliance with organizational and industry security standards
Collaboration & Leadership
- Mentor junior DevOps engineers and provide technical leadership across teams
- Collaborate closely with development, QA, architecture, and security teams
- Drive DevOps best practices, automation culture, and continuous improvement initiatives
- Participate in architecture reviews, capacity planning, and technical decision-making
Qualifications
- 7–8+ years of hands-on experience in DevOps, Cloud Infrastructure, and Site Reliability Engineering
- Strong expertise in AWS cloud services including EC2, ECS, EKS, Lambda, VPC, Route53, RDS, S3, CloudFront, IAM, and CloudWatch
- Expert-level experience with Terraform, Terragrunt, Ansible, and Infrastructure as Code practices
- Extensive experience building and managing CI/CD pipelines using Jenkins, CircleCI, GitHub Actions, and AWS DevOps services
- Strong experience with Kubernetes (EKS), Docker, Helm, and container orchestration
- Proficiency in scripting and programming using Bash, Python, and Golang
- Strong Linux system administration and troubleshooting skills
- Experience implementing monitoring and observability solutions using Prometheus, Grafana, ELK/OpenSearch, and AWS monitoring tools
- Deep understanding of distributed systems, microservices architecture, and cloud-native applications
- Experience with GitOps, release automation, and deployment strategies
- Strong understanding of networking, DNS, load balancing, SSL/TLS, and security best practices
- Proven experience handling production incidents, root cause analysis, and reliability engineering
- Excellent analytical, communication, and stakeholder management skills
- Strong ownership mindset and ability to work independently in fast-paced environments
Preferred Skills (Nice-to-Have)
- AWS Certified DevOps Engineer – Professional / AWS Solutions Architect Certification
- Experience with ArgoCD, FluxCD, or GitOps-based deployment models
- Hands-on experience with Redis, Memcached, Kafka, or RabbitMQ
- Experience working with SQL and NoSQL databases such as PostgreSQL, MySQL, MongoDB, or DynamoDB
- Exposure to service mesh technologies like Istio or Linkerd
- Experience with Nexus, Artifactory, SonarQube, or security scanning tools
- Familiarity with multi-cloud environments (Azure/GCP)
- Experience working in Agile/Scrum environments
- Knowledge of FinOps and cloud cost optimization practices
- Exposure to SRE principles, SLIs, SLOs, and error budgets