Description
We are seeking a hands-on Senior AWS Cloud Engineer to manage and optimize a multi-account, multi-environment AWS infrastructure. The ideal candidate will have deep expertise in DevOps, FinOps, SecOps, and infrastructure automation for large-scale production systems. This role requires end-to-end ownership of AWS infrastructure activities including provisioning, governance, monitoring, cost optimization, and security compliance.
Key Responsibilities
Cloud Infrastructure & Workload Modernization :
- Deploy and manage services: ECS, EKS, EC2, Lambda (API Gateway + Step Functions), AWS Batch, RDS, ALB, S3.
- Execute workload migrations (e.g., EC2 ECS/EKS, Windows Server 2012 2019, SUSE SLES 15 SP5).
- Architect and manage scalable serverless and container-based platforms.
Networking & Availability
- Design/manage VPCs, subnets, route tables, NAT/Internet gateways, Transit Gateway.
- Resolve subnet IP exhaustion and optimize cross-region network scalability.
- Integrate CDN;re-architect WAF rulesto minimize internet exposure and secure internal access.
Security & Governance
- Implement advanced IAM strategies(permission boundaries,stale key automation, MFA enforcement).
- Apply SCPs, AWS Config Rules, Security Hub, GuardDuty, WAF across all accounts.
- Enforce governance on tagging, backup, and resource scheduling through IaC and automation.
Automation & Infrastructure as Code (IaC)
- Manage infrastructure with CloudFormation (including nested stacks).
- Automate compliance checks, patching, utilization reports, and resource cleanup.
- Build serverless workflows using Lambda, Step Functions, and event-driven orchestration.
Observability & Monitoring
- Centralize logging via CloudWatch and export to S3 + Athena for cost-efficient querying.
- Manage 1800+ CloudWatch alarms with proper SNS routing and ownership mapping.
- Extend observability using Grafana and Prometheusfor infrastructure and application insights.
FinOps & Cost Optimization
- Drive cost governance via Budgets, Anomaly Detection, Cost Explorer.
- Transition from Reserved Instancesto 100% utilized Savings Plans (>95% coverage).
- Implement lifecycle policies (ECR/S3), automate cleanup of orphaned resources, and present cost reports.
Backup & Disaster Recovery
- Replace scripts with centralized AWS Backup policies across accounts.
- Define structured tiers(daily, weekly, monthly, yearly) with Design and test DR strategies with RPO
Knowledge Sharing & Documentation
- Develop runbooks, tagging standards, dashboards, and compliance playbooks.
- Mentor junior DevOps/support engineers on operations, automation, and cost hygiene.
Must-Have Skills
- 4+ years hands-on AWS experience with ECS, EKS, Lambda, EC2, RDS, VPC, CloudWatch, ALB.
- Proficient in CloudFormation (nested stacks) and State Machine design.
- Automation using AWS Lambda, Step Functions, Python, Bash.
- Security: IAM, Config Rules, WAF, GuardDuty, Security Hub.
- Cost optimization: Budgets, Anomaly Detection, Savings Plans, lifecycle policies.
- Monitoring tools: CloudWatch, Athena, Prometheus, Grafana.
- CI/CD and Jenkins pipelines; deep understanding of AWS networking stack (VPC, TGW, IGW, NACLs, SGs).
Good-to-Have Skills
- AWS Batch automation and complex CloudFormation templates.
- Advanced Grafana dashboards and Prometheus exporters.
- Performance tuning for Aurora and RDS.
- OS-level knowledge of both Windows and Linux.
- Task scheduling using DynamoDB, SSM, and Lambda orchestration.
Preferred Certifications
- AWS Certified Solutions Architect Associate / Professional
- AWS Certified DevOps Engineer Professional
- AWS Certified Security Specialty
(ref:hirist.tech)