Designation: Senior CI/CD & Cloud Operations Engineer
Location: Kolkata (On-site)
Seniority: Senior (Individual Contributor; expected to lead by influence)
Team: Platform / Cloud Engineering / DevOps
Role Type: Full-time
Role Summary
We are looking for a Senior CI/CD & Cloud Operations Engineer to own and continuously improve our AWS-based deployment and runtime operations.
You will be responsible for designing and managing reliable CI/CD pipelines, provisioning and maintaining multiple environments (Dev/QA/Staging/Prod), setting up monitoring and alerting aligned to SLI/SLO/SLA, and driving production support, backups, and disaster recovery (BCM/DR).
This is a critical role directly impacting delivery speed, platform stability, and system uptime.
Key Responsibilities
1. CI/CD & Release Engineering
- Build, operate, and enhance CI/CD pipelines for build, testing, security scanning, artifact management, and deployment
- Standardize release processes (versioning, approvals, rollbacks, blue/green or canary deployments)
- Automate deployments across microservices, APIs, jobs, and infrastructure components
2. AWS Environment & Platform Operations
- Provision and maintain AWS environments (Dev/QA/Staging/Prod) using Infrastructure as Code
- Manage environment lifecycle including account structure, IAM, networking basics, and secrets management
- Ensure configuration consistency across environments (drift control, tagging, cost visibility)
3. Observability & SLO/SLA Management
- Implement monitoring and alerting using Datadog and/or AWS tools (CloudWatch, X-Ray, etc.)
- Define and track SLIs/SLOs (availability, latency, error rates, throughput)
- Build dashboards, runbooks, and incident response playbooks
- Conduct post-incident reviews (RCA and corrective actions)
4. Production Support, Backup & DR
- Support and improve 24/7 production incident management
- Own backup strategies, restore drills, retention policies, and access controls
- Design and test BCM/DR plans (RTO/RPO, failover processes, DR drills)
5. Security & Compliance (DevSecOps)
- Integrate security checks into CI/CD pipelines (SAST, DAST, dependency scanning, secrets scanning)
- Implement least privilege access and audit-compliant deployment practices
- Collaborate with engineering teams to enhance system security and reduce risks
Must Have
- Strong hands-on experience with AWS (compute, networking basics, IAM, monitoring)
- Proven ownership of CI/CD pipelines and automated deployments
- Infrastructure as Code experience (Terraform preferred / CloudFormation acceptable)
- Experience with observability tools (Datadog / AWS CloudWatch metrics, logs, alerts)
- Incident management experience aligned to SLAs/SLOs (on-call, RCA, remediation)
- Backup, restore, and disaster recovery (BCM/DR) experience
- Strong Linux and scripting skills (Bash/Python)
- Ability to collaborate with developers to improve deployment reliability
Should Have
- Experience with containerization and orchestration (Docker, ECS/EKS/Kubernetes)
- Knowledge of deployment strategies (blue/green, canary, feature flags, rollback mechanisms)
- Experience with secrets management (AWS Secrets Manager / Parameter Store)
- Understanding of APM, distributed tracing, and log aggregation
- Experience defining operational standards (runbooks, SRE practices, error budgets)
- Awareness of AWS cost optimization (tagging, budgets, rightsizing)
Nice to Have
- Experience with CI/CD tools (GitLab CI, Jenkins, GitHub Actions, etc.)
- Advanced Datadog experience (APM, synthetics, RUM, service maps)
- Exposure to compliance, audit readiness, or regulated environments
- Experience with SaaS platforms or multi-tenant systems
- Experience setting up centralized logging/monitoring frameworks
Experience & Qualifications
- 610+ years of experience in DevOps / CI/CD / Cloud Ops / SRE roles
- Bachelor's degree in Engineering/Computer Science (or equivalent practical experience)