Role Summary:
We are seeking a highly skilled and experienced Senior DevOps Engineer with 6+ years of hands-on expertise in designing, automating, and optimizing scalable infrastructure and CI/CD ecosystems. The ideal candidate will lead DevOps initiatives, drive automation strategies, and ensure high availability, reliability, security, and performance of mission-critical systems.
Key Responsibilities:
1. CI/CD Architecture & Automation Strategy
- Drive DevSecOps integration including automated security, compliance, and quality gates.
- Improve deployment frequency, reliability, and lead time through automation best practices.
2. Infrastructure as Code (IaC) & Automation
- Lead infrastructure provisioning and management using Terraform, Ansible, or similar IaC tools.
- Design modular, reusable, and scalable IaC frameworks.
- Implement infrastructure versioning, policy enforcement, and environment standardization.
3. Monitoring, Observability & Incident Management
- Architect and implement comprehensive monitoring solutions using Prometheus, Grafana, or similar tools.
- Establish logging, tracing, and observability frameworks.
- Define SLIs, SLOs, and SLAs to ensure system reliability.
- Lead root cause analysis (RCA) and incident response processes.
- Proactively identify and mitigate performance bottlenecks and reliability risks.
4. Security & Compliance Leadership
- Implement DevSecOps principles across CI/CD pipelines.
- Manage secrets, IAM roles, RBAC policies, and encryption strategies.
- Lead vulnerability management and compliance audits.
Required Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology, Engineering, or related field.
- 6+ years of experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering roles.
- Advanced expertise in CI/CD tools such as Jenkins and GitLab CI/CD.
- Strong hands-on experience with Docker and Kubernetes in production environments.
- Deep knowledge of Infrastructure as Code tools (Terraform, Ansible).
- Strong Linux/Unix system administration experience.
- Experience implementing monitoring solutions such as Prometheus and Grafana.
- Proven experience with cloud platforms (AWS, Azure, or GCP).
Preferred Qualifications:
- Strong scripting/programming skills (Bash, Python, or Go).
- Experience with microservices and distributed system architectures.
- Familiarity with service mesh technologies (Istio, Linkerd).