We are hiring a Senior DevOps Engineer with 8+ years of overall experience and 5+ years of hands-on Kubernetes expertise. The ideal candidate should be highly proficient in AWS cloud, CI/CD, containerization, Kubernetes cluster management, and infrastructure automation.
Must-Have Skills
- 8+ years of DevOps / SRE / Infrastructure Engineering experience
- 5+ years of hands-on experience with Kubernetes
- Expertise in deploying and managing Kubernetes clusters (cloud-managed and on-prem)
- Strong experience in Kubernetes Ingress setup and management
- Strong proficiency in AWS, including EKS, Route 53, CloudFront, IAM, VPC, Load Balancers, S3, ECR, CloudWatch
- Experience with CI/CD tools like Jenkins, GitHub Actions, GitLab CI, CircleCI
- Experience with Docker, NGINX, Terraform, PostgreSQL, Redis, and Kafka
- Strong Linux/Ubuntu administration and Bash scripting skills
- Good understanding of networking, DNS, SSL/TLS, VPNs, and cloud security
- Proven scripting ability with Bash, and optionally Python or Go
- Comfortable with Git workflows and version control best practices
- Solid understanding of networking (DNS, firewalls, routing, SSL/TLS, etc.)
- Experience managing and securing container registries (Docker, GitHub, GitLab, etc.)
Responsibilities
- Design, deploy, and manage Kubernetes clusters, both cloud-managed (EKS/AKS/GKE) and on premise
- Implement and manage CI/CD pipelines using tools like GitLab CI, GitHub Actions, Jenkins, or CircleCI
- Manage and secure VPNs, preferably using WireGuard, for secure remote access and internal communications
- Configure and maintain container registries, including Docker Hub, GitHub Container Registry, and self-hosted/private registries.
- Deploy and manage databases including PostgreSQL, Redis, and TimescaleDB
- Write and maintain Bash scripts for automation and system administration on Ubuntu-based systems
- Deploy and manage Apache Kafka for distributed event streaming and real-time data pipelines
- Use NGINX as a reverse proxy, load balancer, and/or ingress controller, including TLS/SSL management
- Build infrastructure using Infrastructure as Code (IaaC) tools such as Terraform or Pulumi
- Set up monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK, Datadog)
- Collaborate with developers, QA, and security teams to ensure reliable, secure, and scalable deployments
- Support cloud infrastructure (AWS, Azure, GCP) and help in cost optimization and environment standardization
- Participate in on-call rotations and handle production incidents and troubleshooting