Job description
- Building software to help DevOps - You oversee the building and implementation of tools and services to help Engineering do better with agile development and delivery and drive deeper reliability to our systems in production.
- Being comfortable & excellent in customer facing role
- Adding automation and context to alerts - leading to better real-time collaborative response from technical responders. Additionally, can update runbooks, tools and documentation to help prepare Engineering for future incidents.
- Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
- Understanding customer requirements and project KPIs
- Setting up tools and required infrastructure
- Incidence management and root cause analysis
- Coordinating and communicating within the team and with customers
- Selecting and deploying appropriate CI/CD tools
- Striving for continuous improvement and build continuous integration, continuous development, and constant deployment pipeline
- Mentoring and guiding the team members
- Defining scalable, reliable platform architecture and patterns.
- Translating requirements into robust solutions; conduct architectural reviews.
- Ensuring scalability and reliability through strong technical processes.
- Optimizing cloud services, containers, and monitoring tools.
- Automating provisioning, scaling, and configuration management.
- Aligning with product and engineering teams; communicate solutions clearly.
- Building partnerships with vendors and open-source communities.
JOB REQUIREMENTS/ SKILLS:
- Possess 4+ years of experience in a DevOps position.
- Cloud Expertise: Advanced knowledge of AWS (EC2, S3, EBS, VPC, ELB, AMI, SNS, RDS, IAM, Route 53), Azure (Azure VM, SSE, Azure VNet, Azure Monitor).
- Containerization & Infrastructure: Proficient with Docker, Kubernetes, Helm
- Infrastructure as Code: Terraform / OpenTofu, AWS CDK, Azure Bicep and Pulumi
- Experience with monitoring and logging tools such as Prometheus, VictoriaMetrics, Loki, Grafana or ELK stack
- Programming & Scripting: Skilled in Python, and Bash, with strong system design and architecture expertise.
- DevOps & CI/CD: Experience with agile methodologies, GitOps, and tools like GitLab CI and GitHub Actions.
- Linux & Open Source: Deep understanding of Linux OS, open-source ecosystems, and scalable system automation.
- Good understanding of networking protocols: DNS, HTTP, SSL, SMTP, TCP
- Communication & Collaboration: Strong leadership, interpersonal, and decision-making skills, with a focus on driving results and improvement.
- Experience with security best practices for data at rest and data in-transit using AWS tools such as SSE-KMS, security policies, cryptographic protocols
- Pro-active approach to learning about and adapting to new technologies
- Knowledge of agile software development and DevOps philosophies
- Experience with JIRA and Confluence