Responsibilities
- Build and manage infrastructure platforms to improve the flow of value through the developer experience.
- Collaborate across teams and lead technical discussions with clients and internal teams.
- Mentor junior team members through mob-pairing/1:1 pairing sessions.
- Adapt to different roles and technologies as project needs evolve.
- Proactive mindset with excellent communication, collaboration, and writing skills.
- Use methodologies like Agile and Lean to manage projects and deliverables.
- Document technical solutions and knowledge for team reference.
- Stay current with industry trends and emerging technologies.
Requirements
- Experience with managing large-scale infrastructure systems.
- Experience as a reliability engineer is good.
- Proficiency in at least one programming language.
- Good understanding of software delivery principles.
- Technical agility: Ability to adapt to different roles and technologies when needed.
Required Technical Skills
- You possess prior experience in building internal platforms.
- This could either be infrastructure platforms or business-focused platforms, such as a notification service.
- You must have a good understanding of Continuous Delivery concepts and experience any one of the tools like Jenkins, GitHub Actions, GitLab CI, etc.
- You have a good understanding of application reliability and can enable product teams to build and emit the right application telemetry.
- You have a good understanding of observability systems and incident management.
- You have an excellent understanding of the principles of distributed systems.
- You are familiar with the Cloud-Native ecosystem and tools like Prometheus, OpenTelemetry, Envoy, etc.
- In-depth understanding of containers and container orchestrators like Nomad, Kubernetes, etc.
- You have a good understanding of SQL and NoSQL databases, along with best practices of using and managing large-scale database clusters.
- You are familiar with best practices on architecting cloud environments focussing on workload reliability, cost and security.
- You have a good grasp of Linux environments, virtualization and networking.
- You have a great understanding of infrastructure as Code and have worked with tools such as Terraform, Pulumi, CloudFormation, and AWS CDK.
- You have work experience in configuration management and using tools like Ansible and Chef.
This job was posted by Vibha from Infraspec.