
Search by job, company or skills

As a Site Reliability Engineer - Application Support, you will:
We are looking for someone with:
● 5-7 years of hands-on experience in SRE, DevOps, or Application Support roles, preferably in high-availability production environments
● Linux Administration: Strong experience with Linux systems, proficiency in shell scripting for automation, system monitoring, and troubleshooting
● Kubernetes: Hands-on experience managing Kubernetes clusters, troubleshooting pod issues, analyzing logs, configuring deployments, and understanding networking concepts
● AWS Cloud Services: Working knowledge of AWS services (EC2, S3, RDS, Lambda, CloudWatch, ECS, etc.) with experience in troubleshooting and optimizing cloud infrastructure
● Infrastructure as Code: Experience with Terraform or similar tools for provisioning and managing cloud resources
● Monitoring & Observability: Practical experience with APM tools (Dynatrace or similar), Grafana for dashboard creation, and log analysis using Elasticsearch/Kibana
● Database Management: Experience with Redis for caching solutions and Oracle databases, including basic PL/SQL querying and performance troubleshooting
● CI/CD Tools: Familiarity with GitLab, Jenkins, Argo CD, or similar CI/CD platforms for deployment automation
● Scripting & Programming: Proficiency in shell scripting; knowledge of Python/shell or other scripting languages is a plus
● Incident Management: Experience with ServiceNow or similar ITSM tools, understanding of ITIL framework for incident, problem, and change management
● SRE Principles: Understanding of SLIs, SLOs, SLAs, error budgets, and capacity planning concepts
● Problem-Solving Skills: Strong analytical and troubleshooting abilities with attention to detail
● Communication Skills: Ability to collaborate effectively with cross-functional teams and document technical processes clearly
● Education: Bachelors degree in computer science, Information Technology, or equivalent practical experience
Following aspects would be a plus:
Job ID: 149363725
Skills:
Elk, Cloudformation, Prometheus, Bash, Grafana, Jenkins, Gcp, Terraform, Ansible, Kubernetes, Python, AWS, OpenTelemetry
Skills:
Terraform, Saas, Kubernetes, Incident Response, AI-powered Automation, Observability
Skills:
Nginx, Grafana, Sdn, Redis, Ruby, Prometheus, MySQL, Kubernetes, Python, Jenkins, Git, Elk Stack, Envoy, TCP IP routing, Go, ArgoCD, On Prem Cloud data center
Skills:
Bash Scripting, Kubernetes, Docker, Podman, Linux systems, Serverless Architecture, event stream processing
Skills:
Kubernetes, Python, Software Development, C, Cloud, Java, Saas
We don’t charge any money for job offers