Senior Site Reliability Engineer

Karix

Bengaluru, India

Fresher

Save

Posted 21 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

We are looking for an experienced SeniorSite Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our production systems. The ideal candidate will have strong troubleshooting skills, hands-on experience with messaging queues, in-memory queues, Kubernetes, and deployment automation, along with expertise in Infrastructure as Code and microservices architecture.

Key Responsibilities

Application Troubleshooting: Diagnose and resolve complex application issues in production environments.
Queue Management: Work with messaging queues (Kafka, RabbitMQ) and in-memory queues (Redis) to maintain system performance.
Deployment & Automation: Manage deployments using CI/CD pipelines and automation tools.
Kubernetes Administration: Maintain and optimize Kubernetes clusters for high availability and scalability.
Production Support: Provide support for critical production systems, ensuring uptime and reliability.
Monitoring & Alerting: Implement and maintain monitoring solutions (Prometheus, Grafana, ELK stack).
Incident Management: Lead root cause analysis and post-mortem reviews for production incidents.

Must-Have Skills

Strong experience in troubleshooting application issues in distributed systems.
Hands-on experience with messaging queues (Kafka, RabbitMQ) and in-memory queues (Redis).
Proficiency in Kubernetes and container orchestration.
Experience with CI/CD pipelines and deployment automation.
Solid understanding of Linux systems, networking, and cloud platforms (AWS, Azure, or GCP).
Infrastructure as Code experience (Terraform, Ansible).
Knowledge of microservices architecture.
Strong scripting and automation skills (Python, Bash, or similar).
Database expertise: Working experience with MySQL/Oracle/MongoDB.

Nice-to-Have

Experience with WhatsApp Business Messaging APIs and related integration skills.
Experience with security best practices in production environments.
Familiarity with observability tools and performance tuning.