Search by job, company or skills

Site Reliability Engineer (SRE)

Hitachi Ltd.

    Highlights

    Job Description

    More Info

    Recruiter Info

0-2 Years
2 months ago
518 Viewed
31 Applied

Job Description

**Technical Skills:**
- Proven experience with web conferencing platforms.
- Strong knowledge of Linux/Unix systems and network protocols.
- Proficiency in scripting languages such as Python, Bash, or similar.
- Experience with cloud platforms (AWS, Azure, or Google Cloud).
- Familiarity with containerization technologies (Docker, Kubernetes).

**Tools and Automation:**
- Experience with monitoring and logging tools (Prometheus, Grafana, ELK stack).
- Knowledge of CI/CD pipelines and related tools (Jenkins, GitLab CI).
- Expertise in Infrastructure as Code (IaC) using tools like Terraform, Ansible, or Puppet.

**Soft Skills:**
- Strong problem-solving skills and attention to detail.
- Excellent communication and teamwork abilities.
- Ability to work independently and manage time effectively during night shifts.

**Work Conditions:**

- Must be available to work night shifts.
- On-call rotation as required.

Job Responsibilities:

**System Monitoring and Incident Response:**
- Monitor the health and performance of the Connect platform.
- Respond to incidents and troubleshoot issues to minimize downtime.
- Implement and manage alerting systems to proactively identify potential issues.

**Performance Optimization:**
- Analyze and improve system performance, reliability, and scalability.
- Conduct root cause analysis for incidents and implement preventive measures.
- Optimize infrastructure and application performance.

**Automation and Tooling:**
- Develop and maintain automation scripts to improve operational efficiency.
- Create and enhance tools for system management and monitoring.
- Implement Infrastructure as Code (IaC) using tools like Terraform, SALT, or similar.

**Maintenance and Upgrades:**
- Perform regular system maintenance, including patching and upgrades.
- Manage backups, disaster recovery, and data integrity measures.
- Ensure compliance with security standards and best practices.

**Collaboration and Documentation:**
- Collaborate with development, QA, and other operational teams to enhance system reliability.
- Document procedures, configurations, and troubleshooting guides.
- Participate in on-call rotations and handover processes
Follow
Save
Report

Similar Jobs

Site Reliability Engineer SRE AWS

Company Name Confidential

Site Reliability Engineer SRE GCP

Company Name Confidential

People also considered

DelhiBengaluru / BangaloreNoidaMumbaiHyderabad / Secunderabad Telangana
Last Updated: 12-07-2024 10:01:09 AM
Home Jobs in Remote Site Reliability Engineer (SRE)