Job Title: Network Reliability Engineer (NRE)
Location: Singapore
Experience: 712 Years
Job Summary
We are looking for an experienced Network Reliability Engineer (NRE) with strong expertise in hybrid and cloud networking environments. The ideal candidate should have hands-on experience across enterprise data center, cloud, network security, automation, and observability platforms.
This role focuses on improving reliability, scalability, automation, and operational efficiency of network services using NRE/SRE principles, Infrastructure as Code (IaC), CI/CD practices, and API-driven integrations.
The candidate should also have hands-on experience in network automation using Ansible Automation Platform.
Key Responsibilities
Network Reliability & Operations
- Manage and improve reliability, availability, and performance of hybrid and cloud-connected network services.
- Apply Network Reliability Engineering (NRE) and Site Reliability Engineering (SRE) principles to reduce operational overhead and improve service resilience.
- Design, implement, and optimize highly available network architectures across on-premises and cloud environments.
- Perform advanced troubleshooting and root cause analysis for complex network and security incidents.
- Build automation and self-healing workflows for Day-1, Day-2, and Day-N operational activities.
- Define and improve SLIs, SLOs, and error budgets for network services.
- Collaborate with infrastructure, cloud, security, and application teams to support operational and business objectives.
Network Automation (Ansible)
- Develop and maintain Ansible playbooks, roles, and inventories for multi-vendor network automation.
- Automate network configuration management, backups, compliance validation, provisioning, and firmware upgrades.
- Convert operational procedures and runbooks into reusable Infrastructure-as-Code (IaC) workflows.
- Integrate Ansible automation with monitoring tools, APIs, and ITSM platforms for automated remediation and faster incident response.
- Implement configuration compliance and drift detection across network infrastructure.
Hybrid, Cloud & Data Center Networking
- Support enterprise data center and hybrid cloud network environments.
- Manage Cisco enterprise and data center networking platforms including:
- Cisco ACI
- Cisco IOS-XE
- Cisco NX-OS
- Wireless LAN Controllers (WLC)
- Support AWS networking services including:
- VPC
- Routing
- Security Groups
- NACLs
- Transit Gateway
- VPN
- Direct Connect
- Support Azure networking services including:
- VNets
- UDRs
- NSGs
- VPN Gateway
- ExpressRoute
- Azure Firewall
- Manage hybrid connectivity solutions including MPLS, VPN, internet connectivity, and remote access solutions.
Network Security & Proxy Platforms
- Administer and support enterprise firewall platforms including:
- Check Point
- Palo Alto
- Cisco Firepower (FTD)
- Support firewall policy lifecycle management and automation platforms.
- Manage proxy and secure web gateway technologies.
- Support cloud-delivered security and Zero Trust Network Access (ZTNA) solutions for hybrid workforce environments.
Application Delivery & Traffic Management
- Support and manage F5 application delivery and load balancing platforms.
- Work with:
- F5 LTM
- GTM
- APM
- Design resilient application delivery solutions across on-premises and cloud platforms.
Monitoring, Observability & Telemetry
- Monitor hybrid and cloud network services using enterprise monitoring tools.
- Build dashboards, alerts, and observability solutions.
- Support network assurance, telemetry, device profiling, and compliance monitoring.
Automation, CI/CD & Engineering
- Develop and maintain network automation solutions using Ansible and Python.
- Integrate Git-based version control and CI/CD pipelines for infrastructure and network changes.
- Use REST APIs, SDKs, CLI tools, and GUIs for platform integrations.
- Develop automation tools, dashboards, or services using Python and web frameworks where required.
- Implement Infrastructure as Code (IaC) and automated validation processes.
Required Skills
- Strong experience in enterprise networking and network reliability engineering.
- Hands-on experience with hybrid and cloud networking technologies.
- Strong knowledge of routing, switching, network security, and application delivery.
- Experience with network automation and Infrastructure as Code.
- Good understanding of monitoring, observability, and incident management practices.
- Strong troubleshooting, analytical, and communication skills.
Preferred Qualifications
- Bachelors degree in Computer Science, Information Technology, or related field.
- Relevant certifications are an advantage, such as:
- CCNP / CCIE
- AWS or Azure Networking certifications
- Palo Alto / Check Point certifications
- F5 certifications
Work Environment
- Participation in on-call support rotations and major incident response activities.
- Collaboration with regional and global teams supporting enterprise network infrastructure.
- Exposure to CI/CD-driven operational and infrastructure management practices.