Search by job, company or skills

StackNexus

SysEng - Network Engineer

Save
  • Posted 19 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Position Overview

Looking for a network engineer with experience in datacenter environments and at least light

programming experience.

SONiC Programmatic Iterative Configuration (gnmi/yang, Swss) Experience Required

SONiC base configuration (L2, mclag, lag/portchannel, bgp, bfd, etc) experience preferred

FRR Experience Preferred (OSPF)

OpenGear experience nice to have

Light systems programming language (C, C++, Golang, Rust, etc) experience preferred, stronger

experience nice to have

Scripting Language (Python, Bash, Etc) Experience Required

Linux administration (Bash, systemd units, general system navigation) experience preferred

Virtual Networking (VXLAN) Experience Preferred

AWS networking (VPC, Direct Connect) nice to have

Task Expectations

  • Programmatic Iterative Configuration of SONiC Switches (yang/gnmi, swss, etc.)

Expected Experience/abilities:

Has used the above previously to configure, or can trivially identify how to implement CRUD

operations (or at least CRD) against constructs such as but not limited to:

Physical and sub interfaces

VXLAN/VNI

VRF

ACL

At minimum must provide correctly functioning examples

Functionality will ultimately be written in Golang. A network engineer is merely expected to

identify, document, and demonstrate interface functionality for the SingleStore team to

implement.

Network engineer being able to implement the CRUD/CRD functionality as a

library/module in Golang would be a plus but is unexpected

Actual virtual networking control plane implementation is expected to be responsibility

of SingleStore team

Network engineer contributing here would be high value, freeing up team to focus on

storage implementation

  • Base Configuration of SONiC Switches

Inband Configuration:

L2, mclag, lag/portchannel, bgp, bfd

Need to set up anycast addresses for metadata service IP, SAG, etc

Bifurcated Spine-Leaf Topology (Inband):

Each side of aisle has 2x spines to be mclag d

Each spine has 2x connections to each other spine to be lag d

Each spine has 2x connections to each ToR/leaf on same side to be lag d

Each side of aisle has private AS

ToR-compute node connections breakouts, ToR-storage nodes standard

Spine-Leaf Topology (Out Of Band):

Each side of aisle has 1x spine

Each spine has connection to each ToR/leaf on both sides

Currently each side of aisle has private AS

Can be argued should be single AS

Currently hardcoded L3

Additional Notes:

Spine model in use insufficient resources for unified DHCP stack - had to settle on model due to

tariff season (isc dhcpd usable)

Server BMCs previously static IP d and/or infinite lease d via DHCP by vendor, require crash

carting/manual full reset in order to DHCP

Switch management ports physically connected but not currently configured to be reachable via

OOB network

PDU management ports do not DHCP, require on-site troubleshooting to bring into network

Coordination:

Coordinate with DevOps on switch integration with monitoring

Transition from ad-hoc to code-driven base configuration

Coordinate with DevOps on switch provisioning (ZTP or otherwise)

Coordinate with DevOps on SONiC build pipeline

  • Palo Alto Firewall Configuration Remediation

Transition from ad-hoc to code-driven configuration

Multipath & Traffic Handling:

Ensure multipath functioning correctly

Firewall rules engine appears to favor single source interface for all src/dst resulting in

erroneous packet drops

Ensure upstream egress/ingress A/P functioning correctly

Will need to work with network team of colocation vendor providing IP transit to

remedy IP transit only having one functioning leg at present

May require on-site work/coordination

Direct Connect:

Ensure direct connect multipath correctly working

Ensure no overly eager security features negatively impacting legitimate traffic (session

drops/throttling, unreasonable latency impacts currently 200ms hit on some traffic)

Ensure no unlicensed security features enabled (dnssec currently erroneously enabled)

Additional Configuration:

NAT public IPs for use

Interzone traffic rules currently permissive more mature tiered scheme necessary for long term

Coordinate with DevOps on firewall integration with monitoring

  • Console Network Setup

Physical Topology:

2x OpenGear OM2224 spines

16x OpenGear IM7248 ToRs

Current State:

Spines routable, providing loop for firewall management ports

Cellular not active

ToRs lack ethernet routing

All end-device access currently through nested console sessions

Requirements:

FRR experience required, OpenGear experience bonus

Challenges & Fixes:

OpenGear cellular fallback does not work well with multipath (destroys routing when triggered)

Should be manually implemented using systemd timer with heartbeats over multiple

paths

Setup Tasks:

Set up direct-to-end-device serial console via SSH

Configure standardized versions for IM7248s and OM2224s

Set up standardized credentials

Cellular Requirements:

Business cellular plan required for OM2224s

Minimum 10GB/month, 50GB+ preferred

Needed for emergency access and recovery scenarios

Constraints:

AT&T allows OpenGears but blocks Palo Alto traffic on consumer plans

Verizon 4G coverage inconsistent in colo area

One T-Mobile 4G band unsupported by OM2224 modem

Coordination:

Work with colocation vendor for antenna extension installation on roof to ensure reliable cellula

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149583431