Search by job, company or skills

Trantor

System Engineer

3-5 Years
Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 14 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Title: SysEng - Network Engineer (SONiC)

Experience - 3+ years

Shift - 7 Pm to 3 Am

Position Overview

Looking for a network engineer with experience in datacenter environments and at least light

programming experience.

● SONiC programmatic iterative configuration (gnmi/yang, swss) experience required

● SONiC base configuration (L2, mclag, lag/portchannel, bgp, bfd, etc) experience

preferred

● FRR experience preferred ( OSPF)

OpenGear experience nice to have

● Light systems programming language (c, c++, golang, rust, etc) experience preferred,

stronger experience nice to have

● Scripting language (python, bash, etc) experience required

● Linux administration (bash, systemd units, general system navigation) experience

preferred

● Virtual networking (VXLAN) experience preferred

● AWS networking (vpc, direct connect) nice to have

Task Expectations

● Programmatic iterative configuration of SONiC switches (yang/gnmi, swss, etc).

Expected experience/abilities:

○ Has used the above previously to configure, or can trivially identify how to

implement CRUD operations (or at least CRD) against constructs such as but not

limited to: physical and sub interfaces, vxlan/vni, vrf, acl

■ At minimum must provide correctly functioning examples

○ Functionality will ultimately be written in golang. Network engineer is merely

expected to identify, document, and demonstrate interface functionality for

SingleStore team to implement. Network engineer being able to implement the

CRUD/CRD functionality as a library/module in golang would be a plus but is

unexpected.

■ Actual virtual networking control plane implementation is expected to be

responsibility of SingleStore team. Network engineer contributing here

would be high value, freeing up team to focus on storage implementation.

● Base configuration of SONiC switches

○ Inband L2, mclag, lag/portchannel, bgp, bfd

■ Aside: Need to set up anycast addresses for metadata service ip, SAG,

etc

○ Bifurcated spine-leaf topology inband

■ Each side of aisle has 2x spines to be mclag'd

■ Each spine has 2x connections to each other spine to be lag'd

■ Each spine has 2x connections to each ToR/leaf on same side to be lag'd

■ Each side of aisle has private AS

■ ToR-compute node connections breakouts, ToR-storage nodes standard

○ Spine-leaf topology out of band

■ Each side of aisle has 1x spine

■ Each spine has connection to each ToR/leaf on both sides

■ Currently each side of aisle has private AS

● Can be argued should be single AS

■ Currently hardcoded L3

● Spine model in use insufficient resources for unified dhcp stack -

had to settle on model due to tariff season. isc dhcpd usable.

● Server BMCs previously static IP'd and/or infinite lease'd via

DHCP by vendor, require crash carting/manual full reset in order

to DHCP

● Switch management ports physically connected but not currently

configured to be reachable via OOB network

● PDU mgmt ports do not DHCP, require on-site troubleshooting to

bring into network

○ Coordinate w/devops on switch integration with monitoring

○ Transition from ad-hoc to code-driven base configuration

■ Coordinate w/devops on switch provisioning (ZTP or otherwise)

■ Coordinate w/devops on SONiC build pipeline

● Palo Alto Firewall configuration remediation

○ Transition from ad-hoc to code-driven configuration

○ Ensure multipath functioning correctly

■ Firewall rules engine appears to favor single source interface for all

src/dst resulting in erroneous packet drops

■ Ensure upstream egress/ingress A/P functioning correctly

● Will need to work with network team of colocation vendor

providing IP transit to remedy IP transit only having one

functioning leg at present

○ May require on-site work/coordination

● Ensure direct connect multipath correctly working

○ Ensure no overly eager security features negatively impacting legitimate traffic

(session drops/throttling, unreasonable latency impacts, etc - currently see

200ms hit on some traffic)

○ Ensure no unlicensed security features enabled (I believe dnssec is currently

erroneously enabled)

○ NAT public ips for use

○ Interzone traffic rules currently permissive to avoid issues in early deployment -

More mature tiered scheme necessary for long term

○ Coordinate w/devops on firewall integration with monitoring

● Console network setup

○ Physical topology of 2x OpenGear OM2224 spines and 16x Opengear IM7248

ToRs

○ Current state is spines routable, providing loop for firewall mgt ports, cellular not

active, ToRs lack ethernet routing, all end-device access currently through nested

console sessions

○ FRR experience required, OpenGear experience bonus

○ OpenGear cellular fallback does not play nicely with multipath - destroys routing

when triggered - cellular fallback should be manually implemented, simple

systemd timer with heartbeats over various paths is all that's needed

○ Set up direct to end-device serial console via ssh (existing feature, just needs to

be config'd after ethernet routing is set up)

○ Set up standardized versions for IM7248s, OM2224s

○ Set up standardized credentials for IM7248s + OM2224s

○ Need business cellular plan for OM2224s - previously used prepaid consumer

(not an option - AT&T allows OpenGears but bars Palo Alto traffic on consumer

plan). Verizon 4G coverage in colo area is inconsistent. One of TMobile's 4G

bands is unsupported by OM2224's modem.

■ At least 10GB/mo on each, 50+ preferred - It's for emergency access,

there is a world where we catastrophic failure requires recovery involving

pushing images over these connections, hitting plan limit in an emergency

is the last thing we should ever deal with

○ Will need to work with colocation vendor to coordinate antenna extension

installation on roof to ensure reliable cell signal a

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147214889

Similar Jobs

Chennai, India

Skills:

VbaExcelSqlPythonWeb-based application developmentPcVue SCADAMS AccessHMI application developmentTable Based ScheduleBack-end scriptingIOT VM – ESXi

Bengaluru, India

Skills:

Microsoft IntuneAzure Data FactoryData ModelingPower BiPowerShellData GovernancePythonSqlELTEtlMicrosoft Graph API

Hyderabad, India

Skills:

state estimation path planning Pythonobstacle avoidance systemssensor fusionautonomous control algorithmsreal-time data processingreal-time control systems

Bengaluru, India

Skills:

Python ScriptingApi DevelopmentCloud FormationAutomated Testing ToolsTerraformWindows OS commandsAutomation workflowsInfrastructure support and monitoringFrontend UI development – JS nodeCloud services in AWSCI CD maintenanceGIT-HUBControl-MPublic Cloud Platform AWS

Thiruvananthapuram, Thiruvananthapuram / Trivandrum, India

Skills:

RtosBluetoothUsbCEmbedded LinuxUartSpiCanGitFreertosEthernetI2cPythonlogic analyzersOscilloscopesmultimetersArduino IDEVS CodeSTMCubeIDEKiCadAssembly