Deployment Guide
Switch Hardware Selection
We have verified and therefore recommend using the switch model listed in Aether-verified Switch Hardware. Other Stratum-enabled switches listed in White Box Switch Hardware should also work in theory but more integration work may be required.
To use the P4 UPF, you must use fabric switches based on the Intel (formerly Barefoot) Tofino chipset. There are two variants of this switching chipset, with different resources and capabilities. The Dual Pipe Tofino ASIC is less expensive, while the Quad Pipe Tofino ASIC has more chip resources and a faster embedded system with more memory and storage.
The P4 UPF and SD-Fabric features run within the constraints of the Dual Pipe system for production deployments, but for development of features in P4, the larger capacity of the Quad Pipe is desirable.
These switches feature 32 QSFP+ ports capable of running in 100GbE, 40GbE, or 4x 10GbE mode (using a split DAC or fiber cable) and have a 1GbE management network interface.
See also the Rackmount of Equipment for how the Fabric switches should be rack-mounted to ensure proper airflow within a rack.
Deployment Overview
SD-Fabric is released with Helm chart and container images. We recommend using Kubernetes and Helm to deploy SD-Fabric. Here’s a list of high level steps required to deploy SD-Fabric:
Provision switch
We first need to install operating system with Docker and Kubernetes on the bare-metal switches.
Prepare switches as special Kubernetes nodes
Kubernetes
label
andtaint
are used to configure switches as special Kubernetes worker nodes. This is to make sure we deploy Stratum (and only Stratum) on switches.Prepare ONOS network configuration
Network configuration defines properties such as switch pipeconf, subnet and VLAN.
Prepare Stratum chassis configuration for each switch
Chassis config defines switch properties such as port speed and breakout.
Install SD-Fabric using Helm
Finally, we are going to install SD-Fabric with the information we prepared in Step 1 to 5.
Step 1: Provision Switches
We follow Open Network Install Environment (ONIE) way to install Open Network Linux (ONL) image to switch. To work with the SD-Fabric environment, we have customized the ONL image to support related packages and dependencies.
Image source file can be found on ONF repository opennetworkinglab/OpenNetworkLinux. You can also download pre-compiled artifacts from Github Release page
Note
If you’re not familiar with ONIE/ONL environment, please check Getting Started to see how to install the ONL image to an ONIE supported switch.
Below is an example about how to install the ONL image.
1. Prepare a server which is accessible by the switch and then download the pre-compiled installer from the release page.
wget https://github.com/opennetworkinglab/OpenNetworkLinux/releases/download/v1.4.3/ONL-onf-ONLPv2_ONL-OS_2021-07-16.2159-5195444_AMD64_INSTALLED_INSTALLER -o onl-installer
sudo python -m http.server 80
Reboot the switch to enter ONIE installation mode
In order to reinstall an ONL image, you must change the ONIE bootloader to “Rescue Mode”.
Once the switch is powered on, it should retrieve an IP address on the OpenBMC interface with DHCP. Here we use
10.0.0.131
as an example. OpenBMC uses these default credentialsusername: root password: 0penBmc
Login to OpenBMC with SSH:
$ ssh root@10.0.0.131 The authenticity of host '10.0.0.131 (10.0.0.131)' can't be established. ECDSA key fingerprint is SHA256:... Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '10.0.0.131' (ECDSA) to the list of known hosts. root@10.0.0.131's password: root@bmc:~#
Using the Serial-over-LAN Console, enter ONL
root@bmc:~# /usr/local/bin/sol.sh You are in SOL session. Use ctrl-x to quit. ----------------------- root@onl:~#
Note
If sol.sh is unresponsive, please try to restart the mainboard with
root@onl:~# wedge_power.sh reset
Change the boot mode to rescue mode and reboot
root@onl:~# onl-onie-boot-mode rescue [1053033.768512] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) [1053033.936893] EXT4-fs (sda3): re-mounted. Opts: (null) [1053033.996727] EXT4-fs (sda3): re-mounted. Opts: (null) The system will boot into ONIE rescue mode at the next restart. root@onl:~# reboot
At this point, ONL will go through it’s shutdown sequence and ONIE will start. If it does not start right away, press the Enter/Return key a few times - it may show you a boot selection screen. Pick
ONIE
andRescue
if given a choice.Install ONL installer
Now that the switch is in Rescue mode
Then run the
onie-nos-install
command, with the URL of the management server (here we use10.0.0.129
as an example) on the management network segmentONIE:/ # onie-nos-install http://10.0.0.129/onie-installer discover: Rescue mode detected. No discover stopped. ONIE: Unable to find 'Serial Number' TLV in EEPROM data. Info: Fetching http://10.0.0.129/onie-installer ... Connecting to 10.0.0.129 (10.0.0.129:80) installer 100% |*******************************| 322M 0:00:00 ETA ONIE: Executing installer: http://10.0.0.129/onie-installer installer: computing checksum of original archive installer: checksum is OK ...
The installation will now start, and then ONL will boot culminating in
Open Network Linux OS ONL-wedge100bf-32qs, 2020-11-04.19:44-64100e9 localhost login: The default ONL login is:: username: root password: onl
If you login, you can verify that the switch is getting it’s IP address via DHCP
root@localhost:~# ip addr ... 3: ma1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:90:fb:5c:e1:97 brd ff:ff:ff:ff:ff:ff inet 10.0.0.130/25 brd 10.0.0.255 scope global ma1 ...
(Optional) Setup switch IP and hostname after the installation if DHCP is not available
Warning
Stop and return to Post-ONL configuration and continue the remaining steps there if you came from Aether docs. Otherwise, please continue the rest of the page here.
Step 2: Configure switches as special Kubernetes nodes
Our ONL version includes all packages required by running the Kubernetes on top of it. Once the Kubernetes is ready, the Stratum application will be deployed to the switch to manage it.
Unlike server, switch has less CPU and memory resources and we should avoid deploying unnecessary workloads into switch. Besides, the Stratum application should only be deployed to all switches.
To achieve the above goals, please apply the resources to your Kubernetes cluster.
Set up Label to all switch node, e.g
node-role.kubernetes.io=switch
Set up Taint with
NoSchedule
to all switch node, e.gnode-role.kubernetes.io=switch:NoSchedule
Properly configure the
NodeSelector
andToleration
when deploying Stratum via DaemonSet
Example of a five nodes Kubernetes cluster, two switches and three servers
╰─$ kubectl get node -o custom-columns=NAME:.metadata.name,TAINT:.spec.taints
NAME TAINT
compute1 <none>
compute2 <none>
compute3 <none>
leaf1 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
leaf2 [map[effect:NoSchedule key:node-role.kubernetes.io value:switch]]
╰─$ kubectl get nodes -lnode-role.kubernetes.io=switch
NAME STATUS ROLES AGE VERSION
leaf1 Ready worker 27d v1.18.8
leaf2 Ready worker 27d v1.18.8
Step 3: Prepare ONOS network configuration
See Network Configuration for instructions
Step 4: Prepare Stratum chassis configuration
See See Stratum Chassis Configuration for instructions
Step 5: Install SD-Fabric with Helm
To install SD-Fabric into your Kubernetes cluster, follow instructions described on the SD-Fabric Helm Chart README