System Integration Test (TestON)

Each component of SD-Fabric has it’s own testing infrastructure for component level tests. However, we need a way to test all the components together. For SD-Fabric, we are using a framework called TestON to write these tests. We use TestON to manipulate the components via CLI sessions over ssh or rest APIs. More information on how TestON works can be found in the TestON Guide. The dataplane for these tests are either an emulated network running in Mininet using software switches, or hardware pods located in our lab. We use a variety of traffic generation tools running on the hosts to create traffic in the fabric.

The current topology for the hardware pod used for integration test is a set of leaf pair switches, a dual homed management server, and 3 compute nodes. One compute node is dual homed, one is dual homed with a second link to the second leaf switch, and one single homed to the second leaf switch.

../_images/paired-leaves-pod.svg

We also run soak testing and failure/recovery test in a 2x2 hardware pod.

../_images/topology-2x2.png

Test Suites

  • Functionality Tests

    • Paired Leaves

      • Basic Connectivity - All hosts can ping each other and their fabric interfaces.

      • Link Failure - Verify a traffic stream fails over when there is a link failure. We test both failure of both the source and destination links for dual and single homed hosts.

      • Switch Failure - Verify a traffic stream fails over when there is a switch failure. We test several device failure modes for both the source and destination leaves for dual and single homed hosts.

    • Bridging - Verify bridging works between different host VLAN configurations.

    • Routing - Verify routing works with different route configurations.

    • INT - Verify INT reports for different packet drop reasons.

    • UPF - Tests ONOS UP4 APIs, such as attachment and detachment of UEs as well as verify encapsulation of both upstream and downstream traffic.

    • QOS

      • QOS - Generate traffic flows with different QFIs and verify QOS using port statistics.

      • QOSNonMobile - Generate traffic flows with non mobile Traffic Classification and verify QOS using port statistics.

  • Failure/Recovery Tests

    • Link Failure Tests - Setup a flow from between hosts connected to different leaves, disable a link between a leaf and a spine used by the flow, verify and measure how fast the flow is rerouted to another link.

      • Access Leaf - Disable a link between the access leaf and a spine.

      • Upstream Leaf - Disable a link between the upstream lead and a spine.

    • Switch Failure Tests - Setup a flow from between hosts connected to different leaves, disable a spine used by the flow, verify and measure how fast the flow is rerouted to another spine.

      • Switch OS Reboot - Simulate a switch failure by restarting ONL on the switch.

      • Stratum Restart - Simulate a switch failure by killing the stratum agent on the switch.

Test Results

The nightly test results can be found by looking at the SD-Fabric Nightly Tests view on the Aether Jenkins website. The functionality test suites are run on the Paired Leaves Pod and the failure/recovery jobs are run on the Staging Pod.