Slicing and QoS

Overview

Network slicing enables sharing the same physical infrastructure between independent logical networks, each one targeting different use cases while providing isolation and security guarantees. Slicing permits the implementation of tailor-made applications with Quality of Service (QoS) specific to the needs of each slice, rather than a one-size-fits-all approach.

SD-Fabric supports slicing and QoS using dedicated hardware resources such as scheduling queues and meters. Once a packet enters the fabric, it is associated with a slice ID and traffic Class (TC). Slice ID is an arbitrary identifier, while TC is used to determine the QoS parameters. The combination of slice ID and TC is used by SD-Fabric to determine which switch hardware queue to use.

We provide fabric-wide isolation and QoS guarantees. Packets are classified by the first leaf switch in the path, we then use a custom DSCP-based marking scheme to apply the same treatment on all switches.

Classification can be achieved for both regular traffic via REST APIs, or for GTP-U traffic terminated by P4-UPF using PFCP integration.

Traffic Classes

We supports the following traffic classes that covers the spectrum of applications from latency-sensitive to throughput-intensive.

Control

For applications demanding ultra-low latency and jitter guarantees, with non-bursty, low throughput requirements in the order of 100s of packets per second. Examples of such applications are consensus protocols, industrial automation, timing, etc. This class uses a queue shared by all slices, serviced with the highest priority. To enforce isolation between slices, and to avoid starvation of lower priority classes, each slice is processed through a single-rate two-color meter. Slices sending at rates higher than the configured meter rate might observe packet drops.

Real-Time

For applications that require both low-latency and sustained throughput. Examples of such applications are video and audio streaming. Each slice gets a dedicated Real-Time queue serviced in a Round-Robin fashion to guarantee the lowest latency at all times even with bursty senders. To avoid starvation of lower priority classes, Real-Time queues are shaped at a maximum rate. Slices sending at rates higher than the configured one might observe higher latency because of the shaping. Real-Time queues have priority lower than Control, but higher than Elastic.

Elastic

For throughput-intensive applications with no latency requirements. This class is best suited for large file transfers, Intranet/enterprise applications, prioritized Internet access, etc. Each slice gets a dedicated Elastic queue serviced in Weighted Round-Robin (WRR) fashion with configurable weights. During congestion, Elastic queues are guaranteed to receive minimum bandwidth that can grow up to the link capacity if other queues are empty.

Best-Effort

This is the default traffic class, used by packets not classified with any of the above classes All slices share the same best-effort queue with lowest priority.

Classification

Slice ID and TC classification can be performed in two ways.

Regular traffic

We provide an ACL-like APIs that supports specifying wildcard match rules on the IPv4 5-tuple.

P4-UPF traffic

When using the embedded UPF function, for GTP-U mobile traffic terminated by the fabric, we support integration with PFCP QoS features such as prioritization via QoS Flow Identifier (QFI), Maximum Bitrate (MBR) limits, and Guaranteed Bitrate (GBR).

You can configure a static one-to-one mapping between 3GPP’s QFIs and SD-Fabric’s TCs using the ONOS netcfg JSON file (work-in-progress), while MBR and GBR configuration are translated into meter configurations.

QoS classification uses the same table for GTP-U tunnel termination, for this reason, to achieve fabric-wide QoS enforcement, we recommend enabling the UPF function on each leaf switch using the distributed UPF mode, such that packets are classified as soon as they enter the network.

Support for slicing of mobile traffic is work-in-progress and will be added in the next SD-Fabric release.

Configuration

Note

QoS and slicing configuration is currently statically configured at switch startup. Dynamic configuration will be supported in a next SD-Fabric release.

QoS and slicing uses switch queue configuration provided via the vendor_config portion of the Stratum Chassis Config (see Stratum Chassis Configuration), where the queues and schedulers can be configured. For more information on the format of vendor_config, see the guide for running Stratum on Tofino-based switches in the Stratum repository.

We provide a convenient script to generate the configuration starting from a higher-level description provided via a YAML file. This file allows to configure the parameters for the traffic classes listed in the above section.

Here’s a list of parameters that you can configure via the YAML QoS configuration file:

max_cells: Maximum number of buffer cells, depends on the ASIC SKU/revision.
pool_allocations: Percentage of buffer cells allocated to each traffic class. The sum should be 100. Usually, we leave a portion of the buffer unassigned for queues that do not have a pool (yet). Example of such queues are those for the recirculation port, CPU port, etc.
```
pool_allocations:
  control: 1
  realtime: 9
  elastic: 80
  besteffort: 9
  unassigned: 1
```
Control Traffic Class: The available bandwidth dedicated to Control traffic is divided in slots. Each slot has a maximum rate and burst (in packets of the given MTU). A slice can use one or more slots by appropriately configuring meters in the fabric ingress pipeline.
- control_slot_count: Number of slots.
- control_slot_rate_pps: Packet per second rate of each slot.
- control_slot_burst_pkts: Number of packets per burst of each slot.
- control_mtu_bytes: MTU of packets for the PPS and burst values.
```
control_slot_count: 50
control_slot_rate_pps: 100
control_slot_burst_pkts: 10
control_mtu_bytes: 1500
```
Real-Time Traffic Class Configuration:
- realtime_max_rates_bps: List of maximum shaping rates for Real-Time queues, one per slice requesting such service.
- realtime_max_burst_s: Maximum amount of time that a Real-Time queue can burst at the port speed. This parameter is used to limit delay for Elastic queues.
```
realtime_max_rates_bps:
  - 45000000 # 45 Mbps
  - 30000000 # 30 Mbps
  - 25000000 # 25 Mbps
realtime_max_burst_s: 0.005 # 5 ms
```
Elastic Traffic Class Configuration:
- elastic_min_rates_bps: List of minimum guaranteed rates for Elastic queues, one per slice requesting such service.
```
elastic_min_rates_bps:
  - 100000000 # 100 Mbps
  - 200000000 # 200 Mbps
```
port_templates section: List of switch port for which we want to configure queues.

Every port_templates element contains:
- descr: Description of the port purpose.
- rate_bps: Port speed in bit per second.
- is_shaping_enabled: true if the rate is enforced using shaping, false if the rate is the channel speed.
- shaping_burst_bytes: Burst size in bytes, meaningful only if port speed is shaped (when is_shaping_enabled: true).
- queue_count: Number of queues assigned to the port.
- port_ids: List of Stratum port IDs (Singleton Port from Stratum Chassis Config), using this port template. Used for port that corresponds to switch front-panel ports.
  
  Mutually exclusive with sdk_port_ids field.
- sdk_port_ids: List of SDK port numbers (i.e., Tofino DP_ID) using this port template. Used for internal ports (e.g., recirculation ports).
  
  Mutually exclusive with port_ids field.
```
port_templates:
  - descr: "Base station"
    rate_bps: 1000000000 # 1 Gbps
    is_shaping_enabled: true
    shaping_burst_bytes: 18000 # 2x jumbo frames
    queue_count: 16
    port_ids:
      - 100
  - descr: "Servers"
    port_ids:
      - 200
    rate_bps: 40000000000 # 40 Gbps
    is_shaping_enabled: false
    queue_count: 16
  - descr: "Recirculation"
    sdk_port_ids:
      - 68
    rate_bps: 100000000000 # 100 Gbps
    is_shaping_enabled: false
    queue_count: 16
```

An example of a complete QoS and Slicing configuration can be found here.

REST API

We provide REST APIs with support for adding/removing/querying slices and traffic classes, as well as flow classification.

Slice

Add a slice

A POST request with Slice ID as path parameter. /slicing/slice/{sliceId}

Remove a slice

A DELETE request with Slice ID as path parameter. /slicing/slice/{sliceId}

Get all slices

A GET request. Returns a collection of slice id. /slicing/slice

Traffic Class

Tip

Traffic Class has following attributes: BEST_EFFORT, CONTROL, REAL_TIME, ELASTIC.

Add a traffic class to a slice

A POST request with Slice ID and Traffic Class as path parameters. /slicing/tc/{sliceId}/{tc}

Remove a traffic class from a slice

A DELETE request with Slice ID and Traffic Class as path parameters. /slicing/tc/{sliceId}/{tc}

Get all traffic classes from a slice

A GET request with Slice ID as path parameters. Returns a collection of traffic class. /slicing/tc/{sliceId}

Classify Flow

A flow can be defined as

{
  "criteria": [
    {
      "type": "IPV4_SRC",
      "ip": "10.0.0.1/32"
    },
    {
      "type": "IPV4_DST",
      "ip": "10.0.0.2/32"
    },
    {
      "type": "IP_PROTO",
      "protocol": 6
    },
    {
      "type": "TCP_SRC",
      "tcpPort": 1000
    },
    {
      "type": "TCP_DST",
      "tcpPort": 80
    },
    {
      "type": "UDP_SRC",
      "udpPort": 1000
    },
    {
      "type": "UDP_DST",
      "udpPort": 1812
    }
  ]
}

IPV4_SRC: Source IPv4 prefix
IPV4_DST: Destination IPv4 prefix
IP_PROTO: IP Protocol, accept 6 (TCP) and 17 (UDP)
TCP_SRC: Source L4 (TCP) port
TCP_DST: Destination L4 (TCP) port
UDP_SRC: Source L4 (UDP) port
UDP_DST: Destination L4 (UDP) port

Note

SD-Fabric currently supports 5-tuple only.

Classify a flow to a slice and traffic class

A POST request with Slice ID and Traffic Class as path parameters. And a Json of a flow as body parameters. /slicing/flow/{sliceId}/{tc}

Remove a flow from a slice and traffic class

A DELETE request with Slice ID and Traffic Class as path parameters. And a Json of a flow as body parameters. /slicing/flow/{sliceId}/{tc}

../_images/qos-rest-classifier-remove.png

Get all classified flows from a slice and traffic class

A GET request with Slice ID and Traffic Class as path parameters. Returns a collection of flow. /slicing/flow/{sliceId}