Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7640

Traffic Management Benchmarking

Pages: 51
Informational
Part 2 of 3 – Pages 17 to 32
First   Prev   Next

Top   ToC   RFC7640 - Page 17   prevText

6. Traffic Benchmarking Methodology

The traffic benchmarking methodology uses the test setup from Section 1.2 and metrics defined in Section 4. Each test SHOULD compare the network device's internal statistics (available via command line management interface, SNMP, etc.) to the measured metrics defined in Section 4. This evaluates the accuracy of the internal traffic management counters under individual test conditions and capacity test conditions as defined in Sections 4.1 and 4.2. This comparison is not intended to compare real-time statistics, but rather the cumulative statistics reported after the test has completed and device counters have updated (it is common for device counters to update after an interval of 10 seconds or more). From a device configuration standpoint, scheduling and shaping functionality can be applied to logical ports (e.g., Link Aggregation (LAG)). This would result in the same scheduling and shaping configuration applied to all of the member physical ports. The focus of this document is only on tests at a physical-port level. The following sections provide the objective, procedure, metrics, and reporting format for each test. For all test steps, the following global parameters must be specified: Test Runs (Tr): The number of times the test needs to be run to ensure accurate and repeatable results. The recommended value is a minimum of 10. Test Duration (Td): The duration of a test iteration, expressed in seconds. The recommended minimum value is 60 seconds. The variability in the test results MUST be measured between test runs, and if the variation is characterized as a significant portion of the measured values, the next step may be to revise the methods to achieve better consistency.

6.1. Policing Tests

A policer is defined as the entity performing the policy function. The intent of the policing tests is to verify the policer performance (i.e., CIR/CBS and EIR/EBS parameters). The tests will verify that the network device can handle the CIR with CBS and the EIR with EBS, and will use back-to-back packet-testing concepts as described in [RFC2544] (but adapted to burst size algorithms and terminology). Also, [MEF-14], [MEF-19], and [MEF-37] provide some bases for
Top   ToC   RFC7640 - Page 18
   specific components of this test.  The burst hunt algorithm defined
   in Section 5.1.1 can also be used to automate the measurement of the
   CBS value.

   The tests are divided into two (2) sections: individual policer tests
   and then full-capacity policing tests.  It is important to benchmark
   the basic functionality of the individual policer and then proceed
   into the fully rated capacity of the device.  This capacity may
   include the number of policing policies per device and the number of
   policers simultaneously active across all ports.

6.1.1. Policer Individual Tests

Objective: Test a policer as defined by [RFC4115] or [MEF-10.3], depending upon the equipment's specification. In addition to verifying that the policer allows the specified CBS and EBS bursts to pass, the policer test MUST verify that the policer will remark or drop excess packets, and pass traffic at the specified CBS/EBS values. Test Summary: Policing tests should use stateless traffic. Stateful TCP test traffic will generally be adversely affected by a policer in the absence of traffic shaping. So, while TCP traffic could be used, it is more accurate to benchmark a policer with stateless traffic. As an example of a policer as defined by [RFC4115], consider a CBS/EBS of 64 KB and CIR/EIR of 100 Mbps on a 1 GigE physical link (in color-blind mode). A stateless traffic burst of 64 KB would be sent into the policer at the GigE rate. This equates to an approximately 0.512-millisecond burst time (64 KB at 1 GigE). The traffic generator must space these bursts to ensure that the aggregate throughput does not exceed the CIR. The Ti between the bursts would equal CBS * 8 / CIR = 5.12 milliseconds in this example. Test Metrics: The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL be measured at the egress port and recorded. Procedure: 1. Configure the DUT policing parameters for the desired CIR/EIR and CBS/EBS values to be tested. 2. Configure the tester to generate a stateless traffic burst equal to CBS and an interval equal to Ti (CBS in bits/CIR).
Top   ToC   RFC7640 - Page 19
      3. Compliant Traffic Test: Generate bursts of CBS + EBS traffic
         into the policer ingress port, and measure the metrics defined
         in Section 4.1 (BSA, LP, OOS, PD, and PDV) at the egress port
         and across the entire Td (default 60-second duration).

      4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
         bytes into the policer ingress port, and verify that the
         policer only allowed the BSA bytes to exit the egress.  The
         excess burst MUST be recorded; the recommended value is
         1000 bytes.  Additional tests beyond the simple color-blind
         example might include color-aware mode, configurations where
         EIR is greater than CIR, etc.

   Reporting Format:
      The policer individual report MUST contain all results for each
      CIR/EIR/CBS/EBS test run.  A recommended format is as follows:

      ***********************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: CIR, EIR, CBS, EBS

      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  Compliant Traffic Test: BSA, LP, OOS, PD, and PDV

      -  Excess Traffic Test: BSA

      ***********************************************************

6.1.2. Policer Capacity Tests

Objective: The intent of the capacity tests is to verify the policer performance in a scaled environment with multiple ingress customer policers on multiple physical ports. This test will benchmark the maximum number of active policers as specified by the device manufacturer. Test Summary: The specified policing function capacity is generally expressed in terms of the number of policers active on each individual physical port as well as the number of unique policer rates that are utilized. For all of the capacity tests, the benchmarking test
Top   ToC   RFC7640 - Page 20
      procedure and reporting format described in Section 6.1.1 for a
      single policer MUST be applied to each of the physical-port
      policers.

      For example, a Layer 2 switching device may specify that each of
      the 32 physical ports can be policed using a pool of policing
      service policies.  The device may carry a single customer's
      traffic on each physical port, and a single policer is
      instantiated per physical port.  Another possibility is that a
      single physical port may carry multiple customers, in which case
      many customer flows would be policed concurrently on an individual
      physical port (separate policers per customer on an individual
      port).

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   The following sections provide the specific test scenarios,
   procedures, and reporting formats for each policer capacity test.

6.1.2.1. Maximum Policers on Single Physical Port
Test Summary: The first policer capacity test will benchmark a single physical port, with maximum policers on that physical port. Assume multiple categories of ingress policers at rates r1, r2, ..., rn. There are multiple customers on a single physical port. Each customer could be represented by a single-tagged VLAN, a double-tagged VLAN, a Virtual Private LAN Service (VPLS) instance, etc. Each customer is mapped to a different policer. Each of the policers can be of rates r1, r2, ..., rn. An example configuration would be - Y1 customers, policer rate r1 - Y2 customers, policer rate r2 - Y3 customers, policer rate r3 ... - Yn customers, policer rate rn
Top   ToC   RFC7640 - Page 21
      Some bandwidth on the physical port is dedicated for other traffic
      (i.e., other than customer traffic); this includes network control
      protocol traffic.  There is a separate policer for the other
      traffic.  Typical deployments have three categories of policers;
      there may be some deployments with more or less than three
      categories of ingress policers.

   Procedure:
      1. Configure the DUT policing parameters for the desired CIR/EIR
         and CBS/EBS values for each policer rate (r1-rn) to be tested.

      2. Configure the tester to generate a stateless traffic burst
         equal to CBS and an interval equal to Ti (CBS in bits/CIR) for
         each customer stream (Y1-Yn).  The encapsulation for each
         customer must also be configured according to the service
         tested (VLAN, VPLS, IP mapping, etc.).

      3. Compliant Traffic Test: Generate bursts of CBS + EBS traffic
         into the policer ingress port for each customer traffic stream,
         and measure the metrics defined in Section 4.1 (BSA, LP, OOS,
         PD, and PDV) at the egress port for each stream and across the
         entire Td (default 30-second duration).

      4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
         bytes into the policer ingress port for each customer traffic
         stream, and verify that the policer only allowed the BSA bytes
         to exit the egress for each stream.  The excess burst MUST be
         recorded; the recommended value is 1000 bytes.

   Reporting Format:
      The policer individual report MUST contain all results for each
      CIR/EIR/CBS/EBS test run, per customer traffic stream.  A
      recommended format is as follows:

      *****************************************************************

      Test Configuration Summary: Tr, Td

      Customer Traffic Stream Encapsulation: Map each stream to VLAN,
      VPLS, IP address

      DUT Configuration Summary per Customer Traffic Stream: CIR, EIR,
      CBS, EBS
Top   ToC   RFC7640 - Page 22
      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  Customer Stream Y1-Yn (see note) Compliant Traffic Test:
         BSA, LP, OOS, PD, and PDV

      -  Customer Stream Y1-Yn (see note) Excess Traffic Test: BSA

      *****************************************************************

      Note: For each test run, there will be two (2) rows for each
      customer stream: the Compliant Traffic Test result and the Excess
      Traffic Test result.

6.1.2.2. Single Policer on All Physical Ports
Test Summary: The second policer capacity test involves a single policer function per physical port with all physical ports active. In this test, there is a single policer per physical port. The policer can have one of the rates r1, r2, ..., rn. All of the physical ports in the networking device are active. Procedure: The procedure for this test is identical to the procedure listed in Section 6.1.1. The configured parameters must be reported per port, and the test report must include results per measured egress port.
6.1.2.3. Maximum Policers on All Physical Ports
The third policer capacity test is a combination of the first and second capacity tests, i.e., maximum policers active per physical port and all physical ports active. Procedure: The procedure for this test is identical to the procedure listed in Section 6.1.2.1. The configured parameters must be reported per port, and the test report must include per-stream results per measured egress port.
Top   ToC   RFC7640 - Page 23

6.2. Queue/Scheduler Tests

Queues and traffic scheduling are closely related in that a queue's priority dictates the manner in which the traffic scheduler transmits packets out of the egress port. Since device queues/buffers are generally an egress function, this test framework will discuss testing at the egress (although the technique can be applied to ingress-side queues). Similar to the policing tests, these tests are divided into two sections: individual queue/scheduler function tests and then full-capacity tests.

6.2.1. Queue/Scheduler Individual Tests

The various types of scheduling techniques include FIFO, Strict Priority (SP) queuing, and Weighted Fair Queuing (WFQ), along with other variations. This test framework recommends testing with a minimum of three techniques, although benchmarking other device-scheduling algorithms is left to the discretion of the tester.
6.2.1.1. Testing Queue/Scheduler with Stateless Traffic
Objective: Verify that the configured queue and scheduling technique can handle stateless traffic bursts up to the queue depth. Test Summary: A network device queue is memory based, unlike a policing function, which is token or credit based. However, the same concepts from Section 6.1 can be applied to testing network device queues. The device's network queue should be configured to the desired size in KB (i.e., Queue Length (QL)), and then stateless traffic should be transmitted to test this QL. A queue should be able to handle repetitive bursts with the transmission gaps proportional to the Bottleneck Bandwidth (BB). The transmission gap is referred to here as the transmission interval (Ti). The Ti can be defined for the traffic bursts and is based on the QL and BB of the egress interface. Ti = QL * 8 / BB
Top   ToC   RFC7640 - Page 24
      Note that this equation is similar to the Ti required for
      transmission into a policer (QL = CBS, BB = CIR).  Note also that
      the burst hunt algorithm defined in Section 5.1.1 can also be used
      to automate the measurement of the queue value.

      The stateless traffic burst SHALL be transmitted at the link speed
      and spaced within the transmission interval (Ti).  The metrics
      defined in Section 4.1 SHALL be measured at the egress port and
      recorded; the primary intent is to verify the BSA and verify that
      no packets are dropped.

      The scheduling function must also be characterized to benchmark
      the device's ability to schedule the queues according to the
      priority.  An example would be two levels of priority that include
      SP and FIFO queuing.  Under a flow load greater than the egress
      port speed, the higher-priority packets should be transmitted
      without drops (and also maintain low latency), while the lower-
      priority (or best-effort) queue may be dropped.

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   Procedure:
      1. Configure the DUT QL and scheduling technique parameters (FIFO,
         SP, etc.).

      2. Configure the tester to generate a stateless traffic burst
         equal to QL and an interval equal to Ti (QL in bits/BB).

      3. Generate bursts of QL traffic into the DUT, and measure the
         metrics defined in Section 4.1 (LP, OOS, PD, and PDV) at the
         egress port and across the entire Td (default 30-second
         duration).

   Reporting Format:
      The Queue/Scheduler Stateless Traffic individual report MUST
      contain all results for each QL/BB test run.  A recommended format
      is as follows:

      ****************************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: Scheduling technique (i.e., FIFO, SP,
      WFQ, etc.), BB, and QL
Top   ToC   RFC7640 - Page 25
      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  LP, OOS, PD, and PDV

      ****************************************************************

6.2.1.2. Testing Queue/Scheduler with Stateful Traffic
Objective: Verify that the configured queue and scheduling technique can handle stateful traffic bursts up to the queue depth. Test Background and Summary: To provide a more realistic benchmark and to test queues in Layer 4 devices such as firewalls, stateful traffic testing is recommended for the queue tests. Stateful traffic tests will also utilize the Network Delay Emulator (NDE) from the network setup configuration in Section 1.2. The BDP of the TCP test traffic must be calibrated to the QL of the device queue. Referencing [RFC6349], the BDP is equal to: BB * RTT / 8 (in bytes) The NDE must be configured to an RTT value that is large enough to allow the BDP to be greater than QL. An example test scenario is defined below: - Ingress link = GigE - Egress link = 100 Mbps (BB) - QL = 32 KB RTT(min) = QL * 8 / BB and would equal 2.56 ms (and the BDP = 32 KB) In this example, one (1) TCP connection with window size / SSB of 32 KB would be required to test the QL of 32 KB. This Bulk Transfer Test can be accomplished using iperf, as described in Appendix A.
Top   ToC   RFC7640 - Page 26
      Two types of TCP tests MUST be performed: the Bulk Transfer Test
      and the Micro Burst Test Pattern, as documented in Appendix B.
      The Bulk Transfer Test only bursts during the TCP Slow Start (or
      Congestion Avoidance) state, while the Micro Burst Test Pattern
      emulates application-layer bursting, which may occur any time
      during the TCP connection.

      Other types of tests SHOULD include the following: simple web
      sites, complex web sites, business applications, email, and
      SMB/CIFS (Common Internet File System) file copy (all of which are
      also documented in Appendix B).

   Test Metrics:
      The test results will be recorded per the stateful metrics defined
      in Section 4.2 -- primarily the TCP Test Pattern Execution Time
      (TTPET), TCP Efficiency, and Buffer Delay.

   Procedure:
      1. Configure the DUT QL and scheduling technique parameters (FIFO,
         SP, etc.).

      2. Configure the test generator* with a profile of an emulated
         application traffic mixture.

         -  The application mixture MUST be defined in terms of
            percentage of the total bandwidth to be tested.

         -  The rate of transmission for each application within the
            mixture MUST also be configurable.

         *  To ensure repeatable results, the test generator MUST be
            capable of generating precise TCP test patterns for each
            application specified.

      3. Generate application traffic between the ingress (client side)
         and egress (server side) ports of the DUT, and measure the
         metrics (TTPET, TCP Efficiency, and Buffer Delay) per
         application stream and at the ingress and egress ports (across
         the entire Td, default 60-second duration).

      A couple of items require clarification concerning application
      measurements: an application session may be comprised of a single
      TCP connection or multiple TCP connections.

      If an application session utilizes a single TCP connection, the
      application throughput/metrics have a 1-1 relationship to the TCP
      connection measurements.
Top   ToC   RFC7640 - Page 27
      If an application session (e.g., an HTTP-based application)
      utilizes multiple TCP connections, then all of the TCP connections
      are aggregated in the application throughput measurement/metrics
      for that application.

      Then, there is the case of multiple instances of an application
      session (i.e., multiple FTPs emulating multiple clients).  In this
      situation, the test should measure/record each FTP application
      session independently, tabulating the minimum, maximum, and
      average for all FTP sessions.

      Finally, application throughput measurements are based on Layer 4
      TCP throughput and do not include bytes retransmitted.  The TCP
      Efficiency metric MUST be measured during the test, because it
      provides a measure of "goodput" during each test.

   Reporting Format:
      The Queue/Scheduler Stateful Traffic individual report MUST
      contain all results for each traffic scheduler and QL/BB test run.
      A recommended format is as follows:

      ******************************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: Scheduling technique (i.e., FIFO, SP,
      WFQ, etc.), BB, and QL

      Application Mixture and Intensities: These are the percentages
      configured for each application type.

      The results table should contain entries for each test run, with
      minimum, maximum, and average per application session, as follows
      (Test #1 to Test #Tr):

      -  Throughput (bps) and TTPET for each application session

      -  Bytes In and Bytes Out for each application session

      -  TCP Efficiency and Buffer Delay for each application session

      ******************************************************************
Top   ToC   RFC7640 - Page 28

6.2.2. Queue/Scheduler Capacity Tests

Objective: The intent of these capacity tests is to benchmark queue/scheduler performance in a scaled environment with multiple queues/schedulers active on multiple egress physical ports. These tests will benchmark the maximum number of queues and schedulers as specified by the device manufacturer. Each priority in the system will map to a separate queue. Test Metrics: The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL be measured at the egress port and recorded. The following sections provide the specific test scenarios, procedures, and reporting formats for each queue/scheduler capacity test.
6.2.2.1. Multiple Queues, Single Port Active
For the first queue/scheduler capacity test, multiple queues per port will be tested on a single physical port. In this case, all of the queues (typically eight) are active on a single physical port. Traffic from multiple ingress physical ports is directed to the same egress physical port. This will cause oversubscription on the egress physical port. There are many types of priority schemes and combinations of priorities that are managed by the scheduler. The following sections specify the priority schemes that should be tested.
6.2.2.1.1. Strict Priority on Egress Port
Test Summary: For this test, SP scheduling on the egress physical port should be tested, and the benchmarking methodologies specified in Sections 6.2.1.1 (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and reporting format) should be applied here. For a given priority, each ingress physical port should get a fair share of the egress physical-port bandwidth.
Top   ToC   RFC7640 - Page 29
      Since this is a capacity test, the configuration and report
      results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also
      include:

      Configuration:

      -  The number of physical ingress ports active during the test

      -  The classification marking (DSCP, VLAN, etc.) for each physical
         ingress port

      -  The traffic rate for stateful traffic and the traffic
         rate/mixture for stateful traffic for each physical
         ingress port

      Report Results:

      -  For each ingress port traffic stream, the achieved throughput
         rate and metrics at the egress port

6.2.2.1.2. Strict Priority + WFQ on Egress Port
Test Summary: For this test, SP and WFQ should be enabled simultaneously in the scheduler, but on a single egress port. The benchmarking methodologies specified in Sections 6.2.1.1 (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and reporting format) should be applied here. Additionally, the egress port bandwidth-sharing among weighted queues should be proportional to the assigned weights. For a given priority, each ingress physical port should get a fair share of the egress physical-port bandwidth. Since this is a capacity test, the configuration and report results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also include: Configuration: - The number of physical ingress ports active during the test - The classification marking (DSCP, VLAN, etc.) for each physical ingress port - The traffic rate for stateful traffic and the traffic rate/mixture for stateful traffic for each physical ingress port
Top   ToC   RFC7640 - Page 30
      Report Results:

      -  For each ingress port traffic stream, the achieved throughput
         rate and metrics at each queue of the egress port queue (both
         the SP and WFQ)

      Example:

      -  Egress Port SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port WFQ: throughput and metrics for ingress streams 1-n

6.2.2.2. Single Queue per Port, All Ports Active
Test Summary: Traffic from multiple ingress physical ports is directed to the same egress physical port. This will cause oversubscription on the egress physical port. Also, the same amount of traffic is directed to each egress physical port. The benchmarking methodologies specified in Sections 6.2.1.1 (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and reporting format) should be applied here. Each ingress physical port should get a fair share of the egress physical-port bandwidth. Additionally, each egress physical port should receive the same amount of traffic. Since this is a capacity test, the configuration and report results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also include: Configuration: - The number of ingress ports active during the test - The number of egress ports active during the test - The classification marking (DSCP, VLAN, etc.) for each physical ingress port - The traffic rate for stateful traffic and the traffic rate/mixture for stateful traffic for each physical ingress port
Top   ToC   RFC7640 - Page 31
      Report Results:

      -  For each egress port, the achieved throughput rate and metrics
         at the egress port queue for each ingress port stream

      Example:

      -  Egress Port 1: throughput and metrics for ingress streams 1-n

      -  Egress Port n: throughput and metrics for ingress streams 1-n

6.2.2.3. Multiple Queues per Port, All Ports Active
Test Summary: Traffic from multiple ingress physical ports is directed to all queues of each egress physical port. This will cause oversubscription on the egress physical ports. Also, the same amount of traffic is directed to each egress physical port. The benchmarking methodologies specified in Sections 6.2.1.1 (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and reporting format) should be applied here. For a given priority, each ingress physical port should get a fair share of the egress physical-port bandwidth. Additionally, each egress physical port should receive the same amount of traffic. Since this is a capacity test, the configuration and report results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also include: Configuration: - The number of physical ingress ports active during the test - The classification marking (DSCP, VLAN, etc.) for each physical ingress port - The traffic rate for stateful traffic and the traffic rate/mixture for stateful traffic for each physical ingress port Report Results: - For each egress port, the achieved throughput rate and metrics at each egress port queue for each ingress port stream
Top   ToC   RFC7640 - Page 32
      Example:

      -  Egress Port 1, SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port 2, WFQ: throughput and metrics for ingress
         streams 1-n

      ...

      -  Egress Port n, SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port n, WFQ: throughput and metrics for ingress
         streams 1-n



(page 32 continued on part 3)

Next Section