tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Gloss.     Arch.     IMS     UICC    |    Misc.    |    search     info

RFC 7640

Informational
Pages: 51
Top     in Index     Prev     Next
in Group Index     Prev in Group     Next in Group     Group: BMWG

Traffic Management Benchmarking

Part 1 of 3, p. 1 to 16
None       Next RFC Part

 


Top       ToC       Page 1 
Internet Engineering Task Force (IETF)                    B. Constantine
Request for Comments: 7640                                          JDSU
Category: Informational                                      R. Krishnan
ISSN: 2070-1721                                                Dell Inc.
                                                          September 2015


                    Traffic Management Benchmarking

Abstract

   This framework describes a practical methodology for benchmarking the
   traffic management capabilities of networking devices (i.e.,
   policing, shaping, etc.).  The goals are to provide a repeatable test
   method that objectively compares performance of the device's traffic
   management capabilities and to specify the means to benchmark traffic
   management with representative application traffic.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc7640.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Top       Page 2 
Table of Contents

   1. Introduction ....................................................3
      1.1. Traffic Management Overview ................................3
      1.2. Lab Configuration and Testing Overview .....................5
   2. Conventions Used in This Document ...............................6
   3. Scope and Goals .................................................7
   4. Traffic Benchmarking Metrics ...................................10
      4.1. Metrics for Stateless Traffic Tests .......................10
      4.2. Metrics for Stateful Traffic Tests ........................12
   5. Tester Capabilities ............................................13
      5.1. Stateless Test Traffic Generation .........................13
           5.1.1. Burst Hunt with Stateless Traffic ..................14
      5.2. Stateful Test Pattern Generation ..........................14
           5.2.1. TCP Test Pattern Definitions .......................15
   6. Traffic Benchmarking Methodology ...............................17
      6.1. Policing Tests ............................................17
           6.1.1. Policer Individual Tests ...........................18
           6.1.2. Policer Capacity Tests .............................19
                  6.1.2.1. Maximum Policers on Single Physical Port ..20
                  6.1.2.2. Single Policer on All Physical Ports ......22
                  6.1.2.3. Maximum Policers on All Physical Ports ....22
      6.2. Queue/Scheduler Tests .....................................23
           6.2.1. Queue/Scheduler Individual Tests ...................23
                  6.2.1.1. Testing Queue/Scheduler with
                           Stateless Traffic .........................23
                  6.2.1.2. Testing Queue/Scheduler with
                           Stateful Traffic ..........................25
           6.2.2. Queue/Scheduler Capacity Tests .....................28
                  6.2.2.1. Multiple Queues, Single Port Active .......28
                           6.2.2.1.1. Strict Priority on
                                      Egress Port ....................28
                           6.2.2.1.2. Strict Priority + WFQ on
                                      Egress Port ....................29
                  6.2.2.2. Single Queue per Port, All Ports Active ...30
                  6.2.2.3. Multiple Queues per Port, All
                           Ports Active ..............................31
      6.3. Shaper Tests ..............................................32
           6.3.1. Shaper Individual Tests ............................32
                  6.3.1.1. Testing Shaper with Stateless Traffic .....33
                  6.3.1.2. Testing Shaper with Stateful Traffic ......34
           6.3.2. Shaper Capacity Tests ..............................36
                  6.3.2.1. Single Queue Shaped, All Physical
                           Ports Active ..............................37
                  6.3.2.2. All Queues Shaped, Single Port Active .....37
                  6.3.2.3. All Queues Shaped, All Ports Active .......39

Top      ToC       Page 3 
      6.4. Concurrent Capacity Load Tests ............................40
   7. Security Considerations ........................................40
   8. References .....................................................41
      8.1. Normative References ......................................41
      8.2. Informative References ....................................42
   Appendix A. Open Source Tools for Traffic Management Testing ......44
   Appendix B. Stateful TCP Test Patterns ............................45
   Acknowledgments ...................................................51
   Authors' Addresses ................................................51

1.  Introduction

   Traffic management (i.e., policing, shaping, etc.) is an increasingly
   important component when implementing network Quality of Service
   (QoS).

   There is currently no framework to benchmark these features, although
   some standards address specific areas as described in Section 1.1.

   This document provides a framework to conduct repeatable traffic
   management benchmarks for devices and systems in a lab environment.

   Specifically, this framework defines the methods to characterize the
   capacity of the following traffic management features in network
   devices: classification, policing, queuing/scheduling, and traffic
   shaping.

   This benchmarking framework can also be used as a test procedure to
   assist in the tuning of traffic management parameters before service
   activation.  In addition to Layer 2/3 (Ethernet/IP) benchmarking,
   Layer 4 (TCP) test patterns are proposed by this document in order to
   more realistically benchmark end-user traffic.

1.1.  Traffic Management Overview

   In general, a device with traffic management capabilities performs
   the following functions:

   -  Traffic classification: identifies traffic according to various
      configuration rules (for example, IEEE 802.1Q Virtual LAN (VLAN),
      Differentiated Services Code Point (DSCP)) and marks this traffic
      internally to the network device.  Multiple external priorities
      (DSCP, 802.1p, etc.) can map to the same priority in the device.

   -  Traffic policing: limits the rate of traffic that enters a network
      device according to the traffic classification.  If the traffic
      exceeds the provisioned limits, the traffic is either dropped or
      remarked and forwarded onto the next network device.

Top      ToC       Page 4 
   -  Traffic scheduling: provides traffic classification within the
      network device by directing packets to various types of queues and
      applies a dispatching algorithm to assign the forwarding sequence
      of packets.

   -  Traffic shaping: controls traffic by actively buffering and
      smoothing the output rate in an attempt to adapt bursty traffic to
      the configured limits.

   -  Active Queue Management (AQM): involves monitoring the status of
      internal queues and proactively dropping (or remarking) packets,
      which causes hosts using congestion-aware protocols to "back off"
      and in turn alleviate queue congestion [RFC7567].  On the other
      hand, classic traffic management techniques reactively drop (or
      remark) packets based on queue-full conditions.  The benchmarking
      scenarios for AQM are different and are outside the scope of this
      testing framework.

   Even though AQM is outside the scope of this framework, it should be
   noted that the TCP metrics and TCP test patterns (defined in
   Sections 4.2 and 5.2, respectively) could be useful to test new AQM
   algorithms (targeted to alleviate "bufferbloat").  Examples of these
   algorithms include Controlled Delay [CoDel] and Proportional Integral
   controller Enhanced [PIE].

   The following diagram is a generic model of the traffic management
   capabilities within a network device.  It is not intended to
   represent all variations of manufacturer traffic management
   capabilities, but it provides context for this test framework.

    |----------|   |----------------|   |--------------|   |----------|
    |          |   |                |   |              |   |          |
    |Interface |   |Ingress Actions |   |Egress Actions|   |Interface |
    |Ingress   |   |(classification,|   |(scheduling,  |   |Egress    |
    |Queues    |   | marking,       |   | shaping,     |   |Queues    |
    |          |-->| policing, or   |-->| active queue |-->|          |
    |          |   | shaping)       |   | management,  |   |          |
    |          |   |                |   | remarking)   |   |          |
    |----------|   |----------------|   |--------------|   |----------|

   Figure 1: Generic Traffic Management Capabilities of a Network Device

   Ingress actions such as classification are defined in [RFC4689] and
   include IP addresses, port numbers, and DSCP.  In terms of marking,
   [RFC2697] and [RFC2698] define a Single Rate Three Color Marker and a
   Two Rate Three Color Marker, respectively.

Top      ToC       Page 5 
   The Metro Ethernet Forum (MEF) specifies policing and shaping in
   terms of ingress and egress subscriber/provider conditioning
   functions as described in MEF 12.2 [MEF-12.2], as well as ingress and
   bandwidth profile attributes as described in MEF 10.3 [MEF-10.3] and
   MEF 26.1 [MEF-26.1].

1.2.  Lab Configuration and Testing Overview

   The following diagram shows the lab setup for the traffic management
   tests:

     +--------------+     +-------+     +----------+    +-----------+
     | Transmitting |     |       |     |          |    | Receiving |
     | Test Host    |     |       |     |          |    | Test Host |
     |              |-----| Device|---->| Network  |--->|           |
     |              |     | Under |     | Delay    |    |           |
     |              |     | Test  |     | Emulator |    |           |
     |              |<----|       |<----|          |<---|           |
     |              |     |       |     |          |    |           |
     +--------------+     +-------+     +----------+    +-----------+

             Figure 2: Lab Setup for Traffic Management Tests

   As shown in the test diagram, the framework supports unidirectional
   and bidirectional traffic management tests (where the transmitting
   and receiving roles would be reversed on the return path).

   This testing framework describes the tests and metrics for each of
   the following traffic management functions:

   -  Classification

   -  Policing

   -  Queuing/scheduling

   -  Shaping

   The tests are divided into individual and rated capacity tests.  The
   individual tests are intended to benchmark the traffic management
   functions according to the metrics defined in Section 4.  The
   capacity tests verify traffic management functions under the load of
   many simultaneous individual tests and their flows.

   This involves concurrent testing of multiple interfaces with the
   specific traffic management function enabled, and increasing the load
   to the capacity limit of each interface.

Top      ToC       Page 6 
   For example, a device is specified to be capable of shaping on all of
   its egress ports.  The individual test would first be conducted to
   benchmark the specified shaping function against the metrics defined
   in Section 4.  Then, the capacity test would be executed to test the
   shaping function concurrently on all interfaces and with maximum
   traffic load.

   The Network Delay Emulator (NDE) is required for TCP stateful tests
   in order to allow TCP to utilize a TCP window of significant size in
   its control loop.

   Note also that the NDE SHOULD be passive in nature (e.g., a fiber
   spool).  This is recommended to eliminate the potential effects that
   an active delay element (i.e., test impairment generator) may have on
   the test flows.  In the case where a fiber spool is not practical due
   to the desired latency, an active NDE MUST be independently verified
   to be capable of adding the configured delay without loss.  In other
   words, the Device Under Test (DUT) would be removed and the NDE
   performance benchmarked independently.

   Note that the NDE SHOULD be used only as emulated delay.  Most NDEs
   allow for per-flow delay actions, emulating QoS prioritization.  For
   this framework, the NDE's sole purpose is simply to add delay to all
   packets (emulate network latency).  So, to benchmark the performance
   of the NDE, the maximum offered load should be tested against the
   following frame sizes: 128, 256, 512, 768, 1024, 1500, and
   9600 bytes.  The delay accuracy at each of these packet sizes can
   then be used to calibrate the range of expected Bandwidth-Delay
   Product (BDP) for the TCP stateful tests.

2.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The following acronyms are used:

      AQM: Active Queue Management

      BB: Bottleneck Bandwidth

      BDP: Bandwidth-Delay Product

      BSA: Burst Size Achieved

      CBS: Committed Burst Size

Top      ToC       Page 7 
      CIR: Committed Information Rate

      DUT: Device Under Test

      EBS: Excess Burst Size

      EIR: Excess Information Rate

      NDE: Network Delay Emulator

      QL: Queue Length

      QoS: Quality of Service

      RTT: Round-Trip Time

      SBB: Shaper Burst Bytes

      SBI: Shaper Burst Interval

      SP: Strict Priority

      SR: Shaper Rate

      SSB: Send Socket Buffer

      SUT: System Under Test

      Ti: Transmission Interval

      TTP: TCP Test Pattern

      TTPET: TCP Test Pattern Execution Time

3.  Scope and Goals

   The scope of this work is to develop a framework for benchmarking and
   testing the traffic management capabilities of network devices in the
   lab environment.  These network devices may include but are not
   limited to:

   -  Switches (including Layer 2/3 devices)

   -  Routers

   -  Firewalls

   -  General Layer 4-7 appliances (Proxies, WAN Accelerators, etc.)

Top      ToC       Page 8 
   Essentially, any network device that performs traffic management as
   defined in Section 1.1 can be benchmarked or tested with this
   framework.

   The primary goal is to assess the maximum forwarding performance
   deemed to be within the provisioned traffic limits that a network
   device can sustain without dropping or impairing packets, and without
   compromising the accuracy of multiple instances of traffic management
   functions.  This is the benchmark for comparison between devices.

   Within this framework, the metrics are defined for each traffic
   management test but do not include pass/fail criteria, which are not
   within the charter of the BMWG.  This framework provides the test
   methods and metrics to conduct repeatable testing, which will provide
   the means to compare measured performance between DUTs.

   As mentioned in Section 1.2, these methods describe the individual
   tests and metrics for several management functions.  It is also
   within scope that this framework will benchmark each function in
   terms of overall rated capacity.  This involves concurrent testing of
   multiple interfaces with the specific traffic management function
   enabled, up to the capacity limit of each interface.

   It is not within the scope of this framework to specify the procedure
   for testing multiple configurations of traffic management functions
   concurrently.  The multitudes of possible combinations are almost
   unbounded, and the ability to identify functional "break points"
   would be almost impossible.

   However, Section 6.4 provides suggestions for some profiles of
   concurrent functions that would be useful to benchmark.  The key
   requirement for any concurrent test function is that tests MUST
   produce reliable and repeatable results.

   Also, it is not within scope to perform conformance testing.  Tests
   defined in this framework benchmark the traffic management functions
   according to the metrics defined in Section 4 and do not address any
   conformance to standards related to traffic management.

   The current specifications don't specify exact behavior or
   implementation, and the specifications that do exist (cited in
   Section 1.1) allow implementations to vary with regard to short-term
   rate accuracy and other factors.  This is a primary driver for this
   framework: to provide an objective means to compare vendor traffic
   management functions.

Top      ToC       Page 9 
   Another goal is to devise methods that utilize flows with congestion-
   aware transport (TCP) as part of the traffic load and still produce
   repeatable results in the isolated test environment.  This framework
   will derive stateful test patterns (TCP or application layer) that
   can also be used to further benchmark the performance of applicable
   traffic management techniques such as queuing/scheduling and traffic
   shaping.  In cases where the network device is stateful in nature
   (i.e., firewall, etc.), stateful test pattern traffic is important to
   test, along with stateless UDP traffic in specific test scenarios
   (i.e., applications using TCP transport and UDP VoIP, etc.).

   As mentioned earlier in this document, repeatability of test results
   is critical, especially considering the nature of stateful TCP
   traffic.  To this end, the stateful tests will use TCP test patterns
   to emulate applications.  This framework also provides guidelines for
   application modeling and open source tools to achieve the repeatable
   stimulus.  Finally, TCP metrics from [RFC6349] MUST be measured for
   each stateful test and provide the means to compare each repeated
   test.

   Even though this framework targets the testing of TCP applications
   (i.e., web, email, database, etc.), it could also be applied to the
   Stream Control Transmission Protocol (SCTP) in terms of test
   patterns.  WebRTC, Signaling System 7 (SS7) signaling, and 3GPP are
   SCTP-based applications that could be modeled with this framework to
   benchmark SCTP's effect on traffic management performance.

   Note that at the time of this writing, this framework does not
   address tcpcrypt (encrypted TCP) test patterns, although the metrics
   defined in Section 4.2 can still be used because the metrics are
   based on TCP retransmission and RTT measurements (versus any of the
   payload).  Thus, if tcpcrypt becomes popular, it would be natural for
   benchmarkers to consider encrypted TCP patterns and include them in
   test cases.

Top      ToC       Page 10 
4.  Traffic Benchmarking Metrics

   The metrics to be measured during the benchmarks are divided into two
   (2) sections: packet-layer metrics used for the stateless traffic
   testing and TCP-layer metrics used for the stateful traffic testing.

4.1.  Metrics for Stateless Traffic Tests

   Stateless traffic measurements require that a sequence number and
   timestamp be inserted into the payload for lost-packet analysis.
   Delay analysis may be achieved by insertion of timestamps directly
   into the packets or timestamps stored elsewhere (packet captures).
   This framework does not specify the packet format to carry sequence
   number or timing information.

   However, [RFC4737] and [RFC4689] provide recommendations for sequence
   tracking, along with definitions of in-sequence and out-of-order
   packets.

   The following metrics MUST be measured during the stateless traffic
   benchmarking components of the tests:

   -  Burst Size Achieved (BSA): For the traffic policing and network
      queue tests, the tester will be configured to send bursts to test
      either the Committed Burst Size (CBS) or Excess Burst Size (EBS)
      of a policer or the queue/buffer size configured in the DUT.  The
      BSA metric is a measure of the actual burst size received at the
      egress port of the DUT with no lost packets.  For example, the
      configured CBS of a DUT is 64 KB, and after the burst test, only a
      63 KB burst can be achieved without packet loss.  Then, 63 KB is
      the BSA.  Also, the average Packet Delay Variation (PDV) (see
      below) as experienced by the packets sent at the BSA burst size
      should be recorded.  This metric SHALL be reported in units of
      bytes, KB, or MB.

   -  Lost Packets (LP): For all traffic management tests, the tester
      will transmit the test packets into the DUT ingress port, and the
      number of packets received at the egress port will be measured.
      The difference between packets transmitted into the ingress port
      and received at the egress port is the number of lost packets as
      measured at the egress port.  These packets must have unique
      identifiers such that only the test packets are measured.  For
      cases where multiple flows are transmitted from the ingress port
      to the egress port (e.g., IP conversations), each flow must have
      sequence numbers within the stream of test packets.

Top      ToC       Page 11 
   [RFC6703] and [RFC2680] describe the need to establish the time
   threshold to wait before a packet is declared as lost.  This
   threshold MUST be reported, with the results reported as an integer
   number that cannot be negative.

   -  Out-of-Sequence (OOS): In addition to the LP metric, the test
      packets must be monitored for sequence.  [RFC4689] defines the
      general function of sequence tracking, as well as definitions for
      in-sequence and out-of-order packets.  Out-of-order packets will
      be counted per [RFC4737].  This metric SHALL be reported as an
      integer number that cannot be negative.

   -  Packet Delay (PD): The PD metric is the difference between the
      timestamp of the received egress port packets and the packets
      transmitted into the ingress port, as specified in [RFC1242].  The
      transmitting host and receiving host time must be in time sync
      (achieved by using NTP, GPS, etc.).  This metric SHALL be reported
      as a real number of seconds, where a negative measurement usually
      indicates a time synchronization problem between test devices.

   -  Packet Delay Variation (PDV): The PDV metric is the variation
      between the timestamp of the received egress port packets, as
      specified in [RFC5481].  Note that per [RFC5481], this PDV is the
      variation of one-way delay across many packets in the traffic
      flow.  Per the measurement formula in [RFC5481], select the high
      percentile of 99%, and units of measure will be a real number of
      seconds (a negative value is not possible for the PDV and would
      indicate a measurement error).

   -  Shaper Rate (SR): The SR represents the average DUT output rate
      (bps) over the test interval.  The SR is only applicable to the
      traffic-shaping tests.

   -  Shaper Burst Bytes (SBB): A traffic shaper will emit packets in
      "trains" of different sizes; these frames are emitted "back-to-
      back" with respect to the mandatory interframe gap.  This metric
      characterizes the method by which the shaper emits traffic.  Some
      shapers transmit larger bursts per interval, and a burst of
      one packet would apply to the less common case of a shaper sending
      a constant-bitrate stream of single packets.  This metric SHALL be
      reported in units of bytes, KB, or MB.  The SBB metric is only
      applicable to the traffic-shaping tests.

   -  Shaper Burst Interval (SBI): The SBI is the time between bursts
      emitted by the shaper and is measured at the DUT egress port.
      This metric SHALL be reported as a real number of seconds.  The
      SBI is only applicable to the traffic-shaping tests.

Top      ToC       Page 12 
4.2.  Metrics for Stateful Traffic Tests

   The stateful metrics will be based on [RFC6349] TCP metrics and MUST
   include:

   -  TCP Test Pattern Execution Time (TTPET): [RFC6349] defined the TCP
      Transfer Time for bulk transfers, which is simply the measured
      time to transfer bytes across single or concurrent TCP
      connections.  The TCP test patterns used in traffic management
      tests will include bulk transfer and interactive applications.
      The interactive patterns include instances such as HTTP business
      applications and database applications.  The TTPET will be the
      measure of the time for a single execution of a TCP Test Pattern
      (TTP).  Average, minimum, and maximum times will be measured or
      calculated and expressed as a real number of seconds.

   An example would be an interactive HTTP TTP session that should take
   5 seconds on a GigE network with 0.5-millisecond latency.  During ten
   (10) executions of this TTP, the TTPET results might be an average of
   6.5 seconds, a minimum of 5.0 seconds, and a maximum of 7.9 seconds.

   -  TCP Efficiency: After the execution of the TTP, TCP Efficiency
      represents the percentage of bytes that were not retransmitted.

                         Transmitted Bytes - Retransmitted Bytes
     TCP Efficiency % =  ---------------------------------------  X 100
                                  Transmitted Bytes

   "Transmitted Bytes" is the total number of TCP bytes to be
   transmitted, including the original bytes and the retransmitted
   bytes.  To avoid any misinterpretation that a reordered packet is a
   retransmitted packet (as may be the case with packet decode
   interpretation), these retransmitted bytes should be recorded from
   the perspective of the sender's TCP/IP stack.

   -  Buffer Delay: Buffer Delay represents the increase in RTT during a
      TCP test versus the baseline DUT RTT (non-congested, inherent
      latency).  RTT and the technique to measure RTT (average versus
      baseline) are defined in [RFC6349].  Referencing [RFC6349], the
      average RTT is derived from the total of all measured RTTs during
      the actual test sampled at every second divided by the test
      duration in seconds.

Top      ToC       Page 13 
                                      Total RTTs during transfer
     Average RTT during transfer =  ------------------------------
                                     Transfer duration in seconds


                     Average RTT during transfer - Baseline RTT
   Buffer Delay % =  ------------------------------------------  X 100
                                 Baseline RTT

   Note that even though this was not explicitly stated in [RFC6349],
   retransmitted packets should not be used in RTT measurements.

   Also, the test results should record the average RTT in milliseconds
   across the entire test duration, as well as the number of samples.

5.  Tester Capabilities

   The testing capabilities of the traffic management test environment
   are divided into two (2) sections: stateless traffic testing and
   stateful traffic testing.

5.1.  Stateless Test Traffic Generation

   The test device MUST be capable of generating traffic at up to the
   link speed of the DUT.  The test device must be calibrated to verify
   that it will not drop any packets.  The test device's inherent PD and
   PDV must also be calibrated and subtracted from the PD and PDV
   metrics.  The test device must support the encapsulation to be
   tested, e.g., IEEE 802.1Q VLAN, IEEE 802.1ad Q-in-Q, Multiprotocol
   Label Switching (MPLS).  Also, the test device must allow control of
   the classification techniques defined in [RFC4689] (e.g., IP address,
   DSCP, classification of Type of Service).

   The open source tool "iperf" can be used to generate stateless UDP
   traffic and is discussed in Appendix A.  Since iperf is a software-
   based tool, there will be performance limitations at higher link
   speeds (e.g., 1 GigE, 10 GigE).  Careful calibration of any test
   environment using iperf is important.  At higher link speeds, using
   hardware-based packet test equipment is recommended.

Top      ToC       Page 14 
5.1.1.  Burst Hunt with Stateless Traffic

   A central theme for the traffic management tests is to benchmark the
   specified burst parameter of a traffic management function, since
   burst parameters listed in Service Level Agreements (SLAs) are
   specified in bytes.  For testing efficiency, including a burst hunt
   feature is recommended, as this feature automates the manual process
   of determining the maximum burst size that can be supported by a
   traffic management function.

   The burst hunt algorithm should start at the target burst size
   (maximum burst size supported by the traffic management function) and
   will send single bursts until it can determine the largest burst that
   can pass without loss.  If the target burst size passes, then the
   test is complete.  The "hunt" aspect occurs when the target burst
   size is not achieved; the algorithm will drop down to a configured
   minimum burst size and incrementally increase the burst until the
   maximum burst supported by the DUT is discovered.  The recommended
   granularity of the incremental burst size increase is 1 KB.

   For a policer function, if the burst size passes, the burst should be
   increased by increments of 1 KB to verify that the policer is truly
   configured properly (or enabled at all).

5.2.  Stateful Test Pattern Generation

   The TCP test host will have many of the same attributes as the TCP
   test host defined in [RFC6349].  The TCP test device may be a
   standard computer or a dedicated communications test instrument.  In
   both cases, it must be capable of emulating both a client and a
   server.

   For any test using stateful TCP test traffic, the Network Delay
   Emulator (the NDE function as shown in the lab setup diagram in
   Section 1.2) must be used in order to provide a meaningful BDP.  As
   discussed in Section 1.2, the target traffic rate and configured RTT
   MUST be verified independently, using just the NDE for all stateful
   tests (to ensure that the NDE can add delay without inducing any
   packet loss).

   The TCP test host MUST be capable of generating and receiving
   stateful TCP test traffic at the full link speed of the DUT.  As a
   general rule of thumb, testing TCP throughput at rates greater than
   500 Mbps may require high-performance server hardware or dedicated
   hardware-based test tools.

Top      ToC       Page 15 
   The TCP test host MUST allow the adjustment of both Send and Receive
   Socket Buffer sizes.  The Socket Buffers must be large enough to fill
   the BDP for bulk transfer of TCP test application traffic.

   Measuring RTT and retransmissions per connection will generally
   require a dedicated communications test instrument.  In the absence
   of dedicated hardware-based test tools, these measurements may need
   to be conducted with packet capture tools; i.e., conduct TCP
   throughput tests, and analyze RTT and retransmissions in packet
   captures.

   The TCP implementation used by the test host MUST be specified in the
   test results (e.g., TCP New Reno, TCP options supported).
   Additionally, the test results SHALL provide specific congestion
   control algorithm details, as per [RFC3148].

   While [RFC6349] defined the means to conduct throughput tests of TCP
   bulk transfers, the traffic management framework will extend TCP test
   execution into interactive TCP application traffic.  Examples include
   email, HTTP, and business applications.  This interactive traffic is
   bidirectional and can be chatty, meaning many turns in traffic
   communication during the course of a transaction (versus the
   relatively unidirectional flow of bulk transfer applications).

   The test device must not only support bulk TCP transfer application
   traffic but MUST also support chatty traffic.  A valid stress test
   SHOULD include both traffic types.  This is due to the non-uniform,
   bursty nature of chatty applications versus the relatively uniform
   nature of bulk transfers (the bulk transfer smoothly stabilizes to
   equilibrium state under lossless conditions).

   While iperf is an excellent choice for TCP bulk transfer testing, the
   "netperf" open source tool provides the ability to control client and
   server request/response behavior.  The netperf-wrapper tool is a
   Python script that runs multiple simultaneous netperf instances and
   aggregates the results.  Appendix A provides an overview of
   netperf/netperf-wrapper, as well as iperf.  As with any software-
   based tool, the performance must be qualified to the link speed to be
   tested.  Hardware-based test equipment should be considered for
   reliable results at higher link speeds (e.g., 1 GigE, 10 GigE).

5.2.1.  TCP Test Pattern Definitions

   As mentioned in the goals of this framework, techniques are defined
   to specify TCP traffic test patterns to benchmark traffic management
   technique(s) and produce repeatable results.  Some network devices,
   such as firewalls, will not process stateless test traffic; this is
   another reason why stateful TCP test traffic must be used.

Top      ToC       Page 16 
   An application could be fully emulated up to Layer 7; however, this
   framework proposes that stateful TCP test patterns be used in order
   to provide granular and repeatable control for the benchmarks.  The
   following diagram illustrates a simple web-browsing application
   (HTTP).

                             GET URL

             Client      ------------------------->   Web
                                                  |
             Web             200 OK        100 ms |
                                                  |
             Browser     <-------------------------   Server

            Figure 3: Simple Flow Diagram for a Web Application

   In this example, the Client Web Browser (client) requests a URL, and
   then the Web Server delivers the web page content to the client
   (after a server delay of 100 milliseconds).  This asynchronous
   "request/response" behavior is intrinsic to most TCP-based
   applications, such as email (SMTP), file transfers (FTP and Server
   Message Block (SMB)), database (SQL), web applications (SOAP), and
   Representational State Transfer (REST).  The impact on the network
   elements is due to the multitudes of clients and the variety of
   bursty traffic, which stress traffic management functions.  The
   actual emulation of the specific application protocols is not
   required, and TCP test patterns can be defined to mimic the
   application network traffic flows and produce repeatable results.

   Application modeling techniques have been proposed in
   [3GPP2-C_R1002-A], which provides examples to model the behavior of
   HTTP, FTP, and Wireless Application Protocol (WAP) applications at
   the TCP layer.  The models have been defined with various
   mathematical distributions for the request/response bytes and
   inter-request gap times.  The model definition formats described in
   [3GPP2-C_R1002-A] are the basis for the guidelines provided in
   Appendix B and are also similar to formats used by network modeling
   tools.  Packet captures can also be used to characterize application
   traffic and specify some of the test patterns listed in Appendix B.

   This framework does not specify a fixed set of TCP test patterns but
   does provide test cases that SHOULD be performed; see Appendix B.
   Some of these examples reflect those specified in [CA-Benchmark],
   which suggests traffic mixes for a variety of representative
   application profiles.  Other examples are simply well-known
   application traffic types such as HTTP.


Next RFC Part