Network Working Group D. Newman Request for Comments: 4814 Network Test Category: Informational T. Player Spirent Communications March 2007 Hash and Stuffing: Overlooked Factors in Network Device Benchmarking Status of This Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The IETF Trust (2007).
AbstractTest engineers take pains to declare all factors that affect a given measurement, including intended load, packet length, test duration, and traffic orientation. However, current benchmarking practice overlooks two factors that have a profound impact on test results. First, existing methodologies do not require the reporting of addresses or other test traffic contents, even though these fields can affect test results. Second, "stuff" bits and bytes inserted in test traffic by some link-layer technologies add significant and variable overhead, which in turn affects test results. This document describes the effects of these factors; recommends guidelines for test traffic contents; and offers formulas for determining the probability of bit- and byte-stuffing in test traffic.
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. General Considerations . . . . . . . . . . . . . . . . . . . . 4 3.1. Repeatability . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Randomness . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Packet Content Variations . . . . . . . . . . . . . . . . . . 5 4.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 5 4.2. IEEE 802 MAC Addresses . . . . . . . . . . . . . . . . . . 7 4.2.1. Randomized Sets of MAC Addresses . . . . . . . . . . . 8 4.3. MPLS Addressing . . . . . . . . . . . . . . . . . . . . . 9 4.4. Network-layer Addressing . . . . . . . . . . . . . . . . . 9 4.5. Transport-Layer Addressing . . . . . . . . . . . . . . . . 10 4.6. Application-Layer Patterns . . . . . . . . . . . . . . . . 10 5. Control Character Stuffing . . . . . . . . . . . . . . . . . . 11 5.1. Problem Statement . . . . . . . . . . . . . . . . . . . . 11 5.2. PPP Bit-Stuffing . . . . . . . . . . . . . . . . . . . . . 12 5.2.1. Calculating Bit-Stuffing Probability . . . . . . . . . 14 5.2.2. Bit-Stuffing for Finite Strings . . . . . . . . . . . 15 5.2.3. Applied Bit-Stuffing . . . . . . . . . . . . . . . . . 16 5.3. POS Byte-Stuffing . . . . . . . . . . . . . . . . . . . . 16 5.3.1. Nullifying ACCM . . . . . . . . . . . . . . . . . . . 17 5.3.2. Other Stuffed Characters . . . . . . . . . . . . . . . 17 5.3.3. Applied Byte-Stuffing . . . . . . . . . . . . . . . . 17 6. Security Considerations . . . . . . . . . . . . . . . . . . . 18 7. Normative References . . . . . . . . . . . . . . . . . . . . . 19 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 20 Appendix B. Proof of Formula for Finite Bit-Stuffing . . . . . . 20 Appendix C. Explicit Calculation of Bit-Stuffing Overhead for IPv4 . . . . . . . . . . . . . . . . . . . . . . . . 21 Appendix D. Explicit Calculation of Bit-Stuffing Overhead for IPv6 . . . . . . . . . . . . . . . . . . . . . . . . 23 Appendix E. Terminology . . . . . . . . . . . . . . . . . . . . . 24
RFC2544] and [RFC2889] do not require any declaration of packet contents. These methodologies do require the declaration of test parameters such as traffic distribution and traffic orientation, and yet packet contents can have at least as great an impact on test results as the other factors. Variations in packet contents also can lead to non-repeatability of test results: Two individuals may follow methodology procedures to the letter, and still obtain very different results. A related issue is the insertion of stuff bits or bytes by link-layer technologies using PPP with High-Level Data Link Control (HDLC)-like framing. This stuffing is done to ensure sequences in test traffic will not be confused with control characters. Stuffing adds significant and variable overhead. Currently there is no standard method for determining the probability that stuffing will occur for a given pattern, and thus no way to determine what impact stuffing will have on test results. This document covers two areas. First, we discuss strategies for dealing with randomness and nonrandomness in test traffic. Second, we present formulas to determine the probability of bit- and byte- stuffing on Point-to-Point Protocol (PPP) and Packet over SONET (POS) circuits. In both areas, we provide recommendations for obtaining better repeatability in test results. Benchmarking activities as described in this memo are limited to technology characterization using controlled stimuli in a laboratory environment, using dedicated address space. The benchmarking network topology will be an independent test setup and MUST NOT be connected to devices that may forward the test traffic into a production network, or misroute traffic to the test management network. RFC2119].
Specifically, for any random bit pattern of length L, the probability of generating that specific pattern SHOULD equal 1 over 2 to the Lth power. Section 3.2 describes random uniform distributions in more detail.) In practice, the actual outcome of the hash (and thus any test results) will be very different depending on the degree of randomness in test traffic.
Suppose the traffic is nonrandom so that every interface of the test instrument uses this pattern in its source MAC addresses: 00:00:PP:00:00:01 where PP is the source interface number of the test instrument. In this case, the least significant 3 bits of every source and destination MAC address are 001, regardless of interface number. Thus, the outcome of the XOR operation will always be 0, given the same three least significant bits: 001 ^ 001 = 000 Thus, the switch will assign all traffic to NP0, leaving the other seven NPs idle. Given a heavy enough load, NP0 and the switch will become congested, even though seven other NPs are available. At most, this device will be able to utilize approximately 12.5 percent of its total capacity, with the remaining 87.5 percent of capacity unused. Now consider the same example with randomly distributed addresses. In this case, the test instrument offers traffic using MAC addresses with this pattern: 00:00:PP:00:00:RR where PP is the source interface number of the test instrument and RR is a pseudorandom number. In this case, there should be an equal probability of the least significant 3 bits of the MAC address having any value from 000 to 111 inclusive. Thus, the outcome of XOR operations should be equally distributed from 0 to 7, and distribution across NPs should also be equal (at least for this particular 3-bit hashing algorithm). Absent other impediments, the device should be able to utilize 100 percent of available capacity. This simple example presumes knowledge on the tester's part of the hashing algorithm used by the device under test. Knowledge of such algorithms is not always possible beforehand, and in any event violates the "black box" spirit of many documents produced by the IETF Benchmarking Working Group (BMWG). Therefore, this memo adds a new consideration for benchmarking methodologies to select traffic patterns that overcome the effects of nonrandomness, regardless of the hashing algorithms in use. The balance of this section offers recommendations for test traffic patterns to avoid these effects, starting at the link layer and working up to the application layer.
Note also that since only 30 of 48 bits in the MAC address have pseudorandom values, there is no possibility of randomly generating a broadcast or multicast value by accident.
Every MAC address SHOULD be pseudorandom, not just the starting one. When generating traffic with multiple addresses, it is RECOMMENDED that all addresses use pseudorandom values. There are multiple ways to use sets of pseudorandom numbers. One strategy would be for the test instrument to iterate over an array of pseudorandom values rather than incrementing/decrementing from a starting address. The actual method is an implementation detail; in the end, any method that uses multiple addresses with pseudorandom patterns will be sufficient. Experience with benchmarking of IEEE 802.11 devices suggests suboptimal test outcomes may result if different pseudorandom MAC and IP addresses are used from trial to trial. In such cases (not just for 802.11 but for any device using IEEE 802 MAC and IP addresses), testers MAY generate a pseudorandom set of MAC and IP addresses once, or MAY generate a nonrandom set of MAC and IP addresses once. In either case, the same MAC and IP addresses MUST be used in all trials. RFC3032], this avoids using reserved MPLS labels in the range of 0-15 inclusive.
purpose. Others may perform a hash on one or more bytes in the source and destination IP addresses. Just as in the case of MAC addresses, nonrandom IP addresses can have an adverse effect on the outcome of ECMP link assignment decisions. When benchmarking devices that implement ECMP or any other form of Layer 3 aggregation, it is RECOMMENDED to use a randomly distributed range of IP addresses. In particular, testers SHOULD NOT use addresses that produce the undesired effects of address processing. If, for example, a DUT can be observed to exhibit high packet loss when offered IPv4 network addresses that take the form x.x.1.x/24, and relatively low packet loss when the source and destination network addresses take the form of x.x.R.x/24 (where R is some random value between 0 and 9), test engineers SHOULD use the random pattern.
layer-aware hashing algorithms. Given the vast number of application-layer protocols in use, we make no recommendation for specific test traffic patterns to be used; however, test engineers SHOULD be aware that application-layer traffic contents MAY produce nonrandom outcomes with some hashing algorithms. The same issues that apply with lower-layer traffic patterns also apply at the application layer. As discussed in section 5, the potential for stuffing exists with any part of a test packet, including application-layer contents. For example, some traffic generators insert fields into packet payloads to distinguish test traffic. These fields may contain a transmission timestamp; sequence number; test equipment interface identifier and/or "stream" number; and a cyclic redundancy check (CRC) over the contents of the test payload or test packet. All these fields are potential candidates for stuffing.
As described in [RFC1661] and [RFC1662], PPP and HDLC-like framing introduce two kinds of escape sequences: bit- and byte-stuffing. Bit-stuffing refers to the insertion of an escape bit on bit- synchronous links. Byte-stuffing refers to the insertion of an escape byte on byte-synchronous links. We discuss each in turn. RFC1662], section 5.2, specifies that any sequence of five contiguous "1" bits within a frame must be escaped by inserting a "0" bit prior to the sequence. This escaping is necessary to avoid confusion with the HDLC control character 0x7E, which contains six "1" bits. Consider the following PPP frame containing a TCP/IP packet. Not shown is the 1-byte flag sequence (0x7E), at least one of which must occur between frames. The contents of the various frame fields can be described one of three ways: 1. Field contents never change over the test duration. An example would be the IP version number. 2. Field contents change over the test duration. Some of these changes are known prior to the test duration. An example would be the use of incrementing IP addresses. Some of these changes are unknown. An example would be a dynamically calculated field such as the TCP checksum. 3. Field contents may not be known. An example would be proprietary payload fields in test packets.
In the diagram below, 30 out of 48 total bytes in the packet headers are subject to change over the test duration. Additionally, the payload field could be subject to change both content and size. The fields containing the changeable bytes are given in ((double parentheses)). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address | Control | Protocol | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | ((Header Checksum)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Source Address)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Destination Address)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Source Port)) | ((Destination Port)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Sequence Number)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Acknowledgment Number)) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| ((Window)) | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((Checksum)) | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | / ((payload)) / | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ((FCS (4 bytes) )) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ None of the other fields are known to contain sequences subject to bit-stuffing, at least not in their entirety. Note that there is no payload in this simple example; as noted in section 4.6, the payload contents of test traffic often will present additional opportunities for stuffing to occur, and MUST be taken into account when calculating stuff probability.
Given the information at hand, and assuming static contents for the rest of the fields, the challenge is to determine the probability that bit-stuffing will occur.
With this transition matrix we can build the following system of equations. If P(x) represents the probability of reaching state x, then: P(start) = 0.5 * P(start) + 0.5 * P(1) + 0.5 * P(2) + 0.5 * P(3) + 0.5 * P(4) + 0.5 * P(5) P(1) = 0.5 * P(start) + 0.5 * P(5) P(2) = 0.5 * P(1) P(3) = 0.5 * P(2) P(4) = 0.5 * P(3) P(5) = 0.5 * P(4) P(start) + P(1) + P(2) + P(3) + P(4) + P(5) = 1 Solving this system of equations yields: P(start) = 0.5 P(1) = 8/31 P(2) = 4/31 P(3) = 2/31 P(4) = 1/31 P(5) = 1/62 Thus, for an infinitely long string of uniformly random bits, the probability of any individual bit causing a transition to state 5, and thus causing a stuff, is 1/62. Appendix B. Now, the expected number of stuffing events, E[stuffs], can be found by dividing the total number of stuffs in all possible strings by the total number of strings. Thus for any L, E[stuffs] = f(L) / 2^L.
Similarly, the probability that any particular bit is the cause of a bit-stuff can be calculated by dividing the total number of stuffs in the set of all strings of length L by the total number of bits in the set of all strings of length L. Hence for any L, the probability that L_n, where 5 <= n <= L, caused a stuff is f(L) / (L * 2^L). RFC1662] requires that "Each Flag Sequence, Control Escape octet, and any octet which is flagged in the sending Async-Control- Character-Map (ACCM), is replaced by a two octet sequence consisting of the Control Escape octet followed by the original octet exclusive- or'd with hexadecimal 0x20". The practical effect of this is to insert a stuff byte for instances of up to 34 characters: 0x7E, 0x7D, or any of 32 ACCM values. A common implementation of PPP in HDLC-like framing is in PPP over SONET/SDH (POS), as defined in [RFC2615].
As with the bit-stuffing case, the requirement in characterizing POS test traffic is to determine the probability that byte-stuffing will occur for a given sequence. This is much simpler to do than with bit-synchronous links, since there is no possibility of overlap across byte boundaries. RFC2661], section 4.4.6. When the default ACCM values are used, the probability of stuffing for any given random byte is 34 in 256, or approximately 13.3 percent.
Note that the percentages given above are approximations. For greatest precision, the actual intended load SHOULD be explicitly calculated from the test traffic Note also that the DUT/SUT may be able to forward intended loads higher than the calculated theoretical maximum rate without packet loss. This results from queuing on the part of the DUT/SUT. While a device's throughput may be above this level, delay-related measurements may be affected. Accordingly, it is RECOMMENDED to reduce offered levels by the amount of byte-stuffing overhead when testing devices using byte-synchronous links. This recommendation applies for all measurements, including throughput. RFC2615], section 6, discusses a denial-of-service attack involving the intentional transmission of characters that require stuffing. This attack could consume up to 100 percent of available bandwidth. However, the test networks described in BMWG documents generally SHOULD NOT be reachable by anyone other than the tester(s).
[RFC1661] Simpson, W., "The Point-to-Point Protocol (PPP)", STD 51, RFC 1661, July 1994. [RFC1662] Simpson, W., "PPP in HDLC-like Framing", STD 51, RFC 1662, July 1994. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for Network Interconnect Devices", RFC 2544, March 1999. [RFC2615] Malis, A. and W. Simpson, "PPP over SONET/SDH", RFC 2615, June 1999. [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", RFC 2661, August 1999. [RFC2889] Mandeville, R. and J. Perser, "Benchmarking Methodology for LAN Switching Devices", RFC 2889, August 2000. [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack Encoding", RFC 3032, January 2001.
set of all bit strings of length L can be represented as f(L) = 2^(L-5) + (L-5) * 2^(L-6) + f(L-5) for all L >= 5.
Additionally, both the protocol and source IP address must be considered, as they provide a source of extra leading and trailing "1" bits. An exhaustive search shows that cycling through all 254 headers will produce 102 bit-stuffs for the CRC. This gives us an expectation of 102/254 stuffs per packet due to the CRC. Since our destination IP address is even and the UDP length is less than 32768, the random source and destination ports provide 32 bits of sequential random data without forcing us to consider the boundary bits. Additionally, we will assume that since our payload is pseudorandom, our UDP CRC will be too. The even UDP length field again allows us to only consider the bits explicitly contained within the CRC and data fields. So, using the formula for the expected number of stuffs in a finite string from section 5.2.2, we determine that E[UDP stuffs] = f(32)/2^32 + f(8000+16)/2^(8000+16). Now, f(32)/2^32 is calculable without too much difficulty and is approximately 0.465. However, f(8016)/2^8016 is a little large to calculate easily, so we will approximate this value by using the probability value obtained in section 5.2.1. Thus, E[UDP] ~ 0.465 + 8016/62 ~ 129.755. Hence, E[packet stuffs] = 51/254 + 102/254 + 129.755 = 130.357. However, since we cannot have a fractional stuff, we round down to 130. Thus, we expect 130 stuffs per packet. Finally, we can calculate bit-stuffing overhead by dividing the expected number of stuff bits by the total number of bits in the IP datagram. So, this example traffic would generate 1.58% overhead. If our payload had consisted exclusively of zero bits, our overhead would have been 0.012%. An all-ones payload would produce 19.47% overhead.
Similar to the flow label case above, the fourth quad of our destination IP address is even and the UDP length field is less than 32768, so the random source and destination ports provide 32 bits of sequential random data without forcing us to consider the boundary bits. Additionally, we will assume that since our payload is pseudorandom, our UDP CRC will be too. The even UDP length field again allows us to only consider the bits explicitly contained within the CRC and data fields. So, using the formula for the expected number of stuffs in a finite string from section 5.2.2, we determine that E[UDP stuffs] = f(32)/2^32 + f(8000+16)/2^(8000+16). Now, f(32)/2^32 is calculable without too much difficulty and is approximately 0.465. However, f(8016)/2^8016 is a little large to calculate easily, so we will approximate this value by using the probability value obtained in section 5.2.1. Thus, E[UDP stuffs] ~ 0.465 + 8016/62 ~ 129.755. Now we may explicitly calculate that E[packet stuffs] = 20/255 + 0.272 + 129.755 = 130.105. However, since we cannot have a fractional stuff, we round down to 130. Thus, we expect 130 stuffs per packet. Finally, we can calculate bit-stuffing overhead by dividing the expected number of stuff bits by the total number of bits in the IP datagram. So, this example traffic would generate 1.55% overhead. If our payload had consisted exclusively of zero bits, our overhead would have been 0.010%. An all-ones payload would produce 19.09% overhead.
Repeatability The precision of test results obtained on a single test bed, but from trial to trial. See also "reproducibility". Reproducibility The precision of test results between different setups, possibly at different locations. See also "repeatability". Stuffing The insertion of a bit or byte within a frame to avoid confusion with control characters. For example, RFC 1662 requires the insertion of a "0" bit prior to any sequence of five contiguous "1" bits within a frame to avoid confusion with the HDLC control character 0x7E.
Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at email@example.com. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.