Internet Engineering Task Force (IETF) S. Shalunov Request for Comments: 6817 G. Hazel Category: Experimental BitTorrent, Inc. ISSN: 2070-1721 J. Iyengar Franklin and Marshall College M. Kuehlewind University of Stuttgart December 2012 Low Extra Delay Background Transport (LEDBAT)
AbstractLow Extra Delay Background Transport (LEDBAT) is an experimental delay-based congestion control algorithm that seeks to utilize the available bandwidth on an end-to-end path while limiting the consequent increase in queueing delay on that path. LEDBAT uses changes in one-way delay measurements to limit congestion that the flow itself induces in the network. LEDBAT is designed for use by background bulk-transfer applications to be no more aggressive than standard TCP congestion control (as specified in RFC 5681) and to yield in the presence of competing flows, thus limiting interference with the network performance of competing flows. Status of This Memo This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation. This document defines an Experimental Protocol for the Internet community. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6817.
Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1. Introduction ....................................................4 1.1. Requirements Notation ......................................4 1.2. Design Goals ...............................................4 1.3. Applicability ..............................................5 2. LEDBAT Congestion Control .......................................6 2.1. Overview ...................................................6 2.2. Preliminaries ..............................................6 2.3. Receiver-Side Operation ....................................7 2.4. Sender-Side Operation ......................................7 2.4.1. An Overview .........................................7 2.4.2. The Complete Sender Algorithm .......................8 2.5. Parameter Values ..........................................11 3. Understanding LEDBAT Mechanisms ................................13 3.1. Delay Estimation ..........................................13 3.1.1. Estimating Base Delay ..............................13 3.1.2. Estimating Queueing Delay ..........................13 3.2. Managing the Congestion Window ............................14 3.2.1. Window Increase: Probing for More Bandwidth ........14 3.2.2. Window Decrease: Responding to Congestion ..........14 3.3. Choosing the Queuing Delay Target .........................15 4. Discussion .....................................................15 4.1. Framing and ACK Frequency Considerations ..................15 4.2. Competing with TCP Flows ..................................15 4.3. Competing with Non-TCP Flows ..............................16 4.4. Fairness among LEDBAT Flows ...............................16 5. Open Areas for Experimentation .................................17 5.1. Network Effects and Monitoring ............................17 5.2. Parameter Values ..........................................18 5.3. Filters ...................................................19 5.4. Framing ...................................................19 6. Security Considerations ........................................19 7. Acknowledgements ...............................................20 8. References .....................................................20 8.1. Normative References ......................................20 8.2. Informative References ....................................20 Appendix A. Measurement Errors ....................................22 A.1. Clock Offset ...............................................22 A.2. Clock Skew .................................................22
RFC5681] seeks to share bandwidth at a bottleneck link equitably among flows competing at the bottleneck, and it is the predominant congestion control mechanism used on the Internet. However, not all applications seek an equitable share of network throughput. "Background" applications, such as software updates or file-sharing applications, seek to operate without interfering with the performance of more interactive and delay- and/or bandwidth-sensitive "foreground" applications. Standard TCP congestion control, as specified in [RFC5681], may be too aggressive for use with such background applications. Low Extra Delay Background Transport (LEDBAT) is an experimental delay-based congestion control mechanism that reacts early to congestion in the network, thus enabling "background" applications to use the network while avoiding interference with the network performance of competing flows. A LEDBAT sender uses one-way delay measurements to estimate the amount of queueing on the data path, controls the LEDBAT flow's congestion window based on this estimate, and minimizes interference with competing flows by adding low extra queueing delay on the end-to-end path. Delay-based congestion control protocols, such as TCP-Vegas [Bra94][Low02], are generally designed to achieve more, not less throughput than standard TCP, and often outperform TCP under particular network settings. For further discussion on Lower-than- Best-Effort transport protocols see [RFC6297]. In contrast, LEDBAT is designed to be no more aggressive than TCP [RFC5681]; LEDBAT is a "scavenger" congestion control mechanism that seeks to utilize all available bandwidth and yields quickly when competing with standard TCP at a bottleneck link. In the rest of this document, we refer to congestion control specified in [RFC5681] as "standard TCP". RFC2119].
2. to add limited queuing delay to that induced by concurrent flows, and 3. to yield quickly to standard TCP flows that share the same bottleneck link. RFC3168]. LEDBAT is designed to reduce buildup of a standing queue by long- lived LEDBAT flows at a link with a tail-drop FIFO queue, so as to avoid persistently delaying other flows sharing the queue. If Active Queue Management (AQM) is configured to drop or ECN-mark packets before the LEDBAT flow starts reacting to persistent queue buildup, LEDBAT reverts to standard TCP behavior rather than yielding to other TCP flows. However, such an AQM is still desirable since it keeps queuing delay low, achieving an outcome that is in line with LEDBAT's goals. Additionally, a LEDBAT transport that supports ECN enjoys the advantages that an ECN-capable TCP enjoys over an ECN-agnostic TCP; avoiding losses and possible retransmissions. Weighted Fair Queuing (WFQ), as employed by some home gateways, seeks to isolate and protect delay-sensitive flows from delays due to standing queues built up by concurrent long-lived flows. Consequently, while it prevents LEDBAT from yielding to other TCP flows, it again achieves an outcome that is in line with LEDBAT's goals [Sch10].
RFC5681] or an ECN mark is received [RFC3168], which, in the absence of link errors in the network, occurs only when the queue at the bottleneck link on the end-to-end path overflows or an AQM is applied. Since packet loss or marking at the bottleneck link is expected to be preceded by an increase in the queueing delay at the bottleneck link, LEDBAT congestion control uses this increase in queueing delay as an early signal of congestion, enabling it to respond to congestion earlier than standard TCP and enabling it to yield bandwidth to a competing TCP flow. LEDBAT employs one-way delay measurements to estimate queueing delay. When the estimated queueing delay is less than a predetermined target, LEDBAT infers that the network is not yet congested and increases its sending rate to utilize any spare capacity in the network. When the estimated queueing delay becomes greater than the predetermined target, LEDBAT decreases its sending rate as a response to potential congestion in the network. RFC5681]. Since slow start leads to faster increase in the window than that specified in LEDBAT, conservative congestion control implementations employing LEDBAT may skip slow start altogether and start with an initial window of INIT_CWND * MSS. (INIT_CWND is described later in Section 2.5.) The term "MSS", or the sender's Maximum Segment Size, used in this document refers to the size of the largest segment that the sender can transmit. The value of MSS can be based on the path MTU discovery [RFC4821] algorithm and/or on other factors.
Section 2.4.2. TARGET is the maximum queueing delay that LEDBAT itself may introduce in the network, and GAIN determines the rate at which the cwnd responds to changes in queueing delay; both constants are specified later. off_target is a normalized value representing the difference between the measured current queueing delay and the predetermined TARGET delay. off_target can be positive or negative; consequently, cwnd increases or decreases in proportion to off_target. on initialization: base_delay = +INFINITY
on acknowledgement: current_delay = acknowledgement.delay base_delay = min(base_delay, current_delay) queuing_delay = current_delay - base_delay off_target = (TARGET - queuing_delay) / TARGET cwnd += GAIN * off_target * bytes_newly_acked * MSS / cwnd The simplified mechanism above ignores multiple delay samples in an acknowledgement, noise filtering, base delay expiration, and sender idle times, which we now take into account in our complete sender algorithm below. Section 4.4 for a more complete discussion. The full sender-side algorithm is given below: on initialization: # cwnd is the amount of data that is allowed to be # outstanding in an RTT and is defined in bytes. # CTO is the congestion timeout value. create current_delays list with CURRENT_FILTER elements create base_delays list with BASE_HISTORY number of elements initialize elements in base_delays to +INFINITY initialize elements in current_delays according to FILTER() last_rollover = -INFINITY # More than a minute in the past flightsize = 0 cwnd = INIT_CWND * MSS CTO = 1 second
on acknowledgement: # flightsize is the amount of data outstanding before this ACK # was received and is updated later; # bytes_newly_acked is the number of bytes that this ACK # newly acknowledges, and it MAY be set to MSS. for each delay sample in the acknowledgement: delay = acknowledgement.delay update_base_delay(delay) update_current_delay(delay) queuing_delay = FILTER(current_delays) - MIN(base_delays) off_target = (TARGET - queuing_delay) / TARGET cwnd += GAIN * off_target * bytes_newly_acked * MSS / cwnd max_allowed_cwnd = flightsize + ALLOWED_INCREASE * MSS cwnd = min(cwnd, max_allowed_cwnd) cwnd = max(cwnd, MIN_CWND * MSS) flightsize = flightsize - bytes_newly_acked update_CTO() on data loss: # at most once per RTT cwnd = min (cwnd, max (cwnd/2, MIN_CWND * MSS)) if data lost is not to be retransmitted: flightsize = flightsize - bytes_not_to_be_retransmitted if no ACKs are received within a CTO: # extreme congestion, or significant RTT change. # set cwnd to 1MSS and backoff the congestion timer. cwnd = 1 * MSS CTO = 2 * CTO update_CTO() # implements an RTT estimation mechanism using data # transmission times and ACK reception times, # which is used to implement a congestion timeout (CTO). # If implementing LEDBAT in TCP, sender SHOULD use # mechanisms described in RFC 6298 [RFC6298], # and the CTO would be the same as the retransmission timeout (RTO). update_current_delay(delay) # Maintain a list of CURRENT_FILTER last delays observed. delete first item in current_delays list append delay to current_delays list
update_base_delay(delay) # Maintain BASE_HISTORY delay-minima. # Each minimum is measured over a period of a minute. # 'now' is the current system time if round_to_minute(now) != round_to_minute(last_rollover) last_rollover = now delete first item in base_delays list append delay to base_delays list else base_delays.tail = MIN(base_delays.tail, delay) The LEDBAT sender seeks to extract the actual delay estimate from the current_delay samples by implementing FILTER() to eliminate any outliers. Different types of filters MAY be used for FILTER() -- a NULL filter, that does not filter at all, is a reasonable candidate as well, since LEDBAT's use of a linear controller for cwnd increase and decrease may allow it to recover quickly from errors induced by bad samples. Another example of a filter is the exponentially weighted moving average (EWMA) function, with weights that enable agile tracking of changing network delay. A simple MIN filter applied over a small window (much smaller than BASE_HISTORY) may also provide robustness to large delay peaks, as may occur with delayed ACKs in TCP. Care should be taken that the filter used, while providing robustness to noise, remains sensitive to persistent congestion signals. We note that when multiple delay samples are bundled within a single ACK, the sender's resulting cwnd may be slightly different than when the samples are sent individually in separate ACKs. The cwnd is adjusted based on the total number of bytes ACKed and the final filtered value of queueing_delay, irrespective of the number of delay samples in an ACK. To implement an approximate minimum over the past few minutes, a LEDBAT sender stores BASE_HISTORY separate minima -- one each for the last BASE_HISTORY-1 minutes, and one for the running current minute. At the end of the current minute, the window moves -- the earliest minimum is dropped and the latest minimum is added. If the connection is idle for a given minute, no data is available for the one-way delay and, therefore, a value of +INFINITY has to be stored in the list. If the connection has been idle for BASE_HISTORY minutes, all minima in the list are thus set to +INFINITY and measurement begins anew. LEDBAT thus requires that during idle periods, an implementation must maintain the base delay list.
LEDBAT restricts cwnd growth after a period of inactivity. When the sender is application-limited, the sender's cwnd is clamped down using max_allowed_cwnd to a little more than flightsize. To be TCP- friendly, LEDBAT halves its cwnd on data loss. LEDBAT uses a congestion timeout (CTO) to avoid transmitting data during periods of heavy congestion and to avoid congestion collapse. A CTO is used to detect heavy congestion indicated by loss of all outstanding data or acknowledgements, resulting in reduction of the cwnd to 1 MSS and an exponential backoff of the CTO interval. This backoff of the CTO value avoids sending more data into an overloaded queue, and it also allows the sender to cope with sudden changes in the RTT of the path. The function of a CTO is similar to that of an retransmission timeout (RTO) in TCP [RFC6298], but since LEDBAT separates reliability from congestion control, a retransmission need not be triggered by a CTO. LEDBAT, however, does not preclude a CTO from triggering retransmissions, as could be the case if LEDBAT congestion control were to be used with TCP framing and reliability. The CTO is a gating mechanism that ensures exponential backoff of sending rate under heavy congestion, and it may be implemented with or without a timer. An implementation choosing to avoid timers may consider using a "next-time-to-send" variable, set based on the CTO, to control the earliest time a sender may transmit without receiving any ACKs. A maximum value MAY be placed on the CTO, and if placed, it MUST be at least 60 seconds. Section 3.3. Note that using the same TARGET value across LEDBAT flows enables equitable sharing of the bottleneck bandwidth. A flow with a higher TARGET value than other competing LEDBAT flows may get a larger share of the bottleneck bandwidth. It is possible to consider the use of different TARGET values for implementing a relative priority between two competing LEDBAT flows by setting a higher TARGET value for the higher-priority flow. ALLOWED_INCREASE SHOULD be 1, and it MUST be greater than 0. An ALLOWED_INCREASE of 0 results in no cwnd growth at all, and an ALLOWED_INCREASE of 1 allows and limits the cwnd increase based on flightsize in the previous RTT. An ALLOWED_INCREASE greater than 1 MAY be used when interactions between LEDBAT and the framing protocol provide a clear reason for doing so. GAIN MUST be set to 1 or less. A GAIN of 1 limits the maximum cwnd ramp-up to the same rate as TCP Reno in Congestion Avoidance. While this document specifies the use of the same GAIN for both cwnd
increase (when off_target is greater than zero) and decrease (when off_target is less than zero), implementations MAY use a higher GAIN for cwnd decrease than for the increase; our justification follows. When a competing non-LEDBAT flow increases its sending rate, the LEDBAT sender may only measure a small amount of additional delay and decrease the sending rate slowly. To ensure no impact on a competing non-LEDBAT flow, the LEDBAT flow should decrease its sending rate at least as quickly as the competing flow increases its sending rate. A higher decrease-GAIN MAY be used to allow the LEDBAT flow to decrease its sending rate faster than the competing flow's increase rate. The size of the base_delays list, BASE_HISTORY, SHOULD be 10. If the actual base delay decreases, due to a route change, for instance, a LEDBAT sender adapts immediately, irrespective of the value of BASE_HISTORY. If the actual base delay increases, however, a LEDBAT sender will take BASE_HISTORY minutes to adapt and may wrongly infer a little more extra delay than intended (TARGET) in the meanwhile. A value for BASE_HISTORY is thus a trade-off: a higher value may yield a more accurate measurement when the base delay is unchanging, and a lower value results in a quicker response to actual increase in base delay. A LEDBAT sender uses the current_delays list to maintain only delay measurements made within an RTT amount of time in the past, seeking to eliminate noise spikes in its measurement of the current one-way delay through the network. The size of this list, CURRENT_FILTER, may be variable, and it depends on the FILTER() function as well as the number of successful measurements made within an RTT amount of time in the past. The sender should seek to gather enough delay samples in each RTT so as to have statistical confidence in the measurements. While the number of delay samples required for such confidence will vary depending on network conditions, the sender SHOULD use at least 4 delay samples in each RTT, unless the number of samples is lower due to a small congestion window. The value of CURRENT_FILTER will depend on the filter being employed, but CURRENT_FILTER MUST be limited such that samples in the list are not older than an RTT in the past. INIT_CWND and MIN_CWND SHOULD both be 2. An INIT_CWND of 2 should help seed FILTER() at the sender when there are no samples at the beginning of a flow, and a MIN_CWND of 2 allows FILTER() to use more than a single instantaneous delay estimate while not being too aggressive. Slight deviations may be warranted, for example, when these values of INIT_CWND and MIN_CWND interact poorly with the framing protocol. However, INIT_CWND and MIN_CWND MUST be no larger than the corresponding values specified for TCP [RFC5681].
filtered (depending on the usage scenario) to eliminate noise in the delay estimation, such as due to spikes in processing delay at a node on the path.
g114]. Thus, the delay induced by LEDBAT must be well below 150 ms to limit its impact on concurrent delay- sensitive traffic sharing the same bottleneck queue. A target that is too low, on the other hand, increases the sensitivity of the sender's algorithm to noise in the one-way delays and in the delay measurement process, and may lead to reduced throughput for the LEDBAT flow and to under-utilization of the bottleneck link. Our recommendation of 100 ms or less as the target is a trade-off between these considerations. Anecdotal evidence indicates that this value works well -- LEDBAT has been implemented and successfully deployed with a target value of 100 ms in two BitTorrent implementations: as the exclusive congestion control mechanism in BitTorrent Delivery Network Accelerator (DNA), and as an experimental mechanism in uTorrent [uTorrent]. RFC5681]. A LEDBAT flow gets more aggressive as the queueing delay estimate gets lower; since the queueing delay estimate is non-negative, LEDBAT is most aggressive when the queueing delay estimate is zero. In this case, LEDBAT ramps up its congestion window at the same rate as standard TCP [RFC5681]. LEDBAT may reduce its rate earlier than standard TCP and always
halves its congestion window on loss. Thus, in the worst case, where the delay estimates are completely and consistently off, a LEDBAT flow falls back to standard TCP behavior, and is no more aggressive than standard TCP [RFC5681].
delay and backs off, while the latecomer keeps building up, using up the entire link's capacity, and effectively shutting the old flow out. This advantage is called the "latecomer's advantage". In the worst case, if the first flow yields at the same rate as the new flow increases its sending rate, the new flow will see constant end-to-end delay, which it assumes is the base delay, until the first flow backs off completely. As a result, by the time the second flow stops increasing its cwnd, it would have added twice the target queueing delay to the network. This advantage can be reduced if the first flow yields and empties the bottleneck queue faster than the incoming flow increases its occupancy in the queue. In such a case, the latecomer might measure correctly a delay that is closer to the base delay. While such a reduction might be achieved through a multiplicative decrease of the congestion window, this may cause strong fluctuations in flow throughput during the flow's steady state. Thus, we do not recommend a multiplicative decrease scheme. We note that in certain use-case scenarios, it is possible for a later LEDBAT flow to gain an unfair advantage over an existing one [Car10]. In practice, this concern ought to be alleviated by the burstiness of network traffic: all that's needed to measure the base delay is one small gap in transmission schedules between the LEDBAT flows. These gaps can occur for a number of reasons such as latency introduced due to application sending patterns, OS scheduling at the sender, processing delay at the sender or any network node, and link contention. When such a gap occurs in the first sender's transmission while the latecomer is starting, base delay is immediately correctly measured. With a small number of LEDBAT flows, system noise may sufficiently regulate the latecomer's advantage.
broad goals: (i) to understand the effects of different network mechanisms on LEDBAT, and (ii) to understand the impact of LEDBAT on the network. Network mechanisms and dynamics can influence LEDBAT flows in unintended ways. For instance, frequent route changes that result in increasing base delays may, in the worst case, throttle a LEDBAT flow's throughput significantly. The influence of different network traffic management mechanisms on LEDBAT throughput should be studied. An increasing number of LEDBAT flows in the network will likely result in operator-visible network effects as well, and these should thus be studied. For instance, as long as the bottleneck queue in a network is larger than TARGET (in terms of delay), we expect that both the average queueing delay and loss rate in the network should reduce as LEDBAT traffic increasingly dominates the traffic mix in the network. Note that for bottleneck queues that are smaller than TARGET, LEDBAT will appear to behave very similar to standard TCP and its flow-level behavior may not be distinguishable from that of standard TCP. We note that a network operator may be able to verify the operation of a LEDBAT flow by monitoring per-flow behavior and queues in the network -- when the queueing delay at a bottleneck queue is above TARGET as specified in this document, LEDBAT flows should be expected to back off and reduce their sending rate.
LEDBAT is not known to introduce any new concerns with privacy, integrity, or other security issues for flows that use it. LEDBAT is compatible with use of IPsec and Transport Layer Security (TLS) / Datagram Transport Layer Security (DTLS). [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001. [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, March 2007. [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion Control", RFC 5681, September 2009. [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, "Computing TCP's Retransmission Timer", RFC 6298, June 2011. [Bra94] Brakmo, L., O'Malley, S., and L. Peterson, "TCP Vegas: New techniques for congestion detection and avoidance", Proceedings of SIGCOMM '94, pages 24-35, August 1994. [Car10] Carofiglio, G., Muscariello, L., Rossi, D., Testa, C., and S. Valenti, "Rethinking Low Extra Delay Background Transport Protocols", October 2010, <http://arxiv.org/abs/1010.5623v1>. [Low02] Low, S., Peterson, L., and L. Wang, "Understanding TCP Vegas: A Duality Model", JACM 49 (2), March 2002.
[RFC5905] Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithms Specification", RFC 5905, June 2010. [RFC6297] Welzl, M. and D. Ros, "A Survey of Lower-than-Best-Effort Transport Protocols", RFC 6297, June 2011. [Sch10] Schneider, J., Wagner, J., Winter, R., and H. Kolbe, "Out of my Way -- Evaluating Low Extra Delay Background Transport in an ADSL Access Network", Proceedings of 22nd International Teletraffic Congress (ITC22), September 2010. [g114] "SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS; International telephone connections and circuits - General; Recommendations on the transmission quality for an entire international telephone connection; One-way transmission time", ITU-T Recommendation G.114, 05/2003. [uTorrent] Hazel, G., "uTorrent Transport Protocol library", July 2012, <http://github.com/bittorrent/libutp>.
RFC5905]. In particular, NTP uses the terms "offset" to refer to the difference between measured time and true time, and "skew" to refer to difference of clock rate from the true rate. A clock thus has two time measurement errors: a fixed offset from the true time, and a skew. We now consider these errors in the context of LEDBAT. RFC5905], where a skew of 100 ppm translates to an error accumulation of 6 milliseconds per minute. This accumulation is limited in LEDBAT, since any error accumulation is limited to the amount of history maintained by the base delay estimator, as dictated by the BASE_HISTORY parameter. The effects of clock skew error on LEDBAT should generally be insignificant unless the skew is unusually high, or unless extreme values have been chosen for TARGET (extremely low) and BASE_HISTORY (extremely large). Nevertheless, we now consider the possible impact of skew on LEDBAT behavior.
Clock skew can manifest in two ways: the sender's clock can be faster than the receiver's clock, or the receiver's clock can be faster than the sender's clock. In the first case, the measured one-way delay will decrease as the sender's clock drifts forward. While this drift can lead to an artificially low estimate of the queueing delay, the drift should also lead to a lower base delay measurement, which consequently absorbs the erroneous reduction in the one-way delay estimates. In the second case, the one-way delay estimate will artificially increase with time. This increase can reduce a LEDBAT flow's throughput unnecessarily. In this case, a skew correction mechanism can be beneficial. We now discuss an example clock skew correction mechanism. In this example, the receiver sends back raw (sending and receiving) timestamps. Using this information, the sender can estimate one-way delays in both directions, and the sender can also compute and maintain an estimate of the base delay as would be observed by the receiver. If the sender detects the receiver reducing its estimate of the base delay, it may infer that this reduction is due to clock drift. The sender then compensates by increasing its base delay estimate by the same amount. To apply this mechanism, timestamps need to be transmitted in both directions. We now outline a few other ideas that can be used for skew correction. o Skew correction with faster virtual clock: Since having a faster clock on the sender will result in continuous updates of the base delay, a faster virtual clock can be used for sender timestamping. This virtual clock can be computed from the default machine clock through a linear transformation. For instance, with a 500 ppm speed-up the sender's clock is very likely to be faster than a receiver's clock. Consequently, LEDBAT will benefit from the implicit correction when updating the base delay. o Skew correction with estimating drift: A LEDBAT sender maintains a history of base delay minima. This history can provide a base to compute the clock skew difference between the two hosts. The slope of a linear function fitted to the set of minima base delays gives an estimate of the clock skew. This estimation can be used to correct the clocks. If the other endpoint is doing the same, the clock should be corrected by half of the estimated skew amount.
o Byzantine skew correction: When it is known that each host maintains long-lived connections to a number of different other hosts, a byzantine scheme can be used to estimate the skew with respect to the true time. Namely, a host calculates the skew difference for each of the peer hosts as described with the previous approach, then take the median of the skew differences. While this scheme is not universally applicable, it combines well with other schemes, since it is essentially a clock training mechanism. The scheme also corrects fast, since state is preserved between connections.
http://shlang.com Greg Hazel BitTorrent, Inc. 303 Second St., Suite S200 San Francisco, CA 94107 USA EMail: firstname.lastname@example.org Janardhan Iyengar Franklin and Marshall College 415 Harrisburg Ave. Lancaster, PA 17603 USA EMail: email@example.com Mirja Kuehlewind University of Stuttgart Stuttgart DE EMail: firstname.lastname@example.org