RFC 2757

Long Thin Networks

Pages: 46
Informational

Part 1 of 2 – Pages 1 to 20

RFC2757 - Page 1

Network Working Group                                      G. Montenegro
Request for Comments: 2757                        Sun Microsystems, Inc.
Category: Informational                                       S. Dawkins
                                                         Nortel Networks
                                                                 M. Kojo
                                                  University of Helsinki
                                                               V. Magret
                                                                 Alcatel
                                                               N. Vaidya
                                                    Texas A&M University
                                                            January 2000


                           Long Thin Networks

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2000).  All Rights Reserved.

Abstract

   In view of the unpredictable and problematic nature of long thin
   networks (for example, wireless WANs), arriving at an optimized
   transport is a daunting task.  We have reviewed the existing
   proposals along with future research items. Based on this overview,
   we also recommend mechanisms for implementation in long thin
   networks.

   Our goal is to identify a TCP that works for all users, including
   users of long thin networks. We started from the working
   recommendations of the IETF TCP Over Satellite Links (tcpsat) working
   group with this end in mind.

   We recognize that not every tcpsat recommendation will be required
   for long thin networks as well, and work toward a set of TCP
   recommendations that are 'benign' in environments that do not require
   them.

RFC2757 - Page 2

Table of Contents

   1 Introduction .................................................    3
      1.1 Network Architecture ....................................    5
      1.2 Assumptions about the Radio Link ........................    6
   2 Should it be IP or Not?  .....................................    7
      2.1 Underlying Network Error Characteristics ................    7
      2.2 Non-IP Alternatives .....................................    8
         2.2.1 WAP ................................................    8
         2.2.2 Deploying Non-IP Alternatives ......................    9
      2.3 IP-based Considerations .................................    9
         2.3.1 Choosing the MTU [Stevens94, RFC1144] ..............    9
         2.3.2 Path MTU Discovery [RFC1191] .......................   10
         2.3.3 Non-TCP Proposals ..................................   10
   3 The Case for TCP .............................................   11
   4 Candidate Optimizations ......................................   12
      4.1 TCP: Current Mechanisms .................................   12
         4.1.1 Slow Start and Congestion Avoidance ................   12
         4.1.2 Fast Retransmit and Fast Recovery ..................   12
      4.2 Connection Setup with T/TCP [RFC1397, RFC1644] ..........   14
      4.3 Slow Start Proposals ....................................   14
         4.3.1 Larger Initial Window ..............................   14
         4.3.2 Growing the Window during Slow Start ...............   15
            4.3.2.1 ACK Counting ..................................   15
            4.3.2.2 ACK-every-segment .............................   16
         4.3.3 Terminating Slow Start .............................   17
         4.3.4 Generating ACKs during Slow Start ..................   17
      4.4 ACK Spacing .............................................   17
      4.5 Delayed Duplicate Acknowlegements .......................   18
      4.6 Selective Acknowledgements [RFC2018] ....................   18
      4.7 Detecting Corruption Loss ...............................   19
         4.7.1 Without Explicit Notification ......................   19
         4.7.2 With Explicit Notifications ........................   20
      4.8 Active Queue Management .................................   21
      4.9 Scheduling Algorithms ...................................   21
      4.10 Split TCP and Performance-Enhancing Proxies (PEPs) .....   22
         4.10.1 Split TCP Approaches ..............................   23
         4.10.2 Application Level Proxies .........................   26
         4.10.3 Snoop and its Derivatives .........................   27
         4.10.4 PEPs to handle Periods of Disconnection ...........   29
      4.11 Header Compression Alternatives ........................   30
      4.12 Payload Compression ....................................   31
      4.13 TCP Control Block Interdependence [Touch97] ............   32
   5 Summary of Recommended Optimizations .........................   33
   6 Conclusion ...................................................   35
   7 Acknowledgements .............................................   35
   8 Security Considerations ......................................   35

RFC2757 - Page 3

   9 References ...................................................   36
   Authors' Addresses .............................................   44
   Full Copyright Statement .......................................   46

1 Introduction

   Optimized wireless networking is one of the major hurdles that Mobile
   Computing must solve if it is to enable ubiquitous access to
   networking resources. However, current data networking protocols have
   been optimized primarily for wired networks.  Wireless environments
   have very different characteristics in terms of latency, jitter, and
   error rate as compared to wired networks.  Accordingly, traditional
   protocols are ill-suited to this medium.

   Mobile Wireless networks can be grouped in W-LANs (for example,
   802.11 compliant networks) and W-WANs (for example, CDPD [CDPD],
   Ricochet, CDMA [CDMA], PHS, DoCoMo, GSM [GSM] to name a few).  W-WANs
   present the most serious challenge, given that the length of the
   wireless link (expressed as the delay*bandwidth product) is typically
   4 to 5 times as long as that of its W-LAN counterparts.  For example,
   for an 802.11 network, assuming the delay (round-trip time) is about
   3 ms.  and the bandwidth is 1.5 Mbps, the delay*bandwidth product is
   4500 bits. For a W-WAN such as Ricochet, a typical round-trip time
   may be around 500 ms. (the best is about 230 ms.), and the sustained
   bandwidth is about 24 Kbps. This yields a delay*bandwidth product
   roughly equal to 1.5 KB. In the near future, 3rd Generation wireless
   services will offer 384Kbps and more.  Assuming a 200 ms round-trip,
   the delay*bandwidth product in this case is 76.8 Kbits (9.6 KB). This
   value is larger than the default 8KB buffer space used by many TCP
   implementations. This means that, whereas for W-LANs the default
   buffer space is enough, future W-WANs will operate inefficiently
   (that is, they will not be able to fill the pipe) unless they
   override the default value. A 3rd Generation wireless service
   offering 2 Mbps with 200-millisecond latency requires a 50 KB buffer.

   Most importantly,  latency across a link adversely affects
   throughput. For example,  [MSMO97] derives an upper bound on TCP
   throughput. Indeed, the resultant expression is inversely related to
   the round-trip time.

   The long latencies also push the limits (and commonly transgress
   them) for what is acceptable to users of interactive applications.

   As a quick glance to our list of references will reveal, there is a
   wealth of proposals that attempt to solve the wireless networking
   problem. In this document, we survey the different solutions
   available or under investigation, and issue the corresponding
   recommendations.

RFC2757 - Page 4

   There is a large body of work on the subject of improving TCP
   performance over satellite links. The documents under development by
   the tcpsat working group of the IETF [AGS98, ADGGHOSSTT98] are very
   relevant. In both cases, it is essential to start by improving the
   characteristics of the medium by using forward error correction (FEC)
   at the link layer to reduce the BER (bit error rate) from values as
   high as 10-3 to 10-6 or better. This makes the BER manageable. Once
   in this realm, retransmission schemes like ARQ (automatic repeat
   request) may be used to bring it down even further. Notice that
   sometimes it may be desirable to forego ARQ because of the additional
   delay it implies.  In particular, time sensitive traffic (video,
   audio) must be delivered within a certain time limit beyond which the
   data is obsolete. Exhaustive retransmissions in this case merely
   succeed in wasting time in order to deliver data that will be
   discarded once it arrives at its destination.  This indicates the
   desirability of augmenting the protocol stack implementation on
   devices such that the upper protocol layers can inform the link and
   MAC layer when to avoid such costly retransmission schemes.

   Networks that include satellite links are examples of "long fat
   networks" (LFNs or "elephants"). They are "long" networks because
   their round-trip time is quite high (for example, 0.5 sec and higher
   for geosynchronous satellites). Not all satellite links fall within
   the LFN regime. In particular, round-trip times in a low-earth
   orbiting (LEO) satellite network may be as little as a few
   milliseconds (and never extend beyond 160 to 200 ms). W-WANs share
   the "L" with LFNs. However, satellite networks are also "fat" in the
   sense that they may have high bandwidth. Satellite networks may often
   have a delay*bandwidth product above 64 KBytes, in which case they
   pose additional problems to TCP [TCPHP]. W-WANs do not generally
   exhibit this behavior. Accordingly, this document only deals with
   links that are "long thin pipes", and the networks that contain them:
   "long thin networks". We call these "LTNs".

   This document does not give an overview of the API used to access the
   underlying transport. We believe this is an orthogonal issue, even
   though some of the proposals below have been put forth assuming a
   given interface.  It is possible, for example, to support the
   traditional socket semantics without fully relying on TCP/IP
   transport [MOWGLI].

   Our focus is on the on-the-wire protocols. We try to include the most
   relevant ones and briefly (given that we provide the references
   needed for further study) mention their most salient points.

RFC2757 - Page 5

1.1 Network Architecture

   One significant difference between LFNs and LTNs is that we assume
   the W-WAN link is the last hop to the end user. This allows us to
   assume that a single intermediate node sees all packets transferred
   between the wireless mobile device and the rest of the Internet.
   This is only one of the topologies considered by the TCP Satellite
   community.

   Given our focus on mobile wireless applications, we only consider a
   very specific architecture that includes:

      -  a wireless mobile device, connected via

      -  a wireless link (which may, in fact comprise several hops at
         the link layer), to

      -  an intermediate node (sometimes referred to as a base station)
         connected via

      -  a wireline link, which in turn interfaces with

      -  the landline Internet and millions of legacy servers and web
         sites.

   Specifically, we are not as concerned with paths that include two
   wireless segments separated by a wired one. This may occur, for
   example, if one mobile device connects across its immediate wireless
   segment via an intermediate node to the Internet, and then via a
   second wireless segment to another mobile device.  Quite often,
   mobile devices connect to a legacy server on the wired Internet.

   Typically, the endpoints of the wireless segment are the intermediate
   node and the mobile device. However, the latter may be a wireless
   router to a mobile network. This is also important and has
   applications in, for example, disaster recovery.

   Our target architecture has implications which concern the
   deployability of candidate solutions. In particular, an important
   requirement is that we cannot alter the networking stack on the
   legacy servers. It would be preferable to only change the networking
   stack at the intermediate node, although changing it at the mobile
   devices is certainly an option and perhaps a necessity.

   We envision mobile devices that can use the wireless medium very
   efficiently, but overcome some of its traditional constraints.  That
   is, full mobility implies that the devices have the flexibility and
   agility to use whichever happens to be the best network connection

RFC2757 - Page 6

   available at any given point in time or space.  Accordingly, devices
   could switch from a wired office LAN and hand over their ongoing
   connections to continue on, say, a wireless WAN. This type of agility
   also requires Mobile IP [RFC2002].

1.2 Assumptions about the Radio Link

   The system architecture described above assumes at most one wireless
   link (perhaps comprising more than one wireless hop).  However, this
   is not enough to characterize a wireless link.  Additional
   considerations are:

      -  What are the error characteristics of the wireless medium?  The
         link may present a higher BER than a wireline network due to
         burst errors and disconnections. The techniques below usually
         do not address all the types of errors. Accordingly, a complete
         solution should combine the best of all the proposals.
         Nevertheless, in this document we are more concerned with (and
         give preference to solving) the most typical case: (1) higher
         BER due to random errors (which implies longer and more
         variable delays due to link-layer error corrections and
         retransmissions) rather than (2) an interruption in service due
         to a handoff or a disconnection.  The latter are also important
         and we do include relevant proposals in this survey.

      -  Is the wireless service datagram oriented, or is it a virtual
         circuit?  Currently, switched virtual circuits are more common,
         but packet networks are starting to appear, for example,
         Metricom's Starmode [CB96], CDPD [CDPD] and General Packet
         Radio Service (GPRS) [GPRS],[BW97] in GSM.

      -  What kind of reliability does the link provide? Wireless
         services typically retransmit a packet (frame) until it has
         been acknowledged by the target. They may allow the user to
         turn off this behavior. For example, GSM allows RLP [RLP]
         (Radio Link Protocol)  to be turned off.  Metricom has a
         similar "lightweight" mode. In GSM RLP, a frame is
         retransmitted until the maximum number of retransmissions
         (protocol parameter) is reached. What happens when this limit
         is reached is determined by the telecom operator:  the physical
         link connection is either disconnected or a link reset is
         enforced where the sequence numbers are resynchronized and the
         transmit and receive buffers are flushed resulting in lost
         data. Some wireless services, like CDMA IS95-RLP [CDMA,
         Karn93], limit the latency on the wireless link by
         retransmitting a frame only a couple of times. This decreases
         the residual frame error rate significantly, but does not
         provide fully reliable link service.

RFC2757 - Page 7

      -  Does the mobile device transmit and receive at the same time?
         Doing so increases the cost of the electronics on the mobile
         device. Typically, this is not the case. We assume in this
         document that mobile devices do not transmit and receive
         simultaneously.

      -  Does the mobile device directly address more than one peer on
         the wireless link? Packets to each different peer may traverse
         spatially distinct wireless paths. Accordingly, the path to
         each peer may exhibit very different characteristics.  Quite
         commonly, the mobile device addresses only one peer (the
         intermediate node) at any given point in time.  When this is
         not the case, techniques such as Channel-State Dependent Packet
         Scheduling come into play (see the section "Packet Scheduling"
         below).

2 Should it be IP or Not?

   The first decision is whether to use IP as the underlying network
   protocol or not. In particular, some data protocols evolved from
   wireless telephony are not always -- though at times they may be --
   layered on top of IP [MOWGLI, WAP]. These proposals are based on the
   concept of proxies that provide adaptation services between the
   wireless and wireline segments.

   This is a reasonable model for mobile devices that always communicate
   through the proxy. However, we expect many wireless mobile devices to
   utilize wireline networks whenever they are available. This model
   closely follows current laptop usage patterns: devices typically
   utilize LANs, and only resort to dial-up access when "out of the
   office."

   For these devices, an architecture that assumes IP is the best
   approach, because it will be required for communications that do not
   traverse the intermediate node (for example, upon reconnection to a
   W-LAN or a 10BaseT network at the office).

2.1 Underlying Network Error Characteristics

   Using IP as the underlying network protocol requires a certain (low)
   level of link robustness that is expected of wireless links.

   IP, and the protocols that are carried in IP packets, are protected
   end-to-end by checksums that are relatively weak [Stevens94,
   Paxson97] (and, in some cases, optional). For much of the Internet,
   these checksums are sufficient; in wireless environments, the error
   characteristics of the raw wireless link are much less robust than
   the rest of the end-to-end path.  Hence for paths that include

RFC2757 - Page 8

   wireless links, exclusively relying on end-to-end mechanisms to
   detect and correct transmission errors is undesirable. These should
   be complemented by local link-level mechanisms. Otherwise, damaged IP
   packets are propagated through the network only to be discarded at
   the destination host. For example, intermediate routers are required
   to check the IP header checksum, but not the UDP or TCP checksums.
   Accordingly, when the payload of an IP packet is corrupted, this is
   not detected until the packet arrives at its ultimate destination.

   A better approach is to use link-layer mechanisms such as FEC,
   retransmissions, and so on in order to improve the characteristics of
   the wireless link and present a much more reliable service to IP.
   This approach has been taken by CDPD, Ricochet and CDMA.

   This approach is roughly analogous to the successful deployment of
   Point-to-Point Protocol (PPP), with robust framing and 16-bit
   checksumming, on wireline networks as a replacement for the Serial
   Line Interface Protocol (SLIP), with only a single framing byte and
   no checksumming.

   [AGS98] recommends the use of FEC in satellite environments.

   Notice that the link-layer could adapt its frame size to the
   prevalent BER.  It would perform its own fragmentation and reassembly
   so that IP could still enjoy a large enough MTU size [LS98].

   A common concern for using IP as a transport is the header overhead
   it implies. Typically, the underlying link-layer appears as PPP
   [RFC1661] to the IP layer above. This allows for header compression
   schemes [IPHC, IPHC-RTP, IPHC-PPP] which greatly alleviate the
   problem.

2.2 Non-IP Alternatives

   A number of non-IP alternatives aimed at wireless environments have
   been proposed. One representative proposal is discussed here.

2.2.1 WAP

   The Wireless Application Protocol (WAP) specifies an application
   framework and network protocols for wireless devices such as mobile
   telephones, pagers, and PDAs [WAP]. The architecture requires a proxy
   between the mobile device and the server. The WAP protocol stack is
   layered over a datagram transport service.  Such a service is
   provided by most wireless networks; for example, IS-136, GSM
   SMS/USSD, and UDP in IP networks like CDPD and GSM GPRS. The core of

RFC2757 - Page 9

   the WAP protocols is a binary HTTP/1.1 protocol with additional
   features such as header caching between requests and a shared state
   between client and server.

2.2.2 Deploying Non-IP Alternatives

   IP is such a fundamental element of the Internet that non-IP
   alternatives face substantial obstacles to deployment, because they
   do not exploit the IP infrastructure. Any non-IP alternative that is
   used to provide gatewayed access to the Internet must map between IP
   addresses and non-IP addresses, must terminate IP-level security at a
   gateway, and cannot use IP-oriented discovery protocols (Dynamic Host
   Configuration Protocol, Domain Name Services, Lightweight Directory
   Access Protocol, Service Location Protocol, etc.) without translation
   at a gateway.

   A further complexity occurs when a device supports both wireless and
   wireline operation. If the device uses IP for wireless operation,
   uninterrupted operation when the device is connected to a wireline
   network is possible (using Mobile IP). If a non-IP alternative is
   used, this switchover is more difficult to accomplish.

   Non-IP alternatives face the burden of proof that IP is so ill-suited
   to a wireless environment that it is not a viable technology.

2.3 IP-based Considerations

   Given its worldwide deployment, IP is an obvious choice for the
   underlying network technology. Optimizations implemented at this
   level benefit traditional Internet application protocols as well as
   new ones layered on top of IP or UDP.

2.3.1 Choosing the MTU [Stevens94, RFC1144]

   In slow networks, the time required to transmit the largest possible
   packet may be considerable.  Interactive response time should not
   exceed the well-known human factors limit of 100 to 200 ms. This
   should be considered the maximum time budget to (1) send a packet and
   (2) obtain a response. In most networking stack implementations, (1)
   is highly dependent on the maximum transmission unit (MTU). In the
   worst case, a small packet from an interactive application may have
   to wait for a large packet from a bulk transfer application before
   being sent. Hence, a good rule of thumb is to choose an MTU such that
   its transmission time is less than (or not much larger than) 200 ms.

RFC2757 - Page 10

   Of course, compression and type-of-service queuing (whereby
   interactive data packets are given a higher priority) may alleviate
   this problem. In particular, the latter may reduce the average wait
   time to about half the MTU's transmission time.

2.3.2 Path MTU Discovery [RFC1191]

   Path MTU discovery benefits any protocol built on top of IP. It
   allows a sender to determine what the maximum end-to-end transmission
   unit is to a given destination. Without Path MTU discovery, the
   default IPv4 MTU size is 576. The benefits of using a larger MTU are:

      -  Smaller ratio of header overhead to data

      -  Allows TCP to grow its congestion window faster, since it
         increases in units of segments.

   Of course, for a given BER, a larger MTU has a correspondingly larger
   probability of error within any given segment. The BER may be reduced
   using lower level techniques like FEC and link-layer retransmissions.
   The issue is that now delays may become a problem due to the
   additional retransmissions, and the fact that packet transmission
   time increases with a larger MTU.

   Recommendation: Path MTU discovery is recommended. [AGS98] already
   recommends its use in satellite environments.

2.3.3 Non-TCP Proposals

   Other proposals assume an underlying IP datagram service, and
   implement an optimized transport either directly on top of IP
   [NETBLT] or on top of UDP [MNCP]. Not relying on TCP is a bold move,
   given the wealth of experience and research related to it.  It could
   be argued that the Internet has not collapsed because its main
   protocol, TCP, is very careful in how it uses the network, and
   generally treats it as a black box assuming all packet losses are due
   to congestion and prudently backing off. This avoids further
   congestion.

   However, in the wireless medium, packet losses may also be due to
   corruption due to high BER, fading, and so on. Here, the right
   approach is to try harder, instead of backing off. Alternative
   transport protocols are:

      -  NETBLT [NETBLT, RFC1986, RFC1030]

      -  MNCP [MNCP]

RFC2757 - Page 11

      -  ESRO [RFC2188]

      -  RDP [RFC908, RFC1151]

      -  VMTP [VMTP]

3 The Case for TCP

   This is one of the most hotly debated issues in the wireless arena.
   Here are some arguments against it:

      -  It is generally recognized that TCP does not perform well in
         the presence of significant levels of non-congestion loss.  TCP
         detractors argue that the wireless medium is one such case, and
         that it is hard enough to fix TCP. They argue that it is easier
         to start from scratch.

      -  TCP has too much header overhead.

      -  By the time the mechanisms are in place to fix it, TCP is very
         heavy, and ill-suited for use by lightweight, portable devices.

   and here are some in support of TCP:

      -  It is preferable to continue using the same protocol that the
         rest of the Internet uses for compatibility reasons. Any
         extensions specific to the wireless link may be negotiated.

      -  Legacy mechanisms may be reused (for example three-way
         handshake).

      -  Link-layer FEC and ARQ can reduce the BER such that any losses
         TCP does see are, in fact, caused by congestion (or a sustained
         interruption of link connectivity). Modern W-WAN technologies
         do this (CDPD, US-TDMA, CDMA, GSM), thus improving TCP
         throughput.

      -  Handoffs among different technologies are made possible by
         Mobile IP [RFC2002], but only if the same protocols, namely
         TCP/IP, are used throughout.

      -  Given TCP's wealth of research and experience, alternative
         protocols are relatively immature, and the full implications of
         their widespread deployment not clearly understood.

   Overall, we feel that the performance of TCP over long-thin networks
   can be improved significantly. Mechanisms to do so are discussed in
   the next sections.

RFC2757 - Page 12

4 Candidate Optimizations

   There is a large volume of work on the subject of optimizing TCP for
   operation over wireless media. Even though satellite networks
   generally fall in the LFN regime, our current LTN focus has much to
   benefit from it.  For example, the work of the TCP-over-Satellite
   working group of the IETF has been extremely helpful in preparing
   this section [AGS98, ADGGHOSSTT98].

4.1 TCP: Current Mechanisms

   A TCP sender adapts its use of bandwidth based on feedback from the
   receiver. The high latency characteristic of LTNs implies that TCP's
   adaptation is correspondingly slower than on networks with shorter
   delays.  Similarly, delayed ACKs exacerbate the perceived latency on
   the link. Given that TCP grows its congestion window in units of
   segments, small MTUs may slow adaptation even further.

4.1.1 Slow Start and Congestion Avoidance

   Slow Start and Congestion Avoidance [RFC2581] are essential the
   Internet's stability.  However there are two reasons why the wireless
   medium adversely affects them:

      -  Whenever TCP's retransmission timer expires, the sender assumes
         that the network is congested and invokes slow start. This is
         why it is important to minimize the losses caused by
         corruption, leaving only those caused by congestion (as
         expected by TCP).

      -  The sender increases its window based on the number of ACKs
         received. Their rate of arrival, of course, is dependent on the
         RTT (round-trip-time) between sender and receiver, which
         implies long ramp-up times in high latency links like LTNs. The
         dependency lasts until the pipe is filled.

      -  During slow start, the sender increases its window in units of
         segments. This is why it is important to use an appropriately
         large MTU which, in turn, requires requires link layers with
         low loss.

4.1.2 Fast Retransmit and Fast Recovery

   When a TCP sender receives several duplicate ACKs, fast retransmit
   [RFC2581] allows it to infer that a segment was lost.  The sender
   retransmits what it considers to be this lost segment without waiting
   for the full timeout, thus saving time.

RFC2757 - Page 13

   After a fast retransmit, a sender invokes the fast recovery [RFC2581]
   algorithm. Fast recovery allows the sender to transmit at half its
   previous rate (regulating the growth of its window based on
   congestion avoidance), rather than having to begin a slow start. This
   also saves time.

   In general, TCP can increase its window beyond the delay-bandwidth
   product. However, in LTN links the congestion window may remain
   rather small, less than four segments, for long periods of time due
   to any of the following reasons:

      1. Typical "file size" to be transferred over a connection is
         relatively small (Web requests, Web document objects, email
         messages, files, etc.) In particular, users of LTNs are not
         very willing to carry out large transfers as the response time
         is so long.

      2. If the link has high BER, the congestion window tends to stay
         small

      3. When an LTN is combined with a highly congested wireline
         Internet path, congestion losses on the Internet have the same
         effect as 2.

      4. Commonly, ISPs/operators configure only a small number of
         buffers (even as few as for 3 packets) per user in their dial-
         up routers

      5. Often small socket buffers are recommended with LTNs in order
         to prevent the RTO from inflating and to diminish the amount of
         packets with competing traffic.

   A small window effectively prevents the sender from taking advantage
   of Fast Retransmits. Moreover, efficient recovery from multiple
   losses within a single window requires adoption of new proposals
   (NewReno [RFC2582]). In addition, on slow paths with no packet
   reordering waiting for three duplicate ACKs to arrive postpones
   retransmission unnecessarily.

   Recommendation: Implement Fast Retransmit and Fast Recovery at this
   time. This is a widely-implemented optimization and is currently at
   Proposed Standard level. [AGS98] recommends implementation of Fast
   Retransmit/Fast Recovery in satellite environments.  NewReno
   [RFC2582] apparently does help a sender better handle partial ACKs
   and multiple losses in a single window, but at this point is not
   recommended due to its experimental nature.  Instead, SACK [RFC2018]
   is the preferred mechanism.

RFC2757 - Page 14

4.2 Connection Setup with T/TCP [RFC1397, RFC1644]

   TCP engages in a "three-way handshake" whenever a new connection is
   set up.  Data transfer is only possible after this phase has
   completed successfully.  T/TCP allows data to be exchanged in
   parallel with the connection set up, saving valuable time for short
   transactions on long-latency networks.

   Recommendation: T/TCP is not recommended, for these reasons:

   -  It is an Experimental RFC.

   -  It is not widely deployed, and it has to be deployed at both ends
      of a connection.

   -  Security concerns have been raised that T/TCP is more vulnerable
      to address-spoofing attacks than TCP itself.

   -  At least some of the benefits of T/TCP (eliminating three-way
      handshake on subsequent query-response transactions, for instance)
      are also available with persistent connections on HTTP/1.1, which
      is more widely deployed.

   [ADGGHOSSTT98] does not have a recommendation on T/TCP in satellite
   environments.

4.3 Slow Start Proposals

   Because slow start dominates the network response seen by interactive
   users at the beginning of a TCP connection, a number of proposals
   have been made to modify or eliminate slow start in long latency
   environments.

   Stability of the Internet is paramount, so these proposals must
   demonstrate that they will not adversely affect Internet congestion
   levels in significant ways.

4.3.1 Larger Initial Window

   Traditional slow start, with an initial window of one segment, is a
   time-consuming bandwidth adaptation procedure over LTNs. Studies on
   an initial window larger than one segment [RFC2414, AHO98] resulted
   in the TCP standard supporting a maximum value of 2 [RFC2581]. Higher
   values are still experimental in nature.

RFC2757 - Page 15

   In simulations with an increased initial window of three packets
   [RFC2415], this proposal does not contribute significantly to packet
   drop rates, and it has the added benefit of improving initial
   response times when the peer device delays acknowledgements during
   slow start (see next proposal).

   [RFC2416] addresses situations where the initial window exceeds the
   number of buffers available to TCP and indicates that this situation
   is no different from the case where the congestion window grows
   beyond the number of buffers available.

   [RFC2581] now allows an initial congestion window of two segments. A
   larger initial window, perhaps as many as four segments, might be
   allowed in the future in environments where this significantly
   improves performance (LFNs and LTNs).

   Recommendation: Implement this on devices now. The research on this
   optimization indicates that 3 segments is a safe initial setting, and
   is centering on choosing between 2, 3, and 4. For now, use 2
   (following RFC2581), which at least allows clients running query-
   response applications to get an initial ACK from unmodified servers
   without waiting for a typical delayed ACK timeout of 200
   milliseconds, and saves two round-trips. An initial window of 3
   [RFC2415] looks promising and may be adopted in the future pending
   further research and experience.

4.3.2 Growing the Window during Slow Start

   The sender increases its window based on the flow of ACKs coming back
   from the receiver. Particularly during slow start, this flow is very
   important.  A couple of the proposals that have been studied are (1)
   ACK counting and (2) ACK-every-segment.

4.3.2.1 ACK Counting

   The main idea behind ACK counting is:

      -  Make each ACK count to its fullest by growing the window based
         on the data being acknowledged (byte counting) instead of the
         number of ACKs (ACK counting). This has been shown to cause
         bursts which lead to congestion. [Allman98] shows that Limited
         Byte Counting (LBC), in which the window growth is limited to 2
         segments, does not lead to as much burstiness, and offers some
         performance gains.

   Recommendation: Unlimited byte counting is not recommended.  Van
   Jacobson cautions against byte counting [TCPSATMIN] because it leads
   to burstiness, and recommends ACK spacing [ACKSPACING] instead.

RFC2757 - Page 16

   ACK spacing requires ACKs to consistently pass through a single ACK-
   spacing router.  This requirement works well for W-WAN environments
   if the ACK-spacing router is also the intermediate node.

   Limited byte counting warrants further investigation before we can
   recommend this proposal, but it shows promise.

4.3.2.2 ACK-every-segment

   The main idea behind ACK-every-segment is:

      -  Keep a constant stream of ACKs coming back by turning off
         delayed ACKs [RFC1122] during slow start. ACK-every-segment
         must be limited to slow start, in order to avoid penalizing
         asymmetric-bandwidth configurations. For instance, a low
         bandwidth link carrying acknowledgements back to the sender,
         hinders the growth of the congestion window, even if the link
         toward the client has a greater bandwidth [BPK99].

   Even though simulations confirm its promise (it allows receivers to
   receive the second segment from unmodified senders without waiting
   for a typical delayed ACK timeout of 200 milliseconds), for this
   technique to be practical the receiver must acknowledge every segment
   only when the sender is in slow start.  Continuing to do so when the
   sender is in congestion avoidance may have adverse effects on the
   mobile device's battery consumption and on traffic in the network.

   This violates a SHOULD in [RFC2581]:  delayed acknowledgements SHOULD
   be used by a TCP receiver.

   "Disabling Delayed ACKs During Slow Start" is technically
   unimplementable, as the receiver has no way of knowing when the
   sender crosses ssthresh (the "slow start threshold") and begins using
   the congestion avoidance algorithm.  If receivers follow
   recommendations for increased initial windows, disabling delayed ACKs
   during an increased initial window would open the TCP window more
   rapidly without doubling ACK traffic in general.  However, this
   scheme might double ACK traffic if most connections remain in slow-
   start.

   Recommendation: ACK only the first segment on a new connection with
   no delay.

RFC2757 - Page 17

4.3.3 Terminating Slow Start

   New mechanisms [ADGGHOSSTT98] are being proposed to improve TCP's
   adaptive properties such that the available bandwidth is better
   utilized while reducing the possibility of congesting the network.
   This results in the closing of the congestion window to 1 segment
   (which precludes fast retransmit), and the subsequent slow start
   phase.

   Theoretically, an optimum value for slow-start threshold (ssthresh)
   allows connection bandwidth utilization to ramp up as aggressively as
   possible without "overshoot" (using so much bandwidth that packets
   are lost and congestion avoidance procedures are invoked).

   Recommendation: Estimating the slow start threshold is not
   recommended.  Although this would be helpful if we knew how to do it,
   rough consensus on the tcp-impl and tcp-sat mailing lists is that in
   non-trivial operational networks there is no reliable method to probe
   during TCP startup and estimate the bandwidth available.

4.3.4 Generating ACKs during Slow Start

   Mitigations that inject additional ACKs (whether "ACK-first-segment"
   or "ACK-every-segment-during-slow-start") beyond what today's
   conformant TCPs inject are only applicable during the slow-start
   phases of a connection. After an initial exchange, the connection
   usually completes slow-start, so TCPs only inject additional ACKs
   when (1) the connection is closed, and a new connection is opened, or
   (2) the TCPs handle idle connection restart correctly by performing
   slow start.

   Item (1) is typical when using HTTP/1.0, in which each request-
   response transaction requires a new connection.  Persistent
   connections in HTTP/1.1 help in maintaining a connection in
   congestion avoidance instead of constantly reverting to slow-start.
   Because of this, these optimizations which are only enabled during
   slow-start do not get as much of a chance to act. Item (2), of
   course, is independent of HTTP version.

4.4 ACK Spacing

   During slow start, the sender responds to the incoming ACK stream by
   transmitting N+1 segments for each ACK, where N is the number of new
   segments acknowledged by the incoming ACK.  This results in data
   being sent at twice the speed at which it can be processed by the
   network.  Accordingly, queues will form, and due to insufficient
   buffering at the bottleneck router, packets may get dropped before
   the link's capacity is full.

RFC2757 - Page 18

   Spacing out the ACKs effectively controls the rate at which the
   sender will transmit into the network, and may result in little or no
   queueing at the bottleneck router [ACKSPACING].  Furthermore, ack
   spacing reduces the size of the bursts.

   Recommendation: No recommendation at this time. Continue monitoring
   research in this area.

4.5 Delayed Duplicate Acknowlegements

   As was mentioned above, link-layer retransmissions may decrease the
   BER enough that congestion accounts for most of packet losses; still,
   nothing can be done about interruptions due to handoffs, moving
   beyond wireless coverage, etc. In this scenario, it is imperative to
   prevent interaction between link-layer retransmission and TCP
   retransmission as these layers duplicate each other's efforts. In
   such an environment it may make sense to delay TCP's efforts so as to
   give the link-layer a chance to recover. With this in mind, the
   Delayed Dupacks [MV97, Vaidya99] scheme selectively delays duplicate
   acknowledgements at the receiver.  It is preferable to allow a local
   mechanism to resolve a local problem, instead of invoking TCP's end-
   to-end mechanism and incurring the associated costs, both in terms of
   wasted bandwidth and in terms of its effect on TCP's window behavior.

   The Delayed Dupacks scheme can be used despite IP encryption since
   the intermediate node does not need to examine the TCP headers.

   Currently, it is not well understood how long the receiver should
   delay the duplicate acknowledgments. In particular, the impact of
   wireless medium access control (MAC) protocol on the choice of delay
   parameter needs to be studied. The MAC protocol may affect the
   ability to choose the appropriate delay (either statically or
   dynamically). In general, significant variabilities in link-level
   retransmission times can have an adverse impact on the performance of
   the Delayed Dupacks scheme. Furthermore, as discussed later in
   section 4.10.3, Delayed Dupacks and some other schemes (such as Snoop
   [SNOOP]) are only beneficial in certain types of network links.

   Recommendation: Delaying duplicate acknowledgements may be useful in
   specific network topologies, but a general recommendation requires
   further research and experience.

4.6 Selective Acknowledgements [RFC2018]

   SACK may not be useful in many LTNs, according to Section 1.1 of
   [TCPHP].  In particular, SACK is more useful in the LFN regime,
   especially if large windows are being used, because there is a

RFC2757 - Page 19

   considerable probability of multiple segment losses per window. In
   the LTN regime, TCP windows are much smaller, and burst errors must
   be much longer in duration in order to damage multiple segments.

   Accordingly, the complexity of SACK may not be justifiable, unless
   there is a high probability of burst errors and congestion on the
   wireless link. A desire for compatibility with TCP recommendations
   for non-LTN environments may dictate LTN support for SACK anyway.

   [AGS98] recommends use of SACK with Large TCP Windows in satellite
   environments, and notes that this implies support for PAWS
   (Protection Against Wrapped Sequence space) and RTTM (Round Trip Time
   Measurement) as well.

   Berkeley's SNOOP protocol research [SNOOP] indicates that SACK does
   improve throughput for SNOOP when multiple segments are lost per
   window [BPSK96]. SACK allows SNOOP to recover from multi-segment
   losses in one round-trip. In this case, the mobile device needs to
   implement some form of selective acknowledgements.  If SACK is not
   used, TCP may enter congestion avoidance as the time needed to
   retransmit the lost segments may be greater than the retransmission
   timer.

   Recommendation: Implement SACK now for compatibility with other TCPs
   and improved performance with SNOOP.

4.7 Detecting Corruption Loss

4.7.1 Without Explicit Notification

   In the absence of explicit notification from the network, some
   researchers have suggested statistical methods for congestion
   avoidance [Jain89, WC91, VEGAS]. A natural extension of these
   heuristics would enable a sender to distinguish between losses caused
   by congestion and other causes.  The research results on the
   reliability of sender-based heuristics is unfavorable [BV97, BV98].
   [BV98a] reports better results in constrained environments using
   packet inter-arrival times measured at the receiver, but highly-
   variable delay - of the type encountered in wireless environments
   during intercell handoff - confounds these heuristics.

   Recommendation: No recommendation at this time - continue to monitor
   research results.

RFC2757 - Page 20

4.7.2 With Explicit Notifications

   With explicit notification from the network it is possible to
   determine when a loss is due to congestion. Several proposals along
   these lines include:

      -  Explicit Loss Notification (ELN) [BPSK96]

      -  Explicit Bad State Notification (EBSN) [BBKVP96]

      -  Explicit Loss Notification to the Receiver (ELNR), and Explicit
         Delayed Dupack Activation Notification (EDDAN) (notifications
         to mobile receiver) [MV97]

      -  Explicit Congestion Notification (ECN) [ECN]

   Of these proposals, Explicit Congestion Notification (ECN) seems
   closest to deployment on the Internet, and will provide some benefit
   for TCP connections on long thin networks (as well as for all other
   TCP connections).

   Recommendation: No recommendation at this time. Schemes like ELNR and
   EDDAN [MV97], in which  the only systems that need to be modified are
   the intermediate node and the mobile device, are slated for adoption
   pending further research.  However, this solution has some
   limitations. Since the intermediate node must have access to the TCP
   headers, the IP payload must not be encrypted.

   ECN uses the TOS byte in the IP header to carry congestion
   information (ECN-capable and Congestion-encountered).  This byte is
   not encrypted in IPSEC, so ECN can be used on TCP connections that
   are encrypted using IPSEC.

   Recommendation: Implement ECN. In spite of this, mechanisms for
   explicit corruption notification are still relevant and should be
   tracked.

   Note: ECN provides useful information to avoid deteriorating further
   a bad situation, but has some limitations for wireless applications.
   Absence of packets marked with ECN should not be interpreted by ECN-
   capable TCP connections as a green light for aggressive
   retransmissions. On the contrary, during periods of extreme network
   congestion routers may drop packets marked with explicit notification
   because their buffers are exhausted - exactly the wrong time for a
   host to begin retransmitting aggressively.

(next page on part 2)