4.2.3 SPECIFIC ISSUES 4.2.3.1 Retransmission Timeout Calculation A host TCP MUST implement Karn's algorithm and Jacobson's algorithm for computing the retransmission timeout ("RTO"). o Jacobson's algorithm for computing the smoothed round- trip ("RTT") time incorporates a simple measure of the variance [TCP:7]. o Karn's algorithm for selecting RTT measurements ensures that ambiguous round-trip times will not corrupt the calculation of the smoothed round-trip time [TCP:6]. This implementation also MUST include "exponential backoff" for successive RTO values for the same segment. Retransmission of SYN segments SHOULD use the same algorithm as data segments. DISCUSSION: There were two known problems with the RTO calculations specified in RFC-793. First, the accurate measurement of RTTs is difficult when there are retransmissions. Second, the algorithm to compute the smoothed round- trip time is inadequate [TCP:7], because it incorrectly
assumed that the variance in RTT values would be small
and constant. These problems were solved by Karn's and
Jacobson's algorithm, respectively.
The performance increase resulting from the use of
these improvements varies from noticeable to dramatic.
Jacobson's algorithm for incorporating the measured RTT
variance is especially important on a low-speed link,
where the natural variation of packet sizes causes a
large variation in RTT. One vendor found link
utilization on a 9.6kb line went from 10% to 90% as a
result of implementing Jacobson's variance algorithm in
TCP.
The following values SHOULD be used to initialize the
estimation parameters for a new connection:
(a) RTT = 0 seconds.
(b) RTO = 3 seconds. (The smoothed variance is to be
initialized to the value that will result in this RTO).
The recommended upper and lower bounds on the RTO are known
to be inadequate on large internets. The lower bound SHOULD
be measured in fractions of a second (to accommodate high
speed LANs) and the upper bound should be 2*MSL, i.e., 240
seconds.
DISCUSSION:
Experience has shown that these initialization values
are reasonable, and that in any case the Karn and
Jacobson algorithms make TCP behavior reasonably
insensitive to the initial parameter choices.
4.2.3.2 When to Send an ACK Segment
A host that is receiving a stream of TCP data segments can
increase efficiency in both the Internet and the hosts by
sending fewer than one ACK (acknowledgment) segment per data
segment received; this is known as a "delayed ACK" [TCP:5].
A TCP SHOULD implement a delayed ACK, but an ACK should not
be excessively delayed; in particular, the delay MUST be
less than 0.5 seconds, and in a stream of full-sized
segments there SHOULD be an ACK for at least every second
segment.
DISCUSSION:
A delayed ACK gives the application an opportunity to
update the window and perhaps to send an immediate
response. In particular, in the case of character-mode
remote login, a delayed ACK can reduce the number of
segments sent by the server by a factor of 3 (ACK,
window update, and echo character all combined in one
segment).
In addition, on some large multi-user hosts, a delayed
ACK can substantially reduce protocol processing
overhead by reducing the total number of packets to be
processed [TCP:5]. However, excessive delays on ACK's
can disturb the round-trip timing and packet "clocking"
algorithms [TCP:7].
4.2.3.3 When to Send a Window Update
A TCP MUST include a SWS avoidance algorithm in the receiver
[TCP:5].
IMPLEMENTATION:
The receiver's SWS avoidance algorithm determines when
the right window edge may be advanced; this is
customarily known as "updating the window". This
algorithm combines with the delayed ACK algorithm (see
Section 4.2.3.2) to determine when an ACK segment
containing the current window will really be sent to
the receiver. We use the notation of RFC-793; see
Figures 4 and 5 in that document.
The solution to receiver SWS is to avoid advancing the
right window edge RCV.NXT+RCV.WND in small increments,
even if data is received from the network in small
segments.
Suppose the total receive buffer space is RCV.BUFF. At
any given moment, RCV.USER octets of this total may be
tied up with data that has been received and
acknowledged but which the user process has not yet
consumed. When the connection is quiescent, RCV.WND =
RCV.BUFF and RCV.USER = 0.
Keeping the right window edge fixed as data arrives and
is acknowledged requires that the receiver offer less
than its full buffer space, i.e., the receiver must
specify a RCV.WND that keeps RCV.NXT+RCV.WND constant
as RCV.NXT increases. Thus, the total buffer space
RCV.BUFF is generally divided into three parts:
|<------- RCV.BUFF ---------------->|
1 2 3
----|---------|------------------|------|----
RCV.NXT ^
(Fixed)
1 - RCV.USER = data received but not yet consumed;
2 - RCV.WND = space advertised to sender;
3 - Reduction = space available but not yet
advertised.
The suggested SWS avoidance algorithm for the receiver
is to keep RCV.NXT+RCV.WND fixed until the reduction
satisfies:
RCV.BUFF - RCV.USER - RCV.WND >=
min( Fr * RCV.BUFF, Eff.snd.MSS )
where Fr is a fraction whose recommended value is 1/2,
and Eff.snd.MSS is the effective send MSS for the
connection (see Section 4.2.2.6). When the inequality
is satisfied, RCV.WND is set to RCV.BUFF-RCV.USER.
Note that the general effect of this algorithm is to
advance RCV.WND in increments of Eff.snd.MSS (for
realistic receive buffers: Eff.snd.MSS < RCV.BUFF/2).
Note also that the receiver must use its own
Eff.snd.MSS, assuming it is the same as the sender's.
4.2.3.4 When to Send Data
A TCP MUST include a SWS avoidance algorithm in the sender.
A TCP SHOULD implement the Nagle Algorithm [TCP:9] to
coalesce short segments. However, there MUST be a way for
an application to disable the Nagle algorithm on an
individual connection. In all cases, sending data is also
subject to the limitation imposed by the Slow Start
algorithm (Section 4.2.2.15).
DISCUSSION:
The Nagle algorithm is generally as follows:
If there is unacknowledged data (i.e., SND.NXT >
SND.UNA), then the sending TCP buffers all user
data (regardless of the PSH bit), until the
outstanding data has been acknowledged or until
the TCP can send a full-sized segment (Eff.snd.MSS
bytes; see Section 4.2.2.6).
Some applications (e.g., real-time display window
updates) require that the Nagle algorithm be turned
off, so small data segments can be streamed out at the
maximum rate.
IMPLEMENTATION:
The sender's SWS avoidance algorithm is more difficult
than the receivers's, because the sender does not know
(directly) the receiver's total buffer space RCV.BUFF.
An approach which has been found to work well is for
the sender to calculate Max(SND.WND), the maximum send
window it has seen so far on the connection, and to use
this value as an estimate of RCV.BUFF. Unfortunately,
this can only be an estimate; the receiver may at any
time reduce the size of RCV.BUFF. To avoid a resulting
deadlock, it is necessary to have a timeout to force
transmission of data, overriding the SWS avoidance
algorithm. In practice, this timeout should seldom
occur.
The "useable window" [TCP:5] is:
U = SND.UNA + SND.WND - SND.NXT
i.e., the offered window less the amount of data sent
but not acknowledged. If D is the amount of data
queued in the sending TCP but not yet sent, then the
following set of rules is recommended.
Send data:
(1) if a maximum-sized segment can be sent, i.e, if:
min(D,U) >= Eff.snd.MSS;
(2) or if the data is pushed and all queued data can
be sent now, i.e., if:
[SND.NXT = SND.UNA and] PUSHED and D <= U
(the bracketed condition is imposed by the Nagle
algorithm);
(3) or if at least a fraction Fs of the maximum window
can be sent, i.e., if:
[SND.NXT = SND.UNA and]
min(D.U) >= Fs * Max(SND.WND);
(4) or if data is PUSHed and the override timeout
occurs.
Here Fs is a fraction whose recommended value is 1/2.
The override timeout should be in the range 0.1 - 1.0
seconds. It may be convenient to combine this timer
with the timer used to probe zero windows (Section
4.2.2.17).
Finally, note that the SWS avoidance algorithm just
specified is to be used instead of the sender-side
algorithm contained in [TCP:5].
4.2.3.5 TCP Connection Failures
Excessive retransmission of the same segment by TCP
indicates some failure of the remote host or the Internet
path. This failure may be of short or long duration. The
following procedure MUST be used to handle excessive
retransmissions of data segments [IP:11]:
(a) There are two thresholds R1 and R2 measuring the amount
of retransmission that has occurred for the same
segment. R1 and R2 might be measured in time units or
as a count of retransmissions.
(b) When the number of transmissions of the same segment
reaches or exceeds threshold R1, pass negative advice
(see Section 3.3.1.4) to the IP layer, to trigger
dead-gateway diagnosis.
(c) When the number of transmissions of the same segment
reaches a threshold R2 greater than R1, close the
connection.
(d) An application MUST be able to set the value for R2 for
a particular connection. For example, an interactive
application might set R2 to "infinity," giving the user
control over when to disconnect.
(d) TCP SHOULD inform the application of the delivery
problem (unless such information has been disabled by
the application; see Section 4.2.4.1), when R1 is
reached and before R2. This will allow a remote login
(User Telnet) application program to inform the user,
for example.
The value of R1 SHOULD correspond to at least 3
retransmissions, at the current RTO. The value of R2 SHOULD
correspond to at least 100 seconds.
An attempt to open a TCP connection could fail with
excessive retransmissions of the SYN segment or by receipt
of a RST segment or an ICMP Port Unreachable. SYN
retransmissions MUST be handled in the general way just
described for data retransmissions, including notification
of the application layer.
However, the values of R1 and R2 may be different for SYN
and data segments. In particular, R2 for a SYN segment MUST
be set large enough to provide retransmission of the segment
for at least 3 minutes. The application can close the
connection (i.e., give up on the open attempt) sooner, of
course.
DISCUSSION:
Some Internet paths have significant setup times, and
the number of such paths is likely to increase in the
future.
4.2.3.6 TCP Keep-Alives
Implementors MAY include "keep-alives" in their TCP
implementations, although this practice is not universally
accepted. If keep-alives are included, the application MUST
be able to turn them on or off for each TCP connection, and
they MUST default to off.
Keep-alive packets MUST only be sent when no data or
acknowledgement packets have been received for the
connection within an interval. This interval MUST be
configurable and MUST default to no less than two hours.
It is extremely important to remember that ACK segments that
contain no data are not reliably transmitted by TCP.
Consequently, if a keep-alive mechanism is implemented it
MUST NOT interpret failure to respond to any specific probe
as a dead connection.
An implementation SHOULD send a keep-alive segment with no
data; however, it MAY be configurable to send a keep-alive
segment containing one garbage octet, for compatibility with
erroneous TCP implementations.
DISCUSSION:
A "keep-alive" mechanism periodically probes the other
end of a connection when the connection is otherwise
idle, even when there is no data to be sent. The TCP
specification does not include a keep-alive mechanism
because it could: (1) cause perfectly good connections
to break during transient Internet failures; (2)
consume unnecessary bandwidth ("if no one is using the
connection, who cares if it is still good?"); and (3)
cost money for an Internet path that charges for
packets.
Some TCP implementations, however, have included a
keep-alive mechanism. To confirm that an idle
connection is still active, these implementations send
a probe segment designed to elicit a response from the
peer TCP. Such a segment generally contains SEG.SEQ =
SND.NXT-1 and may or may not contain one garbage octet
of data. Note that on a quiet connection SND.NXT =
RCV.NXT, so that this SEG.SEQ will be outside the
window. Therefore, the probe causes the receiver to
return an acknowledgment segment, confirming that the
connection is still live. If the peer has dropped the
connection due to a network partition or a crash, it
will respond with a RST instead of an acknowledgment
segment.
Unfortunately, some misbehaved TCP implementations fail
to respond to a segment with SEG.SEQ = SND.NXT-1 unless
the segment contains data. Alternatively, an
implementation could determine whether a peer responded
correctly to keep-alive packets with no garbage data
octet.
A TCP keep-alive mechanism should only be invoked in
server applications that might otherwise hang
indefinitely and consume resources unnecessarily if a
client crashes or aborts a connection during a network
failure.
4.2.3.7 TCP Multihoming If an application on a multihomed host does not specify the local IP address when actively opening a TCP connection, then the TCP MUST ask the IP layer to select a local IP address before sending the (first) SYN. See the function GET_SRCADDR() in Section 3.4. At all other times, a previous segment has either been sent or received on this connection, and TCP MUST use the same local address is used that was used in those previous segments. 4.2.3.8 IP Options When received options are passed up to TCP from the IP layer, TCP MUST ignore options that it does not understand. A TCP MAY support the Time Stamp and Record Route options. An application MUST be able to specify a source route when it actively opens a TCP connection, and this MUST take precedence over a source route received in a datagram. When a TCP connection is OPENed passively and a packet arrives with a completed IP Source Route option (containing a return route), TCP MUST save the return route and use it for all segments sent on this connection. If a different source route arrives in a later segment, the later definition SHOULD override the earlier one. 4.2.3.9 ICMP Messages TCP MUST act on an ICMP error message passed up from the IP layer, directing it to the connection that created the error. The necessary demultiplexing information can be found in the IP header contained within the ICMP message. o Source Quench TCP MUST react to a Source Quench by slowing transmission on the connection. The RECOMMENDED procedure is for a Source Quench to trigger a "slow start," as if a retransmission timeout had occurred. o Destination Unreachable -- codes 0, 1, 5 Since these Unreachable messages indicate soft error
conditions, TCP MUST NOT abort the connection, and it
SHOULD make the information available to the
application.
DISCUSSION:
TCP could report the soft error condition directly
to the application layer with an upcall to the
ERROR_REPORT routine, or it could merely note the
message and report it to the application only when
and if the TCP connection times out.
o Destination Unreachable -- codes 2-4
These are hard error conditions, so TCP SHOULD abort
the connection.
o Time Exceeded -- codes 0, 1
This should be handled the same way as Destination
Unreachable codes 0, 1, 5 (see above).
o Parameter Problem
This should be handled the same way as Destination
Unreachable codes 0, 1, 5 (see above).
4.2.3.10 Remote Address Validation
A TCP implementation MUST reject as an error a local OPEN
call for an invalid remote IP address (e.g., a broadcast or
multicast address).
An incoming SYN with an invalid source address must be
ignored either by TCP or by the IP layer (see Section
3.2.1.3).
A TCP implementation MUST silently discard an incoming SYN
segment that is addressed to a broadcast or multicast
address.
4.2.3.11 TCP Traffic Patterns
IMPLEMENTATION:
The TCP protocol specification [TCP:1] gives the
implementor much freedom in designing the algorithms
that control the message flow over the connection --
packetizing, managing the window, sending
acknowledgments, etc. These design decisions are
difficult because a TCP must adapt to a wide range of
traffic patterns. Experience has shown that a TCP
implementor needs to verify the design on two extreme
traffic patterns:
o Single-character Segments
Even if the sender is using the Nagle Algorithm,
when a TCP connection carries remote login traffic
across a low-delay LAN the receiver will generally
get a stream of single-character segments. If
remote terminal echo mode is in effect, the
receiver's system will generally echo each
character as it is received.
o Bulk Transfer
When TCP is used for bulk transfer, the data
stream should be made up (almost) entirely of
segments of the size of the effective MSS.
Although TCP uses a sequence number space with
byte (octet) granularity, in bulk-transfer mode
its operation should be as if TCP used a sequence
space that counted only segments.
Experience has furthermore shown that a single TCP can
effectively and efficiently handle these two extremes.
The most important tool for verifying a new TCP
implementation is a packet trace program. There is a
large volume of experience showing the importance of
tracing a variety of traffic patterns with other TCP
implementations and studying the results carefully.
4.2.3.12 Efficiency
IMPLEMENTATION:
Extensive experience has led to the following
suggestions for efficient implementation of TCP:
(a) Don't Copy Data
In bulk data transfer, the primary CPU-intensive
tasks are copying data from one place to another
and checksumming the data. It is vital to
minimize the number of copies of TCP data. Since
the ultimate speed limitation may be fetching data
across the memory bus, it may be useful to combine
the copy with checksumming, doing both with a
single memory fetch.
(b) Hand-Craft the Checksum Routine
A good TCP checksumming routine is typically two
to five times faster than a simple and direct
implementation of the definition. Great care and
clever coding are often required and advisable to
make the checksumming code "blazing fast". See
[TCP:10].
(c) Code for the Common Case
TCP protocol processing can be complicated, but
for most segments there are only a few simple
decisions to be made. Per-segment processing will
be greatly speeded up by coding the main line to
minimize the number of decisions in the most
common case.
4.2.4 TCP/APPLICATION LAYER INTERFACE
4.2.4.1 Asynchronous Reports
There MUST be a mechanism for reporting soft TCP error
conditions to the application. Generically, we assume this
takes the form of an application-supplied ERROR_REPORT
routine that may be upcalled [INTRO:7] asynchronously from
the transport layer:
ERROR_REPORT(local connection name, reason, subreason)
The precise encoding of the reason and subreason parameters
is not specified here. However, the conditions that are
reported asynchronously to the application MUST include:
* ICMP error message arrived (see 4.2.3.9)
* Excessive retransmissions (see 4.2.3.5)
* Urgent pointer advance (see 4.2.2.4).
However, an application program that does not want to
receive such ERROR_REPORT calls SHOULD be able to
effectively disable these calls.
DISCUSSION:
These error reports generally reflect soft errors that
can be ignored without harm by many applications. It
has been suggested that these error report calls should
default to "disabled," but this is not required.
4.2.4.2 Type-of-Service
The application layer MUST be able to specify the Type-of-
Service (TOS) for segments that are sent on a connection.
It not required, but the application SHOULD be able to
change the TOS during the connection lifetime. TCP SHOULD
pass the current TOS value without change to the IP layer,
when it sends segments on the connection.
The TOS will be specified independently in each direction on
the connection, so that the receiver application will
specify the TOS used for ACK segments.
TCP MAY pass the most recently received TOS up to the
application.
DISCUSSION
Some applications (e.g., SMTP) change the nature of
their communication during the lifetime of a
connection, and therefore would like to change the TOS
specification.
Note also that the OPEN call specified in RFC-793
includes a parameter ("options") in which the caller
can specify IP options such as source route, record
route, or timestamp.
4.2.4.3 Flush Call
Some TCP implementations have included a FLUSH call, which
will empty the TCP send queue of any data for which the user
has issued SEND calls but which is still to the right of the
current send window. That is, it flushes as much queued
send data as possible without losing sequence number
synchronization. This is useful for implementing the "abort
output" function of Telnet.
4.2.4.4 Multihoming The user interface outlined in sections 2.7 and 3.8 of RFC- 793 needs to be extended for multihoming. The OPEN call MUST have an optional parameter: OPEN( ... [local IP address,] ... ) to allow the specification of the local IP address. DISCUSSION: Some TCP-based applications need to specify the local IP address to be used to open a particular connection; FTP is an example. IMPLEMENTATION: A passive OPEN call with a specified "local IP address" parameter will await an incoming connection request to that address. If the parameter is unspecified, a passive OPEN will await an incoming connection request to any local IP address, and then bind the local IP address of the connection to the particular address that is used. For an active OPEN call, a specified "local IP address" parameter will be used for opening the connection. If the parameter is unspecified, the networking software will choose an appropriate local IP address (see Section 3.3.4.2) for the connection 4.2.5 TCP REQUIREMENT SUMMARY | | | | |S| | | | | | |H| |F | | | | |O|M|o | | |S| |U|U|o | | |H| |L|S|t | |M|O| |D|T|n | |U|U|M| | |o | |S|L|A|N|N|t | |T|D|Y|O|O|t FEATURE |SECTION | | | |T|T|e -------------------------------------------------|--------|-|-|-|-|-|-- | | | | | | | Push flag | | | | | | | Aggregate or queue un-pushed data |4.2.2.2 | | |x| | | Sender collapse successive PSH flags |4.2.2.2 | |x| | | | SEND call can specify PUSH |4.2.2.2 | | |x| | |
If cannot: sender buffer indefinitely |4.2.2.2 | | | | |x|
If cannot: PSH last segment |4.2.2.2 |x| | | | |
Notify receiving ALP of PSH |4.2.2.2 | | |x| | |1
Send max size segment when possible |4.2.2.2 | |x| | | |
| | | | | | |
Window | | | | | | |
Treat as unsigned number |4.2.2.3 |x| | | | |
Handle as 32-bit number |4.2.2.3 | |x| | | |
Shrink window from right |4.2.2.16| | | |x| |
Robust against shrinking window |4.2.2.16|x| | | | |
Receiver's window closed indefinitely |4.2.2.17| | |x| | |
Sender probe zero window |4.2.2.17|x| | | | |
First probe after RTO |4.2.2.17| |x| | | |
Exponential backoff |4.2.2.17| |x| | | |
Allow window stay zero indefinitely |4.2.2.17|x| | | | |
Sender timeout OK conn with zero wind |4.2.2.17| | | | |x|
| | | | | | |
Urgent Data | | | | | | |
Pointer points to last octet |4.2.2.4 |x| | | | |
Arbitrary length urgent data sequence |4.2.2.4 |x| | | | |
Inform ALP asynchronously of urgent data |4.2.2.4 |x| | | | |1
ALP can learn if/how much urgent data Q'd |4.2.2.4 |x| | | | |1
| | | | | | |
TCP Options | | | | | | |
Receive TCP option in any segment |4.2.2.5 |x| | | | |
Ignore unsupported options |4.2.2.5 |x| | | | |
Cope with illegal option length |4.2.2.5 |x| | | | |
Implement sending & receiving MSS option |4.2.2.6 |x| | | | |
Send MSS option unless 536 |4.2.2.6 | |x| | | |
Send MSS option always |4.2.2.6 | | |x| | |
Send-MSS default is 536 |4.2.2.6 |x| | | | |
Calculate effective send seg size |4.2.2.6 |x| | | | |
| | | | | | |
TCP Checksums | | | | | | |
Sender compute checksum |4.2.2.7 |x| | | | |
Receiver check checksum |4.2.2.7 |x| | | | |
| | | | | | |
Use clock-driven ISN selection |4.2.2.9 |x| | | | |
| | | | | | |
Opening Connections | | | | | | |
Support simultaneous open attempts |4.2.2.10|x| | | | |
SYN-RCVD remembers last state |4.2.2.11|x| | | | |
Passive Open call interfere with others |4.2.2.18| | | | |x|
Function: simultan. LISTENs for same port |4.2.2.18|x| | | | |
Ask IP for src address for SYN if necc. |4.2.3.7 |x| | | | |
Otherwise, use local addr of conn. |4.2.3.7 |x| | | | |
OPEN to broadcast/multicast IP Address |4.2.3.14| | | | |x|
Silently discard seg to bcast/mcast addr |4.2.3.14|x| | | | |
| | | | | | |
Closing Connections | | | | | | |
RST can contain data |4.2.2.12| |x| | | |
Inform application of aborted conn |4.2.2.13|x| | | | |
Half-duplex close connections |4.2.2.13| | |x| | |
Send RST to indicate data lost |4.2.2.13| |x| | | |
In TIME-WAIT state for 2xMSL seconds |4.2.2.13|x| | | | |
Accept SYN from TIME-WAIT state |4.2.2.13| | |x| | |
| | | | | | |
Retransmissions | | | | | | |
Jacobson Slow Start algorithm |4.2.2.15|x| | | | |
Jacobson Congestion-Avoidance algorithm |4.2.2.15|x| | | | |
Retransmit with same IP ident |4.2.2.15| | |x| | |
Karn's algorithm |4.2.3.1 |x| | | | |
Jacobson's RTO estimation alg. |4.2.3.1 |x| | | | |
Exponential backoff |4.2.3.1 |x| | | | |
SYN RTO calc same as data |4.2.3.1 | |x| | | |
Recommended initial values and bounds |4.2.3.1 | |x| | | |
| | | | | | |
Generating ACK's: | | | | | | |
Queue out-of-order segments |4.2.2.20| |x| | | |
Process all Q'd before send ACK |4.2.2.20|x| | | | |
Send ACK for out-of-order segment |4.2.2.21| | |x| | |
Delayed ACK's |4.2.3.2 | |x| | | |
Delay < 0.5 seconds |4.2.3.2 |x| | | | |
Every 2nd full-sized segment ACK'd |4.2.3.2 |x| | | | |
Receiver SWS-Avoidance Algorithm |4.2.3.3 |x| | | | |
| | | | | | |
Sending data | | | | | | |
Configurable TTL |4.2.2.19|x| | | | |
Sender SWS-Avoidance Algorithm |4.2.3.4 |x| | | | |
Nagle algorithm |4.2.3.4 | |x| | | |
Application can disable Nagle algorithm |4.2.3.4 |x| | | | |
| | | | | | |
Connection Failures: | | | | | | |
Negative advice to IP on R1 retxs |4.2.3.5 |x| | | | |
Close connection on R2 retxs |4.2.3.5 |x| | | | |
ALP can set R2 |4.2.3.5 |x| | | | |1
Inform ALP of R1<=retxs<R2 |4.2.3.5 | |x| | | |1
Recommended values for R1, R2 |4.2.3.5 | |x| | | |
Same mechanism for SYNs |4.2.3.5 |x| | | | |
R2 at least 3 minutes for SYN |4.2.3.5 |x| | | | |
| | | | | | |
Send Keep-alive Packets: |4.2.3.6 | | |x| | |
- Application can request |4.2.3.6 |x| | | | |
- Default is "off" |4.2.3.6 |x| | | | |
- Only send if idle for interval |4.2.3.6 |x| | | | |
- Interval configurable |4.2.3.6 |x| | | | |
- Default at least 2 hrs. |4.2.3.6 |x| | | | | - Tolerant of lost ACK's |4.2.3.6 |x| | | | | | | | | | | | IP Options | | | | | | | Ignore options TCP doesn't understand |4.2.3.8 |x| | | | | Time Stamp support |4.2.3.8 | | |x| | | Record Route support |4.2.3.8 | | |x| | | Source Route: | | | | | | | ALP can specify |4.2.3.8 |x| | | | |1 Overrides src rt in datagram |4.2.3.8 |x| | | | | Build return route from src rt |4.2.3.8 |x| | | | | Later src route overrides |4.2.3.8 | |x| | | | | | | | | | | Receiving ICMP Messages from IP |4.2.3.9 |x| | | | | Dest. Unreach (0,1,5) => inform ALP |4.2.3.9 | |x| | | | Dest. Unreach (0,1,5) => abort conn |4.2.3.9 | | | | |x| Dest. Unreach (2-4) => abort conn |4.2.3.9 | |x| | | | Source Quench => slow start |4.2.3.9 | |x| | | | Time Exceeded => tell ALP, don't abort |4.2.3.9 | |x| | | | Param Problem => tell ALP, don't abort |4.2.3.9 | |x| | | | | | | | | | | Address Validation | | | | | | | Reject OPEN call to invalid IP address |4.2.3.10|x| | | | | Reject SYN from invalid IP address |4.2.3.10|x| | | | | Silently discard SYN to bcast/mcast addr |4.2.3.10|x| | | | | | | | | | | | TCP/ALP Interface Services | | | | | | | Error Report mechanism |4.2.4.1 |x| | | | | ALP can disable Error Report Routine |4.2.4.1 | |x| | | | ALP can specify TOS for sending |4.2.4.2 |x| | | | | Passed unchanged to IP |4.2.4.2 | |x| | | | ALP can change TOS during connection |4.2.4.2 | |x| | | | Pass received TOS up to ALP |4.2.4.2 | | |x| | | FLUSH call |4.2.4.3 | | |x| | | Optional local IP addr parm. in OPEN |4.2.4.4 |x| | | | | -------------------------------------------------|--------|-|-|-|-|-|-- -------------------------------------------------|--------|-|-|-|-|-|-- FOOTNOTES: (1) "ALP" means Application-Layer program.
5. REFERENCES INTRODUCTORY REFERENCES [INTRO:1] "Requirements for Internet Hosts -- Application and Support," IETF Host Requirements Working Group, R. Braden, Ed., RFC-1123, October 1989. [INTRO:2] "Requirements for Internet Gateways," R. Braden and J. Postel, RFC-1009, June 1987. [INTRO:3] "DDN Protocol Handbook," NIC-50004, NIC-50005, NIC-50006, (three volumes), SRI International, December 1985. [INTRO:4] "Official Internet Protocols," J. Reynolds and J. Postel, RFC-1011, May 1987. This document is republished periodically with new RFC numbers; the latest version must be used. [INTRO:5] "Protocol Document Order Information," O. Jacobsen and J. Postel, RFC-980, March 1986. [INTRO:6] "Assigned Numbers," J. Reynolds and J. Postel, RFC-1010, May 1987. This document is republished periodically with new RFC numbers; the latest version must be used. [INTRO:7] "Modularity and Efficiency in Protocol Implementations," D. Clark, RFC-817, July 1982. [INTRO:8] "The Structuring of Systems Using Upcalls," D. Clark, 10th ACM SOSP, Orcas Island, Washington, December 1985. Secondary References: [INTRO:9] "A Protocol for Packet Network Intercommunication," V. Cerf and R. Kahn, IEEE Transactions on Communication, May 1974. [INTRO:10] "The ARPA Internet Protocol," J. Postel, C. Sunshine, and D. Cohen, Computer Networks, Vol. 5, No. 4, July 1981. [INTRO:11] "The DARPA Internet Protocol Suite," B. Leiner, J. Postel, R. Cole and D. Mills, Proceedings INFOCOM 85, IEEE, Washington DC,
March 1985. Also in: IEEE Communications Magazine, March 1985.
Also available as ISI-RS-85-153.
[INTRO:12] "Final Text of DIS8473, Protocol for Providing the
Connectionless Mode Network Service," ANSI, published as RFC-994,
March 1986.
[INTRO:13] "End System to Intermediate System Routing Exchange
Protocol," ANSI X3S3.3, published as RFC-995, April 1986.
LINK LAYER REFERENCES
[LINK:1] "Trailer Encapsulations," S. Leffler and M. Karels, RFC-893,
April 1984.
[LINK:2] "An Ethernet Address Resolution Protocol," D. Plummer, RFC-826,
November 1982.
[LINK:3] "A Standard for the Transmission of IP Datagrams over Ethernet
Networks," C. Hornig, RFC-894, April 1984.
[LINK:4] "A Standard for the Transmission of IP Datagrams over IEEE 802
"Networks," J. Postel and J. Reynolds, RFC-1042, February 1988.
This RFC contains a great deal of information of importance to
Internet implementers planning to use IEEE 802 networks.
IP LAYER REFERENCES
[IP:1] "Internet Protocol (IP)," J. Postel, RFC-791, September 1981.
[IP:2] "Internet Control Message Protocol (ICMP)," J. Postel, RFC-792,
September 1981.
[IP:3] "Internet Standard Subnetting Procedure," J. Mogul and J. Postel,
RFC-950, August 1985.
[IP:4] "Host Extensions for IP Multicasting," S. Deering, RFC-1112,
August 1989.
[IP:5] "Military Standard Internet Protocol," MIL-STD-1777, Department
of Defense, August 1983.
This specification, as amended by RFC-963, is intended to describe
the Internet Protocol but has some serious omissions (e.g., the
mandatory subnet extension [IP:3] and the optional multicasting
extension [IP:4]). It is also out of date. If there is a
conflict, RFC-791, RFC-792, and RFC-950 must be taken as
authoritative, while the present document is authoritative over
all.
[IP:6] "Some Problems with the Specification of the Military Standard
Internet Protocol," D. Sidhu, RFC-963, November 1985.
[IP:7] "The TCP Maximum Segment Size and Related Topics," J. Postel,
RFC-879, November 1983.
Discusses and clarifies the relationship between the TCP Maximum
Segment Size option and the IP datagram size.
[IP:8] "Internet Protocol Security Options," B. Schofield, RFC-1108,
October 1989.
[IP:9] "Fragmentation Considered Harmful," C. Kent and J. Mogul, ACM
SIGCOMM-87, August 1987. Published as ACM Comp Comm Review, Vol.
17, no. 5.
This useful paper discusses the problems created by Internet
fragmentation and presents alternative solutions.
[IP:10] "IP Datagram Reassembly Algorithms," D. Clark, RFC-815, July
1982.
This and the following paper should be read by every implementor.
[IP:11] "Fault Isolation and Recovery," D. Clark, RFC-816, July 1982.
SECONDARY IP REFERENCES:
[IP:12] "Broadcasting Internet Datagrams in the Presence of Subnets," J.
Mogul, RFC-922, October 1984.
[IP:13] "Name, Addresses, Ports, and Routes," D. Clark, RFC-814, July
1982.
[IP:14] "Something a Host Could Do with Source Quench: The Source Quench
Introduced Delay (SQUID)," W. Prue and J. Postel, RFC-1016, July
1987.
This RFC first described directed broadcast addresses. However,
the bulk of the RFC is concerned with gateways, not hosts.
UDP REFERENCES: [UDP:1] "User Datagram Protocol," J. Postel, RFC-768, August 1980. TCP REFERENCES: [TCP:1] "Transmission Control Protocol," J. Postel, RFC-793, September 1981. [TCP:2] "Transmission Control Protocol," MIL-STD-1778, US Department of Defense, August 1984. This specification as amended by RFC-964 is intended to describe the same protocol as RFC-793 [TCP:1]. If there is a conflict, RFC-793 takes precedence, and the present document is authoritative over both. [TCP:3] "Some Problems with the Specification of the Military Standard Transmission Control Protocol," D. Sidhu and T. Blumer, RFC-964, November 1985. [TCP:4] "The TCP Maximum Segment Size and Related Topics," J. Postel, RFC-879, November 1983. [TCP:5] "Window and Acknowledgment Strategy in TCP," D. Clark, RFC-813, July 1982. [TCP:6] "Round Trip Time Estimation," P. Karn & C. Partridge, ACM SIGCOMM-87, August 1987. [TCP:7] "Congestion Avoidance and Control," V. Jacobson, ACM SIGCOMM-88, August 1988. SECONDARY TCP REFERENCES: [TCP:8] "Modularity and Efficiency in Protocol Implementation," D. Clark, RFC-817, July 1982.
[TCP:9] "Congestion Control in IP/TCP," J. Nagle, RFC-896, January 1984. [TCP:10] "Computing the Internet Checksum," R. Braden, D. Borman, and C. Partridge, RFC-1071, September 1988. [TCP:11] "TCP Extensions for Long-Delay Paths," V. Jacobson & R. Braden, RFC-1072, October 1988. Security Considerations There are many security issues in the communication layers of host software, but a full discussion is beyond the scope of this RFC. The Internet architecture generally provides little protection against spoofing of IP source addresses, so any security mechanism that is based upon verifying the IP source address of a datagram should be treated with suspicion. However, in restricted environments some source-address checking may be possible. For example, there might be a secure LAN whose gateway to the rest of the Internet discarded any incoming datagram with a source address that spoofed the LAN address. In this case, a host on the LAN could use the source address to test for local vs. remote source. This problem is complicated by source routing, and some have suggested that source-routed datagram forwarding by hosts (see Section 3.3.5) should be outlawed for security reasons. Security-related issues are mentioned in sections concerning the IP Security option (Section 3.2.1.8), the ICMP Parameter Problem message (Section 3.2.2.5), IP options in UDP datagrams (Section 4.1.3.2), and reserved TCP ports (Section 4.2.2.1). Author's Address Robert Braden USC/Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292-6695 Phone: (213) 822 1511 EMail: Braden@ISI.EDU