Network Working Group A. Terzis Request for Comments: 2745 UCLA Category: Standards Track B. Braden ISI S. Vincent Cisco Systems L. Zhang UCLA January 2000 RSVP Diagnostic Messages Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2000). All Rights Reserved.
AbstractThis document specifies the RSVP diagnostic facility, which allows a user to collect information about the RSVP state along a path. This specification describes the functionality, diagnostic message formats, and processing rules. RSVP], error messages are the only means for an end host to receive feedback regarding a failure in setting up either path state or reservation state. An error message carries back only the information from the failed point, without any information about the state at other hops before or after the failure. In the absence of failures, a host receives no feedback regarding the details of a reservation that has been put in place, such as whether, or where, or how, its own reservation request is being merged with that of others. Such missing information can be highly desirable for debugging purposes, or for network resource management in general.
This document specifies the RSVP diagnostic facility, which is designed to fill this information gap. The diagnostic facility can be used to collect and report RSVP state information along the path from a receiver to a specific sender. It uses Diagnostic messages that are independent of other RSVP control messages and produce no side-effects; that is, they do not change any RSVP state at either nodes or hosts. Similarly, they provide not an error report but rather a collection of requested RSVP state information. The RSVP diagnostic facility was designed with the following goals: - To collect RSVP state information from every RSVP-capable hop along a path defined by path state, either for an existing reservation or before a reservation request is made. More specifically, we want to be able to collect information about flowspecs, refresh timer values, and reservation merging at each hop along the path. - To collect the IP hop count across each non-RSVP cloud. - To avoid diagnostic packet implosion or explosion. The following is specifically identified as a non-goal: - Checking the resource availability along a path. Such functionality may be useful for future reservation requests, but it would require modifications to existing admission control modules that is beyond the scope of RSVP.
When the DREQ packet reaches the ending node, the message type is changed to Diagnostic Reply (DREP) and the completed response is sent to the original requester node. Partial responses may also be returned before the DREQ packet reaches the ending node if an error condition along the path, such as "no path state", prevents further forwarding of the DREQ packet. To avoid packet implosion or explosion, all diagnostic packets are forwarded via unicast only. Thus, there are generally three nodes (hosts and/or routers) involved in performing the diagnostic function: the requester node, the starting node, and the ending node, as shown in Figure 1. It is possible that the client invoking the diagnosis function may reside directly on the starting node, in which case that the first two nodes are the same. The starting node is named "LAST-HOP", meaning the last-hop of the path segment to be diagnosed. The LAST-HOP node can be either a receiver node or an intermediate node along the path. The ending node is usually the specified sender host. However, the client can limit the length of the path segment to be diagnosed by specifying a hop-count limit in the DREQ message. LAST-HOP Ending Receiver node node Sender __ __ __ __ __ | |---------| |------>| |--> ...-->| |--> ...---->| | |__| |__| DREQ |__| DREQ |__| DREQ |__| ^ . | | . | | DREQ . DREP | DREP | . | _|_ DREP V V Requester | | <------------------------------------ (client) |___| Figure 1 DREP packets can be unicast from the ending node back to the requester either directly or hop-by-hop along the reverse of the path taken by the DREQ message to the LAST-HOP, and thence to the requester. The direct return is faster and more efficient, but the hop-by-hop reverse-path route may be the only choice if the packets have to cross firewalls. Hop-by-hop return is accomplished using an optional ROUTE object, which is built incrementally to contain a list of node addresses that the DREQ packet has passed through. The ROUTE object is then used in reverse as a source route to forward the DREP hop-by-hop back to the LAST-HOP node.
A DREQ message always consists of a single unfragmented IP datagram. On the other hand, one DREQ message can generate multiple DREP packets, each containing a fragment of the total DREQ message. When the path consists of many hops, the total length of a DREP message will exceed the MTU size before reaching the ending node; thus, the message has to be fragmented. Relying on IP fragmentation and reassembly, however, can be problematic, especially when DREP messages are returned to the requester hop-by-hop, in which case fragmentation/reassembly would have to be performed at every hop. To avoid such excessive overhead, we let the requester define a default path MTU size that is carried in every DREQ packet. If an intermediate node finds that the default MTU size is bigger than the MTU of the incoming interface, it reduces the default MTU size to the MTU size of the incoming interface. If an intermediate node detects that a DREQ packet size is larger than the default MTU size, it returns to the requester (in either manner described above) a DREP fragment containing accumulated responses. It then removes these responses from the DREQ and continues to forward it. The requester node can reassemble the resulting DREP fragments into a complete DREP message. When discussing diagnostic packet handling, this document uses direction terminology that is consistent with the RSVP functional specification [RSVP], relative to the direction of data packet flow. Thus, a DREQ packet enters a node through an "outgoing interface" and is forwarded towards the sender through an "incoming interface", because DREQ packets travel in the reverse direction to the data flow. Notice that DREQ packets can be forwarded only after the RSVP path state has been set up. If no path state exists, one may resort to the traceroute or mtrace facility to examine whether the unicast/multicast routing is working correctly.
+-----------------------------------+ | RSVP Common Header | +-----------------------------------+ | Session object | +-----------------------------------+ | Next-Hop RSVP_HOP object | +-----------------------------------+ | DIAGNOSTIC object | +-----------------------------------+ | (optional) DIAG_SELECT object | +-----------------------------------+ | (optional) ROUTE object | +-----------------------------------+ | zero or more DIAG_RESPONSE objects| +-----------------------------------+ The session object identifies the RSVP session for which the state information is being collected. We describe each of the other parts. RSVP]. The following specific exceptions and extensions are needed for DREP and DREQ. Type field: define: Type = 8: DREQ Diagnostic Request Type = 9: DREP Diagnostic Reply RSVP length: If this is a DREP message and the MF flag in the DIAGNOSTIC object (see below) is set, this field indicates the length of this single DREP fragment rather than the total length of the complete DREP reply message (which cannot generally be known in advance).
While the IP address is not really used during DREQ processing, for consistency with the use of the RSVP_HOP object in other RSVP messages, the IP address in the RSVP_HOP object to contain the address of the interface through which the DREQ was sent.
the problem. If this value is 1, the starting node and the ending node of the query will be the same. If it is zero, there is no hop limit. RSVP-hop-count Records the number of RSVP hops that have been traversed so far. If the starting and ending nodes are the same, this value will be 1 in the resulting DREP message. Fragment Offset Indicates where this DREP fragment belongs in the complete DREP message, measured in octets. The first fragment has offset zero. Fragment Offset is used also to determine if a DREQ message containing zero DIAG_RESPONSE objects should be processed at an RSVP capable node. MF flag Flag means "more fragments". It must be set to zero (0) in all DREQ messages. It must be set to one (1) in all DREP packets that carry partial results and are returned by intermediate nodes due to the MTU limit. When the DREQ message is converted to a DREP message in the ending node, the MF flag must remain zero. Request ID Identifies an individual DREQ message and the corresponding DREP message (or all the fragments of the reply message). One possible way to define the Request ID would use 16 bits to specify the ID of the process making the query and 16 bits to distinguish different queries from this process. Path MTU Specifies a default MTU size in octets for DREP and DREQ messages. This value should not be smaller than the size of the "base" DREQ packet. A "base" DREQ packet is one that contains a Common Header, a Session object, a Next-Hop RSVP_HOP object, a DIAGNOSTIC object, an empty ROUTE object and a single default DIAG_RESPONSE (see below). The assumption made here is that a diagnostic packet of this size can always be forwarded without IP fragmentation.
LAST-HOP Address The IP address of the LAST-HOP node. The DREQ message starts collecting information at this node and proceeds toward the sender. SENDER_TEMPLATE object This IPv4/IPv6 SENDER_TEMPLATE object contains the IP address and the port of a sender for the session being diagnosed. The DREQ packet is forwarded hop-by-hop towards this address. Requester FILTER_SPEC Object This IPv4/IPv6 FILTER_SPEC object contains the IP address and the port from which the request originated and to which the DREP message(s) should be sent.
Depending on the type of objects requested, a node can find the associated information in the path or reservation state stored for the session described in the SESSION object. Specifically, information for the RSVP_HOP,SENDER_TEMPLATE, SENDER_TSPEC, ADSPEC objects can be extracted from the node's path state, while information for the FLOWSPEC, FILTER_SPEC, CONFIRM, STYLE and SCOPE objects can be found in the node's reservation state (if existent). If the number of [Class, C-Type] pairs is odd, the last two octets of the DIAG_SELECT object must be zero. A maximum DIAG_SELECT object is one that contains the [Class, C-type] pairs for all the RSVP objects that can be requested in a Diagnostic query. Section 4.2 for details), but it is incremented as each hop adds its incoming interface address in the ROUTE object.
o IPv6 ROUTE object: Class = 31, C-Type = 2 The same, except RSVP Node List contains IPv6 addresses. In a DREQ message, RSVP Node List specifies all RSVP hops between the LAST-HOP address specified in the DIAGNOSTIC object, and the last RSVP node the DREQ message has visited. In a DREP message, RSVP Node List specifies all RSVP hops between the LAST-HOP and the node that returns this DREP message.
DREQ Arrival Time A 32-bit NTP timestamp specifying the time the DREQ message arrived at this node. The 32-bit form of an NTP timestamp consists of the middle 32 bits of the full 64-bit form, that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. Incoming Interface Address Specifies the IP address of the interface on which messages from the sender are expected to arrive, or 0 if unknown. Outgoing Interface Address Specifies the IP address of the interface through which the DREQ message arrived and to which messages from the given sender and for the specified session address flow, or 0 if unknown. Previous-RSVP-Hop Router Address Specifies the IP address from which this node receives RSVP PATH messages for this source, or 0 if unknown. This is also the interface to which the DREQ will be forwarded. D-TTL The number of IP hops this DREQ message traveled from the down- stream RSVP node to the current node. M flag A single-bit flag which indicates whether the reservation described by the response objects is merged with reservations from other down-stream interfaces when being forwarded upstream. R-error A 3-bit field that indicates error conditions at a node. Currently defined values are: 0x00: no error 0x01: No PATH state 0x02: packet too big 0x04: ROUTE object too big
K The refresh timer multiple (defined in [RSVP]). Timer value The local refresh timer value in seconds. The set of response objects to be included at the end of the DIAG_RESPONSE object is determined by a DIAG_SELECT object, if one is present. If no DIAG_SELECT object is present, the response objects belong to the default list of classes: SENDER_TSPEC object FILTER_SPEC object FLOWSPEC object STYLE object Any C-Type present in the local RSVP state will be used. These response objects may be in any order but they must all be at the end of the DIAG_RESPONSE object. A default DIAG_RESPONSE object is one containing the default list of classes described above. RSVPTUN].
Fragment Offset, the node should forward the DREQ packet towards the LAST-HOP without doing any of the processing mentioned below. The reason is that such conditions apply only for nodes downstream of the LAST-HOP where no information should be collected. Processing begins when a DREQ message, DREQ_in, arrives at a node. 1. Create a new DIAG_RESPONSE object. Compute the IP hop count from the previous RSVP hop. This is done by subtracting the value of the TTL value in the IP header from Send_TTL in the RSVP common header. Save the result in the D-TTL field of the DIAG_RESPONSE object. 2. Set the DREQ Arrival Time and the Outgoing Interface Address in the DIAG_RESPONSE object. If this node is the LAST-HOP, then the Out- going Interface Address field in the DIAG_RESPONSE object contains the following value depending on the session being diagnosed. * If the session in question is a unicast session, then the Out-going Interface Address field contains the address of the interface LAST-HOP uses to send PATH messages and data to the receiver specified by the session address. * Otherwise, if it is a multicast session and there is at least one receiver for this session, LAST_HOP should use the address of one of local interfaces used to reach one of the receivers. * Otherwise Outgoing Interface Address should be zero. 3. Increment the RSVP-hop-count field in the DIAGNOSTIC message object by one. 4. If no PATH state exists for the specified session, set R-error = 0x01 (No PATH state) and goto step 7. 5. Set the rest of the fields in the DIAG_RESPONSE object. If DREQ_in contains a DIAG_SELECT object, the response object classes are those specified in the DIAG_SELECT; otherwise, they are SENDER_TSPEC, STYLE, and FLOWSPEC objects. If no reservation state exists for the specified RSVP session, the DIAG_RESPONSE object will contain no FLOWSPEC, FILTER_SPEC or STYLE object. If neither PATH nor reservation state exists for the specified RSVP session, then no response objects will be appended to the DIAG_RESPONSE object.
6. If RSVP-hop-count is less than Max-RSVP-hops and this node is not the sender, then the DREQ is eligible for forwarding; set the Path MTU to the min of the Path MTU and the MTU size of the incoming interface for the sender being diagnosed. 7. If the size of DREQ_in plus the size of the new DIAG_RESPONSE object plus the size of an IP address (if a ROUTE object exists and R-error= 0) is larger than Path MTU, then the new diagnostic message will be too large to be forwarded or returned without fragmentation; set the "packet too big" (0x02) error bit in DIAG_RESPONSE and goto Step SD1 in Send_DREP (below). 8. If the "No PATH state" (0x01) error bit is set or if RSVP- hop-count is equal to Max-RSVP-hops or if this node is the sender, then the DREQ cannot be forwarded further; goto Step 10. 9. Forward the DREQ towards the sender, as follows. If a ROUTE object exists, append the "Incoming Interface Address" to the end of the ROUTE object and increment R-Pointer by one. Update the Next-Hop RSVP_HOP object, append the new DIAG_RESPONSE object to the list of DIAG_RESPONSE object, and update the message length field in the RSVP common header accordingly. Finally, recompute the checksum, forward DREQ_in to the next hop towards the sender, and return. 10. Turn the DREQ into a DREP and return to the requester, as follows. Append the DIAG_RESPONSE object to the end of DREQ_in and update the packet length. If a ROUTE object is present in the message, decrement the R-pointer and set target address to the last address in the ROUTE object, otherwise set target address to the requester address. Change the Type Field in the Common header from DREQ to DREP. Finally, recompute the checksum, send the DREP to the target address, and return. Note that the MF bit must be off in this case. Send_DREP: This sequence is entered if the DREQ message augmented with the new DIAG_RESPONSE object is too large to be forwarded towards the sender or, if it is not eligible for forwarding, too large to be returned as a DREP. SD1. Make a copy of DREQ_in and change the message type field from DREQ to DREP. Trim all DIAG_RESPONSE objects from DREQ_in and adjust the Fragment Offset. The DREP message contains the DIAG_RESPONSE objects accumulated by prior nodes.
SD2. Send the DREP message towards the requester, as follows. If a ROUTE object is present in the DREP message, decrement the R- pointer and set target address to the last address in the ROUTE object, otherwise set target address to the requester address. Set the MF bit, recompute the checksum and send the DREP message back to the target address. SD3. If the reduced size of DREQ_in plus the size of DIAG_RESPONSE plus the size of an IP address (if a ROUTE object exists) is smaller than or equal to Path MTU, then return to Step 8 of the main DREQ processing sequence above. SD4. If a ROUTE object exists, replace the ROUTE object in DREQ_in with an empty ROUTE object and turn on the "ROUTE object too big" (0x04) error bit in the DIAG_RESPONSE. In either case, return to Step 8 of the main DREQ processing sequence above.
the MTU of the incoming interface (that the DREQ message will be forwarded to), the node changes the MTU value in the header to the smaller value. Whenever a DREQ message size becomes larger than the Path MTU value, an intermediate RSVP node makes a copy of the message, converts it to a DREP message to send back, and then trims off the partial results from the DREQ message. If in this case also the DREQ cannot be forwarded upstream due to a large ROUTE object, the "ROUTE object too big" is set and the ROUTE object is trimmed. As a result of the ROUTE object trimming, DREP(s) will come hop-by-hop up to this node and will then immediately be forwarded to the requester address. Even if the steps shown above are followed there are a few cases where fragmentation at the IP layer will happen. For example, non- RSVP hops with smaller MTUs may exist before LAST-HOP is reached, or if the response is sent directly back to requester (as opposed to hop by hop) the DREP may take a different route to the requester than the DREQ took from the requester. Another case is when there exists a link with MTU smaller than the minimum Path MTU value defined in Section 3.3.
If the requester is a third party host and is separated from the LAST-HOP address by a firewall (either the requester is behind a firewall, or the LAST-HOP is a node behind a firewall, or both), at this time we do not know any other solution but to change the LAST- HOP to a node that is on the same side of the firewall as the requester.
The second problem is that traceroute provides the path from the requester to the sender which, due to routing asymmetries, may be different than the path traffic from the sender to the LAST-HOP uses. There is (at least) one case where this asymmetry will cause the diagnosis to fail. We present this case below. Downstream Path Sender __ __ __ __ Receiver +------| |<------| |<-- ...---| |-----| | __ __ / |__| |__| |__| |__| | |--....--|X |_/ ^ |__| |__| \ Router B | Black \ __ | Hole +----->| |---->---+ |__| Upstream Path Router A Figure 2 Here the first hop upstream of the black hole is different on the upstream path and the downstream path. Traceroute will indicate router A as the previous hop (instead of router B which is the right one). Sending a DREQ to router A will result in A responding with R- error 0x01 (No PATH State). If the two paths converge again then the requester can use the solution proposed above to get any (partial) information from the rest of the path. We don't have, for the moment, any complete solutions for the problematic scenarios described here.
DREQ message sent. Set the Path_MTU to the smaller of the user request and the MTU of the link through which the DREQ will be sent. The port of the UDP socket on which the Diagnostic Client is listening for replies should be included in the Requester FILTER_SPEC object. 2. Set a retransmission timer, waiting for the reply (one or more DREP messages). Listen to the specified UDP port for responses from the LAST-HOP RSVP node. The LAST-HOP RSVP node, upon receiving DREP messages, sends them to the Diagnostic Client as UDP packets, using the port supplied in the Requester FILTER_SPEC object. 3. Upon receiving a DREP message to an outstanding diagnostic request, the client should clear the retransmission timer, check to see if the reply contains the complete result of the requested diagnosis. If so, it should pass the result up to the invoking entity immediately. 4. Reassemble DREP fragments. If the first reply to an outstanding diagnostic request contains only a fragment of the expected result, the client should set up a reassembly timer in a way similar to IP packet reassembly timer. If the timer goes off before all fragments arrive, the client should pass the partial result to the invoking entity. 5. Use retransmission and reassembly timers to gracefully handle packet losses and reply fragment scenarios. In the absence of response to the first diagnostic request, a client should retransmit the request a few times. If all the retransmissions also fail, the client should invoke traceroute or mtrace to obtain the list of hops along the path segment to be diagnosed, and then perform an iteration of diagnosis with increasing hop count as suggested in Section 5.6 in order to cross RSVP-capable but diagnosis-incapable nodes. 6. If all the above efforts fail, the client must notify the invoking entity.
Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.