Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3208

PGM Reliable Transport Protocol Specification

Pages: 111
Experimental
Errata
Part 1 of 5 – Pages 1 to 12
None   None   Next

Top   ToC   RFC3208 - Page 1
Network Working Group                                        T. Speakman
Request for Comments: 3208                                 Cisco Systems
Category: Experimental                                      J. Crowcroft
                                                                     UCL
                                                              J. Gemmell
                                                               Microsoft
                                                            D. Farinacci
                                                        Procket Networks
                                                                  S. Lin
                                                        Juniper Networks
                                                           D. Leshchiner
                                                          TIBCO Software
                                                                 M. Luby
                                                        Digital Fountain
                                                           T. Montgomery
                                                    Talarian Corporation
                                                                L. Rizzo
                                                      University of Pisa
                                                              A. Tweedly
                                                              N. Bhaskar
                                                           R. Edmonstone
                                                         R. Sumanasekera
                                                             L. Vicisano
                                                           Cisco Systems
                                                           December 2001


             PGM Reliable Transport Protocol Specification

Status of this Memo

   This memo defines an Experimental Protocol for the Internet
   community.  It does not specify an Internet standard of any kind.
   Discussion and suggestions for improvement are requested.
   Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2001).  All Rights Reserved.

Abstract

Pragmatic General Multicast (PGM) is a reliable multicast transport protocol for applications that require ordered or unordered, duplicate-free, multicast data delivery from multiple sources to multiple receivers. PGM guarantees that a receiver in the group either receives all data packets from transmissions and repairs, or is able to detect unrecoverable data packet loss. PGM is
Top   ToC   RFC3208 - Page 2
   specifically intended as a workable solution for multicast
   applications with basic reliability requirements.  Its central design
   goal is simplicity of operation with due regard for scalability and
   network efficiency.

Table of Contents

1. Introduction and Overview .................................. 3 2. Architectural Description .................................. 9 3. Terms and Concepts ......................................... 12 4. Procedures - General ....................................... 18 5. Procedures - Sources ....................................... 19 6. Procedures - Receivers ..................................... 22 7. Procedures - Network Elements .............................. 27 8. Packet Formats ............................................. 31 9. Options .................................................... 40 10. Security Considerations .................................... 56 11. Appendix A - Forward Error Correction ...................... 58 12. Appendix B - Support for Congestion Control ................ 72 13. Appendix C - SPM Requests .................................. 79 14. Appendix D - Poll Mechanism ................................ 82 15. Appendix E - Implosion Prevention .......................... 92 16. Appendix F - Transmit Window Example ....................... 98 17 Appendix G - Applicability Statement ....................... 103 18. Abbreviations .............................................. 105 19. Acknowledgments ............................................ 106 20. References ................................................. 106 21. Authors' Addresses.......................................... 108 22. Full Copyright Statement ................................... 111 Nota Bene: The publication of this specification is intended to freeze the definition of PGM in the interest of fostering both ongoing and prospective experimentation with the protocol. The intent of that experimentation is to provide experience with the implementation and deployment of a reliable multicast protocol of this class so as to be able to feed that experience back into the longer-term standardization process underway in the Reliable Multicast Transport Working Group of the IETF. Appendix G provides more specific detail on the scope and status of some of this experimentation. Reports of experiments include [16-23]. Additional results and new experimentation are encouraged.
Top   ToC   RFC3208 - Page 3

1. Introduction and Overview

A variety of reliable protocols have been proposed for multicast data delivery, each with an emphasis on particular types of applications, network characteristics, or definitions of reliability ([1], [2], [3], [4]). In this tradition, Pragmatic General Multicast (PGM) is a reliable transport protocol for applications that require ordered or unordered, duplicate-free, multicast data delivery from multiple sources to multiple receivers. PGM is specifically intended as a workable solution for multicast applications with basic reliability requirements rather than as a comprehensive solution for multicast applications with sophisticated ordering, agreement, and robustness requirements. Its central design goal is simplicity of operation with due regard for scalability and network efficiency. PGM has no notion of group membership. It simply provides reliable multicast data delivery within a transmit window advanced by a source according to a purely local strategy. Reliable delivery is provided within a source's transmit window from the time a receiver joins the group until it departs. PGM guarantees that a receiver in the group either receives all data packets from transmissions and repairs, or is able to detect unrecoverable data packet loss. PGM supports any number of sources within a multicast group, each fully identified by a globally unique Transport Session Identifier (TSI), but since these sources/sessions operate entirely independently of each other, this specification is phrased in terms of a single source and extends without modification to multiple sources. More specifically, PGM is not intended for use with applications that depend either upon acknowledged delivery to a known group of recipients, or upon total ordering amongst multiple sources. Rather, PGM is best suited to those applications in which members may join and leave at any time, and that are either insensitive to unrecoverable data packet loss or are prepared to resort to application recovery in the event. Through its optional extensions, PGM provides specific mechanisms to support applications as disparate as stock and news updates, data conferencing, low-delay real-time video transfer, and bulk data transfer. In the following text, transport-layer originators of PGM data packets are referred to as sources, transport-layer consumers of PGM data packets are referred to as receivers, and network-layer entities in the intervening network are referred to as network elements.
Top   ToC   RFC3208 - Page 4
   Unless otherwise specified, the term "repair" will be used to
   indicate both the actual retransmission of a copy of a missing packet
   or the transmission of an FEC repair packet.

Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [14] and
   indicate requirement levels for compliant PGM implementations.

1.1. Summary of Operation

PGM runs over a datagram multicast protocol such as IP multicast [5]. In the normal course of data transfer, a source multicasts sequenced data packets (ODATA), and receivers unicast selective negative acknowledgments (NAKs) for data packets detected to be missing from the expected sequence. Network elements forward NAKs PGM-hop-by- PGM-hop to the source, and confirm each hop by multicasting a NAK confirmation (NCF) in response on the interface on which the NAK was received. Repairs (RDATA) may be provided either by the source itself or by a Designated Local Repairer (DLR) in response to a NAK. Since NAKs provide the sole mechanism for reliability, PGM is particularly sensitive to their loss. To minimize NAK loss, PGM defines a network-layer hop-by-hop procedure for reliable NAK forwarding. Upon detection of a missing data packet, a receiver repeatedly unicasts a NAK to the last-hop PGM network element on the distribution tree from the source. A receiver repeats this NAK until it receives a NAK confirmation (NCF) multicast to the group from that PGM network element. That network element responds with an NCF to the first occurrence of the NAK and any further retransmissions of that same NAK from any receiver. In turn, the network element repeatedly forwards the NAK to the upstream PGM network element on the reverse of the distribution path from the source of the original data packet until it also receives an NCF from that network element. Finally, the source itself receives and confirms the NAK by multicasting an NCF to the group. While NCFs are multicast to the group, they are not propagated by PGM network elements since they act as hop-by-hop confirmations.
Top   ToC   RFC3208 - Page 5
   To avoid NAK implosion, PGM specifies procedures for subnet-based NAK
   suppression amongst receivers and NAK elimination within network
   elements.  The usual result is the propagation of just one copy of a
   given NAK along the reverse of the distribution path from any network
   with directly connected receivers to a source.

   The net effect is that unicast NAKs return from a receiver to a
   source on the reverse of the path on which ODATA was forwarded, that
   is, on the reverse of the distribution tree from the source.  More
   specifically, they return through exactly the same sequence of PGM
   network elements through which ODATA was forwarded, but in reverse.
   The reasons for handling NAKs this way will become clear in the
   discussion of constraining repairs, but first it's necessary to
   describe the mechanisms for establishing the requisite source path
   state in PGM network elements.

   To establish source path state in PGM network elements, the basic
   data transfer operation is augmented by Source Path Messages (SPMs)
   from a source, periodically interleaved with ODATA.  SPMs function
   primarily to establish source path state for a given TSI in all PGM
   network elements on the distribution tree from the source.  PGM
   network elements use this information to address returning unicast
   NAKs directly to the upstream PGM network element toward the source,
   and thereby insure that NAKs return from a receiver to a source on
   the reverse of the distribution path for the TSI.

   SPMs are sent by a source at a rate that serves to maintain up-to-
   date PGM neighbor information.  In addition, SPMs complement the role
   of DATA packets in provoking further NAKs from receivers, and
   maintaining receive window state in the receivers.

   As a further efficiency, PGM specifies procedures for the constraint
   of repairs by network elements so that they reach only those network
   segments containing group members that did not receive the original
   transmission.  As NAKs traverse the reverse of the ODATA path
   (upward), they establish repair state in the network elements which
   is used in turn to constrain the (downward) forwarding of the
   corresponding RDATA.

   Besides procedures for the source to provide repairs, PGM also
   specifies options and procedures that permit designated local
   repairers (DLRs) to announce their availability and to redirect
   repair requests (NAKs) to themselves rather than to the original
   source.  In addition to these conventional procedures for loss
   recovery through selective ARQ, Appendix A specifies Forward Error
   Correction (FEC) procedures for sources to provide and receivers to
   request general error correcting parity packets rather than selective
   retransmissions.
Top   ToC   RFC3208 - Page 6
   Finally, since PGM operates without regular return traffic from
   receivers, conventional feedback mechanisms for transport flow and
   congestion control cannot be applied.  Appendix B specifies a TCP-
   friendly, NE-based solution for PGM congestion control, and cites a
   reference to a TCP-friendly, end-to-end solution for PGM congestion
   control.

   In its basic operation, PGM relies on a purely rate-limited
   transmission strategy in the source to bound the bandwidth consumed
   by PGM transport sessions and to define the transmit window
   maintained by the source.

   PGM defines four basic packet types:  three that flow downstream
   (SPMs, DATA, NCFs), and one that flows upstream (NAKs).

1.2. Design Goals and Constraints

PGM has been designed to serve that broad range of multicast applications that have relatively simple reliability requirements, and to do so in a way that realizes the much advertised but often unrealized network efficiencies of multicast data transfer. The usual impediments to realizing these efficiencies are the implosion of negative and positive acknowledgments from receivers to sources, repair latency from the source, and the propagation of repairs to disinterested receivers.

1.2.1. Reliability.

Reliable data delivery across an unreliable network is conventionally achieved through an end-to-end protocol in which a source (implicitly or explicitly) solicits receipt confirmation from a receiver, and the receiver responds positively or negatively. While the frequency of negative acknowledgments is a function of the reliability of the network and the receiver's resources (and so, potentially quite low), the frequency of positive acknowledgments is fixed at at least the rate at which the transmit window is advanced, and usually more often. Negative acknowledgments primarily determine repairs and reliability. Positive acknowledgments primarily determine transmit buffer management. When these principles are extended without modification to multicast protocols, the result, at least for positive acknowledgments, is a burden of positive acknowledgments transmitted to the source that quickly threatens to overwhelm it as the number of receivers grows. More succinctly, ACK implosion keeps ACK-based reliable multicast protocols from scaling well.
Top   ToC   RFC3208 - Page 7
   One of the goals of PGM is to get as strong a definition of
   reliability as possible from as simple a protocol as possible.  ACK
   implosion can be addressed in a variety of effective but complicated
   ways, most of which require re-transmit capability from other than
   the original source.

   An alternative is to dispense with positive acknowledgments
   altogether, and to resort to other strategies for buffer management
   while retaining negative acknowledgments for repairs and reliability.
   The approach taken in PGM is to retain negative acknowledgments, but
   to dispense with positive acknowledgments and resort instead to
   timeouts at the source to manage transmit resources.

   The definition of reliability with PGM is a direct consequence of
   this design decision.  PGM guarantees that a receiver either receives
   all data packets from transmissions and repairs, or is able to detect
   unrecoverable data packet loss.

   PGM includes strategies for repeatedly provoking NAKs from receivers,
   and for adding reliability to the NAKs themselves.  By reinforcing
   the NAK mechanism, PGM minimizes the probability that a receiver will
   detect a missing data packet so late that the packet is unavailable
   for repair either from the source or from a designated local repairer
   (DLR).  Without ACKs and knowledge of group membership, however, PGM
   cannot eliminate this possibility.

1.2.2. Group Membership

A second consequence of eliminating ACKs is that knowledge of group membership is neither required nor provided by the protocol. Although a source may receive some PGM packets (NAKs for instance) from some receivers, the identity of the receivers does not figure in the processing of those packets. Group membership MAY change during the course of a PGM transport session without the knowledge of or consequence to the source or the remaining receivers.

1.2.3. Efficiency

While PGM avoids the implosion of positive acknowledgments simply by dispensing with ACKs, the implosion of negative acknowledgments is addressed directly. Receivers observe a random back-off prior to generating a NAK during which interval the NAK is suppressed (i.e. it is not sent, but the receiver acts as if it had sent it) by the receiver upon receipt of a matching NCF. In addition, PGM network elements eliminate duplicate NAKs received on different interfaces on the same network element.
Top   ToC   RFC3208 - Page 8
   The combination of these two strategies usually results in the source
   receiving just a single NAK for any given lost data packet.

   Whether a repair is provided from a DLR or the original source, it is
   important to constrain that repair to only those network segments
   containing members that negatively acknowledged the original
   transmission rather than propagating it throughout the group.  PGM
   specifies procedures for network elements to use the pattern of NAKs
   to define a sub-tree within the group upon which to forward the
   corresponding repair so that it reaches only those receivers that
   missed it in the first place.

1.2.4. Simplicity

PGM is designed to achieve the greatest improvement in reliability (as compared to the usual UDP) with the least complexity. As a result, PGM does NOT address conference control, global ordering amongst multiple sources in the group, nor recovery from network partitions.

1.2.5. Operability

PGM is designed to function, albeit with less efficiency, even when some or all of the network elements in the multicast tree have no knowledge of PGM. To that end, all PGM data packets can be conventionally multicast routed by non-PGM network elements with no loss of functionality, but with some inefficiency in the propagation of RDATA and NCFs. In addition, since NAKs are unicast to the last-hop PGM network element and NCFs are multicast to the group, NAK/NCF operation is also consistent across non-PGM network elements. Note that for NAK suppression to be most effective, receivers should always have a PGM network element as a first hop network element between themselves and every path to every PGM source. If receivers are several hops removed from the first PGM network element, the efficacy of NAK suppression may degrade.

1.3. Options

In addition to the basic data transfer operation described above, PGM specifies several end-to-end options to address specific application requirements. PGM specifies options to support fragmentation, late joining, redirection, Forward Error Correction (FEC), reachability, and session synchronization/termination/reset. Options MAY be appended to PGM data packet headers only by their original transmitters. While they MAY be interpreted by network elements, options are neither added nor removed by network elements.
Top   ToC   RFC3208 - Page 9
   All options are receiver-significant (i.e., they must be interpreted
   by receivers).  Some options are also network-significant (i.e., they
   must be interpreted by network elements).

   Fragmentation MAY be used in conjunction with data packets to allow a
   transport-layer entity at the source to break up application-layer
   data packets into multiple PGM data packets to conform with the
   maximum transmission unit (MTU) supported by the network layer.

   Late joining allows a source to indicate whether or not receivers may
   request all available repairs when they initially join a particular
   transport session.

   Redirection MAY be used in conjunction with Poll Responses to allow a
   DLR to respond to normal NCFs or POLLs with a redirecting POLR
   advertising its own address as an alternative re-transmitter to the
   original source.

   FEC techniques MAY be applied by receivers to use source-provided
   parity packets rather than selective retransmissions to effect loss
   recovery.

2. Architectural Description

As an end-to-end transport protocol, PGM specifies packet formats and procedures for sources to transmit and for receivers to receive data. To enhance the efficiency of this data transfer, PGM also specifies packet formats and procedures for network elements to improve the reliability of NAKs and to constrain the propagation of repairs. The division of these functions is described in this section and expanded in detail in the next section.

2.1. Source Functions

Data Transmission Sources multicast ODATA packets to the group within the transmit window at a given transmit rate. Source Path State Sources multicast SPMs to the group, interleaved with ODATA if present, to establish source path state in PGM network elements.
Top   ToC   RFC3208 - Page 10
      NAK Reliability

         Sources multicast NCFs to the group in response to any NAKs
         they receive.

      Repairs

         Sources multicast RDATA packets to the group in response to
         NAKs received for data packets within the transmit window.

      Transmit Window Advance

         Sources MAY advance the trailing edge of the window according
         to one of a number of strategies.  Implementations MAY support
         automatic adjustments such as keeping the window at a fixed
         size in bytes, a fixed number of packets or a fixed real time
         duration.  In addition, they MAY optionally delay window
         advancement based on NAK-silence for a certain period.  Some
         possible strategies are outlined later in this document.

2.2. Receiver Functions

Source Path State Receivers use SPMs to determine the last-hop PGM network element for a given TSI to which to direct their NAKs. Data Reception Receivers receive ODATA within the transmit window and eliminate any duplicates. Repair Requests Receivers unicast NAKs to the last-hop PGM network element (and MAY optionally multicast a NAK with TTL of 1 to the local group) for data packets within the receive window detected to be missing from the expected sequence. A receiver MUST repeatedly transmit a given NAK until it receives a matching NCF. NAK Suppression Receivers suppress NAKs for which a matching NCF or NAK is received during the NAK transmit back-off interval.
Top   ToC   RFC3208 - Page 11
      Receive Window Advance

         Receivers immediately advance their receive windows upon
         receipt of any PGM data packet or SPM within the transmit
         window that advances the receive window.

2.3. Network Element Functions

Network elements forward ODATA without intervention. Source Path State Network elements intercept SPMs and use them to establish source path state for the corresponding TSI before multicast forwarding them in the usual way. NAK Reliability Network elements multicast NCFs to the group in response to any NAK they receive. For each NAK received, network elements create repair state recording the transport session identifier, the sequence number of the NAK, and the input interface on which the NAK was received. Constrained NAK Forwarding Network elements repeatedly unicast forward only the first copy of any NAK they receive to the upstream PGM network element on the distribution path for the TSI until they receive an NCF in response. In addition, they MAY optionally multicast this NAK upstream with TTL of 1. Nota Bene: Once confirmed by an NCF, network elements discard NAK packets; NAKs are NOT retained in network elements beyond this forwarding operation, but state about the reception of them is stored. NAK Elimination Network elements discard exact duplicates of any NAK for which they already have repair state (i.e., that has been forwarded either by themselves or a neighboring PGM network element), and respond with a matching NCF.
Top   ToC   RFC3208 - Page 12
      Constrained RDATA Forwarding

         Network elements use NAKs to maintain repair state consisting
         of a list of interfaces upon which a given NAK was received,
         and they forward the corresponding RDATA only on these
         interfaces.

      NAK Anticipation

         If a network element hears an upstream NCF (i.e., on the
         upstream interface for the distribution tree for the TSI), it
         establishes repair state without outgoing interfaces in
         anticipation of responding to and eliminating duplicates of the
         NAK that may arrive from downstream.



(page 12 continued on part 2)

Next Section