Network Working Group T. Speakman
Request for Comments: 3208 Cisco Systems
Category: Experimental J. Crowcroft
University of Pisa
A. TweedlyN. BhaskarR. EdmonstoneR. SumanasekeraL. Vicisano
December 2001 PGM Reliable Transport Protocol Specification
Status of this Memo
This memo defines an Experimental Protocol for the Internet
community. It does not specify an Internet standard of any kind.
Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Copyright (C) The Internet Society (2001). All Rights Reserved.
Pragmatic General Multicast (PGM) is a reliable multicast transport
protocol for applications that require ordered or unordered,
duplicate-free, multicast data delivery from multiple sources to
multiple receivers. PGM guarantees that a receiver in the group
either receives all data packets from transmissions and repairs, or
is able to detect unrecoverable data packet loss. PGM is
specifically intended as a workable solution for multicast
applications with basic reliability requirements. Its central design
goal is simplicity of operation with due regard for scalability and
Table of Contents
1. Introduction and Overview .................................. 32. Architectural Description .................................. 93. Terms and Concepts ......................................... 124. Procedures - General ....................................... 185. Procedures - Sources ....................................... 196. Procedures - Receivers ..................................... 227. Procedures - Network Elements .............................. 278. Packet Formats ............................................. 319. Options .................................................... 4010. Security Considerations .................................... 5611. Appendix A - Forward Error Correction ...................... 5812. Appendix B - Support for Congestion Control ................ 7213. Appendix C - SPM Requests .................................. 7914. Appendix D - Poll Mechanism ................................ 8215. Appendix E - Implosion Prevention .......................... 9216. Appendix F - Transmit Window Example ....................... 9817 Appendix G - Applicability Statement ....................... 10318. Abbreviations .............................................. 10519. Acknowledgments ............................................ 10620. References ................................................. 10621. Authors' Addresses.......................................... 10822. Full Copyright Statement ................................... 111Nota Bene:
The publication of this specification is intended to freeze the
definition of PGM in the interest of fostering both ongoing and
prospective experimentation with the protocol. The intent of that
experimentation is to provide experience with the implementation and
deployment of a reliable multicast protocol of this class so as to be
able to feed that experience back into the longer-term
standardization process underway in the Reliable Multicast Transport
Working Group of the IETF. Appendix G provides more specific detail
on the scope and status of some of this experimentation. Reports of
experiments include [16-23]. Additional results and new
experimentation are encouraged.
1. Introduction and Overview
A variety of reliable protocols have been proposed for multicast data
delivery, each with an emphasis on particular types of applications,
network characteristics, or definitions of reliability (, ,
, ). In this tradition, Pragmatic General Multicast (PGM) is a
reliable transport protocol for applications that require ordered or
unordered, duplicate-free, multicast data delivery from multiple
sources to multiple receivers.
PGM is specifically intended as a workable solution for multicast
applications with basic reliability requirements rather than as a
comprehensive solution for multicast applications with sophisticated
ordering, agreement, and robustness requirements. Its central design
goal is simplicity of operation with due regard for scalability and
PGM has no notion of group membership. It simply provides reliable
multicast data delivery within a transmit window advanced by a source
according to a purely local strategy. Reliable delivery is provided
within a source's transmit window from the time a receiver joins the
group until it departs. PGM guarantees that a receiver in the group
either receives all data packets from transmissions and repairs, or
is able to detect unrecoverable data packet loss. PGM supports any
number of sources within a multicast group, each fully identified by
a globally unique Transport Session Identifier (TSI), but since these
sources/sessions operate entirely independently of each other, this
specification is phrased in terms of a single source and extends
without modification to multiple sources.
More specifically, PGM is not intended for use with applications that
depend either upon acknowledged delivery to a known group of
recipients, or upon total ordering amongst multiple sources.
Rather, PGM is best suited to those applications in which members may
join and leave at any time, and that are either insensitive to
unrecoverable data packet loss or are prepared to resort to
application recovery in the event. Through its optional extensions,
PGM provides specific mechanisms to support applications as disparate
as stock and news updates, data conferencing, low-delay real-time
video transfer, and bulk data transfer.
In the following text, transport-layer originators of PGM data
packets are referred to as sources, transport-layer consumers of PGM
data packets are referred to as receivers, and network-layer entities
in the intervening network are referred to as network elements.
Unless otherwise specified, the term "repair" will be used to
indicate both the actual retransmission of a copy of a missing packet
or the transmission of an FEC repair packet.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119  and
indicate requirement levels for compliant PGM implementations.
1.1. Summary of Operation
PGM runs over a datagram multicast protocol such as IP multicast .
In the normal course of data transfer, a source multicasts sequenced
data packets (ODATA), and receivers unicast selective negative
acknowledgments (NAKs) for data packets detected to be missing from
the expected sequence. Network elements forward NAKs PGM-hop-by-
PGM-hop to the source, and confirm each hop by multicasting a NAK
confirmation (NCF) in response on the interface on which the NAK was
received. Repairs (RDATA) may be provided either by the source
itself or by a Designated Local Repairer (DLR) in response to a NAK.
Since NAKs provide the sole mechanism for reliability, PGM is
particularly sensitive to their loss. To minimize NAK loss, PGM
defines a network-layer hop-by-hop procedure for reliable NAK
Upon detection of a missing data packet, a receiver repeatedly
unicasts a NAK to the last-hop PGM network element on the
distribution tree from the source. A receiver repeats this NAK until
it receives a NAK confirmation (NCF) multicast to the group from that
PGM network element. That network element responds with an NCF to
the first occurrence of the NAK and any further retransmissions of
that same NAK from any receiver. In turn, the network element
repeatedly forwards the NAK to the upstream PGM network element on
the reverse of the distribution path from the source of the original
data packet until it also receives an NCF from that network element.
Finally, the source itself receives and confirms the NAK by
multicasting an NCF to the group.
While NCFs are multicast to the group, they are not propagated by PGM
network elements since they act as hop-by-hop confirmations.
To avoid NAK implosion, PGM specifies procedures for subnet-based NAK
suppression amongst receivers and NAK elimination within network
elements. The usual result is the propagation of just one copy of a
given NAK along the reverse of the distribution path from any network
with directly connected receivers to a source.
The net effect is that unicast NAKs return from a receiver to a
source on the reverse of the path on which ODATA was forwarded, that
is, on the reverse of the distribution tree from the source. More
specifically, they return through exactly the same sequence of PGM
network elements through which ODATA was forwarded, but in reverse.
The reasons for handling NAKs this way will become clear in the
discussion of constraining repairs, but first it's necessary to
describe the mechanisms for establishing the requisite source path
state in PGM network elements.
To establish source path state in PGM network elements, the basic
data transfer operation is augmented by Source Path Messages (SPMs)
from a source, periodically interleaved with ODATA. SPMs function
primarily to establish source path state for a given TSI in all PGM
network elements on the distribution tree from the source. PGM
network elements use this information to address returning unicast
NAKs directly to the upstream PGM network element toward the source,
and thereby insure that NAKs return from a receiver to a source on
the reverse of the distribution path for the TSI.
SPMs are sent by a source at a rate that serves to maintain up-to-
date PGM neighbor information. In addition, SPMs complement the role
of DATA packets in provoking further NAKs from receivers, and
maintaining receive window state in the receivers.
As a further efficiency, PGM specifies procedures for the constraint
of repairs by network elements so that they reach only those network
segments containing group members that did not receive the original
transmission. As NAKs traverse the reverse of the ODATA path
(upward), they establish repair state in the network elements which
is used in turn to constrain the (downward) forwarding of the
Besides procedures for the source to provide repairs, PGM also
specifies options and procedures that permit designated local
repairers (DLRs) to announce their availability and to redirect
repair requests (NAKs) to themselves rather than to the original
source. In addition to these conventional procedures for loss
recovery through selective ARQ, Appendix A specifies Forward Error
Correction (FEC) procedures for sources to provide and receivers to
request general error correcting parity packets rather than selective
Finally, since PGM operates without regular return traffic from
receivers, conventional feedback mechanisms for transport flow and
congestion control cannot be applied. Appendix B specifies a TCP-
friendly, NE-based solution for PGM congestion control, and cites a
reference to a TCP-friendly, end-to-end solution for PGM congestion
In its basic operation, PGM relies on a purely rate-limited
transmission strategy in the source to bound the bandwidth consumed
by PGM transport sessions and to define the transmit window
maintained by the source.
PGM defines four basic packet types: three that flow downstream
(SPMs, DATA, NCFs), and one that flows upstream (NAKs).
1.2. Design Goals and Constraints
PGM has been designed to serve that broad range of multicast
applications that have relatively simple reliability requirements,
and to do so in a way that realizes the much advertised but often
unrealized network efficiencies of multicast data transfer. The
usual impediments to realizing these efficiencies are the implosion
of negative and positive acknowledgments from receivers to sources,
repair latency from the source, and the propagation of repairs to
Reliable data delivery across an unreliable network is conventionally
achieved through an end-to-end protocol in which a source (implicitly
or explicitly) solicits receipt confirmation from a receiver, and the
receiver responds positively or negatively. While the frequency of
negative acknowledgments is a function of the reliability of the
network and the receiver's resources (and so, potentially quite low),
the frequency of positive acknowledgments is fixed at at least the
rate at which the transmit window is advanced, and usually more
Negative acknowledgments primarily determine repairs and reliability.
Positive acknowledgments primarily determine transmit buffer
When these principles are extended without modification to multicast
protocols, the result, at least for positive acknowledgments, is a
burden of positive acknowledgments transmitted to the source that
quickly threatens to overwhelm it as the number of receivers grows.
More succinctly, ACK implosion keeps ACK-based reliable multicast
protocols from scaling well.
One of the goals of PGM is to get as strong a definition of
reliability as possible from as simple a protocol as possible. ACK
implosion can be addressed in a variety of effective but complicated
ways, most of which require re-transmit capability from other than
the original source.
An alternative is to dispense with positive acknowledgments
altogether, and to resort to other strategies for buffer management
while retaining negative acknowledgments for repairs and reliability.
The approach taken in PGM is to retain negative acknowledgments, but
to dispense with positive acknowledgments and resort instead to
timeouts at the source to manage transmit resources.
The definition of reliability with PGM is a direct consequence of
this design decision. PGM guarantees that a receiver either receives
all data packets from transmissions and repairs, or is able to detect
unrecoverable data packet loss.
PGM includes strategies for repeatedly provoking NAKs from receivers,
and for adding reliability to the NAKs themselves. By reinforcing
the NAK mechanism, PGM minimizes the probability that a receiver will
detect a missing data packet so late that the packet is unavailable
for repair either from the source or from a designated local repairer
(DLR). Without ACKs and knowledge of group membership, however, PGM
cannot eliminate this possibility.
1.2.2. Group Membership
A second consequence of eliminating ACKs is that knowledge of group
membership is neither required nor provided by the protocol.
Although a source may receive some PGM packets (NAKs for instance)
from some receivers, the identity of the receivers does not figure in
the processing of those packets. Group membership MAY change during
the course of a PGM transport session without the knowledge of or
consequence to the source or the remaining receivers.
While PGM avoids the implosion of positive acknowledgments simply by
dispensing with ACKs, the implosion of negative acknowledgments is
Receivers observe a random back-off prior to generating a NAK during
which interval the NAK is suppressed (i.e. it is not sent, but the
receiver acts as if it had sent it) by the receiver upon receipt of a
matching NCF. In addition, PGM network elements eliminate duplicate
NAKs received on different interfaces on the same network element.
The combination of these two strategies usually results in the source
receiving just a single NAK for any given lost data packet.
Whether a repair is provided from a DLR or the original source, it is
important to constrain that repair to only those network segments
containing members that negatively acknowledged the original
transmission rather than propagating it throughout the group. PGM
specifies procedures for network elements to use the pattern of NAKs
to define a sub-tree within the group upon which to forward the
corresponding repair so that it reaches only those receivers that
missed it in the first place.
PGM is designed to achieve the greatest improvement in reliability
(as compared to the usual UDP) with the least complexity. As a
result, PGM does NOT address conference control, global ordering
amongst multiple sources in the group, nor recovery from network
PGM is designed to function, albeit with less efficiency, even when
some or all of the network elements in the multicast tree have no
knowledge of PGM. To that end, all PGM data packets can be
conventionally multicast routed by non-PGM network elements with no
loss of functionality, but with some inefficiency in the propagation
of RDATA and NCFs.
In addition, since NAKs are unicast to the last-hop PGM network
element and NCFs are multicast to the group, NAK/NCF operation is
also consistent across non-PGM network elements. Note that for NAK
suppression to be most effective, receivers should always have a PGM
network element as a first hop network element between themselves and
every path to every PGM source. If receivers are several hops
removed from the first PGM network element, the efficacy of NAK
suppression may degrade.
In addition to the basic data transfer operation described above, PGM
specifies several end-to-end options to address specific application
requirements. PGM specifies options to support fragmentation, late
joining, redirection, Forward Error Correction (FEC), reachability,
and session synchronization/termination/reset. Options MAY be
appended to PGM data packet headers only by their original
transmitters. While they MAY be interpreted by network elements,
options are neither added nor removed by network elements.
All options are receiver-significant (i.e., they must be interpreted
by receivers). Some options are also network-significant (i.e., they
must be interpreted by network elements).
Fragmentation MAY be used in conjunction with data packets to allow a
transport-layer entity at the source to break up application-layer
data packets into multiple PGM data packets to conform with the
maximum transmission unit (MTU) supported by the network layer.
Late joining allows a source to indicate whether or not receivers may
request all available repairs when they initially join a particular
Redirection MAY be used in conjunction with Poll Responses to allow a
DLR to respond to normal NCFs or POLLs with a redirecting POLR
advertising its own address as an alternative re-transmitter to the
FEC techniques MAY be applied by receivers to use source-provided
parity packets rather than selective retransmissions to effect loss
2. Architectural Description
As an end-to-end transport protocol, PGM specifies packet formats and
procedures for sources to transmit and for receivers to receive data.
To enhance the efficiency of this data transfer, PGM also specifies
packet formats and procedures for network elements to improve the
reliability of NAKs and to constrain the propagation of repairs. The
division of these functions is described in this section and expanded
in detail in the next section.
2.1. Source Functions
Sources multicast ODATA packets to the group within the
transmit window at a given transmit rate.
Source Path State
Sources multicast SPMs to the group, interleaved with ODATA if
present, to establish source path state in PGM network
Sources multicast NCFs to the group in response to any NAKs
Sources multicast RDATA packets to the group in response to
NAKs received for data packets within the transmit window.
Transmit Window Advance
Sources MAY advance the trailing edge of the window according
to one of a number of strategies. Implementations MAY support
automatic adjustments such as keeping the window at a fixed
size in bytes, a fixed number of packets or a fixed real time
duration. In addition, they MAY optionally delay window
advancement based on NAK-silence for a certain period. Some
possible strategies are outlined later in this document.
2.2. Receiver Functions
Source Path State
Receivers use SPMs to determine the last-hop PGM network
element for a given TSI to which to direct their NAKs.
Receivers receive ODATA within the transmit window and
eliminate any duplicates.
Receivers unicast NAKs to the last-hop PGM network element (and
MAY optionally multicast a NAK with TTL of 1 to the local
group) for data packets within the receive window detected to
be missing from the expected sequence. A receiver MUST
repeatedly transmit a given NAK until it receives a matching
Receivers suppress NAKs for which a matching NCF or NAK is
received during the NAK transmit back-off interval.
Receive Window Advance
Receivers immediately advance their receive windows upon
receipt of any PGM data packet or SPM within the transmit
window that advances the receive window.
2.3. Network Element Functions
Network elements forward ODATA without intervention.
Source Path State
Network elements intercept SPMs and use them to establish
source path state for the corresponding TSI before multicast
forwarding them in the usual way.
Network elements multicast NCFs to the group in response to any
NAK they receive. For each NAK received, network elements
create repair state recording the transport session identifier,
the sequence number of the NAK, and the input interface on
which the NAK was received.
Constrained NAK Forwarding
Network elements repeatedly unicast forward only the first copy
of any NAK they receive to the upstream PGM network element on
the distribution path for the TSI until they receive an NCF in
response. In addition, they MAY optionally multicast this NAK
upstream with TTL of 1.
Nota Bene: Once confirmed by an NCF, network elements discard NAK
packets; NAKs are NOT retained in network elements beyond this
forwarding operation, but state about the reception of them is
Network elements discard exact duplicates of any NAK for which
they already have repair state (i.e., that has been forwarded
either by themselves or a neighboring PGM network element), and
respond with a matching NCF.
Constrained RDATA Forwarding
Network elements use NAKs to maintain repair state consisting
of a list of interfaces upon which a given NAK was received,
and they forward the corresponding RDATA only on these
If a network element hears an upstream NCF (i.e., on the
upstream interface for the distribution tree for the TSI), it
establishes repair state without outgoing interfaces in
anticipation of responding to and eliminating duplicates of the
NAK that may arrive from downstream.