Network Working Group B. Whetten Request for Comments: 3048 Talarian Category: Informational L. Vicisano Cisco R. Kermode Motorola M. Handley ACIRI 9 S. Floyd ACIRI M. Luby Digital Fountain January 2001 Reliable Multicast Transport Building Blocks for One-to-Many Bulk-Data Transfer Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved.
AbstractThis document describes a framework for the standardization of bulk- data reliable multicast transport. It builds upon the experience gained during the deployment of several classes of contemporary reliable multicast transport, and attempts to pull out the commonalities between these classes of protocols into a number of building blocks. To that end, this document recommends that certain components that are common to multiple protocol classes be standardized as "building blocks". The remaining parts of the protocols, consisting of highly protocol specific, tightly intertwined functions, shall be designated as "protocol cores". Thus, each protocol can then be constructed by merging a "protocol core" with a number of "building blocks" which can be re-used across multiple protocols.
1 Introduction .................................................. 2 1.1 Protocol Families ........................................... 5 2 Building Blocks Rationale ..................................... 6 2.1 Building Blocks Advantages .................................. 6 2.2 Building Block Risks ........................................ 7 2.3 Building Block Requirements ................................. 8 3 Protocol Components ........................................... 8 3.1 Sub-Components Definition ................................... 9 4 Building Block Recommendations ................................ 12 4.1 NACK-based Reliability ...................................... 13 4.2 FEC coding .................................................. 13 4.3 Congestion Control .......................................... 13 4.4 Generic Router Support ...................................... 14 4.5 Tree Configuration .......................................... 14 4.6 Data Security ............................................... 15 4.7 Common Headers .............................................. 15 4.8 Protocol Cores .............................................. 15 5 Security ...................................................... 15 6 IANA Considerations ........................................... 15 7 Conclusions ................................................... 16 8 Acknowledgements .............................................. 16 9 References .................................................... 16 10 Authors' Addresses ........................................... 19 11 Full Copyright Statement ..................................... 20 RFC 2357 lays out the requirements for reliable multicast protocols that are to be considered for standardization by the IETF. They include: o Congestion Control. The protocol must be safe to deploy in the widespread Internet. Specifically, it must adhere to three mandates: a) it must achieve good throughput (i.e., it must not consistently overload links with excess data or repair traffic), b) it must achieve good link utilization, and c) it must not starve competing flows. o Scalability. The protocol should be able to work under a variety of conditions that include multiple network topologies, link speeds, and the receiver set size. It is more important to have a good understanding of how and when a protocol breaks than when it works.
o Security. The protocol must be analyzed to show what is necessary to allow it to cope with security and privacy issues. This includes understanding the role of the protocol in data confidentiality and sender authentication, as well as how the protocol will provide defenses against denial of service attacks. These requirements are primarily directed towards making sure that any standards will be safe for widespread Internet deployment. The advancing maturity of current work on reliable multicast congestion control (RMCC) [HFW99] in the IRTF Reliable Multicast Research Group (RMRG) has been one of the events that has allowed the IETF to charter the RMT working group. RMCC only addresses a subset of the design space for reliable multicast. Fortuitously, the requirements it addresses are also the most pressing application and market requirements. A protocol's ability to meet the requirements of congestion control, scalability, and security is affected by a number of secondary requirements that are described in a separate document [RFC2887]. In summary, these are: o Ordering Guarantees. A protocol must offer at least one of either source ordered or unordered delivery guarantees. Support for total ordering across multiple senders is not recommended, as it makes it more difficult to scale the protocol, and can more easily be implemented at a higher level. o Receiver Scalability. A protocol should be able to support a "large" number of simultaneous receivers per transport group. A typical receiver set could be on the order of at least 1,000 - 10,000 simultaneous receivers per group, or could even eventually scale up to millions of receivers in the large Internet. o Real-Time Feedback. Some versions of RMCC may require soft real- time feedback, so a protocol may provide some means for this information to be measured and returned to the sender. While this does not require that a protocol deliver data in soft real-time, it is an important application requirement that can be provided easily given real-time feedback. o Delivery Guarantees. In many applications, a logically defined unit or units of data is to be delivered to multiple clients, e.g., a file or a set of files, a software package, a stock quote or package of stock quotes, an event notification, a set of slides, a frame or block from a video. An application data unit is defined to be a logically separable unit of data that is useful to the application. In some cases, an application data unit may be short enough to fit into a single packet (e.g., an event
notification or a stock quote), whereas in other cases an application data unit may be much longer than a packet (e.g., a software package). A protocol must provide good throughput of application data units to receivers. This means that most data that is delivered to receivers is useful in recovering the application data unit that they are trying to receive. A protocol may optionally provide delivery confirmation, i.e., a mechanism for receivers to inform the sender when data has been delivered. There are two types of confirmation, at the application data unit level and at the packet level. Application data unit confirmation is useful at the application level, e.g., to inform the application about receiver progress and to decide when to stop sending packets about a particular application data unit. Packet confirmation is useful at the transport level, e.g., to inform the transport level when it can release buffer space being used for storing packets for which delivery has been confirmed. Packet level confirmation may also aid in application data unit confirmation. o Network Topologies. A protocol must not break the network when deployed in the full Internet. However, we recognize that intranets will be where the first wave of deployments happen, and so are also very important to support. Thus, support for satellite networks (including those with terrestrial return paths or no return paths at all) is encouraged, but not required. o Group Membership. The group membership algorithms must be scalable. Membership can be anonymous (where the sender does not know the list of receivers), or fully distributed (where the sender receives a count of the number of receivers, and optionally a list of failures). o Example Applications. Some of the applications that a RM protocol could be designed to support include multimedia broadcasts, real time financial market data distribution, multicast file transfer, and server replication. In the rest of this document the following terms will be used with a specific connotation: "protocol family", "protocol component", "building block", "protocol core", and "protocol instantiation". A "protocol family" is a broad class of RM protocols which share a common characteristic. In our classification, this characteristic is the mechanism used to achieve reliability. A "protocol component" is a logical part of the protocol that addresses a particular functionality. A "building block" is a constituent of a protocol that implements one, more than one or a part of a component. A "protocol core" is the set of functionality required for the
instantiation of a complete protocol, that is not specified by any building block. Finally a "protocol instantiation" is an actual RM protocol defined in term of building blocks and a protocol core. RFC2887] also provides a taxonomy of the most popular approaches that have been proposed over the last ten years. After congestion control, the primary challenge has been that of meeting the requirement for ensuring good throughput in a way that scales to a large number of receivers. For protocols that include a back-channel for recovery of lost packets, the ability to take advantage of support of elements in the network has been found to be very beneficial for supporting good throughput for a large numbers of receivers. Other protocols have found it very beneficial to transmit coded data to achieve good throughput for large numbers of receivers. This taxonomy breaks proposed protocols into four families. Some protocols in the family provide packet level delivery confirmation that may be useful to the transport level. All protocols in all families can be supplemented with higher level protocols that provide delivery confirmation of application data units. 1 NACK only. Protocols such as SRM [FJM95] and MDP2 [MA99] attempt to limit traffic by only using NACKs for requesting packet retransmission. They do not require network infrastructure. 2 Tree based ACK. Protocols such as RMTP [LP96, PSLM97], RMTP-II [WBPM98] and TRAM [KCW98], use positive acknowledgments (ACKs). ACK based protocols reduce the need for supplementary protocols that provide delivery confirmation, as the ACKS can be used for this purpose. In order to avoid ACK implosion in scaled up deployments, the protocol can use servers placed in the network. 3 Asynchronous Layered Coding (ALC). These protocols (examples include [RV97] and [BLMR98]) use sender-based Forward Error Correction (FEC) methods with no feedback from receivers or the network to ensure good throughput. These protocols also used sender-based layered multicast and receiver-driven protocols to join and leave these layers with no feedback to the sender to achieve scalable congestion control. 4 Router assist. Like SRM, protocols such as PGM [FLST98] and [LG97] also use negative acknowledgments for packet recovery. These protocols take advantage of new router software to do constrained negative acknowledgments and retransmissions. Router assist protocols can also provide other functionality more efficiently than end to end protocols. For example, [LVS99] shows
how router assist can provide fine grained congestion control for ALC protocols. Router assist protocols can be designed to complement all protocol families described above. Note that the distinction in protocol families in not necessarily precise and mutually exclusive. Actual protocols may use a combination of the mechanisms belonging to different classes. For example, hybrid NACK/ACK based protocols (such as [WBPM98]) are possible. Other examples are protocols belonging to class 1 through 3 that take advantage of router support. RFC 2357 [MRBP98], no single reliable multicast protocol will likely meet the needs of all applications. Therefore, the IETF expects to standardize a number of protocols that are tailored to application and network specific needs. This document concentrates on the requirements for "one-to-many bulk-data transfer", but in the future, additional protocols and building- blocks are expected that will address the needs of other types of applications, including "many-to- many" applications. Note that bulk-data transfer does not refer to the timeliness of the data, rather it states that there is a large amount of data to be transferred in a session. The scope and approach taken for the development of protocols for these additional scenarios will depend upon large part on the success of the "building-block" approach put forward in this document.
o Easier Future Upgrades. There is still ongoing research in reliable multicast, and we expect the state of the art to continue to evolve. Building protocols with smaller modules allows them to be more easily upgraded to reflect future research. o Common Diagnostics. To the extent that multiple protocols share common packet headers, packet analyzers and other diagnostic tools can be built which work with multiple protocols. o Reduces Effort for New Protocols. As new application requirements drive the need for new standards, some existing modules may be reused in these protocols. o Parallelism of Development. If the APIs are defined clearly, the development of each module can proceed in parallel.
o Data Reliability (ensuring good throughput) | | - Loss Detection/Notification | - Loss Recovery | - Loss Protection o Congestion Control | | - Congestion Feedback | - Rate Regulation | - Receiver Controls o Security o Group membership | | - Membership Notification | - Membership Management o Session Management | | - Group Membership Tracking | - Session Advertisement | - Session Start/Stop | - Session Configuration/Monitoring o Tree Configuration Note that not all components are required by all protocols, depending upon the fully defined service that is being provided by the protocol. In particular, some minimal service models do not require many of these functions, including loss notification, loss recovery, and group membership.
Loss Protection. This function attempts to mask packet-losses so that they don't become Loss Notification events. This function can be realized through the pro-active transmission of FEC packets. Each FEC packet is created from an entire application data unit [LMSSS97] or a portion of an application data unit [RV97], [BKKKLZ95], a fact that allows a receiver to recover from some packet loss without further retransmissions. The number of losses that can be recovered from without requiring retransmission depends on the amount of FEC packets sent in the first place. Loss protection can also be pushed to the extreme when good throughput is achieved without any Loss Detection/Notification and Loss Recovery functionality, as in the ALC family of protocols defined above. Congestion Feedback. For sender driven congestion control protocols, the receiver must provide some type of feedback on congestion to the sender. This typically involves loss rate and round trip time measurements. Rate Regulation. Given the congestion feedback, the sender then must adjust its rate in a way that is fair to the network. One proposal that defines this notion of fairness and other congestion control requirements is [Whetten99]. Receiver Controls. In order to avoid allowing a receiver that has an extremely slow connection to the sender to stop all progress within single rate schemes, a congestion control algorithm will often require receivers to leave groups. For multiple rate approaches, receivers of all connection speeds can have data delivered to them according to the rate of their connection without slowing down other receivers. Security. Security for reliable multicast contains a number of complex and tricky issues that stem in large part from the IP multicast service model. In this service model, hosts do not send traffic to another host, but instead elect to receive traffic from a multicast group. This means that any host may join a group and receive its traffic. Conversely, hosts may also leave a group at any time. Therefore, the protocol must address how it impacts the following security issues: o Sender Authentication (since any host can send to a group), o Data Encryption (since any host can join a group) o Transport Protection (denial of service attacks, through corruption of transport state, or requests for unauthorized resources)
o Group Key Management (since hosts may join and leave a group at any time) [WHA98] In particular, a transport protocol needs to pay particular attention to how it protects itself from denial of service attacks, through mechanisms such as lightweight authentication of control packets [HW99]. With Source Specific Multicast service model (SSM), a host joins specifically to a sender and group pair. Thus, SSM offers more security against hosts receiving traffic from a denial of service attack where an arbitrary sender sends packets that hosts did not specifically request to receive. Nevertheless, it is recommended that additional protections against such attacks should be provided when using SSM, because the protection offered by SSM against such attacks may not be enough. Sender Authentication, Data Encryption, and Group Key Management. While these functions are not typically part of the transport layer per se, a protocol needs to understand what ramifications it has on data security, and may need to have special interfaces to the security layer in order to accommodate these ramifications. Transport Protection. The primary security task for a transport layer is that of protecting the transport layer itself from attack. The most important function for this is typically lightweight authentication of control packets in order to prevent corruption of state and other denial of service attacks. Membership Notification. This is the function through which the data source--or upper level agent in a possible hierarchical organization--learns about the identity and/or number of receivers or lower level agents. To be scalable, this typically will not provide total knowledge of the identity of each receiver. Membership Management. This implements the mechanisms for members to join and leave the group, to accept/refuse new members, or to terminate the membership of existing members. Group Membership Tracking. As an optional feature, a protocol may interface with a component which tracks the identity of each receiver in a large group. If so, this feature will typically be implemented out of band, and may be implemented by an upper level protocol. This may be useful for services that require tracking of usage of the system, billing, and usage reports.
Session Advertisement. This publishes the session name/contents and the parameters needed for its reception. This function is usually performed by an upper layer protocol (e.g., [HPW99] and [HJ98]). Session Start/Stop. These functions determine the start/stop time of sender and/or receivers. In many cases this is implicit or performed by an upper level application or protocol. In some protocols, however, this is a task best performed by the transport layer due to scalability requirements. Session Configuration/Monitoring. Due to the potentially far reaching scope of a multicast session, it is particularly important for a protocol to include tools for configuring and monitoring the protocol's operation. Tree Configuration. For protocols which include hierarchical elements (such as PGM and RMTP-II), it is important to configure these elements in a way that has approximate congruence with the multicast routing topology. While tree configuration could be included as part of the session configuration tools, it is clearly better if this configuration can be made automatic. section 1.1 generally use different mechanisms to implement the protocol functional components described in section 3. This section tries to group these mechanisms in macro components that define protocol building blocks. A building block is defined as "a logical protocol component that results in explicit APIs for use by other building blocks or by the protocol client." Building blocks are generally specified in terms of the set of algorithms and packet formats that implement protocol functional components. A building block may also have API's through which it communicates to applications and/or other building blocks. Most building blocks should also have a management API, through which it communicates to SNMP and/or other management protocols. In the following section we will list a number of building blocks which, at this stage, seem to cover most of the functional components needed to implement the protocol families presented in section 1.1. Nevertheless this list represents the "best current guess", and as such it is not meant to be exhaustive. The actual building block decomposition, i.e., the division of functional components into building blocks, may also have to be revised in the future.
VRC98] or router filtering of a single layer [LVS99]. The multiple rate approach is most applicable for ALC protocols. Both approaches are still in the phase of study, however the first seems to be mature enough [HFW99] to allow the standardization process to begin. At the time of writing this document, a third class of congestion control algorithm based on router support is beginning to emerge in the IRTF RMRG [LVS99]. This work may lead to the future standardization of one or more additional building blocks for congestion control.
FLST98]. Functional components which can take advantage of router support include feedback aggregation/suppression (both for loss notification and congestion control) and constrained retransmission of repair packets. Another component that can take advantage of router support is intentional packet filtering to provide different rates of delivery of packets to different receivers from the same multicast packet stream. This could be most advantageous when combined with ALC protocols [LVS99]. The process of designing and deploying these mechanisms inside routers can be much slower than the one required for end-host protocol mechanisms. Therefore, it would be highly advantageous to define these mechanisms in a generic way that multiple protocols can use if it is available, but do not necessarily need to depend on. This component has two halves, a signaling protocol and actual router algorithms. The signaling protocol allows the transport protocol to request from the router the functions that it wishes to perform, and the router algorithms actually perform these functions. It is more urgent to define the signaling protocol, since it will likely impact the common case protocol headers. An important component of the signaling protocol is some level of commonality between the packet headers of multiple protocols, which allows the router to recognize and interpret the headers.
this building block would be to construct and manage the logical tree connecting the agents. Ideally, this building block would perform these functions in a manner that adapts to changes in session membership, routing topology, and network availability. section 3 that appear to meet the requirements for being implemented as building blocks presented in section 2. The other functions from section 3, which are not covered above, should be implemented as part of "protocol cores", specific to each protocol standardized. RFC 2357 specifically states that "reliable multicast Internet-Drafts reviewed by the Transport Area Directors must explicitly explore the security aspects of the proposed design." Specifically, RMT building block works in progress must examine the denial-of-service attacks that can be made upon building blocks and affected by building blocks upon the Internet at large. This requirement is in addition to any discussions regarding data-security, that is the manipulation of or exposure of session information to unauthorized receivers. Readers are referred to section 5.e of RFC 2357 for further details.
that will be administered by IANA. Initially, this registry will be empty, since the building blocks described in sections 4.1 to 4.3 are presented for example and design purposes. The requested IANA building block registry will be populated from specifications as they are approved for RFC publication (using the "Specification Required" policy as described in RFC 2434 [RFC2434]). A registration will consist of a building block name, a version number, a brief text description, a specification RFC number, and a responsible person, to which IANA will assign the type number. [BKKKLZ95] J. Bloemer, M. Kalfane, M. Karpinski, R. Karp, M. Luby, D. Zuckerman, "An XOR-based Erasure Resilient Coding Scheme," ICSI Technical Report No. TR-95-048, August 1995. [BLMR98] J. Byers, M. Luby, M. Mitzenmacher, A. Rege, "A Digital Fountain Approach to Reliable Distribution of Bulk Data," Proc ACM SIGCOMM 98. [FJM95] S. Floyd, V. Jacobson, S. McCanne, "A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing," Proc ACM SIGCOMM 95, Aug 1995 pp. 342-356.
[FLST98] D. Farinacci, S. Lin, T. Speakman, and A. Tweedly, "PGM reliable transport protocol specification," Work in Progress. [HFW99] M. Handley, S. Floyd, B. Whetten, "Strawman Specification for TCP Friendly (Reliable) Multicast Congestion Control (TFMCC)," Work in Progress. [HJ98] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998. [HPW99] M. Handley, C. Perkins, E. Whelan, "Session Announcement Protocol," Work in Progress, June 1999. [HW99] T. Hardjorno, B. Whetten, "Security Requirements for RMTP-II," Work in Progress, June 1999. [RFC2887] Handley, M., Whetten, B., Kermode, R., Floyd, S., Vicisano, L. and M. Luby, "The Reliable Multicast Design Space for Bulk Data Transfer", RFC 2887, August 2000. [KCW98] M. Kadansky, D. Chiu, and J. Wesley, "Tree-based reliable multicast (TRAM)," Work in Progress. [Kermode98] R. Kermode, "Scoped Hybrid Automatic Repeat Request with Forward Error Correction," Proc ACM SIGCOMM 98, Sept 1998. [LDW98] M. Lucas, B. Dempsey, A. Weaver, "MESH: Distributed Error Recovery for Multimedia Streams in Wide-Area Multicast Networks". [LESZ97] C-G. Liu, D. Estrin, S. Shenkar, L. Zhang, "Local Error Recovery in SRM: Comparison of Two Approaches," USC Technical Report 97-648, Jan 1997. [LG97] B.N. Levine, J.J. Garcua-Luna-Aceves, "Improving Internet Multicast Routing with Routing Labels," IEEE International Conference on Network Protocols (ICNP-97), Oct 28-31, 1997, p.241-250. [LP96] K. Lin and S. Paul. "RMTP: A Reliable Multicast Transport Protocol," IEEE INFOCOMM 1996, March 1996, pp. 1414-1424. [LMSSS97] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, V. Stemann, "Practical Loss-Resilient Codes", Proc ACM Symposium on Theory of Computing, 1997.
[LVS99] M. Luby, L. Vicisano, T. Speakman. "Heterogeneous multicast congestion control based on router packet filtering", RMT working group, June 1999, Pisa, Italy. [MA99] J. Macker, B. Adamson. "Multicast Dissemination Protocol version 2 (MDPv2)," Work in Progress, http://manimac.itd.nrl.navy.mil/MDP [MRBP98] Mankin, A., Romanow, A., Brander, S. and V.Paxson, "IETF Criteria for Evaluating Reliable Multicast Transport and Application Protocols", RFC 2357, June 1998. [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. [OXB99] O. Ozkasap, Z. Xiao, K. Birman. "Scalability of Two Reliable Multicast Protocols", Work in Progress, May 1999. [PSLB97] "Reliable Multicast Transport Protocol (RMTP)," S. Paul, K. K. Sabnani, J. C. Lin, and S. Bhattacharyya, IEEE Journal on Selected Areas in Communications, Vol. 15, No. 3, April 1997. [RV97] L. Rizzo, L. Vicisano, "A Reliable Multicast Data Distribution Protocol Based on Software FEC Techniques," Proc. of The Fourth IEEE Workshop on the Architecture and Implementation of High Performance Communication Systems (HPCS'97), Sani Beach, Chalkidiki, Greece June 23-25, 1997. [VRC98] L. Vicisano, L. Rizzo, J. Crowcroft, "TCP-Like Congestion Control for Layered Multicast Data Transfer", Proc. of IEEE Infocom'98, March 1998. [WBPM98] B. Whetten, M. Basavaiah, S. Paul, T. Montgomery, N. Rastogi, J. Conlan, and T. Yeh, "THE RMTP-II PROTOCOL," Work in Progress. [WHA98] D. Wallner, E. Hardler, R. Agee, "Key Management for Multicast: Issues and Architectures," Work in Progress. [Whetten99] B. Whetten, "A Proposal for Reliable Multicast Congestion Control Requirements," Work in Progress. http://www.talarian.com/rmtp-ii/overview.htm
Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.