Internet Engineering Task Force (IETF) B. Ver Steeg Request for Comments: 6285 A. Begen Category: Standards Track Cisco ISSN: 2070-1721 T. Van Caenegem Alcatel-Lucent Z. Vax Magnum Semiconductor June 2011 Unicast-Based Rapid Acquisition of Multicast RTP Sessions Abstract When an RTP receiver joins a multicast session, it may need to acquire and parse certain Reference Information before it can process any data sent in the multicast session. Depending on the join time, length of the Reference Information repetition (or appearance) interval, size of the Reference Information, and the application and transport properties, the time lag before an RTP receiver can usefully consume the multicast data, which we refer to as the Acquisition Delay, varies and can be large. This is an undesirable phenomenon for receivers that frequently switch among different multicast sessions, such as video broadcasts. In this document, we describe a method using the existing RTP and RTP Control Protocol (RTCP) machinery that reduces the acquisition delay. In this method, an auxiliary unicast RTP session carrying the Reference Information to the receiver precedes or accompanies the multicast stream. This unicast RTP flow can be transmitted at a faster than natural bitrate to further accelerate the acquisition. The motivating use case for this capability is multicast applications that carry real-time compressed audio and video. However, this method can also be used in other types of multicast applications where the acquisition delay is long enough to be a problem. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6285. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Acquisition of RTP Streams vs. RTP Sessions . . . . . . . 6 1.2. Outline . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Requirements Notation . . . . . . . . . . . . . . . . . . . . 7 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 7 4. Elements of Delay in Multicast Applications . . . . . . . . . 8 5. Protocol Design Considerations and Their Effect on Resource Management for Rapid Acquisition . . . . . . . . . . 10 6. Rapid Acquisition of Multicast RTP Sessions (RAMS) . . . . . . 12 6.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 12 6.2. Message Flows . . . . . . . . . . . . . . . . . . . . . . 13 6.3. Synchronization of Primary Multicast Streams . . . . . . . 24 6.4. Burst Shaping and Congestion Control in RAMS . . . . . . . 25 6.5. Failure Cases . . . . . . . . . . . . . . . . . . . . . . 27 7. Encoding of the Signaling Protocol in RTCP . . . . . . . . . . 28
7.1. Extensions . . . . . . . . . . . . . . . . . . . . . . . . 29 7.1.1. Vendor-Neutral Extensions . . . . . . . . . . . . . . 30 7.1.2. Private Extensions . . . . . . . . . . . . . . . . . . 30 7.2. RAMS Request . . . . . . . . . . . . . . . . . . . . . . . 31 7.3. RAMS Information . . . . . . . . . . . . . . . . . . . . . 34 7.3.1. Response Code Definitions . . . . . . . . . . . . . . 37 7.4. RAMS Termination . . . . . . . . . . . . . . . . . . . . . 39 8. SDP Signaling . . . . . . . . . . . . . . . . . . . . . . . . 40 8.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 40 8.2. Requirements . . . . . . . . . . . . . . . . . . . . . . . 41 8.3. Example and Discussion . . . . . . . . . . . . . . . . . . 41 9. NAT Considerations . . . . . . . . . . . . . . . . . . . . . . 44 10. Security Considerations . . . . . . . . . . . . . . . . . . . 45 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 11.1. Registration of SDP Attributes . . . . . . . . . . . . . . 48 11.2. Registration of SDP Attribute Values . . . . . . . . . . . 48 11.3. Registration of FMT Values . . . . . . . . . . . . . . . . 48 11.4. SFMT Values for RAMS Messages Registry . . . . . . . . . . 48 11.5. RAMS TLV Space Registry . . . . . . . . . . . . . . . . . 49 11.6. RAMS Response Code Space Registry . . . . . . . . . . . . 50 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 52 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 52 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 52 14.1. Normative References . . . . . . . . . . . . . . . . . . . 52 14.2. Informative References . . . . . . . . . . . . . . . . . . 54 1. Introduction Most multicast flows carry a stream of inter-related data. Receivers need to acquire certain information to start processing any data sent in the multicast session. This document refers to this information as Reference Information. The Reference Information is conventionally sent periodically in the multicast session (although its content can change over time) and usually consists of items such as a description of the schema for the rest of the data, references to which data to process, encryption information including keys, and any other information required to process the data in the multicast stream [IC2009]. Real-time multicast applications require receivers to buffer data. Receivers may have to buffer data to smooth out the network jitter, to allow loss-repair methods such as Forward Error Correction and retransmission to recover the missing packets, and to satisfy the data-processing requirements of the application layer. When a receiver joins a multicast session, it has no control over what point in the flow is currently being transmitted. Sometimes the receiver might join the session right before the Reference
Information is sent in the session. In this case, the required waiting time is usually minimal. Other times, the receiver might join the session right after the Reference Information has been transmitted. In this case, the receiver has to wait for the Reference Information to appear again in the flow before it can start processing any multicast data. In some other cases, the Reference Information is not contiguous in the flow but dispersed over a large period, which forces the receiver to wait for the whole Reference Information to arrive before starting to process the rest of the data. The net effect of waiting for the Reference Information and waiting for various buffers to fill up is that receivers can experience significantly large delays in data processing. In this document, we refer to the difference between the time an RTP receiver wants to join the multicast session and the time the RTP receiver acquires all the necessary Reference Information as the Acquisition Delay. The acquisition delay might not be the same for different receivers; it usually varies depending on the join time, length of the Reference Information repetition (or appearance) interval, and size of the Reference Information, as well as the application and transport properties. The varying nature of the acquisition delay adversely affects the receivers that frequently switch among multicast sessions. While this problem equally applies to both any-source multicast (ASM) and source-specific multicast (SSM) applications, in this specification we address it for the SSM-based applications by describing a method that uses the fundamental tools offered by the existing RTP and RTCP protocols [RFC3550]. In this method, either the multicast source (or the distribution source in an SSM session) retains the Reference Information for a period after its transmission, or an intermediary network element (that we refer to as Retransmission Server) joins the multicast session and continuously caches the Reference Information as it is sent in the session and acts as a feedback target (see [RFC5760]) for the session. When an RTP receiver wishes to join the same multicast session, instead of simply issuing a Source Filtering Group Management Protocol (SFGMP) Join message, it sends a request to the feedback target for the session and asks for the Reference Information. The retransmission server starts a new unicast RTP (retransmission) session and sends the Reference Information to the RTP receiver over that session. If there is residual bandwidth, the retransmission server might burst the Reference Information faster than its natural rate. As soon as the receiver acquires the Reference Information, it can join the multicast session and start processing the multicast data. A simplified network diagram showing this method through an intermediary network element is depicted in Figure 1.
This method potentially reduces the acquisition delay. We refer to this method as Unicast-Based Rapid Acquisition of Multicast RTP Sessions. A primary use case for this method is to reduce the channel-change times in IPTV networks where compressed video streams are multicast in different SSM sessions and viewers randomly join these sessions. ----------------------- +--->| Intermediary | | | Network Element | | ...|(Retransmission Server)| | : ----------------------- | : | v ----------- ---------- ---------- | Multicast | | |---------->| Joining | | Source |------->| Router |..........>| RTP | | | | | | Receiver | ----------- ---------- ---------- | | ---------- +---------------->| Existing | | RTP | | Receiver | ---------- -------> Multicast RTP Flow .......> Unicast RTP Flow Figure 1: Rapid Acquisition through an Intermediary Network Element A principle design goal in this solution is to use the existing tools in the RTP/RTCP protocol family. This improves the versatility of the existing implementations and promotes faster deployment and better interoperability. To this effect, we use the unicast retransmission support of RTP [RFC4588] and the capabilities of RTCP to handle the signaling needed to accomplish the acquisition. A reasonable effort has been made in this specification to design a solution that reliably works in both engineered and best-effort networks. However, a proper congestion control combined with the desired behavior of this solution is difficult to achieve. Rather, this solution has been designed based on the assumption that the retransmission server and the RTP receivers have some knowledge about where the bottleneck between them is. This assumption does not generally hold unless both the retransmission server and the RTP receivers are in the same edge network. Thus, this solution should
not be used across any best-effort path of the Internet. Furthermore, this solution should only be used in networks that are already carrying non-congestion-responsive multicast traffic and have throttling mechanisms in the retransmission servers to ensure the (unicast) burst traffic is a known constant upper-bound multiplier on the multicast load. 1.1. Acquisition of RTP Streams vs. RTP Sessions In this memo, we describe a protocol that handles the rapid acquisition of a single multicast RTP session (called a primary multicast RTP session) carrying one or more RTP streams (called primary multicast streams). If desired, multiple instances of this protocol may be run in parallel to acquire multiple RTP sessions simultaneously. When an RTP receiver requests the Reference Information from the retransmission server, it can opt to rapidly acquire a specific subset of the available RTP streams in the primary multicast RTP session. Alternatively, the RTP receiver can request the rapid acquisition of all of the RTP streams in that RTP session. Regardless of how many RTP streams are requested by the RTP receiver or how many will be actually sent by the retransmission server, only one unicast RTP session will be established by the retransmission server. This unicast RTP session is separate from the associated primary multicast RTP session. As a result, there are always two different RTP sessions in a single instance of the rapid acquisition protocol: (i) the primary multicast RTP session with its associated unicast feedback and (ii) the unicast RTP session. If the RTP receiver wants to rapidly acquire multiple RTP sessions simultaneously, separate unicast RTP sessions will be established for each of them. 1.2. Outline The rest of this specification is as follows. Section 3 provides a list of the definitions frequently used in this document. In Section 4, we describe the delay components in generic multicast applications. Section 5 presents an overview of the protocol design considerations for rapid acquisition. We provide the protocol details of the rapid acquisition method in Sections 6 and 7. Sections 8 and 9 discuss the Session Description Protocol (SDP) signaling issues with examples and NAT-related issues, respectively. Finally, Section 10 discusses the security considerations, and Section 11 details the IANA considerations.
2. Requirements Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 3. Definitions This document uses the following acronyms and definitions frequently: (Primary) SSM Session: The multicast session to which RTP receivers can join at a random point in time. A primary SSM session can carry multiple RTP streams. Primary Multicast RTP Session: The multicast RTP session an RTP receiver is interested in acquiring rapidly. From the RTP receiver's viewpoint, the primary multicast RTP session has one associated unicast RTCP feedback stream to a Feedback Target, in addition to the primary multicast RTP stream(s). Primary Multicast (RTP) Streams: The RTP stream(s) carried in the primary multicast RTP session. Source Filtering Group Management Protocol (SFGMP): Following the definition in [RFC4604], SFGMP refers to the Internet Group Management Protocol (IGMP) version 3 [RFC3376] and the Multicast Listener Discovery Protocol (MLD) version 2 [RFC3810] in the IPv4 and IPv6 networks, respectively. However, the rapid acquisition method introduced in this document does not depend on a specific version of either of these group management protocols. In the remainder of this document, SFGMP will refer to any group management protocol that has Join and Leave functionalities. Feedback Target (FT): Unicast RTCP feedback target as defined in [RFC5760]. FT_Ap denotes a specific feedback target running on a particular address and port. Retransmission (or Burst) Packet: An RTP packet that is formatted as defined in Section 4 of [RFC4588]. The payload of a retransmission or burst packet comprises the retransmission payload header followed by the payload of the original RTP packet. Reference Information: The set of certain media content and metadata information that is sufficient for an RTP receiver to start usefully consuming a media stream. The meaning, format, and size of this information are specific to the application and are out of the scope of this document.
Preamble Information: A more compact form of the whole or a subset of the Reference Information transmitted out-of-band. (Unicast) Burst (or Retransmission) RTP Session: The unicast RTP session used to send one or more unicast burst RTP streams and their associated RTCP messages. The terms "burst RTP session" and "retransmission RTP session" can be used interchangeably. (Unicast) Burst (Stream): A unicast stream of RTP retransmission packets that enable an RTP receiver to rapidly acquire the Reference Information associated with a primary multicast stream. Each burst stream is identified by its Synchronization Source (SSRC) identifier that is unique in the primary multicast RTP session. Following the session-multiplexing guidelines in [RFC4588], each unicast burst stream will use the same SSRC and Canonical Name (CNAME) as its primary multicast RTP stream. Retransmission Server (RS): The RTP/RTCP endpoint that can generate the retransmission packets and the burst streams. The RS may also generate other non-retransmission packets to aid rapid acquisition. 4. Elements of Delay in Multicast Applications In a source-specific multicast (SSM) delivery system, there are three major elements that contribute to the acquisition delay when an RTP receiver switches from one multicast session to another one. These are: o Multicast-switching delay o Reference Information latency o Buffering delays Multicast-switching delay is the delay that is experienced when leaving the current multicast session (if any) and joining the new multicast session. In typical systems, the multicast join and leave operations are handled by a group management protocol. For example, the receivers and routers participating in a multicast session can use the Internet Group Management Protocol (IGMP) version 3 [RFC3376] or the Multicast Listener Discovery Protocol (MLD) version 2 [RFC3810]. In either of these protocols, when a receiver wants to join a multicast session, it sends a message to its upstream router and the routing infrastructure sets up the multicast forwarding state to deliver the packets of the multicast session to the new receiver. The join times vary depending on the proximity of the upstream router, the current state of the multicast tree, the load on the system, and the protocol implementation. Current systems provide
join latencies, usually less than 200 milliseconds (ms). If the receiver had been participating in another multicast session before joining the new session, it needs to send a Leave message to its upstream router to leave the session. In common multicast routing protocols, the leave times are usually smaller than the join times; however, it is possible that the Leave and Join messages might get lost, in which case the multicast-switching delay inevitably increases. Reference Information latency is the time it takes the receiver to acquire the Reference Information. It is highly dependent on the proximity of the actual time the receiver joined the session to the next time the Reference Information will be sent to the receivers in the session, whether or not the Reference Information is sent contiguously, and the size of the Reference Information. For some multicast flows, there is a little or no interdependency in the data, in which case the Reference Information latency will be nil or negligible. For other multicast flows, there is a high degree of interdependency. One example of interest is the multicast flows that carry compressed audio/video. For these flows, the Reference Information latency can become quite large and be a major contributor to the overall delay. The buffering component of the overall acquisition delay is driven by the way the application layer processes the payload. In many multicast applications, an unreliable transport protocol such as UDP [RFC0768] is often used to transmit the data packets, and the reliability, if needed, is usually addressed through other means such as Forward Error Correction (e.g., [RFC6015]) and retransmission. These loss-repair methods require buffering at the receiver side to function properly. In many applications, it is also often necessary to de-jitter the incoming data packets before feeding them to the application. The de-jittering process also increases the buffering delays. Besides these network-related buffering delays, there are also specific buffering needs that are required by the individual applications. For example, standard video decoders typically require a certain amount, sometimes up to a few seconds, of coded video data to be available in the pre-decoding buffers prior to starting to decode the video bitstream.
5. Protocol Design Considerations and Their Effect on Resource Management for Rapid Acquisition This section is for informational purposes and does not contain requirements for implementations. Rapid acquisition is an optimization of a system that is expected to continue to work correctly and properly whether or not the optimization is effective or even fails due to lost control and feedback messages, congestion, or other problems. This is fundamental to the overall design requirements surrounding the protocol definition and to the resource management schemes to be employed together with the protocol (e.g., Quality of Service (QoS) machinery, server load management, etc). In particular, the system needs to operate within a number of constraints: o First, a rapid acquisition operation must fail gracefully. The user experience must not be significantly worse for trying and failing to complete rapid acquisition compared to simply joining the multicast session. o Second, providing the rapid acquisition optimizations must not cause collateral damage to either the multicast session being joined or other multicast sessions sharing resources with the rapid acquisition operation. In particular, the rapid acquisition operation must avoid interference with the multicast session that might be simultaneously being received by other hosts. In addition, it must also avoid interference with other multicast and non-multicast sessions sharing the same network resources. These properties are possible but are usually difficult to achieve. One challenge is the existence of multiple bandwidth bottlenecks between the receiver and the server(s) in the network providing the rapid acquisition service. In commercial IPTV deployments, for example, bottlenecks are often present in the aggregation network connecting the IPTV servers to the network edge, the access links (e.g., DSL, Data Over Cable Service Interface Specification (DOCSIS)), and the home network of the subscribers. Some of these links might serve only a single subscriber, limiting congestion impact to the traffic of only that subscriber, but others can be shared links carrying multicast sessions of many subscribers. Also note that the state of these links can vary over time. The receiver might have knowledge of a portion of this network or might have partial knowledge of the entire network. The methods employed by the devices to acquire this network state information is out of the scope of this document. The receiver should be able to signal the server with the bandwidth that it believes it can handle. The server also needs to be able to rate limit the flow in order to stay within the
performance envelope that it knows about. Both the server and receiver need to be able to inform the other of changes in the requested and delivered rates. However, the protocol must be robust in the presence of packet loss, so this signaling must include the appropriate default behaviors. A second challenge is that for some uses (e.g., high-bitrate video) the unicast burst bitrate is high while the flow duration of the unicast burst is short. This is because the purpose of the unicast burst is to allow the RTP receiver to join the multicast quickly and thereby limit the overall resources consumed by the burst. Such high-bitrate, short-duration flows are not amenable to conventional admission-control techniques. For example, end-to-end per-flow signaled admission-control techniques such as Resource Reservation Protocol (RSVP) have too much latency and control channel overhead to be a good fit for rapid acquisition. Similarly, using a TCP (or TCP- like) approach with a 3-way handshake and slow-start to avoid inducing congestion would defeat the purpose of attempting rapid acquisition in the first place by introducing many round-trip times (RTTs) of delay. These observations lead to certain unavoidable requirements and goals for a rapid acquisition protocol. These are: o The protocol must be designed to allow a deterministic upper bound on the extra bandwidth used (compared to just joining the multicast session). A reasonable size bound is e*B, where B is the nominal bandwidth of the primary multicast streams and e is an excess-bandwidth coefficient. The total duration of the unicast burst must have a reasonable bound; long unicast bursts devolve to the bandwidth profile of multi-unicast for the whole system. o The scheme should minimize (or better eliminate) the overlap of the unicast burst and the primary multicast stream. This minimizes the window during which congestion could be induced on a bottleneck link compared to just carrying the multicast or unicast packets alone. o The scheme must minimize (or better eliminate) any gap between the unicast burst and the primary multicast stream, which has to be repaired later or, in the absence of repair, will result in loss being experienced by the application. In addition to the above, there are some other protocol design issues to be considered. First, there is at least one RTT of "slop" in the control loop. In starting a rapid acquisition burst, this manifests as the time between the client requesting the unicast burst and the burst description and/or the first unicast burst packets arriving at
the receiver. For managing and terminating the unicast burst, there are two possible approaches for the control loop. First, the receiver can adapt to the unicast burst as received, converge based on observation, and explicitly terminate the unicast burst with a second control loop exchange (which takes a minimum of one RTT, just as starting the unicast burst does). Alternatively, the server generating the unicast burst can precompute the burst parameters based on the information in the initial request and tell the receiver the burst duration. The protocol described in the next section allows either method of controlling the rapid acquisition unicast burst.