Network Working Group J. Ott Request for Comments: 4585 Helsinki University of Technology Category: Standards Track S. Wenger Nokia N. Sato Oki C. Burmeister J. Rey Matsushita July 2006 Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF) Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2006).Abstract
Real-time media streams that use RTP are, to some degree, resilient against packet losses. Receivers may use the base mechanisms of the Real-time Transport Control Protocol (RTCP) to report packet reception statistics and thus allow a sender to adapt its transmission behavior in the mid-term. This is the sole means for feedback and feedback-based error repair (besides a few codec- specific mechanisms). This document defines an extension to the Audio-visual Profile (AVP) that enables receivers to provide, statistically, more immediate feedback to the senders and thus allows for short-term adaptation and efficient feedback-based repair mechanisms to be implemented. This early feedback profile (AVPF) maintains the AVP bandwidth constraints for RTCP and preserves scalability to large groups.
Table of Contents
1. Introduction ....................................................3 1.1. Definitions ................................................3 1.2. Terminology ................................................5 2. RTP and RTCP Packet Formats and Protocol Behavior ...............6 2.1. RTP ........................................................6 2.2. Underlying Transport Protocols .............................6 3. Rules for RTCP Feedback .........................................7 3.1. Compound RTCP Feedback Packets .............................7 3.2. Algorithm Outline ..........................................8 3.3. Modes of Operation .........................................9 3.4. Definitions and Algorithm Overview ........................11 3.5. AVPF RTCP Scheduling Algorithm ............................14 3.5.1. Initialization .....................................15 3.5.2. Early Feedback Transmission ........................15 3.5.3. Regular RTCP Transmission ..........................18 3.5.4. Other Considerations ...............................19 3.6. Considerations on the Group Size ..........................20 3.6.1. ACK Mode ...........................................20 3.6.2. NACK Mode ..........................................20 3.7. Summary of Decision Steps .................................22 3.7.1. General Hints ......................................22 3.7.2. Media Session Attributes ...........................22 4. SDP Definitions ................................................23 4.1. Profile Identification ....................................23 4.2. RTCP Feedback Capability Attribute ........................23 4.3. RTCP Bandwidth Modifiers ..................................27 4.4. Examples ..................................................27 5. Interworking and Coexistence of AVP and AVPF Entities ..........29 6. Format of RTCP Feedback Messages ...............................31 6.1. Common Packet Format for Feedback Messages ................32 6.2. Transport Layer Feedback Messages .........................34 6.2.1. Generic NACK .......................................34 6.3. Payload-Specific Feedback Messages ........................35 6.3.1. Picture Loss Indication (PLI) ......................36 6.3.2. Slice Loss Indication (SLI) ........................37 6.3.3. Reference Picture Selection Indication (RPSI) ......39 6.4. Application Layer Feedback Messages .......................41 7. Early Feedback and Congestion Control ..........................41 8. Security Considerations ........................................42 9. IANA Considerations ............................................43 10. Acknowledgements ..............................................47 11. References ....................................................48 11.1. Normative References .....................................48 11.2. Informative References ...................................48
1. Introduction
Real-time media streams that use RTP are, to some degree, resilient against packet losses. RTP [1] provides all the necessary mechanisms to restore ordering and timing present at the sender to properly reproduce a media stream at a recipient. RTP also provides continuous feedback about the overall reception quality from all receivers -- thereby allowing the sender(s) in the mid-term (in the order of several seconds to minutes) to adapt their coding scheme and transmission behavior to the observed network quality of service (QoS). However, except for a few payload-specific mechanisms [6], RTP makes no provision for timely feedback that would allow a sender to repair the media stream immediately: through retransmissions, retroactive Forward Error Correction (FEC) control, or media-specific mechanisms for some video codecs, such as reference picture selection. Current mechanisms available with RTP to improve error resilience include audio redundancy coding [13], video redundancy coding [14], RTP-level FEC [11], and general considerations on more robust media streams transmission [12]. These mechanisms may be applied proactively (thereby increasing the bandwidth of a given media stream). Alternatively, in sufficiently small groups with small round-trip times (RTTs), the senders may perform repair on-demand, using the above mechanisms and/or media-encoding-specific approaches. Note that "small group" and "sufficiently small RTT" are both highly application dependent. This document specifies a modified RTP profile for audio and video conferences with minimal control based upon [1] and [2] by means of two modifications/additions: Firstly, to achieve timely feedback, the concept of Early RTCP messages as well as algorithms allowing for low-delay feedback in small multicast groups (and preventing feedback implosion in large ones) are introduced. Special consideration is given to point-to-point scenarios. Secondly, a small number of general-purpose feedback messages as well as a format for codec- and application-specific feedback information are defined for transmission in the RTCP payloads.1.1. Definitions
The definitions from RTP/RTCP [1] and the "RTP Profile for Audio and Video Conferences with Minimal Control" [2] apply. In addition, the following definitions are used in this document:
Early RTCP mode:
The mode of operation in that a receiver of a media stream is
often (but not always) capable of reporting events of interest
back to the sender close to their occurrence. In Early RTCP mode,
RTCP packets are transmitted according to the timing rules defined
in this document.
Early RTCP packet:
An Early RTCP packet is a packet which is transmitted earlier than
would be allowed if following the scheduling algorithm of [1], the
reason being an "event" observed by a receiver. Early RTCP
packets may be sent in Immediate Feedback and in Early RTCP mode.
Sending an Early RTCP packet is also referred to as sending Early
Feedback in this document.
Event:
An observation made by the receiver of a media stream that is
(potentially) of interest to the sender -- such as a packet loss
or packet reception, frame loss, etc. -- and thus useful to be
reported back to the sender by means of a feedback message.
Feedback (FB) message:
An RTCP message as defined in this document is used to convey
information about events observed at a receiver -- in addition to
long-term receiver status information that is carried in RTCP
receiver reports (RRs) -- back to the sender of the media stream.
For the sake of clarity, feedback message is referred to as FB
message throughout this document.
Feedback (FB) threshold:
The FB threshold indicates the transition between Immediate
Feedback and Early RTCP mode. For a multiparty scenario, the FB
threshold indicates the maximum group size at which, on average,
each receiver is able to report each event back to the sender(s)
immediately, i.e., by means of an Early RTCP packet without having
to wait for its regularly scheduled RTCP interval. This threshold
is highly dependent on the type of feedback to be provided,
network QoS (e.g., packet loss probability and distribution),
codec and packetization scheme in use, the session bandwidth, and
application requirements. Note that the algorithms do not depend
on all senders and receivers agreeing on the same value for this
threshold. It is merely intended to provide conceptual guidance
to application designers and is not used in any calculations. For
the sake of clarity, the term feedback threshold is referred to as
FB threshold throughout this document.
Immediate Feedback mode:
A mode of operation in which each receiver of a media stream is,
statistically, capable of reporting each event of interest
immediately back to the media stream sender. In Immediate
Feedback mode, RTCP FB messages are transmitted according to the
timing rules defined in this document.
Media packet:
A media packet is an RTP packet.
Regular RTCP mode:
Mode of operation in which no preferred transmission of FB
messages is allowed. Instead, RTCP messages are sent following
the rules of [1]. Nevertheless, such RTCP messages may contain
feedback information as defined in this document.
Regular RTCP packet:
An RTCP packet that is not sent as an Early RTCP packet.
RTP sender:
An RTP sender is an RTP entity that transmits media packets as
well as RTCP packets and receives Regular as well as Early RTCP
(i.e., feedback) packets. Note that the RTP sender is a logical
role and that the same RTP entity may at the same time act as an
RTP receiver.
RTP receiver:
An RTP receiver is an RTP entity that receives media packets as
well as RTCP packets and transmits Regular as well as Early RTCP
(i.e., feedback) packets. Note that the RTP receiver is a logical
role and that the same RTP entity may at the same time act as an
RTP sender.
1.2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [5].
2. RTP and RTCP Packet Formats and Protocol Behavior
2.1. RTP
The rules defined in [2] also apply to this profile except for those rules mentioned in the following: RTCP packet types: Two additional RTCP packet types are registered and the corresponding FB messages to convey feedback information are defined in Section 6 of this memo. RTCP report intervals: This document describes three modes of operation that influence the RTCP report intervals (see Section 3.2 of this memo). In Regular RTCP mode, all rules from [1] apply except for the recommended minimal interval of five seconds between two RTCP reports from the same RTP entity. In both Immediate Feedback and Early RTCP modes, the minimal interval of five seconds between two RTCP reports is dropped and, additionally, the rules specified in Section 3 of this memo apply if RTCP packets containing FB messages (defined in Section 4 of this memo) are to be transmitted. The rules set forth in [1] may be overridden by session descriptions specifying different parameters (e.g., for the bandwidth share assigned to RTCP for senders and receivers, respectively). For sessions defined using the Session Description Protocol (SDP) [3], the rules of [4] apply. Congestion control: The same basic rules as detailed in [2] apply. Beyond this, in Section 7, further consideration is given to the impact of feedback and a sender's reaction to FB messages.2.2. Underlying Transport Protocols
RTP is intended to be used on top of unreliable transport protocols, including UDP and the Datagram Congestion Control Protocol (DCCP). This section briefly describes the specifics beyond plain RTP operation introduced by RTCP feedback as specified in this memo. UDP: UDP provides best-effort delivery of datagrams for point-to- point as well as for multicast communications. UDP does not support congestion control or error repair. The RTCP-based feedback defined in this memo is able to provide minimal support for limited error repair. As RTCP feedback is not guaranteed to operate on sufficiently small timescales (in the order of RTT),
RTCP feedback is not suitable to support congestion control. This
memo addresses both unicast and multicast operation.
DCCP: DCCP [19] provides for congestion-controlled but unreliable
datagram flows for unicast communications. With TCP Friendly Rate
Control (TFRC)-based [20] congestion control (CCID 3), DCCP is
particularly suitable for audio and video communications. DCCP's
acknowledgement messages may provide detailed feedback reporting
about received and missed datagrams (and thus about congestion).
When running RTP over DCCP, congestion control is performed at the
DCCP layer and no additional mechanisms are required at the RTP
layer. Furthermore, an RTCP-feedback-capable sender may leverage
the more frequent DCCP-based feedback and thus a receiver may
refrain from using (additional) Generic Feedback messages where
appropriate.
3. Rules for RTCP Feedback
3.1. Compound RTCP Feedback Packets
Two components constitute RTCP-based feedback as described in this
document:
o Status reports are contained in sender report (SR)/received report
(RR) packets and are transmitted at regular intervals as part of
compound RTCP packets (which also include source description
(SDES) and possibly other messages); these status reports provide
an overall indication for the recent reception quality of a media
stream.
o FB messages as defined in this document that indicate loss or
reception of particular pieces of a media stream (or provide some
other form of rather immediate feedback on the data received).
Rules for the transmission of FB messages are newly introduced in
this document.
RTCP FB messages are just another RTCP packet type (see Section 4).
Therefore, multiple FB messages MAY be combined in a single compound
RTCP packet and they MAY also be sent combined with other RTCP
packets.
Compound RTCP packets containing FB messages as defined in this
document MUST contain RTCP packets in the order defined in [1]:
o OPTIONAL encryption prefix that MUST be present if the RTCP
packet(s) is to be encrypted according to Section 9.1 of [1].
o MANDATORY SR or RR.
o MANDATORY SDES, which MUST contain the CNAME item; all other SDES
items are OPTIONAL.
o One or more FB messages.
The FB message(s) MUST be placed in the compound packet after RR and
SDES RTCP packets defined in [1]. The ordering with respect to other
RTCP extensions is not defined.
Two types of compound RTCP packets carrying feedback packets are used
in this document:
a) Minimal compound RTCP feedback packet
A minimal compound RTCP feedback packet MUST contain only the
mandatory information as listed above: encryption prefix if
necessary, exactly one RR or SR, exactly one SDES with only the
CNAME item present, and the FB message(s). This is to minimize
the size of the RTCP packet transmitted to convey feedback and
thus to maximize the frequency at which feedback can be provided
while still adhering to the RTCP bandwidth limitations.
This packet format SHOULD be used whenever an RTCP FB message is
sent as part of an Early RTCP packet. This packet type is
referred to as minimal compound RTCP packet in this document.
b) (Full) compound RTCP feedback packet
A (full) compound RTCP feedback packet MAY contain any additional
number of RTCP packets (additional RRs, further SDES items, etc.).
The above ordering rules MUST be adhered to.
This packet format MUST be used whenever an RTCP FB message is
sent as part of a Regular RTCP packet or in Regular RTCP mode. It
MAY also be used to send RTCP FB messages in Immediate Feedback or
Early RTCP mode. This packet type is referred to as full compound
RTCP packet in this document.
RTCP packets that do not contain FB messages are referred to as non-
FB RTCP packets. Such packets MUST follow the format rules in [1].
3.2. Algorithm Outline
FB messages are part of the RTCP control streams and thus subject to
the RTCP bandwidth constraints. This means, in particular, that it
may not be possible to report an event observed at a receiver
immediately back to the sender. However, the value of feedback
given to a sender typically decreases over time -- in terms of the media quality as perceived by the user at the receiving end and/or the cost required to achieve media stream repair. RTP [1] and the commonly used RTP profile [2] specify rules when compound RTCP packets should be sent. This document modifies those rules in order to allow applications to timely report events (e.g., loss or reception of RTP packets) and to accommodate algorithms that use FB messages. The modified RTCP transmission algorithm can be outlined as follows: As long as no FB messages have to be conveyed, compound RTCP packets are sent following the rules of RTP [1] -- except that the five- second minimum interval between RTCP reports is not enforced. Hence, the interval between RTCP reports is only derived from the average RTCP packet size and the RTCP bandwidth share available to the RTP/RTCP entity. Optionally, a minimum interval between Regular RTCP packets may be enforced. If a receiver detects the need to send an FB message, it may do so earlier than the next regular RTCP reporting interval (for which it would be scheduled following the above regular RTCP algorithm). Feedback suppression is used to avoid feedback implosion in multiparty sessions: The receiver waits for a (short) random dithering interval to check whether it sees a corresponding FB message from any other receiver reporting the same event. Note that for point-to-point sessions there is no such delay. If a corresponding FB message from another member is received, this receiver refrains from sending the FB message and continues to follow the Regular RTCP transmission schedule. In case the receiver has not yet seen a corresponding FB message from any other member, it checks whether it is allowed to send Early feedback. If sending Early feedback is permissible, the receiver sends the FB message as part of a minimal compound RTCP packet. The permission to send Early feedback depends on the type of the previous RTCP packet sent by this receiver and the time the previous Early feedback message was sent. FB messages may also be sent as part of full compound RTCP packets, which are transmitted as per [1] (except for the five-second lower bound) in regular intervals.3.3. Modes of Operation
RTCP-based feedback may operate in one of three modes (Figure 1) as described below. The mode of operation is just an indication of whether or not the receiver will, on average, be able to report all events to the sender in a timely fashion; the mode does not influence the algorithm used for scheduling the transmission of FB messages.
And, depending on the reception quality and the locally monitored
state of the RTP session, individual receivers may not (and do not
have to) agree on a common perception on the current mode of
operation.
a) Immediate Feedback mode: In this mode, the group size is below the
FB threshold, which gives each receiving party sufficient
bandwidth to transmit the RTCP feedback packets for the intended
purpose. This means that, for each receiver, there is enough
bandwidth to report each event by means of a virtually "immediate"
RTCP feedback packet.
The group size threshold is a function of a number of parameters
including (but not necessarily limited to): the type of feedback
used (e.g., ACK vs. NACK), bandwidth, packet rate, packet loss
probability and distribution, media type, codec, and the (worst
case or observed) frequency of events to report (e.g., frame
received, packet lost).
As a rough estimate, let N be the average number of events to be
reported per interval T by a receiver, B the RTCP bandwidth
fraction for this particular receiver, and R the average RTCP
packet size, then the receiver operates in Immediate Feedback mode
as long as N<=B*T/R.
b) Early RTCP mode: In this mode, the group size and other parameters
no longer allow each receiver to react to each event that would be
worth reporting (or that needed reporting). But feedback can
still be given sufficiently often so that it allows the sender to
adapt the media stream transmission accordingly and thereby
increase the overall media playback quality.
Using the above notation, Early RTCP mode can be roughly
characterized by N > B*T/R as "lower bound". An estimate for an
upper bound is more difficult. Setting N=1, we obtain for a given
R and B the interval T = R/B as average interval between events to
be reported. This information can be used as a hint to determine
whether or not early transmission of RTCP packets is useful.
c) Regular RTCP Mode: From some group size upwards, it is no longer
useful to provide feedback for individual events from receivers at
all -- because of the time scale in which the feedback could be
provided and/or because in large groups the sender(s) have no
chance to react to individual feedback anymore.
No precise group size threshold can be specified at which this
mode starts but, obviously, this boundary matches the upper bound
of the Early RTCP mode as specified in item b) above.
As the feedback algorithm described in this document scales smoothly, there is no need for an agreement among the participants on the precise values of the respective FB thresholds within the group. Hence, the borders between all these modes are soft. ACK feedback V :<- - - - NACK feedback - - - ->// : : Immediate || : Feedback mode ||Early RTCP mode Regular RTCP mode :<=============>||<=============>//<=================> : || -+---------------||---------------//------------------> group size 2 || Application-specific FB Threshold = f(data rate, packet loss, codec, ...) Figure 1: Modes of operation As stated before, the respective FB thresholds depend on a number of technical parameters (of the codec, the transport, the type of feedback used, etc.) but also on the respective application scenarios. Section 3.6 provides some useful hints (but no precise calculations) on estimating these thresholds.3.4. Definitions and Algorithm Overview
The following pieces of state information need to be maintained per receiver (largely taken from [1]). Note that all variables (except in item h) below) are calculated independently at each receiver. Therefore, their local values may differ at any given point in time. a) Let "senders" be the number of active senders in the RTP session. b) Let "members" be the current estimate of the number of receivers in the RTP session. c) Let tn and tp be the time for the next (last) scheduled RTCP RR transmission calculated prior to timer reconsideration. d) Let Tmin be the minimal interval between RTCP packets as per [1]. Unlike in [1], the initial Tmin is set to 1 second to allow for some group size sampling before sending the first RTCP packet. After the first RTCP packet is sent, Tmin is set to 0.
e) Let T_rr be the interval after which, having just sent a regularly
scheduled RTCP packet, a receiver would schedule the transmission
of its next Regular RTCP packet. This value is obtained following
the rules of [1] but with Tmin as defined in this document: T_rr =
T (the "calculated interval" as defined in [1]) with tn = tp + T.
T_rr always refers to the last value of T that has been computed
(because of reconsideration or to determine tn). T_rr is also
referred to as Regular RTCP interval in this document.
f) Let t0 be the time at which an event that is to be reported is
detected by a receiver.
g) Let T_dither_max be the maximum interval for which an RTCP
feedback packet MAY be additionally delayed to prevent implosions
in multiparty sessions; the value for T_dither_max is dynamically
calculated based upon T_rr (or may be derived by means of another
mechanism common across all RTP receivers to be specified in the
future). For point-to-point sessions (i.e., sessions with exactly
two members with no change in the group size expected, e.g.,
unicast streaming sessions), T_dither_max is set to 0.
h) Let T_max_fb_delay be the upper bound within which feedback to an
event needs to be reported back to the sender to be useful at all.
This value is application specific, and no values are defined in
this document.
i) Let te be the time for which a feedback packet is scheduled.
j) Let T_fd be the actual (randomized) delay for the transmission of
FB message in response to an event at time t0.
k) Let allow_early be a Boolean variable that indicates whether the
receiver currently may transmit FB messages prior to its next
regularly scheduled RTCP interval tn. This variable is used to
throttle the feedback sent by a single receiver. allow_early is
set to FALSE after Early feedback transmission and is set to TRUE
as soon as the next Regular RTCP transmission takes place.
l) Let avg_rtcp_size be the moving average on the RTCP packet size as
defined in [1].
m) Let T_rr_interval be an OPTIONAL minimal interval to be used
between Regular RTCP packets. If T_rr_interval == 0, then this
variable does not have any impact on the overall operation of the
RTCP feedback algorithm. If T_rr_interval != 0, then the next
Regular RTCP packet will not be scheduled T_rr after the last
Regular RTCP transmission (i.e., at tp+T_rr). Instead, the next
Regular RTCP packet will be delayed until at least T_rr_interval
after the last Regular RTCP transmission, i.e., it will be
scheduled at or later than tp+T_rr_interval. Note that
T_rr_interval does not affect the calculation of T_rr and tp;
instead, Regular RTCP packets scheduled for transmission before
tp+T_rr_interval will be suppressed if, for example, they do not
contain any FB messages. The T_rr_interval does not affect
transmission scheduling of Early RTCP packets.
Note: Providing T_rr_interval as an independent variable is meant
to minimize Regular RTCP feedback (and thus bandwidth consumption)
as needed by the application while additionally allowing the use
of more frequent Early RTCP packets to provide timely feedback.
This goal could not be achieved by reducing the overall RTCP
bandwidth as RTCP bandwidth reduction would also impact the
frequency of Early feedback.
n) Let t_rr_last be the point in time at which the last Regular RTCP
packet has been scheduled and sent, i.e., has not been suppressed
due to T_rr_interval.
o) Let T_retention be the time window for which past FB messages are
stored by an AVPF entity. This is to ensure that feedback
suppression also works for entities that have received FB messages
from other entities prior to noticing the feedback event itself.
T_retention MUST be set to at least 2 seconds.
p) Let M*Td be the timeout value for a receiver to be considered
inactive (as defined in [1]).
The feedback situation for an event to report at a receiver is
depicted in Figure 2 below. At time t0, such an event (e.g., a
packet loss) is detected at the receiver. The receiver decides --
based upon current bandwidth, group size, and other application-
specific parameters -- that an FB message needs to be sent back to
the sender.
To avoid an implosion of feedback packets in multiparty sessions, the
receiver MUST delay the transmission of the RTCP feedback packet by a
random amount of time T_fd (with the random number evenly distributed
in the interval [0, T_dither_max]). Transmission of the compound
RTCP packet MUST then be scheduled for te = t0 + T_fd.
The T_dither_max parameter is derived from the Regular RTCP interval,
T_rr, which, in turn, is based upon the group size. A future
document may also specify other calculations for T_dither_max (e.g.,
based upon RTT) if it can be assured that all RTP receivers will use
the same mechanism for calculating T_dither_max.
For a certain application scenario, a receiver may determine an upper bound for the acceptable local delay of FB messages: T_max_fb_delay. If an a priori estimation or the actual calculation of T_dither_max indicates that this upper bound MAY be violated (e.g., because T_dither_max > T_max_fb_delay), the receiver MAY decide not to send any feedback at all because the achievable gain is considered insufficient. If an Early RTCP packet is scheduled, the time slot for the next Regular RTCP packet MUST be updated accordingly to have a new tn (tn=tp+2*T_rr) and a new tp (tp=tp+T_rr) afterwards. This is to ensure that the short-term average RTCP bandwidth used with Early feedback does not exceed the bandwidth used without Early feedback. event to report detected | | RTCP feedback range | (T_max_fb_delay) vXXXXXXXXXXXXXXXXXXXXXXXXXXX ) ) |---+--------+-------------+-----+------------| |--------+---> | | | | ( ( | | t0 te | tp tn \_______ ________/ \/ T_dither_max Figure 2: Event report and parameters for Early RTCP scheduling3.5. AVPF RTCP Scheduling Algorithm
Let S0 be an active sender (out of S senders) and let N be the number of receivers with R being one of these receivers. Assume that R has verified that using feedback mechanisms is reasonable at the current constellation (which is highly application specific and hence not specified in this document). Assume further that T_rr_interval is 0, if no minimal interval between Regular RTCP packets is to be enforced, or T_rr_interval is set to some meaningful value, as given by the application. This value then denotes the minimal interval between Regular RTCP packets. With this, a receiver R MUST use the following rules for transmitting one or more FB messages as minimal or full compound RTCP packet.
3.5.1. Initialization
Initially, R MUST set allow_early = TRUE and t_rr_last = NaN (Not-a- Number, i.e., some invalid value that can be distinguished from a valid time). Furthermore, the initialization of the RTCP variables as per [1] applies except for the initial value for Tmin. For a point-to-point session, the initial Tmin is set to 0. For a multiparty session, Tmin is initialized to 1.0 seconds.3.5.2. Early Feedback Transmission
Assume that R had scheduled the last Regular RTCP RR packet for transmission at tp (and sent or suppressed this packet at tp) and has scheduled the next transmission (including possible reconsideration as per [1]) for tn = tp + T_rr. Assume also that the last Regular RTCP packet transmission has occurred at t_rr_last. The Early Feedback algorithm then comprises the following steps: 1. At time t0, R detects the need to transmit one or more FB messages, e.g., because media "units" need to be ACKed or NACKed, and finds that providing the feedback information is potentially useful for the sender. 2. R first checks whether there is already a compound RTCP packet containing one or more FB messages scheduled for transmission (either as Early or as Regular RTCP packet). 2a) If so, the new FB message MUST be included in the scheduled packet; the scheduling of the waiting compound RTCP packet MUST remain unchanged. When doing so, the available feedback information SHOULD be merged to produce as few FB messages as possible. This completes the course of immediate actions to be taken. 2b) If no compound RTCP packet is already scheduled for transmission, a new (minimal or full) compound RTCP packet MUST be created and the minimal interval for T_dither_max MUST be chosen as follows: i) If the session is a point-to-point session, then T_dither_max = 0.
ii) If the session is a multiparty session, then
T_dither_max = l * T_rr
with l=0.5.
The value for T_dither_max MAY be calculated differently
(e.g., based upon RTT), which MUST then be specified in a
future document. Such a future specification MUST ensure that
all RTP receivers use the same mechanism to calculate
T_dither_max.
The values given above for T_dither_max are minimal values.
Application-specific feedback considerations may make it
worthwhile to increase T_dither_max beyond this value. This
is up to the discretion of the implementer.
3. Then, R MUST check whether its next Regular RTCP packet would be
within the time bounds for the Early RTCP packet triggered at t0,
i.e., if t0 + T_dither_max > tn.
3a) If so, an Early RTCP packet MUST NOT be scheduled; instead,
the FB message(s) MUST be stored to be included in the Regular
RTCP packet scheduled for tn. This completes the course of
immediate actions to be taken.
3b) Otherwise, the following steps are carried out.
4. R MUST check whether it is allowed to transmit an Early RTCP
packet, i.e., allow_early == TRUE, or not.
4a) If allow_early == FALSE, then R MUST check the time for the
next scheduled Regular RTCP packet:
1. If tn - t0 < T_max_fb_delay, then the feedback could still
be useful for the sender, despite the late reporting.
Hence, R MAY create an RTCP FB message to be included in
the Regular RTCP packet for transmission at tn.
2. Otherwise, R MUST discard the RTCP FB message.
This completes the immediate course of actions to be taken.
4b) If allow_early == TRUE, then R MUST schedule an Early RTCP
packet for te = t0 + RND * T_dither_max with RND being a
pseudo random function evenly distributed between 0 and 1.
5. R MUST detect overlaps in FB messages received from other members
of the RTP session and the FB messages R wants to send.
Therefore, while a member of the RTP session, R MUST continuously
monitor the arrival of (minimal) compound RTCP packets and store
each FB message contained in these RTCP packets for at least
T_retention. When scheduling the transmission of its own FB
message following steps 1 through 4 above, R MUST check each of
the stored and newly received FB messages from the RTCP packets
received during the interval [t0 - T_retention ; te] and act as
follows:
5a) If R understands the received FB message's semantics and the
message contents is a superset of the feedback R wanted to
send, then R MUST discard its own FB message and MUST re-
schedule the next Regular RTCP packet transmission for tn (as
calculated before).
5b) If R understands the received FB message's semantics and the
message contents is not a superset of the feedback R wanted to
send, then R SHOULD transmit its own FB message as scheduled.
If there is an overlap between the feedback information to
send and the feedback information received, the amount of
feedback transmitted is up to R: R MAY leave its feedback
information to be sent unchanged, R MAY as well eliminate any
redundancy between its own feedback and the feedback received
so far from other session members.
5c) If R does not understand the received FB message's semantics,
R MAY keep its own FB message scheduled as an Early RTCP
packet, or R MAY re-schedule the next Regular RTCP packet
transmission for tn (as calculated before) and MAY append the
FB message to the now regularly scheduled RTCP message.
Note: With 5c), receiving unknown FB messages may not lead to
feedback suppression at a particular receiver. As a
consequence, a given event may cause M different types of FB
messages (which are all appropriate but not mutually
understood) to be scheduled, so that a "large" receiver group
may effectively be partitioned into at most M groups. Among
members of each of these M groups, feedback suppression will
occur following 5a and 5b but no suppression will happen
across groups. As a result, O(M) RTCP FB messages may be
received by the sender. Hence, there is a chance for a very
limited feedback implosion. However, as sender(s) and all
receivers make up the same application using the same (set of)
codecs in the same RTP session, only little divergence in
semantics for FB messages can safely be assumed and,
therefore, M is assumed to be small in the general case.
Given further that the O(M) FB messages are randomly
distributed over a time interval of T_dither_max, we find that
the resulting limited number of extra compound RTCP packets
(a) is assumed not to overwhelm the sender and (b) should be
conveyed as all contain complementary pieces of information.
6. If R's FB message(s) was not suppressed by other receiver FB
messages as per 5, when te is reached, R MUST transmit the
(minimal) compound RTCP packet containing its FB message(s). R
then MUST set allow_early = FALSE, MUST recalculate tn = tp +
2*T_rr, and MUST set tp to the previous tn. As soon as the newly
calculated tn is reached, regardless whether R sends its next
Regular RTCP packet or suppresses it because of T_rr_interval, it
MUST set allow_early = TRUE again.
3.5.3. Regular RTCP Transmission
Full compound RTCP packets MUST be sent in regular intervals. These
packets MAY also contain one or more FB messages. Transmission of
Regular RTCP packets is scheduled as follows:
If T_rr_interval == 0, then the transmission MUST follow the rules as
specified in Sections 3.2 and 3.4 of this document and MUST adhere to
the adjustments of tn specified in Section 3.5.2 (i.e., skip one
regular transmission if an Early RTCP packet transmission has
occurred). Timer reconsideration takes place when tn is reached as
per [1]. The Regular RTCP packet is transmitted after timer
reconsideration. Whenever a Regular RTCP packet is sent or
suppressed, allow_early MUST be set to TRUE and tp, tn MUST be
updated as per [1]. After the first transmission of a Regular RTCP
packet, Tmin MUST be set to 0.
If T_rr_interval != 0, then the calculation for the transmission
times MUST follow the rules as specified in Sections 3.2 and 3.4 of
this document and MUST adhere to the adjustments of tn specified in
Section 3.5.2 (i.e., skip one regular transmission if an Early RTCP
transmission has occurred). Timer reconsideration takes place when
tn is reached as per [1]. After timer reconsideration, the following
actions are taken:
1. If no Regular RTCP packet has been sent before (i.e., if t_rr_last
== NaN), then a Regular RTCP packet MUST be scheduled. Stored FB
messages MAY be included in the Regular RTCP packet. After the
scheduled packet has been sent, t_rr_last MUST be set to tn. Tmin
MUST be set to 0.
2. Otherwise, a temporary value T_rr_current_interval is calculated
as follows:
T_rr_current_interval = RND*T_rr_interval
with RND being a pseudo random function evenly distributed between
0.5 and 1.5. This dithered value is used to determine one of the
following alternatives:
2a) If t_rr_last + T_rr_current_interval <= tn, then a Regular
RTCP packet MUST be scheduled. Stored RTCP FB messages MAY be
included in the Regular RTCP packet. After the scheduled
packet has been sent, t_rr_last MUST be set to tn.
2b) If t_rr_last + T_rr_current_interval > tn and RTCP FB messages
have been stored and are awaiting transmission, an RTCP packet
MUST be scheduled for transmission at tn. This RTCP packet
MAY be a minimal or a Regular RTCP packet (at the discretion
of the implementer), and the compound RTCP packet MUST include
the stored RTCP FB message(s). t_rr_last MUST remain
unchanged.
2c) Otherwise (if t_rr_last + T_rr_current_interval > tn but no
stored RTCP FB messages are awaiting transmission), the
compound RTCP packet MUST be suppressed (i.e., it MUST NOT be
scheduled). t_rr_last MUST remain unchanged.
In all the four cases above (1, 2a, 2b, and 2c), allow_early MUST be
set to TRUE (possibly after sending the Regular RTCP packet) and tp
and tn MUST be updated following the rules of [1] except for the five
second minimum.
3.5.4. Other Considerations
If T_rr_interval != 0, then the timeout calculation for RTP/AVPF
entities (Section 6.3.5 of [1]) MUST be modified to use T_rr_interval
instead of Tmin for computing Td and thus M*Td for timing out RTP
entities.
Whenever a compound RTCP packet is sent or received -- minimal or
full compound, Early or Regular -- the avg_rtcp_size variable MUST be
updated accordingly (see [1]) and subsequent computations of tn MUST
use the new avg_rtcp_size.
3.6. Considerations on the Group Size
This section provides some guidelines to the group sizes at which the various feedback modes may be used.3.6.1. ACK Mode
The RTP session MUST have exactly two members and this group size MUST NOT grow, i.e., it MUST be point-to-point communications. Unicast addresses SHOULD be used in the session description. For unidirectional as well as bi-directional communication between two parties, 2.5% of the RTP session bandwidth is available for RTCP traffic from the receivers including feedback. For a 64-kbit/s stream this yields 1,600 bit/s for RTCP. If we assume an average of 96 bytes (=768 bits) per RTCP packet, a receiver can report 2 events per second back to the sender. If acknowledgements for 10 events are collected in each FB message, then 20 events can be acknowledged per second. At 256 kbit/s, 8 events could be reported per second; thus, the ACKs may be sent in a finer granularity (e.g., only combining three ACKs per FB message). From 1 Mbit/s upwards, a receiver would be able to acknowledge each individual frame (not packet!) in a 30-fps video stream. ACK strategies MUST be defined to work properly with these bandwidth limitations. An indication whether or not ACKs are allowed for a session and, if so, which ACK strategy should be used, MAY be conveyed by out-of-band mechanisms, e.g., media-specific attributes in a session description using SDP.3.6.2. NACK Mode
Negative acknowledgements (and the other types of feedback exhibiting similar reporting characteristics) MUST be used for all sessions with a group size that may grow larger than two. Of course, NACKs MAY be used for point-to-point communications as well. Whether or not the use of Early RTCP packets should be considered depends upon a number of parameters including session bandwidth, codec, special type of feedback, and number of senders and receivers. The most important parameters when determining the mode of operation are the allowed minimal interval between two compound RTCP packets (T_rr) and the average number of events that presumably need reporting per time interval (plus their distribution over time, of course). The minimum interval can be derived from the available RTCP bandwidth and the expected average size of an RTCP packet. The
number of events to report (e.g., per second) may be derived from the packet loss rate and sender's rate of transmitting packets. From these two values, the allowable group size for the Immediate Feedback mode can be calculated. As stated in Section 3.3: Let N be the average number of events to be reported per interval T by a receiver, B the RTCP bandwidth fraction for this particular receiver, and R the average RTCP packet size, then the receiver operates in Immediate Feedback mode as long as N<=B*T/R. The upper bound for the Early RTCP mode then solely depends on the acceptable quality degradation, i.e., how many events per time interval may go unreported. As stated in Section 3.3: Using the above notation, Early RTCP mode can be roughly characterized by N > B*T/R as "lower bound". An estimate for an upper bound is more difficult. Setting N=1, we obtain for a given R and B the interval T = R/B as average interval between events to be reported. This information can be used as a hint to determine whether or not early transmission of RTCP packets is useful. Example: If a 256-kbit/s video with 30 fps is transmitted through a network with an MTU size of some 1,500 bytes, then, in most cases, each frame would fit in into one packet leading to a packet rate of 30 packets per second. If 5% packet loss occurs in the network (equally distributed, no inter-dependence between receivers), then each receiver will, on average, have to report 3 packets lost each two seconds. Assuming a single sender and more than three receivers, this yields 3.75% of the RTCP bandwidth allocated to the receivers and thus 9.6 kbit/s. Assuming further a size of 120 bytes for the average compound RTCP packet allows 10 RTCP packets to be sent per second or 20 in two seconds. If every receiver needs to report three lost packets per two seconds, this yields a maximum group size of 6-7 receivers if all loss events are reported. The rules for transmission of Early RTCP packets should provide sufficient flexibility for most of this reporting to occur in a timely fashion. Extending this example to determine the upper bound for Early RTCP mode could lead to the following considerations: assume that the underlying coding scheme and the application (as well as the tolerant users) allow on the order of one loss without repair per two seconds. Thus, the number of packets to be reported by each receiver decreases to two per two seconds and increases the group size to 10. Assuming further that some number of packet losses are correlated, feedback
traffic is further reduced and group sizes of some 12 to 16 (maybe even 20) can be reasonably well supported using Early RTCP mode. Note that all these considerations are based upon statistics and will fail to hold in some cases.3.7. Summary of Decision Steps
3.7.1. General Hints
Before even considering whether or not to send RTCP feedback information, an application has to determine whether this mechanism is applicable: 1) An application has to decide whether -- for the current ratio of packet rate with the associated (application-specific) maximum feedback delay and the currently observed round-trip time (if available) -- feedback mechanisms can be applied at all. This decision may be based upon (and dynamically revised following) RTCP reception statistics as well as out-of-band mechanisms. 2) The application has to decide -- for a certain observed error rate, assigned bandwidth, frame/packet rate, and group size -- whether (and which) feedback mechanisms can be applied. Regular RTCP reception statistics provide valuable input to this step, too. 3) If the application decides to send feedback, the application has to follow the rules for transmitting Early RTCP packets or Regular RTCP packets containing FB messages. 4) The type of RTCP feedback sent should not duplicate information available to the sender from a lower layer transport protocol. That is, if the transport protocol provides negative or positive acknowledgements about packet reception (such as DCCP), the receiver should avoid repeating the same information at the RTCP layer (i.e., abstain from sending Generic NACKs).3.7.2. Media Session Attributes
Media sessions are typically described using out-of-band mechanisms to convey transport addresses, codec information, etc., between sender(s) and receiver(s). Such a mechanism is two-fold: a format used to describe a media session and another mechanism for transporting this description.
In the IETF, the Session Description Protocol (SDP) is currently used to describe media sessions while protocols such as SIP, Session Announcement Protocol (SAP), Real Time Streaming Protocol (RTSP), and HTTP (among others) are used to convey the descriptions. A media session description format MAY include parameters to indicate that RTCP feedback mechanisms are supported in this session and which of the feedback mechanisms MAY be applied. To do so, the profile "AVPF" MUST be indicated instead of "AVP". Further attributes may be defined to show which type(s) of feedback are supported. Section 4 contains the syntax specification to support RTCP feedback with SDP. Similar specifications for other media session description formats are outside the scope of this document.