Internet Engineering Task Force (IETF) C. Perkins
Request for Comments: 8083 University of Glasgow
Updates: 3550 V. Singh
Category: Standards Track callstats.io
ISSN: 2070-1721 March 2017Multimedia Congestion Control: Circuit Breakers for Unicast RTP Sessions
The Real-time Transport Protocol (RTP) is widely used in telephony,
video conferencing, and telepresence applications. Such applications
are often run on best-effort UDP/IP networks. If congestion control
is not implemented in these applications, then network congestion can
lead to uncontrolled packet loss and a resulting deterioration of the
user's multimedia experience. The congestion control algorithm acts
as a safety measure by stopping RTP flows from using excessive
resources and protecting the network from overload. At the time of
this writing, however, while there are several proprietary solutions,
there is no standard algorithm for congestion control of interactive
This document does not propose a congestion control algorithm. It
instead defines a minimal set of RTP circuit breakers: conditions
under which an RTP sender needs to stop transmitting media data to
protect the network from excessive congestion. It is expected that,
in the absence of long-lived excessive congestion, RTP applications
running on best-effort IP networks will be able to operate without
triggering these circuit breakers. To avoid triggering the RTP
circuit breaker, any Standards Track congestion control algorithms
defined for RTP will need to operate within the envelope set by these
RTP circuit breaker algorithms.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
The Real-time Transport Protocol (RTP) [RFC3550] is widely used in
voice-over-IP, video teleconferencing, and telepresence systems.
Many of these systems run over best-effort UDP/IP networks and can
suffer from packet loss and increased latency if network congestion
occurs. Designing effective RTP congestion control algorithms to
adapt the transmission of RTP-based media to match the available
network capacity while also maintaining the user experience is a
difficult but important problem. Many such congestion control and
media adaptation algorithms have been proposed, but to date there is
no consensus on the correct approach or even that a single standard
algorithm is desirable.
This memo does not attempt to propose a new RTP congestion control
algorithm. Instead, we propose a small set of RTP circuit breakers:
mechanisms that terminate RTP flows in conditions under which there
is general agreement that serious network congestion is occurring.
The RTP circuit breakers proposed in this memo are a specific
instance of the general class of network transport circuit breakers
[RFC8084] designed to act as a protection mechanism of last resort to
avoid persistent excessive congestion. To avoid triggering the RTP
circuit breaker, any Standards Track congestion control algorithms
defined for RTP will need to operate within the envelope set by the
RTP circuit breaker algorithms defined by this memo.
We consider congestion control for unicast RTP traffic flows. This
is the problem of adapting the transmission of an audio/visual data
flow, encapsulated within an RTP transport session, from one sender
to one receiver so that it does not use more capacity than is
available along the network path. Such adaptation needs to be done
in a way that limits the disruption to the user experience caused by
both packet loss and excessive rate changes. Congestion control for
multicast flows is outside the scope of this memo. Multicast traffic
needs different solutions since the available capacity estimator for
a group of receivers will differ from that for a single receiver, and
because multicast congestion control has to consider issues of
fairness across groups of receivers that do not apply to unicast
Congestion control for unicast RTP traffic can be implemented in one
of two places in the protocol stack. One approach is to run the RTP
traffic over a congestion-controlled transport protocol (for example,
over TCP), and to adapt the media encoding to match the dictates of
the transport-layer congestion control algorithm. This is safe for
the network but can be suboptimal for the media quality unless the
transport protocol is designed to support real-time media flows. We
do not consider this class of applications further in this memo, as
their network safety is guaranteed by the underlying transport.
Alternatively, RTP flows can be run over a non-congestion-controlled
transport protocol (for example, UDP) performing rate adaptation at
the application layer based on RTP Control Protocol (RTCP) feedback.
With a well-designed, network-aware application, this allows highly
effective media quality adaptation, but there is potential to cause
persistent congestion in the network if the application does not
adapt its sending rate in a timely and effective manner. We consider
this class of applications in this memo.
Congestion control relies on monitoring the delivery of a media flow
and responding to adapt the transmission of that flow when there are
signs that the network path is congested. Network congestion can be
detected in one of three ways:
1) a receiver can infer the onset of congestion by observing an
increase in one-way delay caused by queue build-up within the
2) if Explicit Congestion Notification (ECN) [RFC3168] is supported,
the network can signal the presence of congestion by marking
packets using ECN Congestion Experienced (CE) marks (this could
potentially be augmented by mechanisms such as Congestion
Exposure (ConEx) [RFC7713] or other future protocol extensions
for network signaling of congestion); or
3) in the extreme case, congestion will cause packet loss that can
be detected by observing a gap in the received RTP sequence
Once the onset of congestion is observed, the receiver has to send
feedback to the sender to indicate that the transmission rate needs
to be reduced. How the sender reduces the transmission rate is
highly dependent on the media codec being used and is outside the
scope of this memo.
There are several ways in which a receiver can send feedback to a
media sender within the RTP framework:
o The base RTP specification [RFC3550] defines RTCP Receiver Report
(RR) packets to convey reception quality feedback information and
Sender Report (SR) packets to convey information about the media
transmission. RTCP SR packets contain data that can be used to
reconstruct media timing at a receiver along with a count of the
total number of octets and packets sent. RTCP RR packets report
on the fraction of packets lost in the last reporting interval,
the cumulative number of packets lost, the highest sequence number
received, and the inter-arrival jitter. The RTCP RR packets also
contain timing information that allows the sender to estimate the
network Round-Trip Time (RTT) to the receivers. RTCP reports are
sent periodically, with the reporting interval being determined by
the number of Synchronization Sources (SSRCs) used in the session
and a configured session bandwidth estimate (the number of SSRCs)
used is usually two in a unicast session, one for each
participant, but can be greater if the participants send multiple
media streams). The interval between reports sent from each
receiver is on the order of a few seconds on average; although it
varies with the session bandwidth, it is randomized to avoid
synchronization of reports from multiple receivers. The interval
can be less than a second in a high-bandwidth session. RTCP RR
packets allow a receiver to report ongoing network congestion to
the sender. However, if a receiver detects the onset of
congestion part way through a reporting interval, the base RTP
specification contains no provision for sending the RTCP RR packet
early, and the receiver has to wait until the next scheduled
o The RTCP Extended Reports (XR) [RFC3611] allow reporting of more
complex and sophisticated reception quality metrics but do not
change the RTCP timing rules. RTCP extended reports of potential
interest for congestion control purposes are the extended packet
loss, discard, and burst metrics [RFC3611] [RFC7002] [RFC7097]
[RFC7003] [RFC6958] as well as the extended delay metrics
[RFC6843] [RFC6798]. Other RTCP Extended Reports that could be
helpful for congestion control purposes might be developed in
o Rapid feedback about the occurrence of congestion events can be
achieved using the Extended RTP Profile for RTCP-Based Feedback
(RTP/AVPF) [RFC4585] (or its secure variant, RTP/SAVPF [RFC5124])
in place of the RTP/AVP profile [RFC3551]. This modifies the RTCP
timing rules to allow RTCP reports to be sent early, in some cases
immediately, provided the RTCP transmission rate keeps within its
bandwidth allocation. It also defines transport-layer feedback
messages, including Negative Acknowledgements (NACKs), that can be
used to report on specific congestion events. RTP Codec Control
Messages [RFC5104] extend the RTP/AVPF profile with additional
feedback messages that can be used to influence the way in which
rate adaptation occurs but do not further change the dynamics of
how rapidly feedback can be sent. Use of the RTP/AVPF profile is
dependent on signaling.
o Finally, ECN for RTP over UDP [RFC6679] can be used to provide
feedback on the number of packets that received an ECN-CE mark.
This RTCP extension builds on the RTP/AVPF profile to allow rapid
congestion feedback when ECN is supported.
In addition to these mechanisms for providing feedback, the sender
can include an RTP header extension in each packet to record packet
transmission times [RFC5450]. Accurate transmission timestamps can
be helpful for estimating queuing delays to get an early indication
of the onset of congestion.
Taken together, these various mechanisms allow receivers to provide
feedback on the senders when congestion events occur, with varying
degrees of timeliness and accuracy. The key distinction is between
systems that use only the basic RTCP mechanisms, without RTP/AVPF
rapid feedback, and those that use the RTP/AVPF extensions to respond
to congestion more rapidly.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
This interpretation of these key words applies only when written in
ALL CAPS. Mixed- or lower-case uses of these key words are not to be
interpreted as carrying special significance in this memo.
The definition of the RTP circuit breaker is specified in terms of
the following variables:
o Td is the deterministic RTCP reporting interval, as defined in
Section 6.3.1 of [RFC3550].
o Tdr is the sender's estimate of the deterministic RTCP reporting
interval, Td, calculated by a receiver of the data it is sending.
Tdr is not known at the sender but can be estimated by executing
the algorithm in Section 6.2 of [RFC3550] using the average RTCP
packet size seen at the sender, the number of members reported in
the receiver's SR/RR report blocks, and whether the receiver is
sending SR or RR packets. Tdr is recalculated when each new RTCP
SR/RR report is received, but the media timeout circuit breaker
(see Section 4.2) is only reconsidered when Tdr increases.
o Tr is the network round-trip time, which is calculated by the
sender using the algorithm in Section 6.4.1 of [RFC3550] and is
smoothed using an exponentially weighted moving average as
Tr = (0.8 * Tr) + (0.2 * Tr_new) where Tr_new is the latest RTT
estimate obtained from an RTCP report. The weight is chosen so
old estimates decay over k intervals.
o k is the non-reporting threshold (see Section 4.2).
o Tf is the media framing interval at the sender. For applications
sending at a constant frame rate, Tf is the inter-frame interval.
For applications that switch between a small set of possible frame
rates (for example, when sending speech with comfort noise, such
that comfort noise frames are sent less often than speech frames),
Tf is set to the longest of the inter-frame intervals of the
different frame rates. For applications that send periodic frames
but dynamically vary their frame rate, Tf is set to the largest
inter-frame interval used in the last 10 seconds. For
applications that send less than one frame every 10 seconds, or
that have no concept of periodic frames (e.g., text conversation
[RFC4103], or pointer events [RFC2862]), when each frame is sent,
Tf is set to the time interval since the previous frame.
o G is the frame group size. That is, the number of frames that are
coded together based on a particular sending rate setting. If the
codec used by the sender can change its rate on each frame, then G
= 1; otherwise, G is set to the number of frames before the codec
can adjust to the new rate. For codecs that have the concept of a
Group of Pictures (GOP), G is likely the GOP length.
o T_rr_interval is the minimal interval between RTCP reports, as
defined in Section 3.4 of [RFC4585]; it is only meaningful for
implementations of RTP/AVPF profile [RFC4585] or the RTP/SAVPF
o X is the estimated throughput a TCP connection would achieve over
a path, in bytes per second.
o s is the size of RTP packets being sent, in bytes. If the RTP
packets being sent vary in size, then the average size over the
packet comprising the last 4 * G frames MUST be used (this is
intended to be comparable to the four loss intervals used in
o p is the loss event rate, between 0.0 and 1.0, that would be seen
by a TCP connection over a particular path. When used in the RTP
congestion circuit breaker, this is approximated as described in
o t_RTO is the retransmission timeout value that would be used by a
TCP connection over a particular path, in seconds. This MUST be
approximated using t_RTO = 4 * Tr when used as part of the RTP
congestion circuit breaker.
o b is the number of packets that are acknowledged by a single TCP
acknowledgement. Following [RFC5348], it is RECOMMENDED that the
value b = 1 is used as part of the RTP congestion circuit breaker.
4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile
The feedback mechanisms defined in [RFC3550] and available under the
RTP/AVP profile [RFC3551] are the minimum that can be assumed for a
baseline circuit breaker mechanism that is suitable for all unicast
applications of RTP. Accordingly, for an RTP circuit breaker to be
useful, it needs to be able to detect that an RTP flow is causing
excessive congestion using only basic RTCP features without needing
RTCP XR feedback or the RTP/AVPF profile for rapid RTCP reports.
RTCP is a fundamental part of the RTP protocol, and the mechanisms
described here rely on the implementation of RTCP. Implementations
that claim to support RTP, but that do not implement RTCP, will be
unable to use the circuit breaker mechanisms described in this memo.
Such implementations SHOULD NOT be used on networks that might be
subject to congestion unless equivalent mechanisms are defined using
some non-RTCP feedback channel to report congestion and signal
circuit breaker conditions.
The RTCP timeout circuit breaker (Section 4.1) will trigger if an
implementation of this memo attempts to interwork with an endpoint
that does not support RTCP. Implementations that sometimes need to
interwork with endpoints that do not support RTCP need to disable the
RTP circuit breakers if they don't receive some confirmation via
signaling that the remote endpoint implements RTCP (the presence of a
Session Description Protocol (SDP) "a=rtcp:" attribute in an answer
might be such an indication). The RTP circuit breaker SHOULD NOT be
disabled on networks that might be subject to congestion unless
equivalent mechanisms are defined using some non-RTCP feedback
channel to report congestion and signal circuit breaker conditions
Three potential congestion signals are available from the basic RTCP
SR/RR packets and are reported for each SSRC in the RTP session:
1. The sender can estimate the network round-trip time once per RTCP
reporting interval based on the contents and timing of RTCP SR
and RR packets.
2. Receivers report a jitter estimate (the statistical variance of
the RTP data packet inter-arrival time) calculated over the RTCP
reporting interval. Due to the nature of the jitter calculation
(Section 6.4.4. of [RFC3550]), the jitter is only meaningful for
RTP flows that send a single data packet for each RTP timestamp
value (i.e., audio flows, or video flows where each packet
comprises one video frame).
3. Receivers report the fraction of RTP data packets lost during the
RTCP reporting interval and the cumulative number of RTP packets
lost over the entire RTP session.
These congestion signals limit the possible circuit breakers since
they give only limited visibility into the behavior of the network.
RTT estimates are widely used in congestion control algorithms as a
proxy for queuing delay measures in delay-based congestion control or
to determine connection timeouts. RTT estimates derived from RTCP SR
and RR packets sent according to the RTP/AVP timing rules are too
infrequent to be useful for congestion control and don't give enough
information to distinguish a delay change due to routing updates from
queuing delay caused by congestion. Accordingly, we cannot use the
RTT estimate alone as an RTP circuit breaker.
Increased jitter can be a signal of transient network congestion, but
in the highly aggregated form reported in RTCP RR packets, it offers
insufficient information to estimate the extent or persistence of
congestion. Jitter reports are a useful early warning of potential
network congestion but provide an insufficiently strong signal to be
used as a circuit breaker.
The remaining congestion signals are the packet loss fraction and the
cumulative number of packets lost. If considered carefully, and over
an appropriate time frame to distinguish transient problems from long
term issues [RFC8084], these can be effective indicators that
persistent excessive congestion is occurring in networks where packet
loss is primarily due to queue overflows, although loss caused by
non-congestive packet corruption can distort the result in some
networks. TCP congestion control [RFC5681] intentionally tries to
fill the router queues and uses the resulting packet loss as
congestion feedback. An RTP flow competing with TCP traffic will
therefore expect to see a non-zero packet loss fraction, and some
variation in queuing latency, in normal operation when sharing a path
with other flows, which needs to be accounted for when determining
the circuit breaker threshold [RFC8084]. This behavior of TCP is
reflected in the congestion circuit breaker below and will affect the
design of any RTP congestion control protocol.
Two packet loss regimes can be observed: 1) RTCP RR packets show a
non-zero packet loss fraction while the extended highest sequence
number received continues to increment; and 2) RR packets show a loss
fraction of zero, but the extended highest sequence number received
does not increment even though the sender has been transmitting RTP
data packets. The former corresponds to the TCP congestion avoidance
state and indicates a congested path that is still delivering data;
the latter corresponds to a TCP timeout and is most likely due to a
path failure. A third condition is that data is being sent but no
RTCP feedback is received at all, corresponding to a failure of the
reverse path. We derive circuit breaker conditions for these loss
regimes in the following.
4.1. RTP/AVP Circuit Breaker #1: RTCP Timeout
An RTCP timeout can occur when RTP data packets are being sent, but
there are no RTCP reports returned from the receiver. This is either
due to a failure of the receiver to send RTCP reports or a failure of
the return path that is preventing those RTCP reporting from being
delivered. In either case, it is not safe to continue transmission
since the sender has no way of knowing if it is causing congestion.
An RTP sender that has not received any RTCP SR or RTCP RR packets
reporting on the SSRC it is using, for a time period of at least
three times its deterministic RTCP reporting interval, Td (where Td
is calculated without the randomization factor and using the fixed
minimum interval of Tmin=5 seconds), SHOULD cease transmission (see
Section 4.5). The rationale for this choice of timeout is as
described in Section 6.2 of [RFC3550] ("so that implementations which
do not use the reduced value for transmitting RTCP packets are not
timed out by other participants prematurely") and has been updated by
Section 6.1.4 of [RFC8108] to account for the use of the RTP/AVPF
profile [RFC4585] or the RTP/SAVPF profile [RFC5124].
To reduce the risk of premature timeout, implementations SHOULD NOT
configure the RTCP bandwidth such that Td is larger than 5 seconds.
Similarly, implementations that use the RTP/AVPF profile [RFC4585] or
the RTP/SAVPF profile [RFC5124] SHOULD NOT configure T_rr_interval to
values larger than 4 seconds (the reduced limit for T_rr_interval
follows Section 6.1.3 of [RFC8108]).
The choice of three RTCP reporting intervals as the timeout is made
following Section 6.3.5 of RFC 3550 [RFC3550]. This specifies that
participants in an RTP session will timeout and remove an RTP sender
from the list of active RTP senders if no RTP data packets have been
received from that RTP sender within the last two RTCP reporting
intervals. Using a timeout of three RTCP reporting intervals is
therefore large enough that the other participants will have timed
out the sender if a network problem stops the data packets it is
sending from reaching the receivers, even allowing for loss of some
If a sender is transmitting a large number of RTP media streams, such
that the corresponding RTCP SR or RR packets are too large to fit
into the network MTU, the receiver will generate RTCP SR or RR
packets in a round-robin manner. In this case, the sender SHOULD
treat receipt of an RTCP SR or RR packet corresponding to any SSRC it
sent on the same 5-tuple of source and destination IP address, port,
and protocol as an indication that the receiver and return path are
working and thus preventing the RTCP timeout circuit breaker from
4.2. RTP/AVP Circuit Breaker #2: Media Timeout
If RTP data packets are being sent but the RTCP SR or RR packets
reporting on that SSRC indicate a non-increasing extended highest
sequence number received, this is an indication that those RTP data
packets are not reaching the receiver. This could be a short-term
issue affecting only a few RTP packets, perhaps caused by a slow-to-
open firewall or a transient connectivity problem, but if the issue
persists, it is a sign of a more ongoing and significant problem (a
The time needed to declare a media timeout depends on the parameters
Tdr, Tr, Tf, and on the non-reporting threshold k. The value of k is
chosen so that when Tdr is large compared to Tr and Tf, receipt of at
least k RTCP reports with non-increasing extended highest sequence
number received gives reasonable assurance that the forward path has
failed and that the RTP data packets have not been lost by chance.
The RECOMMENDED value for k is 5 reports.
When Tdr < Tf, then RTP data packets are being sent at a rate less
than one per RTCP reporting interval of the receiver, so the extended
highest sequence number received can be expected to be non-increasing
for some receiver RTCP reporting intervals. Similarly, when
Tdr < Tr, some receiver RTCP reporting intervals might pass before
the RTP data packets arrive at the receiver, also leading to reports
where the extended highest sequence number received is non-
increasing. Both issues require the media timeout interval to be
scaled relative to the threshold, k.
The media timeout RTP circuit breaker is therefore as follows. When
starting sending, calculate MEDIA_TIMEOUT using:
MEDIA_TIMEOUT = ceil(k * max(Tf, Tr, Tdr) / Tdr)
When a sender receives an RTCP packet that indicates reception of the
media it has been sending, then it cancels the media timeout circuit
breaker. If it is still sending, then it MUST calculate a new value
for MEDIA_TIMEOUT and set a new media timeout circuit breaker.
If a sender receives an RTCP packet indicating that its media was not
received, it MUST calculate a new value for MEDIA_TIMEOUT. If the
new value is larger than the previous, it replaces MEDIA_TIMEOUT with
the new value, extending the media timeout circuit breaker;
otherwise, it keeps the original value of MEDIA_TIMEOUT. This
process is known as reconsidering the media timeout circuit breaker.
If MEDIA_TIMEOUT consecutive RTCP packets are received indicating
that the media being sent was not received, and the media timeout
circuit breaker has not been canceled, then the media timeout circuit
breaker triggers. When the media timeout circuit breaker triggers,
the sender SHOULD cease transmission (see Section 4.5).
When stopping sending an RTP stream, a sender MUST cancel the
corresponding media timeout circuit breaker.