tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Gloss.     Arch.     IMS     UICC    |    Misc.    |    search     info

RFC 7667


RTP Topologies

Part 3 of 3, p. 39 to 48
Prev RFC Part


prevText      Top      Up      ToC       Page 39 
4.  Topology Properties

   The topologies discussed in Section 3 have different properties.
   This section describes these properties.  Note that, even if a
   certain property is supported within a particular topology concept,
   the necessary functionality may be optional to implement.

4.1.  All-to-All Media Transmission

   To recapitulate, multicast, and in particular ASM, provides the
   functionality that everyone may send to, or receive from, everyone
   else within the session.  SSM can provide a similar functionality by
   having anyone intending to participate as a sender to send its media
   to the SSM Distribution Source.  The SSM Distribution Source forwards
   the media to all receivers subscribed to the multicast group.  Mesh,
   MCUs, mixers, Selective Forwarding Middleboxes (SFMs), and
   translators may all provide that functionality at least on some basic
   level.  However, there are some differences in which type of
   reachability they provide.

   The topologies that come closest to emulating Any-Source IP
   Multicast, with all-to-all transmission capabilities, are the
   Transport Translator function called "relay" in Section 3.5, as well
   as the Mesh with joint RTP sessions (Section 3.4).  Media
   Translators, Mesh with independent RTP Sessions, mixers, SFUs, and
   the MCU variants do not provide a fully meshed forwarding on the
   transport level; instead, they only allow limited forwarding of
   content from the other session participants.

   The "all-to-all media transmission" requires that any media
   transmitting endpoint considers the path to the least-capable
   receiving endpoint.  Otherwise, the media transmissions may overload
   that path.  Therefore, a sending endpoint needs to monitor the path
   from itself to any of the receiving endpoints, to detect the
   currently least-capable receiver and adapt its sending rate
   accordingly.  As multiple endpoints may send simultaneously, the
   available resources may vary.  RTCP's receiver reports help perform
   this monitoring, at least on a medium time scale.

Top      Up      ToC       Page 40 
   The resource consumption for performing all-to-all transmission
   varies depending on the topology.  Both ASM and SSM have the benefit
   that only one copy of each packet traverses a particular link.  Using
   a relay causes the transmission of one copy of a packet per
   endpoint-to-relay path and packet transmitted.  However, in most
   cases, the links carrying the multiple copies will be the ones close
   to the relay (which can be assumed to be part of the network
   infrastructure with good connectivity to the backbone) rather than
   the endpoints (which may be behind slower access links).  The Mesh
   topologies causes N-1 streams of transmitted packets to traverse the
   first-hop link from the endpoint, in a mesh with N endpoints.  How
   long the different paths are common is highly situation dependent.

   The transmission of RTCP by design adapts to any changes in the
   number of participants due to the transmission algorithm, defined in
   the RTP specification [RFC3550], and the extensions in AVPF [RFC4585]
   (when applicable).  That way, the resources utilized for RTCP stay
   within the bounds configured for the session.

4.2.  Transport or Media Interoperability

   All translators, mixers, RTCP-terminating MCUs, and Mesh with
   individual RTP sessions allow changing the media encoding or the
   transport to other properties of the other domain, thereby providing
   extended interoperability in cases where the endpoints lack a common
   set of media codecs and/or transport protocols.  Selective Forwarding
   Middleboxes can adopt the transport and (at least) selectively
   forward the encoded streams that match a receiving endpoint's
   capability.  It requires an additional translator to change the media
   encoding if the encoded streams do not match the receiving endpoint's

4.3.  Per-Domain Bitrate Adaptation

   Endpoints are often connected to each other with a heterogeneous set
   of paths.  This makes congestion control in a Point-to-Multipoint set
   problematic.  In the ASM, SSM, Mesh with common RTP session, and
   Transport Relay scenarios, each individual sending endpoint has to
   adapt to the receiving endpoint behind the least-capable path,
   yielding suboptimal quality for the endpoints behind the more capable
   paths.  This is no longer an issue when Media Translators, mixers,
   SFMs, or MCUs are involved, as each endpoint only needs to adapt to
   the slowest path within its own domain.  The translator, mixer, SFM,
   or MCU topologies all require their respective outgoing RTP streams
   to adjust the bitrate, packet rate, etc., to adapt to the least-
   capable path in each of the other domains.  That way one can avoid
   lowering the quality to the least-capable endpoint in all the domains
   at the cost (complexity, delay, equipment) of the mixer, SFM, or

Top      Up      ToC       Page 41 
   translator, and potentially the media sender (multicast/layered
   encoding and sending the different representations).

4.4.  Aggregation of Media

   In the all-to-all media property mentioned above and provided by ASM,
   SSM, Mesh with common RTP session, and relay, all simultaneous media
   transmissions share the available bitrate.  For endpoints with
   limited reception capabilities, this may result in a situation where
   even a minimal, acceptable media quality cannot be accomplished,
   because multiple RTP streams need to share the same resources.  One
   solution to this problem is to use a mixer, or MCU, to aggregate the
   multiple RTP streams into a single one, where the single RTP stream
   takes up less resources in terms of bitrate.  This aggregation can be
   performed according to different methods.  Mixing or selection are
   two common methods.  Selection is almost always possible and easy to
   implement.  Mixing requires resources in the mixer and may be
   relatively easy and not impair the quality too badly (audio) or quite
   difficult (video tiling, which is not only computationally complex
   but also reduces the pixel count per stream, with corresponding loss
   in perceptual quality).

4.5.  View of All Session Participants

   The RTP protocol includes functionality to identify the session
   participants through the use of the SSRC and CSRC fields.  In
   addition, it is capable of carrying some further identity information
   about these participants using the RTCP SDES.  In topologies that
   provide a full all-to-all functionality, i.e., ASM, Mesh with common
   RTP session, and relay, a compliant RTP implementation offers the
   functionality directly as specified in RTP.  In topologies that do
   not offer all-to-all communication, it is necessary that RTCP is
   handled correctly in domain bridging functions.  RTP includes
   explicit specification text for translators and mixers, and for SFMs
   the required functionality can be derived from that text.  However,
   the MCU described in Section 3.8 cannot offer the full functionality
   for session participant identification through RTP means.  The
   topologies that create independent RTP sessions per endpoint or pair
   of endpoints, like a Back-to-Back RTP session, MESH with independent
   RTP sessions, and the RTCP terminating MCU (Section 3.9), with an
   exception of SFM, do not support RTP-based identification of session
   participants.  In all those cases, other non-RTP-based mechanisms
   need to be implemented if such knowledge is required or desirable.
   When it comes to SFM, the SSRC namespace is not necessarily joint.
   Instead, identification will require knowledge of SSRC/CSRC mappings
   that the SFM performed; see Section 3.7.

Top      Up      ToC       Page 42 
4.6.  Loop Detection

   In complex topologies with multiple interconnected domains, it is
   possible to unintentionally form media loops.  RTP and RTCP support
   detecting such loops, as long as the SSRC and CSRC identities are
   maintained and correctly set in forwarded packets.  Loop detection
   will work in ASM, SSM, Mesh with joint RTP session, and relay.  It is
   likely that loop detection works for the video-switching MCU,
   Section 3.8, at least as long as it forwards the RTCP between the
   endpoints.  However, the Back-to-Back RTP sessions, Mesh with
   independent RTP sessions, and SFMs will definitely break the loop
   detection mechanism.

4.7.  Consistency between Header Extensions and RTCP

   Some RTP header extensions have relevance not only end to end but
   also hop to hop, meaning at least some of the middleboxes in the path
   are aware of their potential presence through signaling, intercept
   and interpret such header extensions, and potentially also rewrite or
   generate them.  Modern header extensions generally follow "A General
   Mechanism for RTP Header Extensions" [RFC5285], which allows for all
   of the above.  Examples for such header extensions include the Media
   ID (MID) in [SDP-BUNDLE].  At the time of writing, there was also a
   proposal for how to include some SDES into an RTP header extension

   When such header extensions are in use, any middlebox that
   understands it must ensure consistency between the extensions it sees
   and/or generates and the RTCP it receives and generates.  For
   example, the MID of the bundle is sent in an RTP header extension and
   also in an RTCP SDES message.  This apparent redundancy was
   introduced as unaware middleboxes may choose to discard RTP header
   extensions.  Obviously, inconsistency between the MID sent in the RTP
   header extension and in the RTCP SDES message could lead to
   undesirable results, and, therefore, consistency is needed.
   Middleboxes unaware of the nature of a header extension, as specified
   in [RFC5285], are free to forward or discard header extensions.

5.  Comparison of Topologies

   The table below attempts to summarize the properties of the different
   topologies.  The legend to the topology abbreviations are:
   Topo-Point-to-Point (PtP), Topo-ASM (ASM), Topo-SSM (SSM), Topo-Trn-
   Translator (TT), Topo-Media-Translator (including Transport
   Translator) (MT), Topo-Mesh with joint session (MJS), Topo-Mesh with
   individual sessions (MIS), Topo-Mixer (Mix), Topo-Asymmetric (ASY),
   Topo-Video-switch-MCU (VSM), Topo-RTCP-terminating-MCU (RTM), and
   Selective Forwarding Middlebox (SFM).  In the table below, Y

Top      Up      ToC       Page 43 
   indicates Yes or full support, N indicates No support, (Y) indicates
   partial support, and N/A indicates not applicable.

   Property             PtP  ASM SSM  TT MT MJS MIS Mix ASY VSM RTM SFM
   All-to-All Media      N    Y  (Y)  Y  Y   Y  (Y) (Y) (Y) (Y) (Y) (Y)
   Interoperability      N/A  N   N   Y  Y   Y   Y   Y   Y   N   Y   Y
   Per-Domain Adaptation N/A  N   N   N  Y   N   Y   Y   Y   N   Y   Y
   Aggregation of Media  N    N   N   N  N   N   N   Y  (Y)  Y   Y   N
   Full Session View     Y    Y   Y   Y  Y   Y   N   Y   Y  (Y)  N   Y
   Loop Detection        Y    Y   Y   Y  Y   Y   N   Y   Y  (Y)  N   N

   Please note that the Media Translator also includes the Transport
   Translator functionality.

6.  Security Considerations

   The use of mixers, SFMs, and translators has impact on security and
   the security functions used.  The primary issue is that mixers, SFMs,
   and translators modify packets, thus preventing the use of integrity
   and source authentication, unless they are trusted devices that take
   part in the security context, e.g., the device can send Secure Real-
   time Transport Protocol (SRTP) and Secure Real-time Transport Control
   Protocol (SRTCP) [RFC3711] packets to endpoints in the Communication
   Session.  If encryption is employed, the Media Translator, SFM, and
   mixer need to be able to decrypt the media to perform its function.
   A Transport Translator may be used without access to the encrypted
   payload in cases where it translates parts that are not included in
   the encryption and integrity protection, for example, IP address and
   UDP port numbers in a media stream using SRTP [RFC3711].  However, in
   general, the translator, SFM, or mixer needs to be part of the
   signaling context and get the necessary security associations (e.g.,
   SRTP crypto contexts) established with its RTP session participants.

   Including the mixer, SFM, and translator in the security context
   allows the entity, if subverted or misbehaving, to perform a number
   of very serious attacks as it has full access.  It can perform all
   the attacks possible (see RFC 3550 and any applicable profiles) as if
   the media session were not protected at all, while giving the
   impression to the human session participants that they are protected.

   Transport Translators have no interactions with cryptography that
   work above the transport layer, such as SRTP, since that sort of
   translator leaves the RTP header and payload unaltered.  Media
   Translators, on the other hand, have strong interactions with
   cryptography, since they alter the RTP payload.  A Media Translator
   in a session that uses cryptographic protection needs to perform
   cryptographic processing to both inbound and outbound packets.

Top      Up      ToC       Page 44 
   A Media Translator may need to use different cryptographic keys for
   the inbound and outbound processing.  For SRTP, different keys are
   required, because an RFC 3550 Media Translator leaves the SSRC
   unchanged during its packet processing, and SRTP key sharing is only
   allowed when distinct SSRCs can be used to protect distinct packet

   When the Media Translator uses different keys to process inbound and
   outbound packets, each session participant needs to be provided with
   the appropriate key, depending on whether they are listening to the
   translator or the original source.  (Note that there is an
   architectural difference between RTP media translation, in which
   participants can rely on the RTP payload type field of a packet to
   determine appropriate processing, and cryptographically protected
   media translation, in which participants must use information that is
   not carried in the packet.)

   When using security mechanisms with translators, SFMs, and mixers, it
   is possible that the translator, SFM, or mixer could create different
   security associations for the different domains they are working in.
   Doing so has some implications:

   First, it might weaken security if the mixer/translator accepts a
   weaker algorithm or key in one domain rather than in another.
   Therefore, care should be taken that appropriately strong security
   parameters are negotiated in all domains.  In many cases,
   "appropriate" translates to "similar" strength.  If a key-management
   system does allow the negotiation of security parameters resulting in
   a different strength of the security, then this system should notify
   the participants in the other domains about this.

   Second, the number of crypto contexts (keys and security-related
   state) needed (for example, in SRTP [RFC3711]) may vary between
   mixers, SFMs, and translators.  A mixer normally needs to represent
   only a single SSRC per domain and therefore needs to create only one
   security association (SRTP crypto context) per domain.  In contrast,
   a translator needs one security association per participant it
   translates towards, in the opposite domain.  Considering Figure 11,
   the translator needs two security associations towards the multicast
   domain: one for B and one for D.  It may be forced to maintain a set
   of totally independent security associations between itself and B and
   D, respectively, so as to avoid two-time pad occurrences.  These
   contexts must also be capable of handling all the sources present in
   the other domains.  Hence, using completely independent security
   associations (for certain keying mechanisms) may force a translator
   to handle N*DM keys and related state, where N is the total number of
   SSRCs used over all domains and DM is the total number of domains.

Top      Up      ToC       Page 45 
   The ASM, SSM, Relay, and Mesh (with common RTP session) topologies
   each have multiple endpoints that require shared knowledge about the
   different crypto contexts for the endpoints.  These multiparty
   topologies have special requirements on the key management as well as
   the security functions.  Specifically, source authentication in these
   environments has special requirements.

   There exist a number of different mechanisms to provide keys to the
   different participants.  One example is the choice between group keys
   and unique keys per SSRC.  The appropriate keying model is impacted
   by the topologies one intends to use.  The final security properties
   are dependent on both the topologies in use and the keying
   mechanisms' properties and need to be considered by the application.
   Exactly which mechanisms are used is outside of the scope of this
   document.  Please review RTP Security Options [RFC7201] to get a
   better understanding of most of the available options.

7.  References

7.1.  Normative References

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
              July 2003, <>.

   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
              "Extended RTP Profile for Real-time Transport Control
              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
              DOI 10.17487/RFC4585, July 2006,

   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
              B. Burman, Ed., "A Taxonomy of Grouping Semantics and
              Mechanisms for Real-Time Transport Protocol (RTP)
              Sources", RFC 7656, November 2015,

7.2.  Informative References

              Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
              "Sending Multiple Media Streams in a Single RTP Session:
              Grouping RTCP Reception Statistics and Other Feedback",
              Work in Progress, draft-ietf-avtcore-rtp-multi-stream-
              optimisation-08, October 2015.

Top      Up      ToC       Page 46 
   [RFC1112]  Deering, S., "Host extensions for IP multicasting", STD 5,
              RFC 1112, DOI 10.17487/RFC1112, August 1989,

   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
              Address Translator (Traditional NAT)", RFC 3022,
              DOI 10.17487/RFC3022, January 2001,

   [RFC3569]  Bhattacharyya, S., Ed., "An Overview of Source-Specific
              Multicast (SSM)", RFC 3569, DOI 10.17487/RFC3569, July
              2003, <>.

   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
              RFC 3711, DOI 10.17487/RFC3711, March 2004,

   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A
              Session Initiation Protocol (SIP) Event Package for
              Conference State", RFC 4575, DOI 10.17487/RFC4575, August
              2006, <>.

   [RFC4607]  Holbrook, H. and B. Cain, "Source-Specific Multicast for
              IP", RFC 4607, DOI 10.17487/RFC4607, August 2006,

   [RFC5104]  Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
              "Codec Control Messages in the RTP Audio-Visual Profile
              with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
              February 2008, <>.

   [RFC5117]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
              DOI 10.17487/RFC5117, January 2008,

   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
              Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
              2008, <>.

   [RFC5760]  Ott, J., Chesterfield, J., and E. Schooler, "RTP Control
              Protocol (RTCP) Extensions for Single-Source Multicast
              Sessions with Unicast Feedback", RFC 5760,
              DOI 10.17487/RFC5760, February 2010,

Top      Up      ToC       Page 47 
   [RFC5766]  Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
              Relays around NAT (TURN): Relay Extensions to Session
              Traversal Utilities for NAT (STUN)", RFC 5766,
              DOI 10.17487/RFC5766, April 2010,

   [RFC6285]  Ver Steeg, B., Begen, A., Van Caenegem, T., and Z. Vax,
              "Unicast-Based Rapid Acquisition of Multicast RTP
              Sessions", RFC 6285, DOI 10.17487/RFC6285, June 2011,

   [RFC6465]  Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real-
              time Transport Protocol (RTP) Header Extension for Mixer-
              to-Client Audio Level Indication", RFC 6465,
              DOI 10.17487/RFC6465, December 2011,

   [RFC7201]  Westerlund, M. and C. Perkins, "Options for Securing RTP
              Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,

              Westerlund, M., Burman, B., Even, R., and M. Zanaty, "RTP
              Header Extension for RTCP Source Description Items", Work
              in Progress, draft-ietf-avtext-sdes-hdr-ext-02, July 2015.

              Holmberg, C., Alvestrand, H., and C. Jennings,
              "Negotiating Media Multiplexing Using the Session
              Description Protocol (SDP)", Work in Progress,
              draft-ietf-mmusic-sdp-bundle-negotiation-23, July 2015.

Top      Up      ToC       Page 48 

   The authors would like to thank Mark Baugher, Bo Burman, Ben
   Campbell, Umesh Chandra, Alex Eleftheriadis, Roni Even, Ladan Gharai,
   Geoff Hunt, Suresh Krishnan, Keith Lantz, Jonathan Lennox, Scarlet
   Liuyan, Suhas Nandakumar, Colin Perkins, and Dan Wing for their help
   in reviewing and improving this document.

Authors' Addresses

   Magnus Westerlund
   Farogatan 2
   SE-164 80 Kista

   Phone: +46 10 714 82 87

   Stephan Wenger
   433 Hackensack Ave
   Hackensack, NJ  07601
   United States