Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3550

RTP: A Transport Protocol for Real-Time Applications

Pages: 104
Internet Standard: 64
Errata
Obsoletes:  1889
Updated by:  5506576160516222702271607164808381088860
Part 3 of 4 – Pages 53 to 74
First   Prev   Next

Top   ToC   RFC3550 - Page 53   prevText

7. RTP Translators and Mixers

In addition to end systems, RTP supports the notion of "translators" and "mixers", which could be considered as "intermediate systems" at the RTP level. Although this support adds some complexity to the protocol, the need for these functions has been clearly established by experiments with multicast audio and video applications in the Internet. Example uses of translators and mixers given in Section 2.3 stem from the presence of firewalls and low bandwidth connections, both of which are likely to remain.

7.1 General Description

An RTP translator/mixer connects two or more transport-level "clouds". Typically, each cloud is defined by a common network and transport protocol (e.g., IP/UDP) plus a multicast address and transport level destination port or a pair of unicast addresses and ports. (Network-level protocol translators, such as IP version 4 to IP version 6, may be present within a cloud invisibly to RTP.) One system may serve as a translator or mixer for a number of RTP sessions, but each is considered a logically separate entity. In order to avoid creating a loop when a translator or mixer is installed, the following rules MUST be observed: o Each of the clouds connected by translators and mixers participating in one RTP session either MUST be distinct from all the others in at least one of these parameters (protocol, address, port), or MUST be isolated at the network level from the others.
Top   ToC   RFC3550 - Page 54
   o  A derivative of the first rule is that there MUST NOT be multiple
      translators or mixers connected in parallel unless by some
      arrangement they partition the set of sources to be forwarded.

   Similarly, all RTP end systems that can communicate through one or
   more RTP translators or mixers share the same SSRC space, that is,
   the SSRC identifiers MUST be unique among all these end systems.
   Section 8.2 describes the collision resolution algorithm by which
   SSRC identifiers are kept unique and loops are detected.

   There may be many varieties of translators and mixers designed for
   different purposes and applications.  Some examples are to add or
   remove encryption, change the encoding of the data or the underlying
   protocols, or replicate between a multicast address and one or more
   unicast addresses.  The distinction between translators and mixers is
   that a translator passes through the data streams from different
   sources separately, whereas a mixer combines them to form one new
   stream:

   Translator: Forwards RTP packets with their SSRC identifier
      intact; this makes it possible for receivers to identify
      individual sources even though packets from all the sources pass
      through the same translator and carry the translator's network
      source address.  Some kinds of translators will pass through the
      data untouched, but others MAY change the encoding of the data and
      thus the RTP data payload type and timestamp.  If multiple data
      packets are re-encoded into one, or vice versa, a translator MUST
      assign new sequence numbers to the outgoing packets.  Losses in
      the incoming packet stream may induce corresponding gaps in the
      outgoing sequence numbers.  Receivers cannot detect the presence
      of a translator unless they know by some other means what payload
      type or transport address was used by the original source.

   Mixer: Receives streams of RTP data packets from one or more
      sources, possibly changes the data format, combines the streams in
      some manner and then forwards the combined stream.  Since the
      timing among multiple input sources will not generally be
      synchronized, the mixer will make timing adjustments among the
      streams and generate its own timing for the combined stream, so it
      is the synchronization source.  Thus, all data packets forwarded
      by a mixer MUST be marked with the mixer's own SSRC identifier.
      In order to preserve the identity of the original sources
      contributing to the mixed packet, the mixer SHOULD insert their
      SSRC identifiers into the CSRC identifier list following the fixed
      RTP header of the packet.  A mixer that is also itself a
      contributing source for some packet SHOULD explicitly include its
      own SSRC identifier in the CSRC list for that packet.
Top   ToC   RFC3550 - Page 55
      For some applications, it MAY be acceptable for a mixer not to
      identify sources in the CSRC list.  However, this introduces the
      danger that loops involving those sources could not be detected.

   The advantage of a mixer over a translator for applications like
   audio is that the output bandwidth is limited to that of one source
   even when multiple sources are active on the input side.  This may be
   important for low-bandwidth links.  The disadvantage is that
   receivers on the output side don't have any control over which
   sources are passed through or muted, unless some mechanism is
   implemented for remote control of the mixer.  The regeneration of
   synchronization information by mixers also means that receivers can't
   do inter-media synchronization of the original streams.  A multi-
   media mixer could do it.

         [E1]                                    [E6]
          |                                       |
    E1:17 |                                 E6:15 |
          |                                       |   E6:15
          V  M1:48 (1,17)         M1:48 (1,17)    V   M1:48 (1,17)
         (M1)-------------><T1>-----------------><T2>-------------->[E7]
          ^                 ^     E4:47           ^   E4:47
     E2:1 |           E4:47 |                     |   M3:89 (64,45)
          |                 |                     |
         [E2]              [E4]     M3:89 (64,45) |
                                                  |        legend:
   [E3] --------->(M2)----------->(M3)------------|        [End system]
          E3:64        M2:12 (64)  ^                       (Mixer)
                                   | E5:45                 <Translator>
                                   |
                                  [E5]          source: SSRC (CSRCs)
                                                ------------------->

   Figure 3: Sample RTP network with end systems, mixers and translators

   A collection of mixers and translators is shown in Fig. 3 to
   illustrate their effect on SSRC and CSRC identifiers.  In the figure,
   end systems are shown as rectangles (named E), translators as
   triangles (named T) and mixers as ovals (named M).  The notation "M1:
   48(1,17)" designates a packet originating a mixer M1, identified by
   M1's (random) SSRC value of 48 and two CSRC identifiers, 1 and 17,
   copied from the SSRC identifiers of packets from E1 and E2.

7.2 RTCP Processing in Translators

In addition to forwarding data packets, perhaps modified, translators and mixers MUST also process RTCP packets. In many cases, they will take apart the compound RTCP packets received from end systems to
Top   ToC   RFC3550 - Page 56
   aggregate SDES information and to modify the SR or RR packets.
   Retransmission of this information may be triggered by the packet
   arrival or by the RTCP interval timer of the translator or mixer
   itself.

   A translator that does not modify the data packets, for example one
   that just replicates between a multicast address and a unicast
   address, MAY simply forward RTCP packets unmodified as well.  A
   translator that transforms the payload in some way MUST make
   corresponding transformations in the SR and RR information so that it
   still reflects the characteristics of the data and the reception
   quality.  These translators MUST NOT simply forward RTCP packets.  In
   general, a translator SHOULD NOT aggregate SR and RR packets from
   different sources into one packet since that would reduce the
   accuracy of the propagation delay measurements based on the LSR and
   DLSR fields.

   SR sender information:  A translator does not generate its own
      sender information, but forwards the SR packets received from one
      cloud to the others.  The SSRC is left intact but the sender
      information MUST be modified if required by the translation.  If a
      translator changes the data encoding, it MUST change the "sender's
      byte count" field.  If it also combines several data packets into
      one output packet, it MUST change the "sender's packet count"
      field.  If it changes the timestamp frequency, it MUST change the
      "RTP timestamp" field in the SR packet.

   SR/RR reception report blocks:  A translator forwards reception
      reports received from one cloud to the others.  Note that these
      flow in the direction opposite to the data.  The SSRC is left
      intact.  If a translator combines several data packets into one
      output packet, and therefore changes the sequence numbers, it MUST
      make the inverse manipulation for the packet loss fields and the
      "extended last sequence number" field.  This may be complex.  In
      the extreme case, there may be no meaningful way to translate the
      reception reports, so the translator MAY pass on no reception
      report at all or a synthetic report based on its own reception.
      The general rule is to do what makes sense for a particular
      translation.

      A translator does not require an SSRC identifier of its own, but
      MAY choose to allocate one for the purpose of sending reports
      about what it has received.  These would be sent to all the
      connected clouds, each corresponding to the translation of the
      data stream as sent to that cloud, since reception reports are
      normally multicast to all participants.
Top   ToC   RFC3550 - Page 57
   SDES:  Translators typically forward without change the SDES
      information they receive from one cloud to the others, but MAY,
      for example, decide to filter non-CNAME SDES information if
      bandwidth is limited.  The CNAMEs MUST be forwarded to allow SSRC
      identifier collision detection to work.  A translator that
      generates its own RR packets MUST send SDES CNAME information
      about itself to the same clouds that it sends those RR packets.

   BYE:  Translators forward BYE packets unchanged.  A translator
      that is about to cease forwarding packets SHOULD send a BYE packet
      to each connected cloud containing all the SSRC identifiers that
      were previously being forwarded to that cloud, including the
      translator's own SSRC identifier if it sent reports of its own.

   APP:  Translators forward APP packets unchanged.

7.3 RTCP Processing in Mixers

Since a mixer generates a new data stream of its own, it does not pass through SR or RR packets at all and instead generates new information for both sides. SR sender information: A mixer does not pass through sender information from the sources it mixes because the characteristics of the source streams are lost in the mix. As a synchronization source, the mixer SHOULD generate its own SR packets with sender information about the mixed data stream and send them in the same direction as the mixed stream. SR/RR reception report blocks: A mixer generates its own reception reports for sources in each cloud and sends them out only to the same cloud. It MUST NOT send these reception reports to the other clouds and MUST NOT forward reception reports from one cloud to the others because the sources would not be SSRCs there (only CSRCs). SDES: Mixers typically forward without change the SDES information they receive from one cloud to the others, but MAY, for example, decide to filter non-CNAME SDES information if bandwidth is limited. The CNAMEs MUST be forwarded to allow SSRC identifier collision detection to work. (An identifier in a CSRC list generated by a mixer might collide with an SSRC identifier generated by an end system.) A mixer MUST send SDES CNAME information about itself to the same clouds that it sends SR or RR packets.
Top   ToC   RFC3550 - Page 58
      Since mixers do not forward SR or RR packets, they will typically
      be extracting SDES packets from a compound RTCP packet.  To
      minimize overhead, chunks from the SDES packets MAY be aggregated
      into a single SDES packet which is then stacked on an SR or RR
      packet originating from the mixer.  A mixer which aggregates SDES
      packets will use more RTCP bandwidth than an individual source
      because the compound packets will be longer, but that is
      appropriate since the mixer represents multiple sources.
      Similarly, a mixer which passes through SDES packets as they are
      received will be transmitting RTCP packets at higher than the
      single source rate, but again that is correct since the packets
      come from multiple sources.  The RTCP packet rate may be different
      on each side of the mixer.

      A mixer that does not insert CSRC identifiers MAY also refrain
      from forwarding SDES CNAMEs.  In this case, the SSRC identifier
      spaces in the two clouds are independent.  As mentioned earlier,
      this mode of operation creates a danger that loops can't be
      detected.

   BYE:  Mixers MUST forward BYE packets.  A mixer that is about to
      cease forwarding packets SHOULD send a BYE packet to each
      connected cloud containing all the SSRC identifiers that were
      previously being forwarded to that cloud, including the mixer's
      own SSRC identifier if it sent reports of its own.

   APP:  The treatment of APP packets by mixers is application-specific.

7.4 Cascaded Mixers

An RTP session may involve a collection of mixers and translators as shown in Fig. 3. If two mixers are cascaded, such as M2 and M3 in the figure, packets received by a mixer may already have been mixed and may include a CSRC list with multiple identifiers. The second mixer SHOULD build the CSRC list for the outgoing packet using the CSRC identifiers from already-mixed input packets and the SSRC identifiers from unmixed input packets. This is shown in the output arc from mixer M3 labeled M3:89(64,45) in the figure. As in the case of mixers that are not cascaded, if the resulting CSRC list has more than 15 identifiers, the remainder cannot be included.
Top   ToC   RFC3550 - Page 59

8. SSRC Identifier Allocation and Use

The SSRC identifier carried in the RTP header and in various fields of RTCP packets is a random 32-bit number that is required to be globally unique within an RTP session. It is crucial that the number be chosen with care in order that participants on the same network or starting at the same time are not likely to choose the same number. It is not sufficient to use the local network address (such as an IPv4 address) for the identifier because the address may not be unique. Since RTP translators and mixers enable interoperation among multiple networks with different address spaces, the allocation patterns for addresses within two spaces might result in a much higher rate of collision than would occur with random allocation. Multiple sources running on one host would also conflict. It is also not sufficient to obtain an SSRC identifier simply by calling random() without carefully initializing the state. An example of how to generate a random identifier is presented in Appendix A.6.

8.1 Probability of Collision

Since the identifiers are chosen randomly, it is possible that two or more sources will choose the same number. Collision occurs with the highest probability when all sources are started simultaneously, for example when triggered automatically by some session management event. If N is the number of sources and L the length of the identifier (here, 32 bits), the probability that two sources independently pick the same value can be approximated for large N [26] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, the probability is roughly 10**-4. The typical collision probability is much lower than the worst-case above. When one new source joins an RTP session in which all the other sources already have unique identifiers, the probability of collision is just the fraction of numbers used out of the space. Again, if N is the number of sources and L the length of the identifier, the probability of collision is N / 2**L. For N=1000, the probability is roughly 2*10**-7. The probability of collision is further reduced by the opportunity for a new source to receive packets from other participants before sending its first packet (either data or control). If the new source keeps track of the other participants (by SSRC identifier), then
Top   ToC   RFC3550 - Page 60
   before transmitting its first packet the new source can verify that
   its identifier does not conflict with any that have been received, or
   else choose again.

8.2 Collision Resolution and Loop Detection

Although the probability of SSRC identifier collision is low, all RTP implementations MUST be prepared to detect collisions and take the appropriate actions to resolve them. If a source discovers at any time that another source is using the same SSRC identifier as its own, it MUST send an RTCP BYE packet for the old identifier and choose another random one. (As explained below, this step is taken only once in case of a loop.) If a receiver discovers that two other sources are colliding, it MAY keep the packets from one and discard the packets from the other when this can be detected by different source transport addresses or CNAMEs. The two sources are expected to resolve the collision so that the situation doesn't last. Because the random SSRC identifiers are kept globally unique for each RTP session, they can also be used to detect loops that may be introduced by mixers or translators. A loop causes duplication of data and control information, either unmodified or possibly mixed, as in the following examples: o A translator may incorrectly forward a packet to the same multicast group from which it has received the packet, either directly or through a chain of translators. In that case, the same packet appears several times, originating from different network sources. o Two translators incorrectly set up in parallel, i.e., with the same multicast groups on both sides, would both forward packets from one multicast group to the other. Unidirectional translators would produce two copies; bidirectional translators would form a loop. o A mixer can close a loop by sending to the same transport destination upon which it receives packets, either directly or through another mixer or translator. In this case a source might show up both as an SSRC on a data packet and a CSRC in a mixed data packet. A source may discover that its own packets are being looped, or that packets from another source are being looped (a third-party loop). Both loops and collisions in the random selection of a source identifier result in packets arriving with the same SSRC identifier but a different source transport address, which may be that of the end system originating the packet or an intermediate system.
Top   ToC   RFC3550 - Page 61
   Therefore, if a source changes its source transport address, it MAY
   also choose a new SSRC identifier to avoid being interpreted as a
   looped source.  (This is not MUST because in some applications of RTP
   sources may be expected to change addresses during a session.)  Note
   that if a translator restarts and consequently changes the source
   transport address (e.g., changes the UDP source port number) on which
   it forwards packets, then all those packets will appear to receivers
   to be looped because the SSRC identifiers are applied by the original
   source and will not change.  This problem can be avoided by keeping
   the source transport address fixed across restarts, but in any case
   will be resolved after a timeout at the receivers.

   Loops or collisions occurring on the far side of a translator or
   mixer cannot be detected using the source transport address if all
   copies of the packets go through the translator or mixer, however,
   collisions may still be detected when chunks from two RTCP SDES
   packets contain the same SSRC identifier but different CNAMEs.

   To detect and resolve these conflicts, an RTP implementation MUST
   include an algorithm similar to the one described below, though the
   implementation MAY choose a different policy for which packets from
   colliding third-party sources are kept.  The algorithm described
   below ignores packets from a new source or loop that collide with an
   established source.  It resolves collisions with the participant's
   own SSRC identifier by sending an RTCP BYE for the old identifier and
   choosing a new one.  However, when the collision was induced by a
   loop of the participant's own packets, the algorithm will choose a
   new identifier only once and thereafter ignore packets from the
   looping source transport address.  This is required to avoid a flood
   of BYE packets.

   This algorithm requires keeping a table indexed by the source
   identifier and containing the source transport addresses from the
   first RTP packet and first RTCP packet received with that identifier,
   along with other state for that source.  Two source transport
   addresses are required since, for example, the UDP source port
   numbers may be different on RTP and RTCP packets.  However, it may be
   assumed that the network address is the same in both source transport
   addresses.

   Each SSRC or CSRC identifier received in an RTP or RTCP packet is
   looked up in the source identifier table in order to process that
   data or control information.  The source transport address from the
   packet is compared to the corresponding source transport address in
   the table to detect a loop or collision if they don't match.  For
   control packets, each element with its own SSRC identifier, for
   example an SDES chunk, requires a separate lookup.  (The SSRC
   identifier in a reception report block is an exception because it
Top   ToC   RFC3550 - Page 62
   identifies a source heard by the reporter, and that SSRC identifier
   is unrelated to the source transport address of the RTCP packet sent
   by the reporter.)  If the SSRC or CSRC is not found, a new entry is
   created.  These table entries are removed when an RTCP BYE packet is
   received with the corresponding SSRC identifier and validated by a
   matching source transport address, or after no packets have arrived
   for a relatively long time (see Section 6.2.1).

   Note that if two sources on the same host are transmitting with the
   same source identifier at the time a receiver begins operation, it
   would be possible that the first RTP packet received came from one of
   the sources while the first RTCP packet received came from the other.
   This would cause the wrong RTCP information to be associated with the
   RTP data, but this situation should be sufficiently rare and harmless
   that it may be disregarded.

   In order to track loops of the participant's own data packets, the
   implementation MUST also keep a separate list of source transport
   addresses (not identifiers) that have been found to be conflicting.
   As in the source identifier table, two source transport addresses
   MUST be kept to separately track conflicting RTP and RTCP packets.
   Note that the conflicting address list should be short, usually
   empty.  Each element in this list stores the source addresses plus
   the time when the most recent conflicting packet was received.  An
   element MAY be removed from the list when no conflicting packet has
   arrived from that source for a time on the order of 10 RTCP report
   intervals (see Section 6.2).

   For the algorithm as shown, it is assumed that the participant's own
   source identifier and state are included in the source identifier
   table.  The algorithm could be restructured to first make a separate
   comparison against the participant's own source identifier.

      if (SSRC or CSRC identifier is not found in the source
          identifier table) {
          create a new entry storing the data or control source
              transport address, the SSRC or CSRC and other state;
      }

      /* Identifier is found in the table */

      else if (table entry was created on receipt of a control packet
               and this is the first data packet or vice versa) {
          store the source transport address from this packet;
      }
      else if (source transport address from the packet does not match
               the one saved in the table entry for this identifier) {
Top   ToC   RFC3550 - Page 63
          /* An identifier collision or a loop is indicated */

          if (source identifier is not the participant's own) {
              /* OPTIONAL error counter step */
              if (source identifier is from an RTCP SDES chunk
                  containing a CNAME item that differs from the CNAME
                  in the table entry) {
                  count a third-party collision;
              } else {
                  count a third-party loop;
              }
              abort processing of data packet or control element;
              /* MAY choose a different policy to keep new source */
          }

          /* A collision or loop of the participant's own packets */

          else if (source transport address is found in the list of
                   conflicting data or control source transport
                   addresses) {
              /* OPTIONAL error counter step */
              if (source identifier is not from an RTCP SDES chunk
                  containing a CNAME item or CNAME is the
                  participant's own) {
                  count occurrence of own traffic looped;
              }
              mark current time in conflicting address list entry;
              abort processing of data packet or control element;
          }

          /* New collision, change SSRC identifier */

          else {
              log occurrence of a collision;
              create a new entry in the conflicting data or control
                  source transport address list and mark current time;
              send an RTCP BYE packet with the old SSRC identifier;
              choose a new SSRC identifier;
              create a new entry in the source identifier table with
                  the old SSRC plus the source transport address from
                  the data or control packet being processed;
          }
      }

   In this algorithm, packets from a newly conflicting source address
   will be ignored and packets from the original source address will be
   kept.  If no packets arrive from the original source for an extended
   period, the table entry will be timed out and the new source will be
Top   ToC   RFC3550 - Page 64
   able to take over.  This might occur if the original source detects
   the collision and moves to a new source identifier, but in the usual
   case an RTCP BYE packet will be received from the original source to
   delete the state without having to wait for a timeout.

   If the original source address was received through a mixer (i.e.,
   learned as a CSRC) and later the same source is received directly,
   the receiver may be well advised to switch to the new source address
   unless other sources in the mix would be lost.  Furthermore, for
   applications such as telephony in which some sources such as mobile
   entities may change addresses during the course of an RTP session,
   the RTP implementation SHOULD modify the collision detection
   algorithm to accept packets from the new source transport address.
   To guard against flip-flopping between addresses if a genuine
   collision does occur, the algorithm SHOULD include some means to
   detect this case and avoid switching.

   When a new SSRC identifier is chosen due to a collision, the
   candidate identifier SHOULD first be looked up in the source
   identifier table to see if it was already in use by some other
   source.  If so, another candidate MUST be generated and the process
   repeated.

   A loop of data packets to a multicast destination can cause severe
   network flooding.  All mixers and translators MUST implement a loop
   detection algorithm like the one here so that they can break loops.
   This should limit the excess traffic to no more than one duplicate
   copy of the original traffic, which may allow the session to continue
   so that the cause of the loop can be found and fixed.  However, in
   extreme cases where a mixer or translator does not properly break the
   loop and high traffic levels result, it may be necessary for end
   systems to cease transmitting data or control packets entirely.  This
   decision may depend upon the application.  An error condition SHOULD
   be indicated as appropriate.  Transmission MAY be attempted again
   periodically after a long, random time (on the order of minutes).

8.3 Use with Layered Encodings

For layered encodings transmitted on separate RTP sessions (see Section 2.4), a single SSRC identifier space SHOULD be used across the sessions of all layers and the core (base) layer SHOULD be used for SSRC identifier allocation and collision resolution. When a source discovers that it has collided, it transmits an RTCP BYE packet on only the base layer but changes the SSRC identifier to the new value in all layers.
Top   ToC   RFC3550 - Page 65

9. Security

Lower layer protocols may eventually provide all the security services that may be desired for applications of RTP, including authentication, integrity, and confidentiality. These services have been specified for IP in [27]. Since the initial audio and video applications using RTP needed a confidentiality service before such services were available for the IP layer, the confidentiality service described in the next section was defined for use with RTP and RTCP. That description is included here to codify existing practice. New applications of RTP MAY implement this RTP-specific confidentiality service for backward compatibility, and/or they MAY implement alternative security services. The overhead on the RTP protocol for this confidentiality service is low, so the penalty will be minimal if this service is obsoleted by other services in the future. Alternatively, other services, other implementations of services and other algorithms may be defined for RTP in the future. In particular, an RTP profile called Secure Real-time Transport Protocol (SRTP) [28] is being developed to provide confidentiality of the RTP payload while leaving the RTP header in the clear so that link-level header compression algorithms can still operate. It is expected that SRTP will be the correct choice for many applications. SRTP is based on the Advanced Encryption Standard (AES) and provides stronger security than the service described here. No claim is made that the methods presented here are appropriate for a particular security need. A profile may specify which services and algorithms should be offered by applications, and may provide guidance as to their appropriate use. Key distribution and certificates are outside the scope of this document.

9.1 Confidentiality

Confidentiality means that only the intended receiver(s) can decode the received packets; for others, the packet contains no useful information. Confidentiality of the content is achieved by encryption. When it is desired to encrypt RTP or RTCP according to the method specified in this section, all the octets that will be encapsulated for transmission in a single lower-layer packet are encrypted as a unit. For RTCP, a 32-bit random number redrawn for each unit MUST be prepended to the unit before encryption. For RTP, no prefix is prepended; instead, the sequence number and timestamp fields are initialized with random offsets. This is considered to be a weak
Top   ToC   RFC3550 - Page 66
   initialization vector (IV) because of poor randomness properties.  In
   addition, if the subsequent field, the SSRC, can be manipulated by an
   enemy, there is further weakness of the encryption method.

   For RTCP, an implementation MAY segregate the individual RTCP packets
   in a compound RTCP packet into two separate compound RTCP packets,
   one to be encrypted and one to be sent in the clear.  For example,
   SDES information might be encrypted while reception reports were sent
   in the clear to accommodate third-party monitors that are not privy
   to the encryption key.  In this example, depicted in Fig. 4, the SDES
   information MUST be appended to an RR packet with no reports (and the
   random number) to satisfy the requirement that all compound RTCP
   packets begin with an SR or RR packet.  The SDES CNAME item is
   required in either the encrypted or unencrypted packet, but not both.
   The same SDES information SHOULD NOT be carried in both packets as
   this may compromise the encryption.

             UDP packet                     UDP packet
   -----------------------------  ------------------------------
   [random][RR][SDES #CNAME ...]  [SR #senderinfo #site1 #site2]
   -----------------------------  ------------------------------
             encrypted                     not encrypted

   #: SSRC identifier

       Figure 4: Encrypted and non-encrypted RTCP packets

   The presence of encryption and the use of the correct key are
   confirmed by the receiver through header or payload validity checks.
   Examples of such validity checks for RTP and RTCP headers are given
   in Appendices A.1 and A.2.

   To be consistent with existing implementations of the initial
   specification of RTP in RFC 1889, the default encryption algorithm is
   the Data Encryption Standard (DES) algorithm in cipher block chaining
   (CBC) mode, as described in Section 1.1 of RFC 1423 [29], except that
   padding to a multiple of 8 octets is indicated as described for the P
   bit in Section 5.1.  The initialization vector is zero because random
   values are supplied in the RTP header or by the random prefix for
   compound RTCP packets.  For details on the use of CBC initialization
   vectors, see [30].

   Implementations that support the encryption method specified here
   SHOULD always support the DES algorithm in CBC mode as the default
   cipher for this method to maximize interoperability.  This method was
   chosen because it has been demonstrated to be easy and practical to
   use in experimental audio and video tools in operation on the
   Internet.  However, DES has since been found to be too easily broken.
Top   ToC   RFC3550 - Page 67
   It is RECOMMENDED that stronger encryption algorithms such as
   Triple-DES be used in place of the default algorithm.  Furthermore,
   secure CBC mode requires that the first block of each packet be XORed
   with a random, independent IV of the same size as the cipher's block
   size.  For RTCP, this is (partially) achieved by prepending each
   packet with a 32-bit random number, independently chosen for each
   packet.  For RTP, the timestamp and sequence number start from random
   values, but consecutive packets will not be independently randomized.
   It should be noted that the randomness in both cases (RTP and RTCP)
   is limited.  High-security applications SHOULD consider other, more
   conventional, protection means.  Other encryption algorithms MAY be
   specified dynamically for a session by non-RTP means.  In particular,
   the SRTP profile [28] based on AES is being developed to take into
   account known plaintext and CBC plaintext manipulation concerns, and
   will be the correct choice in the future.

   As an alternative to encryption at the IP level or at the RTP level
   as described above, profiles MAY define additional payload types for
   encrypted encodings.  Those encodings MUST specify how padding and
   other aspects of the encryption are to be handled.  This method
   allows encrypting only the data while leaving the headers in the
   clear for applications where that is desired.  It may be particularly
   useful for hardware devices that will handle both decryption and
   decoding.  It is also valuable for applications where link-level
   compression of RTP and lower-layer headers is desired and
   confidentiality of the payload (but not addresses) is sufficient
   since encryption of the headers precludes compression.

9.2 Authentication and Message Integrity

Authentication and message integrity services are not defined at the RTP level since these services would not be directly feasible without a key management infrastructure. It is expected that authentication and integrity services will be provided by lower layer protocols.

10. Congestion Control

All transport protocols used on the Internet need to address congestion control in some way [31]. RTP is not an exception, but because the data transported over RTP is often inelastic (generated at a fixed or controlled rate), the means to control congestion in RTP may be quite different from those for other transport protocols such as TCP. In one sense, inelasticity reduces the risk of congestion because the RTP stream will not expand to consume all available bandwidth as a TCP stream can. However, inelasticity also means that the RTP stream cannot arbitrarily reduce its load on the network to eliminate congestion when it occurs.
Top   ToC   RFC3550 - Page 68
   Since RTP may be used for a wide variety of applications in many
   different contexts, there is no single congestion control mechanism
   that will work for all.  Therefore, congestion control SHOULD be
   defined in each RTP profile as appropriate.  For some profiles, it
   may be sufficient to include an applicability statement restricting
   the use of that profile to environments where congestion is avoided
   by engineering.  For other profiles, specific methods such as data
   rate adaptation based on RTCP feedback may be required.

11. RTP over Network and Transport Protocols

This section describes issues specific to carrying RTP packets within particular network and transport protocols. The following rules apply unless superseded by protocol-specific definitions outside this specification. RTP relies on the underlying protocol(s) to provide demultiplexing of RTP data and RTCP control streams. For UDP and similar protocols, RTP SHOULD use an even destination port number and the corresponding RTCP stream SHOULD use the next higher (odd) destination port number. For applications that take a single port number as a parameter and derive the RTP and RTCP port pair from that number, if an odd number is supplied then the application SHOULD replace that number with the next lower (even) number to use as the base of the port pair. For applications in which the RTP and RTCP destination port numbers are specified via explicit, separate parameters (using a signaling protocol or other means), the application MAY disregard the restrictions that the port numbers be even/odd and consecutive although the use of an even/odd port pair is still encouraged. The RTP and RTCP port numbers MUST NOT be the same since RTP relies on the port numbers to demultiplex the RTP data and RTCP control streams. In a unicast session, both participants need to identify a port pair for receiving RTP and RTCP packets. Both participants MAY use the same port pair. A participant MUST NOT assume that the source port of the incoming RTP or RTCP packet can be used as the destination port for outgoing RTP or RTCP packets. When RTP data packets are being sent in both directions, each participant's RTCP SR packets MUST be sent to the port that the other participant has specified for reception of RTCP. The RTCP SR packets combine sender information for the outgoing data plus reception report information for the incoming data. If a side is not actively sending data (see Section 6.4), an RTCP RR packet is sent instead. It is RECOMMENDED that layered encoding applications (see Section 2.4) use a set of contiguous port numbers. The port numbers MUST be distinct because of a widespread deficiency in existing operating
Top   ToC   RFC3550 - Page 69
   systems that prevents use of the same port with multiple multicast
   addresses, and for unicast, there is only one permissible address.
   Thus for layer n, the data port is P + 2n, and the control port is P
   + 2n + 1.  When IP multicast is used, the addresses MUST also be
   distinct because multicast routing and group membership are managed
   on an address granularity.  However, allocation of contiguous IP
   multicast addresses cannot be assumed because some groups may require
   different scopes and may therefore be allocated from different
   address ranges.

   The previous paragraph conflicts with the SDP specification, RFC 2327
   [15], which says that it is illegal for both multiple addresses and
   multiple ports to be specified in the same session description
   because the association of addresses with ports could be ambiguous.
   It is intended that this restriction will be relaxed in a revision of
   RFC 2327 to allow an equal number of addresses and ports to be
   specified with a one-to-one mapping implied.

   RTP data packets contain no length field or other delineation,
   therefore RTP relies on the underlying protocol(s) to provide a
   length indication.  The maximum length of RTP packets is limited only
   by the underlying protocols.

   If RTP packets are to be carried in an underlying protocol that
   provides the abstraction of a continuous octet stream rather than
   messages (packets), an encapsulation of the RTP packets MUST be
   defined to provide a framing mechanism.  Framing is also needed if
   the underlying protocol may contain padding so that the extent of the
   RTP payload cannot be determined.  The framing mechanism is not
   defined here.

   A profile MAY specify a framing method to be used even when RTP is
   carried in protocols that do provide framing in order to allow
   carrying several RTP packets in one lower-layer protocol data unit,
   such as a UDP packet.  Carrying several RTP packets in one network or
   transport packet reduces header overhead and may simplify
   synchronization between different streams.

12. Summary of Protocol Constants

This section contains a summary listing of the constants defined in this specification. The RTP payload type (PT) constants are defined in profiles rather than this document. However, the octet of the RTP header which contains the marker bit(s) and payload type MUST avoid the reserved values 200 and 201 (decimal) to distinguish RTP packets from the RTCP SR and RR packet types for the header validation procedure described
Top   ToC   RFC3550 - Page 70
   in Appendix A.1.  For the standard definition of one marker bit and a
   7-bit payload type field as shown in this specification, this
   restriction means that payload types 72 and 73 are reserved.

12.1 RTCP Packet Types

abbrev. name value SR sender report 200 RR receiver report 201 SDES source description 202 BYE goodbye 203 APP application-defined 204 These type values were chosen in the range 200-204 for improved header validity checking of RTCP packets compared to RTP packets or other unrelated packets. When the RTCP packet type field is compared to the corresponding octet of the RTP header, this range corresponds to the marker bit being 1 (which it usually is not in data packets) and to the high bit of the standard payload type field being 1 (since the static payload types are typically defined in the low half). This range was also chosen to be some distance numerically from 0 and 255 since all-zeros and all-ones are common data patterns. Since all compound RTCP packets MUST begin with SR or RR, these codes were chosen as an even/odd pair to allow the RTCP validity check to test the maximum number of bits with mask and value. Additional RTCP packet types may be registered through IANA (see Section 15).

12.2 SDES Types

abbrev. name value END end of SDES list 0 CNAME canonical name 1 NAME user name 2 EMAIL user's electronic mail address 3 PHONE user's phone number 4 LOC geographic user location 5 TOOL name of application or tool 6 NOTE notice about the source 7 PRIV private extensions 8 Additional SDES types may be registered through IANA (see Section 15).
Top   ToC   RFC3550 - Page 71

13. RTP Profiles and Payload Format Specifications

A complete specification of RTP for a particular application will require one or more companion documents of two types described here: profiles, and payload format specifications. RTP may be used for a variety of applications with somewhat differing requirements. The flexibility to adapt to those requirements is provided by allowing multiple choices in the main protocol specification, then selecting the appropriate choices or defining extensions for a particular environment and class of applications in a separate profile document. Typically an application will operate under only one profile in a particular RTP session, so there is no explicit indication within the RTP protocol itself as to which profile is in use. A profile for audio and video applications may be found in the companion RFC 3551. Profiles are typically titled "RTP Profile for ...". The second type of companion document is a payload format specification, which defines how a particular kind of payload data, such as H.261 encoded video, should be carried in RTP. These documents are typically titled "RTP Payload Format for XYZ Audio/Video Encoding". Payload formats may be useful under multiple profiles and may therefore be defined independently of any particular profile. The profile documents are then responsible for assigning a default mapping of that format to a payload type value if needed. Within this specification, the following items have been identified for possible definition within a profile, but this list is not meant to be exhaustive: RTP data header: The octet in the RTP data header that contains the marker bit and payload type field MAY be redefined by a profile to suit different requirements, for example with more or fewer marker bits (Section 5.3, p. 18). Payload types: Assuming that a payload type field is included, the profile will usually define a set of payload formats (e.g., media encodings) and a default static mapping of those formats to payload type values. Some of the payload formats may be defined by reference to separate payload format specifications. For each payload type defined, the profile MUST specify the RTP timestamp clock rate to be used (Section 5.1, p. 14). RTP data header additions: Additional fields MAY be appended to the fixed RTP data header if some additional functionality is required across the profile's class of applications independent of payload type (Section 5.3, p. 18).
Top   ToC   RFC3550 - Page 72
   RTP data header extensions: The contents of the first 16 bits of
      the RTP data header extension structure MUST be defined if use of
      that mechanism is to be allowed under the profile for
      implementation-specific extensions (Section 5.3.1, p. 18).

   RTCP packet types: New application-class-specific RTCP packet
      types MAY be defined and registered with IANA.

   RTCP report interval: A profile SHOULD specify that the values
      suggested in Section 6.2 for the constants employed in the
      calculation of the RTCP report interval will be used.  Those are
      the RTCP fraction of session bandwidth, the minimum report
      interval, and the bandwidth split between senders and receivers.
      A profile MAY specify alternate values if they have been
      demonstrated to work in a scalable manner.

   SR/RR extension: An extension section MAY be defined for the
      RTCP SR and RR packets if there is additional information that
      should be reported regularly about the sender or receivers
      (Section 6.4.3, p. 42 and 43).

   SDES use: The profile MAY specify the relative priorities for
      RTCP SDES items to be transmitted or excluded entirely (Section
      6.3.9); an alternate syntax or semantics for the CNAME item
      (Section 6.5.1); the format of the LOC item (Section 6.5.5); the
      semantics and use of the NOTE item (Section 6.5.7); or new SDES
      item types to be registered with IANA.

   Security: A profile MAY specify which security services and
      algorithms should be offered by applications, and MAY provide
      guidance as to their appropriate use (Section 9, p. 65).

   String-to-key mapping: A profile MAY specify how a user-provided
      password or pass phrase is mapped into an encryption key.

   Congestion: A profile SHOULD specify the congestion control
      behavior appropriate for that profile.

   Underlying protocol: Use of a particular underlying network or
      transport layer protocol to carry RTP packets MAY be required.

   Transport mapping: A mapping of RTP and RTCP to transport-level
      addresses, e.g., UDP ports, other than the standard mapping
      defined in Section 11, p. 68 may be specified.
Top   ToC   RFC3550 - Page 73
   Encapsulation: An encapsulation of RTP packets may be defined to
      allow multiple RTP data packets to be carried in one lower-layer
      packet or to provide framing over underlying protocols that do not
      already do so (Section 11, p. 69).

   It is not expected that a new profile will be required for every
   application.  Within one application class, it would be better to
   extend an existing profile rather than make a new one in order to
   facilitate interoperation among the applications since each will
   typically run under only one profile.  Simple extensions such as the
   definition of additional payload type values or RTCP packet types may
   be accomplished by registering them through IANA and publishing their
   descriptions in an addendum to the profile or in a payload format
   specification.

14. Security Considerations

RTP suffers from the same security liabilities as the underlying protocols. For example, an impostor can fake source or destination network addresses, or change the header or payload. Within RTCP, the CNAME and NAME information may be used to impersonate another participant. In addition, RTP may be sent via IP multicast, which provides no direct means for a sender to know all the receivers of the data sent and therefore no measure of privacy. Rightly or not, users may be more sensitive to privacy concerns with audio and video communication than they have been with more traditional forms of network communication [33]. Therefore, the use of security mechanisms with RTP is important. These mechanisms are discussed in Section 9. RTP-level translators or mixers may be used to allow RTP traffic to reach hosts behind firewalls. Appropriate firewall security principles and practices, which are beyond the scope of this document, should be followed in the design and installation of these devices and in the admission of RTP applications for use behind the firewall.

15. IANA Considerations

Additional RTCP packet types and SDES item types may be registered through the Internet Assigned Numbers Authority (IANA). Since these number spaces are small, allowing unconstrained registration of new values would not be prudent. To facilitate review of requests and to promote shared use of new types among multiple applications, requests for registration of new values must be documented in an RFC or other permanent and readily available reference such as the product of another cooperative standards body (e.g., ITU-T). Other requests may also be accepted, under the advice of a "designated expert."
Top   ToC   RFC3550 - Page 74
   (Contact the IANA for the contact information of the current expert.)

   RTP profile specifications SHOULD register with IANA a name for the
   profile in the form "RTP/xxx", where xxx is a short abbreviation of
   the profile title.  These names are for use by higher-level control
   protocols, such as the Session Description Protocol (SDP), RFC 2327
   [15], to refer to transport methods.

16. Intellectual Property Rights Statement

The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director.

17. Acknowledgments

This memorandum is based on discussions within the IETF Audio/Video Transport working group chaired by Stephen Casner and Colin Perkins. The current protocol has its origins in the Network Voice Protocol and the Packet Video Protocol (Danny Cohen and Randy Cole) and the protocol implemented by the vat application (Van Jacobson and Steve McCanne). Christian Huitema provided ideas for the random identifier generator. Extensive analysis and simulation of the timer reconsideration algorithm was done by Jonathan Rosenberg. The additions for layered encodings were specified by Michael Speer and Steve McCanne.