RFC 7604

Comparison of Different NAT Traversal Techniques for Media Controlled by the Real-Time Streaming Protocol (RTSP)

Pages: 46
Informational

Part 2 of 3 – Pages 9 to 29

RFC7604 - Page 9 prevText

3.  Requirements on Solutions

   This section considers the set of requirements for the evaluation of
   RTSP NAT traversal solutions.

   RTSP is a client-server protocol.  Typically, service providers
   deploy RTSP servers on the Internet or otherwise reachable address
   realm.  However, there are use cases where the reverse is true: RTSP
   clients are connecting from any address realm to RTSP servers behind
   NATs, e.g., in a home.  This is the case, for instance, when home
   surveillance cameras running as RTSP servers intend to stream video
   to cell phone users in the public address realm through a home NAT.
   In terms of requirements, the primary issue to solve is the RTSP NAT
   traversal problem for RTSP servers deployed in a network where the
   server is on the external side of any NAT, i.e., the server is not
   behind a NAT.  The server behind a NAT is desirable but of much lower
   priority.

   Important considerations for any NAT traversal technique are whether
   any protocol modifications are needed and where the implementation
   burden resides (e.g., server, client, or middlebox).  If the
   incentive to get RTSP to work over a NAT is sufficient, it will
   motivate the owner of the server, client, or middlebox to update,
   configure, or otherwise perform changes to the device and its
   software in order to support NAT traversal.  Thus, the questions of
   who this burden falls on and how big it is are highly relevant.

   The list of feature requirements for RTSP NAT solutions are given
   below:

   1.  Must work for all flavors of NATs, including NATs with Address
       and Port-Dependent Filtering.

   2.  Must work for firewalls (subject to pertinent firewall
       administrative policies), including those with ALGs.

   3.  Should have minimal impact on clients not behind NATs and that
       are not dual hosted.  RTSP dual hosting means that the RTSP
       signaling protocol and the media protocol (e.g., RTP) are
       implemented on different computers with different IP addresses.

       *  For instance, no extra protocol RTT before arrival of media.

   4.  Should be simple to use/implement/administer so people actually
       turn them on.

       *  Discovery of the address(es) assigned by NAT should happen
          automatically, if possible.

RFC7604 - Page 10

   5.  Should authenticate dual-hosted client's media transport receiver
       to prevent usage of RTSP servers for DDoS attacks.

   The last requirement addresses the Distributed Denial-of-Service
   (DDoS) threat, which relates to NAT traversal as explained below.

   During NAT traversal, when the RTSP server determines the media
   destination (address and port) for the client, the result may be that
   the IP address of the RTP receiver host is different than the IP
   address of the RTSP client host.  This poses a DDoS threat that has
   significant amplification potentials because the RTP media streams in
   general consist of a large number of IP packets.  DDoS attacks can
   occur if the attacker can fake the messages in the NAT traversal
   mechanism to trick the RTSP server into believing that the client's
   RTP receiver is located on a host to be attacked.  For example, user
   A may use his RTSP client to direct the RTSP server to send video RTP
   streams to target.example.com in order to degrade the services
   provided by target.example.com.

   Note that a simple mitigation is for the RTSP server to disallow the
   cases where the client's RTP receiver has a different IP address than
   that of the RTSP client.  This is recommended behavior in RTSP 2.0
   unless other solutions to prevent this attack are present; see
   Section 21.2.1 in [RTSP].  With the increased deployment of NAT
   middleboxes by operators, i.e., CGN, the reuse of an IP address on
   the NAT's external side by many customers reduces the protection
   provided.  Also in some applications (e.g., centralized
   conferencing), dual-hosted RTSP/RTP clients have valid use cases.
   The key is how to authenticate the messages exchanged during the NAT
   traversal process.

4.  NAT-Traversal Techniques

   There exists a number of potential NAT traversal techniques that can
   be used to allow RTSP to traverse NATs.  They have different features
   and are applicable to different topologies; their costs are also
   different.  They also vary in security levels.  In the following
   sections, each technique is outlined with discussions on the
   corresponding advantages and disadvantages.

   The survey of traversal techniques was done prior to 2007 and is
   based on what was available then.  This section includes NAT
   traversal techniques that have not been formally specified anywhere
   else.  This document may be the only publicly available specification
   of some of the NAT traversal techniques.  However, that is not a real
   barrier against doing an evaluation of the NAT traversal techniques.
   Some techniques used as part of some of the traversal solutions have
   been recommended against or are no longer possible due to the outcome

RFC7604 - Page 11

   of standardization work or their failure to progress within IETF
   after the initial evaluation in this document.  For example, RTP
   No-Op [RTP-NO-OP] was a proposed RTP payload format that failed to be
   specified; thus, it is not available for use today.  In each such
   case, the missing parts will be noted and some basic reasons will be
   given.

4.1.  Stand-Alone STUN

4.1.1.  Introduction

   Session Traversal Utilities for NAT (STUN) [RFC5389] is a
   standardized protocol that allows a client to use secure means to
   discover the presence of a NAT between itself and the STUN server.
   The client uses the STUN server to discover the address and port
   mappings assigned by the NAT.  Then using the knowledge of these NAT
   mappings, it uses the external addresses to directly connect to the
   independent RTSP server.  However, this is only possible if the NAT
   address and port mapping behavior is such that the STUN server and
   RTSP server will see the same external address and port for the same
   internal address and port.

   STUN is a client-server protocol.  The STUN client sends a request to
   a STUN server and the server returns a response.  There are two types
   of STUN messages -- Binding Requests and Indications.  Binding
   Requests are used when determining a client's external address and
   soliciting a response from the STUN server with the seen address.
   Indications are used by the client for keep-alive messages towards
   the server and requires no response from the server.

   The first version of STUN [RFC3489] included categorization and
   parameterization of NATs.  This was abandoned in the updated version
   [RFC5389] due to it being unreliable and brittle.  This particular
   traversal method uses the removed functionality described in RFC 3489
   to detect the NAT type to give an early failure indication when the
   NAT is showing the behavior that this method can't support.  This
   method also suggests using the RTP No-Op payload format [RTP-NO-OP]
   for keep-alives of the RTP traffic in the client-to-server direction.
   This can be replaced with another form of UDP packet as will be
   further discussed below.

4.1.2.  Using STUN to Traverse NAT without Server Modifications

   This section describes how a client can use STUN to traverse NATs to
   RTSP servers without requiring server modifications.  Note that this
   method has limited applicability and requires the server to be
   available in the external/public address realm in regards to the
   client located behind a NAT(s).

RFC7604 - Page 12

   Limitations:

   o  The server must be located in either a public address realm or the
      next-hop external address realm in regards to the client.

   o  The client may only be located behind NATs that perform Endpoint-
      Independent or Address-Dependent Mappings (the STUN server and
      RTSP server on the same IP address).  Clients behind NATs that do
      Address and Port-Dependent Mappings cannot use this method.  See
      [RFC4787] for the full definition of these terms.

   o  Based on the discontinued middlebox classification of the replaced
      STUN specification [RFC3489]; thus, it is brittle and unreliable.

   Method:

   An RTSP client using RTP transport over UDP can use STUN to traverse
   a NAT(s) in the following way:

   1.  Use STUN to try to discover the type of NAT and the timeout
       period for any UDP mapping on the NAT.  This is recommended to be
       performed in the background as soon as IP connectivity is
       established.  If this is performed prior to establishing a
       streaming session, the delays in the session establishment will
       be reduced.  If no NAT is detected, normal SETUP should be used.

   2.  The RTSP client determines the number of UDP ports needed by
       counting the number of needed media transport protocols sessions
       in the multimedia presentation.  This information is available in
       the media description protocol, e.g., SDP [RFC4566].  For
       example, each RTP session will in general require two UDP ports:
       one for RTP, and one for RTCP.

   3.  For each UDP port required, establish a mapping and discover the
       public/external IP address and port number with the help of the
       STUN server.  A successful mapping looks like: client's local
       address/port <-> public address/port.

   4.  Perform the RTSP SETUP for each media.  In the Transport header,
       the following parameter should be included with the given values:
       "dest_addr" [RTSP] or "destination" + "client_port" [RFC2326]
       with the public/external IP address and port pair for both RTP
       and RTCP.  To be certain that this works, servers must allow a
       client to set up the RTP stream on any port, not only even ports
       and with non-contiguous port numbers for RTP and RTCP.  This
       requires the new feature provided in RTSP 2.0 [RTSP].  The server
       should respond with a Transport header containing an "src_addr"

RFC7604 - Page 13

       or "source" + "server_port" parameters with the RTP and RTCP
       source IP address and port of the media stream.

   5.  To keep the mappings alive, the client should periodically send
       UDP traffic over all mappings needed for the session.  For the
       mapping carrying RTCP traffic, the periodic RTCP traffic is
       likely enough.  For mappings carrying RTP traffic and for
       mappings carrying RTCP packets at too low of a frequency, keep-
       alive messages should be sent.

   If a UDP mapping is lost, the above discovery process must be
   repeated.  The media stream also needs to be SETUP again to change
   the transport parameters to the new ones.  This will cause a glitch
   in media playback.

   To allow UDP packets to arrive from the server to a client behind an
   Address-Dependent or Address and Port-Dependent Filtering NAT, the
   client must first send a UDP packet to establish the filtering state
   in the NAT.  The client, before sending an RTSP PLAY request, must
   send a so-called hole-punching packet on each mapping to the IP
   address and port given as the server's source address and port.  For
   a NAT that only is Address-Dependent Filtering, the hole-punching
   packet could be sent to the server's discard port (port number 9).
   For Address and Port-Dependent Filtering NATs, the hole-punching
   packet must go to the port used for sending UDP packets to the
   client.  To be able to do that, the server needs to include the
   "src_addr" in the Transport header (which is the "source" transport
   parameter in RFC2326).  Since UDP packets are inherently unreliable,
   to ensure that at least one UDP message passes the NAT, hole-punching
   packets should be retransmitted a reasonable number of times.

   One could have used RTP No-Op packets [RTP-NO-OP] as hole-punching
   and keep-alive messages had they been defined.  That would have
   ensured that the traffic would look like RTP and thus would likely
   have the least risk of being dropped by any firewall.  The drawback
   of using RTP No-Op is that the payload type number must be
   dynamically assigned through RTSP first.  Other options are STUN, an
   RTP packet without any payload, or a UDP packet without any payload.
   For RTCP it is most suitable to use correctly generated RTCP packets.
   In general, sending unsolicited traffic to the RTSP server may
   trigger security functions resulting in the blocking of the keep-
   alive messages or termination of the RTSP session itself.

   This method is further brittle as it doesn't support Address and
   Port-Dependent Mappings.  Thus, it proposes to use the old STUN
   methods to classify the NAT behavior, thus enabling early error
   indication.  This is strictly not required but will lead to failures
   during setup when the NAT has the wrong behavior.  This failure can

RFC7604 - Page 14

   also occur if the NAT changes the properties of the existing mapping
   and filtering state or between the classification message exchange
   and the actual RTSP session setup, for example, due to load.

4.1.3.  ALG Considerations

   If a NAT supports RTSP ALG (Application Level Gateway) and is not
   aware of the STUN traversal option, service failure may happen,
   because a client discovers its NAT external IP address and port
   numbers and inserts them in its SETUP requests.  When the RTSP ALG
   processes the SETUP request, it may change the destination and port
   number, resulting in unpredictable behavior.  An ALG should not
   update address fields that contain addresses other than the NAT's
   internal address domain.  In cases where the ALG modifies fields
   unnecessarily, two alternatives exist:

   1.  Use Transport Layer Security (TLS) to encrypt the data over the
       RTSP TCP connection to prevent the ALG from reading and modifying
       the RTSP messages.

   2.  Turn off the STUN-based NAT traversal mechanism.

   As it may be difficult to determine why the failure occurs, the usage
   of TLS-protected RTSP message exchange at all times would avoid this
   issue.

4.1.4.  Deployment Considerations

   For the stand-alone usage of STUN, the following applies:

   Advantages:

   o  STUN is a solution first used by applications based on SIP
      [RFC3261] (see Sections 1 and 2 of [RFC5389]).  As shown above,
      with little or no changes, the RTSP application can reuse STUN as
      a NAT traversal solution, avoiding the pitfall of solving a
      problem twice.

   o  Using STUN does not require RTSP server modifications, assuming it
      is a server that is compliant with RTSP 2.0; it only affects the
      client implementation.

   Disadvantages:

   o  Requires a STUN server deployed in the same address domain as the
      server.

RFC7604 - Page 15

   o  Only works with NATs that perform Endpoint-Independent and
      Address-Dependent Mappings.  Address and Port-Dependent Filtering
      NATs create some issues.

   o  Brittle to NATs changing the properties of the NAT mapping and
      filtering.

   o  Does not work with Address and Port-Dependent Mapping NATs without
      server modifications.

   o  Will not work if a NAT uses multiple IP addresses, since RTSP
      servers generally require all media streams to use the same IP as
      used in the RTSP connection to prevent becoming a DDoS tool.

   o  Interaction problems exist when an RTSP-aware ALG interferes with
      the use of STUN for NAT traversal unless TLS-secured RTSP message
      exchange is used.

   o  Using STUN requires that RTSP servers and clients support the
      updated RTSP specification [RTSP], because it is no longer
      possible to guarantee that RTP and RTCP ports are adjacent to each
      other, as required by the "client_port" and "server_port"
      parameters in RFC 2326.

   Transition:

   The usage of STUN can be phased out gradually as the first step of a
   STUN-capable server or client should be to check the presence of
   NATs.  The removal of STUN capability in the client implementations
   will have to wait until there is absolutely no need to use STUN.

4.1.5.  Security Considerations

   To prevent the RTSP server from being used as Denial-of-Service (DoS)
   attack tools, the RTSP Transport header parameters "destination" and
   "dest_addr" are generally not allowed to point to any IP address
   other than the one the RTSP message originates from.  The RTSP server
   is only prepared to make an exception to this rule when the client is
   trusted (e.g., through the use of a secure authentication process or
   through some secure method of challenging the destination to verify
   its willingness to accept the RTP traffic).  Such a restriction means
   that STUN in general does not work for use cases where RTSP and media
   transport go to different addresses.

   STUN combined with RTSP that is restricted by destination address has
   the same security properties as the core RTSP.  It is protected from
   being used as a DoS attack tool unless the attacker has the ability
   to spoof the TCP connection carrying RTSP messages.

RFC7604 - Page 16

   Using STUN's support for message authentication and the secure
   transport of RTSP messages, attackers cannot modify STUN responses or
   RTSP messages (TLS) to change the media destination.  This protects
   against hijacking; however, as a client can be the initiator of an
   attack, these mechanisms cannot securely prevent RTSP servers from
   being used as DoS attack tools.

4.2.  Server Embedded STUN

4.2.1.  Introduction

   This section describes an alternative to the stand-alone STUN usage
   in the previous section that has quite significantly different
   behavior.

4.2.2.  Embedding STUN in RTSP

   This section outlines the adaptation and embedding of STUN within
   RTSP.  This enables STUN to be used to traverse any type of NAT,
   including Address and Port-Dependent Mapping NATs.  This would
   require RTSP-level protocol changes.

   This NAT traversal solution has limitations:

   1.  It does not work if both the RTSP client and RTSP server are
       behind separate NATs.

   2.  The RTSP server may, for security reasons, refuse to send media
       streams to an IP that is different from the IP in the client RTSP
       requests.

   Deviations from STUN as defined in RFC 5389:

   1.  The RTSP application must provision the client with an identity
       and shared secret to use in the STUN authentication;

   2.  We require the STUN server to be co-located on the RTSP server's
       media source ports.

   If the STUN server is co-located with the RTSP server's media source
   port, an RTSP client using RTP transport over UDP can use STUN to
   traverse ALL types of NATs.  In the case of Address and Port-
   Dependent Mapping NATs, the party on the inside of the NAT must
   initiate UDP traffic.  The STUN Binding Request, being a UDP packet
   itself, can serve as the traffic initiating packet.  Subsequently,
   both the STUN Binding Response packets and the RTP/RTCP packets can
   traverse the NAT, regardless of whether the RTSP server or the RTSP
   client is behind NAT (however, only one of them can be behind a NAT).

RFC7604 - Page 17

   Likewise, if an RTSP server is behind a NAT, then an embedded STUN
   server must be co-located on the RTSP client's RTCP port.  Also, it
   will become the client that needs to disclose his destination address
   rather than the server, so the server can correctly determine its NAT
   external source address for the media streams.  In this case, we
   assume that the client has some means of establishing a TCP
   connection to the RTSP server behind NAT so as to exchange RTSP
   messages with the RTSP server, potentially using a proxy or static
   rules.

   To minimize delay, we require that the RTSP server supporting this
   option must inform the client about the RTP and RTCP ports from where
   the server will send out RTP and RTCP packets, respectively.  This
   can be done by using the "server_port" parameter in RFC 2326 and the
   "src_addr" parameter in [RTSP].  Both are in the RTSP Transport
   header.  But in general, this strategy will require that one first
   does one SETUP request per media to learn the server ports, then
   perform the STUN checks, followed by a subsequent SETUP to change the
   client port and destination address to what was learned during the
   STUN checks.

   To be certain that RTCP works correctly, the RTSP endpoint (server or
   client) will be required to send and receive RTCP packets from the
   same port.

4.2.3.  Discussion on Co-located STUN Server

   In order to use STUN to traverse Address and Port-Dependent Filtering
   or Mapping NATs, the STUN server needs to be co-located with the
   streaming server media output ports.  This creates a demultiplexing
   problem: we must be able to differentiate a STUN packet from a media
   packet.  This will be done based on heuristics.  The existing STUN
   heuristics is the first byte in the packet and the Magic Cookie field
   (added in RFC 5389), which works fine between STUN and RTP or RTCP
   where the first byte happens to be different.  Thanks to the Magic
   Cookie field, it is unlikely that other protocols would be mistaken
   for a STUN packet, but this is not assured.  For more discussion of
   this, please see Section 5.1.2 of [RFC5764].

4.2.4.  ALG Considerations

   The same ALG traversal considerations as for stand-alone STUN applies
   (Section 4.1.3).

RFC7604 - Page 18

4.2.5.  Deployment Considerations

   For the "Embedded STUN" method the following applies:

   Advantages:

   o  STUN is a solution first used by SIP applications.  As shown
      above, with little or no changes, the RTSP application can reuse
      STUN as a NAT traversal solution, avoiding the pitfall of solving
      a problem twice.

   o  STUN has built-in message authentication features, which makes it
      more secure against hijacking attacks.  See the next section for
      an in-depth security discussion.

   o  This solution works as long as there is only one RTSP endpoint in
      the private address realm, regardless of the NAT's type.  There
      may even be multiple NATs (see Figure 1 in [RFC5389]).

   o  Compared to other UDP-based NAT traversal methods in this
      document, STUN requires little new protocol development (since
      STUN is already an IETF standard), and most likely less
      implementation effort, since open source STUN server and client
      implementations are available [STUN-IMPL] [PJNATH].

   Disadvantages:

   o  Some extensions to the RTSP core protocol, likely signaled by RTSP
      feature tags, must be introduced.

   o  Requires an embedded STUN server to be co-located on each of the
      RTSP server's media protocol's ports (e.g., RTP and RTCP ports),
      which means more processing is required to demultiplex STUN
      packets from media packets.  For example, the demultiplexer must
      be able to differentiate an RTCP RR packet from a STUN packet and
      forward the former to the streaming server and the latter to the
      STUN server.

   o  Does not support use cases that require the RTSP connection and
      the media reception to happen at different addresses, unless the
      server's security policy is relaxed.

   o  Interaction problems exist when an RTSP ALG is not aware of STUN
      unless TLS is used to protect the RTSP messages.

   o  Using STUN requires that RTSP servers and clients support the
      updated RTSP specification [RTSP], and they both agree to support
      the NAT traversal feature.

RFC7604 - Page 19

   o  Increases the setup delay with at least the amount of time it
      takes to perform STUN message exchanges.  Most likely an extra
      SETUP sequence will be required.

   Transition:

   The usage of STUN can be phased out gradually as the first step of a
   STUN-capable machine can be used to check the presence of NATs for
   the presently used network connection.  The removal of STUN
   capability in the client implementations will have to wait until
   there is absolutely no need to use STUN, i.e., no NATs or firewalls.

4.2.6.  Security Considerations

   See Stand-Alone STUN (Section 4.1.5).

4.3.  ICE

4.3.1.  Introduction

   Interactive Connectivity Establishment (ICE) [RFC5245] is a
   methodology for NAT traversal that has been developed for SIP using
   SDP offer/answer.  The basic idea is to try, in a staggered parallel
   fashion, all possible connection addresses in which an endpoint may
   be reached.  This allows the endpoint to use the best available UDP
   "connection" (meaning two UDP endpoints capable of reaching each
   other).  The methodology has very nice properties in that basically
   all NAT topologies are possible to traverse.

   Here is how ICE works at a high level.  Endpoint A collects all
   possible addresses that can be used, including local IP addresses,
   STUN-derived addresses, Traversal Using Relay NAT (TURN) addresses,
   etc.  On each local port that any of these address and port pairs
   lead to, a STUN server is installed.  This STUN server only accepts
   STUN requests using the correct authentication through the use of a
   username and password.

   Endpoint A then sends a request to establish connectivity with
   endpoint B, which includes all possible "destinations" [RFC5245] to
   get the media through to A.  Note that each of A's local address/port
   pairs (host candidates and server reflexive base) has a co-located
   STUN server.  B in turn provides A with all its possible destinations
   for the different media streams.  A and B then uses a STUN client to
   try to reach all the address and port pairs specified by A from its
   corresponding destination ports.  The destinations for which the STUN
   requests successfully complete are then indicated and one is
   selected.

RFC7604 - Page 20

   If B fails to get any STUN response from A, all hope is not lost.
   Certain NAT topologies require multiple tries from both ends before
   successful connectivity is accomplished; therefore, requests are
   retransmitted multiple times.  The STUN requests may also result in
   more connectivity alternatives (destinations) being discovered and
   conveyed in the STUN responses.

4.3.2.  Using ICE in RTSP

   The usage of ICE for RTSP requires that both client and server be
   updated to include the ICE functionality.  If both parties implement
   the necessary functionality, the following steps could provide ICE
   support for RTSP.

   This assumes that it is possible to establish a TCP connection for
   the RTSP messages between the client and the server.  This is not
   trivial in scenarios where the server is located behind a NAT, and
   may require some TCP ports be opened, or proxies are deployed, etc.

   The negotiation of ICE in RTSP of necessity will work different than
   in SIP with SDP offer/answer.  The protocol interactions are
   different, and thus the possibilities for transfer of states are also
   somewhat different.  The goal is also to avoid introducing extra
   delay in the setup process at least for when the server is not behind
   a NAT in regards to the client, and the client is either having an
   address in the same address domain or is behind the NAT(s), which can
   address the address domain of the server.  This process is only
   intended to support PLAY mode, i.e., media traffic flows from server
   to client.

   1.  ICE usage begins in the SDP.  The SDP for the service indicates
       that ICE is supported at the server.  No candidates can be given
       here as that would not work with on demand, DNS load balancing,
       etc., which have the SDP indicate a resource on a server park
       rather than a specific machine.

   2.  The client gathers addresses and puts together its candidates for
       each media stream indicated in the session description.

   3.  In each SETUP request, the client includes its candidates in an
       ICE-specific transport specification.  For the server, this
       indicates the ICE support by the client.  One candidate is the
       most prioritized candidate and here the prioritization for this
       address should be somewhat different compared to SIP.  High-
       performance candidates are recommended rather than candidates
       with the highest likelihood of success, as it is more likely that
       a server is not behind a NAT compared to a SIP user agent.

RFC7604 - Page 21

   4.  The server responds to the SETUP (200 OK) for each media stream
       with its candidates.  A server not behind a NAT usually only
       provides a single ICE candidate.  Also, here one candidate is the
       server primary address.

   5.  The connectivity checks are performed.  For the server, the
       connectivity checks from the server to the clients have an
       additional usage.  They verify that there is someone willing to
       receive the media, thus preventing the server from unknowingly
       performing a DoS attack.

   6.  Connectivity checks from the client promoting a candidate pair
       were successful.  Thus, no further SETUP requests are necessary
       and processing can proceed with step 7.  If an address other than
       the primary has been verified by the client to work, that address
       may then be promoted for usage in a SETUP request (go to step 7).
       If the checks for the available candidates failed and if further
       candidates have been derived during the connectivity checks, then
       those can be signaled in new candidate lines in a SETUP request
       updating the list (go to step 5).

   7.  Client issues the PLAY request.  If the server also has completed
       its connectivity checks for the promoted candidate pair (based on
       the username as it may be derived addresses if the client was
       behind NAT), then it can directly answer 200 OK (go to step 8).
       If the connectivity check has not yet completed, it responds with
       a 1xx code to indicate that it is verifying the connectivity.  If
       that fails within the set timeout, an error is reported back.
       The client needs to go back to step 6.

   8.  Process completed and media can be delivered.  ICE candidates not
       used may be released.

   To keep media paths alive, the client needs to periodically send data
   to the server.  This will be realized with STUN.  RTCP sent by the
   client should be able to keep RTCP open, but STUN will also be used
   for SIP based on the same motivations as for ICE.

4.3.3.  Implementation Burden of ICE

   The usage of ICE will require that a number of new protocols and new
   RTSP/SDP features be implemented.  This makes ICE the solution that
   has the largest impact on client and server implementations among all
   the NAT/firewall traversal methods in this document.

RFC7604 - Page 22

   RTSP server implementation requirements are:

   o  STUN server features

   o  Limited STUN client features

   o  SDP generation with more parameters

   o  RTSP error code for ICE extension

   RTSP client implementation requirements are:

   o  Limited STUN server features

   o  Limited STUN client features

   o  RTSP error code and ICE extension

4.3.4.  ALG Considerations

   If there is an RTSP ALG that doesn't support the NAT traversal
   method, it may interfere with the NAT traversal.  As the usage of ICE
   for the traversal manifests itself in the RTSP message primarily as a
   new transport specification, an ALG that passes through unknown will
   not prevent the traversal.  An ALG that discards unknown
   specifications will, however, prevent the NAT traversal.  These
   issues can be avoided by preventing the ALG to interfere with the
   signaling by using TLS for the RTSP message transport.

   An ALG that supports this traversal method can, on the most basic
   level, just pass the transport specifications through.  ALGs in NATs
   and firewalls could use the ICE candidates to establish a filtering
   state that would allow incoming STUN messages prior to any outgoing
   hole-punching packets, and in that way it could speed up the
   connectivity checks and reduce the risk of failures.

4.3.5.  Deployment Considerations

   Advantages:

   o  Solves NAT connectivity discovery for basically all cases as long
      as a TCP connection between the client and server can be
      established.  This includes servers behind NATs.  (Note that a
      proxy between address domains may be required to get TCP through.)

   o  Improves defenses against DDoS attacks, since a media-receiving
      client requires authentications via STUN on its media reception
      ports.

RFC7604 - Page 23

   Disadvantages:

   o  Increases the setup delay with at least the amount of time it
      takes for the server to perform its STUN requests.

   o  Assumes that it is possible to demultiplex between the packets of
      the media protocol and STUN packets.  This is possible for RTP as
      discussed, for example, in Section 5.1.2 of [RFC5764].

   o  Has a fairly high implementation burden put on both the RTSP
      server and client.  However, several open source ICE
      implementations do exist, such as [NICE] and [PJNATH].

4.3.6.  Security Considerations

   One should review the Security Considerations section of ICE and STUN
   to understand that ICE contains some potential issues.  However,
   these can be avoided by correctly using ICE in RTSP.  An important
   factor is to secure the signaling, i.e., use TLS between the RTSP
   client and server.  In fact ICE does help avoid the DDoS attack issue
   with RTSP substantially as it reduces the possibility for a DDoS
   using RTSP servers on attackers that are on path between the RTSP
   server and the target and capable of intercepting the STUN
   connectivity check packets and correctly sending a response to the
   server.  The ICE connectivity checks with their random transaction
   IDs from the server to the client serves as a return-routability
   check and prevents off-path attackers to succeed with address
   spoofing.  This is similar to Mobile IPv6's return routability
   procedure (Section 5.2.5 of [RFC6275]).

4.4.  Latching

4.4.1.  Introduction

   Latching is a NAT traversal solution that is based on requiring RTSP
   clients to send UDP packets to the server's media output ports.
   Conventionally, RTSP servers send RTP packets in one direction: from
   server to client.  Latching is similar to connection-oriented
   traffic, where one side (e.g., the RTSP client) first "connects" by
   sending an RTP packet to the other side's RTP port; the recipient
   then replies to the originating IP and Port.  This method is also
   referred to as "late binding".  It requires that all RTP/RTCP
   transport be done symmetrically.  This in effect requires Symmetric
   RTP [RFC4961].  Refer to [RFC7362] for a description of the Latching
   of SIP-negotiated media streams in Session Border Controllers.

   Specifically, when the RTSP server receives the Latching packet
   (a.k.a. hole-punching packet, since it is used to punch a hole in the

RFC7604 - Page 24

   firewall/NAT) from its client, it copies the source IP and Port
   number and uses them as the delivery address for media packets.  By
   having the server send media traffic back the same way as the
   client's packets are sent to the server, address and port mappings
   will be honored.  Therefore, this technique works for all types of
   NATs, given that the server is not behind a NAT.  However, it does
   require server modifications.  The format of the Latching packet will
   have to be defined.

   Latching is very vulnerable to both hijacking and becoming a tool in
   DDoS attacks (see Security Considerations in [RFC7362]) because
   attackers can simply forge the source IP and Port of the Latching
   packet.  The rule for restricting IP addresses to one of the
   signaling connections will need to be applied here also.  However,
   that does not protect against hijacking from another client behind
   the same NAT.  This can become a serious issue in deployments with
   CGNs.

4.4.2.  Necessary RTSP Extensions

   To support Latching, RTSP signaling must be extended to allow the
   RTSP client to indicate that it will use Latching.  The client also
   needs to be able to signal its RTP SSRC to the server in its SETUP
   request.  The RTP SSRC is used to establish some basic level of
   security against hijacking attacks or simply to avoid mis-association
   when multiple clients are behind the same NAT.  Care must be taken in
   choosing clients' RTP SSRC.  First, it must be unique within all the
   RTP sessions belonging to the same RTSP session.  Second, if the RTSP
   server is sending out media packets to multiple clients from the same
   send port, the RTP SSRC needs to be unique among those clients' RTP
   sessions.  Recognizing that there is a potential that RTP SSRC
   collisions may occur, the RTSP server must be able to signal to a
   client that a collision has occurred and that it wants the client to
   use a different RTP SSRC carried in the SETUP response or use unique
   ports per RTSP session.  Using unique ports limits an RTSP server in
   the number of sessions it can simultaneously handle per interface IP
   addresses.

   The Latching packet as discussed above should have a field that can
   contain a client and RTP session identifier to correctly associate
   the Latching packet with the correct context.  If an RTP packet is to
   be used, there would be a benefit to using a well-defined RTP payload
   format for this purpose as the No-Op payload format proposed
   [RTP-NO-OP].  However, in the absence of such a specification, an RTP
   packet without a payload could be used.  Using SSRC is beneficial
   because RTP and RTCP both would work as is.  However, other packet
   formats could be used that carry the necessary identification of the
   context, and such a solution is discussed in Section 4.5.

RFC7604 - Page 25

4.4.3.  ALG Considerations

   An RTSP ALG not supporting this method could interfere with the
   methods used to indicate that Latching is to be done, as well as the
   SSRC signaling, thus preventing the method from working.  However, if
   the RTSP ALG instead opens the corresponding pinholes and creates the
   necessary mapping in the NAT, traversal will still work.  Securing
   the RTSP message transport using TLS will avoid this issue.

   An RTSP ALG that supports this traversal method can for basic
   functionality simply pass the related signaling parameters
   transparently.  Due to the security considerations for Latching,
   there might exist a benefit for an RTSP ALG that will enable NAT
   traversal to negotiate with the path and turn off the Latching
   procedures when the ALG handles this.  However, this opens up to
   failure modes when there are multiple levels of NAT and only one
   supports an RTSP ALG.

4.4.4.  Deployment Considerations

   Advantages:

   o  Works for all types of client-facing NATs (requirement 1 in
      Section 3).

   o  Has little interaction problems with any RTSP ALG changing the
      client's information in the Transport header.

   Disadvantages:

   o  Requires modifications to both the RTSP server and client.

   o  Limited to working with servers that are not behind a NAT.

   o  The format of the packet for "connection setup" (a.k.a Latching
      packet) is not defined.

   o  SSRC management if RTP is used for Latching due to risk for mis-
      association of clients to RTSP sessions at the server if SSRC
      collision occurs.

   o  Has significant security considerations (See Section 4.4.5), due
      to the lack of a strong authentication mechanism and will need to
      use address restrictions.

RFC7604 - Page 26

4.4.5.  Security Considerations

   Latching's major security issue is that RTP streams can be hijacked
   and directed towards any target that the attacker desires unless
   address restrictions are used.  In the case of NATs with multiple
   clients on the inside of them, hijacking can still occur.  This
   becomes a significant threat in the context of CGNs.

   The most serious security problem is the deliberate attack with the
   use of an RTSP client and Latching.  The attacker uses RTSP to set up
   a media session.  Then it uses Latching with a spoofed source address
   of the intended target of the attack.  There is no defense against
   this attack other than restricting the possible address a Latching
   packet can come from to the same address as the RTSP TCP connection
   is from.  This prevents Latching to be used in use cases that require
   different addresses for media destination and signaling.  Even
   allowing only a limited address range containing the signaling
   address from where Latching is allowed opens up a significant
   vulnerability as it is difficult to determine the address usage for
   the network the client connects from.

   A hijack attack can also be performed in various ways.  The basic
   attack is based on the ability to read the RTSP signaling packets in
   order to learn the address and port the server will send from and
   also the SSRC the client will use.  Having this information, the
   attacker can send its own Latching packets containing the correct RTP
   SSRC to the correct address and port on the server.  The RTSP server
   will then use the source IP and Port from the Latching packet as the
   destination for the media packets it sends.

   Another variation of this attack is for a man in the middle to modify
   the RTP Latching packet being sent by a client to the server by
   simply changing the source IP and Port to the target one desires to
   attack.

   One can fend off the snooping-based attack by applying encryption to
   the RTSP signaling transport.  However, if the attacker is a man in
   the middle modifying Latching packets, the attack is impossible to
   defend against other than through address restrictions.  As a NAT
   rewrites the source IP and (possibly) port, this cannot be
   authenticated, but authentication is required in order to protect
   against this type of DoS attack.

   Yet another issue is that these attacks also can be used to deny the
   client the service it desires from the RTSP server completely.  The
   attacker modifies or originates its own Latching packets with a port

RFC7604 - Page 27

   other than what the legit Latching packets use, which results in the
   media server sending the RTP/RTCP traffic to ports the client isn't
   listening for RTP/RTCP on.

   The amount of random non-guessable material in the Latching packet
   determines how well Latching can fend off stream hijacking performed
   by parties that are off the client-to-server network path, i.e., it
   lacks the capability to see the client's Latching packets.  The
   proposal above uses the 32-bit RTP SSRC field to this effect.
   Therefore, it is important that this field is derived with a non-
   predictable random number generator.  It should not be possible by
   knowing the algorithm used and a couple of basic facts to derive what
   random number a certain client will use.

   An attacker not knowing the SSRC but aware of which port numbers that
   a server sends from can deploy a brute-force attack on the server by
   testing a lot of different SSRCs until it finds a matching one.
   Therefore, a server could implement functionality that blocks packets
   to ports or from sources that receive or send multiple Latching
   packets with different invalid SSRCs, especially when they are coming
   from the same IP and Port.  Note that this mitigation in itself opens
   up a new venue for DoS attacks against legit users trying to latch.

   To improve the security against attackers, the amount of random
   material could be increased.  To achieve a longer random tag while
   still using RTP and RTCP, it will be necessary to develop RTP and
   RTCP payload formats for carrying the random material.

4.5.  A Variation to Latching

4.5.1.  Introduction

   Latching as described above requires the usage of a valid RTP format
   as the Latching packet, i.e., the first packet that the client sends
   to the server to establish a bidirectional transport flow for RTP
   streams.  There is currently no appropriate RTP packet format for
   this purpose, although the RTP No-Op format was a proposal to fix the
   problem [RTP-NO-OP]; however, that work was abandoned.  [RFC6263]
   discusses the implication of different types of packets as keep-
   alives for RTP, and its findings are very relevant to the format of
   the Latching packet.

   Meanwhile, there have been NAT/firewall traversal techniques deployed
   in the wireless streaming market place that use non-RTP messages as
   Latching packets.  This section describes a variant based on a subset
   of those solutions that alters the previously described Latching
   solution.

RFC7604 - Page 28

4.5.2.  Necessary RTSP Extensions

   In this variation of Latching, the Latching packet is a small UDP
   packet that does not contain an RTP header.  In response to the
   client's Latching packet, the RTSP server sends back a similar
   Latching packet as a confirmation so the client can stop the so-
   called "connection phase" of this NAT traversal technique.
   Afterwards, the client only has to periodically send Latching packets
   as keep-alive messages for the NAT mappings.

   The server listens on its RTP-media output port and tries to decode
   any received UDP packet as the Latching packet.  This is valid since
   an RTSP server is not expecting RTP traffic from the RTSP client.
   Then, it can correlate the Latching packet with the RTSP client's
   session ID or the client's SSRC and record the NAT bindings
   accordingly.  The server then sends a Latching packet as the response
   to the client.

   The Latching packet can contain the SSRC to identify the RTP stream,
   and care must be taken if the packet is bigger than 12 bytes,
   ensuring that it is distinctively different from RTP packets, whose
   header size is 12 bytes.

   RTSP signaling can be added to do the following:

   1.  Enable or disable such Latching message exchanges.  When the
       firewall/NAT has an RTSP-aware ALG, it is possible to disable
       Latching message exchange and let the ALG work out the address
       and port mappings.

   2.  Configure the number of retries and the retry interval of the
       Latching message exchanges.

4.5.3.  ALG Considerations

   See Latching ALG considerations in Section 4.4.3.

4.5.4.  Deployment Considerations

   This approach has the following advantages when compared with the
   Latching approach (Section 4.4):

   1.  There is no need to define an RTP payload format for firewall
       traversal; therefore, it is more simple to use, implement, and
       administer (requirement 4 in Section 3) than a Latching protocol,
       which must be defined.

RFC7604 - Page 29

   2.  When properly defined, this kind of Latching packet exchange can
       also authenticate RTP receivers, to prevent hijacking attacks.

   This approach has the following disadvantage when compared with the
   Latching approach:

   1.  The server's sender SSRC for the RTP stream or other session
       Identity information must be signaled in the RTSP's SETUP
       response, in the Transport header of the RTSP SETUP response.

4.5.5.  Security Considerations

   Compared to the security properties of Latching, this variant is
   slightly improved.  First of all it allows for a larger random field
   in the Latching packets, which makes it more unlikely for an off-path
   attacker to succeed in a hijack attack.  Second, the confirmation
   allows the client to know when Latching works and when it doesn't and
   thus when to restart the Latching process by updating the SSRC.

   Still, the main security issue remaining is that the RTSP server
   can't know that the source address in the Latching packet was coming
   from an RTSP client wanting to receive media and not from one that
   likes to direct the media traffic to a DoS target.

(page 29 continued on part 3)