4.6. Three-Way Latching
Three-Way Latching is an attempt to try to resolve the most
significant security issues for both previously discussed variants of
Latching. By adding a server request response exchange directly
after the initial Latching, the server can verify that the target
address present in the Latching packet is an active listener and
confirm its desire to establish a media flow.
4.6.2. Necessary RTSP Extensions
Uses the same RTSP extensions as the Alternative Latching method
(Section 4.5) uses. The extensions for this variant are only in the
format and transmission of the Latching packets.
The client-to-server Latching packet is similar to the Alternative
Latching (Section 4.5), i.e., a UDP packet with some session
identifiers and a random value. When the server responds to the
Latching packet with a Latching confirmation, it includes a random
value (nonce) of its own in addition to echoing back the one the
client sent. Then a third message is added to the exchange. The
client acknowledges the reception of the Latching confirmation
message and echoes back the server's nonce, thus confirming that the
Latched address goes to an RTSP client that initiated the Latching
and is actually present at that address. The RTSP server will refuse
to send any media until the Latching Acknowledgement has been
received with a valid nonce.
4.6.3. ALG Considerations
See Latching ALG considerations in Section 4.4.3.
4.6.4. Deployment Considerations
A solution with a three-way handshake and its own Latching packets
can be compared with the ICE-based solution (Section 4.3) and have
the following differences:
o Only works for servers that are not behind a NAT.
o May be simpler to implement due to the avoidance of the ICE
prioritization and check-board mechanisms.
However, a Three-Way Latching protocol is very similar to using STUN
in both directions as a Latching and verification protocol. Using
STUN would remove the need for implementing a new protocol.
4.6.5. Security Considerations
Three-Way Latching is significantly more secure than its simpler
versions discussed above. The client-to-server nonce, which is
included in signaling and also can be bigger than the 32 bits of
random data that the SSRC field supports, makes it very difficult for
an off-path attacker to perform a DoS attack by diverting the media.
The client-to-server nonce and its echoing back does not protect
against on-path attackers, including malicious clients. However, the
server-to-client nonce and its echoing back prevents malicious
clients to divert the media stream by spoofing the source address and
port, as it can't echo back the nonce in these cases. This is
similar to the Mobile IPv6 return routability procedure
(Section 5.2.5 of [RFC6275]).
Three-Way Latching is really only vulnerable to an on-path attacker
that is quite capable. First, the attacker can learn the client-
to-server nonce either by intercepting the signaling or by modifying
the source information (target destination) of a client's Latching
packet. Second, it is also on-path between the server and target
destination and can generate a response using the server's nonce. An
adversary that has these capabilities is commonly capable of causing
significantly worse damage than this using other methods.
Three-Way Latching results in the server-to-client packet being
bigger than the client-to-server packet, due to the inclusion of the
server-to-client nonce in addition to the client-to-server nonce.
Thus, an amplification effect does exist; however, to achieve this
amplification effect, the attacker has to create a session state on
the RTSP server. The RTSP server can also limit the number of
responses it will generate before considering the Latching to be
4.7. Application Level Gateways
An ALG reads the application level messages and performs necessary
changes to allow the protocol to work through the middlebox.
However, this behavior has some problems in regards to RTSP:
1. It does not work when RTSP is used with end-to-end security. As
the ALG can't inspect and change the application level messages,
the protocol will fail due to the middlebox.
2. ALGs need to be updated if extensions to the protocol are added.
Due to deployment issues with changing ALGs, this may also break
the end-to-end functionality of RTSP.
Due to the above reasons, it is not recommended to use an RTSP ALG in
NATs. This is especially important for NATs targeted to home users
and small office environments, since it is very hard to upgrade NATs
deployed in SOHO environments.
4.7.2. Outline on How ALGs for RTSP Work
In this section, we provide a step-by-step outline on how one could
go about writing an ALG to enable RTSP to traverse a NAT.
1. Detect any SETUP request.
2. Try to detect the usage of any of the NAT traversal methods that
replace the address and port of the Transport header parameters
"destination" or "dest_addr". If any of these methods are used,
then the ALG should not change the address. Ways to detect that
these methods are used are:
* For embedded STUN, it would be to watch for a feature tag,
like "nat.stun", and to see if any of those exist in the
"supported", "proxy-require", or "require" headers of the RTSP
* For stand-alone STUN and TURN-based solutions: This can be
detected by inspecting the "destination" or "dest_addr"
parameter. If it contains either one of the NAT's external IP
addresses or a public IP address, then such a solution is in
use. However, if multiple NATs are used, this detection may
fail. Remapping should only be done for addresses belonging
to the NAT's own private address space.
Otherwise, continue to the next step.
3. Create UDP mappings (client given IP and Port <-> external IP and
Port) where needed for all possible transport specifications in
the Transport header of the request found in (step 1). Enter the
external address and port(s) of these mappings in the Transport
header. Mappings shall be created with consecutive external port
numbers starting on an even number for RTP for each media stream.
Mappings should also be given a long timeout period, at least 5
4. When the SETUP response is received from the server, the ALG may
remove the unused UDP mappings, i.e., the ones not present in the
Transport header. The session ID should also be bound to the UDP
mappings part of that session.
5. If the SETUP response settles on RTP over TCP or RTP over RTSP as
lower transport, do nothing: let TCP tunneling take care of NAT
traversal. Otherwise, go to the next step.
6. The ALG should keep the UDP mappings belonging to the RTSP
session as long as: an RTSP message with the session's ID has
been sent in the last timeout interval, or a UDP message has been
sent on any of the UDP mappings during the last timeout interval.
7. The ALG may remove a mapping as soon as a TEARDOWN response has
been received for that media stream.
4.7.3. Deployment Considerations
o No impact on either client or server.
o Can work for any type of NATs.
o When deployed, they are hard to update to reflect protocol
modifications and extensions. If not updated, they will break the
o When end-to-end security is used, the ALG functionality will fail.
o Can interfere with other types of traversal mechanisms, such as
An RTSP ALG will not be phased out in any automatic way. It must be
removed, probably through the removal or update of the NAT it is
4.7.4. Security Considerations
An ALG will not work with deployment of end-to-end RTSP signaling
security; however, it will work with the hop-by-hop security method
defined in Section 19.3 of RTSP 2.0 [RTSP]. Therefore, deployment of
ALG may result in clients located behind NATs not using end-to-end
security, or more likely the selection of a NAT traversal solution
that allows for security.
The creation of a UDP mapping based on the signaling message has some
potential security implications. First of all, if the RTSP client
releases its ports and another application is assigned these instead,
it could receive RTP media as long as the mappings exist and the RTSP
server has failed to be signaled or notice the lack of client
A NAT with RTSP ALG that assigns mappings based on SETUP requests
could potentially become the victim of a resource exhaustion attack.
If an attacker creates a lot of RTSP sessions, even without starting
media transmission, this could exhaust the pool of available UDP
ports on the NAT. Thus, only a limited number of UDP mappings should
be allowed to be created by the RTSP ALG.
4.8. TCP Tunneling
Using a TCP connection that is established from the client to the
server ensures that the server can send data to the client. The
connection opened from the private domain ensures that the server can
send data back to the client. To send data originally intended to be
transported over UDP requires the TCP connection to support some type
of framing of the media data packets. Using TCP also results in the
client having to accept that real-time performance can be impacted.
TCP's problem of ensuring timely delivery was one of the reasons why
RTP was developed. Problems that arise with TCP are: head-of-line
blocking, delay introduced by retransmissions, and a highly varying
rate due to the congestion control algorithm. If a sufficient amount
of buffering (several seconds) in the receiving client can be
tolerated, then TCP will clearly work.
4.8.2. Usage of TCP Tunneling in RTSP
The RTSP core specification [RTSP] supports interleaving of media
data on the TCP connection that carries RTSP signaling. See
Section 14 in [RTSP] for how to perform this type of TCP tunneling.
There also exists another way of transporting RTP over TCP, which is
defined in Appendix C.2 in [RTSP]. For signaling and rules on how to
establish the TCP connection in lieu of UDP, see Appendix C.2 in
[RTSP]. This is based on the framing of RTP over the TCP connection
as described in [RFC4571].
4.8.3. ALG Considerations
An RTSP ALG will face a different issue with TCP tunneling, at least
the interleaved version. Now the full data stream can end up flowing
through the ALG implementation. Thus, it is important that the ALG
is efficient in dealing with the interleaved media data frames to
avoid consuming to many resources and thus creating performance
The RTSP ALG can also affect the transport specifications that
indicate that TCP tunneling can be done and its prioritization,
including removing the transport specification, thus preventing TCP
4.8.4. Deployment Considerations
o Works through all types of NATs where the RTSP server is not NATed
or is at least reachable like it was not.
o Functionality needs to be implemented on both server and client.
o Will not always meet multimedia stream's real-time requirements.
The tunneling over RTSP's TCP connection is not planned to be phased
out. It is intended to be a fallback mechanism and for usage when
total media reliability is desired, even at the potential price of
loss of real-time properties.
4.8.5. Security Considerations
The TCP tunneling of RTP has no known significant security problems
besides those already presented in the RTSP specification. It is
difficult to get any amplification effect for DoS attacks due to
TCP's flow control. The RTSP server's TCP socket, if independently
used for media tunneling or only RTSP messages, can be used for a
redirected syn attack. By spoofing the source address of any TCP
init packets, the TCP SYNs from the server can be directed towards a
A possible security consideration, when session media data is
interleaved with RTSP, would be the performance bottleneck when RTSP
encryption is applied, since all session media data also needs to be
4.9. Traversal Using Relays around NAT (TURN)
TURN [RFC5766] is a protocol for setting up traffic relays that allow
clients behind NATs and firewalls to receive incoming traffic for
both UDP and TCP. These relays are controlled and have limited
resources. They need to be allocated before usage. TURN allows a
client to temporarily bind an address/port pair on the relay (TURN
server) to its local source address/port pair, which is used to
contact the TURN server. The TURN server will then forward packets
between the two sides of the relay.
To prevent DoS attacks on either recipient, the packets forwarded are
restricted to the specific source address. On the client side, it is
restricted to the source setting up the allocation. On the external
side, it is limited to the source address/port pair that have been
given permission by the TURN client creating the allocation. Packets
from any other source on this address will be discarded.
Using a TURN server makes it possible for an RTSP client to receive
media streams from even an unmodified RTSP server. However, the
problem is those RTSP servers most likely restrict media destinations
to no other IP address than the one the RTSP message arrives from.
This means that TURN could only be used if the server knows and
accepts that the IP belongs to a TURN server, and the TURN server
can't be targeted at an unknown address. Alternatively, both the
RTSP TCP connection as well as the RTP media is relayed through the
same TURN server.
4.9.2. Usage of TURN with RTSP
To use a TURN server for NAT traversal, the following steps should be
1. The RTSP client connects with the RTSP server. The client
retrieves the session description to determine the number of
media streams. To avoid the issue of having the RTSP connection
and media traffic from different addresses, the TCP connection
must also be done through the same TURN server as the one in the
next step. This will require the usage of TURN for TCP
2. The client establishes the necessary bindings on the TURN server.
It must choose the local RTP and RTCP ports that it desires to
receive media packets. TURN supports requesting bindings of even
port numbers and contiguous ranges.
3. The RTSP client uses the acquired address and port allocations in
the RTSP SETUP request using the destination header.
4. The RTSP server sends the SETUP reply, which must include the
Transport header's "src_addr" parameter (source and port in RTSP
1.0). Note that the server is required to have a mechanism to
verify that it is allowed to send media traffic to the given
address unless TCP relaying of the RTSP messages also is
5. The RTSP client uses the RTSP server's response to create TURN
permissions for the server's media traffic.
6. The client requests that the server starts playing. The server
starts sending media packets to the given destination address and
7. Media packets arrive at the TURN server on the external port; if
the packets match an established permission, the TURN server
forwards the media packets to the RTSP client.
8. If the client pauses and media is not sent for about 75% of the
mapping timeout, the client should use TURN to refresh the
4.9.3. ALG Considerations
As the RTSP client inserts the address information of the TURN
relay's external allocations in the SETUP messages, the ALG that
replaces the address, without considering that the address does not
belong to the internal address realm of the NAT, will prevent this
mechanism from working. This can be prevented by securing the RTSP
4.9.4. Deployment Considerations
o Does not require any server modifications given that the server
includes the "src_addr" header in the SETUP response.
o Works for any type of NAT as long as the RTSP server has a
reachable IP address that is not behind a NAT.
o Requires another network element, namely the TURN server.
o A TURN server for RTSP may not scale since the number of sessions
it must forward is proportional to the number of client media
o The TURN server becomes a single point of failure.
o Since TURN forwards media packets, as a necessity it introduces
o An RTSP ALG may change the necessary destinations parameter. This
will cause the media traffic to be sent to the wrong address.
TURN is not intended to be phased out completely; see Section 19 of
[RFC5766]. However, the usage of TURN could be reduced when the
demand for having NAT traversal is reduced.
4.9.5. Security Considerations
The TURN server can become part of a DoS attack towards any victim.
To perform this attack, the attacker must be able to eavesdrop on the
packets from the TURN server towards a target for the DoS attack.
The attacker uses the TURN server to set up an RTSP session with
media flows going through the TURN server. The attacker is in fact
creating TURN mappings towards a target by spoofing the source
address of TURN requests. As the attacker will need the address of
these mappings, he must be able to eavesdrop or intercept the TURN
responses going from the TURN server to the target. Having these
addresses, he can set up an RTSP session and start delivery of the
media. The attacker must be able to create these mappings. The
attacker in this case may be traced by the TURN username in the
This attack requires that the attacker has access to a user account
on the TURN server to be able to set up the TURN mappings. To
prevent this attack, the RTSP server needs to verify that the
ultimate target destination accepts this media stream, which would
require something like ICE's connectivity checks being run between
the RTSP server and the RTSP client.
Firewalls exist for the purpose of protecting a network from traffic
not desired by the firewall owner. Therefore, it is a policy
decision if a firewall will let RTSP and its media streams through or
not. RTSP is designed to be firewall friendly in that it should be
easy to design firewall policies to permit passage of RTSP traffic
and its media streams.
The firewall will need to allow the media streams associated with an
RTSP session to pass through it. Therefore, the firewall will need
an ALG that reads RTSP SETUP and TEARDOWN messages. By reading the
SETUP message, the firewall can determine what type of transport and
from where the media stream packets will be sent. Commonly, there
will be the need to open UDP ports for RTP/RTCP. By looking at the
source and destination addresses and ports, the opening in the
firewall can be minimized to the least necessary. The opening in the
firewall can be closed after a TEARDOWN message for that session or
the session itself times out.
The above possibilities for firewalls to inspect and respond to the
signaling are prevented if end-to-end confidentiality protection is
used for the RTSP signaling, e.g., using the specified RTSP over TLS.
As a result, firewalls can't be actively opening pinholes for the
media streams based on the signaling. To enable an RTSP ALG in the
firewall to correctly function, the hop-by-hop signaling security in
RTSP 2.0 can be used (see Section 19.3 of [RTSP]). If not, other
methods have to be used to enable the transport flows for the media.
Simpler firewalls do allow a client to receive media as long as it
has sent packets to the target. Depending on the security level,
this can have the same behavior as a NAT. The only difference is
that no address translation is done. To use such a firewall, a
client would need to implement one of the above described NAT
traversal methods that include sending packets to the server to
create the necessary filtering state.
6. Comparison of NAT Traversal Techniques
This section evaluates the techniques described above against the
requirements listed in Section 3.
In the following table, the columns correspond to the numbered
requirements. For instance, the column under R1 corresponds to the
first requirement in Section 3: must work for all flavors of NATs.
The rows represent the different NAT/firewall traversal techniques.
Latch is short for Latching, "V. Latch" is short for "variation of
Latching" as described in Section 4.5, and "3-W Latch" is short for
the Three-Way Latching described in Section 4.6.
A summary of the requirements are:
R1: Work for all flavors of NATs
R2: Must work with firewalls, including those with ALGs
R3: Should have minimal impact on clients not behind NATs, counted in
minimal number of additional RTTs
R4: Should be simple to use, implement, and administer
R5: Should provide mitigation against DDoS attacks
The following considerations are also added to the requirements:
C1: Will the solution support both clients and servers behind NAT?
C2: Is the solution robust as NAT behaviors change?
| R1 | R2 | R3 | R4 | R5 | C1 | C2 |
STUN | No | Yes | 1 | Maybe| No | No | No |
Emb. STUN | Yes | Yes | 2 | Maybe| No | No | Yes |
ICE | Yes | Yes | 2.5 | No | Yes | Yes | Yes |
Latch | Yes | Yes | 1 | Maybe| No | No | Yes |
V. Latch | Yes | Yes | 1 | Yes | No | No | Yes |
3-W Latch | Yes | Yes | 1.5 | Maybe| Yes | No | Yes |
ALG |(Yes) | Yes | 0 | No | Yes | No | Yes |
TCP Tunnel | Yes | Yes | 1.5 | Yes | Yes | No | Yes |
TURN | Yes | Yes | 1 | No | Yes |(Yes) | Yes |
Figure 1: Comparison of Fulfillment of Requirements
Looking at Figure 1, one would draw the conclusion that using TCP
Tunneling or Three-Way Latching are the solutions that best fulfill
the requirements. The different techniques were discussed in the
MMUSIC WG. It was established that the WG would pursue an ICE-based
solution due to its generality and capability of also handling
servers delivering media from behind NATs. TCP Tunneling is likely
to be available as an alternative, due to its specification in the
main RTSP specification. Thus, it can be used if desired, and the
potential downsides of using TCP is acceptable in particular
deployments. When it comes to Three-Way Latching, it is a very
competitive technique given that you don't need support for RTSP
servers behind NATs. There was some discussion in the WG about if
the increased implementation burden of ICE is sufficiently motivated
compared to a the Three-Way Latching solution for this generality.
In the end, the authors believed that the reuse of ICE, greater
flexibility, and any way needed to deploy a new solution were the
The ICE-based RTSP NAT traversal solution is specified in "A Network
Address Translator (NAT) Traversal mechanism for media controlled by
Real-Time Streaming Protocol (RTSP)" [RTSP-NAT].
7. Security Considerations
In the preceding sections, we have discussed security merits of the
different NAT/firewall traversal methods for RTSP. In summary, the
presence of NAT(s) is a security risk, as a client cannot perform
source authentication of its IP address. This prevents the
deployment of any future RTSP extensions providing security against
the hijacking of sessions by a man in the middle.
Each of the proposed solutions has security implications. Using STUN
will provide the same level of security as RTSP without transport-
level security and source authentications, as long as the server does
not allow media to be sent to a different IP address than the RTSP
client request was sent from.
Using Latching will have a higher risk of session hijacking or DoS
than normal RTSP. The reason is that there exists a probability that
an attacker is able to guess the random bits that the client uses to
prove its identity when creating the address bindings. This can be
solved in the variation of Latching (Section 4.5) with authentication
features. Still, both those variants of Latching are vulnerable
against a deliberate attack from the RTSP client to redirect the
media stream requested to any target assuming it can spoof the source
address. This security vulnerability is solved by performing a
Three-way Latching procedure as discussed in Section 4.6.
ICE resolves the binding vulnerability of Latching by using signed
STUN messages, as well as requiring that both sides perform
connectivity checks to verify that the target IP address in the
candidate pair is both reachable and willing to respond. ICE can,
however, create a significant amount of traffic if the number of
candidate pairs are large. Thus, pacing is required and
implementations should attempt to limit their number of candidates to
reduce the number of packets.
If the signaling between the ICE peers (RTSP client and server) is
not confidentiality and integrity protected, ICE is vulnerable to
attacks where the candidate list is manipulated. The lack of
signaling security will also simplify spoofing of STUN binding
messages by revealing the secret used in signing.
The usage of an RTSP ALG does not in itself increase the risk for
session hijacking. However, the deployment of ALGs as the sole
mechanism for RTSP NAT traversal will prevent deployment of end-
to-end encrypted RTSP signaling.
The usage of TCP tunneling has no known security problems. However,
it might provide a bottleneck when it comes to end-to-end RTSP
signaling security if TCP tunneling is used on an interleaved RTSP
The usage of TURN has severe risk of DoS attacks against a client.
The TURN server can also be used as a redirect point in a DDoS attack
unless the server has strict enough rules for who may create
Since Latching and the variants of Latching have such big security
issues, they should not be used at all. Three-Way Latching as well
as ICE mitigates these security issues and performs the important
return-routability checks that prevent spoofed source addresses, and
they should be recommended for that reason. RTP ALGs are a security
risk as they can create an incitement against using secure RTSP
signaling. That can be avoided as ALGs require trust in the
middlebox, and that trust becomes explicit if one uses the hop-by-hop
security solution as specified in Section 19.3 of RTSP 2.0. [RTSP].
The remaining methods can be considered safe enough, assuming that
the appropriate security mechanisms are used and not ignored.
8. Informative References
[NICE] Libnice, "The GLib ICE implementation", June 2015,
[PJNATH] "PJNATH - Open Source ICE, STUN, and TURN Library", May
[RFC768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
DOI 10.17487/RFC0768, August 1980,
[RFC793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, DOI 10.17487/RFC0793, September 1981,
[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
Streaming Protocol (RTSP)", RFC 2326,
DOI 10.17487/RFC2326, April 1998,
[RFC2588] Finlayson, R., "IP Multicast and Firewalls", RFC 2588,
DOI 10.17487/RFC2588, May 1999,
[RFC7362] Ivov, E., Kaplan, H., and D. Wing, "Latching: Hosted NAT
Traversal (HNT) for Media in Real-Time Communication",
RFC 7362, DOI 10.17487/RFC7362, September 2014,
[RTP-NO-OP] Andreasen, F., "A No-Op Payload Format for RTP", Work in
Progress, draft-ietf-avt-rtp-no-op-04, May 2007.
[RTSP] Schulzrinne, H., Rao, A., Lanphier, R., Westerlund, M.,
and M. Stiemerling, "Real Time Streaming Protocol 2.0
(RTSP)", Work in Progress,
draft-ietf-mmusic-rfc2326bis-40, February 2014.
[RTSP-NAT] Goldberg, J., Westerlund, M., and T. Zeng, "A Network
Address Translator (NAT) Traversal Mechanism for Media
Controlled by Real-Time Streaming Protocol (RTSP)", Work
in Progress, draft-ietf-mmusic-rtsp-nat-22, July 2014.
[STUN-IMPL] "Open Source STUN Client and Server", May 2013,
The authors would also like to thank all persons on the MMUSIC
working group's mailing list that have commented on this document.
Persons having contributed to this protocol, in no special order,
are: Jonathan Rosenberg, Philippe Gentric, Tom Marshall, David Yon,
Amir Wolf, Anders Klemets, Flemming Andreasen, Ari Keranen, Bill
Atwood, Alissa Cooper, Colin Perkins, Sarah Banks, David Black, and
Alvaro Retana. Thomas Zeng would also like to give special thanks to
Greg Sherwood of PacketVideo for his input into this memo.
Section 1.1 contains text originally written for RFC 4787 by Francois
Audet and Cullen Jennings.
Stockholm SE-164 80
Phone: +46 8 719 0000