Internet Engineering Task Force (IETF) C. Boulton Request for Comments: 6314 NS-Technologies Category: Informational J. Rosenberg ISSN: 2070-1721 Skype G. Camarillo Ericsson F. Audet Skype July 2011 NAT Traversal Practices for Client-Server SIP
AbstractTraversal of the Session Initiation Protocol (SIP) and the sessions it establishes through Network Address Translators (NATs) is a complex problem. Currently, there are many deployment scenarios and traversal mechanisms for media traffic. This document provides concrete recommendations and a unified method for NAT traversal as well as documents corresponding flows. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6314. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 4. Solution Technology Outline Description . . . . . . . . . . . 8 4.1. SIP Signaling . . . . . . . . . . . . . . . . . . . . . . 8 4.1.1. Symmetric Response . . . . . . . . . . . . . . . . . . 8 4.1.2. Client-Initiated Connections . . . . . . . . . . . . . 9 4.2. Media Traversal . . . . . . . . . . . . . . . . . . . . . 10 4.2.1. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . 10 4.2.2. RTCP . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2.3. STUN/TURN/ICE . . . . . . . . . . . . . . . . . . . . 11 5. NAT Traversal Scenarios . . . . . . . . . . . . . . . . . . . 12 5.1. Basic NAT SIP Signaling Traversal . . . . . . . . . . . . 12 5.1.1. Registration (Registrar/Edge Proxy Co-Located) . . . . 12 5.1.2. Registration(Registrar/Edge Proxy Not Co-Located) . . 16 5.1.3. Initiating a Session . . . . . . . . . . . . . . . . . 19 5.1.4. Receiving an Invitation to a Session . . . . . . . . . 22 5.2. Basic NAT Media Traversal . . . . . . . . . . . . . . . . 27 5.2.1. Endpoint-Independent NAT . . . . . . . . . . . . . . . 28 5.2.2. Address/Port-Dependent NAT . . . . . . . . . . . . . . 48 6. IPv4-IPv6 Transition . . . . . . . . . . . . . . . . . . . . . 57 6.1. IPv4-IPv6 Transition for SIP Signaling . . . . . . . . . . 57 7. Security Considerations . . . . . . . . . . . . . . . . . . . 57 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 57 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 58 9.1. Normative References . . . . . . . . . . . . . . . . . . . 58 9.2. Informative References . . . . . . . . . . . . . . . . . . 59
RFC3261] and its associated media such as the Real-time Transport Protocol (RTP) [RFC3550]. The problem is exacerbated by the variety of NATs that are available in the marketplace today and the large number of potential deployment scenarios. Details of different NATs behavior can be found in "NAT Behavioral Requirements for Unicast UDP" [RFC4787]. The IETF has been active on many specifications for the traversal of NATs, including Session Traversal Utilities for NAT (STUN) [RFC5389], Interactive Connectivity Establishment (ICE) [RFC5245], symmetric response [RFC3581], symmetric RTP [RFC4961], Traversal Using Relay NAT (TURN) [RFC5766], SIP Outbound [RFC5626], the Session Description Protocol (SDP) attribute for RTP Control Protocol (RTCP) [RFC3605], "Multiplexing RTP Data and Control Packets on a Single Port" [RFC5761], and others. Each of these represents a part of the solution, but none of them gives the overall context for how the NAT traversal problem is decomposed and solved through this collection of specifications. This document serves to meet that need. It should be noted that this document intentionally does not invoke 'Best Current Practice' machinery as defined in RFC 2026 [RFC2026]. The document is split into two distinct sections as follows: o Section 4 provides a definitive set of best common practices to demonstrate the traversal of SIP and its associated media through NAT devices. o Section 5 provides non-normative examples representing interactions of SIP using various NAT type deployments. The document does not propose any new functionality but does draw on existing solutions for both core SIP signaling and media traversal (as defined in Section 4). The best practices described in this document are for traditional "client-server"-style SIP. This term refers to the traditional use of the SIP protocol where User Agents talk to a series of intermediaries on a path to connect to a remote User Agent. It seems likely that other groups using SIP, for example, peer-to-peer SIP (P2PSIP), will recommend these same practices between a P2PSIP client and a P2PSIP peer, but will recommend different practices for use between peers in a peer-to-peer network.
RFC2119]. It should be noted that the use of the term 'Endpoint-Independent NAT' in this document refers to a NAT that is both Endpoint- Independent Filtering and Endpoint-Independent Mapping per the definitions in RFC 4787 [RFC4787]. RFC4787] Section 7, RFC 3424 [RFC3424], and [RFC5245] Section 18.6), and experience shows they can have an adverse impact on the functionality of SIP. This includes problems such as requiring the media and signaling to traverse the same device and not working with encrypted signaling and/or payload. The use of non-TURN-based media intermediaries is not considered in this document. More information can be obtained from [RFC5853] and [MIDDLEBOXES]. The core SIP signaling has a number of issues when traversing through NATs. SIP response routing over UDP as defined in RFC 3261 [RFC3261] without extensions causes the response to be delivered to the source IP address specified in the topmost Via header, or the 'received' parameter of the topmost 'Via' header. The port is extracted from the SIP 'Via' header to complete the IP address/port combination for returning the SIP response. While the destination for the response is correct, the port contained in the SIP 'Via' header represents the listening port of the originating client and not the port representing the open pinhole on the NAT. This results in responses being sent back to the NAT but to a port that is likely not open for SIP traffic. The SIP response will then be dropped at the NAT. This is illustrated in Figure 1, which depicts a SIP response being returned to port 5060.
Private NAT Public Network | Network | | -------- SIP Request |open port 10923 -------- | |-------------------->--->-----------------------| | | | | | | | Client | |port 5060 SIP Response | Proxy | | | x<------------------------| | | | | | | -------- | -------- | | | Figure 1: Failed Response Secondly, there are two cases where new requests reuse existing connections. The first is when using a reliable, connection-oriented transport protocol such as TCP, SIP has an inherent mechanism that results in SIP responses reusing the connection that was created/used for the corresponding transactional request. The SIP protocol does not provide a mechanism that allows new requests generated in the reverse direction of the originating client to use, for example, the existing TCP connection created between the client and the server during registration. This results in the registered contact address not being bound to the "connection" in the case of TCP. Requests are then blocked at the NAT, as illustrated in Figure 2. The second case is when using an unreliable transport protocol such as UDP where external NAT mappings need to be reused to reach a SIP entity on the private side of the network. Private NAT Public Network | Network | | -------- (UAC 8023) REGISTER/Response (UAS 5060) -------- | |-------------------->---<-----------------------| | | | | | | | Client | |5060 INVITE (UAC 8015)| Proxy | | | x<------------------------| | | | | | | -------- | -------- | | | Figure 2: Failed Request
In Figure 2, the original REGISTER request is sent from the client on port 8023 and received by the proxy on port 5060, establishing a connection and opening a pinhole in the NAT. The generation of a new request from the proxy results in a request destined for the registered entity (contact IP address) that is not reachable from the public network. This results in the new SIP request attempting to create a connection to a private network address. This problem would be solved if the original connection were reused. While this problem has been discussed in the context of connection-oriented protocols such as TCP, the problem exists for SIP signaling using any transport protocol. The impact of connection reuse of connection-oriented transports (TCP, TLS, etc.) is discussed in more detail in the connection reuse specification [RFC5923]. The approach proposed for this problem in Section 4 of this document is relevant for all SIP signaling in conjunction with connection reuse, regardless of the transport protocol. NAT policy can dictate that connections should be closed after a period of inactivity. This period of inactivity may vary from a number of seconds to hours. SIP signaling cannot be relied upon to keep connections alive for the following two reasons. Firstly, SIP entities can sometimes have no signaling traffic for long periods of time, which has the potential to exceed the inactivity timer, and this can lead to problems where endpoints are not available to receive incoming requests as the connection has been closed. Secondly, if a low inactivity timer is specified, SIP signaling is not appropriate as a keep-alive mechanism as it has the potential to add a large amount of traffic to the network, which uses up valuable resources and also requires processing at a SIP stack, which is also a waste of processing resources. Media associated with SIP calls also has problems traversing NAT. RTP [RFC3550] runs over UDP and is one of the most common media transport types used in SIP signaling. Negotiation of RTP occurs with a SIP session establishment using the Session Description Protocol (SDP) [RFC4566] and a SIP offer/answer exchange [RFC3264]. During a SIP offer/answer exchange, an IP address and port combination are specified by each client in a session as a means of receiving media such as RTP. The problem arises when a client advertises its address to receive media and it exists in a private network that is not accessible from outside the NAT. Figure 3 illustrates this problem.
NAT Public Network NAT | | | | | | -------- | SIP Signaling Session | -------- | |---------------------->Proxy<-------------------| | | | | | | | | Client | | | | Client | | A |>=====>RTP>==Unknown Address==>X | | B | | | | X<==Unknown Address==<RTP<===<| | -------- | | -------- | | | | | | Figure 3: Failed Media The connection addresses of the clients behind the NATs will nominally contain a private IPv4 address that is not routable across the public Internet. Exacerbating matters even more would be the tendency of Client A to send media to a destination address it received in the signaling confirmation message -- an address that may actually correspond to a host within the private network who is suddenly faced with incoming RTP packets (likewise, Client B may send media to a host within its private network who did not solicit these packets). Finally, to complicate the problem even further, a number of different NAT topologies with different default behaviors increases the difficulty of arriving at a unified approach. This problem exists for all media transport protocols that might be NATted (e.g., TCP, UDP, the Stream Control Transmission Protocol (SCTP), the Datagram Congestion Control Protocol (DCCP)). In general, the problems associated with NAT traversal can be categorized as follows. For signaling: o Responses do not reuse the NAT mapping and filtering entries created by the request. o Inbound requests are filtered out by the NAT because there is no long-term connection between the client and the proxy.
For media: o Each endpoint has a variety of addresses that can be used to reach it (e.g., native interface address, public NATted address). In different situations, a different pair of (local endpoint, remote endpoint) addresses should be used, and it is not clear when to use which pair. o Many NATs filter inbound packets if the local endpoint has not recently sent an outbound packet to the sender. o Classic RTCP usage is to run RTCP on the next highest port. However, NATs do not necessarily preserve port adjacency. o Classic RTP and RTCP usage is to use different 5-tuples for traffic in each direction. Though not really a problem, doing this through NATs is more work than using the same 5-tuple in both directions. Section 3 of this document. The remaining sub-sections describe appropriate solutions that result in SIP signaling traversal through NATs, regardless of transport protocol. It is advised that SIP-compliant entities follow the guidelines presented in this section to enable traversal of SIP signaling through NATs. Section 3 of this document, when using an unreliable transport protocol such as UDP, SIP responses are sent to the IP address and port combination contained in the SIP 'Via' header field (or default port for the appropriate transport protocol if not present). Figure 4 illustrates the response traversal through the open pinhole using Symmetric techniques defined in RFC 3581 [RFC3581].
Private NAT Public Network | Network | | -------- | -------- | | | | | | |send request---------------------------------->| | | Client |<---------------------------------send response| SIP | | A | | | Proxy | | | | | | -------- | -------- | | | Figure 4: Symmetric Response The outgoing request from Client A opens a pinhole in the NAT. The SIP Proxy would normally respond to the port available in the SIP 'Via' header, as illustrated in Figure 1. The SIP Proxy honors the 'rport' parameter in the SIP 'Via' header and routes the response to the port from which it was sent. The exact functionality for this method of response traversal is called 'Symmetric Response', and the details are documented in RFC 3581 [RFC3581]. Additional requirements are imposed on SIP entities in RFC 3581 [RFC3581] such as listening and sending SIP requests/responses from the same port. Section 3 and illustrated in Figure 2, is to allow incoming requests to be properly routed. Guidelines for devices such as User Agents that can only generate outbound connections through NATs are documented in "Managing Client- Initiated Connections in the Session Initiation Protocol (SIP)" [RFC5626]. The document provides techniques that use a unique User Agent instance identifier (instance-id) in association with a flow identifier (reg-id). The combination of the two identifiers provides a key to a particular connection (both UDP and TCP) that is stored in association with registration bindings. On receiving an incoming request to a SIP Address-Of-Record (AOR), a proxy/registrar routes to the associated flow created by the registration and thus a route through NATs. It also provides a keep-alive mechanism for clients to keep NAT bindings alive. This is achieved by multiplexing a ping- pong mechanism over the SIP signaling connection (STUN for UDP and
CRLF/operating system keepalive for reliable transports like TCP). Usage of [RFC5626] is RECOMMENDED. This mechanism is not transport specific and should be used for any transport protocol. Even if the SIP Outbound mechanism is not used, clients generating SIP requests SHOULD use the same IP address and port (i.e., socket) for both transmission and receipt of SIP messages. Doing so allows for the vast majority of industry provided solutions to properly function (e.g., NAT traversal that is Session Border Control (SBC) hosted). Deployments should also consider the mechanism described in the Connection Reuse [RFC5923] specification for routing bidirectional messages securely between trusted SIP Proxy servers. Section 3 of this document is that internal IP address/port combinations cannot be reached from the public side of NATs. In the case of media such as RTP, this will result in no audio traversing NATs (as illustrated in Figure 3). To overcome this problem, a technique called 'Symmetric RTP/RTCP' [RFC4961] can be used. This involves a SIP endpoint both sending and receiving RTP/RTCP traffic from the same IP address/port combination. When operating behind a NAT and using the 'latching' technique described in [MIDDLEBOXES], SIP User Agents MUST implement Symmetric RTP/RTCP. This allows traversal of RTP across the NAT. RFC3550] is for consecutive-order numbering (i.e., select an incremented port for RTCP from that used for RTP). This assumption causes RTCP traffic to break when traversing certain types of NATs due to various reasons (e.g., already allocated port, randomized port allocation). To combat this problem, a specific address and port need to be specified in the SDP rather than relying on such assumptions. RFC 3605 [RFC3605] defines an SDP attribute that is included to explicitly specify transport connection information for RTCP so a separate, explicit NAT binding can be set up for the purpose. The address details can be obtained using any appropriate method including those detailed in this section (e.g., STUN, TURN, ICE).
A further enhancement to RFC 3605 [RFC3605] is defined in [RFC5761], which specifies 'muxing' both RTP and RTCP on the same IP/PORT combination. RFC5389]. STUN is a lightweight tool kit and protocol that provides details of the external IP address/port combination used by the NAT device to represent the internal entity on the public facing side of NATs. On learning of such an external representation, a client can use it accordingly as the connection address in SDP to provide NAT traversal. Using terminology defined in "NAT Behavioral Requirements for Unicast UDP" [RFC4787], STUN does work with Endpoint-Independent Mapping but does not work with either Address-Dependent Mapping or Address and Port-Dependent Mapping type NATs. Using STUN with either of the previous two NAT mappings to probe for the external IP address/port representation will provide a different result to that required for traversal by an alternative SIP entity. The IP address/ port combination deduced for the STUN server would be blocked for RTP packets from the remote SIP User Agent. As mentioned in Section 4.1.2, STUN is also used as a client-to- server keep-alive mechanism to refresh NAT bindings. Section 18.104.22.168, the STUN protocol does not work for UDP traversal through certain identified NAT mappings. 'Traversal Using Relays around NAT' is a usage of the STUN protocol for deriving (from a TURN server) an address that will be used to relay packets towards a client. TURN provides an external address (globally routable) at a TURN server that will act as a media relay that attempts to allow traffic to reach the associated internal address. The full details of the TURN specification are defined in [RFC5766]. A TURN service will almost always provide media traffic to a SIP entity, but it is RECOMMENDED that this method would only be used as a last resort and not as a general mechanism for NAT traversal. This is because using TURN has high performance costs when relaying media traffic and can lead to unwanted latency.
RFC3424] to provide a unified solution. This is achieved by obtaining as many representative IP address/port combinations as possible using technologies such as STUN/TURN (note: an ICE endpoint can also use other mechanisms (e.g., the NAT Port Mapping Protocol [NAT-PMP], Universal Plug and Play Internet Gateway Device [UPnP-IGD]) to learn public IP addresses and ports, and populate a=candidate lines with that information). Once the addresses are accumulated, they are all included in the SDP exchange in a new media attribute called 'candidate'. Each candidate SDP attribute entry has detailed connection information including a media address, priority, and transport protocol. The appropriate IP address/port combinations are used in the order specified by the priority. A client compliant to the ICE specification will then locally run STUN servers on all addresses being advertised using ICE. Each instance will undertake connectivity checks to ensure that a client can successfully receive media on the advertised address. Only connections that pass the relevant connectivity checks are used for media exchange. The full details of the ICE methodology are in [RFC5245].