Internet Engineering Task Force (IETF) R. Bush, Ed. Request for Comments: 6346 Internet Initiative Japan Category: Experimental August 2011 ISSN: 2070-1721 The Address plus Port (A+P) Approach to the IPv4 Address Shortage
AbstractWe are facing the exhaustion of the IANA IPv4 free IP address pool. Unfortunately, IPv6 is not yet deployed widely enough to fully replace IPv4, and it is unrealistic to expect that this is going to change before the depletion of IPv4 addresses. Letting hosts seamlessly communicate in an IPv4 world without assigning a unique globally routable IPv4 address to each of them is a challenging problem. This document proposes an IPv4 address sharing scheme, treating some of the port number bits as part of an extended IPv4 address (Address plus Port, or A+P). Instead of assigning a single IPv4 address to a single customer device, we propose to extend the address field by using bits from the port number range in the TCP/UDP header as additional endpoint identifiers, thus leaving a reduced range of ports available to applications. This means assigning the same IPv4 address to multiple clients (e.g., Customer Premises Equipment (CPE), mobile phones), each with its assigned port range. In the face of IPv4 address exhaustion, the need for addresses is stronger than the need to be able to address thousands of applications on a single host. If address translation is needed, the end-user should be in control of the translation process -- not some smart boxes in the core.
Status of This Memo This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation. This document defines an Experimental Protocol for the Internet community. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6346. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Problems with Carrier Grade NATs . . . . . . . . . . . . . 4 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Design Constraints and Functions . . . . . . . . . . . . . . . 6 3.1. Design Constraints . . . . . . . . . . . . . . . . . . . . 6 3.2. A+P Functions . . . . . . . . . . . . . . . . . . . . . . 7 3.3. Overview of the A+P Solution . . . . . . . . . . . . . . . 8 3.3.1. Signaling . . . . . . . . . . . . . . . . . . . . . . 9 3.3.2. Address Realm . . . . . . . . . . . . . . . . . . . . 11 3.3.3. Reasons for Allowing Multiple A+P Gateways . . . . . . 15 3.3.4. Overall A+P Architecture . . . . . . . . . . . . . . . 16 3.4. A+P Experiments . . . . . . . . . . . . . . . . . . . . . 17 4. Stateless A+P Mapping Function . . . . . . . . . . . . . . . . 18 4.1. Stateless A+P Mapping (SMAP) Gateway Function Description . . . . . . . . . . . . . . . . . . . . . . . 18 4.2. Implementation Mode . . . . . . . . . . . . . . . . . . . 20 4.3. Towards IPv6-Only Networks . . . . . . . . . . . . . . . . 22 4.4. PRR: On Stateless and Binding Table Modes . . . . . . . . 22 4.5. General Recommendations on SMAP . . . . . . . . . . . . . 23 5. Deployment Scenarios . . . . . . . . . . . . . . . . . . . . . 24 5.1. A+P Deployment Models . . . . . . . . . . . . . . . . . . 24 5.1.1. A+P for Broadband Providers . . . . . . . . . . . . . 24 5.1.2. A+P for Mobile Providers . . . . . . . . . . . . . . . 24 5.1.3. A+P from the Provider Network Perspective . . . . . . 25 5.2. Dynamic Allocation of Port Ranges . . . . . . . . . . . . 27 5.3. Example of A+P-Forwarded Packets . . . . . . . . . . . . . 28 5.3.1. Forwarding of Standard Packets . . . . . . . . . . . . 32 5.3.2. Handling ICMP . . . . . . . . . . . . . . . . . . . . 32 5.3.3. Fragmentation . . . . . . . . . . . . . . . . . . . . 33 5.3.4. Limitations of the A+P Approach . . . . . . . . . . . 33 5.3.5. Port Allocation Strategy Agnostic . . . . . . . . . . 34 6. Security Considerations . . . . . . . . . . . . . . . . . . . 34 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 35 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 35 8.1. Normative References . . . . . . . . . . . . . . . . . . . 35 8.2. Informative References . . . . . . . . . . . . . . . . . . 35 9. Contributing Authors . . . . . . . . . . . . . . . . . . . . . 37
Today's NATs are typically mitigated by offering the customers limited control over them, e.g., port forwarding, Universal Plug and Play or the NAT Port Mapping Protocol (UPnP/NAT-PMP). However, this is not expected to work with CGNs. CGN proposals -- other than DS-Lite [RFC6333] with A+P or the Port Control Protocol (PCP) [PCP-BASE] -- admit that it is not expected that applications that require specific port assignment or port mapping from the NAT box will keep working. Another issue with CGNs is the trade-off between session state and network placement. The farther from the edge the CGN is placed, the more session state needs to be kept due to larger subscriber aggregation and the more disruption that occurs in the case of a failure. In order to reduce the state, CGNs would end up somewhere closer to the edge. Thus, the CGN trades scalability for the amount of state that needs to be kept, which makes optimally placing a CGN a hard engineering problem. In some deployment scenarios, a CGN can be seen as the single point of failure, and therefore the availability of delivered services can be impacted by a single CGN device. Means to ensure state synchronization and failover would be required to allow for service continuity whenever a failure occurs. Intra-domain paths may not be optimal for communications between two nodes connected to the same domain deploying CGNs; they may lead to path stretches. RFC2119].
Private Realm: This realm contains IPv4 addresses that are not globally routed. They may be taken from the [RFC1918] range. However, this document does not make such an assumption. We regard as private address space any IPv4 address that needs to be translated in order to gain global connectivity, irrespective of whether or not it falls in [RFC1918] space. Port-Range Router (PRR): A device that forwards A+P packets. Customer Premises Equipment (CPE): cable or DSL modem. Provider Edge (PE) Router: Customer aggregation router. Provider Border Router (BR): Provider's edge to other providers. Network Core Routers (Core): Provider routers that are not at the edges.
4) Efficiency versus complexity: Operators should have the flexibility to trade off port multiplexing efficiency and scalability and end-to-end transparency. 5) "Double-NAT" should be avoided: Multiple gateway devices might be present in a path, and once one has done some translation, those packets should not be retranslated. 6) Legal traceability: ISPs must be able to provide the identity of a customer from the knowledge of the IPv4 public address and the port. This should have as low an impact as is reasonable on storage by the ISP. We assume that NATs on customer premises do not pose much of a problem, while provider NATs need to keep additional logs. 7) IPv6 deployment should be encouraged. NAT444 strongly biases the users to the deployment of RFC 1918 addressing. Constraint 5 is important: while many techniques have been deployed to allow applications to work through a NAT, traversing cascaded NATs is crucial if NATs are being deployed in the core of a provider network. BCP38]). Based on the result of such a check, the packet MAY be forwarded untranslated, MAY be discarded, or MAY be NATed. In this document, we refer to a device that provides this encaps/decaps functionality as a Port-Range Router (PRR). Network Address Translation (NAT) function: is used to connect legacy end-hosts. Unless upgraded, end-hosts or end-systems are not aware of A+P restrictions and therefore assume a full IP address. The NAT function performs any address or port translation, including Application Level Gateways (ALGs) whenever required. The state that has to be kept to implement this function is the mapping for which external addresses and ports have been mapped to which internal addresses and ports, just as in CPEs embedding NAT today. A subtle,
but very important, difference should be noted here: the customer has control over the NATing process or might choose to "bypass" the NAT. If this is done, we call the NAT a Large-Scale NAT (LSN). However, if the NAT does NOT allow the customer to control the translation process, we call it a CGN. Signaling function: is used to allow A+P-aware devices to get to know which ports are assigned to be passed through untranslated and what will happen to packets outside the assigned port range (e.g., could be NATed or discarded). Signaling may also be used to learn the encapsulation method and any endpoint information needed. In addition, the signaling function may be used to dynamically assign the requested port range. Figure 1) is that the translator might only be able to use a restricted range of ports when mapping multiple internal addresses onto an external one, e.g., the external address realm might be port- restricted. "internal-side" "external-side" +-----+ internal | N | external address <---| A |---> address realm | T | realm +-----+ Figure 1: Traditional NAT The encaps/decaps function, on the other hand, is the ability to establish a tunnel with another endpoint providing the same function. This implies some form of signaling to establish a tunnel. Such signaling can be viewed as integrated with DHCP or as a separate service. Section 3.3.1 discusses the constraints of this signaling function. The tunnel can be an IPv6 or IPv4 encapsulation, a layer-2 tunnel, or some other form of softwire. Note that the presence of a tunnel allows unmodified, naive, or even legacy devices between the two endpoints.
Two or more devices that provide the encaps/decaps function are linked by tunnels to form an A+P subsystem. The function of each gateway is to encapsulate and decapsulate, respectively. Figure 2 depicts the simplest possible A+P subsystem, that is, two devices providing the encaps/decaps function. +------------------------------------+ Private | +----------+ tunnel +----------+ | Public address --|-| gateway |==========| gateway |-|-- address realm | +----------+ +----------+ | realm +------------------------------------+ A+P subsystem Figure 2: A Simple A+P Subsystem Within an A+P subsystem, the public address realm is extended by using bits from the port number when forwarding packets. Each device is assigned one address from the external realm and a range of port numbers. Hence, devices that are part of an A+P subsystem can communicate with the public realm without the need for address translation (i.e., preserving end-to-end packet integrity): an A+P packet originated from within the A+P subsystem can be simply forwarded over tunnels up to the endpoint, where it gets decapsulated and routed in the external realm. PR-ADDR-ASSIGN], [SHARED-ADDR-OPT], [PORT-RANGE-OPT], or [PCP-BASE]. The information that needs to be shared is the following: o a set of public IPv4 addresses, o for each IPv4 address, a starting point for the allocated port range, o the number of delegated ports, o the optional key that enables partial or full preservation of entropy in port randomization -- see [PR-ADDR-ASSIGN], o the lifetime for each IPv4 address and set of allocated ports, o the tunneling technology to be used (e.g., "IPv6-encapsulation"),
o addresses of the tunnel endpoints (e.g., IPv6 address of tunnel endpoints), o whether or not NAT function is provided by the gateway, o a device identification number and some authentication mechanisms, and o a version number and some reserved bits for future use. Note that the functions of encapsulation and decapsulation have been separated from the NAT function. However, to accommodate legacy hosts, NATing is likely to be provided at some point in the path; therefore, the availability or absence of NATing MUST be communicated in signaling, as A+P is agnostic about NAT placement. The port ranges can be allocated in two different ways: o If applications or end-hosts behind the CPE are not UPnPv2/ NAT-PMP-aware, then the CPE SHOULD request ports via mechanisms, e.g., as described in [PR-ADDR-ASSIGN] and [PORT-RANGE-OPT]. Note that different port ranges can have different lifetimes, and the CPE is not entitled to use them after they expire -- unless it refreshes those ranges. It is up to the ISP to put mechanisms in place (to prevent denial-of-service attacks) that determine what percentage of already allocated port ranges should be exhausted before a CPE may request additional ranges, how often the CPE can request additional ranges, and so on. o If applications behind the CPE are UPnPv2/NAT-PMP-aware, additional ports MAY be requested through that mechanism. In this case, the CPE should forward those requests to the LSN, and the LSN should reply reporting if the requested ports are available or not (and if they are not available, some alternatives should be offered). Here again, to prevent potential denial-of-service attacks, mechanisms should be in place to prevent UPnPv2/NAT-PMP packet storms and fast port allocation. A detailed description of this mechanism, called PCP, is in [PCP-BASE]. Whatever signaling mechanism is used inside the tunnels -- DHCP, IP Control Protocol (IPCP), or PCP based, synchronization between the signaling server and PRR must be established in both directions. For example, if we use DHCP as the signaling mechanism, the PRR must communicate to the DHCP server at least its IP range. The DHCP server then starts to allocate IP addresses and port ranges to CPEs and communicates back to the PRR which IP and port range have been allocated to which CPE, so the PRR knows to which tunnel to redirect incoming traffic. In addition, DHCP MUST also communicate lifetimes
of port ranges assigned to CPE via the PRR. DHCP server may be co- located with the PRR function to ease address management and also to avoid the need to introduce a communication protocol between the PRR and DHCP. If UPnPv2/NAT-PMP is used as the dynamic port allocation mechanism, the PRR must also communicate to the DHCP (or IPCP) server to avoid those ports. The PRR must somehow (e.g., using DHCP or IPCP options) communicate back to CPE that the allocation of ports was successful, so CPE adds those ports to existing port ranges. Note that operation can be even simplified if a fixed length of port ranges is assigned to all customers and no differentiation is implemented based on port-range length. In such case, the binding table maintained by the PRR can be dynamically built upon the receipt of a first packet from a port-restricted device.
delegate a portion of its port range to B. In addition, A will also negotiate the encaps/decaps function with B (e.g., let B know the address of the decaps device at the endpoint of the tunnel). This could be implemented, for example, via a NAT-PMP- or DHCP-like solution. In general, the following rule applies: a sub-portion of the managed A+P address space is delegated as long as devices below ask for it; otherwise, private IPv4 is provided to support legacy hosts. The following examples use an IPv4 address from the blocks reserved for documentation as defined in [RFC5737]. private +-----+ +-----+ public address ---| B |==========| A |--- Internet realm +-----+ +-----+ Address space realm of A: public IPv4 address = 192.0.2.1 port range = 0-65535 Address space realm of B: public IPv4 address = 192.0.2.1 port range = 2560-3071 Figure 3: Configuration Example Figure 3 illustrates a sample configuration. Note that A might actually consist of three different devices: one that handles signaling requests from B; one that performs encapsulation and decapsulation; and, if provided, one device that performs the NATing function (e.g., an LSN). Packet forwarding is assumed to be as follows: in the "outbound" case, a packet arrives from the private address realm to B. As stated above, B has two options: it can either apply or not apply the NAT function. The decision depends upon the specific configuration and/or the capabilities of A and B. Note that NAT functionality is required to support legacy hosts; however, this can be done at either of the two devices A or B. The term "NAT" refers to translating the packet into the managed A+P address (B has address 192.0.2.1 and ports 2560-3071 in the example above). We then have two options: 1) B NATs the packet. The translated packet is then tunneled to A. A recognizes that the packet has already been translated because the source address and port match the delegated space. A decapsulates the packet and releases it in the public Internet.
2) B does not NAT the packet. The untranslated packet is then tunneled to A. A recognizes that the packet has not been translated, so A forwards the packet to a co-located NATing device, which translates the packet and routes it in the public Internet. This device, e.g., an LSN, has to store the mapping between the source port used to NAT and the tunnel where the packet came from, in order to correctly route the reply. Note that A cannot use a port number from the range that has been delegated to B. As a consequence, A has to assign a part of its non-delegated address space to the NATing function. "Inbound" packets are handled in the following way: a packet from the public realm arrives at A. A analyzes the destination port number to understand whether or not the packet needs to be NATed. 1) If the destination port number belongs to the range that A delegated to B, then A tunnels the packet to B. B NATs the packet using its stored mapping and forwards the translated packet to the private domain. 2) If the destination port number is from the address space of the LSN, then A passes the packet on to the co-located LSN, which uses its stored mapping to NAT the packet into the private address realm of B. The appropriate tunnel is stored as well in the mapping of the initial NAT. The LSN then encapsulates the packet to B, which decapsulates it and normally routes it within its private realm. 3) Finally, if the destination port number falls in neither a delegated range nor the address range of the LSN, A discards the packet. If the packet is passed to the LSN, but no mapping can be found, the LSN discards the packet. Observe that A must be able to receive all IPv4 packets destined to the public IPv4 address (192.0.2.1 in the example), so that it can make routing decisions according to the port number. On the other hand, B receives IPv4 packets destined to the public IPv4 address only via the established tunnel with A. In other words, B uses the public IPv4 address just for translation purposes, but it is not used to make routing decisions. This allows us to keep the routing logic at B as simple as described above, while enabling seamless communication between A+P devices sharing the same public IPv4 address.
private +-----+ +-----+ public address ---| B |==========| A |--- Internet realm 1 +-----+ +-----+ | private +-----+ | address ---| C |============/ realm 2 +-----+ Address space realm of A: public IPv4 address = 192.0.2.1 port range = 0-65535 Address space realm of B: public IPv4 address = 192.0.2.1 port range = 2560-3071 Address space realm of C: public IPv4 address = 192.0.2.1 port range = 0-2559 Figure 4: Hierarchical A+P Consider the example shown in Figure 4. Here, both B and C use the encaps/decaps function to establish a tunnel with A, and they are assigned the same public IPv4 address with different, non-overlapping port ranges. Assume that a host in B's private realm sends a packet destined to address 192.0.2.1 and port 2000, and that B has been instructed to NAT all packets destined to 192.0.2.1. Under these assumptions, B receives the packet and NATs it using its own public IPv4 address (192.0.2.1) and a port selected from its configured port range (e.g., 3000). B then tunnels the translated packet to A. When A receives the packet via the tunnel, it looks at the destination address and port, recognizes C's delegated range, and then tunnels the packet to C. Observe that, apart from stripping the tunnel header, A handles the packet as if it came from the public Internet. When C receives the packet, it NATs the destination address into one address chosen from its private address realm, while keeping the source address (192.0.2.1) and port (3000) untranslated. Return traffic is handled the same way. Such a mechanism allows hosts behind A+P devices to communicate seamlessly even when they share the same public IPv4 address. Please refer to Section 4 for a discussion of an alternative A+P mechanism that does not incur path-stretch penalties for intra-domain communication.
Figure 5 shows an example where three devices are connected in a chain. In other words, A+P signaling can be used to extend end-to-end connectivity to the devices that are in an A+P subsystem. This allows A+P-aware applications (or OSes) running on end-hosts to enter an A+P subsystem and exploit untranslated connectivity. There are two modes for end-hosts to gain fine-grained control of end-to-end connectivity. The first is where actual end-hosts perform the NAT function and the encaps/decaps function that is required to join the A+P subsystem. This option works in a similar way to the NAT-in-the-host trick employed by virtualization software such as VMware, where the guest operating system is connected via a NAT to the host operating system. The second mode is when applications autonomously ask for an A+P address and use it to join the A+P subsystem. This capability is necessary for some applications that require end-to-end connectivity (e.g., applications that need to be contacted from outside). +---------+ +---------+ +---------+ internal | gateway | | gateway | | gateway | external realm --| 1 |======| 2 |======| 3 |-- realm +---------+ +---------+ +---------+ Figure 5: An A+P Subsystem with Multiple Devices Whatever the reasons might be, the Internet was built on a paradigm that end-to-end connectivity is important. A+P makes this still possible in a time where address shortage forces ISPs to use NATs at various levels. In that sense, A+P can be regarded as a way to bypass NATs.
+---+ (customer2) |A+P|-. +---+ +---+ \ NAT|A+P|-. \ +---+ | \ | forward if in range +---+ \+---+ +---+ / |A+P|------|A+P|----|A+P|---- +---+ /+---+ +---+ \ / NAT if necessary / (cust1) (prov. (e.g., provider NAT) +---+ / router) |A+P|-' +---+ Figure 6: A Complex A+P Subsystem Figure 6 depicts a complex scenario, where the A+P subsystem is composed of multiple devices organized in a hierarchy. Each A+P gateway decapsulates the packet and then re-encapsulates it again to the next tunnel. A packet can be NATed either when it enters the A+P subsystem, at intermediate devices, or when it exits the A+P subsystem. This could be, for example, a gateway installed within the provider's network, together with an LSN. Then, each customer operates its own CPE. However, behind the CPE, applications might also be A+P-aware and run their own A+P-gateways; this enables them to have end-to-end connectivity. One limitation applies when "delayed translation" is used (e.g., translation at the LSN instead of the CPE). If devices using "delayed translation" want to talk to each other, they SHOULD use A+P addresses or out-of-band addressing. A+P architecture IPv4 Full-A+P AFTR CGN | | | | <-- Full IPv4 ---- Port range ---- Port range ---- Provider ---> allocated & dynamic & LSN NAT ONLY allocation (NAT on CPE (No mechanism) (no NAT) (NAT on CPE) and on LSN) for customer to bypass CGN) Figure 7: A+P Overall Architecture
The A+P architecture defines various deployment options within an ISP. Figure 7 shows the spectrum of deployment options. On the far left is the common deployment method for broadband subscribers today, an IPv4 address unrestricted with full port range. Full-A+P refers to a port-range allocation from the ISP. The customer must operate an A+P-aware CPE device, and no NATing functionality is provided by the ISP. The Address Family Transition Router (AFTR), such as DS-Lite [RFC6333], is a hybrid. There is NAT present in the core (in this document, referred to as LSN), but the user has the option to "bypass" that NAT in one form or an other, for example, via A+P, NAT-PMP, etc. Finally, a service provider that only deploys CGN will place a NAT in the provider's core and does not allow the customer to "bypass" the translation process or modify ALGs on the NAT. The customer is provider-locked. Notice that all options (besides full IPv4) require some form of tunneling mechanism (e.g., 4in6) and a signaling mechanism (see Section 3.3.1). A+P-EXPERIMENTS]. As seen in that experiment, most tested applications are unaffected. There are problems with torrent protocol and applications, as the listening port is out of A+P port range and some UPnP may be required to make it work with A+P. Problems with BitTorrent have already been experienced in the wild by users trapped behind a non-UPnP-capable CPE. The current workaround for the end-user is to statically map ports, which can be done in the A+P scenario as well. BitTorrent tests and experiments in shared IP and port-range environments are well described in [BITTORRENT-ADDR-SHARING]. Conclusions in that document tell us that two limitations were experienced. The first occurred when two clients sharing the same IP address tried to simultaneously retrieve the SAME file located in a SINGLE remote peer. The second limitation occurred when a client tried to download a file located on several seeders, when those seeders shared the same IP address. Mutual file sharing between hosts having the same IP address has been checked. Indeed, machines having the same IP address can share files with no alteration compared to current IP architectures. Working implementations of A+P can be found in: o Internet Systems Consortium AFTR (http://www.isc.org/software/aftr),
o FT Orange opensource A+P (http://opensourceaplusp.weebly.com/) developed by Xiaoyu Zhao, Xiaohong Deng, Tao Zheng, and o 4rd (IPv4 Residual Deployment) from ipinfusion.com, which is stateless A+P. Section 4.1 of [RFC6052], the suffix part may enclose the port. The Stateless A+P Mapping (SMAP) gateway consists in two basic functions as described in Figure 8. 1. SMAP encapsulates an IPv4 packet, destined to a shared IPv4 address, in an IPv6 one. The IPv6 source address is constructed using an IPv4-embedded IPv6 address [RFC6052] from the IPv4 source address and port number plus the IPv6 prefix that has been provisioned to the node performing the SMAP function. The destination IPv6 address is constructed using the shared IPv4 destination address and port number plus the IPv6 prefix that has been provisioned to the SMAP function and that is dedicated to IPv4 destination addresses. 2. SMAP extracts IPv4 incoming packets from IPv6 incoming ones that have IPv6 source addresses belonging to the prefix of the node performing the SMAP function. Extracted IPv4 packets are then forwarded to the point identified by the IPv4 destination address and port number.
+-------------------+ | |----IPv6---\ ----IPv4---\| |----IPv4---\\ -----------/| |-----------// | |-----------/ | SMAP | | | /--IPv6----- /---IPv4----| |//---IPv4---- \-----------| |\\----------- | | \----------- +-------------------+ Figure 8: Stateless A+P Mapping Gateway Function An SMAP-enabled node will perform the stateless 6/4 mapping function for all public shared IPv4 addresses for which it was designated as a stateless 6/4 mapping gateway. To perform the stateless 6/4 mapping function, an SMAP gateway must: o be provided with an IPv6 prefix (i.e., Pref6). The SMAP gateway uses this prefix to construct IPv6 source addresses for all IPv4 shared addresses for which it was designated as an SMAP gateway. The IPv6 prefix may be provisioned statically or dynamically (e.g., DHCP). o be able to know the IPv6 prefix of the node serving as another SMAP gateway for IPv4 destination addresses. This prefix may be known in various ways: * Default or Well-Known Prefix (i.e., 64:ff9b::/96) that was provisioned statically or dynamically; * Retained at the reception of incoming IPv4-in-IPv6 encapsulated packets; * Discovered at the start of communication, thanks to mechanisms such as DNS resolution, for example. When the SMAP-enabled node receives IPv4 packets with IPv4 source addresses for which it was not designated as an smap gateway, it will not perform stateless 6/4 mapping function for those packets. Those packets will be handled in a classical way (i.e., forwarded, dropped, or locally processed).
When the SMAP-enabled node receives IPv6 packets with IPv6 addresses that do not match with its IPv6 prefix, it will not perform the stateless 6/4 mapping function for those packets. Those packets will be handled in a classical way (i.e., forwarded, dropped, or locally processed). +------+ +------+ | |----IPv6---\ | | ----IPv4---\| |----IPv4---\\| |----IPv4---\ -----------/| |-----------//| |-----------/ | |-----------/ | | | SMAP | | SMAP | | | /----IPv6---| | /---IPv4----| |//---IPv4----| |/---IPv4---- \-----------| |\\-----------| |\----------- | | \-----------| | +------+ +------+ node A node B Figure 9 Several deployment scenarios of the SMAP function may be envisaged in the context of port-range-based solutions: o An SMAP function is embedded in a port-restricted device. Other SMAP-enabled nodes are deployed in the boundaries between IPv6- enabled realms and IPv4 ones. This scenario may be deployed particularly for intra-domain communications so as to interconnect heterogeneous realms (i.e., IPv6/IPv4) within the same Autonomous System (AS).
o An SMAP function is embedded in a port-restricted device. Other SMAP-enabled nodes are deployed in the interconnection segment (with adjacent IPv4-only ones) of a given AS. This deployment scenario is more suitable for service providers targeting the deployment of IPv6 since it eases the migration to full IPv6. Core nodes are not required to continue to activate both IPv4 and IPv6 transfer capabilities. Other considerations regarding the interconnection of SMAP-enabled domains should be elaborated. The following provides a non- exhaustive list of interconnection schemes. o The interconnection of two domains implementing the SMAP function may be deployed via IPv4 Internet (Figure 10): this means that IPv4 packets encapsulated in IPv6 packets are transferred using IPv6 until reaching the first SMAP-enabled node. Then, these packets are decapsulated and are forwarded using IPv4 transfer capabilities. A remote SMAP-enabled node will receive those packets and proceed to an IPv4-in-IPv6 encapsulation. These packets are then routed normally until reaching the port- restricted devices that decapsulate the packets. +------+ +------+ +--------+ +------+ +------+ | |--IPv6--\ | | | | | |---IPv6--\ | | | |--IPv4--\\| |---|-IPv4---|--\| |---IPv4--\\| | | |--------//| |---|--------|--/| |---------//| | | |--------/ | | |Internet| | |---------/ | | | SMAP | | SMAP | | IPv4 | | SMAP | | SMAP | | | /--IPv6--| | | | | | /---IPv6--| | | |//--IPv4--| |/--|-IPv4---|---| |//--IPv4---| | | |\\--------| |\--|--------|---| |\\---------| | | | \--------| | | | | | \---------| | +------+ +------+ +--------+ +------+ +------+ Source node A node B Destination Figure 10: Interconnection Scenario 1 o A second scheme is to use IPv6 to interconnect two realms that implement the SMAP function (Figure 11). An IPv6 prefix (i.e., Pref6) assigned by IANA is used for this service. If appropriate routing configurations have been enforced, then the IPv6- encapsulated packets will be routed until the final destination. In order to implement this model, IPv4-inferred IPv6 prefixes are required to be injected in the IPv6 inter-domain routing tables.
+------+ +------------+ +------+ | | | | | | | |----IPv6-----|----IPv6----|----IPv6----\ | | | |----IPv4-----|------------|----IPv4----\\| | | |-------------|------------|------------//| | | |-------------|------------|------------/ | | | SMAP | | Internet v6| | SMAP | | | /-----IPv6--|------------|-----IPv6-----| | | |//---IPv4----|------------|-------IPv4---| | | |\\-----------|------------|--------------| | | | \-----------|------------|--------------| | | | | | | | +------+ +------------+ +------+ Source Destination Figure 11: Interconnection Scenario 2 Section 4) discusses two modes: the binding and the stateless modes. Dynamic port allocation is not a feature of the stateless mode, but it is supported in the binding mode. In the binding mode, distinct external IPv4 addresses may be used, but this is not recommended. o Stateless Mode Complete stateless mapping implies that the IPv4 address and the significant bits coding the port range are reflected inside the IPv6 prefix assigned to the port-restricted device. This can be achieved either by embedding the full IPv4 address and the significant bits in the IPv6 prefix or by applying an algorithmic approach. Two alternatives are offered when such a stateless mapping is to be enabled: - use the IPv6 prefix already used for native IPv6 traffic, or
- provide two prefixes to the port-restricted device: one for the native IPv6 traffic and one for the IPv4 traffic. Note that: - Providing two IPv6 prefixes has the advantages of allowing a /64 prefix for the port-restricted device along with another prefix (e.g., a /56 or /64) for native IPv6 traffic. This alternative allows the service provider to relate the native IPv6 traffic addressing plan to the IPv4 addressing plan. The drawback is having to allocate two prefixes to each port- restricted device and to route them. In addition, an address selection issue may be encountered. - Providing one prefix for both needs (e.g., a /56 or a /64) allows the service provider to handle two types of IPv6 prefix for the port-restricted device and in routing tables. But the drawback is that it strongly links the IPv4 addressing plan to the allocated IPv6 prefixes. As mentioned in Section 4.1 of [RFC6052], the suffix part may enclose the port. o Binding Table Mode Another alternative is to assign a "normal" IPv6 prefix to the port-restricted device and to use a binding table, which can be hosted by a service node to correlate the (shared IPv4 address, port range) with an IPv6 address part of the assigned IPv6 prefix. For scalability reasons, this table should be instantiated within PRR-enabled nodes that are close to the port-restricted devices. The number of required entries if hosted at the interconnection segment would be equal to the amount of subscribed users (one per port-restricted device). 4rd].
4. devices behind the phone, e.g., laptop computer connecting via phone to Internet We expect that the gateway has a pool of IPv4 addresses and is always in the data-path of the packets. Transport between the gateway and phone devices is assumed to be an end-to-end layer-2 tunnel. We assume that the phone as well as gateway can be upgraded to support A+P. However, some applications running on the phone or devices behind the phone (such as laptop computers connecting via the phone) are not expected to be upgraded. Again, while we do not expect that devices behind the phone will be A+P-aware or upgraded, we also do not want to hinder their evolution. In this sense, the mobile phone would be comparable to the CPE in the broadband provider case; it would be the gateway to the PRR/LSN box in the network of the broadband provider. +---------------------------+ private | +------+ A+P-in +-----+ | dual-stacked (RFC 1918) --|-| CPE |==-IPv6-==| PRR |-|-- network space | +------+ tunnel +-----+ | (public addresses) | ^ +-----+ | | | IPv6-only | LSN | | | | network +-----+ | +----+----------------- ^ --+ | | on customer within provider premises and control network Figure 12: A Simple A+P Subsystem Example Consider the deployment scenario in Figure 12, where an A+P subsystem is formed by the CPE and a PRR within the ISP core network and preferably is close to the customer edge. Inside the subsystem, packets are forwarded based on address and port. The provider MAY deploy an LSN co-located with the PRR to handle packets that have not been translated by the CPE. In such a configuration, the ISP allows the customer to freely decide whether the translation is done at the
CPE or at the LSN. In order to establish the A+P subsystem, the CPE will be configured automatically (e.g., via a signaling protocol that conforms to the requirements stated above). Note that the CPE in the example above is provisioned with only an IPv6 address on the external interface. +------------ IPv6-only transport ------------+ | +---------------+ | | | | |A+P-application| | +--------+ | +-----+ | dual-stacked | | on end-host |=|==| CPE w/ |==|==| PRR |-|-- network | +---------------+ | +--------+ | +-----+ | (public addresses) +---------------+ | +--------+ | +-----+ | private IPv4 <-*--+->| NAT | | | LSN | | address space \ | +--------+ | +-----+ | for legacy +|--------------|----------+ hosts | | | | end-host with | CPE device | provider upgraded | on customer | network application | premises | Figure 13: An Extended A+P Subsystem with End-Host Running A+P-Aware Applications Figure 13 shows an example of how an upgraded application running on a legacy end-host can connect to another host in the public realm. The legacy host is provisioned with a private IPv4 address allocated by the CPE. Any packet sent from the legacy host will be NATed either at the CPE (if configured to do so) or at the LSN (if available). An A+P-aware application running on the end-host MAY use the signaling described in Section 3.3.1 to connect to the A+P subsystem. In this case, the application will be delegated some space in the A+P address realm, and will be able to contact the public realm (i.e., the public Internet) without the need for translation. Note that part of A+P signaling is that the NATs are optional. However, if neither the CPE nor the PRR provides NATing functionality, then it will not be possible to connect legacy end- hosts. To enable packet forwarding with A+P, the ISP MUST install at its A+P border a PRR that encaps/decaps packets. However, to achieve a higher address space compression ratio and/or to support CPEs without NATing functionality, the ISP MAY decide to provide an LSN as well. If no LSN is installed in some part of the ISP's topology, all CPEs
in that part of the topology MUST support NAT functionality. For reasons of scalability, it is assumed that the PRR is located within the access portion of the network. The CPE would be configured automatically (e.g., via an extended DHCP or NAT-PMP, which has the signaling requirements stated above) with the address of the PRR and of the LSN (if one is being provided). Figure 12 illustrates a possible deployment scenario.
There is likely no minimum request size. This is because A+P-aware applications running on end-hosts MAY request a single port (or a few ports) for the CPE to be contacted on (e.g., Voice over IP (VoIP) clients register a public IP and a single delegated port from the CPE, and accept incoming calls on that port). The implementation on the CPE or PRR will dictate how to handle such requests for smaller blocks: for example, half of available blocks might be used for "block-allocations", 1/6 for single port requests, and the rest for NATing. Another possible mechanism to allocate additional ports is UPnP/ NAT-PMP (as defined in Section 3.3.1), if applications behind CPE support it. In the case of the LSN implementation (DS-Lite), as described in Section 3.3.4 about the A+P overall architecture, signaling packets are simply forwarded by the CPE to the LSN and back to the host running the application that requested the ports, and the PRR allocates the requested port to the appropriate CPE. The same behavior may be chosen with AFTR, if requested ports are outside of the static initial port allocation. If a full A+P implementation is selected, then UPnPv2/NAT-PMP packets are accepted by the CPE, processed, and the requested port number is communicated through the normal signaling mechanism between CPE and PRR tunnel endpoints (PCP). Figure 14 illustrates how the CPE receives an IPv4 packet from the end-user device. We first describe the case where the CPE has been configured to provide the NAT functionality (e.g., by the customer through interaction with a website or by automatic signaling). In the following, we call a packet that is translated at the CPE an "A+P-forwarded packet", an analogy with the port-forwarding function employed in today's CPEs. Upon receiving a packet from the internal interface, the CPE translates, encapsulates, and forwards it to the PRR. The NAT on the CPE is assumed to have a default route to the public realm through its tunnel interface. When the PRR receives the A+P-forwarded packet, it decapsulates the inner IPv4 packet and checks the source address. If the source address does match the range assigned to A+P-enabled CPEs, then the PRR simply forwards the decapsulated packet onward. This is always the case for A+P-forwarded packets. Otherwise, the PRR assumes that the packet is not A+P-forwarded and passes it to the LSN function,
which in turn translates and forwards the packet based on the destination address. Figure 14 shows the packet flow for an outgoing A+P-forwarded packet. +-----------+ | Host | +-----+-----+ | | 198.51.100.2 IPv4 datagram 1 | | | | v | 198.51.100.1 +---------|---------+ |CPE | | +--------|||--------+ | ||| 2001:db8::2 | ||| 192.0.2.3 (100-200) IPv6 datagram 2| ||| | |||<-IPv4-in-IPv6 | ||| -----|-|||------- / | ||| \ | ISP access network | \ | ||| / -----|-|||------- | ||| v ||| 2001:db8::1 +--------|||--------+ |PRR ||| | +---------|---------+ | | 192.0.2.1 IPv4 datagram 3 | | -----|--|-------- / | | \ | ISP network / | \ Internet / -----|--|-------- | | v | 203.0.113.1 +-----+-----+ | IPv4 Host | +-----------+ Figure 14: Forwarding of Outgoing A+P-Forwarded Packets
+-----------------+--------------+-----------------------------+ | Datagram | Header field | Contents | +-----------------+--------------+-----------------------------+ | IPv4 datagram 1 | IPv4 Dst | 203.0.113.1 | | | IPv4 Src | 198.51.100.2 | | | TCP Dst | 80 | | | TCP Src | 8000 | | --------------- | ------------ | --------------------------- | | IPv6 datagram 2 | IPv6 Dst | 2001:db8::1 | | | IPv6 Src | 2001:db8::2 | | | IPv4 Dst | 203.0.113.1 | | | IPv4 Src | 192.0.2.3 | | | TCP Dst | 80 | | | TCP Src | 100 | | --------------- | ------------ | --------------------------- | | IPv4 datagram 3 | IPv4 Dst | 203.0.113.1 | | | IPv4 Src | 192.0.2.3 | | | TCP Dst | 80 | | | TCP Src | 100 | +-----------------+--------------+-----------------------------+ Datagram Header Contents An incoming packet undergoes the reverse process. When the PRR receives an IPv4 packet on an external interface, it first checks whether or not the destination address falls within the A+P CPE delegated range. If the address space was delegated, then the PRR encapsulates the incoming packet and forwards it through the appropriate tunnel for that IP/port range. If the address space was not delegated, the packet would be handed to the LSN to check if a mapping is available. Figure 15 shows how an incoming packet is forwarded, under the assumption that the port number matches the port range that was delegated to the CPE.
+-----------+ | Host | +-----+-----+ ^ | 198.51.100.2 IPv4 datagram 3 | | | | | | 198.51.100.1 +---------|---------+ |CPE | | +--------|||--------+ ^ ||| 2001:db8::2 | ||| 192.0.2.3 (100-200) IPv6 datagram 2| ||| | |||<-IPv4-in-IPv6 | ||| -----|-|||------- / | ||| \ | ISP access network | \ | ||| / -----|-|||------- | ||| | ||| 2001:db8::1 +--------|||--------+ |PRR ||| | +---------|---------+ ^ | 192.0.2.1 IPv4 datagram 1 | | -----|--|-------- / | | \ | ISP network / | \ Internet / -----|--|-------- | | | | 203.0.113.1 +-----+-----+ | IPv4 Host | +-----------+ Figure 15: Forwarding of Incoming A+P-Forwarded Packets
+-----------------+--------------+-----------------------------+ | Datagram | Header field | Contents | +-----------------+--------------+-----------------------------+ | IPv4 datagram 1 | IPv4 Dst | 198.51.100.3 | | | IPv4 Src | 203.0.113.1 | | | TCP Dst | 100 | | | TCP Src | 80 | | --------------- | ------------ | --------------------------- | | IPv6 datagram 2 | IPv6 Dst | 2001:db8::2 | | | IPv6 Src | 2001:db8::1 | | | IPv4 Dst | 198.51.100.3 | | | IP Src | 203.0.113.1 | | | TCP Dst | 100 | | | TCP Src | 80 | | --------------- | ------------ | --------------------------- | | IPv4 datagram 3 | IPv4 Dst | 198.51.100.2 | | | IPv4 Src | 203.0.113.1 | | | TCP Dst | 8000 | | | TCP Src | 80 | +-----------------+--------------+-----------------------------+ Datagram Header Contents Note that datagram 1 travels untranslated up to the CPE; thus, the customer has the same control over the translation as he has today -- a home gateway with customizable port-forwarding. RFC6333] for network diagrams that illustrate the packet flow in this case.
ECHO and ECHO replies are more problematic. For ECHO, the A+P gateway device must rewrite the "Identifier" and perhaps "Sequence Number" fields in the ICMP request, treating them as if they were port numbers. This way, the PRR can build the correct A+P address for the returning ECHO replies, so they can be correctly routed back to the appropriate host in the same way as TCP/UDP packets. Pings originated from the public realm (Internet) towards an A+P device are not supported. RFC6146], Section 3.5, in the sense that it should reassemble the fragments in order to look at the destination port number. Note that it is recommended to use a Path MTU Discovery (PMTUD) mechanism (e.g., [RFC1191]). Security issues related to fragmentation are out of scope of this document. For more details, refer to [RFC1858]. RFC3715]) can be employed.
A+P, as all other port-sharing solutions, suffers from the issues documented in [RFC6269], but that's something we'll have to live with. For the host-based A+P, issues related to application conflicts when trying to bind to an out-of-range port are to be further assessed. Note that extensions to the host-based model have been proposed in the past (e.g., the Port-Enhanced Address Resolution Protocol (ARP) extension documented in http://software.merit.edu/pe-arp/). PR-IP-ISSUES] have been analyzed in [STATELESS-4v6]. As seen in that document, most of the issues apply to host-based port-sharing solutions. A+P is not intended to be a host-based port-sharing solution. The conclusion of [STATELESS-4v6] is that the set of issues specifically attributed to A+P either do not apply to CPE-based flavors or can be mitigated. The A+P solution represents a reasonable trade-off compared to alternatives in areas such as binding logging (for data storage purposes) and ease of deployment and operations, all of which are actually facilitated by such a solution.
The other extreme is A+P NAT on the customer premises. Such a node would be no different than today's NAT boxes, which do no such logging. We thus conclude that A+P is no worse than today's situation, while being considerably better than CGNs. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [4rd] Despres, R., Matsushima, S., Murakami, T., and O. Troan, "IPv4 Residual Deployment across IPv6-Service networks (4rd) ISP-NAT's made optional", Work in Progress, March 2011. [A+P-EXPERIMENTS] Deng, X., Boucadair, M., and F. Telecom, "Implementing A+P in the provider's IPv6-only network", Work in Progress, March 2011. [BCP38] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, May 2000. [BITTORRENT-ADDR-SHARING] Boucadair, M., Grimault, J., Levis, P., and A. Villefranque, "Behavior of BitTorrent service in an IP Shared Address Environment", Work in Progress, March 2011. [PCP-BASE] Wing, D., Cheshire, S., Boucadair, M., Penno, R., and P. Selkirk, "Port Control Protocol (PCP)", Work in Progress, July 2011.
[PORT-RANGE-OPT] Boucadair, M., Levis, P., Bajko, G., Savolainen, T., and T. ZOU), "Huawei Port Range Configuration Options for PPP IPCP", Work in Progress, June 2011. [PR-ADDR-ASSIGN] Bajko, G., Savolainen, T., Boucadair, M., and P. Levis, "Port Restricted IP Address Assignment", Work in Progress, September 2010. [PR-IP-ISSUES] Thaler, D., "Issues With Port-Restricted IP Addresses", Work in Progress, February 2010. [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, November 1990. [RFC1858] Ziemba, G., Reed, D., and P. Traina, "Security Considerations for IP Fragment Filtering", RFC 1858, October 1995. [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E. Lear, "Address Allocation for Private Internets", BCP 5, RFC 1918, February 1996. [RFC3715] Aboba, B. and W. Dixon, "IPsec-Network Address Translation (NAT) Compatibility Requirements", RFC 3715, March 2004. [RFC5737] Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks Reserved for Documentation", RFC 5737, January 2010. [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, October 2010. [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful NAT64: Network Address and Protocol Translation from IPv6 Clients to IPv4 Servers", RFC 6146, April 2011. [RFC6269] Ford, M., Boucadair, M., Durand, A., Levis, P., and P. Roberts, "Issues with IP Address Sharing", RFC 6269, June 2011. [RFC6333] Durand, A., Droms, R., Woodyatt, J., and Y. Lee, "Dual- Stack Lite Broadband Deployments Following IPv4 Exhaustion", RFC 6333, August 2011.
[SHARED-ADDR-OPT] Boucadair, M., Levis, P., Grimault, J., Savolainen, T., and G. Bajko, "Dynamic Host Configuration Protocol (DHCPv6) Options for Shared IP Addresses Solutions", Work in Progress, December 2009. [STATELESS-4v6] Dec, W., Asati, R., Bao, C., and H. Deng, "Stateless 4Via6 Address Sharing", Work in Progress, July 2011.
Luca Cittadini Universita' Roma Tre via della Vasca Navale, 79 Rome, 00146 Italy Phone: +39 06 5733 3215 EMail: email@example.com Olaf Maennel Loughborough University Department of Computer Science - N.2.03 Loughborough United Kingdom Phone: +44 115 714 0042 EMail: firstname.lastname@example.org Reinaldo Penno Juniper Networks 1194 North Mathilda Avenue Sunnyvale, California 94089 US EMail: email@example.com Teemu Savolainen Nokia Hermiankatu 12 D TAMPERE, FI-33720 Finland EMail: firstname.lastname@example.org Jan Zorz Go6 Institute Slovenia Frankovo naselje 165 Skofja Loka, 4220 Slovenia EMail: email@example.com