8. Layered Mapping System (LMS)
8.1.1. Key Ideas
The layered mapping system proposal builds a hierarchical mapping
system to support scalability, analyzes the design constraints,
presents an explicit system structure, designs a two-cache mechanism
on ingress tunneling router (ITR) to gain low request delay, and
facilitates data validation. Tunneling and mapping are done at the
core, and no change is needed on edge networks. The mapping system
is run by interest groups independent of any ISP, which conforms to
an economical model and can be voluntarily adopted by various
networks. Mapping systems can also be constructed stepwise,
especially in the IPv6 scenario.
A. Distributed storage of mapping data avoids central storage of
massive amounts of data and restricts updates within local
B. The cache mechanism in an ITR reasonably reduces the request
loads on the mapping system.
A. No change on edge systems, only tunneling in core routers,
and new devices in core networks.
B. The mapping system can be constructed stepwise: a mapping
node needn't be constructed if none of its responsible ELOCs
is allocated. This makes sense especially for IPv6.
C. Conforms to a viable economic model: the mapping system
operators can profit from their services; core routers and
edge networks are willing to join the circle either to avoid
router upgrades or realize traffic engineering. Benefits
from joining are independent of the scheme's implementation
3. Low request delay: The low number of layers in the mapping
structure and the two-stage cache help achieve low request delay.
4. Data consistency: The two-stage cache enables an ITR to update
data in the map cache conveniently.
5. Traffic engineering support: Edge networks inform the mapping
system of their prioritized mappings with all upstream routers,
thus giving the edge networks control over their ingress flows.
1. Deployment of LMS needs to be further discussed.
2. The structure of the mapping system needs to be refined according
to practical circumstances.
LMS is a mapping mechanism based on Core-Edge Separation. In fact,
any proposal that needs a global mapping system with keys with
similar properties to that of an "edge address" in a Core-Edge
Separation scenario can use such a mechanism. This means that those
keys are globally unique (by authorization or just statistically), at
the disposal of edge users, and may have several satisfied mappings
(with possibly different weights). A proposal to address routing
scalability that needs mapping but doesn't specify the mapping
mechanism can use LMS to strengthen its infrastructure.
The key idea of LMS is similar to that of LISP+ALT: that the mapping
system should be hierarchically organized to gain scalability for
storage and updates and to achieve quick indexing for lookups.
However, LMS advocates an ISP-independent mapping system, and ETRs
are not the authorities of mapping data. ETRs or edge-sites report
their mapping data to related mapping servers.
LMS assumes that mapping servers can be incrementally deployed in
that a server may not be constructed if none of its administered edge
addresses are allocated, and that mapping servers can charge for
their services, which provides the economic incentive for their
existence. How this brand-new system can be constructed is still not
clear. Explicit layering is only an ideal state, and the proposal
analyzes the layering limits and feasibility, rather than provide a
practical way for deployment.
The drawbacks of LMS's feasibility analysis also include that it 1)
is based on current PC power and may not represent future
circumstances (especially for IPv6), and 2) does not consider the
variability of address utilization. Some IP address spaces may be
effectively allocated and used while some may not, causing some
mapping servers to be overloaded while others are poorly utilized.
More thoughts are needed as to the flexibility of the layer design.
LMS doesn't fit well for mobility. It does not solve the problem
when hosts move faster than the mapping updates and propagation
between relative mapping servers. On the other hand, mobile hosts'
moving across ASes and changing their attachment points (core
addresses) is less frequent than hosts' moving within an AS.
Separation needs two planes: Core-Edge Separation (which is to gain
routing table scalability) and identity/location separation (which is
to achieve mobility). The Global Locator, Local Locator, and
Identifier (GLI) scheme does a good clarification of this, and in
that case, LMS can be used to provide identity-to-core address
mapping. Of course, other schemes may be competent, and LMS can be
incorporated with them if the scheme has global keys and needs to map
them to other namespaces.
No rebuttal was submitted for this proposal.
9. Two-Phased Mapping
1. A mapping from prefixes to ETRs is an M:M mapping. Any change of
a (prefix, ETR) pair should be updated in a timely manner, which
can be a heavy burden to any mapping system if the relation
2. A prefix<->ETR mapping system cannot be deployed efficiently if
it is overwhelmed by worldwide dynamics. Therefore, the mapping
itself is not scalable with this direct mapping scheme.
9.1.2. Basics of a Two-Phased Mapping
1. Introduce an AS number in the middle of the mapping, the phase I
mapping is prefix<->AS#, phase II mapping is AS#<->ETRs. This
creates a M:1:M mapping model.
2. It is fair to assume that all ASes know their local prefixes (in
the IGP) better than other ASes and that it is most likely that
local prefixes can be aggregated when they can be mapped to the
AS number, which will reduce the number of mapping entries.
Also, ASes also know clearly their ETRs on the border between
core and edge. So, all mapping information can be collected
3. A registry system will take care of the phase I mapping
information. Each AS should have a registration agent to notify
the registry of the local range of IP address space. This system
can be organized as a hierarchical infrastructure like DNS, or
alternatively, as a centralized registry like "whois" in each
RIR. Phase II mapping information can be distributed between
xTRs as a BGP extension.
4. The basic forwarding procedure is that the ITR first gets the
destination AS number from the phase I mapper (or from cache)
when the packet is entering the "core". Then, it will extract
the closest ETR for the destination AS number. This is local,
since phase II mapping information has been "pushed" to the ITR
through BGP updates. Finally, the ITR tunnels the packet to the
1. Any prefix reconfiguration (aggregation/deaggregation) within an
AS will not be reflected in the mapping system.
2. Local prefixes can be aggregated with a high degree of
3. Both phase I and phase II mappings can be stable.
4. A stable mapping system will reduce the update overhead
introduced by topology changes and/or routing policy dynamics.
1. The two-phased mapping scheme introduces an AS number between the
mapping prefixes and ETRs.
2. The decoupling of direct mapping makes highly dynamic updates
stable; therefore, it can be more scalable than any direct
3. The two-phased mapping scheme is adaptable to any proposals based
on the core/edge split.
No references were submitted.
This is a simple idea on how to scale mapping. However, this design
is too incomplete to be considered a serious input to RRG. Take the
following two issues as example:
First, in this two-phase scheme, an AS is essentially the unit of
destinations (i.e., sending ITRs find out destination AS D, then send
data to one of D's ETRs). This does not offer much choice for
Second, there is no consideration whatsoever on failure detection and
No rebuttal was submitted for this proposal.
10. Global Locator, Local Locator, and Identifier Split (GLI-Split)
10.1.1. Key Idea
GLI-Split implements a separation between global routing (in the
global Internet outside edge networks) and local routing (inside edge
networks) using global and local locators (GLs and LLs). In
addition, a separate static identifier (ID) is used to identify
communication endpoints (e.g., nodes or services) independently of
any routing information. Locators and IDs are encoded in IPv6
addresses to enable backwards-compatibility with the IPv6 Internet.
The higher-order bits store either a GL or a LL, while the lower-
order bits contain the ID. A local mapping system maps IDs to LLs,
and a global mapping system maps IDs to GLs. The full GLI-mode
requires nodes with upgraded networking stacks and special GLI-
gateways. The GLI-gateways perform stateless locator rewriting in
IPv6 addresses with the help of the local and global mapping system.
Non-upgraded IPv6 nodes can also be accommodated in GLI-domains since
an enhanced DHCP service and GLI-gateways compensate for their
missing GLI-functionality. This is an important feature for
The benefits of GLI-Split are:
o Hierarchical aggregation of routing information in the global
Internet through separation of edge and core routing
o Provider changes not visible to nodes inside GLI-domains
(renumbering not needed)
o Rearrangement of subnetworks within edge networks not visible to
the outside world (better support of large edge networks)
o Transport connections survive both types of changes
o Improved traffic engineering for incoming and outgoing traffic
o Multipath routing and load balancing for hosts
o Improved resilience
o Improved mobility support without home agents and triangle routing
o Interworking with the classic Internet
* without triangle routing over proxy routers
* without stateful NAT
These benefits are available for upgraded GLI-nodes, but non-upgraded
nodes in GLI-domains partially benefit from these advanced features,
too. This offers multiple incentives for early adopters, and they
have the option to migrate their nodes gradually from non-GLI-stacks
o Local and global mapping system
o Modified DHCP or similar mechanism
o GLI-gateways with stateless locator rewriting in IPv6 addresses
o Upgraded stacks (only for full GLI-mode)
GLI-Split makes a clear distinction between two separation planes:
the separation between identifier and locator (which is to meet end-
users' needs including mobility) and the separation between local and
global locator (which makes the global routing table scalable). The
distinction is needed since ISPs and hosts have different
requirements, with both needing to make the changes inside and
outside GLI-domains invisible to their opposites.
A main drawback of GLI-Split is that it puts a burden on hosts.
Before routing a packet received from upper layers, network stacks in
hosts first need to resolve the DNS name to an IP address; if the IP
address is GLI-formed, it may look up the map from the identifier
extracted from the IP address to the local locator. If the
communication is between different GLI-domains, hosts may further
look up the mapping from the identifier to the global locator.
Having the local mapping system forward requests to the global
mapping system for hosts is just an option. Though host lookup may
ease the burden of intermediate nodes, which would otherwise to
perform the mapping lookup, the three lookups by hosts in the worst
case may lead to large delays unless a very efficient mapping
mechanism is devised. The work may also become impractical for low-
powered hosts. On one hand, GLI-Split can provide backward
compatibility where classic and upgraded IPv6 hosts can communicate.
This is its big virtue. On the other hand, the need to upgrade may
work against hosts' enthusiasm to change. This is offset against the
benefits they would gain.
GLI-Split provides additional features to improve TE and to improve
resilience, e.g., exerting multipath routing. However, the cost is
that more burdens are placed on hosts, e.g., they may need more
lookup actions and route selections. However, these kinds of
tradeoffs between costs and gains exist in most proposals.
One improvement of GLI-Split is its support for mobility by updating
DNS data as GLI-hosts move across GLI-domains. Through this, the
GLI-corresponding-node can query DNS to get a valid global locator of
the GLI-mobile-node and need not query the global mapping system
(unless it wants to do multipath routing), giving more incentives for
nodes to become GLI-enabled. The merits of GLI-Split, including
simplified-mobility-handover provision, compensate for the costs of
GLI-Split claims to use rewriting instead of tunneling for
conversions between local and global locators when packets span GLI-
domains. The major advantage is that this kind of rewriting needs no
extra state, since local and global locators need not map to each
other. Many other rewriting mechanisms instead need to maintain
extra state. It also avoids the MTU problem faced by the tunneling
methods. However, GLI-Split achieves this only by compressing the
namespace size of each attribute (identifier and local/global
locator). GLI-Split encodes two namespaces (identifier and local/
global locator) into an IPv6 address (each has a size of 2^64 or
less), while map-and-encap proposals assume that identifier and
locator each occupy a 128-bit space.
The arguments in the GLI-Split critique are correct. There are only
two points that should be clarified here. First, it is not a
drawback that hosts perform the mapping lookups. Second, the
critique proposed an improvement to the mobility mechanism, which is
of a general nature and not specific to GLI-Split.
1. The additional burden on the hosts is actually a benefit,
compared to having the same burden on the gateways. If the
gateway would perform the lookups and packets addressed to
uncached EIDs arrive, a lookup in the mapping system must be
initiated. Until the mapping reply returns, packets must be
either dropped, cached, or sent over the mapping system to the
destination. All these options are not optimal and have their
drawbacks. To avoid these problems in GLI-Split, the hosts
perform the lookup. The short additional delay is not a big
issue in the hosts because it happens before the first packets
are sent. So, no packets are lost or have to be cached. GLI-
Split could also easily be adapted to special GLI-hosts (e.g.,
low-power sensor nodes) that do not have to do any lookup and
simply let the gateway do all the work. This functionality is
included anyway for backward compatibility with regular IPv6
hosts inside the GLI-domain.
2. The critique proposes a DNS-based mobility mechanism as an
improvement to GLI-Split. However, this improvement is an
alternative mobility approach that can be applied to any routing
architecture (including GLI-Split) and also raises some concerns,
e.g., the update speed of DNS. Therefore, we prefer to keep this
issue out of the discussion.
11. Tunneled Inter-Domain Routing (TIDR)
11.1.1. Key Idea
Provides a method for locator/identifier separation using tunnels
between routers on the edge of the Internet transit infrastructure.
It enriches the BGP protocol for distributing the identifier-to-
locator mapping. Using new BGP attributes, "identifier prefixes" are
assigned inter-domain routing locators so that they will not be
installed in the RIB and will be moved to a new table called the
Tunnel Information Base (TIB). Afterwards, when routing a packet to
an "identifier prefix", first the TIB will be searched to perform
tunneling, and secondly the RIB will be searched for actual routing.
After the edge router performs tunneling, all routers in the middle
will route this packet until the packet reaches the router at the
tail-end of the tunnel.
o Smooth deployment
o Size reduction of the global RIB
o Deterministic customer traffic engineering for incoming traffic
o Numerous forwarding decisions for a particular address prefix
o Stops AS number space depletion
o Improved BGP convergence
o Protection of the inter-domain routing infrastructure
o Easy separation of control traffic and transit traffic
o Different layer-2 protocol IDs for transit and non-transit traffic
o Multihoming resilience
o New address families and tunneling techniques
o Support for IPv4 or IPv6, and migration to IPv6
o Scalability, stability, and reliability
o Faster inter-domain routing
o Routers on the edge of the inter-domain infrastructure will need
to be upgraded to hold the mapping database (i.e., the TIB).
o "Mapping updates" will need to be treated differently from usual
BGP "routing updates".
[TIDR] [TIDR_identifiers] [TIDR_and_LISP] [TIDR_AS_forwarding]
TIDR is a Core-Edge Separation architecture from late 2006 that
distributes its mapping information via BGP messages that are passed
between DFZ routers.
This means that TIDR cannot solve the most important goal of scalable
routing -- to accommodate much larger numbers of end-user network
prefixes (millions or billions) without each such prefix directly
burdening every DFZ router. Messages advertising routes for TIDR-
managed prefixes may be handled with lower priority, but this would
only marginally reduce the workload for each DFZ router compared to
handling an advertisement of a conventional PI prefix.
Therefore, TIDR cannot be considered for RRG recommendation as a
solution to the routing scaling problem.
For a TIDR-using network to receive packets sent from any host, every
BR of all ISPs must be upgraded to have the new ITR-like
functionality. Furthermore, all DFZ routers would need to be altered
so they accepted and correctly propagated the routes for end-user
network address space, with the new LOCATOR attribute, which contains
the ETR address and a REMOTE-PREFERENCE value. Firstly, if they
received two such advertisements with different LOCATORs, they would
advertise a single route to this prefix containing both. Secondly,
for end-user address space (for IPv4) to be more finely divided, the
DFZ routers must propagate LOCATOR-containing advertisements for
prefixes longer than /24.
TIDR's ITR-like routers store the full mapping database -- so there
would be no delay in obtaining mapping, and therefore no significant
delay in tunneling traffic packets.
[TIDR] is written as if traffic packets are classified by reference
to the RIB, but routers use the FIB for this purpose, and "FIB" does
not appear in [TIDR].
TIDR does not specify a tunneling technique, leaving this to be
chosen by the ETR-like function of BRs and specified as part of a
second kind of new BGP route advertised by that ETR-like BR. There
is no provision for solving the PMTUD problems inherent in
ITR functions must be performed by already busy routers of ISPs,
rather than being distributed to other routers or to sending hosts.
There is no practical support for mobility. The mapping in each end-
user route advertisement includes a REMOTE-PREFERENCE for each ETR-
like BR, but this is used by the ITR-like functions of BRs to always
select the LOCATOR with the highest value. As currently described,
TIDR does not provide inbound load-splitting TE.
Multihoming service restoration is achieved initially by the ETR-like
function of the BR at the ISP (whose link to the end-user network has
just failed). It looks up the mapping to find the next preferred
ETR-like BR's address. The first ETR-like router tunnels the packets
to the second ETR-like router in the other ISP. However, if the
failure was caused by the first ISP itself being unreachable, then
connectivity would not be restored until a revised mapping (with
higher REMOTE-PREFERENCE) from the reachable ETR-like BR of the
second ISP propagated across the DFZ to all ITR-like routers, or the
withdrawn advertisement for the first one reaches the ITR-like
No rebuttal was submitted for this proposal.
12. Identifier-Locator Network Protocol (ILNP)
12.1.1. Key Ideas
o Provides crisp separation of Identifiers from Locators.
o Identifiers name nodes, not interfaces.
o Locators name subnetworks, rather than interfaces, so they are
equivalent to an IP routing prefix.
o Identifiers are never used for network-layer routing, whilst
Locators are never used for Node Identity.
o Transport-layer sessions (e.g., TCP session state) use only
Identifiers, never Locators, meaning that changes in location have
no adverse impact on an IP session.
o The underlying protocol mechanisms support fully scalable site
multihoming, node multihoming, site mobility, and node mobility.
o ILNP enables topological aggregation of location information while
providing stable and topology-independent identities for nodes.
o In turn, this topological aggregation reduces both the routing
prefix "churn" rate and the overall size of the Internet's global
routing table, by eliminating the value and need for more-specific
routing state currently carried throughout the global (default-
free) zone of the routing system.
o ILNP enables improved traffic engineering capabilities without
adding any state to the global routing system. TE capabilities
include both provider-driven TE and also end-site-controlled TE.
o ILNP's mobility approach:
* eliminates the need for special-purpose routers (e.g., home
agent and/or foreign agent now required by Mobile IP and NEMO).
* eliminates "triangle routing" in all cases.
* supports both "make before break" and "break before make"
o ILNP improves resilience and network availability while reducing
the global routing state (as compared with the currently deployed
o ILNP is incrementally deployable:
* No changes are required to existing IPv6 (or IPv4) routers.
* Upgraded nodes gain benefits immediately ("day one"); those
benefits gain in value as more nodes are upgraded (this follows
* The incremental deployment approach is documented.
o ILNP is backwards compatible:
* ILNPv6 is fully backwards compatible with IPv6 (ILNPv4 is fully
backwards compatible with IPv4).
* Reuses existing known-to-scale DNS mechanisms to provide
* Existing DNS security mechanisms are reused without change.
* Existing IP Security mechanisms are reused with one minor
change (IPsec Security Associations replace the current use of
IP addresses with the use of Identifier values). NB: IPsec is
also backwards compatible.
* The backwards compatibility approach is documented.
o No new or additional overhead is required to determine or to
maintain locator/path liveness.
o ILNP does not require locator rewriting (NAT); ILNP permits and
tolerates NAT, should that be desirable in some deployment(s).
o Changes to upstream network providers do not require node or
subnetwork renumbering within end-sites.
o ILNP is compatible with and can facilitate the transition from
current single-path TCP to multipath TCP.
o ILNP can be implemented such that existing applications (e.g.,
applications using the BSD Sockets API) do NOT need any changes or
modifications to use ILNP.
o End systems need to be enhanced incrementally to support ILNP in
addition to IPv6 (or IPv4 or both).
o DNS servers supporting upgraded end systems also should be
upgraded to support new DNS resource records for ILNP. (The DNS
protocol and DNS security do not need any changes.)
[ILNP_Site] [MobiArch1] [MobiArch2] [MILCOM1] [MILCOM2] [DNSnBIND]
[Referral_Obj] [ILNP_Intro] [ILNP_Nonce] [ILNP_DNS] [ILNP_ICMP]
[JSAC_Arch] [RFC4033] [RFC4034] [RFC4035] [RFC5534] [RFC5902]
The primary issue for ILNP is how the deployment incentives and
benefits line up with the RRG goal of reducing the rate of growth of
entries and churn in the core routing table. If a site is currently
using PI space, it can only stop advertising that space when the
entire site is ILNP capable. This needs (at least) clear elucidation
of the incentives for ILNP which are not related to routing scaling,
in order for there to be a path for this to address the RRG needs.
Similarly, the incentives for upgrading hosts need to align with the
value for those hosts.
A closely related question is whether this mechanism actually
addresses the sites need for PI addresses. Assuming ILNP is
deployed, the site does achieve flexible, resilient, communication
using all of its Internet connections. While the proposal addresses
the host updates when the host learns of provider changes, there are
other aspects of provider change that are not addressed. This
includes renumbering routers, subnets, and certain servers. (It is
presumed that most servers, once the entire site has moved to ILNP,
will not be concerned if their locator changes. However, some
servers must have known locators, such as the DNS server.) The
issues described in [RFC5887] will be ameliorated, but not resolved.
To be able to adopt this proposal, and have sites use it, we need to
address these issues. When a site changes points of attachment, only
a small amount of DNS provisioning should be required. The LP
resource record type is apparently intended to help with this. It is
also likely that the use of dynamic DNS will help this.
The ILNP mechanism is described as being suitable for use in
conjunction with mobility. This raises the question of race
conditions. To the degree that mobility concerns are valid at this
time, it is worth asking how communication can be established if a
node is sufficiently mobile that it is moving faster than the DNS
update and DNS fetch cycle can effectively propagate changes.
This proposal does presume that all communication using this
mechanism is tied to DNS names. While it is true that most
communication does start from a DNS name, it is not the case that all
exchanges have this property. Some communication initiation and
referral can be done with an explicit identifier/locator pair. This
does appear to require some extensions to the existing mechanism (for
both sides to add locators). In general, some additional clarity on
the assumptions regarding DNS, particularly for low-end devices,
would seem appropriate.
One issue that this proposal shares with many others is the question
of how to determine which locator pairs (local and remote) are
actually functional. This is an issue both for initial
communications establishment and for robustly maintaining
communication. It is likely that a combination of monitoring of
traffic (in the host, where this is tractable), coupled with other
active measures, can address this. ICMP is clearly insufficient.
ILNP eliminates the perceived need for PI addressing and encourages
increased DFZ aggregation. Many enterprise users view DFZ scaling
issues as too abstruse, so ILNP creates more user-visible incentives
to upgrade deployed systems.
ILNP mobility eliminates Duplicate Address Detection (DAD), reducing
the layer-3 handoff time significantly when compared to IETF standard
Mobile IP, as shown in [MobiArch1] and [MobiArch2]. ICMP location
updates separately reduce the layer-3 handoff latency.
Also, ILNP enables both host multihoming and site multihoming.
Current BGP approaches cannot support host multihoming. Host
multihoming is valuable in reducing the site's set of externally
Improved mobility support is very important. This is shown by the
research literature and also appears in discussions with vendors of
mobile devices (smartphones, MP3 players). Several operating system
vendors push "updates" with major networking software changes in
maintenance releases today. Security concerns mean most hosts
receive vendor updates more quickly these days.
ILNP enables a site to hide exterior connectivity changes from
interior nodes, using various approaches. One approach deploys
unique local address (ULA) prefixes within the site, and has the site
border router(s) rewrite the Locator values. The usual NAT issues
don't arise because the Locator value is not used above the network-
layer. [MILCOM1] [MILCOM2]
[RFC5902] makes clear that many users desire IPv6 NAT, with site
interior obfuscation as a major driver. This makes global-scope PI
addressing much less desirable for end sites than formerly.
ILNP-capable nodes can talk existing IP with legacy IP-only nodes,
with no loss of current IP capability. So, ILNP-capable nodes will
never be worse off.
Secure Dynamic DNS Update is standard and widely supported in
deployed hosts and DNS servers. [DNSnBIND] says many sites have
deployed this technology without realizing it (e.g., by enabling both
the DHCP server and Active Directory of the MS-Windows Server).
If a node is as mobile as the critique says, then existing IETF
Mobile IP standards also will fail. They also use location updates
(e.g., MN -> home agent, MN -> foreign agent).
ILNP also enables new approaches to security that eliminate
dependence upon location-dependent Access Control Lists (ACLs)
without packet authentication. Instead, security appliances track
flows using Identifier values and validate the identifier/locator
relationship cryptographically [RFC4033] [RFC4034] [RFC4035] or non-
cryptographically by reading the nonce [ILNP_Nonce].
The DNS LP record has a more detailed explanation now. LP records
enable a site to change its upstream connectivity by changing the L
resource records of a single FQDN covering the whole site, thereby
DNS-based server load balancing works well with ILNP by using DNS SRV
records. DNS SRV records are not new, are widely available in DNS
clients and servers, and are widely used today in the IPv4 Internet
for server load balancing.
Recent ILNP documents discuss referrals in more detail. A node with
a binary referral can find the FQDN using DNS PTR records, which can
be authenticated [RFC4033] [RFC4034] [RFC4035]. Approaches such as
[Referral_Obj] improve user experience and user capability, so are
likely to self-deploy.
Selection from multiple Locators is identical to an IPv4 system
selecting from multiple A records for its correspondent. Deployed IP
nodes can track reachability via existing host mechanisms or by using
the SHIM6 method. [RFC5534]
13. Enhanced Efficiency of Mapping Distribution Protocols in
Map-and-Encap Schemes (EEMDP)
We present some architectural principles pertaining to the mapping
distribution protocols, especially applicable to the map-and-encap
(e.g., LISP) type of protocols. These principles enhance the
efficiency of the map-and-encap protocols in terms of (1) better
utilization of resources (e.g., processing and memory) at Ingress
Tunnel Routers (ITRs) and mapping servers, and consequently, (2)
reduction of response time (e.g., first-packet delay). We consider
how Egress Tunnel Routers (ETRs) can perform aggregation of endpoint
ID (EID) address space belonging to their downstream delivery
networks, in spite of migration/re-homing of some subprefixes to
other ETRs. This aggregation may be useful for reducing the
processing load and memory consumption associated with map messages,
especially at some resource-constrained ITRs and subsystems of the
mapping distribution system. We also consider another architectural
concept where the ETRs are organized in a hierarchical manner for the
potential benefit of aggregation of their EID address spaces. The
two key architectural ideas are discussed in some more detail below.
A more complete description can be found in [EEMDP_Considerations]
It will be helpful to refer to Figures 1, 2, and 3 in
[EEMDP_Considerations] for some of the discussions that follow here
13.1.2. Management of Mapping Distribution of Subprefixes Spread across
To assist in this discussion, we start with the high level
architecture of a map-and-encap approach (it would be helpful to see
Figure 1 in [EEMDP_Considerations]). In this architecture, we have
the usual ITRs, ETRs, delivery networks, etc. In addition, we have
the ID-Locator Mapping (ILM) servers, which are repositories for
complete mapping information, while the ILM-Regional (ILM-R) servers
can contain partial and/or regionally relevant mapping information.
While a large endpoint address space contained in a prefix may be
mostly associated with the delivery networks served by one ETR, some
fragments (subprefixes) of that address space may be located
elsewhere at other ETRs. Let a/20 denote a prefix that is
conceptually viewed as composed of 16 subnets of /24 size that are
denoted as a1/24, a2/24, ..., a16/24. For example, a/20 is mostly at
ETR1, while only two of its subprefixes a8/24 and a15/24 are
elsewhere at ETR3 and ETR2, respectively (see Figure 2
[EEMDP_Considerations]). From the point of view of efficiency of the
mapping distribution protocol, it may be beneficial for ETR1 to
announce a map for the entire space a/20 (rather than fragment it
into a multitude of more-specific prefixes), and provide the
necessary exceptions in the map information. Thus, the map message
could be in the form of Map:(a/20, ETR1; Exceptions: a8/24, a15/24).
In addition, ETR2 and ETR3 announce the maps for a15/24 and a8/24,
respectively, and so the ILMs know where the exception EID addresses
are located. Now consider a host associated with ITR1 initiating a
packet destined for an address a7(1), which is in a7/24 that is not
in the exception portion of a/20. Now a question arises as to which
of the following approaches would be the best choice:
1. ILM-R provides the complete mapping information for a/20 to ITR1
including all maps for relevant exception subprefixes.
2. ILM-R provides only the directly relevant map to ITR1, which in
this case is (a/20, ETR1).
In the first approach, the advantage is that ITR1 would have the
complete mapping for a/20 (including exception subnets), and it would
not have to generate queries for subsequent first packets that are
destined to any address in a/20, including a8/24 and a15/24.
However, the disadvantage is that if there is a significant number of
exception subprefixes, then the very first packet destined for a/20
will experience a long delay, and also the processors at ITR1 and
ILM-R can experience overload. In addition, the memory usage at ITR1
can be very inefficient. The advantage of the second approach above
is that the ILM-R does not overload resources at ITR1, neither in
terms of processing or memory usage, but it needs an enhanced map
response in of the form Map:(a/20, ETR1, MS=1), where the MS (More
Specific) indicator is set to 1 to indicate to ITR1 that not all
subnets in a/20 map to ETR1. The key idea is that aggregation is
beneficial, and subnet exceptions must be handled with additional
messages or indicators in the maps.
13.1.3. Management of Mapping Distribution for Scenarios with Hierarchy
of ETRs and Multihoming
Now we highlight another architectural concept related to mapping
management (please refer to Figure 3 in [EEMDP_Considerations]).
Here we consider the possibility that ETRs may be organized in a
hierarchical manner. For instance, ETR7 is higher in the hierarchy
relative to ETR1, ETR2, and ETR3, and like-wise ETR8 is higher
relative to ETR4, ETR5, and ETR6. For instance, ETRs 1 through 3 can
relegate the locator role to ETR7 for their EID address space. In
essence, they can allow ETR7 to act as the locator for the delivery
networks in their purview. ETR7 keeps a local mapping table for
mapping the appropriate EID address space to specific ETRs that are
hierarchically associated with it in the level below. In this
situation, ETR7 can perform EID address space aggregation across ETRs
1 through 3 and can also include its own immediate EID address space
for the purpose of that aggregation. The many details related to
this approach and special circumstances involving multihoming of
subnets are discussed in detail in [EEMDP_Considerations]. The
hierarchical organization of ETRs and delivery networks should help
in the future growth and scalability of ETRs and mapping distribution
networks. This is essentially recursive map-and-encap, and some of
the mapping distribution and management functionality will remain
local to topologically neighboring delivery networks that are
hierarchically underneath ETRs.
[EEMDP_Considerations] [EEMDP_Presentation] [FIBAggregatability]
The scheme described in [EEMDP_Considerations] represents one
approach to mapping overhead reduction, and it is a general idea that
is applicable to any proposal that includes prefix or EID
aggregation. A somewhat similar idea is also used in Level-3
aggregation in the FIB aggregation proposal [FIBAggregatability].
There can be cases where deaggregation of EID prefixes occur in such
a way that the bulk of an EID prefix P would be attached to one
locator (say, ETR1) while a few subprefixes under P would be attached
to other locators elsewhere (say, ETR2, ETR3, etc.). Ideally, such
cases should not happen; however, in reality it can happen as the
RIR's address allocations are imperfect. In addition, as new IP
address allocations become harder to get, an IPv4 prefix owner might
split previously unused subprefixes of that prefix and allocate them
to remote sites (homed to other ETRs). Assuming these situations
could arise in practice, the nature of the solution would be that the
response from the mapping server for the coarser site would include
information about the more specifics. The solution as presented
The proposal mentions that in Approach 1, the ID-Locator Mapping
(ILM) system provides the complete mapping information for an
aggregate EID prefix to a querying ITR, including all the maps for
the relevant exception subprefixes. The sheer number of such more-
specifics can be worrisome, for example, in LISP. What if a
company's mobile-node EIDs came out of their corporate EID prefix?
Approach 2 is far better but still there may be too many entries for
a regional ILM to store. In Approach 2, the ILM communicates that
there are more specifics but does not communicate their mask-length.
A suggested improvement would be that rather than saying that there
are more specifics, indicate what their mask-lengths are. There can
be multiple mask lengths. This number should be pretty small for
IPv4 but can be large for IPv6.
Later in the proposal, a different problem is addressed, involving a
hierarchy of ETRs and how aggregation of EID prefixes from lower-
level ETRs can be performed at a higher-level ETR. The various
scenarios here are well illustrated and described. This seems like a
good idea, and a solution like LISP can support this as specified.
As any optimization scheme would inevitably add some complexity; the
proposed scheme for enhancing mapping efficiency comes with some of
its own overhead. The gain depends on the details of specific EID
blocks, i.e., how frequently the situations (such as an ETR that has
a bigger EID block with a few holes) arise.
There are two main points in the critique that are addressed here:
(1) The gain depends on the details of specific EID blocks, i.e., how
frequently the situations arise such as an ETR having a bigger EID
block with a few holes, and (2) Approach 2 is lacking an added
feature of conveying just the mask-length of the more specifics that
exist as part of the current map response.
Regarding comment (1) above, there are multiple possibilities
regarding how situations can arise, resulting in allocations having
holes in them. An example of one of these possibilities is as
follows. Org-A has historically received multiple /20s, /22s, and
/24s over the course of time that are adjacent to each other. At the
present time, these prefixes would all aggregate to a /16 but for the
fact that just a few of the underlying /24s have been allocated
elsewhere historically to other organizations by an RIR or ISPs. An
example of a second possibility is that Org-A has an allocation of a
/16. It has suballocated a /22 to one of its subsidiaries, and
subsequently sold the subsidiary to another Org-B. For ease of
keeping the /22 subnet up and running without service disruption, the
/22 subprefix is allowed to be transferred in the acquisition
process. Now the /22 subprefix originates from a different AS and is
serviced by a different ETR (as compared to the parent \16 prefix).
We are in the process of performing an analysis of RIR allocation
data and are aware of other studies (notably at UCLA) that are also
performing similar analysis to quantify the frequency of occurrence
of the holes. We feel that the problem that has been addressed is a
realistic one, and the proposed scheme would help reduce the
overheads associated with the mapping distribution system.
Regarding comment (2) above, the suggested modification to Approach 2
would be definitely beneficial. In fact, we feel that it would be
fairly straightforward to dynamically use Approach 1 or Approach 2
(with the suggested modification), depending on whether there are
only a few (e.g., <=5) or many (e.g., >5) more specifics,
respectively. The suggested modification of notifying the mask-
length of the more specifics in the map response is indeed very
helpful because then the ITR would not have to resend a map-query for
EID addresses that match the EID address in the previous query up to
at least mask-length bit positions. There can be a two-bit field in
the map response that would be interpreted as follows.
(a) value 00: there are no more specifics
(b) value 01: there are more specifics and their exact information
follows in additional map-responses
(c) value 10: there are more-specifics and the mask-length of the
next more-specific is indicated in the current map-response.
An additional field will be included that will be used to specify the
mask-length of the next more-specific in the case of value 10 (case
As the Internet continues its rapid growth, router memory size and
CPU cycle requirements are outpacing feasible hardware upgrade
schedules. We propose to solve this problem by applying aggregation
with increasing scopes to gradually evolve the routing system towards
a scalable structure. At each evolutionary step, our solution is
able to interoperate with the existing system and provide immediate
benefits to adopters to enable deployment. This document summarizes
the need for an evolutionary design, the relationship between our
proposal and other revolutionary proposals, and the steps of
aggregation with increasing scopes. Our detailed proposal can be
found in [Evolution].
14.1.1. Need for Evolution
Multiple different views exist regarding the routing scalability
problem. Networks differ vastly in goals, behavior, and resources,
giving each a different view of the severity and imminence of the
scalability problem. Therefore, we believe that, for any solution to
be adopted, it will start with one or a few early adopters and may
not ever reach the entire Internet. The evolutionary approach
recognizes that changes to the Internet can only be a gradual process
with multiple stages. At each stage, adopters are driven by and
rewarded with solving an immediate problem. Each solution must be
deployable by individual networks who deem it necessary at a time
they deem it necessary, without requiring coordination from other
networks, and the solution has to bring immediate relief to a single
14.1.2. Relation to Other RRG Proposals
Most proposals take a revolutionary approach that expects the entire
Internet to eventually move to some new design whose main benefits
would not materialize until the vast majority of the system has been
upgraded; their incremental deployment plan simply ensures
interoperation between upgraded and legacy parts of the system. In
contrast, the evolutionary approach depicts a system where changes
may happen here and there as needed, but there is no dependency on
the system as a whole making a change. Whoever takes a step forward
gains the benefit by solving his own problem, without depending on
others to take actions. Thus, deployability includes not only
interoperability, but also the alignment of costs and gains.
The main differences between our approach and more revolutionary map-
and-encap proposals are: (a) we do not start with a pre-defined
boundary between edge and core; and (b) each step brings immediate
benefits to individual first-movers. Note that our proposal neither
interferes nor prevents any revolutionary host-based solutions such
as ILNP from being rolled out. However, host-based solutions do not
bring useful impact until a large portion of hosts have been
upgraded. Thus, even if a host-based solution is rolled out in the
long run, an evolutionary solution is still needed for the near term.
14.1.3. Aggregation with Increasing Scopes
Aggregating many routing entries to a fewer number is a basic
approach to improving routing scalability. Aggregation can take
different forms and be done within different scopes. In our design,
the aggregation scope starts from a single router, then expands to a
single network and neighbor networks. The order of the following
steps is not fixed but is merely a suggestion; it is under each
individual network's discretion which steps they choose to take based
on their evaluation of the severity of the problems and the
affordability of the solutions.
1. FIB Aggregation (FA) in a single router. A router
algorithmically aggregates its FIB entries without changing its
RIB or its routing announcements. No coordination among routers
is needed, nor any change to existing protocols. This brings
scalability relief to individual routers with only a software
2. Enabling 'best external' on Provider Edge routers (PEs),
Autonomous System Border Routers (ASBRs), and Route Reflectors
(RRs), and turning on next-hop-self on RRs. For hierarchical
networks, the RRs in each Point of Presence (PoP) can serve as a
default gateway for nodes in the PoP, thus allowing the non-RR
nodes in each PoP to maintain smaller routing tables that only
include paths that egress that PoP. This is known as 'topology-
based mode' Virtual Aggregation, and can be done with existing
hardware and configuration changes only. Please see
[Evolution_Grow_Presentation] for details.
3. Virtual Aggregation (VA) in a single network. Within an AS, some
fraction of existing routers are designated as Aggregation Point
Routers (APRs). These routers are either individually or
collectively maintain the full FIB table. Other routers may
suppress entries from their FIBs, instead forwarding packets to
APRs, which will then tunnel the packets to the correct egress
routers. VA can be viewed as an intra-domain map-and-encap
system to provide the operators with a control mechanism for the
FIB size in their routers.
4. VA across neighbor networks. When adjacent networks have VA
deployed, they can go one step further by piggybacking egress
router information on existing BGP announcements, so that packets
can be tunneled directly to a neighbor network's egress router.
This improves packet delivery performance by performing the
encapsulation/decapsulation only once across these neighbor
networks, as well as reducing the stretch of the path.
5. Reducing RIB Size by separating the control plane from the data
plane. Although a router's FIB can be reduced by FA or VA, it
usually still needs to maintain the full RIB to produce complete
routing announcements to its neighbors. To reduce the RIB size,
a network can set up special boxes, which we call controllers, to
take over the External BGP (eBGP) sessions from border routers.
The controllers receive eBGP announcements, make routing
decisions, and then inform other routers in the same network of
how to forward packets, while the regular routers just focus on
the job of forwarding packets. The controllers, not being part
of the data path, can be scaled using commodity hardware.
6. Insulating forwarding routers from routing churn. For routers
with a smaller RIB, the rate of routing churn is naturally
reduced. Further reduction can be achieved by not announcing
failures of customer prefixes into the core, but handling these
failures in a data-driven fashion, e.g., a link failure to an
edge network is not reported unless and until there are data
packets that are heading towards the failed link.
All of the RRG proposals that scale the routing architecture share
one fundamental approach, route aggregation, in different forms,
e.g., LISP removes "edge prefixes" using encapsulation at ITRs, and
ILNP achieves the goal by locator rewrite. In this evolutionary path
proposal, each stage of the evolution applies aggregation with
increasing scopes to solve a specific scalability problem, and
eventually the path leads towards global routing scalability. For
example, it uses FIB aggregation at the single router level, virtual
aggregation at the network level, and then between neighboring
networks at the inter-domain level.
Compared to other proposals, this proposal has the lowest hurdle to
deployment, because it does not require that all networks move to use
a global mapping system or upgrade all hosts, and it is designed for
each individual network to get immediate benefits after its own
Criticisms of this proposal fall into two types. The first type
concerns several potential issues in the technical design as listed
1. FIB aggregation, at level-3 and level-4, may introduce extra
routable space. Concerns have been raised about the potential
routing loops resulting from forwarding otherwise non-routable
packets, and the potential impact on Reverse Path Forwarding
(RPF) checking. These concerns can be addressed by choosing a
lower level of aggregation and by adding null routes to minimize
the extra space, at the cost of reduced aggregation gain.
2. Virtual Aggregation changes the traffic paths in an ISP network,
thereby introducing stretch. Changing the traffic path may also
impact the reverse path checking practice used to filter out
packets from spoofed sources. More analysis is need to identify
the potential side-effects of VA and to address these issues.
3. The current Virtual Aggregation description is difficult to
understand, due to its multiple options for encapsulation and
popular prefix configurations, which makes the mechanism look
overly complicated. More thought is needed to simplify the
design and description.
4. FIB Aggregation and Virtual Aggregation may require additional
operational cost. There may be new design trade-offs that the
operators need to understand in order to select the best option
for their networks. More analysis is needed to identify and
quantify all potential operational costs.
5. In contrast to a number of other proposals, this solution does
not provide mobility support. It remains an open question as to
whether the routing system should handle mobility.
The second criticism is whether deploying quick fixes like FIB
aggregation would alleviate scalability problems in the short term
and reduce the incentives for deploying a new architecture; and
whether an evolutionary approach would end up with adding more and
more patches to the old architecture, and not lead to a fundamentally
new architecture as the proposal had expected. Though this solution
may get rolled out more easily and quickly, a new architecture, if/
once deployed, could solve more problems with cleaner solutions.
No rebuttal was submitted for this proposal.