4.3. VPN Tunneling
VPN solutions use tunneling in order to transport VPN packets across
the VPN backbone, from one VPN edge device to another. There are
different types of tunneling protocols, different ways of
establishing and maintaining tunnels, and different ways to associate
tunnels with VPNs (e.g., shared versus dedicated per-VPN tunnels).
Sections 4.3.1 through 4.3.5 discusses some common characteristics
shared by all forms of tunneling, and some common problems to which
tunnels provide a solution. Section 4.3.6 provides a survey of
available tunneling techniques. Note that tunneling protocol issues
are generally independent of the mechanisms used for VPN membership
and VPN routing.
One motivation for the use of tunneling is that the packet addressing
used in a VPN may have no relation to the packet addressing used
between the VPN edge devices. For example the customer VPN traffic
could use non-unique or private IP addressing [RFC1918]. Also an
IPv6 VPN could be implemented across an IPv4 provider backbone. As
such the packet forwarding between the VPN edge devices must use
information other than that contained in the VPN packets themselves.
A tunneling protocol adds additional information, such an extra
header or label, to a VPN packet, and this additional information is
then used for forwarding the packet between the VPN edge devices.
Another capability optionally provided by tunneling is that of
isolation between different VPN traffic flows. The QoS and security
requirements for these traffic flows may differ, and can be met by
using different tunnels with the appropriate characteristics. This
allows a provider to offer different service characteristics for
traffic in different VPNs, or to subsets of traffic flows within a
The specific tunneling protocols considered in this section are GRE,
IP-in-IP, IPsec, and MPLS, as these are the most suitable for
carrying VPN traffic across the VPN backbone. Other tunneling
protocols, such as L2TP [RFC2661], may be used as access tunnels,
carrying traffic between a PE and a CE. As backbone tunneling is
independent of and orthogonal to access tunneling, protocols for the
latter are not discussed here.
4.3.1. Tunnel Encapsulations
All tunneling protocols use an encapsulation that adds additional
information to the encapsulated packet; this information is used for
forwarding across the VPN backbone. Examples are provided in section
One characteristic of a tunneling protocol is whether per-tunnel
state is needed in the SP network in order to forward the
encapsulated packets. For IP tunneling schemes (GRE, IP-in-IP, and
IPsec) per-tunnel state is completely confined to the VPN edge
devices. Other routers are unaware of the tunnels, and forward
according to the IP header. For MPLS, per-tunnel state is needed,
since the top label in the label stack must be examined and swapped
by intermediate LSRs. The amount of state required can be minimized
by hierarchical multiplexing, and by use of multi-point to point
tunnels, as discussed below.
Another characteristic is the tunneling overhead introduced. With
IPsec the overhead may be considerable as it may include, for
example, an ESP header, ESP trailer and an additional IP header. The
other mechanisms listed use less overhead, with MPLS being the most
lightweight. The overhead inherent in any tunneling mechanism may
result in additional IP packet fragmentation, if the resulting packet
is too large to be carried by the underlying link layer. As such it
is important to report any reduced MTU sizes via mechanisms such as
path MTU discovery in order to avoid fragmentation wherever possible.
Yet another characteristic is something we might call "transparency
to the Internet". IP-based encapsulation can carry be used to carry
a packet anywhere in the Internet. MPLS encapsulation can only be
used to carry a packet on IP networks that support MPLS. If an
MPLS-encapsulated packet must cross the networks of multiple SPs, the
adjacent SPs must bilateral agreements to accept MPLS packets from
each other. If only a portion of the path across the backbone lacks
MPLS support, then an MPLS-in-IP encapsulation can be used to move
the MPLS packets across that part of the backbone. However, this
does add complexity. On the other hand, MPLS has efficiency
advantages, particularly in environments where encapsulations may
need to be nested.
Transparency to the Internet is sometimes a requirement, but
sometimes not. This depends on the sort of service which a SP is
offering to its customer.
4.3.2. Tunnel Multiplexing
When a tunneled packet arrives at the tunnel egress, it must be
possible to infer the packet's VPN from its encapsulation header. In
MPLS encapsulations, this must be inferred from the packet's label
stack. In IP-based encapsulations, this can be inferred from some
combination of the IP source address, the IP destination address, and
a "multiplexing field" in the encapsulation header. The multiplexing
field might be one which was explicitly designed for multiplexing, or
one that wasn't originally designed for this but can be pushed into
service as a multiplexing field. For example:
o GRE: Packets associated to VPN by source IP address, destination IP
address, and Key field, although the key field was originally
intended for authentication.
o IP-in-IP: Packets associated to VPN by IP destination address in
o IPsec: Packets associated to VPN by IP source address, IP
destination address, and SPI field.
o MPLS: Packets associated to VPN by label stack.
Note that IP-in-IP tunneling does not have a real multiplexing field,
so a different IP destination address must be used for every VPN
supported by a given PE. In the other IP-based encapsulations, a
given PE need have only a single IP address, and the multiplexing
field is used to distinguish the different VPNs supported by a PE.
Thus the IP-in-IP solution has the significant disadvantage that it
requires the allocation and assignment of a potentially large number
of IP addresses, all of which have to be reachable via backbone
In the following, we will use the term "multiplexing field" to refer
to whichever field in the encapsulation header must is used to
distinguish different VPNs at a given PE. In the IP-in-IP
encapsulation, this is the destination IP address field, in the other
encapsulations it is a true multiplexing field.
4.3.3. Tunnel Establishment
When tunnels are established, the tunnel endpoints must agree on the
multiplexing field values which are to be used to indicate that
particular packets are in particular VPNs. The use of "well known"
or explicitly provisioned values would not scale well as the number
of VPNs increases. So it is necessary to have some sort of protocol
interaction in which the tunnel endpoints agree on the multiplexing
For some tunneling protocols, setting up a tunnel requires an
explicit exchange of signaling messages. Generally the multiplexing
field values would be agreed upon as part of this exchange. For
example, if an IPsec encapsulation is used, the SPI field plays the
role of the multiplexing field, and IKE signaling is used to
distribute the SPI values; if an MPLS encapsulation is used, LDP,
CR-LDP or RSVP-TE can be used to distribute the MPLS label value used
as the multiplexing field. Information about the identity of the VPN
with which the tunnel is to be associated needs to be exchanged as
part of the signaling protocol (e.g., a VPN-ID can be carried in the
signaling protocol). An advantage of this approach is that
per-tunnel security, QoS and other characteristics may also be
negotiable via the signaling protocol. A disadvantage is that the
signaling imposes overhead, which may then lead to scalability
considerations, discussed further below.
For some tunneling protocols, there is no explicit protocol
interaction that sets up the tunnel, and the multiplexing field
values must be exchanged in some other way. For example, for MPLS
tunnels, MPLS labels can be piggybacked on the protocols used to
distribute VPN routes or VPN membership information. GRE and
IP-in-IP have no associated signaling protocol, and thus by necessity
the multiplexing values are distributed via some other mechanism,
such as via configuration, control protocol, or piggybacked in some
manner on a VPN membership protocol.
The resources used by the different tunneling establishment
mechanisms may vary. With a full mesh VPN topology, and explicit
signaling, each VPN edge device has to establish a tunnel to all the
other VPN edge devices for in each VPN. The resources needed for
this on a VPN edge device may be significant, and issues such as the
time needed to recover following a device failure may need to be
taken into account, as the time to recovery includes the time needed
to reestablish a large number of tunnels.
4.3.4. Scaling and Hierarchical Tunnels
If tunnels require state to be maintained in the core of the network,
it may not be feasible to set up per-VPN tunnels between all adjacent
devices that are adjacent in some VPN topology. This would violate
the principle that there is no per-VPN state in the core of the
network, and would make the core scale poorly as the number of VPNs
increases. For example, MPLS tunnels require that core network
devices maintain state for the topmost label in the label stack. If
every core router had to maintain one or more labels for every VPN,
scaling would be very poor.
There are also scaling considerations related to the use of explicit
signaling for tunnel establishment. Even if the tunneling protocol
does not maintain per tunnel state in the core, the number of tunnels
that a single VPN edge device needs to handle may be large, as this
grows according to the number of VPNs and the number of neighbors per
VPN. One way to reduce the number of tunnels in a network is to use
a VPN topology other than a full mesh. However this may not always
be desirable, and even with hub and spoke topologies the hubs VPN
edge devices may still need to handle large numbers of tunnels.
If the core routers need to maintain any per-tunnel state at all,
scaling can be greatly improved by using hierarchical tunnels. One
tunnel can be established between each pair of VPN edge devices, and
multiple VPN-specific tunnels can then be carried through the single
"outer" tunnel. Now the amount of state is dependent only on the
number of VPN edge devices, not on the number of VPNs. Scaling can
be further improved by having the outer tunnels be
multipoint-to-point "merging" tunnels. Now the amount of state to be
maintained in the core is on the order of the number of VPN edge
devices, not on the order of the square of that number. That is, the
amount of tunnel state is roughly equivalent to the amount of state
needed to maintain IP routes to the VPN edge devices. This is almost
(if not quite) as good as using tunnels which do not require any
state to be maintained in the core.
Using hierarchical tunnels may also reduce the amount of state to be
maintained in the VPN edge devices, particularly if maintaining the
outer tunnels requires more state than maintaining the per-VPN
tunnels that run inside the outer tunnels.
There are other factors relevant to determining the number of VPN
edge to VPN edge "outer" tunnels to use. While using a single such
tunnel has the best scaling properties, using more than one may allow
different QoS capabilities or different security characteristics to
be used for different traffic flows (from the same or from different
When tunnels are used hierarchically, the tunnels in the hierarchy
may all be of the same type (e.g., an MPLS label stack) or they may
be of different types (e.g., a GRE tunnel carried inside an IPsec
One example using hierarchical tunnels is the establishment of a
number of different IPsec security associations, providing different
levels of security between a given pair of VPN edge devices. Per-VPN
GRE tunnels can then be grouped together and then carried over the
appropriate IPsec tunnel, rather than having a separate IPsec tunnel
per-VPN. Another example is the use of an MPLS label stack. A
single PE-PE LSP is used to carry all the per-VPN LSPs. The
mechanisms used for label establishment are typically different. The
PE-PE LSP could be established using LDP, as part or normal backbone
operation, with the per-VPN LSP labels established by piggybacking on
VPN routing (e.g., using BGP) discussed in sections 184.108.40.206 and 4.1.
4.3.5. Tunnel Maintenance
Once a tunnel is established it is necessary to know that the tunnel
is operational. Mechanisms are needed to detect tunnel failures, and
to respond appropriately to restore service.
There is a potential issue regarding propagation of failures when
multiple tunnels are multiplexed hierarchically. Suppose that
multiple VPN-specific tunnels are multiplexed inside a single PE to
PE tunnel. In this case, suppose that routing for the VPN is done
over the VPN-specific tunnels (as may be the case for CE-based and VR
approaches). Suppose that the PE to PE tunnel fails. In this case
multiple VPN-specific tunnels may fail, and layer 3 routing may
simultaneously respond for each VPN using the failed tunnel. If the
PE to PE tunnel is subsequently restored, there may then be multiple
VPN-specific tunnels and multiple routing protocol instances which
also need to recover. Each of these could potentially require some
exchange of control traffic.
When a tunnel fails, if the tunnel can be restored quickly, it might
therefore be preferable to restore the tunnel without any response by
high levels (such as other tunnels which were multiplexed inside the
failed tunnels). By having high levels delay response to a lower
level failed tunnel, this may limit the amount of control traffic
needed to completely restore correct service. However, if the failed
tunnel cannot be quickly restored, then it is necessary for the
tunnels or routing instances multiplexed over the failed tunnel to
respond, and preferable for them to respond quickly and without
explicit action by network operators.
With most layer 3 provider-provisioned CE-based VPNs and the VR
scheme, a per-VPN instance of routing is running over the tunnel,
thus any loss of connectivity between the tunnel endpoints will be
detected by the VPN routing instance. This allows rapid detection of
tunnel failure. Careful adjustment of timers might be needed to
avoid failure propagation as discussed the above. With the
aggregated routing scheme, there isn't a per-VPN instance of routing
running over the tunnel, and therefore some other scheme to detect
loss of connectivity is needed in the event that the tunnel cannot be
Failure of connectivity in a tunnel can be very difficult to detect
reliably. Among the mechanisms that can be used to detect failure
are loss of the underlying connectivity to the remote endpoint (as
indicated, e.g., by "no IP route to host" or no MPLS label), timeout
of higher layer "hello" mechanisms (e.g., IGP hellos, when the tunnel
is an adjacency in some IGP), and timeout of keep alive mechanisms in
the tunnel establishment protocols (if any). However, none of these
techniques provides completely reliable detection of all failure
modes. Additional monitoring techniques may also be necessary.
With hierarchical tunnels it may suffice to only monitor the
outermost tunnel for loss of connectivity. However there may be
failure modes in a device where the outermost tunnel is up but one of
the inner tunnels is down.
4.3.6. Survey of Tunneling Techniques
Tunneling mechanisms provide isolated communication between two CE-PE
devices. Available tunneling mechanisms include (but are not limited
to): GRE [RFC2784] [RFC2890], IP-in-IP encapsulation [RFC2003]
[RFC2473], IPsec [RFC2401] [RFC2402], and MPLS [RFC3031] [RFC3035].
Note that the following subsections address tunnel overhead to
clarify the risk of fragmentation. Some SP networks contain layer 2
switches that enforce the standard/default MTU of 1500 bytes. In
this case, any encapsulation whatsoever creates a significant risk of
fragmentation. However, layer 2 switch vendors are in general aware
of IP tunneling as well as stacked VLAN overhead, thus many switches
practically allow an MTU of approximately 1512 bytes now. In this
case, up to 12 bytes of encapsulation can be used before there is any
risk of fragmentation. Furthermore, to improve TCP and NFS
performance, switches that support 9K bytes "jumbo frames" are also
on the market. In this case, there is no risk of fragmentation.
220.127.116.11. GRE [RFC2784] [RFC2890]
Generic Routing Encapsulation (GRE) specifies a protocol for
encapsulating an arbitrary payload protocol over an arbitrary
delivery protocol [RFC2784]. In particular, it can be used where
both the payload and the delivery protocol are IP as is the case in
layer 3 VPNs. A GRE tunnel is a tunnel whose packets are
encapsulated by GRE.
The GRE specification [RFC2784] does not explicitly support
multiplexing. But the key field extension to GRE is specified in
[RFC2890] and it may be used as a multiplexing field.
GRE itself does not have intrinsic QoS/SLA capabilities, but it
inherits whatever capabilities exist in the delivery protocol (IP).
Additional mechanisms, such as Diffserv or RSVP extensions
[RFC2746], can be applied.
o Tunnel setup and maintenance
There is no standard signaling protocol for setting up and
maintaining GRE tunnels.
o Large MTUs and minimization of tunnel overhead
When GRE encapsulation is used, the resulting packet consists of a
delivery protocol header, followed by a GRE header, followed by the
payload packet. When the delivery protocol is IPv4, and if the key
field is not present, GRE encapsulation adds at least 28 bytes of
overhead (36 bytes if key field extension is used.)
GRE encapsulation does not provide any significant security. The
optional key field can be used as a clear text password to aid in
the detection of misconfigurations, but it does not provide
integrity or authentication. An SP network which supports VPNs
must do extensive IP address filtering at its borders to prevent
spoofed packets from penetrating the VPNs. If multi-provider VPNs
are being supported, it may be difficult to set up these filters.
18.104.22.168. IP-in-IP Encapsulation [RFC2003] [RFC2473]
IP-in-IP specifies the format and procedures for IP-in-IP
encapsulation. This allows an IP datagram to be encapsulated within
another IP datagram. That is, the resulting packet consists of an
outer IP header, followed immediately by the payload packet. There
is no intermediate header as in GRE. [RFC2003] and [RFC2473] specify
IPv4 and IPv6 encapsulations respectively. Once the encapsulated
datagram arrives at the intermediate destination (as specified in the
outer IP header), it is decapsulated, yielding the original IP
datagram, which is then delivered to the destination indicated by the
original destination address field.
The IP-in-IP specifications don't explicitly support multiplexing.
But if a different IP address is used for every VPN then the IP
address field can be used for this purpose. (See section 4.3.2 for
IP-in-IP itself does not have intrinsic QoS/SLA capabilities, but
of course it inherits whatever capabilities exist for IP.
Additional mechanisms, such as RSVP extensions [RFC2764] or
DiffServ extensions [RFC2983], may be used with it.
o Tunnel setup and maintenance
There is no standard setup and maintenance protocol for IP-in-IP.
o Large MTUs and minimization of tunnel overhead
When the delivery protocol is IPv4, IP-in-IP adds at least 20 bytes
IP-in-IP encapsulation does not provide any significant security.
An SP network which supports VPNs must do extensive IP address
filtering at its borders to prevent spoofed packets from
penetrating the VPNs. If multi-provider VPNs are being supported,
it may be difficult to set up these filters.
22.214.171.124. IPsec [RFC2401] [RFC2402] [RFC2406] [RFC2409]
IP Security (IPsec) provides security services at the IP layer
[RFC2401]. It comprises authentication header (AH) protocol
[RFC2402], encapsulating security payload (ESP) protocol [RFC2406],
and Internet key exchange (IKE) protocol [RFC2409]. AH protocol
provides data integrity, data origin authentication, and an
anti-replay service. ESP protocol provides data confidentiality and
limited traffic flow confidentiality. It may also provide data
integrity, data origin authentication, and an anti-replay service.
AH and ESP may be used in combination.
IPsec may be employed in either transport or tunnel mode. In
transport mode, either an AH or ESP header is inserted immediately
after the payload packet's IP header. In tunnel mode, an IP packet
is encapsulated with an outer IP packet header. Either an AH or ESP
header is inserted between them. AH and ESP establish a
unidirectional secure communication path between two endpoints, which
is called a security association. In tunnel mode, PE-PE tunnel (or a
CE-CE tunnel) consists of a pair of unidirectional security
associations. The IPsec and IKE protocols are used for setting up
The SPI field of AH and ESP is used to multiplex security
associations (or tunnels) between two peer devices.
IPsec itself does not have intrinsic QoS/SLA capabilities, but it
inherits whatever mechanisms exist for IP. Other mechanisms such
as "RSVP Extensions for IPsec Data Flows" [RFC2207] or DiffServ
extensions [RFC2983] may be used with it.
o Tunnel setup and maintenance
The IPsec and IKE protocols are used for the setup and maintenance
o Large MTUs and minimization of tunnel overhead
IPsec transport mode adds at least 8 bytes of overhead. IPsec
tunnel mode adds at least 28 bytes of overhead. IPsec transport
mode adds minimal overhead. In PE-based PPVPNs, the processing
overhead of IPsec (due to its cryptography) may limit the PE's
performance, especially if privacy is being provided; this is not
generally an issue in CE-based PPVPNs.
When IPsec tunneling is used in conjunction with IPsec's
cryptographic capabilities, excellent authentication and integrity
functions can be provided. Privacy can also be optionally
126.96.36.199. MPLS [RFC3031] [RFC3032] [RFC3035]
Multiprotocol Label Switching (MPLS) is a method for forwarding
packets through a network. Routers at the edge of a network apply
simple labels to packets. A label may be inserted between the data
link and network headers, or may be carried in the data link header
(e.g., the VPI/VCI field in an ATM header). Routers in the network
switch packets according to the labels, with minimal lookup overhead.
A path, or a tunnel in the PPVPN, is called a "label switched path
LSPs may be multiplexed within other LSPs.
MPLS does not have intrinsic QoS or SLA management mechanisms, but
bandwidth may be allocated to LSPs, and their routing may be
explicitly controlled. Additional techniques such as DiffServ and
DiffServ aware traffic engineering may be used with it [RFC3270]
[MPLS-DIFF-TE]. QoS capabilities from IP may be inherited.
o Tunnel setup and maintenance
LSPs are set up and maintained by LDP (Label Distribution
Protocol), RSVP (Resource Reservation Protocol) [RFC3209], or BGP.
o Large MTUs and minimization of tunnel overhead.
MPLS encapsulation adds four bytes per label. VPN-2547BIS's
[VPN-2547BIS] approach uses at least two labels for encapsulation
and adds minimal overhead.
MPLS packets may optionally be encapsulated in IP or GRE, for cases
where it is desirable to carry MPLS packets over an IP-only
MPLS encapsulation does not provide any significant security. An
SP which is providing VPN service can refuse to accept MPLS packets
from outside its borders. This provides the same level of
assurance as would be obtained via IP address filtering when
IP-based encapsulations are used. If a VPN is jointly provided by
multiple SPs, care should be taken to ensure that a labeled packet
is accepted from a neighboring router in another SP only if its top
label is one which was actually distributed to that router.
MPLS is the only one of the encapsulation techniques that cannot be
guaranteed to run over any IP network. Hence it would not be
applicable when transparency to the Internet is a requirement.
If the VPN backbone consists of several cooperating SP networks
which support MPLS, then the adjacent networks may support MPLS at
their interconnects. If two cooperating SP networks which support
MPLS are separated by a third which does not support MPLS, then
MPLS-in-IP or MPLS-in-IPsec tunneling may be done between them.
4.4. PE-PE Distribution of VPN Routing Information
In layer 3 PE-based VPNs, PE devices examine the IP headers of
packets they receive from the customer networks. Forwarding is based
on routing information received from the customer network. This
implies that the PE devices need to participate in some manner in
routing for the customer network. Section 3.3 discussed how routing
would be done in the customer network, including the customer
interface. In this section, we discuss ways in which the routing
information from a particular VPN may be passed, over the shared VPN
backbone, among the set of PEs attaching to that VPN.
The PEs needs to distribute two types of routing information to each
other: (i) Public Routing: routing information which specifies how to
reach addresses on the VPN backbone (i.e., "public addresses"); call
this "public routing information" (ii) VPN Routing: routing
information obtained from the CEs, which specifies how to reach
addresses ("private addresses") that are in the VPNs.
The way in which routing information in the first category is
distributed is outside the scope of this document; we discuss only
the distribution of routing information in the second category. Of
course, one of the requirements for distributing VPN routing
information is that it be kept separate and distinct from the public
information. Another requirement is that the distribution of VPN
routing information not destabilize or otherwise interfere with the
distribution of public routing information.
Similarly, distribution of VPN routing information associated with
one VPN should not destabilize or otherwise interfere with the
operation of other VPNs. These requirements are, for example,
relevant in the case that a private network might be suffering from
instability or other problems with its internal routing, which might
be propagated to the VPN used to support that private network.
Note that this issue does not arise in CE-based VPNs, as in CE-based
VPNs, the PE devices do not see packets from the VPN until after the
packets haven been encapsulated in an outer header that has only
4.4.1. Options for VPN Routing in the SP
The following technologies can be used for exchanging VPN routing
information discussed in sections 188.8.131.52 and 4.1.
o Static routing
o RIP [RFC2453]
o OSPF [RFC2328]
o BGP-4 [RFC1771]
4.4.2. VPN Forwarding Instances (VFIs)
In layer 3 PE-based VPNs, the PE devices receive unencapsulated IP
packets from the CE devices, and the PE devices use the IP
destination addresses in these packets to help make their forwarding
decisions. In order to do this properly, the PE devices must obtain
routing information from the customer networks. This implies that
the PE device participates in some manner in the customer network's
In layer 3 PE-based VPNs, a single PE device connected to several CE
devices that are in the same VPN, and it may also be connected to CE
devices of different VPNs. The route which the PE chooses for a
given IP destination address in a given packet will depend on the VPN
from which the packet was received. A PE device must therefore have
a separate forwarding table for each VPN to which it is attached. We
refer to these forwarding tables as "VPN Forwarding Instances"
(VFIs), as defined in section 2.1.
A VFI contains routes to locally attached VPN sites, as well as
routes to remote VPN sites. Section 4.4 discusses the way in which
routes to remote sites are obtained.
Routes to local sites may be obtained in several ways. One way is to
explicitly configure static routes into the VFI. This can be useful
in simple deployments, but it requires that one or more devices in
the customer's network be configured with static routes (perhaps just
a default route), so that traffic will be directed from the site to
the PE device.
Another way is to have the PE device be a routing peer of the CE
device, in a routing algorithm such as RIP, OSPF, or BGP. Depending
on the deployment scenario, the PE might need to advertise a large
number of routes to each CE (e.g., all the routes which the PE
obtained from remote sites in the CE's VPN), or it might just need to
advertise a single default route to the CE.
A PE device uses some resources in proportion to the number of VFIs
that it has, particularly if a distinct dynamic routing protocol
instance is associated with each VFI. A PE device also uses some
resources in proportion to the total number of routes it supports,
where the total number of routes includes all the routes in all its
VFIs, and all the public routes. These scaling factors will limit
the number of VPNs which a single PE device can support.
When dynamic routing is used between a PE and a CE, it is not
necessarily the case that each VFI is associated with a single
routing protocol instance. A single routing protocol instance may
provide routing information for multiple VFIs, and/or multiple
routing protocol instances might provide information for a single
VFI. See sections 4.4.3, 4.4.4, 3.3.1, and 184.108.40.206 for details.
There are several options for how VPN routes are carried between the
PEs, as discussed below.
4.4.3. Per-VPN Routing
One option is to operate separate instances of routing protocols
between the PEs, one instance for each VPN. When this is done,
routing protocol packets for each customer network need to be
tunneled between PEs. This uses the same tunneling method, and
optionally the same tunnels, as is used for transporting VPN user
data traffic between PEs.
With per-VPN routing, a distinct routing instance corresponding to
each VPN exists within the corresponding PE device. VPN-specific
tunnels are set up between PE devices (using the control mechanisms
that were discussed in sections 3 and 4). Logically these tunnels
are between the VFIs which are within the PE devices. The tunnels
then used as if they were normal links between normal routers.
Routing protocols for each VPN operate between VFIs and the routers
within the customer network.
This approach establishes, for each VPN, a distinct "control plane"
operating across the VPN backbone. There is no sharing of control
plane by any two VPNs, nor is there any sharing of control plane by
the VPN routing and the public routing. With this approach each PE
device can logically be thought of as consisting of multiple
The multiple routing instances within the PE device may be separate
processes, or may be in the same process with different data
structures. Similarly, there may be mechanisms internal to the PE
devices to partition memory and other resources between routing
instances. The mechanisms for implementing multiple routing
instances within a single physical PE are outside of the scope of
this framework document, and are also outside of the scope of other
This approach tends to minimize the explicit interactions between
different VPNs, as well as between VPN routing and public routing.
However, as long as the independent logical routers share the same
hardware, there is some sharing of resources, and interactions are
still possible. Also, each independent control plane has its
associated overheads, and this can raise issues of scale. For
example, the PE device must run a potentially large number of
independent routing "decision processes," and must also maintain a
potentially very large number of routing adjacencies.
4.4.4. Aggregated Routing Model
Another option is to use one single instance of a routing protocol
for carrying VPN routing information between the PEs. In this
method, the routing information for multiple different VPNs is
aggregated into a single routing protocol.
This approach greatly reduces the number of routing adjacencies which
the PEs must maintain, since there is no longer any need to maintain
more than one such adjacency between a given pair of PEs. If the
single routing protocol supports a hierarchical route distribution
mechanism (such as BGP's "route reflectors"), the PE-PE adjacencies
can be completely eliminated, and the number of backbone adjacencies
can be made into a small constant which is independent of the number
of PE devices. This improves the scaling properties.
Additional routing instances may still be needed to support the
exchange of routing information between the PE and its locally
attached CEs. These can be eliminated, with a consequent further
improvement in scalability, by using static routing on the PE-CE
interfaces, or possibly by having the PE-CE routing interaction use
the same protocol instance that is used to distribute VPN routes
across the VPN backbone (see section 220.127.116.11 for a way to do this).
With this approach, the number of routing protocol instances in a PE
device does not depend on the number of CEs supported by the PE
device, if the routing between PE and CE devices is static or BGP-4.
However, CE and PE devices in a VPN exchange route information inside
a VPN using a routing protocol except for BGP-4, the number of
routing protocol entities in a PE device depends on the number of CEs
supported by the PE device.
In principle it is possible for routing to be aggregated using either
BGP or on an IGP.
18.104.22.168. Aggregated Routing with OSPF or IS-IS
When supporting VPNs, it is likely that there can be a large number
of VPNs supported within any given SP network. In general only a
small number of PE devices will be interested in the operation of any
one VPN. Thus while the total amount of routing information related
to the various customer networks will be very large, any one PE needs
to know about only a small number of such networks.
Generally SP networks use OSPF or IS-IS for interior routing within
the SP network. There are very good reasons for this choice, which
are outside of the scope of this document.
Both OSPF and IS-IS are link state routing protocols. In link state
routing, routing information is distributed via a flooding protocol.
The set of routing peers is in general not fully meshed, but there is
a path from any router in the set to any other. Flooding ensures
that routing information from any one router reaches all the others.
This requires all routers in the set to maintain the same routing
information. One couldn't withhold any routing information from a
particular peer unless it is known that none of the peers further
downstream will need that information, and in general this cannot be
As a result, if one tried to do aggregated routing by using OSPF,
with all the PEs in the set of routing peers, all the PEs would end
up with the exact same routing information; there is no way to
constrain the distribution of routing information to a subset of the
PEs. Given the potential magnitude of the total routing information
required for supporting a large number of VPNs, this would have
unfortunate scaling implications.
In some cases VPNs may span multiple areas within a provider, or span
multiple providers. If VPN routing information were aggregated into
the IGP used within the provider, then some method would need to be
used to extend the reach of IGP routing information between areas and
22.214.171.124. Aggregated Routing with BGP
In order to use BGP for aggregated routing, the VPN routing
information must be clearly distinguished from the public Internet
routing information. This is typically done by making use of BGP's
capability of handling multiple address families, and treating the
VPN routes as being in a different address family than the public
Internet routes. Typically a VPN route also carries attributes which
depend on the particular VPN or VPNs to which that route belongs.
When BGP is used for carrying VPN information, the total amount of
information carried in BGP (including the Internet routes and VPN
routes) may be quite large. As noted above, there may be a large
number of VPNs which are supported by any particular provider, and
the total amount of routing information associated with all VPNs may
be quite large. However, any one PE will in general only need to be
aware of a small number of VPNs. This implies that where VPN routing
information is aggregated into BGP, it is desirable to be able to
limit which VPN information is distributed to which PEs.
In "Interior BGP" (IBGP), routing information is not flooded; it is
sent directly, over a TCP connection, to the peer routers (or to a
route reflector). These peer routers (unless they are route
reflectors) are then not even allowed to redistribute the information
to each other. BGP also has a comprehensive set of mechanisms for
constraining the routing information that any one peer sends to
another, based on policies established by the network administration.
Thus IBGP satisfies one of the requirements for aggregated routing
within a single SP network - it makes it possible to ensure that
routing information relevant to a particular VPN is processed only by
the PE devices that attach to that VPN. All that is necessary is
that each VPN route be distributed with one or more attributes which
identify the distribution policies. Then distribution can be
constrained by filtering against these attributes.
In "Exterior BGP" (EBGP), routing peers do redistribute routing
information to each other. However, it is very common to constrain
the distribution of particular items of routing information so that
they only go to those exterior peers who have a "need to know,"
although this does require a priori knowledge of which paths may
validly lead to which addresses. In the case of VPN routing, if a
VPN is provided by a small set of cooperating SPs, such constraints
can be applied to ensure that the routing information relevant to
that VPN does not get distributed anywhere it doesn't need to be. To
the extent that a particular VPN is supported by a small number of
cooperating SPs with private peering arrangements, this is
particularly straightforward, as the set of EBGP neighbors which need
to know the routing information from a particular VPN is easier to
BGP also has mechanisms (such as "Outbound Route Filtering," ORF)
which enable the proper set of VPN routing distribution constraints
to be dynamically distributed. This reduces the management burden of
setting up the constraints, and hence improves scalability.
Within a single routing domain (in the layer 3 VPN context, this
typically means within a single SP's network), it is common to have
the IBGP routers peer directly with one or two route reflectors,
rather than having them peer directly with each other. This greatly
reduces the number of IBGP adjacencies which any one router must
support. Further, a route reflector does not merely redistribute
routing information, it "digests" the information first, by running
its own decision processes. Only routes which survive the decision
process are redistributed.
As a result, when route reflectors are used, the amount of routing
information carried around the network, and in particular, the amount
of routing information which any given router must receive and
process, is greatly reduced. This greatly increases the scalability
of the routing distribution system.
It has already been stated that a given PE has VPN routing
information only for those PEs to which it is directly attached. It
is similarly important, for scalability, to ensure that no single
route reflector should have to have all the routing information for
all VPNs. It is after all possible for the total number of VPN
routes (across all VPNs supported by an SP) to exceed the number
which can be supported by a single route reflector. Therefore, the
VPN routes may themselves be partitioned, with some route reflectors
carrying one subset of the VPN routes and other route reflectors
carrying a different subset. The route reflectors which carry the
public Internet routes can also be completely separate from the route
reflectors that carry the VPN routes.
The use of outbound route filters allows any one PE and any one route
reflector to exchange information about only those VPNs which the PE
and route reflector are both interested in. This in turn ensures
that each PE and each route reflector receives routing information
only about the VPNs which it is directly supporting. Large SPs which
support a large number of VPNs therefore can partition the
information which is required for support of those VPNs.
Generally a PE device will be restricted in the total number of
routes it can support, whether those are public Internet routes or
VPN routes. As a result, a PE device may be able to be attached to a
larger number of VPNs if it does not also need to support Internet
The way in which VPN routes are partitioned among PEs and/or route
reflectors is a deployment issue. With suitable deployment
procedures, the limited capacity of these devices will not limit the
number of VPNs that can be supported.
Similarly, whether a given PE and/or route reflector contains
Internet routes as well as VPN routes is a deployment issue. If the
customer networks served by a particular PE do not need the Internet
access, then that PE does not need to be aware of the Internet
routes. If some or all of the VPNs served by a particular PE do need
the Internet access, but the PE does not contain Internet routes,
then the PE can maintain a default route that routes all the Internet
traffic from that PE to a different router within the SP network,
where that other router holds the full the Internet routing table.
With this approach the PE device needs only a single default route
for all the Internet routes.
For the reasons given above, the BGP protocol seems to be a
reasonable protocol to use for distributing VPN routing information.
Additional reasons for the use of BGP are:
o BGP has been proven to be useful for distributing very large
amounts of routing information; there isn't any routing
distribution protocol which is known to scale any better.
o The same BGP instance that is used for PE-PE distribution of VPN
routes can be used for PE-CE route distribution, if CE-PE routing
is static or BGP. PEs and CEs are really parts of distinct
Autonomous Systems, and BGP is particularly well-suited for
carrying routing information between Autonomous Systems.
On the other hand, BGP is also used for distributing public Internet
routes, and it is crucially important that VPN route distributing not
compromise the distribution of public Internet routes in any way.
This issue is discussed in the following section.
4.4.5. Scalability and Stability of Routing with Layer 3 PE-based VPNs
For layer 3 PE-based VPNs, there are likely to be cases where a
service provider supports Internet access over the same link that is
used for VPN service. Thus, a particular CE to PE link may carry
both private network IP packets (for transmission between sites of
the private network using VPN services) as well as public Internet
traffic (for transmission from the private site to the Internet, and
for transmission to the private site from the Internet). This
section looks at the scalability and stability of routing in this
case. It is worth noting that this sort of issue may be applicable
where per-VPN routing is used, as well as where aggregated routing is
For layer 3 PE-based VPNs, it is necessary for the PE devices to be
able to forward IP packets using the addresses spaces of the
supported private networks, as well as using the full Internet
address space. This implies that PE devices might in some cases
participate in routing for the private networks, as well as for the
In some cases the routing demand on the PE might be low enough, and
the capabilities of the PE, might be great enough, that it is
reasonable for the PE to participate fully in routing for both
private networks and the public Internet. For example, the PE device
might participate in normal operation of BGP as part of the global
Internet. The PE device might also operate routing protocols (or in
some cases use static routing) to exchange routes with CE devices.
For large installations, or where PE capabilities are more limited,
it may be undesirable for the PE to fully participate in routing for
both VPNs as well as the public Internet. For example, suppose that
the total volume of routes and routing instances supported by one PE
across multiple VPNs is very large. Suppose furthermore that one or
more of the private networks suffers from routing instabilities, for
example resulting in a large number of routing updates being
transmitted to the PE device. In this case it is important to
prevent such routing from causing any instability in the routing used
in the global Internet.
In these cases it may be necessary to partition routing, so that the
PE does not need to maintain as large a collection of routes, and so
that the PE is not able to adversely effect Internet routing. Also,
given that the total number of route prefixes and the total number of
routing instances which the PE needs to maintain might be very large,
it may be desirable to limit the participation in Internet routing
for those PEs which are supporting a large number of VPNs or which
are supporting large VPNs.
Consider a case where a PE is supporting a very large number of VPNs,
some of which have a large number of sites. To pick a VERY large
example, let's suppose 1000 VPNs, with an average of 100 sites each,
plus 10 prefixes per site on average. Consider that the PE also
needs to be able to route traffic to the Internet in general. In
this example the PE might need to support approximately 1,000,000
prefixes for the VPNs, plus more than 100,000 prefixes for the
Internet. If augmented and aggregated routing is used, then this
implies a large number of routes which may be advertised in a single
routing protocol (most likely BGP). If the VR approach is used, then
there are also 100,000 neighbor adjacencies in the various per-VPN
routing protocol instances. In some cases this number of routing
prefixes and/or this number of adjacencies might be difficult to
support in one device.
In this case, an alternate approach is to limit the PE's
participation in Internet routing to the absolute minimum required:
Specifically the PE will need to know which Internet address prefixes
are reachable via directly attached CE devices. All other Internet
routes may be summarized into a single default route pointing to one
or more P routers. In many cases the P routers to which the default
routes are directed may be the P routers to which the PE device is
directly attached (which are the ones which it needs to use for
forwarding most Internet traffic). Thus if there are M CE devices
directly connected to the PE, and if these M CE devices are the next
hop for a total of N globally addressable Internet address prefixes,
then the PE device would maintain N+1 routes corresponding to
globally routable Internet addresses.
In this example, those PE devices which provide VPN service run
routing to compute routes for the VPNs, but don't operate Internet
routing, and instead use only a default route to route traffic to all
Internet destinations (not counting the addresses which are reachable
via directly attached CE devices). The P routers need to maintain
Internet routes, and therefore take part in Internet routing
protocols. However, the P routers don't know anything about the VPN
In some cases the maximum number of routes and/or routing instances
supportable via a single PE device may limit the number of VPNs which
can be supported by that PE. For example, in some cases this might
require that two different PE devices be used to support VPN services
for a set of multiple CEs, even if one PE might have had sufficient
throughput to handle the data traffic from the full set of CEs.
Similarly, the amount of resources which any one VPN is permitted to
use in a single PE might be restricted.
There will be cases where it is not necessary to partition the
routing, since the PEs will be able to maintain all VPN routes and
all Internet routes without a problem. However, it is important that
VPN approaches allow partitioning to be used where needed in order to
prevent future scaling problems. Again, making the system scalable
is a matter of proper deployment.
It may be wondered whether it is ever desirable to have both Internet
routing and VPN routing running in a single PE device or route
reflector. In fact, if there is even a single system running both
Internet routing and VPN routing, doesn't that raise the possibility
that a disruption within the VPN routing system will cause a
disruption within the Internet routing system?
Certainly this possibility exists in theory. To minimize that
possibility, BGP implementations which support multiple address
families should be organized so as to minimize the degree to which
the processing and distribution of one address family affects the
processing and distribution of another. This could be done, for
example, by suitable partitioning of resources. This partitioning
may be helpful both to protect Internet routing from VPN routing, and
to protect well behaved VPN customers from "mis-behaving" VPNs. Or
one could try to protect the Internet routing system from the VPN
routing system by giving preference to the Internet routing. Such
implementation issues are outside the scope of this document. If one
has inadequate confidence in an implementation, deployment procedures
can be used, as explained above, to separate the Internet routing
from the VPN routing.
4.5. Quality of Service, SLAs, and IP Differentiated Services
The following technologies for QoS/SLA may be applicable to PPVPNs.
4.5.1. IntServ/RSVP [RFC2205] [RFC2208] [RFC2210] [RFC2211] [RFC2212]
Integrated services, or IntServ for short, is a mechanism for
providing QoS/SLA by admission control. RSVP is used to reserve
network resources. The network needs to maintain a state for each
reservation. The number of states in the network increases in
proportion to the number of concurrent reservations.
In some cases, IntServ on the edge of a network (e.g., over the
customer interface) may be mapped to DiffServ in the SP network.
4.5.2. DiffServ [RFC2474] [RFC2475]
IP differentiated service, or DiffServ for short, is a mechanism for
providing QoS/SLA by differentiating traffic. Traffic entering a
network is classified into several behavior aggregates at the network
edge and each is assigned a corresponding DiffServ codepoint. Within
the network, traffic is treated according to its DiffServ codepoint.
Some behavior aggregates have already been defined. Expedited
forwarding behavior [RFC3246] guarantees the QoS, whereas assured
forwarding behavior [RFC2597] differentiates traffic packet
When DiffServ is used, network provisioning is done on a
per-traffic-class basis. This ensures a specific class of service
can be achieved for a class (assuming that the traffic load is
controlled). All packets within a class are then treated equally
within an SP network. Policing is done at input to prevent any one
user from exceeding their allocation and therefore defeating the
provisioning for the class as a whole. If a user exceeds their
traffic contract, then the excess packets may optionally be
discarded, or may be marked as "over contract". Routers throughout
the network can then preferentially discard over contract packets in
response to congestion, in order to ensure that such packets do not
defeat the service guarantees intended for in contract traffic.
4.6. Concurrent Access to VPNs and the Internet
In some scenarios, customers will need to concurrently have access to
their VPN network and to the public Internet.
Two potential problems are identified in this scenario: the use of
private addresses and the potential security threads.
o The use of private addresses
The IP addresses used in the customer's sites will possibly belong
to a private routing realm, and as such be unusable in the public
Internet. This means that a network address translation function
(e.g., NAT) will need to be implemented to allow VPN customers to
access the Public Internet.
In the case of layer 3 PE-based VPNs, this translation function
will be implemented in the PE to which the CE device is connected.
In the case of layer 3 provider-provisioned CE-based VPNs, this
translation function will be implemented on the CE device itself.
o Potential security threat
As portions of the traffic that flow to and from the public
Internet are not necessarily under the SP's nor the customer's
control, some traffic analyzing function (e.g., a firewall
function) will be implemented to control the traffic entering and
leaving the VPN.
In the case of layer 3 PE-based VPNs, this traffic analyzing
function will be implemented in the PE device (or in the VFI
supporting a specific VPN), while in the case of layer 3 provider
provisioned CE-based VPNs, this function will be implemented in the
o Handling of a customer IP packet destined for the Internet
In the case of layer 3 PE-based VPNs, an IP packet coming from a
customer site will be handled in the corresponding VFI. If the IP
destination address in the packet's IP header belongs to the
Internet, multiple scenarios are possible, based on the adapted
policy. As a first possibility, when Internet access is not
allowed, the packet will be dropped. As a second possibility, when
(controlled) Internet access is allowed, the IP packet will go
through the translation function and eventually through the traffic
analyzing function before further processing in the PE's global
Internet forwarding table.
Note that different implementation choices are possible. One can
choose to implement the translation and/or the traffic analyzing
function in every VFI (or CE device in the context of layer 3
provider-provisioned CE-based VPNs), or alternatively in a subset or
even in only one VPN network element. This would mean that the
traffic to/from the Internet from/to any VPN site needs to be routed
through that single network element (this is what happens in a hub
and spoke topology for example).
4.7. Network and Customer Management of VPNs
4.7.1. Network and Customer Management
Network and customer management systems responsible for managing VPN
networks have several challenges depending on the type of VPN network
or networks they are required to manage.
For any type of provider-provisioned VPN it is useful to have one
place where the VPN can be viewed and optionally managed as a whole.
The NMS may therefore be a place where the collective instances of a
VPN are brought together into a cohesive picture to form a VPN. To
be more precise, the instances of a VPN on their own do not form the
VPN; rather, the collection of disparate VPN sites together forms the
VPN. This is important because VPNs are typically configured at the
edges of the network (i.e., PEs) either through manual configuration
or auto-configuration. This results in no state information being
kept in within the "core" of the network. Sometimes little or no
information about other PEs is configured at any particular PE.
Support of any one VPN may span a wide range of network equipment,
potentially including equipment from multiple implementors. Allowing
a unified network management view of the VPN therefore is simplified
through use of standard management interfaces and models. This will
also facilitate customer self-managed (monitored) network devices or
In cases where significant configuration is required whenever a new
service is provisioned, it is important for scalability reasons that
the NMS provide a largely automated mechanism for this operation.
Manual configuration of VPN services (i.e., new sites, or
re-provisioning existing ones), could lead to scalability issues, and
should be avoided. It is thus important for network operators to
maintain visibility of the complete picture of the VPN through the
NMS system. This must be achieved using standard protocols such as
SNMP, XML, or LDAP. Use of proprietary command-line interfaces has
the disadvantage that proprietary interfaces do not lend themselves
to standard representations of managed objects.
To achieve the goals outlined above for network and customer
management, device implementors should employ standard management
interfaces to expose the information required to manage VPNs. To
this end, devices should utilize standards-based mechanisms such as
SNMP, XML, or LDAP to achieve this goal.
4.7.2. Segregated Access of VPN Information
Segregated access of VPNs information is important in that customers
sometimes require access to information in several ways. First, it
is important for some customers (or operators) to access PEs, CEs or
P devices within the context of a particular VPN on a per-VPN-basis
in order to access statistics, configuration or status information.
This can either be under the guise of general management,
operator-initiated provisioning, or SLA verification (SP, customer or
Where users outside of the SP have access to information from PE or P
devices, managed objects within the managed devices must be
accessible on a per-VPN basis in order to provide the customer, the
SP or the third party SLA verification agent with a high degree of
security and convenience.
Security may require authentication or encryption of network
management commands and information. Information hiding may use
encryption or may isolate information through a mechanism that
provides per-VPN access. Authentication or encryption of both
requests and responses for managed objects within a device may be
employed. Examples of how this can be achieved include IPsec
tunnels, SNMPv3 encryption for SNMP-based management, or encrypted
telnet sessions for CLI-based management.
In the case of information isolation, any one customer should only be
able to view information pertaining to its own VPN or VPNs.
Information isolation can also be used to partition the space of
managed objects on a device in such a way as to make it more
convenient for the SP to manage the device. In certain deployments,
it is also important for the SP to have access to information
pertaining to all VPNs, thus it may be important for the SP to create
virtual VPNs within the management domain which overlap across
If the user is allowed to change the configuration of their VPN, then
in some cases customers may make unanticipated changes or even
mistakes, thereby causing their VPN to mis-behave. This in turn may
require an audit trail to allow determination of what went wrong and
some way to inform the carrier of the cause.
The segregation and security access of information on a per-VPN basis
is also important when the carrier of carrier's paradigm is employed.
In this case it may be desirable for customers (i.e., sub-carriers or
VPN wholesalers) to manage and provision services within their VPNs
on their respective devices in order to reduce the management
overhead cost to the carrier of carrier's SP. In this case, it is
important to observe the guidelines detailed above with regard to
information hiding, isolation and encryption. It should be noted
that there may be many flavors of information hiding and isolation
employed by the carrier of carrier's SP. If the carrier of carriers
SP does not want to grant the sub-carrier open access to all of the
managed objects within their PEs or P routers, it is necessary for
devices to provide network operators with secure and scalable per-VPN
network management access to their devices. For the reasons outlined
above, it therefore is desirable to provide standard mechanisms for
achieving these goals.