3. Concepts and Framework
3.1. PE-CE Multicast Routing
Support of multicast in BGP/MPLS IP VPNs is modeled closely after the
support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing
protocol will be run on the PE-CE interfaces, such that PE and CE are
multicast routing adjacencies on that interface. CEs at different
sites do not become multicast routing adjacencies of each other.
If a PE attaches to n VPNs for which multicast support is provided
(i.e., to n "MVPNs"), the PE will run n independent instances of a
multicast routing protocol. We will refer to these multicast routing
instances as "VPN-specific multicast routing instances", or more
briefly as "multicast C-instances". The notion of a "VRF" (VPN
Routing and Forwarding Table), defined in [RFC4364], is extended to
include multicast routing entries as well as unicast routing entries.
Each multicast routing entry is thus associated with a particular
Whether a particular VRF belongs to an MVPN or not is determined by
In this document, we do not attempt to provide support for every
possible multicast routing protocol that could possibly run on the
PE-CE link. Rather, we consider multicast C-instances only for the
following multicast routing protocols:
- PIM Sparse Mode (PIM-SM), supporting the ASM service model
- PIM Sparse Mode, supporting the SSM service model
- PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional
C-trees to support the ASM service model.
In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
may also be supported on the PE-CE interface. The use of mLDP on the
PE-CE interface is described in [MVPN-BGP].
The use of BGP on the PE-CE interface is not within the scope of this
As the only multicast C-instances discussed by this document are PIM-
based C-instances, we will generally use the term "PIM C-instances"
to refer to the multicast C-instances.
A PE router may also be running a "provider-wide" instance of PIM, (a
"PIM P-instance"), in which it has a PIM adjacency with, e.g., each
of its IGP neighbors (i.e., with P routers), but NOT with any CE
routers, and not with other PE routers (unless another PE router
happens to be an IGP adjacency). In this case, P routers would also
run the P-instance of PIM but NOT a C-instance. If there is a PIM
P-instance, it may or may not have a role to play in the support of
VPN multicast; this is discussed in later sections. However, in no
case will the PIM P-instance contain VPN-specific multicast routing
In order to help clarify when we are speaking of the PIM P-instance
and when we are speaking of a PIM C-instance, we will also apply the
prefixes "P-" and "C-", respectively, to control messages, addresses,
etc. Thus, a P-Join would be a PIM Join that is processed by the PIM
P-instance, and a C-Join would be a PIM Join that is processed by a
C-instance. A P-group address would be a group address in the SP's
address space, and a C-group address would be a group address in a
VPN's address space. A C-tree is a multicast distribution tree
constructed and maintained by the PIM C-instances. A C-flow is a
stream of multicast packets with a common C-source address and a
common C-group address. We will use the notation "(C-S,C-G)" to
identify specific C-flows. If a particular C-tree is a shared tree
(whether unidirectional or bidirectional) rather than a source-
specific tree, we will sometimes speak of the entire set of flows
traveling that tree, identifying the set as "(C-*,C-G)".
3.2. P-Multicast Service Interfaces (PMSIs)
A PE must have the ability to forward multicast data packets received
from a CE to one or more of the other PEs in the same MVPN for
delivery to one or more other CEs.
We define the notion of a "P-Multicast Service Interface" (PMSI). If
a particular MVPN is supported by a particular set of PE routers,
then there will be one or more PMSIs connecting those PE routers
and/or subsets thereof. A PMSI is a conceptual "overlay" on the
P-network with the following property: a PE in a given MVPN can give
a packet to the PMSI, and the packet will be delivered to some or all
of the other PEs in the MVPN, such that any PE receiving the packet
will be able to determine the MVPN to which the packet belongs.
As we discuss below, a PMSI may be instantiated by a number of
different transport mechanisms, depending on the particular
requirements of the MVPN and of the SP. We will refer to these
transport mechanisms as "P-tunnels".
For each MVPN, there are one or more PMSIs that are used for
transmitting the MVPN's multicast data from one PE to others. We
will use the term "PMSI" such that a single PMSI belongs to a single
MVPN. However, the transport mechanism that is used to instantiate a
PMSI may allow a single P-tunnel to carry the data of multiple PMSIs.
In this document, we make a clear distinction between the multicast
service (the PMSI) and its instantiation. This allows us to separate
the discussion of different services from the discussion of different
instantiations of each service. The term "P-tunnel" is used to refer
to the transport mechanism that instantiates a service.
PMSIs are used to carry C-multicast data traffic. The C-multicast
data traffic travels along a C-tree, but in the SP backbone all
C-trees are tunneled through P-tunnels. Thus, we will sometimes talk
of a P-tunnel carrying one or more C-trees.
Some of the options for passing multicast control traffic among the
PEs do so by sending the control traffic through a PMSI; other
options do not send control traffic through a PMSI.
3.2.1. Inclusive and Selective PMSIs
We will distinguish between three different kinds of PMSIs:
- "Multidirectional Inclusive" PMSI (MI-PMSI)
A Multidirectional Inclusive PMSI is one that enables ANY PE
attaching to a particular MVPN to transmit a message such that it
will be received by EVERY other PE attaching to that MVPN.
There is, at most, one MI-PMSI per MVPN. (Though the P-tunnel or
P-tunnels that instantiate an MI-PMSI may actually carry the data
of more than one PMSI.)
An MI-PMSI can be thought of as an overlay broadcast network
connecting the set of PEs supporting a particular MVPN.
- "Unidirectional Inclusive" PMSI (UI-PMSI)
A Unidirectional Inclusive PMSI is one that enables a particular
PE, attached to a particular MVPN, to transmit a message such
that it will be received by all the other PEs attaching to that
MVPN. There is, at most, one UI-PMSI per PE per MVPN, though the
P-tunnel that instantiates a UI-PMSI may, in fact, carry the data
of more than one PMSI.
- "Selective" PMSI (S-PMSI).
A Selective PMSI is one that provides a mechanism wherein a
particular PE in an MVPN can multicast messages so that they will
be received by a subset of the other PEs of that MVPN. There may
be an arbitrary number of S-PMSIs per PE per MVPN. The P-tunnel
that instantiates a given S-PMSI may carry data from multiple
In later sections, we describe the role played by these different
kinds of PMSIs. We will use the term "I-PMSI" when we are not
distinguishing between "MI-PMSIs" and "UI-PMSIs".
3.2.2. P-Tunnels Instantiating PMSIs
The P-tunnels that are used to instantiate PMSIs will be referred to
as "P-tunnels". A number of different tunnel setup techniques can be
used to create the P-tunnels that instantiate the PMSIs. Among these
are the following:
A PMSI can be instantiated as (a set of) Multicast Distribution
trees created by the PIM P-instance ("P-trees").
The multicast distribution trees that instantiate I-PMSIs may be
either shared trees or source-specific trees.
This document (along with [MVPN-BGP]) specifies procedures for
identifying a particular (C-S,C-G) flow and assigning it to a
particular S-PMSI. Such an S-PMSI is most naturally instantiated
as a source-specific tree.
The use of shared trees (including bidirectional trees) to
instantiate S-PMSIs is outside the scope of this document.
The use of PIM-DM to create P-tunnels is not supported.
P-tunnels may be shared by multiple MVPNs (i.e., a given P-tunnel
may be the instantiation of multiple PMSIs), as long as the
tunnel encapsulation provides some means of demultiplexing the
data traffic by MVPN.
mLDP Point-to-Multipoint (P2MP) LSPs or Multipoint-to-Multipoint
(MP2MP) LSPs can be used to instantiate I-PMSIs.
An S-PMSI or a UI-PMSI could be instantiated as a single mLDP
P2MP LSP, whereas an MI-PMSI would have to be instantiated as a
set of such LSPs (each PE in the MVPN being the root of one such
LSP) or as a single MP2MP LSP.
Procedures for sharing MP2MP LSPs across multiple MVPNs are
outside the scope of this document.
The use of MP2MP LSPs to instantiate S-PMSIs is outside the scope
of this document.
Section 11.2.3 discusses a way of using a partial mesh of MP2MP
LSPs to instantiate a PMSI. However, a full specification of the
necessary procedures is outside the scope of this document.
A PMSI may be instantiated as one or more RSVP-TE Point-to-
Multipoint (P2MP) LSPs. An S-PMSI or a UI-PMSI would be
instantiated as a single RSVP-TE P2MP LSP, whereas a
Multidirectional Inclusive PMSI would be instantiated as a set of
such LSPs, one for each PE in the MVPN. RSVP-TE P2MP LSPs can be
shared across multiple MVPNs.
- A Mesh of Unicast P-Tunnels.
If a PMSI is implemented as a mesh of unicast P-tunnels, a PE
wishing to transmit a packet through the PMSI would replicate the
packet and send a copy to each of the other PEs.
An MI-PMSI for a given MVPN can be instantiated as a full mesh of
unicast P-tunnels among that MVPN's PEs. A UI-PMSI or an S-PMSI
can be instantiated as a partial mesh.
It can be seen that each method of implementing PMSIs has its own
area of applicability. Therefore, this specification allows for the
use of any of these methods. At first glance, this may seem like an
overabundance of options. However, the history of multicast
development and deployment should make it clear that there is no one
option that is always acceptable. The use of segmented inter-AS
trees does allow each SP to select the option that it finds most
applicable in its own environment, without causing any other SP to
choose that same option.
SPECIFYING THE CONDITIONS UNDER WHICH A PARTICULAR TREE-BUILDING
METHOD IS APPLICABLE IS OUTSIDE THE SCOPE OF THIS DOCUMENT.
The choice of the tunnel technique belongs to the sender router and
is a local policy decision of that router. The procedures defined
throughout this document do not mandate that the same tunnel
technique be used for all P-tunnels going through a given provider
backbone. However, it is expected that any tunnel technique that can
be used by a PE for a particular MVPN is also supported by all the
other PEs having VRFs for the MVPN. Moreover, the use of ingress
replication by any PE for an MVPN implies that all other PEs MUST use
ingress replication for this MVPN.
3.3. Use of PMSIs for Carrying Multicast Data
Each PE supporting a particular MVPN must have a way of discovering
the following information:
- The set of other PEs in its AS that are attached to sites of that
MVPN, and the set of other ASes that have PEs attached to sites
of that MVPN. However, if non-segmented inter-AS trees are used
(see Section 8.1), then each PE needs to know the entire set of
PEs attached to sites of that MVPN.
- If segmented inter-AS trees are to be used, the set of border
routers in its AS that support inter-AS connectivity for that
- If the MVPN is configured to use an MI-PMSI, the information
needed to set up and to use the P-tunnels instantiating the
- For each other PE, whether the PE supports Aggregate Trees for
the MVPN, and if so, the demultiplexing information that must be
provided so that the other PE can determine whether a packet that
it received on an Aggregate Tree belongs to this MVPN.
In some cases, the information above is provided by means of the BGP-
based auto-discovery procedures discussed in Section 4 of this
document and in Section 9 of [MVPN-BGP]. In other cases, this
information is provided after discovery is complete, by means of
procedures discussed in Section 7.4. In either case, the information
that is provided must be sufficient to enable the PMSI to be bound to
the identified P-tunnel, to enable the P-tunnel to be created if it
does not already exist, and to enable the different PMSIs that may
travel on the same P-tunnel to be properly demultiplexed.
If an MVPN uses an MI-PMSI, then the information needed to identify
the P-tunnels that instantiate the MI-PMSI has to be known to the PEs
attached to the MVPN before any data can be transmitted on the
MI-PMSI. This information is either statically configured or auto-
discovered (see Section 4). The actual process of constructing the
P-tunnels (e.g., via PIM, RSVP-TE, or mLDP) SHOULD occur as soon as
this information is known.
When MI-PMSIs are used, they may serve as the default method of
carrying C-multicast data traffic. When we say that an MI-PMSI is
the "default" method of carrying C-multicast data traffic for a
particular MVPN, we mean that it is not necessary to use any special
control procedures to bind a particular C-flow to the MI-PMSI; any
C-flows that have not been bound to other PMSIs will be assumed to
travel through the MI-PMSI.
There is no requirement to use MI-PMSIs as the default method of
carrying C-flows. It is possible to adopt a policy in which all
C-flows are carried on UI-PMSIs or S-PMSIs. In this case, if an
MI-PMSI is not used for carrying routing information, it is not
needed at all.
Even when an MI-PMSI is used as the default method of carrying an
MVPN's C-flows, if a particular C-flow has certain characteristics,
it may be desirable to migrate it from the MI-PMSI to an S-PMSI.
These characteristics, as well as the procedures for migrating a
C-flow from an MI-PMSI to an S-PMSI, are discussed in Section 7.
Sometimes a set of C-flows are traveling the same, shared, C-tree
(e.g., either unidirectional or bidirectional), and it may be
desirable to move the whole set of C-flows as a unit to an S-PMSI.
Procedures for doing this are outside the scope of this
Some of the procedures for transmitting C-multicast routing
information among the PEs require that the routing information be
sent over an MI-PMSI. Other procedures do not use an MI-PMSI to
transmit the C-multicast routing information.
For a given MVPN, whether an MI-PMSI is used to carry C-multicast
routing information is independent from whether an MI-PMSI is used as
the default method of carrying the C-multicast data traffic.
As previously stated, it is possible to send all C-flows on a set of
S-PMSIs, omitting any usage of I-PMSIs. This prevents PEs from
receiving data that they don't need, at the cost of requiring
additional P-tunnels, and additional signaling to bind the C-flows to
P-tunnels. Cost-effective instantiation of S-PMSIs is likely to
require Aggregate P-trees, which, in turn, makes it necessary for the
transmitting PE to know which PEs need to receive which multicast
streams. This is known as "explicit tracking", and the procedures to
enable explicit tracking may themselves impose a cost. This is
further discussed in Section 188.8.131.52.
3.4. PE-PE Transmission of C-Multicast Routing
As a PE attached to a given MVPN receives C-Join/Prune messages from
its CEs in that MVPN, it must convey the information contained in
those messages to other PEs that are attached to the same MVPN.
There are several different methods for doing this. As these methods
are not interoperable, the method to be used for a particular MVPN
must be either configured or discovered as part of the auto-discovery
3.4.1. PIM Peering
184.108.40.206. Full per-MVPN PIM Peering across an MI-PMSI
If the set of PEs attached to a given MVPN are connected via an
MI-PMSI, the PEs can form "normal" PIM adjacencies with each other.
Since the MI-PMSI functions as a broadcast network, the standard PIM
procedures for forming and maintaining adjacencies over a LAN can be
As a result, the C-Join/Prune messages that a PE receives from a CE
can be multicast to all the other PEs of the MVPN. PIM "Join
suppression" can be enabled and the PEs can send Asserts as needed.
This procedure is fully specified in Section 5.2.
220.127.116.11. Lightweight PIM Peering across an MI-PMSI
The procedure of the previous Section has the following
- Periodic Hello messages must be sent by all PEs.
Standard PIM procedures require that each PE in a particular MVPN
periodically multicast a Hello to all the other PEs in that MVPN.
If the number of MVPNs becomes very large, sending and receiving
these Hellos can become a substantial overhead for the PE
- Periodic retransmission of C-Join/Prune messages.
PIM is a "soft-state" protocol, in which reliability is assured
through frequent retransmissions (refresh) of control messages.
This too can begin to impose a large overhead on the PE routers
as the number of MVPNs grows.
The first of these disadvantages is easily remedied. The reason for
the periodic PIM Hellos is to ensure that each PIM speaker on a LAN
knows who all the other PIM speakers on the LAN are. However, in the
context of MVPN, PEs in a given MVPN can learn the identities of all
the other PEs in the MVPN by means of the BGP-based auto-discovery
procedure of Section 4. In that case, the periodic Hellos would
serve no function and could simply be eliminated. (Of course, this
does imply a change to the standard PIM procedures.)
When Hellos are suppressed, we may speak of "lightweight PIM
The periodic refresh of the C-Join/Prune messages is not as simple to
eliminate. If and when "refresh reduction" procedures are specified
for PIM, it may be useful to incorporate them, so as to make the
lightweight PIM peering procedures even more lightweight.
Lightweight PIM peering is not specified in this document.
18.104.22.168. Unicasting of PIM C-Join/Prune Messages
PIM does not require that the C-Join/Prune messages that a PE
receives from a CE to be multicast to all the other PEs; it allows
them to be unicast to a single PE, the one that is upstream on the
path to the root of the multicast tree mentioned in the Join/Prune
message. Note that when the C-Join/Prune messages are unicast, there
is no such thing as "Join suppression". Therefore, PIM Refresh
Reduction may be considered to be a prerequisite for the procedure of
unicasting the C-Join/Prune messages.
When the C-Join/Prune messages are unicast, they are not transmitted
on a PMSI at all. Note that the procedure of unicasting the
C-Join/Prune messages is different than the procedure of transmitting
the C-Join/Prune messages on an MI-PMSI that is instantiated as a
mesh of unicast P-tunnels.
If there are multiple PEs that can be used to reach a given C-source,
procedures described in Sections 5.1 and 9 MUST be used to ensure
that duplicate packets do not get delivered.
Procedures for unicasting the PIM control messages are not further
specified in this document.
3.4.2. Using BGP to Carry C-Multicast Routing
It is possible to use BGP to carry C-multicast routing information
from PE to PE, dispensing entirely with the transmission of
C-Join/Prune messages from PE to PE. This is discussed in Section
5.3 and fully specified in [MVPN-BGP].
4. BGP-Based Auto-Discovery of MVPN Membership
BGP-based auto-discovery is done by means of a new address family,
the MCAST-VPN address family. (This address family also has other
uses, as will be seen later.) Any PE that attaches to an MVPN must
issue a BGP Update message containing an NLRI ("Network Layer
Reachability Information" element) in this address family, along with
a specific set of attributes. In this document, we specify the
information that must be contained in these BGP Updates in order to
provide auto-discovery. The encoding details, along with the
complete set of detailed procedures, are specified in a separate
This section specifies the intra-AS BGP-based auto-discovery
procedures. When segmented inter-AS trees are used, additional
procedures are needed, as specified in [MVPN-BGP]. (When segmented
inter-AS trees are not used, the inter-AS procedures are almost
identical to the intra-AS procedures.)
BGP-based auto-discovery uses a particular kind of MCAST-VPN route
known as an "auto-discovery route", or "A-D route". In particular,
it uses two kinds of "A-D routes": the "Intra-AS I-PMSI A-D route"
and the "Inter-AS I-PMSI A-D route". (There are also additional
kinds of A-D routes, such as the Source Active A-D routes, which are
used for purposes that go beyond auto-discovery. These are discussed
in subsequent sections.)
The Inter-AS I-PMSI A-D route is used only when segmented inter-AS
P-tunnels are used, as specified in [MVPN-BGP].
The "Intra-AS I-PMSI A-D route" is originated by the PEs that are
(directly) connected to the site(s) of an MVPN. It is distributed to
other PEs that attach to sites of the MVPN. If segmented inter-AS
P-tunnels are used, then the Intra-AS I-PMSI A-D routes are not
distributed outside the AS where they originate; if segmented inter-
AS P-tunnels are not used, then the Intra-AS I-PMSI A-D routes are,
despite their name, distributed to all PEs attached to the VPN, no
matter what AS the PEs are in.
The NLRI of an Intra-AS I-PMSI A-D route must contain the following
- The route type (i.e., Intra-AS I-PMSI A-D route).
- The IP address of the originating PE.
- An RD ("Route Distinguisher", [RFC4364]) configured locally for
the MVPN. This is an RD that can be prepended to that IP address
to form a globally unique VPN-IP address of the PE.
Intra-AS I-PMSI A-D routes carry the following attributes:
- Route Target Extended Communities attribute.
One or more of these MUST be carried by each Intra-AS I-PMSI A-D
route. If any other PE has one of these Route Targets configured
for import into a VRF, it treats the advertising PE as a member
in the MVPN to which the VRF belongs. This allows each PE to
discover the PEs that belong to a given MVPN. More specifically,
it allows a PE in the Receiver Sites set to discover the PEs in
the Sender Sites set of the MVPN, and the PEs in the Sender Sites
set of the MVPN to discover the PEs in the Receiver Sites set of
the MVPN. The PEs in the Receiver Sites set would be configured
to import the Route Targets advertised in the BGP A-D routes by
PEs in the Sender Sites set. The PEs in the Sender Sites set
would be configured to import the Route Targets advertised in the
BGP A-D routes by PEs in the Receiver Sites set.
- PMSI Tunnel attribute.
This attribute is present whenever the MVPN uses an MI-PMSI or
when it uses a UI-PMSI rooted at the originating router. It
contains the following information:
* tunnel technology, which may be one of the following:
+ Bidirectional multicast tree created by BIDIR-PIM,
+ Source-specific multicast tree created by PIM-SM,
supporting the SSM service model,
+ Set of trees (one shared tree and a set of source trees)
created by PIM-SM using the ASM service model,
+ Point-to-multipoint LSP created by RSVP-TE,
+ Point-to-multipoint LSP created by mLDP,
+ multipoint-to-multipoint LSP created by mLDP
+ unicast tunnel
* P-tunnel identifier
Before a P-tunnel can be constructed to instantiate the
I-PMSI, the PE must be able to create a unique identifier for
the tunnel. The syntax of this identifier depends on the
tunnel technology used.
Each PE attaching to a given MVPN must be configured with
information specifying the allowable encapsulations to use
for that MVPN, as well as the particular one of those
encapsulations that the PE is to identify in the PMSI Tunnel
attribute of the Intra-AS I-PMSI A-D routes that it
* Multi-VPN aggregation capability and demultiplexor value.
This specifies whether the P-tunnel is capable of aggregating
I-PMSIs from multiple MVPNs. This will affect the
encapsulation used. If aggregation is to be used, a
demultiplexor value to be carried by packets for this
particular MVPN must also be specified. The demultiplexing
mechanism and signaling procedures are described in Section
- PE Distinguisher Labels Attribute
Sometimes it is necessary for one PE to advertise an upstream-
assigned MPLS label that identifies another PE. Under certain
circumstances to be discussed later, a PE that is the root of a
multicast P-tunnel will bind an MPLS label value to one or more
of the PEs that belong to the P-tunnel, and it will distribute
these label bindings using Intra-AS I-PMSI A-D routes.
Specification of when this must be done is provided in Sections
6.4.4 and 11.2.2. We refer to these as "PE Distinguisher
Note that, as specified in [MPLS-UPSTREAM-LABEL], PE
Distinguisher Label values are unique only in the context of the
IP address identifying the root of the P-tunnel; they are not
necessarily unique per tunnel.
5. PE-PE Transmission of C-Multicast Routing
As a PE attached to a given MVPN receives C-Join/Prune messages from
its CEs in that MVPN, it must convey the information contained in
those messages to other PEs that are attached to the same MVPN. This
is known as the "PE-PE transmission of C-multicast routing
This section specifies the procedures used for PE-PE transmission of
C-multicast routing information. Not every procedure mentioned in
Section 3.4 is specified here. Rather, this section focuses on two
- Full PIM Peering.
This procedure is fully specified herein.
- Use of BGP to distribute C-multicast routing
This procedure is described herein, but the full specification
appears in [MVPN-BGP].
Those aspects of the procedures that apply to both of the above are
also specified fully herein.
Specification of other procedures is outside the scope of this
5.1. Selecting the Upstream Multicast Hop (UMH)
When a PE receives a C-Join/Prune message from a CE, the message
identifies a particular multicast flow as belonging either to a
source-specific tree (S,G) or to a shared tree (*,G). Throughout
this section, we use the term "C-root" to refer to S, in the case of
a source-specific tree, or to the Rendezvous Point (RP) for G, in the
case of (*,G). If the route to the C-root is across the VPN
backbone, then the PE needs to find the "Upstream Multicast Hop"
(UMH) for the (S,G) or (*,G) flow. The UMH is either the PE at which
(S,G) or (*,G) data packets enter the VPN backbone or the Autonomous
System Border Router (ASBR) at which those data packets enter the
local AS when traveling through the VPN backbone. The process of
finding the upstream multicast hop for a given C-root is known as
"upstream multicast hop selection".
5.1.1. Eligible Routes for UMH Selection
In the simplest case, the PE does the upstream hop selection by
looking up the C-root in the unicast VRF associated with the PE-CE
interface over which the C-Join/Prune message was received. The
route that matches the C-root will contain the information needed to
select the UMH.
However, in some cases, the CEs may be distributing to the PEs a
special set of routes that are to be used exclusively for the purpose
of upstream multicast hop selection, and not used for unicast routing
at all. For example, when BGP is the CE-PE unicast routing protocol,
the CEs may be using Subsequent Address Family Identifier 2 (SAFI 2)
to distribute a special set of routes that are to be used for, and
only for, upstream multicast hop selection. When OSPF [OSPF] is the
CE-PE routing protocol, the CE may use an MT-ID (Multi-Topology
Identifier) [OSPF-MT] of 1 to distribute a special set of routes that
are to be used for, and only for, upstream multicast hop selection.
When a CE uses one of these mechanisms to distribute to a PE a
special set of routes to be used exclusively for upstream multicast
hop selection, these routes are distributed among the PEs using SAFI
129, as described in [MVPN-BGP]. Whether the routes used for
upstream multicast hop selection are (a) the "ordinary" unicast
routes or (b) a special set of routes that are used exclusively for
upstream multicast hop selection is a matter of policy. How that
policy is chosen, deployed, or implemented is outside the scope of
this document. In the following, we will simply refer to the set of
routes that are used for upstream multicast hop selection, the
"Eligible UMH routes", with no presumptions about the policy by which
this set of routes was chosen.
5.1.2. Information Carried by Eligible UMH Routes
Every route that is eligible for UMH selection SHOULD carry a VRF
Route Import Extended Community [MVPN-BGP]. However, if BGP is used
to distribute C-multicast routing information, or if the route is
from a VRF that belongs to a multi-AS VPN as described in option b of
Section 10 of [RFC4364], then the route MUST carry a VRF Route Import
Extended Community. This attribute identifies the PE that originated
If BGP is used for carrying C-multicast routes, OR if "Segmented
inter-AS Tunnels" are used, then every UMH route MUST also carry a
Source AS Extended Community [MVPN-BGP].
These two attributes are used in the upstream multicast hop selection
procedures described below.
5.1.3. Selecting the Upstream PE
The first step in selecting the upstream multicast hop for a given
C-root is to select the Upstream PE router for that C-root.
The PE that received the C-Join message from a CE looks in the VRF
corresponding to the interfaces over which the C-Join was received.
It finds the Eligible UMH route that is the best match for the C-root
specified in that C-Join. Call this the "Installed UMH Route".
Note that the outgoing interface of the Installed UMH Route may be
one of the interfaces associated with the VRF, in which case the
upstream multicast hop is a CE and the route to the C-root is not
across the VPN backbone.
Consider the set of all VPN-IP routes that (a) are eligible to be
imported into the VRF (as determined by their Route Targets), (b) are
eligible to be used for upstream multicast hop selection, and (c)
have exactly the same IP prefix (not necessarily the same RD) as the
installed UMH route.
For each route in this set, determine the corresponding Upstream PE
and Upstream RD. If a route has a VRF Route Import Extended
Community, the route's Upstream PE is determined from it. If a route
does not have a VRF Route Import Extended Community, the route's
Upstream PE is determined from the route's BGP Next Hop. In either
case, the Upstream RD is taken from the route's NLRI.
This results in a set of triples of <route, Upstream PE, Upstream
Call this the "UMH Route Candidate Set". Then, the PE MUST select a
single route from the set to be the "Selected UMH Route". The
corresponding Upstream PE is known as the "Selected Upstream PE", and
the corresponding Upstream RD is known as the "Selected Upstream RD".
There are several possible procedures that can be used by a PE to
select a single route from the candidate set.
The default procedure, which MUST be implemented, is to select the
route whose corresponding Upstream PE address is numerically highest,
where a 32-bit IP address is treated as a 32-bit unsigned integer.
Call this the "default Upstream PE selection". For a given C-root,
provided that the routing information used to create the candidate
set is stable, all PEs will have the same default Upstream PE
selection. (Though different default Upstream PE selections may be
chosen during a routing transient.)
An alternative procedure that MUST be implemented, but which is
disabled by default, is the following. This procedure ensures that,
except during a routing transient, each PE chooses the same Upstream
PE for a given combination of C-root and C-G.
1. The PEs in the candidate set are numbered from lowest to
highest IP address, starting from 0.
2. The following hash is performed:
- A bytewise exclusive-or of all the bytes in the C-root
address and the C-G address is performed.
- The result is taken modulo n, where n is the number of PEs
in the candidate set. Call this result N.
The Selected Upstream PE is then the one that appears in position N
in the list of step 1.
Other hashing algorithms are allowed as well, but not required.
The alternative procedure allows a form of "equal cost load
balancing". Suppose, for example, that from egress PEs PE3 and PE4,
source C-S can be reached, at equal cost, via ingress PE PE1 or
ingress PE PE2. The load balancing procedure makes it possible for
PE1 to be the ingress PE for (C-S,C-G1) data traffic while PE2 is the
ingress PE for (C-S,C-G2) data traffic.
Another procedure, which SHOULD be implemented, is to use the
Installed UMH Route as the Selected UMH Route. If this procedure is
used, the result is likely to be that a given PE will choose the
Upstream PE that is closest to it, according to the routing in the SP
backbone. As a result, for a given C-root, different PEs may choose
different Upstream PEs. This is useful if the C-root is an anycast
address, and can also be useful if the C-root is in a multihomed site
(i.e., a site that is attached to multiple PEs). However, this
procedure is more likely to lead to steady state duplication of
traffic unless (a) PEs discard data traffic that arrives from the
"wrong" Upstream PE or (b) data traffic is carried only in non-
aggregated S-PMSIs. This issue is discussed at length in Section 9.
General policy-based procedures for selecting the UMH route are
allowed but not required, and they are not further discussed in this
5.1.4. Selecting the Upstream Multicast Hop
In certain cases, the Selected Upstream Multicast Hop is the same as
the Selected Upstream PE. In other cases, the Selected Upstream
Multicast Hop is the ASBR that is the BGP Next Hop of the Selected
If the Selected Upstream PE is in the local AS, then the Selected
Upstream PE is also the Selected Upstream Multicast Hop. This is the
case if any of the following conditions holds:
- The Selected UMH Route has a Source AS Extended Community, and
the Source AS is the same as the local AS,
- The Selected UMH Route does not have a Source AS Extended
Community, but the route's BGP Next Hop is the same as the
Otherwise, the Selected Upstream Multicast Hop is an ASBR. The
method of determining just which ASBR it is depends on the particular
inter-AS signaling method being used (PIM or BGP) and on whether
segmented or non-segmented inter-AS tunnels are used. These details
are presented in later sections.
5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI
When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat
the MI-PMSI as a LAN interface and form full PIM adjacencies with
each other over that LAN interface.
The use of PIM when an MI-PMSI is not in use is outside the scope of
To form full PIM adjacencies, the PEs execute the standard PIM
procedures on the LAN interface, including the generation and
processing of PIM Hello, Join/Prune, Assert, DF (Designated
Forwarder) election, and other PIM control messages. These are
executed independently for each C-instance. PIM "Join suppression"
SHOULD be enabled.
5.2.1. PIM C-Instance Control Packets
All IPv4 PIM C-instance control packets of a particular MVPN are
addressed to the ALL-PIM-ROUTERS (22.214.171.124) IP destination address
and transmitted over the MI-PMSI of that MVPN. While in transit in
the P-network, the packets are encapsulated as required for the
particular kind of P-tunnel that is being used to instantiate the
MI-PMSI. Thus, the C-instance control packets are not processed by
the P routers, and MVPN-specific PIM routes can be extended from site
to site without appearing in the P routers.
The handling of IPv6 PIM C-instance control packets will be specified
in a follow-on document.
As specified in Section 5.1.2, when PIM is being used to distribute
C-multicast routing information, any PE distributing VPN-IP routes
that are eligible for use as UMH routes SHOULD include a VRF Route
Import Extended Community with each route. For a given VRF, the
Global Administrator field of the VRF Route Import Extended Community
MUST be set to the same IP address that the PE places in the IP
source address field of the PE-PE PIM control messages it originates
from that VRF.
Note that BSR (Bootstrap Router Mechanism for PIM) [BSR] messages are
treated the same as PIM C-instance control packets, and BSR
processing is regarded as an integral part of the PIM C-instance
5.2.2. PIM C-Instance Reverse Path Forwarding (RPF) Determination
Although the MI-PMSI is treated by PIM as a LAN interface, unicast
routing is NOT run over it, and there are no unicast routing
adjacencies over it. Therefore, it is necessary to specify special
procedures for determining when the MI-PMSI is to be regarded as the
"RPF Interface" for a particular C-address.
The PE follows the procedures of Section 5.1 to determine the
Selected UMH Route. If that route is NOT a VPN-IP route learned from
BGP as described in [RFC4364], or if that route's outgoing interface
is one of the interfaces associated with the VRF, then ordinary PIM
procedures for determining the RPF interface apply.
However, if the Selected UMH Route is a VPN-IP route whose outgoing
interface is not one of the interfaces associated with the VRF, then
PIM will consider the RPF interface to be the MI-PMSI associated with
the VPN-specific PIM instance.
Once PIM has determined that the RPF interface for a particular
C-root is the MI-PMSI, it is necessary for PIM to determine the "RPF
neighbor" for that C-root. This will be one of the other PEs that is
a PIM adjacency over the MI-PMSI. In particular, it will be the
"Selected Upstream PE", as defined in Section 5.1.
5.3. Use of BGP for Carrying C-Multicast Routing
It is possible to use BGP to carry C-multicast routing information
from PE to PE, dispensing entirely with the transmission of
C-Join/Prune messages from PE to PE. This section describes the
procedures for carrying intra-AS multicast routing information.
Inter-AS procedures are described in Section 8. The complete
specification of both sets of procedures and of the encodings can be
found in [MVPN-BGP].
5.3.1. Sending BGP Updates
The MCAST-VPN address family is used for this purpose. MCAST-VPN
routes used for the purpose of carrying C-multicast routing
information are distinguished from those used for the purpose of
carrying auto-discovery information by means of a "route type" field
that is encoded into the NLRI. The following information is required
in BGP to advertise the MVPN routing information. The NLRI contains
- The type of C-multicast route
There are two types:
* source tree join
* shared tree join
- The C-group address
- The C-source address (In the case of a shared tree join, this is
the address of the C-RP.)
- The Selected Upstream RD corresponding to the C-root address
(determined by the procedures of Section 5.1).
Whenever a C-multicast route is sent, it must also carry the Selected
Upstream Multicast Hop corresponding to the C-root address
(determined by the procedures of Section 5.1). The Selected Upstream
Multicast Hop must be encoded as part of a Route Target Extended
Community to facilitate the optional use of filters that can prevent
the distribution of the update to BGP speakers other than the
Upstream Multicast Hop. See Section 10.1.3 of [MVPN-BGP] for the
There is no C-multicast route corresponding to the PIM function of
pruning a source off the shared tree when a PE switches from a
(C-*,C-G) tree to a (C-S,C-G) tree. Section 9 of this document
specifies a mandatory procedure that ensures that if any PE joins a
(C-S,C-G) source tree, all other PEs that have joined or will join
the (C-*,C-G) shared tree will also join the (C-S,C-G) source tree.
This eliminates the need for a C-multicast route that prunes C-S off
the (C-*,C-G) shared tree when switching from (C-*,C-G) to (C-S,C-G)
5.3.2. Explicit Tracking
Note that the upstream multicast hop is NOT part of the NLRI in the
C-multicast BGP routes. This means that if several PEs join the same
C-tree, the BGP routes they distribute to do so are regarded by BGP
as comparable routes, and only one will be installed. If a route
reflector is being used, this further means that the PE that is used
to reach the C-source will know only that one or more of the other
PEs have joined the tree, but it won't know which one. That is, this
BGP update mechanism does not provide "explicit tracking". Explicit
tracking is not provided by default because it increases the amount
of state needed and thus decreases scalability. Also, as
constructing the C-PIM messages to send "upstream" for a given tree
does not depend on knowing all the PEs that are downstream on that
tree, there is no reason for the C-multicast route type updates to
provide explicit tracking.
There are some cases in which explicit tracking is necessary in order
for the PEs to set up certain kinds of P-trees. There are other
cases in which explicit tracking is desirable in order to determine
how to optimally aggregate multicast flows onto a given aggregate
tree. As these functions have to do with the setting up of
infrastructure in the P-network, rather than with the dissemination
of C-multicast routing information, any explicit tracking that is
necessary is handled by sending a particular type of A-D route known
as "Leaf A-D routes".
Whenever a PE sends an A-D route with a PMSI Tunnel attribute, it can
set a bit in the PMSI Tunnel attribute indicating "Leaf Information
Required". A PE that installs such an A-D route MUST respond by
generating a Leaf A-D route, indicating that it needs to join (or be
joined to) the specified PMSI Tunnel. Details can be found in
5.3.3. Withdrawing BGP Updates
A PE removes itself from a C-multicast tree (shared or source) by
withdrawing the corresponding BGP Update.
If a PE has pruned a C-source from a shared C-multicast tree, and it
needs to "unprune" that source from that tree, it does so by
withdrawing the route that pruned the source from the tree.
BGP does not provide a method for carrying the control information of
BSR packets received by a PE from a CE. BSR is supported by
transmitting the BSR control messages from one PE in an MVPN to all
the other PEs in that MVPN.
When a PE needs to transmit a BSR message for a particular MVPN to
other PEs, it must put its own IP address into the BSR message as the
IP source address. As specified in Section 5.1.2, when a PE
distributes VPN-IP routes that are eligible for use as UMH routes,
the PE MUST include a VRF Route Import Extended Community with each
route. For a given MVPN, a single such IP address MUST be used, and
that same IP address MUST be used as the source address in all BSR
packets that the PE transmits to other PEs.
The BSR message may be transmitted over any PMSI that will deliver
the message to all the other PEs in the MVPN. If no such PMSI has
been instantiated yet, then an appropriate P-tunnel must be
advertised, and the C-flow whose C-source address is the address of
the PE itself, and whose multicast group is ALL-PIM-ROUTERS
(126.96.36.199), must be bound to it. This can be done using the
procedures described in Sections 7.3 and 7.4. Note that this is NOT
meant to imply that the other PIM control packets from the PIM
C-instance are to be transmitted to the other PEs.
When a PE receives a BSR message for a particular MVPN from some
other PE, the PE accepts the message only if the IP source address in
that message is the Selected Upstream PE (see Section 5.1.3) for the
IP address of the Bootstrap router. Otherwise, the PE simply
discards the packet. If the PE accepts the packet, it does normal
BSR processing on it, and it may forward a BSR message to one or more
CEs as a result.