RFC 6513

Multicast in MPLS/BGP IP VPNs

Pages: 88
Proposed Standard
→ Errata
Updated by: 7582 7900 7988

Part 2 of 5 – Pages 13 to 33

RFC6513 - Page 13 prevText

3.  Concepts and Framework

3.1.  PE-CE Multicast Routing

   Support of multicast in BGP/MPLS IP VPNs is modeled closely after the
   support of unicast in BGP/MPLS IP VPNs.  That is, a multicast routing
   protocol will be run on the PE-CE interfaces, such that PE and CE are
   multicast routing adjacencies on that interface.  CEs at different
   sites do not become multicast routing adjacencies of each other.

   If a PE attaches to n VPNs for which multicast support is provided
   (i.e., to n "MVPNs"), the PE will run n independent instances of a
   multicast routing protocol.  We will refer to these multicast routing
   instances as "VPN-specific multicast routing instances", or more
   briefly as "multicast C-instances".  The notion of a "VRF" (VPN
   Routing and Forwarding Table), defined in [RFC4364], is extended to
   include multicast routing entries as well as unicast routing entries.
   Each multicast routing entry is thus associated with a particular
   VRF.

   Whether a particular VRF belongs to an MVPN or not is determined by
   configuration.

   In this document, we do not attempt to provide support for every
   possible multicast routing protocol that could possibly run on the
   PE-CE link.  Rather, we consider multicast C-instances only for the
   following multicast routing protocols:

     - PIM Sparse Mode (PIM-SM), supporting the ASM service model

     - PIM Sparse Mode, supporting the SSM service model

     - PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional
       C-trees to support the ASM service model.

   In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
   may also be supported on the PE-CE interface.  The use of mLDP on the
   PE-CE interface is described in [MVPN-BGP].

RFC6513 - Page 14

   The use of BGP on the PE-CE interface is not within the scope of this
   document.

   As the only multicast C-instances discussed by this document are PIM-
   based C-instances, we will generally use the term "PIM C-instances"
   to refer to the multicast C-instances.

   A PE router may also be running a "provider-wide" instance of PIM, (a
   "PIM P-instance"), in which it has a PIM adjacency with, e.g., each
   of its IGP neighbors (i.e., with P routers), but NOT with any CE
   routers, and not with other PE routers (unless another PE router
   happens to be an IGP adjacency).  In this case, P routers would also
   run the P-instance of PIM but NOT a C-instance.  If there is a PIM
   P-instance, it may or may not have a role to play in the support of
   VPN multicast; this is discussed in later sections.  However, in no
   case will the PIM P-instance contain VPN-specific multicast routing
   information.

   In order to help clarify when we are speaking of the PIM P-instance
   and when we are speaking of a PIM C-instance, we will also apply the
   prefixes "P-" and "C-", respectively, to control messages, addresses,
   etc.  Thus, a P-Join would be a PIM Join that is processed by the PIM
   P-instance, and a C-Join would be a PIM Join that is processed by a
   C-instance.  A P-group address would be a group address in the SP's
   address space, and a C-group address would be a group address in a
   VPN's address space.  A C-tree is a multicast distribution tree
   constructed and maintained by the PIM C-instances.  A C-flow is a
   stream of multicast packets with a common C-source address and a
   common C-group address.  We will use the notation "(C-S,C-G)" to
   identify specific C-flows.  If a particular C-tree is a shared tree
   (whether unidirectional or bidirectional) rather than a source-
   specific tree, we will sometimes speak of the entire set of flows
   traveling that tree, identifying the set as "(C-*,C-G)".

3.2.  P-Multicast Service Interfaces (PMSIs)

   A PE must have the ability to forward multicast data packets received
   from a CE to one or more of the other PEs in the same MVPN for
   delivery to one or more other CEs.

   We define the notion of a "P-Multicast Service Interface" (PMSI).  If
   a particular MVPN is supported by a particular set of PE routers,
   then there will be one or more PMSIs connecting those PE routers
   and/or subsets thereof.  A PMSI is a conceptual "overlay" on the
   P-network with the following property: a PE in a given MVPN can give
   a packet to the PMSI, and the packet will be delivered to some or all
   of the other PEs in the MVPN, such that any PE receiving the packet
   will be able to determine the MVPN to which the packet belongs.

RFC6513 - Page 15

   As we discuss below, a PMSI may be instantiated by a number of
   different transport mechanisms, depending on the particular
   requirements of the MVPN and of the SP.  We will refer to these
   transport mechanisms as "P-tunnels".

   For each MVPN, there are one or more PMSIs that are used for
   transmitting the MVPN's multicast data from one PE to others.  We
   will use the term "PMSI" such that a single PMSI belongs to a single
   MVPN.  However, the transport mechanism that is used to instantiate a
   PMSI may allow a single P-tunnel to carry the data of multiple PMSIs.

   In this document, we make a clear distinction between the multicast
   service (the PMSI) and its instantiation.  This allows us to separate
   the discussion of different services from the discussion of different
   instantiations of each service.  The term "P-tunnel" is used to refer
   to the transport mechanism that instantiates a service.

   PMSIs are used to carry C-multicast data traffic.  The C-multicast
   data traffic travels along a C-tree, but in the SP backbone all
   C-trees are tunneled through P-tunnels.  Thus, we will sometimes talk
   of a P-tunnel carrying one or more C-trees.

   Some of the options for passing multicast control traffic among the
   PEs do so by sending the control traffic through a PMSI; other
   options do not send control traffic through a PMSI.

3.2.1.  Inclusive and Selective PMSIs

   We will distinguish between three different kinds of PMSIs:

     - "Multidirectional Inclusive" PMSI (MI-PMSI)

       A Multidirectional Inclusive PMSI is one that enables ANY PE
       attaching to a particular MVPN to transmit a message such that it
       will be received by EVERY other PE attaching to that MVPN.

       There is, at most, one MI-PMSI per MVPN.  (Though the P-tunnel or
       P-tunnels that instantiate an MI-PMSI may actually carry the data
       of more than one PMSI.)

       An MI-PMSI can be thought of as an overlay broadcast network
       connecting the set of PEs supporting a particular MVPN.

     - "Unidirectional Inclusive" PMSI (UI-PMSI)

       A Unidirectional Inclusive PMSI is one that enables a particular
       PE, attached to a particular MVPN, to transmit a message such
       that it will be received by all the other PEs attaching to that

RFC6513 - Page 16

       MVPN.  There is, at most, one UI-PMSI per PE per MVPN, though the
       P-tunnel that instantiates a UI-PMSI may, in fact, carry the data
       of more than one PMSI.

     - "Selective" PMSI (S-PMSI).

       A Selective PMSI is one that provides a mechanism wherein a
       particular PE in an MVPN can multicast messages so that they will
       be received by a subset of the other PEs of that MVPN.  There may
       be an arbitrary number of S-PMSIs per PE per MVPN.  The P-tunnel
       that instantiates a given S-PMSI may carry data from multiple
       S-PMSIs.

   In later sections, we describe the role played by these different
   kinds of PMSIs.  We will use the term "I-PMSI" when we are not
   distinguishing between "MI-PMSIs" and "UI-PMSIs".

3.2.2.  P-Tunnels Instantiating PMSIs

   The P-tunnels that are used to instantiate PMSIs will be referred to
   as "P-tunnels".  A number of different tunnel setup techniques can be
   used to create the P-tunnels that instantiate the PMSIs.  Among these
   are the following:

     - PIM

       A PMSI can be instantiated as (a set of) Multicast Distribution
       trees created by the PIM P-instance ("P-trees").

       The multicast distribution trees that instantiate I-PMSIs may be
       either shared trees or source-specific trees.

       This document (along with [MVPN-BGP]) specifies procedures for
       identifying a particular (C-S,C-G) flow and assigning it to a
       particular S-PMSI.  Such an S-PMSI is most naturally instantiated
       as a source-specific tree.

       The use of shared trees (including bidirectional trees) to
       instantiate S-PMSIs is outside the scope of this document.

       The use of PIM-DM to create P-tunnels is not supported.

       P-tunnels may be shared by multiple MVPNs (i.e., a given P-tunnel
       may be the instantiation of multiple PMSIs), as long as the
       tunnel encapsulation provides some means of demultiplexing the
       data traffic by MVPN.

RFC6513 - Page 17

     - mLDP

       mLDP Point-to-Multipoint (P2MP) LSPs or Multipoint-to-Multipoint
       (MP2MP) LSPs can be used to instantiate I-PMSIs.

       An S-PMSI or a UI-PMSI could be instantiated as a single mLDP
       P2MP LSP, whereas an MI-PMSI would have to be instantiated as a
       set of such LSPs (each PE in the MVPN being the root of one such
       LSP) or as a single MP2MP LSP.

       Procedures for sharing MP2MP LSPs across multiple MVPNs are
       outside the scope of this document.

       The use of MP2MP LSPs to instantiate S-PMSIs is outside the scope
       of this document.

       Section 11.2.3 discusses a way of using a partial mesh of MP2MP
       LSPs to instantiate a PMSI.  However, a full specification of the
       necessary procedures is outside the scope of this document.

     - RSVP-TE

       A PMSI may be instantiated as one or more RSVP-TE Point-to-
       Multipoint (P2MP) LSPs.  An S-PMSI or a UI-PMSI would be
       instantiated as a single RSVP-TE P2MP LSP, whereas a
       Multidirectional Inclusive PMSI would be instantiated as a set of
       such LSPs, one for each PE in the MVPN.  RSVP-TE P2MP LSPs can be
       shared across multiple MVPNs.

     - A Mesh of Unicast P-Tunnels.

       If a PMSI is implemented as a mesh of unicast P-tunnels, a PE
       wishing to transmit a packet through the PMSI would replicate the
       packet and send a copy to each of the other PEs.

       An MI-PMSI for a given MVPN can be instantiated as a full mesh of
       unicast P-tunnels among that MVPN's PEs.  A UI-PMSI or an S-PMSI
       can be instantiated as a partial mesh.

   It can be seen that each method of implementing PMSIs has its own
   area of applicability.  Therefore, this specification allows for the
   use of any of these methods.  At first glance, this may seem like an
   overabundance of options.  However, the history of multicast
   development and deployment should make it clear that there is no one
   option that is always acceptable.  The use of segmented inter-AS
   trees does allow each SP to select the option that it finds most
   applicable in its own environment, without causing any other SP to
   choose that same option.

RFC6513 - Page 18

   SPECIFYING THE CONDITIONS UNDER WHICH A PARTICULAR TREE-BUILDING
   METHOD IS APPLICABLE IS OUTSIDE THE SCOPE OF THIS DOCUMENT.

   The choice of the tunnel technique belongs to the sender router and
   is a local policy decision of that router.  The procedures defined
   throughout this document do not mandate that the same tunnel
   technique be used for all P-tunnels going through a given provider
   backbone.  However, it is expected that any tunnel technique that can
   be used by a PE for a particular MVPN is also supported by all the
   other PEs having VRFs for the MVPN.  Moreover, the use of ingress
   replication by any PE for an MVPN implies that all other PEs MUST use
   ingress replication for this MVPN.

3.3.  Use of PMSIs for Carrying Multicast Data

   Each PE supporting a particular MVPN must have a way of discovering
   the following information:

     - The set of other PEs in its AS that are attached to sites of that
       MVPN, and the set of other ASes that have PEs attached to sites
       of that MVPN.  However, if non-segmented inter-AS trees are used
       (see Section 8.1), then each PE needs to know the entire set of
       PEs attached to sites of that MVPN.

     - If segmented inter-AS trees are to be used, the set of border
       routers in its AS that support inter-AS connectivity for that
       MVPN.

     - If the MVPN is configured to use an MI-PMSI, the information
       needed to set up and to use the P-tunnels instantiating the
       MI-PMSI.

     - For each other PE, whether the PE supports Aggregate Trees for
       the MVPN, and if so, the demultiplexing information that must be
       provided so that the other PE can determine whether a packet that
       it received on an Aggregate Tree belongs to this MVPN.

   In some cases, the information above is provided by means of the BGP-
   based auto-discovery procedures discussed in Section 4 of this
   document and in Section 9 of [MVPN-BGP].  In other cases, this
   information is provided after discovery is complete, by means of
   procedures discussed in Section 7.4.  In either case, the information
   that is provided must be sufficient to enable the PMSI to be bound to
   the identified P-tunnel, to enable the P-tunnel to be created if it
   does not already exist, and to enable the different PMSIs that may
   travel on the same P-tunnel to be properly demultiplexed.

RFC6513 - Page 19

   If an MVPN uses an MI-PMSI, then the information needed to identify
   the P-tunnels that instantiate the MI-PMSI has to be known to the PEs
   attached to the MVPN before any data can be transmitted on the
   MI-PMSI.  This information is either statically configured or auto-
   discovered (see Section 4).  The actual process of constructing the
   P-tunnels (e.g., via PIM, RSVP-TE, or mLDP) SHOULD occur as soon as
   this information is known.

   When MI-PMSIs are used, they may serve as the default method of
   carrying C-multicast data traffic.  When we say that an MI-PMSI is
   the "default" method of carrying C-multicast data traffic for a
   particular MVPN, we mean that it is not necessary to use any special
   control procedures to bind a particular C-flow to the MI-PMSI; any
   C-flows that have not been bound to other PMSIs will be assumed to
   travel through the MI-PMSI.

   There is no requirement to use MI-PMSIs as the default method of
   carrying C-flows.  It is possible to adopt a policy in which all
   C-flows are carried on UI-PMSIs or S-PMSIs.  In this case, if an
   MI-PMSI is not used for carrying routing information, it is not
   needed at all.

   Even when an MI-PMSI is used as the default method of carrying an
   MVPN's C-flows, if a particular C-flow has certain characteristics,
   it may be desirable to migrate it from the MI-PMSI to an S-PMSI.
   These characteristics, as well as the procedures for migrating a
   C-flow from an MI-PMSI to an S-PMSI, are discussed in Section 7.

   Sometimes a set of C-flows are traveling the same, shared, C-tree
   (e.g., either unidirectional or bidirectional), and it may be
   desirable to move the whole set of C-flows as a unit to an S-PMSI.
   Procedures for doing this are outside the scope of this
   specification.

   Some of the procedures for transmitting C-multicast routing
   information among the PEs require that the routing information be
   sent over an MI-PMSI.  Other procedures do not use an MI-PMSI to
   transmit the C-multicast routing information.

   For a given MVPN, whether an MI-PMSI is used to carry C-multicast
   routing information is independent from whether an MI-PMSI is used as
   the default method of carrying the C-multicast data traffic.

   As previously stated, it is possible to send all C-flows on a set of
   S-PMSIs, omitting any usage of I-PMSIs.  This prevents PEs from
   receiving data that they don't need, at the cost of requiring
   additional P-tunnels, and additional signaling to bind the C-flows to
   P-tunnels.  Cost-effective instantiation of S-PMSIs is likely to

RFC6513 - Page 20

   require Aggregate P-trees, which, in turn, makes it necessary for the
   transmitting PE to know which PEs need to receive which multicast
   streams.  This is known as "explicit tracking", and the procedures to
   enable explicit tracking may themselves impose a cost.  This is
   further discussed in Section 7.4.1.2.

3.4.  PE-PE Transmission of C-Multicast Routing

   As a PE attached to a given MVPN receives C-Join/Prune messages from
   its CEs in that MVPN, it must convey the information contained in
   those messages to other PEs that are attached to the same MVPN.

   There are several different methods for doing this.  As these methods
   are not interoperable, the method to be used for a particular MVPN
   must be either configured or discovered as part of the auto-discovery
   process.

3.4.1.  PIM Peering

3.4.1.1.  Full per-MVPN PIM Peering across an MI-PMSI

   If the set of PEs attached to a given MVPN are connected via an
   MI-PMSI, the PEs can form "normal" PIM adjacencies with each other.
   Since the MI-PMSI functions as a broadcast network, the standard PIM
   procedures for forming and maintaining adjacencies over a LAN can be
   applied.

   As a result, the C-Join/Prune messages that a PE receives from a CE
   can be multicast to all the other PEs of the MVPN.  PIM "Join
   suppression" can be enabled and the PEs can send Asserts as needed.

   This procedure is fully specified in Section 5.2.

3.4.1.2.  Lightweight PIM Peering across an MI-PMSI

   The procedure of the previous Section has the following
   disadvantages:

     - Periodic Hello messages must be sent by all PEs.

       Standard PIM procedures require that each PE in a particular MVPN
       periodically multicast a Hello to all the other PEs in that MVPN.
       If the number of MVPNs becomes very large, sending and receiving
       these Hellos can become a substantial overhead for the PE
       routers.

RFC6513 - Page 21

     - Periodic retransmission of C-Join/Prune messages.

       PIM is a "soft-state" protocol, in which reliability is assured
       through frequent retransmissions (refresh) of control messages.
       This too can begin to impose a large overhead on the PE routers
       as the number of MVPNs grows.

   The first of these disadvantages is easily remedied.  The reason for
   the periodic PIM Hellos is to ensure that each PIM speaker on a LAN
   knows who all the other PIM speakers on the LAN are.  However, in the
   context of MVPN, PEs in a given MVPN can learn the identities of all
   the other PEs in the MVPN by means of the BGP-based auto-discovery
   procedure of Section 4.  In that case, the periodic Hellos would
   serve no function and could simply be eliminated.  (Of course, this
   does imply a change to the standard PIM procedures.)

   When Hellos are suppressed, we may speak of "lightweight PIM
   peering".

   The periodic refresh of the C-Join/Prune messages is not as simple to
   eliminate.  If and when "refresh reduction" procedures are specified
   for PIM, it may be useful to incorporate them, so as to make the
   lightweight PIM peering procedures even more lightweight.

   Lightweight PIM peering is not specified in this document.

3.4.1.3.  Unicasting of PIM C-Join/Prune Messages

   PIM does not require that the C-Join/Prune messages that a PE
   receives from a CE to be multicast to all the other PEs; it allows
   them to be unicast to a single PE, the one that is upstream on the
   path to the root of the multicast tree mentioned in the Join/Prune
   message.  Note that when the C-Join/Prune messages are unicast, there
   is no such thing as "Join suppression".  Therefore, PIM Refresh
   Reduction may be considered to be a prerequisite for the procedure of
   unicasting the C-Join/Prune messages.

   When the C-Join/Prune messages are unicast, they are not transmitted
   on a PMSI at all.  Note that the procedure of unicasting the
   C-Join/Prune messages is different than the procedure of transmitting
   the C-Join/Prune messages on an MI-PMSI that is instantiated as a
   mesh of unicast P-tunnels.

   If there are multiple PEs that can be used to reach a given C-source,
   procedures described in Sections 5.1 and 9 MUST be used to ensure
   that duplicate packets do not get delivered.

RFC6513 - Page 22

   Procedures for unicasting the PIM control messages are not further
   specified in this document.

3.4.2.  Using BGP to Carry C-Multicast Routing

   It is possible to use BGP to carry C-multicast routing information
   from PE to PE, dispensing entirely with the transmission of
   C-Join/Prune messages from PE to PE.  This is discussed in Section
   5.3 and fully specified in [MVPN-BGP].

4.  BGP-Based Auto-Discovery of MVPN Membership

   BGP-based auto-discovery is done by means of a new address family,
   the MCAST-VPN address family.  (This address family also has other
   uses, as will be seen later.)  Any PE that attaches to an MVPN must
   issue a BGP Update message containing an NLRI ("Network Layer
   Reachability Information" element) in this address family, along with
   a specific set of attributes.  In this document, we specify the
   information that must be contained in these BGP Updates in order to
   provide auto-discovery.  The encoding details, along with the
   complete set of detailed procedures, are specified in a separate
   document [MVPN-BGP].

   This section specifies the intra-AS BGP-based auto-discovery
   procedures.  When segmented inter-AS trees are used, additional
   procedures are needed, as specified in [MVPN-BGP].  (When segmented
   inter-AS trees are not used, the inter-AS procedures are almost
   identical to the intra-AS procedures.)

   BGP-based auto-discovery uses a particular kind of MCAST-VPN route
   known as an "auto-discovery route", or "A-D route".  In particular,
   it uses two kinds of "A-D routes": the "Intra-AS I-PMSI A-D route"
   and the "Inter-AS I-PMSI A-D route".  (There are also additional
   kinds of A-D routes, such as the Source Active A-D routes, which are
   used for purposes that go beyond auto-discovery.  These are discussed
   in subsequent sections.)

   The Inter-AS I-PMSI A-D route is used only when segmented inter-AS
   P-tunnels are used, as specified in [MVPN-BGP].

   The "Intra-AS I-PMSI A-D route" is originated by the PEs that are
   (directly) connected to the site(s) of an MVPN.  It is distributed to
   other PEs that attach to sites of the MVPN.  If segmented inter-AS
   P-tunnels are used, then the Intra-AS I-PMSI A-D routes are not
   distributed outside the AS where they originate; if segmented inter-
   AS P-tunnels are not used, then the Intra-AS I-PMSI A-D routes are,
   despite their name, distributed to all PEs attached to the VPN, no
   matter what AS the PEs are in.

RFC6513 - Page 23

   The NLRI of an Intra-AS I-PMSI A-D route must contain the following
   information:

     - The route type (i.e., Intra-AS I-PMSI A-D route).

     - The IP address of the originating PE.

     - An RD ("Route Distinguisher", [RFC4364]) configured locally for
       the MVPN.  This is an RD that can be prepended to that IP address
       to form a globally unique VPN-IP address of the PE.

   Intra-AS I-PMSI A-D routes carry the following attributes:

     - Route Target Extended Communities attribute.

       One or more of these MUST be carried by each Intra-AS I-PMSI A-D
       route.  If any other PE has one of these Route Targets configured
       for import into a VRF, it treats the advertising PE as a member
       in the MVPN to which the VRF belongs.  This allows each PE to
       discover the PEs that belong to a given MVPN.  More specifically,
       it allows a PE in the Receiver Sites set to discover the PEs in
       the Sender Sites set of the MVPN, and the PEs in the Sender Sites
       set of the MVPN to discover the PEs in the Receiver Sites set of
       the MVPN.  The PEs in the Receiver Sites set would be configured
       to import the Route Targets advertised in the BGP A-D routes by
       PEs in the Sender Sites set.  The PEs in the Sender Sites set
       would be configured to import the Route Targets advertised in the
       BGP A-D routes by PEs in the Receiver Sites set.

     - PMSI Tunnel attribute.

       This attribute is present whenever the MVPN uses an MI-PMSI or
       when it uses a UI-PMSI rooted at the originating router.  It
       contains the following information:

         * tunnel technology, which may be one of the following:

             + Bidirectional multicast tree created by BIDIR-PIM,

             + Source-specific multicast tree created by PIM-SM,
               supporting the SSM service model,

             + Set of trees (one shared tree and a set of source trees)
               created by PIM-SM using the ASM service model,

             + Point-to-multipoint LSP created by RSVP-TE,

             + Point-to-multipoint LSP created by mLDP,

RFC6513 - Page 24

             + multipoint-to-multipoint LSP created by mLDP

             + unicast tunnel

         * P-tunnel identifier

           Before a P-tunnel can be constructed to instantiate the
           I-PMSI, the PE must be able to create a unique identifier for
           the tunnel.  The syntax of this identifier depends on the
           tunnel technology used.

           Each PE attaching to a given MVPN must be configured with
           information specifying the allowable encapsulations to use
           for that MVPN, as well as the particular one of those
           encapsulations that the PE is to identify in the PMSI Tunnel
           attribute of the Intra-AS I-PMSI A-D routes that it
           originates.

         * Multi-VPN aggregation capability and demultiplexor value.

           This specifies whether the P-tunnel is capable of aggregating
           I-PMSIs from multiple MVPNs.  This will affect the
           encapsulation used.  If aggregation is to be used, a
           demultiplexor value to be carried by packets for this
           particular MVPN must also be specified.  The demultiplexing
           mechanism and signaling procedures are described in Section
           6.

     - PE Distinguisher Labels Attribute

       Sometimes it is necessary for one PE to advertise an upstream-
       assigned MPLS label that identifies another PE.  Under certain
       circumstances to be discussed later, a PE that is the root of a
       multicast P-tunnel will bind an MPLS label value to one or more
       of the PEs that belong to the P-tunnel, and it will distribute
       these label bindings using Intra-AS I-PMSI A-D routes.

       Specification of when this must be done is provided in Sections
       6.4.4 and 11.2.2.  We refer to these as "PE Distinguisher
       Labels".

       Note that, as specified in [MPLS-UPSTREAM-LABEL], PE
       Distinguisher Label values are unique only in the context of the
       IP address identifying the root of the P-tunnel; they are not
       necessarily unique per tunnel.

RFC6513 - Page 25

5.  PE-PE Transmission of C-Multicast Routing

   As a PE attached to a given MVPN receives C-Join/Prune messages from
   its CEs in that MVPN, it must convey the information contained in
   those messages to other PEs that are attached to the same MVPN.  This
   is known as the "PE-PE transmission of C-multicast routing
   information".

   This section specifies the procedures used for PE-PE transmission of
   C-multicast routing information.  Not every procedure mentioned in
   Section 3.4 is specified here.  Rather, this section focuses on two
   particular procedures:

     - Full PIM Peering.

       This procedure is fully specified herein.

     - Use of BGP to distribute C-multicast routing

       This procedure is described herein, but the full specification
       appears in [MVPN-BGP].

   Those aspects of the procedures that apply to both of the above are
   also specified fully herein.

   Specification of other procedures is outside the scope of this
   document.

5.1.  Selecting the Upstream Multicast Hop (UMH)

   When a PE receives a C-Join/Prune message from a CE, the message
   identifies a particular multicast flow as belonging either to a
   source-specific tree (S,G) or to a shared tree (*,G).  Throughout
   this section, we use the term "C-root" to refer to S, in the case of
   a source-specific tree, or to the Rendezvous Point (RP) for G, in the
   case of (*,G).  If the route to the C-root is across the VPN
   backbone, then the PE needs to find the "Upstream Multicast Hop"
   (UMH) for the (S,G) or (*,G) flow.  The UMH is either the PE at which
   (S,G) or (*,G) data packets enter the VPN backbone or the Autonomous
   System Border Router (ASBR) at which those data packets enter the
   local AS when traveling through the VPN backbone.  The process of
   finding the upstream multicast hop for a given C-root is known as
   "upstream multicast hop selection".

RFC6513 - Page 26

5.1.1.  Eligible Routes for UMH Selection

   In the simplest case, the PE does the upstream hop selection by
   looking up the C-root in the unicast VRF associated with the PE-CE
   interface over which the C-Join/Prune message was received.  The
   route that matches the C-root will contain the information needed to
   select the UMH.

   However, in some cases, the CEs may be distributing to the PEs a
   special set of routes that are to be used exclusively for the purpose
   of upstream multicast hop selection, and not used for unicast routing
   at all.  For example, when BGP is the CE-PE unicast routing protocol,
   the CEs may be using Subsequent Address Family Identifier 2 (SAFI 2)
   to distribute a special set of routes that are to be used for, and
   only for, upstream multicast hop selection.  When OSPF [OSPF] is the
   CE-PE routing protocol, the CE may use an MT-ID (Multi-Topology
   Identifier) [OSPF-MT] of 1 to distribute a special set of routes that
   are to be used for, and only for, upstream multicast hop selection.
   When a CE uses one of these mechanisms to distribute to a PE a
   special set of routes to be used exclusively for upstream multicast
   hop selection, these routes are distributed among the PEs using SAFI
   129, as described in [MVPN-BGP].  Whether the routes used for
   upstream multicast hop selection are (a) the "ordinary" unicast
   routes or (b) a special set of routes that are used exclusively for
   upstream multicast hop selection is a matter of policy.  How that
   policy is chosen, deployed, or implemented is outside the scope of
   this document.  In the following, we will simply refer to the set of
   routes that are used for upstream multicast hop selection, the
   "Eligible UMH routes", with no presumptions about the policy by which
   this set of routes was chosen.

5.1.2.  Information Carried by Eligible UMH Routes

   Every route that is eligible for UMH selection SHOULD carry a VRF
   Route Import Extended Community [MVPN-BGP].  However, if BGP is used
   to distribute C-multicast routing information, or if the route is
   from a VRF that belongs to a multi-AS VPN as described in option b of
   Section 10 of [RFC4364], then the route MUST carry a VRF Route Import
   Extended Community.  This attribute identifies the PE that originated
   the route.

   If BGP is used for carrying C-multicast routes, OR if "Segmented
   inter-AS Tunnels" are used, then every UMH route MUST also carry a
   Source AS Extended Community [MVPN-BGP].

   These two attributes are used in the upstream multicast hop selection
   procedures described below.

RFC6513 - Page 27

5.1.3.  Selecting the Upstream PE

   The first step in selecting the upstream multicast hop for a given
   C-root is to select the Upstream PE router for that C-root.

   The PE that received the C-Join message from a CE looks in the VRF
   corresponding to the interfaces over which the C-Join was received.
   It finds the Eligible UMH route that is the best match for the C-root
   specified in that C-Join.  Call this the "Installed UMH Route".

   Note that the outgoing interface of the Installed UMH Route may be
   one of the interfaces associated with the VRF, in which case the
   upstream multicast hop is a CE and the route to the C-root is not
   across the VPN backbone.

   Consider the set of all VPN-IP routes that (a) are eligible to be
   imported into the VRF (as determined by their Route Targets), (b) are
   eligible to be used for upstream multicast hop selection, and (c)
   have exactly the same IP prefix (not necessarily the same RD) as the
   installed UMH route.

   For each route in this set, determine the corresponding Upstream PE
   and Upstream RD.  If a route has a VRF Route Import Extended
   Community, the route's Upstream PE is determined from it.  If a route
   does not have a VRF Route Import Extended Community, the route's
   Upstream PE is determined from the route's BGP Next Hop.  In either
   case, the Upstream RD is taken from the route's NLRI.

   This results in a set of triples of <route, Upstream PE, Upstream
   RD>.

   Call this the "UMH Route Candidate Set".  Then, the PE MUST select a
   single route from the set to be the "Selected UMH Route".  The
   corresponding Upstream PE is known as the "Selected Upstream PE", and
   the corresponding Upstream RD is known as the "Selected Upstream RD".

   There are several possible procedures that can be used by a PE to
   select a single route from the candidate set.

   The default procedure, which MUST be implemented, is to select the
   route whose corresponding Upstream PE address is numerically highest,
   where a 32-bit IP address is treated as a 32-bit unsigned integer.
   Call this the "default Upstream PE selection".  For a given C-root,
   provided that the routing information used to create the candidate
   set is stable, all PEs will have the same default Upstream PE
   selection.  (Though different default Upstream PE selections may be
   chosen during a routing transient.)

RFC6513 - Page 28

   An alternative procedure that MUST be implemented, but which is
   disabled by default, is the following.  This procedure ensures that,
   except during a routing transient, each PE chooses the same Upstream
   PE for a given combination of C-root and C-G.

      1. The PEs in the candidate set are numbered from lowest to
         highest IP address, starting from 0.

      2. The following hash is performed:

           - A bytewise exclusive-or of all the bytes in the C-root
             address and the C-G address is performed.

           - The result is taken modulo n, where n is the number of PEs
             in the candidate set.  Call this result N.

   The Selected Upstream PE is then the one that appears in position N
   in the list of step 1.

   Other hashing algorithms are allowed as well, but not required.

   The alternative procedure allows a form of "equal cost load
   balancing".  Suppose, for example, that from egress PEs PE3 and PE4,
   source C-S can be reached, at equal cost, via ingress PE PE1 or
   ingress PE PE2.  The load balancing procedure makes it possible for
   PE1 to be the ingress PE for (C-S,C-G1) data traffic while PE2 is the
   ingress PE for (C-S,C-G2) data traffic.

   Another procedure, which SHOULD be implemented, is to use the
   Installed UMH Route as the Selected UMH Route.  If this procedure is
   used, the result is likely to be that a given PE will choose the
   Upstream PE that is closest to it, according to the routing in the SP
   backbone.  As a result, for a given C-root, different PEs may choose
   different Upstream PEs.  This is useful if the C-root is an anycast
   address, and can also be useful if the C-root is in a multihomed site
   (i.e., a site that is attached to multiple PEs).  However, this
   procedure is more likely to lead to steady state duplication of
   traffic unless (a) PEs discard data traffic that arrives from the
   "wrong" Upstream PE or (b) data traffic is carried only in non-
   aggregated S-PMSIs.  This issue is discussed at length in Section 9.

   General policy-based procedures for selecting the UMH route are
   allowed but not required, and they are not further discussed in this
   specification.

RFC6513 - Page 29

5.1.4.  Selecting the Upstream Multicast Hop

   In certain cases, the Selected Upstream Multicast Hop is the same as
   the Selected Upstream PE.  In other cases, the Selected Upstream
   Multicast Hop is the ASBR that is the BGP Next Hop of the Selected
   UMH Route.

   If the Selected Upstream PE is in the local AS, then the Selected
   Upstream PE is also the Selected Upstream Multicast Hop.  This is the
   case if any of the following conditions holds:

     - The Selected UMH Route has a Source AS Extended Community, and
       the Source AS is the same as the local AS,

     - The Selected UMH Route does not have a Source AS Extended
       Community, but the route's BGP Next Hop is the same as the
       Upstream PE.

   Otherwise, the Selected Upstream Multicast Hop is an ASBR.  The
   method of determining just which ASBR it is depends on the particular
   inter-AS signaling method being used (PIM or BGP) and on whether
   segmented or non-segmented inter-AS tunnels are used.  These details
   are presented in later sections.

5.2.  Details of Per-MVPN Full PIM Peering over MI-PMSI

   When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat
   the MI-PMSI as a LAN interface and form full PIM adjacencies with
   each other over that LAN interface.

   The use of PIM when an MI-PMSI is not in use is outside the scope of
   this document.

   To form full PIM adjacencies, the PEs execute the standard PIM
   procedures on the LAN interface, including the generation and
   processing of PIM Hello, Join/Prune, Assert, DF (Designated
   Forwarder) election, and other PIM control messages.  These are
   executed independently for each C-instance.  PIM "Join suppression"
   SHOULD be enabled.

5.2.1.  PIM C-Instance Control Packets

   All IPv4 PIM C-instance control packets of a particular MVPN are
   addressed to the ALL-PIM-ROUTERS (224.0.0.13) IP destination address
   and transmitted over the MI-PMSI of that MVPN.  While in transit in
   the P-network, the packets are encapsulated as required for the
   particular kind of P-tunnel that is being used to instantiate the

RFC6513 - Page 30

   MI-PMSI.  Thus, the C-instance control packets are not processed by
   the P routers, and MVPN-specific PIM routes can be extended from site
   to site without appearing in the P routers.

   The handling of IPv6 PIM C-instance control packets will be specified
   in a follow-on document.

   As specified in Section 5.1.2, when PIM is being used to distribute
   C-multicast routing information, any PE distributing VPN-IP routes
   that are eligible for use as UMH routes SHOULD include a VRF Route
   Import Extended Community with each route.  For a given VRF, the
   Global Administrator field of the VRF Route Import Extended Community
   MUST be set to the same IP address that the PE places in the IP
   source address field of the PE-PE PIM control messages it originates
   from that VRF.

   Note that BSR (Bootstrap Router Mechanism for PIM) [BSR] messages are
   treated the same as PIM C-instance control packets, and BSR
   processing is regarded as an integral part of the PIM C-instance
   processing.

5.2.2.  PIM C-Instance Reverse Path Forwarding (RPF) Determination

   Although the MI-PMSI is treated by PIM as a LAN interface, unicast
   routing is NOT run over it, and there are no unicast routing
   adjacencies over it.  Therefore, it is necessary to specify special
   procedures for determining when the MI-PMSI is to be regarded as the
   "RPF Interface" for a particular C-address.

   The PE follows the procedures of Section 5.1 to determine the
   Selected UMH Route.  If that route is NOT a VPN-IP route learned from
   BGP as described in [RFC4364], or if that route's outgoing interface
   is one of the interfaces associated with the VRF, then ordinary PIM
   procedures for determining the RPF interface apply.

   However, if the Selected UMH Route is a VPN-IP route whose outgoing
   interface is not one of the interfaces associated with the VRF, then
   PIM will consider the RPF interface to be the MI-PMSI associated with
   the VPN-specific PIM instance.

   Once PIM has determined that the RPF interface for a particular
   C-root is the MI-PMSI, it is necessary for PIM to determine the "RPF
   neighbor" for that C-root.  This will be one of the other PEs that is
   a PIM adjacency over the MI-PMSI.  In particular, it will be the
   "Selected Upstream PE", as defined in Section 5.1.

RFC6513 - Page 31

5.3.  Use of BGP for Carrying C-Multicast Routing

   It is possible to use BGP to carry C-multicast routing information
   from PE to PE, dispensing entirely with the transmission of
   C-Join/Prune messages from PE to PE.  This section describes the
   procedures for carrying intra-AS multicast routing information.
   Inter-AS procedures are described in Section 8.  The complete
   specification of both sets of procedures and of the encodings can be
   found in [MVPN-BGP].

5.3.1.  Sending BGP Updates

   The MCAST-VPN address family is used for this purpose.  MCAST-VPN
   routes used for the purpose of carrying C-multicast routing
   information are distinguished from those used for the purpose of
   carrying auto-discovery information by means of a "route type" field
   that is encoded into the NLRI.  The following information is required
   in BGP to advertise the MVPN routing information.  The NLRI contains
   the following:

     - The type of C-multicast route

       There are two types:

         * source tree join

         * shared tree join

     - The C-group address

     - The C-source address (In the case of a shared tree join, this is
       the address of the C-RP.)

     - The Selected Upstream RD corresponding to the C-root address
       (determined by the procedures of Section 5.1).

   Whenever a C-multicast route is sent, it must also carry the Selected
   Upstream Multicast Hop corresponding to the C-root address
   (determined by the procedures of Section 5.1).  The Selected Upstream
   Multicast Hop must be encoded as part of a Route Target Extended
   Community to facilitate the optional use of filters that can prevent
   the distribution of the update to BGP speakers other than the
   Upstream Multicast Hop.  See Section 10.1.3 of [MVPN-BGP] for the
   details.

   There is no C-multicast route corresponding to the PIM function of
   pruning a source off the shared tree when a PE switches from a
   (C-*,C-G) tree to a (C-S,C-G) tree.  Section 9 of this document

RFC6513 - Page 32

   specifies a mandatory procedure that ensures that if any PE joins a
   (C-S,C-G) source tree, all other PEs that have joined or will join
   the (C-*,C-G) shared tree will also join the (C-S,C-G) source tree.

   This eliminates the need for a C-multicast route that prunes C-S off
   the (C-*,C-G) shared tree when switching from (C-*,C-G) to (C-S,C-G)
   tree.

5.3.2.  Explicit Tracking

   Note that the upstream multicast hop is NOT part of the NLRI in the
   C-multicast BGP routes.  This means that if several PEs join the same
   C-tree, the BGP routes they distribute to do so are regarded by BGP
   as comparable routes, and only one will be installed.  If a route
   reflector is being used, this further means that the PE that is used
   to reach the C-source will know only that one or more of the other
   PEs have joined the tree, but it won't know which one.  That is, this
   BGP update mechanism does not provide "explicit tracking".  Explicit
   tracking is not provided by default because it increases the amount
   of state needed and thus decreases scalability.  Also, as
   constructing the C-PIM messages to send "upstream" for a given tree
   does not depend on knowing all the PEs that are downstream on that
   tree, there is no reason for the C-multicast route type updates to
   provide explicit tracking.

   There are some cases in which explicit tracking is necessary in order
   for the PEs to set up certain kinds of P-trees.  There are other
   cases in which explicit tracking is desirable in order to determine
   how to optimally aggregate multicast flows onto a given aggregate
   tree.  As these functions have to do with the setting up of
   infrastructure in the P-network, rather than with the dissemination
   of C-multicast routing information, any explicit tracking that is
   necessary is handled by sending a particular type of A-D route known
   as "Leaf A-D routes".

   Whenever a PE sends an A-D route with a PMSI Tunnel attribute, it can
   set a bit in the PMSI Tunnel attribute indicating "Leaf Information
   Required".  A PE that installs such an A-D route MUST respond by
   generating a Leaf A-D route, indicating that it needs to join (or be
   joined to) the specified PMSI Tunnel.  Details can be found in
   [MVPN-BGP].

5.3.3.  Withdrawing BGP Updates

   A PE removes itself from a C-multicast tree (shared or source) by
   withdrawing the corresponding BGP Update.

RFC6513 - Page 33

   If a PE has pruned a C-source from a shared C-multicast tree, and it
   needs to "unprune" that source from that tree, it does so by
   withdrawing the route that pruned the source from the tree.

5.3.4.  BSR

   BGP does not provide a method for carrying the control information of
   BSR packets received by a PE from a CE.  BSR is supported by
   transmitting the BSR control messages from one PE in an MVPN to all
   the other PEs in that MVPN.

   When a PE needs to transmit a BSR message for a particular MVPN to
   other PEs, it must put its own IP address into the BSR message as the
   IP source address.  As specified in Section 5.1.2, when a PE
   distributes VPN-IP routes that are eligible for use as UMH routes,
   the PE MUST include a VRF Route Import Extended Community with each
   route.  For a given MVPN, a single such IP address MUST be used, and
   that same IP address MUST be used as the source address in all BSR
   packets that the PE transmits to other PEs.

   The BSR message may be transmitted over any PMSI that will deliver
   the message to all the other PEs in the MVPN.  If no such PMSI has
   been instantiated yet, then an appropriate P-tunnel must be
   advertised, and the C-flow whose C-source address is the address of
   the PE itself, and whose multicast group is ALL-PIM-ROUTERS
   (224.0.0.13), must be bound to it.  This can be done using the
   procedures described in Sections 7.3 and 7.4.  Note that this is NOT
   meant to imply that the other PIM control packets from the PIM
   C-instance are to be transmitted to the other PEs.

   When a PE receives a BSR message for a particular MVPN from some
   other PE, the PE accepts the message only if the IP source address in
   that message is the Selected Upstream PE (see Section 5.1.3) for the
   IP address of the Bootstrap router.  Otherwise, the PE simply
   discards the packet.  If the PE accepts the packet, it does normal
   BSR processing on it, and it may forward a BSR message to one or more
   CEs as a result.

(page 33 continued on part 3)