3. Concepts and Framework 3.1. PE-CE Multicast Routing Support of multicast in BGP/MPLS IP VPNs is modeled closely after the support of unicast in BGP/MPLS IP VPNs. That is, a multicast routing protocol will be run on the PE-CE interfaces, such that PE and CE are multicast routing adjacencies on that interface. CEs at different sites do not become multicast routing adjacencies of each other. If a PE attaches to n VPNs for which multicast support is provided (i.e., to n "MVPNs"), the PE will run n independent instances of a multicast routing protocol. We will refer to these multicast routing instances as "VPN-specific multicast routing instances", or more briefly as "multicast C-instances". The notion of a "VRF" (VPN Routing and Forwarding Table), defined in [RFC4364], is extended to include multicast routing entries as well as unicast routing entries. Each multicast routing entry is thus associated with a particular VRF. Whether a particular VRF belongs to an MVPN or not is determined by configuration. In this document, we do not attempt to provide support for every possible multicast routing protocol that could possibly run on the PE-CE link. Rather, we consider multicast C-instances only for the following multicast routing protocols: - PIM Sparse Mode (PIM-SM), supporting the ASM service model - PIM Sparse Mode, supporting the SSM service model - PIM Bidirectional Mode (BIDIR-PIM), which uses bidirectional C-trees to support the ASM service model. In order to support the "Carrier's Carrier" model of [RFC4364], mLDP may also be supported on the PE-CE interface. The use of mLDP on the PE-CE interface is described in [MVPN-BGP].
The use of BGP on the PE-CE interface is not within the scope of this document. As the only multicast C-instances discussed by this document are PIM- based C-instances, we will generally use the term "PIM C-instances" to refer to the multicast C-instances. A PE router may also be running a "provider-wide" instance of PIM, (a "PIM P-instance"), in which it has a PIM adjacency with, e.g., each of its IGP neighbors (i.e., with P routers), but NOT with any CE routers, and not with other PE routers (unless another PE router happens to be an IGP adjacency). In this case, P routers would also run the P-instance of PIM but NOT a C-instance. If there is a PIM P-instance, it may or may not have a role to play in the support of VPN multicast; this is discussed in later sections. However, in no case will the PIM P-instance contain VPN-specific multicast routing information. In order to help clarify when we are speaking of the PIM P-instance and when we are speaking of a PIM C-instance, we will also apply the prefixes "P-" and "C-", respectively, to control messages, addresses, etc. Thus, a P-Join would be a PIM Join that is processed by the PIM P-instance, and a C-Join would be a PIM Join that is processed by a C-instance. A P-group address would be a group address in the SP's address space, and a C-group address would be a group address in a VPN's address space. A C-tree is a multicast distribution tree constructed and maintained by the PIM C-instances. A C-flow is a stream of multicast packets with a common C-source address and a common C-group address. We will use the notation "(C-S,C-G)" to identify specific C-flows. If a particular C-tree is a shared tree (whether unidirectional or bidirectional) rather than a source- specific tree, we will sometimes speak of the entire set of flows traveling that tree, identifying the set as "(C-*,C-G)". 3.2. P-Multicast Service Interfaces (PMSIs) A PE must have the ability to forward multicast data packets received from a CE to one or more of the other PEs in the same MVPN for delivery to one or more other CEs. We define the notion of a "P-Multicast Service Interface" (PMSI). If a particular MVPN is supported by a particular set of PE routers, then there will be one or more PMSIs connecting those PE routers and/or subsets thereof. A PMSI is a conceptual "overlay" on the P-network with the following property: a PE in a given MVPN can give a packet to the PMSI, and the packet will be delivered to some or all of the other PEs in the MVPN, such that any PE receiving the packet will be able to determine the MVPN to which the packet belongs.
As we discuss below, a PMSI may be instantiated by a number of different transport mechanisms, depending on the particular requirements of the MVPN and of the SP. We will refer to these transport mechanisms as "P-tunnels". For each MVPN, there are one or more PMSIs that are used for transmitting the MVPN's multicast data from one PE to others. We will use the term "PMSI" such that a single PMSI belongs to a single MVPN. However, the transport mechanism that is used to instantiate a PMSI may allow a single P-tunnel to carry the data of multiple PMSIs. In this document, we make a clear distinction between the multicast service (the PMSI) and its instantiation. This allows us to separate the discussion of different services from the discussion of different instantiations of each service. The term "P-tunnel" is used to refer to the transport mechanism that instantiates a service. PMSIs are used to carry C-multicast data traffic. The C-multicast data traffic travels along a C-tree, but in the SP backbone all C-trees are tunneled through P-tunnels. Thus, we will sometimes talk of a P-tunnel carrying one or more C-trees. Some of the options for passing multicast control traffic among the PEs do so by sending the control traffic through a PMSI; other options do not send control traffic through a PMSI. 3.2.1. Inclusive and Selective PMSIs We will distinguish between three different kinds of PMSIs: - "Multidirectional Inclusive" PMSI (MI-PMSI) A Multidirectional Inclusive PMSI is one that enables ANY PE attaching to a particular MVPN to transmit a message such that it will be received by EVERY other PE attaching to that MVPN. There is, at most, one MI-PMSI per MVPN. (Though the P-tunnel or P-tunnels that instantiate an MI-PMSI may actually carry the data of more than one PMSI.) An MI-PMSI can be thought of as an overlay broadcast network connecting the set of PEs supporting a particular MVPN. - "Unidirectional Inclusive" PMSI (UI-PMSI) A Unidirectional Inclusive PMSI is one that enables a particular PE, attached to a particular MVPN, to transmit a message such that it will be received by all the other PEs attaching to that
MVPN. There is, at most, one UI-PMSI per PE per MVPN, though the P-tunnel that instantiates a UI-PMSI may, in fact, carry the data of more than one PMSI. - "Selective" PMSI (S-PMSI). A Selective PMSI is one that provides a mechanism wherein a particular PE in an MVPN can multicast messages so that they will be received by a subset of the other PEs of that MVPN. There may be an arbitrary number of S-PMSIs per PE per MVPN. The P-tunnel that instantiates a given S-PMSI may carry data from multiple S-PMSIs. In later sections, we describe the role played by these different kinds of PMSIs. We will use the term "I-PMSI" when we are not distinguishing between "MI-PMSIs" and "UI-PMSIs". 3.2.2. P-Tunnels Instantiating PMSIs The P-tunnels that are used to instantiate PMSIs will be referred to as "P-tunnels". A number of different tunnel setup techniques can be used to create the P-tunnels that instantiate the PMSIs. Among these are the following: - PIM A PMSI can be instantiated as (a set of) Multicast Distribution trees created by the PIM P-instance ("P-trees"). The multicast distribution trees that instantiate I-PMSIs may be either shared trees or source-specific trees. This document (along with [MVPN-BGP]) specifies procedures for identifying a particular (C-S,C-G) flow and assigning it to a particular S-PMSI. Such an S-PMSI is most naturally instantiated as a source-specific tree. The use of shared trees (including bidirectional trees) to instantiate S-PMSIs is outside the scope of this document. The use of PIM-DM to create P-tunnels is not supported. P-tunnels may be shared by multiple MVPNs (i.e., a given P-tunnel may be the instantiation of multiple PMSIs), as long as the tunnel encapsulation provides some means of demultiplexing the data traffic by MVPN.
- mLDP mLDP Point-to-Multipoint (P2MP) LSPs or Multipoint-to-Multipoint (MP2MP) LSPs can be used to instantiate I-PMSIs. An S-PMSI or a UI-PMSI could be instantiated as a single mLDP P2MP LSP, whereas an MI-PMSI would have to be instantiated as a set of such LSPs (each PE in the MVPN being the root of one such LSP) or as a single MP2MP LSP. Procedures for sharing MP2MP LSPs across multiple MVPNs are outside the scope of this document. The use of MP2MP LSPs to instantiate S-PMSIs is outside the scope of this document. Section 11.2.3 discusses a way of using a partial mesh of MP2MP LSPs to instantiate a PMSI. However, a full specification of the necessary procedures is outside the scope of this document. - RSVP-TE A PMSI may be instantiated as one or more RSVP-TE Point-to- Multipoint (P2MP) LSPs. An S-PMSI or a UI-PMSI would be instantiated as a single RSVP-TE P2MP LSP, whereas a Multidirectional Inclusive PMSI would be instantiated as a set of such LSPs, one for each PE in the MVPN. RSVP-TE P2MP LSPs can be shared across multiple MVPNs. - A Mesh of Unicast P-Tunnels. If a PMSI is implemented as a mesh of unicast P-tunnels, a PE wishing to transmit a packet through the PMSI would replicate the packet and send a copy to each of the other PEs. An MI-PMSI for a given MVPN can be instantiated as a full mesh of unicast P-tunnels among that MVPN's PEs. A UI-PMSI or an S-PMSI can be instantiated as a partial mesh. It can be seen that each method of implementing PMSIs has its own area of applicability. Therefore, this specification allows for the use of any of these methods. At first glance, this may seem like an overabundance of options. However, the history of multicast development and deployment should make it clear that there is no one option that is always acceptable. The use of segmented inter-AS trees does allow each SP to select the option that it finds most applicable in its own environment, without causing any other SP to choose that same option.
SPECIFYING THE CONDITIONS UNDER WHICH A PARTICULAR TREE-BUILDING METHOD IS APPLICABLE IS OUTSIDE THE SCOPE OF THIS DOCUMENT. The choice of the tunnel technique belongs to the sender router and is a local policy decision of that router. The procedures defined throughout this document do not mandate that the same tunnel technique be used for all P-tunnels going through a given provider backbone. However, it is expected that any tunnel technique that can be used by a PE for a particular MVPN is also supported by all the other PEs having VRFs for the MVPN. Moreover, the use of ingress replication by any PE for an MVPN implies that all other PEs MUST use ingress replication for this MVPN. 3.3. Use of PMSIs for Carrying Multicast Data Each PE supporting a particular MVPN must have a way of discovering the following information: - The set of other PEs in its AS that are attached to sites of that MVPN, and the set of other ASes that have PEs attached to sites of that MVPN. However, if non-segmented inter-AS trees are used (see Section 8.1), then each PE needs to know the entire set of PEs attached to sites of that MVPN. - If segmented inter-AS trees are to be used, the set of border routers in its AS that support inter-AS connectivity for that MVPN. - If the MVPN is configured to use an MI-PMSI, the information needed to set up and to use the P-tunnels instantiating the MI-PMSI. - For each other PE, whether the PE supports Aggregate Trees for the MVPN, and if so, the demultiplexing information that must be provided so that the other PE can determine whether a packet that it received on an Aggregate Tree belongs to this MVPN. In some cases, the information above is provided by means of the BGP- based auto-discovery procedures discussed in Section 4 of this document and in Section 9 of [MVPN-BGP]. In other cases, this information is provided after discovery is complete, by means of procedures discussed in Section 7.4. In either case, the information that is provided must be sufficient to enable the PMSI to be bound to the identified P-tunnel, to enable the P-tunnel to be created if it does not already exist, and to enable the different PMSIs that may travel on the same P-tunnel to be properly demultiplexed.
If an MVPN uses an MI-PMSI, then the information needed to identify the P-tunnels that instantiate the MI-PMSI has to be known to the PEs attached to the MVPN before any data can be transmitted on the MI-PMSI. This information is either statically configured or auto- discovered (see Section 4). The actual process of constructing the P-tunnels (e.g., via PIM, RSVP-TE, or mLDP) SHOULD occur as soon as this information is known. When MI-PMSIs are used, they may serve as the default method of carrying C-multicast data traffic. When we say that an MI-PMSI is the "default" method of carrying C-multicast data traffic for a particular MVPN, we mean that it is not necessary to use any special control procedures to bind a particular C-flow to the MI-PMSI; any C-flows that have not been bound to other PMSIs will be assumed to travel through the MI-PMSI. There is no requirement to use MI-PMSIs as the default method of carrying C-flows. It is possible to adopt a policy in which all C-flows are carried on UI-PMSIs or S-PMSIs. In this case, if an MI-PMSI is not used for carrying routing information, it is not needed at all. Even when an MI-PMSI is used as the default method of carrying an MVPN's C-flows, if a particular C-flow has certain characteristics, it may be desirable to migrate it from the MI-PMSI to an S-PMSI. These characteristics, as well as the procedures for migrating a C-flow from an MI-PMSI to an S-PMSI, are discussed in Section 7. Sometimes a set of C-flows are traveling the same, shared, C-tree (e.g., either unidirectional or bidirectional), and it may be desirable to move the whole set of C-flows as a unit to an S-PMSI. Procedures for doing this are outside the scope of this specification. Some of the procedures for transmitting C-multicast routing information among the PEs require that the routing information be sent over an MI-PMSI. Other procedures do not use an MI-PMSI to transmit the C-multicast routing information. For a given MVPN, whether an MI-PMSI is used to carry C-multicast routing information is independent from whether an MI-PMSI is used as the default method of carrying the C-multicast data traffic. As previously stated, it is possible to send all C-flows on a set of S-PMSIs, omitting any usage of I-PMSIs. This prevents PEs from receiving data that they don't need, at the cost of requiring additional P-tunnels, and additional signaling to bind the C-flows to P-tunnels. Cost-effective instantiation of S-PMSIs is likely to
require Aggregate P-trees, which, in turn, makes it necessary for the transmitting PE to know which PEs need to receive which multicast streams. This is known as "explicit tracking", and the procedures to enable explicit tracking may themselves impose a cost. This is further discussed in Section 126.96.36.199. 3.4. PE-PE Transmission of C-Multicast Routing As a PE attached to a given MVPN receives C-Join/Prune messages from its CEs in that MVPN, it must convey the information contained in those messages to other PEs that are attached to the same MVPN. There are several different methods for doing this. As these methods are not interoperable, the method to be used for a particular MVPN must be either configured or discovered as part of the auto-discovery process. 3.4.1. PIM Peering 188.8.131.52. Full per-MVPN PIM Peering across an MI-PMSI If the set of PEs attached to a given MVPN are connected via an MI-PMSI, the PEs can form "normal" PIM adjacencies with each other. Since the MI-PMSI functions as a broadcast network, the standard PIM procedures for forming and maintaining adjacencies over a LAN can be applied. As a result, the C-Join/Prune messages that a PE receives from a CE can be multicast to all the other PEs of the MVPN. PIM "Join suppression" can be enabled and the PEs can send Asserts as needed. This procedure is fully specified in Section 5.2. 184.108.40.206. Lightweight PIM Peering across an MI-PMSI The procedure of the previous Section has the following disadvantages: - Periodic Hello messages must be sent by all PEs. Standard PIM procedures require that each PE in a particular MVPN periodically multicast a Hello to all the other PEs in that MVPN. If the number of MVPNs becomes very large, sending and receiving these Hellos can become a substantial overhead for the PE routers.
- Periodic retransmission of C-Join/Prune messages. PIM is a "soft-state" protocol, in which reliability is assured through frequent retransmissions (refresh) of control messages. This too can begin to impose a large overhead on the PE routers as the number of MVPNs grows. The first of these disadvantages is easily remedied. The reason for the periodic PIM Hellos is to ensure that each PIM speaker on a LAN knows who all the other PIM speakers on the LAN are. However, in the context of MVPN, PEs in a given MVPN can learn the identities of all the other PEs in the MVPN by means of the BGP-based auto-discovery procedure of Section 4. In that case, the periodic Hellos would serve no function and could simply be eliminated. (Of course, this does imply a change to the standard PIM procedures.) When Hellos are suppressed, we may speak of "lightweight PIM peering". The periodic refresh of the C-Join/Prune messages is not as simple to eliminate. If and when "refresh reduction" procedures are specified for PIM, it may be useful to incorporate them, so as to make the lightweight PIM peering procedures even more lightweight. Lightweight PIM peering is not specified in this document. 220.127.116.11. Unicasting of PIM C-Join/Prune Messages PIM does not require that the C-Join/Prune messages that a PE receives from a CE to be multicast to all the other PEs; it allows them to be unicast to a single PE, the one that is upstream on the path to the root of the multicast tree mentioned in the Join/Prune message. Note that when the C-Join/Prune messages are unicast, there is no such thing as "Join suppression". Therefore, PIM Refresh Reduction may be considered to be a prerequisite for the procedure of unicasting the C-Join/Prune messages. When the C-Join/Prune messages are unicast, they are not transmitted on a PMSI at all. Note that the procedure of unicasting the C-Join/Prune messages is different than the procedure of transmitting the C-Join/Prune messages on an MI-PMSI that is instantiated as a mesh of unicast P-tunnels. If there are multiple PEs that can be used to reach a given C-source, procedures described in Sections 5.1 and 9 MUST be used to ensure that duplicate packets do not get delivered.
Procedures for unicasting the PIM control messages are not further specified in this document. 3.4.2. Using BGP to Carry C-Multicast Routing It is possible to use BGP to carry C-multicast routing information from PE to PE, dispensing entirely with the transmission of C-Join/Prune messages from PE to PE. This is discussed in Section 5.3 and fully specified in [MVPN-BGP]. 4. BGP-Based Auto-Discovery of MVPN Membership BGP-based auto-discovery is done by means of a new address family, the MCAST-VPN address family. (This address family also has other uses, as will be seen later.) Any PE that attaches to an MVPN must issue a BGP Update message containing an NLRI ("Network Layer Reachability Information" element) in this address family, along with a specific set of attributes. In this document, we specify the information that must be contained in these BGP Updates in order to provide auto-discovery. The encoding details, along with the complete set of detailed procedures, are specified in a separate document [MVPN-BGP]. This section specifies the intra-AS BGP-based auto-discovery procedures. When segmented inter-AS trees are used, additional procedures are needed, as specified in [MVPN-BGP]. (When segmented inter-AS trees are not used, the inter-AS procedures are almost identical to the intra-AS procedures.) BGP-based auto-discovery uses a particular kind of MCAST-VPN route known as an "auto-discovery route", or "A-D route". In particular, it uses two kinds of "A-D routes": the "Intra-AS I-PMSI A-D route" and the "Inter-AS I-PMSI A-D route". (There are also additional kinds of A-D routes, such as the Source Active A-D routes, which are used for purposes that go beyond auto-discovery. These are discussed in subsequent sections.) The Inter-AS I-PMSI A-D route is used only when segmented inter-AS P-tunnels are used, as specified in [MVPN-BGP]. The "Intra-AS I-PMSI A-D route" is originated by the PEs that are (directly) connected to the site(s) of an MVPN. It is distributed to other PEs that attach to sites of the MVPN. If segmented inter-AS P-tunnels are used, then the Intra-AS I-PMSI A-D routes are not distributed outside the AS where they originate; if segmented inter- AS P-tunnels are not used, then the Intra-AS I-PMSI A-D routes are, despite their name, distributed to all PEs attached to the VPN, no matter what AS the PEs are in.
The NLRI of an Intra-AS I-PMSI A-D route must contain the following information: - The route type (i.e., Intra-AS I-PMSI A-D route). - The IP address of the originating PE. - An RD ("Route Distinguisher", [RFC4364]) configured locally for the MVPN. This is an RD that can be prepended to that IP address to form a globally unique VPN-IP address of the PE. Intra-AS I-PMSI A-D routes carry the following attributes: - Route Target Extended Communities attribute. One or more of these MUST be carried by each Intra-AS I-PMSI A-D route. If any other PE has one of these Route Targets configured for import into a VRF, it treats the advertising PE as a member in the MVPN to which the VRF belongs. This allows each PE to discover the PEs that belong to a given MVPN. More specifically, it allows a PE in the Receiver Sites set to discover the PEs in the Sender Sites set of the MVPN, and the PEs in the Sender Sites set of the MVPN to discover the PEs in the Receiver Sites set of the MVPN. The PEs in the Receiver Sites set would be configured to import the Route Targets advertised in the BGP A-D routes by PEs in the Sender Sites set. The PEs in the Sender Sites set would be configured to import the Route Targets advertised in the BGP A-D routes by PEs in the Receiver Sites set. - PMSI Tunnel attribute. This attribute is present whenever the MVPN uses an MI-PMSI or when it uses a UI-PMSI rooted at the originating router. It contains the following information: * tunnel technology, which may be one of the following: + Bidirectional multicast tree created by BIDIR-PIM, + Source-specific multicast tree created by PIM-SM, supporting the SSM service model, + Set of trees (one shared tree and a set of source trees) created by PIM-SM using the ASM service model, + Point-to-multipoint LSP created by RSVP-TE, + Point-to-multipoint LSP created by mLDP,
+ multipoint-to-multipoint LSP created by mLDP + unicast tunnel * P-tunnel identifier Before a P-tunnel can be constructed to instantiate the I-PMSI, the PE must be able to create a unique identifier for the tunnel. The syntax of this identifier depends on the tunnel technology used. Each PE attaching to a given MVPN must be configured with information specifying the allowable encapsulations to use for that MVPN, as well as the particular one of those encapsulations that the PE is to identify in the PMSI Tunnel attribute of the Intra-AS I-PMSI A-D routes that it originates. * Multi-VPN aggregation capability and demultiplexor value. This specifies whether the P-tunnel is capable of aggregating I-PMSIs from multiple MVPNs. This will affect the encapsulation used. If aggregation is to be used, a demultiplexor value to be carried by packets for this particular MVPN must also be specified. The demultiplexing mechanism and signaling procedures are described in Section 6. - PE Distinguisher Labels Attribute Sometimes it is necessary for one PE to advertise an upstream- assigned MPLS label that identifies another PE. Under certain circumstances to be discussed later, a PE that is the root of a multicast P-tunnel will bind an MPLS label value to one or more of the PEs that belong to the P-tunnel, and it will distribute these label bindings using Intra-AS I-PMSI A-D routes. Specification of when this must be done is provided in Sections 6.4.4 and 11.2.2. We refer to these as "PE Distinguisher Labels". Note that, as specified in [MPLS-UPSTREAM-LABEL], PE Distinguisher Label values are unique only in the context of the IP address identifying the root of the P-tunnel; they are not necessarily unique per tunnel.
5. PE-PE Transmission of C-Multicast Routing As a PE attached to a given MVPN receives C-Join/Prune messages from its CEs in that MVPN, it must convey the information contained in those messages to other PEs that are attached to the same MVPN. This is known as the "PE-PE transmission of C-multicast routing information". This section specifies the procedures used for PE-PE transmission of C-multicast routing information. Not every procedure mentioned in Section 3.4 is specified here. Rather, this section focuses on two particular procedures: - Full PIM Peering. This procedure is fully specified herein. - Use of BGP to distribute C-multicast routing This procedure is described herein, but the full specification appears in [MVPN-BGP]. Those aspects of the procedures that apply to both of the above are also specified fully herein. Specification of other procedures is outside the scope of this document. 5.1. Selecting the Upstream Multicast Hop (UMH) When a PE receives a C-Join/Prune message from a CE, the message identifies a particular multicast flow as belonging either to a source-specific tree (S,G) or to a shared tree (*,G). Throughout this section, we use the term "C-root" to refer to S, in the case of a source-specific tree, or to the Rendezvous Point (RP) for G, in the case of (*,G). If the route to the C-root is across the VPN backbone, then the PE needs to find the "Upstream Multicast Hop" (UMH) for the (S,G) or (*,G) flow. The UMH is either the PE at which (S,G) or (*,G) data packets enter the VPN backbone or the Autonomous System Border Router (ASBR) at which those data packets enter the local AS when traveling through the VPN backbone. The process of finding the upstream multicast hop for a given C-root is known as "upstream multicast hop selection".
5.1.1. Eligible Routes for UMH Selection In the simplest case, the PE does the upstream hop selection by looking up the C-root in the unicast VRF associated with the PE-CE interface over which the C-Join/Prune message was received. The route that matches the C-root will contain the information needed to select the UMH. However, in some cases, the CEs may be distributing to the PEs a special set of routes that are to be used exclusively for the purpose of upstream multicast hop selection, and not used for unicast routing at all. For example, when BGP is the CE-PE unicast routing protocol, the CEs may be using Subsequent Address Family Identifier 2 (SAFI 2) to distribute a special set of routes that are to be used for, and only for, upstream multicast hop selection. When OSPF [OSPF] is the CE-PE routing protocol, the CE may use an MT-ID (Multi-Topology Identifier) [OSPF-MT] of 1 to distribute a special set of routes that are to be used for, and only for, upstream multicast hop selection. When a CE uses one of these mechanisms to distribute to a PE a special set of routes to be used exclusively for upstream multicast hop selection, these routes are distributed among the PEs using SAFI 129, as described in [MVPN-BGP]. Whether the routes used for upstream multicast hop selection are (a) the "ordinary" unicast routes or (b) a special set of routes that are used exclusively for upstream multicast hop selection is a matter of policy. How that policy is chosen, deployed, or implemented is outside the scope of this document. In the following, we will simply refer to the set of routes that are used for upstream multicast hop selection, the "Eligible UMH routes", with no presumptions about the policy by which this set of routes was chosen. 5.1.2. Information Carried by Eligible UMH Routes Every route that is eligible for UMH selection SHOULD carry a VRF Route Import Extended Community [MVPN-BGP]. However, if BGP is used to distribute C-multicast routing information, or if the route is from a VRF that belongs to a multi-AS VPN as described in option b of Section 10 of [RFC4364], then the route MUST carry a VRF Route Import Extended Community. This attribute identifies the PE that originated the route. If BGP is used for carrying C-multicast routes, OR if "Segmented inter-AS Tunnels" are used, then every UMH route MUST also carry a Source AS Extended Community [MVPN-BGP]. These two attributes are used in the upstream multicast hop selection procedures described below.
5.1.3. Selecting the Upstream PE The first step in selecting the upstream multicast hop for a given C-root is to select the Upstream PE router for that C-root. The PE that received the C-Join message from a CE looks in the VRF corresponding to the interfaces over which the C-Join was received. It finds the Eligible UMH route that is the best match for the C-root specified in that C-Join. Call this the "Installed UMH Route". Note that the outgoing interface of the Installed UMH Route may be one of the interfaces associated with the VRF, in which case the upstream multicast hop is a CE and the route to the C-root is not across the VPN backbone. Consider the set of all VPN-IP routes that (a) are eligible to be imported into the VRF (as determined by their Route Targets), (b) are eligible to be used for upstream multicast hop selection, and (c) have exactly the same IP prefix (not necessarily the same RD) as the installed UMH route. For each route in this set, determine the corresponding Upstream PE and Upstream RD. If a route has a VRF Route Import Extended Community, the route's Upstream PE is determined from it. If a route does not have a VRF Route Import Extended Community, the route's Upstream PE is determined from the route's BGP Next Hop. In either case, the Upstream RD is taken from the route's NLRI. This results in a set of triples of <route, Upstream PE, Upstream RD>. Call this the "UMH Route Candidate Set". Then, the PE MUST select a single route from the set to be the "Selected UMH Route". The corresponding Upstream PE is known as the "Selected Upstream PE", and the corresponding Upstream RD is known as the "Selected Upstream RD". There are several possible procedures that can be used by a PE to select a single route from the candidate set. The default procedure, which MUST be implemented, is to select the route whose corresponding Upstream PE address is numerically highest, where a 32-bit IP address is treated as a 32-bit unsigned integer. Call this the "default Upstream PE selection". For a given C-root, provided that the routing information used to create the candidate set is stable, all PEs will have the same default Upstream PE selection. (Though different default Upstream PE selections may be chosen during a routing transient.)
An alternative procedure that MUST be implemented, but which is disabled by default, is the following. This procedure ensures that, except during a routing transient, each PE chooses the same Upstream PE for a given combination of C-root and C-G. 1. The PEs in the candidate set are numbered from lowest to highest IP address, starting from 0. 2. The following hash is performed: - A bytewise exclusive-or of all the bytes in the C-root address and the C-G address is performed. - The result is taken modulo n, where n is the number of PEs in the candidate set. Call this result N. The Selected Upstream PE is then the one that appears in position N in the list of step 1. Other hashing algorithms are allowed as well, but not required. The alternative procedure allows a form of "equal cost load balancing". Suppose, for example, that from egress PEs PE3 and PE4, source C-S can be reached, at equal cost, via ingress PE PE1 or ingress PE PE2. The load balancing procedure makes it possible for PE1 to be the ingress PE for (C-S,C-G1) data traffic while PE2 is the ingress PE for (C-S,C-G2) data traffic. Another procedure, which SHOULD be implemented, is to use the Installed UMH Route as the Selected UMH Route. If this procedure is used, the result is likely to be that a given PE will choose the Upstream PE that is closest to it, according to the routing in the SP backbone. As a result, for a given C-root, different PEs may choose different Upstream PEs. This is useful if the C-root is an anycast address, and can also be useful if the C-root is in a multihomed site (i.e., a site that is attached to multiple PEs). However, this procedure is more likely to lead to steady state duplication of traffic unless (a) PEs discard data traffic that arrives from the "wrong" Upstream PE or (b) data traffic is carried only in non- aggregated S-PMSIs. This issue is discussed at length in Section 9. General policy-based procedures for selecting the UMH route are allowed but not required, and they are not further discussed in this specification.
5.1.4. Selecting the Upstream Multicast Hop In certain cases, the Selected Upstream Multicast Hop is the same as the Selected Upstream PE. In other cases, the Selected Upstream Multicast Hop is the ASBR that is the BGP Next Hop of the Selected UMH Route. If the Selected Upstream PE is in the local AS, then the Selected Upstream PE is also the Selected Upstream Multicast Hop. This is the case if any of the following conditions holds: - The Selected UMH Route has a Source AS Extended Community, and the Source AS is the same as the local AS, - The Selected UMH Route does not have a Source AS Extended Community, but the route's BGP Next Hop is the same as the Upstream PE. Otherwise, the Selected Upstream Multicast Hop is an ASBR. The method of determining just which ASBR it is depends on the particular inter-AS signaling method being used (PIM or BGP) and on whether segmented or non-segmented inter-AS tunnels are used. These details are presented in later sections. 5.2. Details of Per-MVPN Full PIM Peering over MI-PMSI When an MVPN uses an MI-PMSI, the C-instances of that MVPN can treat the MI-PMSI as a LAN interface and form full PIM adjacencies with each other over that LAN interface. The use of PIM when an MI-PMSI is not in use is outside the scope of this document. To form full PIM adjacencies, the PEs execute the standard PIM procedures on the LAN interface, including the generation and processing of PIM Hello, Join/Prune, Assert, DF (Designated Forwarder) election, and other PIM control messages. These are executed independently for each C-instance. PIM "Join suppression" SHOULD be enabled. 5.2.1. PIM C-Instance Control Packets All IPv4 PIM C-instance control packets of a particular MVPN are addressed to the ALL-PIM-ROUTERS (18.104.22.168) IP destination address and transmitted over the MI-PMSI of that MVPN. While in transit in the P-network, the packets are encapsulated as required for the particular kind of P-tunnel that is being used to instantiate the
MI-PMSI. Thus, the C-instance control packets are not processed by the P routers, and MVPN-specific PIM routes can be extended from site to site without appearing in the P routers. The handling of IPv6 PIM C-instance control packets will be specified in a follow-on document. As specified in Section 5.1.2, when PIM is being used to distribute C-multicast routing information, any PE distributing VPN-IP routes that are eligible for use as UMH routes SHOULD include a VRF Route Import Extended Community with each route. For a given VRF, the Global Administrator field of the VRF Route Import Extended Community MUST be set to the same IP address that the PE places in the IP source address field of the PE-PE PIM control messages it originates from that VRF. Note that BSR (Bootstrap Router Mechanism for PIM) [BSR] messages are treated the same as PIM C-instance control packets, and BSR processing is regarded as an integral part of the PIM C-instance processing. 5.2.2. PIM C-Instance Reverse Path Forwarding (RPF) Determination Although the MI-PMSI is treated by PIM as a LAN interface, unicast routing is NOT run over it, and there are no unicast routing adjacencies over it. Therefore, it is necessary to specify special procedures for determining when the MI-PMSI is to be regarded as the "RPF Interface" for a particular C-address. The PE follows the procedures of Section 5.1 to determine the Selected UMH Route. If that route is NOT a VPN-IP route learned from BGP as described in [RFC4364], or if that route's outgoing interface is one of the interfaces associated with the VRF, then ordinary PIM procedures for determining the RPF interface apply. However, if the Selected UMH Route is a VPN-IP route whose outgoing interface is not one of the interfaces associated with the VRF, then PIM will consider the RPF interface to be the MI-PMSI associated with the VPN-specific PIM instance. Once PIM has determined that the RPF interface for a particular C-root is the MI-PMSI, it is necessary for PIM to determine the "RPF neighbor" for that C-root. This will be one of the other PEs that is a PIM adjacency over the MI-PMSI. In particular, it will be the "Selected Upstream PE", as defined in Section 5.1.
5.3. Use of BGP for Carrying C-Multicast Routing It is possible to use BGP to carry C-multicast routing information from PE to PE, dispensing entirely with the transmission of C-Join/Prune messages from PE to PE. This section describes the procedures for carrying intra-AS multicast routing information. Inter-AS procedures are described in Section 8. The complete specification of both sets of procedures and of the encodings can be found in [MVPN-BGP]. 5.3.1. Sending BGP Updates The MCAST-VPN address family is used for this purpose. MCAST-VPN routes used for the purpose of carrying C-multicast routing information are distinguished from those used for the purpose of carrying auto-discovery information by means of a "route type" field that is encoded into the NLRI. The following information is required in BGP to advertise the MVPN routing information. The NLRI contains the following: - The type of C-multicast route There are two types: * source tree join * shared tree join - The C-group address - The C-source address (In the case of a shared tree join, this is the address of the C-RP.) - The Selected Upstream RD corresponding to the C-root address (determined by the procedures of Section 5.1). Whenever a C-multicast route is sent, it must also carry the Selected Upstream Multicast Hop corresponding to the C-root address (determined by the procedures of Section 5.1). The Selected Upstream Multicast Hop must be encoded as part of a Route Target Extended Community to facilitate the optional use of filters that can prevent the distribution of the update to BGP speakers other than the Upstream Multicast Hop. See Section 10.1.3 of [MVPN-BGP] for the details. There is no C-multicast route corresponding to the PIM function of pruning a source off the shared tree when a PE switches from a (C-*,C-G) tree to a (C-S,C-G) tree. Section 9 of this document
specifies a mandatory procedure that ensures that if any PE joins a (C-S,C-G) source tree, all other PEs that have joined or will join the (C-*,C-G) shared tree will also join the (C-S,C-G) source tree. This eliminates the need for a C-multicast route that prunes C-S off the (C-*,C-G) shared tree when switching from (C-*,C-G) to (C-S,C-G) tree. 5.3.2. Explicit Tracking Note that the upstream multicast hop is NOT part of the NLRI in the C-multicast BGP routes. This means that if several PEs join the same C-tree, the BGP routes they distribute to do so are regarded by BGP as comparable routes, and only one will be installed. If a route reflector is being used, this further means that the PE that is used to reach the C-source will know only that one or more of the other PEs have joined the tree, but it won't know which one. That is, this BGP update mechanism does not provide "explicit tracking". Explicit tracking is not provided by default because it increases the amount of state needed and thus decreases scalability. Also, as constructing the C-PIM messages to send "upstream" for a given tree does not depend on knowing all the PEs that are downstream on that tree, there is no reason for the C-multicast route type updates to provide explicit tracking. There are some cases in which explicit tracking is necessary in order for the PEs to set up certain kinds of P-trees. There are other cases in which explicit tracking is desirable in order to determine how to optimally aggregate multicast flows onto a given aggregate tree. As these functions have to do with the setting up of infrastructure in the P-network, rather than with the dissemination of C-multicast routing information, any explicit tracking that is necessary is handled by sending a particular type of A-D route known as "Leaf A-D routes". Whenever a PE sends an A-D route with a PMSI Tunnel attribute, it can set a bit in the PMSI Tunnel attribute indicating "Leaf Information Required". A PE that installs such an A-D route MUST respond by generating a Leaf A-D route, indicating that it needs to join (or be joined to) the specified PMSI Tunnel. Details can be found in [MVPN-BGP]. 5.3.3. Withdrawing BGP Updates A PE removes itself from a C-multicast tree (shared or source) by withdrawing the corresponding BGP Update.
If a PE has pruned a C-source from a shared C-multicast tree, and it needs to "unprune" that source from that tree, it does so by withdrawing the route that pruned the source from the tree. 5.3.4. BSR BGP does not provide a method for carrying the control information of BSR packets received by a PE from a CE. BSR is supported by transmitting the BSR control messages from one PE in an MVPN to all the other PEs in that MVPN. When a PE needs to transmit a BSR message for a particular MVPN to other PEs, it must put its own IP address into the BSR message as the IP source address. As specified in Section 5.1.2, when a PE distributes VPN-IP routes that are eligible for use as UMH routes, the PE MUST include a VRF Route Import Extended Community with each route. For a given MVPN, a single such IP address MUST be used, and that same IP address MUST be used as the source address in all BSR packets that the PE transmits to other PEs. The BSR message may be transmitted over any PMSI that will deliver the message to all the other PEs in the MVPN. If no such PMSI has been instantiated yet, then an appropriate P-tunnel must be advertised, and the C-flow whose C-source address is the address of the PE itself, and whose multicast group is ALL-PIM-ROUTERS (22.214.171.124), must be bound to it. This can be done using the procedures described in Sections 7.3 and 7.4. Note that this is NOT meant to imply that the other PIM control packets from the PIM C-instance are to be transmitted to the other PEs. When a PE receives a BSR message for a particular MVPN from some other PE, the PE accepts the message only if the IP source address in that message is the Selected Upstream PE (see Section 5.1.3) for the IP address of the Bootstrap router. Otherwise, the PE simply discards the packet. If the PE accepts the packet, it does normal BSR processing on it, and it may forward a BSR message to one or more CEs as a result.