tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Gloss.     Arch.     IMS     UICC    |    Misc.    |    search     info

RFC 6513

 
 
 

Multicast in MPLS/BGP IP VPNs

Part 3 of 5, p. 33 to 55
Prev RFC Part       Next RFC Part

 


prevText      Top      Up      ToC       Page 33 
6.  PMSI Instantiation

   This section provides the procedures for using P-tunnels to
   instantiate a PMSI.  It describes the procedures for setting up and
   maintaining the P-tunnels as well as for sending and receiving C-data
   and/or C-control messages on the P-tunnels.  However, procedures for
   binding particular C-flows to particular P-tunnels are discussed in
   Section 7.

Top      Up      ToC       Page 34 
   PMSIs can be instantiated either by P-multicast trees or by PE-PE
   unicast tunnels.  In the latter case, the PMSI is said to be
   instantiated by "ingress replication".

   This specification supports a number of different methods for setting
   up P-multicast trees: these are detailed below.  A P-tunnel may
   support a single VPN (a non-aggregated P-multicast tree) or multiple
   VPNs (an aggregated P-multicast tree).

6.1.  Use of the Intra-AS I-PMSI A-D Route

6.1.1.  Sending Intra-AS I-PMSI A-D Routes

   When a PE is provisioned to have one or more VRFs that provide MVPN
   support, the PE announces its MVPN membership information using
   Intra-AS I-PMSI A-D routes, as discussed in Section 4 and detailed in
   Section 9.1.1 of [MVPN-BGP].  (Under certain conditions, detailed in
   [MVPN-BGP], the Intra-AS I-PMSI A-D route may be omitted.)

   Generally, the Intra-AS I-PMSI A-D route will have a PMSI Tunnel
   attribute that identifies a P-tunnel that is being used to
   instantiate the I-PMSI.  Section 9.1.1 of [MVPN-BGP] details certain
   conditions under which the PMSI Tunnel attribute may be omitted (or
   in which a PMSI Tunnel attribute with the "no tunnel information
   present" bit may be sent).

   As a special case, when (a) C-PIM control messages are to be sent
   through an MI-PMSI and (b) the MI-PMSI is instantiated by a P-tunnel
   technique for which each PE needs to know only a single P-tunnel
   identifier per VPN, then the use of the Intra-AS I-PMSI A-D routes
   MAY be omitted, and static configuration of the tunnel identifier
   used instead.  However, this is not recommended for long-term use,
   and in all other cases, the Intra-AS I-PMSI A-D routes MUST be used.

   The PMSI Tunnel attribute MAY contain an upstream-assigned MPLS
   label, assigned by the PE originating the Intra-AS I-PMSI A-D route.
   If this label is present, the P-tunnel can be carrying data from
   several MVPNs.  The label is used on the data packets traveling
   through the tunnel to identify the MVPN to which those data packets
   belong.  (The specified label identifies the packet as belonging to
   the MVPN that is identified by the RTs of the Intra-AS I-PMSI A-D
   route.)

   See Section 12.2 for details on how to place the label in the
   packet's label stack.

Top      Up      ToC       Page 35 
   The Intra-AS I-PMSI A-D route may contain a "PE Distinguisher Labels"
   attribute.  This contains a set of bindings between upstream-assigned
   labels and PE addresses.  The PE that originated the route may use
   this to bind an upstream-assigned label to one or more of the other
   PEs that belong to the same MVPN.  The way in which PE Distinguisher
   Labels are used is discussed in Sections 6.4.1, 6.4.3, 11.2.2, and
   12.3.  Other uses of the PE Distinguisher Labels attribute are
   outside the scope of this document.

6.1.2.  Receiving Intra-AS I-PMSI A-D Routes

   The action to be taken when a PE receives an Intra-AS I-PMSI A-D
   route for a particular MVPN depends on the particular P-tunnel
   technology that is being used by that MVPN.  If the P-tunnel
   technology requires tunnels to be built by means of receiver-
   initiated joins, the PE SHOULD join the tunnel immediately.

6.2.  When C-flows Are Specifically Bound to P-Tunnels

   This situation is discussed in Section 7.

6.3.  Aggregating Multiple MVPNs on a Single P-Tunnel

   When a P-multicast tree is shared across multiple MVPNs, it is termed
   an "Aggregate Tree".  The procedures described in this document allow
   a single SP multicast tree to be shared across multiple MVPNs.
   Unless otherwise specified, P-multicast tree technology supports
   aggregation.

   All procedures that are specific to multi-MVPN aggregation are
   OPTIONAL and are explicitly pointed out.

   Aggregate Trees allow a single P-multicast tree to be used across
   multiple MVPNs so that state in the SP core grows per set of MVPNs
   and not per MVPN.  Depending on the congruence of the aggregated
   MVPNs, this may result in trading off optimality of multicast
   routing.

   An Aggregate Tree can be used by a PE to provide a UI-PMSI or MI-PMSI
   service for more than one MVPN.  When this is the case, the Aggregate
   Tree is said to have an inclusive mapping.

Top      Up      ToC       Page 36 
6.3.1.  Aggregate Tree Leaf Discovery

   BGP MVPN membership discovery (Section 4) allows a PE to determine
   the different Aggregate Trees that it should create and the MVPNs
   that should be mapped onto each such tree.  The leaves of an
   Aggregate Tree are determined by the PEs, supporting aggregation,
   that belong to all the MVPNs that are mapped onto the tree.

   If an Aggregate Tree is used to instantiate one or more S-PMSIs, then
   it may be desirable for the PE at the root of the tree to know which
   PEs (in its MVPN) are receivers on that tree.  This enables the PE to
   decide when to aggregate two S-PMSIs, based on congruence (as
   discussed in the next section).  Thus, explicit tracking may be
   required.  Since the procedures for disseminating C-multicast routes
   do not provide explicit tracking, a type of A-D route known as a
   "Leaf A-D route" is used.  The PE that wants to assign a particular
   C-multicast flow to a particular Aggregate Tree can send an A-D
   route, which elicits Leaf A-D routes from the PEs that need to
   receive that C-multicast flow.  This provides the explicit tracking
   information needed to support the aggregation methodology discussed
   in the next section.  For more details on Leaf A-D routes, please
   refer to [MVPN-BGP].

6.3.2.  Aggregation Methodology

   This document does not specify the mandatory implementation of any
   particular set of rules for determining whether or not the PMSIs of
   two particular MVPNs are to be instantiated by the same Aggregate
   Tree.  This determination can be made by implementation-specific
   heuristics, by configuration, or even perhaps by the use of offline
   tools.

   It is the intention of this document that the control procedures will
   always result in all the PEs of an MVPN agreeing on the PMSIs that
   are to be used and on the tunnels used to instantiate those PMSIs.

   This section discusses potential methodologies with respect to
   aggregation.

   The "congruence" of aggregation is defined by the amount of overlap
   in the leaves of the customer trees that are aggregated on an SP
   tree.  For Aggregate Trees with an inclusive mapping, the congruence
   depends on the overlap in the membership of the MVPNs that are
   aggregated on the tree.  If there is complete overlap, i.e., all
   MVPNs have exactly the same sites, aggregation is perfectly
   congruent.  As the overlap between the MVPNs that are aggregated
   reduces, i.e., the number of sites that are common across all the
   MVPNs reduces, the congruence reduces.

Top      Up      ToC       Page 37 
   If aggregation is done such that it is not perfectly congruent, a PE
   may receive traffic for MVPNs to which it doesn't belong.  As the
   amount of multicast traffic in these unwanted MVPNs increases,
   aggregation becomes less optimal with respect to delivered traffic.
   Hence, there is a trade-off between reducing state and delivering
   unwanted traffic.

   An implementation should provide knobs to control the congruence of
   aggregation.  These knobs are implementation dependent.  Configuring
   the percentage of sites that MVPNs must have in common to be
   aggregated is an example of such a knob.  This will allow an SP to
   deploy aggregation depending on the MVPN membership and traffic
   profiles in its network.  If different PEs or servers are setting up
   Aggregate Trees, this will also allow a service provider to engineer
   the maximum amount of unwanted MVPNs for which a particular PE may
   receive traffic.

6.3.3.  Demultiplexing C-Multicast Traffic

   If a P-multicast tree is associated with only one MVPN, determining
   the P-multicast tree on which a packet was received is sufficient to
   determine the packet's MVPN.  All that the egress PE needs to know is
   the MVPN with which the P-multicast tree is associated.

   When multiple MVPNs are aggregated onto one P-multicast tree,
   determining the tree over which the packet is received is not
   sufficient to determine the MVPN to which the packet belongs.  The
   packet must also carry some demultiplexing information to allow the
   egress PEs to determine the MVPN to which the packet belongs.  Since
   the packet has been multicast through the P-network, any given
   demultiplexing value must have the same meaning to all the egress
   PEs.  The demultiplexing value is a MPLS label that corresponds to
   the multicast VRF to which the packet belongs.  This label is placed
   by the ingress PE immediately beneath the P-multicast tree header.
   Each of the egress PEs must be able to associate this MPLS label with
   the same MVPN.  If downstream-assigned labels were used, this would
   require all the egress PEs in the MVPN to agree on a common label for
   the MVPN.  Instead, the MPLS label is upstream-assigned
   [MPLS-UPSTREAM-LABEL].  The label bindings are advertised via BGP
   Updates originated by the ingress PEs.

   This procedure requires each egress PE to support a separate label
   space for every other PE.  The egress PEs create a forwarding entry
   for the upstream-assigned MPLS label, allocated by the ingress PE, in
   this label space.  Hence, when the egress PE receives a packet over
   an Aggregate Tree, it first determines the tree over which the packet
   was received.  The tree identifier determines the label space in
   which the upstream-assigned MPLS label lookup has to be performed.

Top      Up      ToC       Page 38 
   The same label space may be used for all P-multicast trees rooted at
   the same ingress PE or an implementation may decide to use a separate
   label space for every P-multicast tree.

   A full specification of the procedures to support aggregation on
   shared trees or on MP2MP LSPs is outside the scope of this document.

   The encapsulation format is either MPLS or MPLS-in-something (e.g.,
   MPLS-in-GRE [MPLS-IP]).  When MPLS is used, this label will appear
   immediately below the label that identifies the P-multicast tree.
   When MPLS-in-GRE is used, this label will be the top MPLS label that
   appears when the GRE header is stripped off.

   When IP encapsulation is used for the P-multicast tree, whatever
   information that particular encapsulation format uses for identifying
   a particular tunnel is used to determine the label space in which the
   MPLS label is looked up.

   If the P-multicast tree uses MPLS encapsulation, the P-multicast tree
   is itself identified by an MPLS label.  The egress PE MUST NOT
   advertise IMPLICIT NULL or EXPLICIT NULL for that tree.  Once the
   label representing the tree is popped off the MPLS label stack, the
   next label is the demultiplexing information that allows the proper
   MVPN to be determined.

   This specification requires that, to support this sort of
   aggregation, there be at least one upstream-assigned label per MVPN.
   It does not require that there be only one.  For example, an ingress
   PE could assign a unique label to each (C-S,C-G).  (This could be
   done using the same technique that is used to assign a particular
   (C-S,C-G) to an S-PMSI, see Section 7.4.)

   When an egress PE receives a C-multicast data packet over a
   P-multicast tree, it needs to forward the packet to the CEs that have
   receivers in the packet's C-multicast group.  In order to do this,
   the egress PE needs to determine the P-tunnel on which the packet was
   received.  The PE can then determine the MVPN that the packet belongs
   to and, if needed, do any further lookups that are needed to forward
   the packet.

6.4.  Considerations for Specific Tunnel Technologies

   While it is believed that the architecture specified in this document
   places no limitations on the protocols used for setting up and
   maintaining P-tunnels, the only protocols that have been explicitly
   considered are PIM-SM (both the SSM and ASM service models are

Top      Up      ToC       Page 39 
   considered, as are bidirectional trees), RSVP-TE, mLDP, and BGP.
   (BGP's role in the setup and maintenance of P-tunnels is to "stitch"
   together the intra-AS segments of a segmented inter-AS P-tunnel.)

6.4.1.  RSVP-TE P2MP LSPs

   If an I-PMSI is to be instantiated as one or more non-segmented
   P-tunnels, where the P-tunnels are RSVP-TE P2MP LSPs, then only the
   PEs that are at the head ends of those LSPs will ever include the
   PMSI Tunnel attribute in their Intra-AS I-PMSI A-D routes.  (These
   will be the PEs in the "Sender Sites set".)

   If an I-PMSI is to be instantiated as one or more segmented
   P-tunnels, where some of the intra-AS segments of these tunnels are
   RSVP-TE P2MP LSPs, then only a PE or ASBR that is at the head end of
   one of these LSPs will ever include the PMSI Tunnel attribute in its
   Inter-AS I-PMSI A-D route.

   Other PEs send Intra-AS I-PMSI A-D routes without PMSI Tunnel
   attributes.  (These will be the PEs that are in the "Receiver Sites
   set" but not in the "Sender Sites set".)  As each "Sender Site" PE
   receives an Intra-AS I-PMSI A-D route from a PE in the Receiver Sites
   set, it adds the PE originating that Intra-AS I-PMSI A-D route to the
   set of receiving PEs for the P2MP LSP.  The PE at the head end MUST
   then use RSVP-TE [RSVP-P2MP] signaling to add the receiver PEs to the
   P-tunnel.

   When RSVP-TE P2MP LSPs are used to instantiate S-PMSIs, and a
   particular C-flow is to be bound to the LSP, it is necessary to use
   explicit tracking so that the head end of the LSP knows which PEs
   need to receive data from the specified C-flow.  If the binding is
   done using S-PMSI A-D routes (see Section 7.4.1), the "Leaf
   Information Required" bit MUST be set in the PMSI Tunnel attribute.

   RSVP-TE P2MP LSPs can optionally support aggregation of multiple
   MVPNs.

   If an RSVP-TE P2MP LSP Tunnel is used for only a single MVPN, the
   mapping between the LSP and the MVPN can either be configured or be
   deduced from the procedures used to announce the LSP (e.g., from the
   RTs in the A-D route that announced the LSP).  If the LSP is used for
   multiple MVPNs, the set of MVPNs using it (and the corresponding MPLS
   labels) is inferred from the PMSI Tunnel attributes that specify the
   LSP.

   If an RSVP-TE P2MP LSP is being used to carry a set of C-flows
   traveling along a bidirectional C-tree, using the procedures of
   Section 11.2, the head end MUST include the PE Distinguisher Labels

Top      Up      ToC       Page 40 
   attribute in its Intra-AS I-PMSI A-D route or S-PMSI A-D route, and
   it MUST provide an upstream-assigned label for each PE that it has
   selected as the Upstream PE for the C-tree's RPA (Rendezvous Point
   Address).  See Section 11.2 for details.

   A PMSI Tunnel attribute specifying an RSVP-TE P2MP LSP contains the
   following information:

     - The type of the tunnel is set to RSVP-TE P2MP Tunnel

     - The RSVP-TE P2MP Tunnel's SESSION Object.

     - Optionally, the RSVP-TE P2MP LSP's SENDER_TEMPLATE Object.  This
       object is included when it is desired to identify a particular
       P2MP TE LSP.

   Demultiplexing the C-multicast data packets at the egress PE follows
   procedures described in Section 6.3.3.  As specified in Section
   6.3.3, an egress PE MUST NOT advertise IMPLICIT NULL or EXPLICIT NULL
   for an RSVP-TE P2MP LSP that is carrying traffic for one or more
   MVPNs.

   If (and only if) a particular RSVP-TE P2MP LSP is possibly carrying
   data from multiple MVPNs, the following special procedures apply:

     - A packet in a particular MVPN, when transmitted into the LSP,
       must carry the MPLS label specified in the PMSI Tunnel attribute
       that announced that LSP as a P-tunnel for that for that MVPN.

     - Demultiplexing the C-multicast data packets at the egress PE is
       done by means of the MPLS label that rises to the top of the
       stack after the label corresponding to the P2MP LSP is popped
       off.

   It is possible that at the time a PE learns, via an A-D route with a
   PMSI Tunnel attribute, that it needs to receive traffic on a
   particular RSVP-TE P2MP LSP, the signaling to set up the LSP will not
   have been completed.  In this case, the PE needs to wait for the
   RSVP-TE signaling to take place before it can modify its forwarding
   tables as directed by the A-D route.

   It is also possible that the signaling to set up an RSVP-TE P2MP LSP
   will be completed before a given PE learns, via a PMSI Tunnel
   attribute, of the use to which that LSP will be put.  The PE MUST
   discard any traffic received on that LSP until that time.

Top      Up      ToC       Page 41 
   In order for the egress PE to be able to discard such traffic, it
   needs to know that the LSP is associated with an MVPN and that the
   A-D route that binds the LSP to an MVPN or to a particular a C-flow
   has not yet been received.  This is provided by extending [RSVP-P2MP]
   with [RSVP-OOB].

6.4.2.  PIM Trees

   When the P-tunnels are PIM trees, the PMSI Tunnel attribute contains
   enough information to allow each other PE in the same MVPN to use
   P-PIM signaling to join the P-tunnel.

   If an I-PMSI is to be instantiated as one or more PIM trees, then the
   PE that is at the root of a given PIM tree sends an Intra-AS I-PMSI
   A-D route containing a PMSI Tunnel attribute that contains all the
   information needed for other PEs to join the tree.

   If PIM trees are to be used to instantiate an MI-PMSI, each PE in the
   MVPN must send an Intra-AS I-PMSI A-D route containing such a PMSI
   Tunnel attribute.

   If a PMSI is to be instantiated via a shared tree, the PMSI Tunnel
   attribute identifies the P-group address.  The RP or RPA
   corresponding to the P-group address is not specified.  It must, of
   course, be known to all the PEs.  It is presupposed that the PEs use
   one of the methods for automatically learning the RP-to-group
   correspondences (e.g., Bootstrap Router Protocol [BSR]), or else that
   the correspondence is configured.

   If a PMSI is to be instantiated via a source-specific tree, the PMSI
   Tunnel attribute identifies the PE router that is the root of the
   tree, as well as a P-group address.  The PMSI Tunnel attribute always
   specifies whether the PIM tree is to be a unidirectional shared tree,
   a bidirectional shared tree, or a source-specific tree.

   If PIM trees are being used to instantiate S-PMSIs, the above
   procedures assume that each PE router has a set of group P-addresses
   that it can use for setting up the PIM-trees.  Each PE must be
   configured with this set of P-addresses.  If the P-tunnels are
   source-specific trees, then the PEs may be configured with
   overlapping sets of group P-addresses.  If the trees are not source-
   specific, then each PE must be configured with a unique set of group
   P-addresses (i.e., having no overlap with the set configured at any
   other PE router).  The management of this set of addresses is thus
   greatly simplified when source-specific trees are used, so the use of
   source-specific trees is strongly recommended whenever unidirectional
   trees are desired.

Top      Up      ToC       Page 42 
   Specification of the full set of procedures for using bidirectional
   PIM trees to instantiate S-PMSIs is outside the scope of this
   document.

   Details for constructing the PMSI Tunnel attribute identifying a PIM
   tree can be found in [MVPN-BGP].

6.4.3.  mLDP P2MP LSPs

   When the P-tunnels are mLDP P2MP trees, each Intra-AS I-PMSI A-D
   route has a PMSI Tunnel attribute containing enough information to
   allow each other PE in the same MVPN to use mLDP signaling to join
   the P-tunnel.  The tunnel identifier consists of a P2MP Forwarding
   Equivalence Class (FEC) Element [mLDP].

   An mLDP P2MP LSP may be used to carry the traffic of multiple VPNs,
   if the PMSI Tunnel attribute specifying it contains a non-zero MPLS
   label.

   If an mLDP P2MP LSP is being used to carry the set of flows traveling
   along a particular bidirectional C-tree, using the procedures of
   Section 11.2, the root of the LSP MUST include the PE Distinguisher
   Labels attribute in its Intra-AS I-PMSI A-D route or S-PMSI A-D
   route, and it MUST provide an upstream-assigned label for the PE that
   it has selected to be the Upstream PE for the C-tree's RPA.  See
   Section 11.2 for details.

6.4.4.  mLDP MP2MP LSPs

   The specification of the procedures for assigning C-flows to mLDP
   MP2MP LSPs that serve as P-tunnels is outside the scope of this
   document.

6.4.5.  Ingress Replication

   As described in Section 3, a PMSI can be instantiated using Unicast
   Tunnels between the PEs that are participating in the MVPN.  In this
   mechanism, the ingress PE replicates a C-multicast data packet
   belonging to a particular MVPN and sends a copy to all or a subset of
   the PEs that belong to the MVPN.  A copy of the packet is tunneled to
   a remote PE over a Unicast Tunnel to the remote PE.  IP/GRE Tunnels
   or MPLS LSPs are examples of unicast tunnels that may be used.  The
   same Unicast Tunnel can be used to transport packets belonging to
   different MVPNs

   In order for a PE to use Unicast P-tunnels to send a C-multicast data
   packet for a particular MVPN to a set of remote PEs, the remote PEs
   must be able to correctly decapsulate such packets and to assign each

Top      Up      ToC       Page 43 
   one to the proper MVPN.  This requires that the encapsulation used
   for sending packets through the P-tunnel have demultiplexing
   information that the receiver can associate with a particular MVPN.

   If ingress replication is being used to instantiate the PMSIs for an
   MVPN, the PEs announce this as part of the BGP-based MVPN membership
   auto-discovery process, described in Section 4.  The PMSI Tunnel
   attribute specifies ingress replication; it also specifies a
   downstream-assigned MPLS label.  This label will be used to identify
   that a particular packet belongs to the MVPN that the Intra-AS I-PMSI
   A-D route belongs to (as inferred from its RTs).  If PE1 specifies a
   particular label value for a particular MVPN, then any other PE
   sending PE1 a packet for that MVPN through a unicast P-tunnel must
   put that label on the packet's label stack.  PE1 then treats that
   label as the demultiplexor value identifying the MVPN in question.

   Ingress replication may be used to instantiate any kind of PMSI.
   When ingress replication is done, it is RECOMMENDED, except in the
   one particular case mentioned in the next paragraph, that explicit
   tracking be done and that the data packets of a particular C-flow
   only get sent to those PEs that need to see the packets of that
   C-flow.  There is never any need to use the procedures of Section 7.4
   for binding particular C-flows to particular P-tunnels.

   The particular case in which there is no need for explicit tracking
   is the case where ingress replication is being used to create a
   one-hop ASBR-ASBR inter-AS segment of an segmented inter-AS P-tunnel.

   Section 9.1 specifies three different methods that can be used to
   prevent duplication of multicast data packets.  Any given deployment
   must use at least one of those methods.  Note that the method
   described in Section 9.1.1 ("Discarding Packets from Wrong PE")
   presupposes that the egress PE of a P-tunnel can, upon receiving a
   packet from the P-tunnel, determine the identity of the PE that
   transmitted the packet into the P-tunnel.  SPs that use ingress
   replication to instantiate their PMSIs are cautioned against this use
   for this purpose of unicast P-tunnel technologies that do not allow
   the egress PE to identify the ingress PE (e.g., MP2P LSPs for which
   penultimate-hop-popping is done).  Deployment of ingress replication
   with such P-tunnel technology MUST NOT be done unless it is known
   that the deployment relies entirely on the procedures of Sections
   9.1.2 or 9.1.3 for duplicate prevention.

Top      Up      ToC       Page 44 
7.  Binding Specific C-Flows to Specific P-Tunnels

   As discussed previously, Intra-AS I-PMSI A-D routes may (or may not)
   have PMSI Tunnel attributes, identifying P-tunnels that can be used
   as the default P-tunnels for carrying C-multicast traffic, i.e., for
   carrying C-multicast traffic that has not been specifically bound to
   another P-tunnel.

   If none of the Intra-AS I-PMSI A-D routes originated by a particular
   PE for a particular MVPN carry PMSI Tunnel attributes at all (or if
   the only PMSI Tunnel attributes they carry have type "No tunnel
   information present"), then there are no default P-tunnels for that
   PE to use when transmitting C-multicast traffic in that MVPN to other
   PEs.  In that case, all such C-flows must be assigned to specific
   P-tunnels using one of the mechanisms specified in Section 7.4.  That
   is, all such C-flows are carried on P-tunnels that instantiate
   S-PMSIs.

   There are other cases where it may be either necessary or desirable
   to use the mechanisms of Section 7.4 to identify specific C-flows and
   bind them to or unbind them from specific P-tunnels.  Some possible
   cases are as follows:

     - The policy for a particular MVPN is to send all C-data on
       S-PMSIs, even if the Intra-AS I-PMSI A-D routes carry PMSI Tunnel
       attributes.  (This is another case where all C-data is carried on
       S-PMSIs; presumably, the I-PMSIs are used for control
       information.)

     - It is desired to optimize the routing of the particular C-flow,
       which may already be traveling on an I-PMSI, by sending it
       instead on an S-PMSI.

     - If a particular C-flow is traveling on an S-PMSI, it may be
       considered desirable to move it to an I-PMSI (i.e., optimization
       of the routing for that flow may no longer be considered
       desirable).

     - It is desired to change the encapsulation used to carry the
       C-flow, e.g., because one now wants to aggregate it on a P-tunnel
       with flows from other MVPNs.

   Note that if Full PIM Peering over an MI-PMSI (Section 5.2) is being
   used, then from the perspective of the PIM state machine, the
   "interface" connecting the PEs to each other is the MI-PMSI, even if
   some or all of the C-flows are being sent on S-PMSIs.  That is, from

Top      Up      ToC       Page 45 
   the perspective of the C-PIM state machine, when a C-flow is being
   sent or received on an S-PMSI, the output or input interface
   (respectively) is considered to be the MI-PMSI.

   Section 7.1 discusses certain general considerations that apply
   whenever a specified C-flow is bound to a specified P-tunnel using
   the mechanisms of Section 7.4.  This includes the case where the
   C-flow is moved from one P-tunnel to another as well as the case
   where the C-flow is initially bound to an S-PMSI P-tunnel.

   Section 7.2 discusses the specific case of using the mechanisms of
   Section 7.4 as a way of optimizing multicast routing by switching
   specific flows from one P-tunnel to another.

   Section 7.3 discusses the case where the mechanisms of Section 7.4
   are used to announce the presence of "unsolicited flooded data" and
   to assign such data to a particular P-tunnel.

   Section 7.4 specifies the protocols for assigning specific C-flows to
   specific P-tunnels.  These protocols may be used to assign a C-flow
   to a P-tunnel initially or to switch a flow from one P-tunnel to
   another.

   Procedures for binding to a specified P-tunnel the set of C-flows
   traveling along a specified C-tree (or for so binding a set of
   C-flows that share some relevant characteristic), without identifying
   each flow individually, are outside the scope of this document.

7.1.  General Considerations

7.1.1.  At the PE Transmitting the C-Flow on the P-Tunnel

   The decision to bind a particular C-flow (designated as (C-S,C-G)) to
   a particular P-tunnel, or to switch a particular C-flow to a
   particular P-tunnel, is always made by the PE that is to transmit the
   C-flow onto the P-tunnel.

   Whenever a PE moves a particular C-flow from one P-tunnel, say P1, to
   another, say P2, care must be taken to ensure that there is no steady
   state duplication of traffic.  At any given time, the PE transmits
   the C-flow either on P1 or on P2, but not on both.

   When a particular PE, say PE1, decides to bind a particular C-flow to
   a particular P-tunnel, say P2, the following procedures MUST be
   applied:

Top      Up      ToC       Page 46 
     - PE1 must issue the required control plane information to signal
       that the specified C-flow is now bound to P-tunnel P2 (see
       Section 7.4).

     - If P-tunnel P2 needs to be constructed from the root downwards,
       PE1 must initiate the signaling to construct P2.  This is only
       required if P2 is an RSVP-TE P2MP LSP.

     - If the specified C-flow is currently bound to a different
       P-tunnel, say P1, then:

         * PE1 MUST wait for a "switch-over" delay before sending
           traffic of the C-flow on P-tunnel P2.  It is RECOMMENDED to
           allow this delay to be configurable.

         * Once the "switch-over" delay has elapsed, PE1 MUST send
           traffic for the C-flow on P2 and MUST NOT send it on P1.  In
           no case is any C-flow packet sent on both P-tunnels.

   When a C-flow is switched from one P-tunnel to another, the purpose
   of running a switch-over timer is to minimize packet loss without
   introducing packet duplication.  However, jitter may be introduced
   due to the difference in transit delays between the old and new
   P-tunnels.

   For best effect, the switch-over timer should be configured to a
   value that is "just long enough" (a) to allow all the PEs to learn
   about the new binding of C-flow to P-tunnel and (b) to allow the PEs
   to construct the P-tunnel, if it doesn't already exist.

   If, after such a switch, the "old" P-tunnel P1 is no longer needed,
   it SHOULD be torn down and the resources supporting it freed.  The
   procedures for "tearing down" a P-tunnel are specific to the P-tunnel
   technology.

   Procedures for binding sets of C-flows traveling along specified
   C-trees (or sets of C-flows sharing any other characteristic) to a
   specified P-tunnel (or for moving them from one P-tunnel to another)
   are outside the scope of this document.

7.1.2.  At the PE Receiving the C-flow from the P-Tunnel

   Suppose that a particular PE, say PE1, learns, via the procedures of
   Section 7.4, that some other PE, say PE2, has bound a particular
   C-flow, designated as (C-S,C-G), to a particular P-tunnel, say P2.
   Then, PE1 must determine whether it needs to receive (C-S,C-G)
   traffic from PE2.

Top      Up      ToC       Page 47 
   If BGP is being used to distribute C-multicast routing information
   from PE to PE, the conditions under which PE1 needs to receive
   (C-S,C-G) traffic from PE2 are specified in Section 12.3 of
   [MVPN-BGP].

   If PIM over an MI-PMSI is being used to distribute C-multicast
   routing from PE to PE, PE1 needs to receive (C-S,C-G) traffic from
   PE2 if one or more of the following conditions holds:

     - PE1 has (C-S,C-G) state such that PE2 is PE1's Upstream PE for
       (C-S,C-G), and PE1 has downstream neighbors ("non-null olist")
       for the (C-S,C-G) state.

     - PE1 has (C-*,C-G) state with an Upstream PE (not necessarily PE2)
       and with downstream neighbors ( "non-null olist"), but PE1 does
       not have (C-S,C-G) state.

     - Native PIM methods are being used to prevent steady-state packet
       duplication, and PE1 has either (C-*,C-G) or (C-S,C-G) state such
       that the MI-PMSI is one of the downstream interfaces.  Note that
       this includes the case where PE1 is itself sending (C-S,C-G)
       traffic on an S-PMSI.  (In this case, PE1 needs to receive the
       (C-S,C-G) traffic from PE2 in order to allow the PIM Assert
       mechanism to function properly.)

   Irrespective of whether BGP or PIM is being used to distribute
   C-multicast routing information, once PE1 determines that it needs to
   receive (C-S,C-G) traffic from PE2, the following procedures MUST be
   applied:

     - PE1 MUST take all necessary steps to be able to receive the
       (C-S,C-G) traffic on P2.

         * If P2 is a PIM tunnel or an mLDP LSP, PE1 will need to use
           PIM or mLDP (respectively) to join P2 (unless it is already
           joined to P2).

         * PE1 may need to modify the forwarding state for (C-S,C-G) to
           indicate that (C-S,C-G) traffic is to be accepted on P2.  If
           P2 is an Aggregate Tree, this also implies setting up the
           demultiplexing forwarding entries based on the inner label as
           described in Section 6.3.3

     - If PE1 was previously receiving the (C-S,C-G) C-flow on another
       P-tunnel, say P1, then:

         * PE1 MAY run a switch-over timer, and until it expires, SHOULD
           accept traffic for the given C-flow on both P1 and P2;

Top      Up      ToC       Page 48 
         * If, after such a switch, the "old" P-tunnel P1 is no longer
           needed, it SHOULD be torn down and the resources supporting
           it freed.  The procedures for "tearing down" a P-tunnel are
           specific to the P-tunnel technology.

     - If PE1 later determines that it no longer needs to receive any of
       the C-multicast data that is being sent on a particular P-tunnel,
       it may initiate signaling (specific to the P-tunnel technology)
       to remove itself from that tunnel.

7.2.  Optimizing Multicast Distribution via S-PMSIs

   Whenever a particular multicast stream is being sent on an I-PMSI, it
   is likely that the data of that stream is being sent to PEs that do
   not require it.  If a particular stream has a significant amount of
   traffic, it may be beneficial to move it to an S-PMSI that includes
   only those PEs that are transmitters and/or receivers (or at least
   includes fewer PEs that are neither).

   If explicit tracking is being done, S-PMSI creation can also be
   triggered on other criteria.  For instance, there could be a "pseudo-
   wasted bandwidth" criterion: switching to an S-PMSI would be done if
   the bandwidth multiplied by the number of uninterested PEs (PE that
   are receiving the stream but have no receivers) is above a specified
   threshold.  The motivation is that (a) the total bandwidth wasted by
   many sparsely subscribed low-bandwidth groups may be large and (b)
   there's no point to moving a high-bandwidth group to an S-PMSI if all
   the PEs have receivers for it.

   Switching a (C-S,C-G) stream to an S-PMSI may require the root of the
   S-PMSI to determine the egress PEs that need to receive the (C-S,C-G)
   traffic.  This is true in the following cases:

     - If the P-tunnel is a source-initiated tree, such as an RSVP-TE
       P2MP Tunnel, the PE needs to know the leaves of the tree before
       it can instantiate the S-PMSI.

     - If a PE instantiates multiple S-PMSIs, belonging to different
       MVPNs, using one P-multicast tree, such a tree is termed an
       Aggregate Tree with a selective mapping.  The setting up of such
       an Aggregate Tree requires the ingress PE to know all the other
       PEs that have receivers for multicast groups that are mapped onto
       the tree.

Top      Up      ToC       Page 49 
   The above two cases require that explicit tracking be done for the
   (C-S,C-G) stream.  The root of the S-PMSI MAY decide to do explicit
   tracking of this stream only after it has determined to move the
   stream to an S-PMSI, or it MAY have been doing explicit tracking all
   along.

   If the S-PMSI is instantiated by a P-multicast tree, the PE at the
   root of the tree must signal the leaves of the tree that the
   (C-S,C-G) stream is now bound to the S-PMSI.  Note that the PE could
   create the identity of the P-multicast tree prior to the actual
   instantiation of the P-tunnel.

   If the S-PMSI is instantiated by a source-initiated P-multicast tree
   (e.g., an RSVP-TE P2MP tunnel), the PE at the root of the tree must
   establish the source-initiated P-multicast tree to the leaves.  This
   tree MAY have been established before the leaves receive the S-PMSI
   binding, or it MAY be established after the leaves receive the
   binding.  The leaves MUST NOT switch to the S-PMSI until they receive
   both the binding and the tree signaling message.

7.3.  Announcing the Presence of Unsolicited Flooded Data

   A PE may receive "unsolicited" data from a CE, where the data is
   intended to be flooded to the other PEs of the same MVPN and then on
   to other CEs.  By "unsolicited", we mean that the data is to be
   delivered to all the other PEs of the MVPN, even though those PEs may
   not have sent any control information indicating that they need to
   receive that data.

   For example, if the BSR [BSR] is being used within the MVPN, BSR
   control messages may be received by a PE from a CE.  These need to be
   forwarded to other PEs, even though no PE ever issues any kind of
   explicit signal saying that it wants to receive BSR messages.

   If a PE receives a BSR message from a CE, and if the CE's MVPN has an
   MI-PMSI, then the PE can just send BSR messages on the appropriate
   P-tunnel.  Otherwise, the PE MUST announce the binding of a
   particular C-flow to a particular P-tunnel, using the procedures of
   Section 7.4.  The particular C-flow in this case would be
   (C-IPaddress_of_PE, ALL-PIM-ROUTERS).  The P-tunnel identified by the
   procedures of Section 7.4 may or may not be one that was previously
   identified in the PMSI Tunnel attribute of an I-PMSI A-D route.
   Further procedures for handling BSR may be found in Sections 5.2.1
   and 5.3.4.

Top      Up      ToC       Page 50 
   Analogous procedures may be used for announcing the presence of other
   sorts of unsolicited flooded data, e.g., dense mode data or data from
   proprietary protocols that presume messages can be flooded.  However,
   a full specification of the procedures for traffic other than BSR
   traffic is outside the scope of this document.

7.4.  Protocols for Binding C-Flows to P-Tunnels

   We describe two protocols for binding C-flows to P-tunnels.

   These protocols can be used for moving C-flows from I-PMSIs to
   S-PMSIs, as long as the S-PMSI is instantiated by a P-multicast tree.
   (If the S-PMSI is instantiated by means of ingress replication, the
   procedures of Section 6.4.5 suffice.)

   These protocols can also be used for other cases in which it is
   necessary to bind specific C-flows to specific P-tunnels.

7.4.1.  Using BGP S-PMSI A-D Routes

   Not withstanding the name of the mechanism "S-PMSI A-D routes", the
   mechanism to be specified in this section may be used any time it is
   necessary to advertise a binding of a C-flow to a particular
   P-tunnel.

7.4.1.1.  Advertising C-Flow Binding to P-Tunnel

   The ingress PE informs all the PEs that are on the path to receivers
   of the (C-S,C-G) of the binding of the P-tunnel to the (C-S,C-G).
   The BGP announcement is done by sending an update for the MCAST-VPN
   address family.  An S-PMSI A-D route is used, containing the
   following information:

      1. The IP address of the originating PE.

      2. The RD configured locally for the MVPN.  This is required to
         uniquely identify the (C-S,C-G) as the addresses could overlap
         between different MVPNs.  This is the same RD value used in the
         auto-discovery process.

      3. The C-S address.

      4. The C-G address.

      5. A PE MAY use a single P-tunnel to aggregate two or more
         S-PMSIs.  If the PE already advertised unaggregated S-PMSI A-D
         routes for these S-PMSIs, then a decision to aggregate them
         requires the PE to re-advertise these routes.  The re-

Top      Up      ToC       Page 51 
         advertised routes MUST be the same as the original ones, except
         for the PMSI Tunnel attribute.  If the PE has not previously
         advertised S-PMSI A-D routes for these S-PMSIs, then the
         aggregation requires the PE to advertise (new) S-PMSI A-D
         routes for these S-PMSIs.  The PMSI Tunnel attribute in the
         newly advertised/re-advertised routes MUST carry the identity
         of the P-tunnel that aggregates the S-PMSIs.

         If all these aggregated S-PMSIs belong to the same MVPN, and
         this MVPN uses PIM as its C-multicast routing protocol, then
         the corresponding S-PMSI A-D routes MAY carry an MPLS upstream-
         assigned label [MPLS-UPSTREAM-LABEL].  Moreover, in this case,
         the labels MUST be distinct on a per-MVPN basis, and MAY be
         distinct on a per-route basis.

         If all these aggregated S-PMSIs belong to the MVPN(s) that use
         mLDP as its C-multicast routing protocol, then the
         corresponding S-PMSI A-D routes MUST carry an MPLS upstream-
         assigned label [MPLS-UPSTREAM-LABEL], and these labels MUST be
         distinct on a per-route (per-mLDP-FEC) basis, irrespective of
         whether the aggregated S-PMSIs belong to the same or different
         MVPNs.

   When a PE distributes this information via BGP, it must include the
   following:

      1. An identifier for the particular P-tunnel to which the stream
         is to be bound.  This identifier is a structured field that
         includes the following information:

           * The type of tunnel

           * An identifier for the tunnel.  The form of the identifier
             will depend upon the tunnel type.  The combination of
             tunnel identifier and tunnel type should contain enough
             information to enable all the PEs to "join" the tunnel and
             receive messages from it.

      2. Route Target Extended Communities attribute.  This is used as
         described in Section 4.

7.4.1.2.  Explicit Tracking

   If the PE wants to enable explicit tracking for the specified flow,
   it also indicates this in the A-D route it uses to bind the flow to a
   particular P-tunnel.  Then, any PE that receives the A-D route will

Top      Up      ToC       Page 52 
   respond with a "Leaf A-D route" in which it identifies itself as a
   receiver of the specified flow.  The Leaf A-D route will be withdrawn
   when the PE is no longer a receiver for the flow.

   If the PE needs to enable explicit tracking for a flow without at the
   same time binding the flow to a specific P-tunnel, it can do so by
   sending an S-PMSI A-D route whose NLRI identifies the flow and whose
   PMSI Tunnel attribute has its tunnel type value set to "no tunnel
   information present" and its "leaf information required" bit set to
   1.  This will elicit the Leaf A-D routes.  This is useful when the PE
   needs to know the receivers before selecting a P-tunnel.

7.4.2.  UDP-Based Protocol

   This procedure carries its control messages in UDP and requires that
   the MVPN have an MI-PMSI that can be used to carry the control
   messages.

7.4.2.1.  Advertising C-Flow Binding to P-Tunnel

   In order for a given PE to move a particular C-flow to a particular
   P-tunnel, an "S-PMSI Join message" is sent periodically on the
   MI-PMSI.  (Notwithstanding the name of the mechanism, the mechanism
   may be used to bind a flow to any P-tunnel.)  The S-PMSI Join message
   is a UDP-encapsulated message whose destination address is ALL-PIM-
   ROUTERS (224.0.0.13) and whose destination port is 3232.

   The S-PMSI Join message contains the following information:

     - An identifier for the particular multicast stream that is to be
       bound to the P-tunnel.  This can be represented as an (S,G) pair.

     - An identifier for the particular P-tunnel to which the stream is
       to be bound.  This identifier is a structured field that includes
       the following information:

         * The type of tunnel used to instantiate the S-PMSI.

         * An identifier for the tunnel.  The form of the identifier
           will depend upon the tunnel type.  The combination of tunnel
           identifier and tunnel type should contain enough information
           to enable all the PEs to "join" the tunnel and receive
           messages from it.

         * If (and only if) the identified P-tunnel is aggregating
           several S-PMSIs, any demultiplexing information needed by the
           tunnel encapsulation protocol to identify a particular
           S-PMSI.

Top      Up      ToC       Page 53 
   If the policy for the MVPN is that traffic is sent/received by
   default over an MI-PMSI, then traffic for a particular C-flow can be
   switched back to the MI-PMSI simply by ceasing to send S-PMSI Joins
   for that C-flow.

   Note that an S-PMSI Join that is not received over a PMSI (e.g., one
   that is received directly from a CE) is an illegal packet that MUST
   be discarded.

7.4.2.2.  Packet Formats and Constants

   The S-PMSI Join message is encapsulated within UDP and has the
   following type/length/value (TLV) encoding:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     Type      |            Length           |     Value       |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                               .                               |
       |                               .                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Type (8 bits)

   Length (16 bits): the total number of octets in the Type, Length, and
   Value fields combined

   Value (variable length)

   In this specification, only one type of S-PMSI Join is defined.  A
   Type 1 S-PMSI Join is used when the S-PMSI tunnel is a PIM tunnel
   that is used to carry a single multicast stream, where the packets of
   that stream have IPv4 source and destination IP addresses.

   The S-PMSI Join format to use when the C-source and C-group are IPv6
   addresses will be defined in a follow-on document.

Top      Up      ToC       Page 54 
        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     Type      |           Length            |    Reserved     |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           C-source                            |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           C-group                             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                           P-group                             |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Type (8 bits): 1

   Length (16 bits): 16

   Reserved (8 bits): This field SHOULD be zero when transmitted, and
   MUST be ignored when received.

   C-source (32 bits): the IPv4 address of the traffic source in the
   VPN.

   C-group (32 bits): the IPv4 address of the multicast traffic
   destination address in the VPN.

   P-group (32 bits): the IPv4 group address that the PE router is going
   to use to encapsulate the flow (C-source, C-group).

   The P-group identifies the S-PMSI P-tunnel, and the (C-S,C-G)
   identifies the multicast flow that is carried in the P-tunnel.

   The protocol uses the following constants.

   [S-PMSI_DELAY]:

       Once an S-PMSI Join message has been sent, the PE router that is
       to transmit onto the S-PMSI will delay this amount of time before
       it begins using the S-PMSI.  The default value is 3 seconds.

   [S-PMSI_TIMEOUT]:

       If a PE (other than the transmitter) does not receive any packets
       over the S-PMSI P-tunnel for this amount of time, the PE will
       prune itself from the S-PMSI P-tunnel, and will expect (C-S,C-G)
       packets to arrive on an I-PMSI.  The default value is 3 minutes.

       This value must be consistent among PE routers.

Top      Up      ToC       Page 55 
   [S-PMSI_HOLDOWN]:

       If the PE that transmits onto the S-PMSI does not see any
       (C-S,C-G) packets for this amount of time, it will resume sending
       (C-S,C-G) packets on an I-PMSI.

       This is used to avoid oscillation when traffic is bursty.  The
       default value is 1 minute.

   [S-PMSI_INTERVAL]:

       The interval the transmitting PE router uses to periodically send
       the S-PMSI Join message.  The default value is 60 seconds.

7.4.3.  Aggregation

   S-PMSIs can be aggregated on a P-multicast tree.  The S-PMSI to
   (C-S,C-G) binding advertisement supports aggregation.  Furthermore,
   the aggregation procedures of Section 6.3 apply.  It is also possible
   to aggregate both S-PMSIs and I-PMSIs on the same P-multicast tree.



(page 55 continued on part 4)

Next RFC Part