RFC 8365

A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)

Pages: 33
Proposed Standard
→ Errata

Part 1 of 2 – Pages 1 to 17

RFC8365 - Page 1

Internet Engineering Task Force (IETF)                   A. Sajassi, Ed.
Request for Comments: 8365                                         Cisco
Category: Standards Track                                  J. Drake, Ed.
ISSN: 2070-1721                                                  Juniper
                                                                N. Bitar
                                                                   Nokia
                                                              R. Shekhar
                                                                 Juniper
                                                               J. Uttaro
                                                                    AT&T
                                                           W. Henderickx
                                                                   Nokia
                                                              March 2018


  A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)

Abstract

   This document specifies how Ethernet VPN (EVPN) can be used as a
   Network Virtualization Overlay (NVO) solution and explores the
   various tunnel encapsulation options over IP and their impact on the
   EVPN control plane and procedures.  In particular, the following
   encapsulation options are analyzed: Virtual Extensible LAN (VXLAN),
   Network Virtualization using Generic Routing Encapsulation (NVGRE),
   and MPLS over GRE.  This specification is also applicable to Generic
   Network Virtualization Encapsulation (GENEVE); however, some
   incremental work is required, which will be covered in a separate
   document.  This document also specifies new multihoming procedures
   for split-horizon filtering and mass withdrawal.  It also specifies
   EVPN route constructions for VXLAN/NVGRE encapsulations and
   Autonomous System Border Router (ASBR) procedures for multihoming of
   Network Virtualization Edge (NVE) devices.

Status of This Memo

   This is an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is available in Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   https://www.rfc-editor.org/info/rfc8365.

RFC8365 - Page 2

Copyright Notice

   Copyright (c) 2018 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

RFC8365 - Page 3

Table of Contents

   1. Introduction ....................................................4
   2. Requirements Notation and Conventions ...........................5
   3. Terminology .....................................................5
   4. EVPN Features ...................................................7
   5. Encapsulation Options for EVPN Overlays .........................8
      5.1. VXLAN/NVGRE Encapsulation ..................................8
           5.1.1. Virtual Identifiers Scope ...........................9
           5.1.2. Virtual Identifiers to EVI Mapping .................11
           5.1.3. Constructing EVPN BGP Routes .......................13
      5.2. MPLS over GRE .............................................15
   6. EVPN with Multiple Data-Plane Encapsulations ...................15
   7. Single-Homing NVEs - NVE Residing in Hypervisor ................16
      7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE ....16
      7.2. Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations ..17
   8. Multihoming NVEs - NVE Residing in ToR Switch ..................18
      8.1. EVPN Multihoming Features .................................18
           8.1.1. Multihomed ES Auto-Discovery .......................18
           8.1.2. Fast Convergence and Mass Withdrawal ...............18
           8.1.3. Split-Horizon ......................................19
           8.1.4. Aliasing and Backup Path ...........................19
           8.1.5. DF Election ........................................20
      8.2. Impact on EVPN BGP Routes and Attributes ..................20
      8.3. Impact on EVPN Procedures .................................20
           8.3.1. Split Horizon ......................................21
           8.3.2. Aliasing and Backup Path ...........................22
           8.3.3. Unknown Unicast Traffic Designation ................22
   9. Support for Multicast ..........................................23
   10. Data-Center Interconnections (DCIs) ...........................24
      10.1. DCI Using GWs ............................................24
      10.2. DCI Using ASBRs ..........................................24
           10.2.1. ASBR Functionality with Single-Homing NVEs ........25
           10.2.2. ASBR Functionality with Multihoming NVEs ..........26
   11. Security Considerations .......................................28
   12. IANA Considerations ...........................................29
   13. References ....................................................29
      13.1. Normative References .....................................29
      13.2. Informative References ...................................30
   Acknowledgements ..................................................32
   Contributors ......................................................32
   Authors' Addresses ................................................33

RFC8365 - Page 4

1.  Introduction

   This document specifies how Ethernet VPN (EVPN) [RFC7432] can be used
   as a Network Virtualization Overlay (NVO) solution and explores the
   various tunnel encapsulation options over IP and their impact on the
   EVPN control plane and procedures.  In particular, the following
   encapsulation options are analyzed: Virtual Extensible LAN (VXLAN)
   [RFC7348], Network Virtualization using Generic Routing Encapsulation
   (NVGRE) [RFC7637], and MPLS over Generic Routing Encapsulation (GRE)
   [RFC4023].  This specification is also applicable to Generic Network
   Virtualization Encapsulation (GENEVE) [GENEVE]; however, some
   incremental work is required, which will be covered in a separate
   document [EVPN-GENEVE].  This document also specifies new multihoming
   procedures for split-horizon filtering and mass withdrawal.  It also
   specifies EVPN route constructions for VXLAN/NVGRE encapsulations and
   Autonomous System Border Router (ASBR) procedures for multihoming of
   Network Virtualization Edge (NVE) devices.

   In the context of this document, an NVO is a solution to address the
   requirements of a multi-tenant data center, especially one with
   virtualized hosts, e.g., Virtual Machines (VMs) or virtual workloads.
   The key requirements of such a solution, as described in [RFC7364],
   are the following:

   -  Isolation of network traffic per tenant

   -  Support for a large number of tenants (tens or hundreds of
      thousands)

   -  Extension of Layer 2 (L2) connectivity among different VMs
      belonging to a given tenant segment (subnet) across different
      Points of Delivery (PoDs) within a data center or between
      different data centers

   -  Allowing a given VM to move between different physical points of
      attachment within a given L2 segment

   The underlay network for NVO solutions is assumed to provide IP
   connectivity between NVO endpoints.

RFC8365 - Page 5

   This document describes how EVPN can be used as an NVO solution and
   explores applicability of EVPN functions and procedures.  In
   particular, it describes the various tunnel encapsulation options for
   EVPN over IP and their impact on the EVPN control plane as well as
   procedures for two main scenarios:

   (a)  single-homing NVEs - when an NVE resides in the hypervisor, and

   (b)  multihoming NVEs - when an NVE resides in a Top-of-Rack (ToR)
        device.

   The possible encapsulation options for EVPN overlays that are
   analyzed in this document are:

   -  VXLAN and NVGRE

   -  MPLS over GRE

   Before getting into the description of the different encapsulation
   options for EVPN over IP, it is important to highlight the EVPN
   solution's main features, how those features are currently supported,
   and any impact that the encapsulation has on those features.

2.  Requirements Notation and Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Terminology

   Most of the terminology used in this documents comes from [RFC7432]
   and [RFC7365].

   VXLAN:  Virtual Extensible LAN

   GRE:  Generic Routing Encapsulation

   NVGRE:  Network Virtualization using Generic Routing Encapsulation

   GENEVE:  Generic Network Virtualization Encapsulation

   PoD:  Point of Delivery

   NV:  Network Virtualization

RFC8365 - Page 6

   NVO:  Network Virtualization Overlay

   NVE:  Network Virtualization Edge

   VNI:  VXLAN Network Identifier

   VSID:  Virtual Subnet Identifier (for NVGRE)

   I-SID:  Service Instance Identifier

   EVPN:  Ethernet VPN

   EVI:  EVPN Instance.  An EVPN instance spanning the Provider Edge
      (PE) devices participating in that EVPN

   MAC-VRF:  A Virtual Routing and Forwarding table for Media Access
      Control (MAC) addresses on a PE

   IP-VRF:  A Virtual Routing and Forwarding table for Internet Protocol
      (IP) addresses on a PE

   ES:  Ethernet Segment.  When a customer site (device or network) is
      connected to one or more PEs via a set of Ethernet links, then
      that set of links is referred to as an 'Ethernet segment'.

   Ethernet Segment Identifier (ESI):  A unique non-zero identifier that
      identifies an Ethernet segment is called an 'Ethernet Segment
      Identifier'.

   Ethernet Tag:  An Ethernet tag identifies a particular broadcast
      domain, e.g., a VLAN.  An EVPN instance consists of one or more
      broadcast domains.

   PE:  Provider Edge

   Single-Active Redundancy Mode:  When only a single PE, among all the
      PEs attached to an ES, is allowed to forward traffic to/from that
      ES for a given VLAN, then the Ethernet segment is defined to be
      operating in Single-Active redundancy mode.

   All-Active Redundancy Mode:  When all PEs attached to an Ethernet
      segment are allowed to forward known unicast traffic to/from that
      ES for a given VLAN, then the ES is defined to be operating in
      All-Active redundancy mode.

   PIM-SM:  Protocol Independent Multicast - Sparse-Mode

RFC8365 - Page 7

   PIM-SSM:  Protocol Independent Multicast - Source-Specific Multicast

   BIDIR-PIM:  Bidirectional PIM

4.  EVPN Features

   EVPN [RFC7432] was originally designed to support the requirements
   detailed in [RFC7209] and therefore has the following attributes
   which directly address control-plane scaling and ease of deployment
   issues.

   1.   Control-plane information is distributed with BGP and broadcast
        and multicast traffic is sent using a shared multicast tree or
        with ingress replication.

   2.   Control-plane learning is used for MAC (and IP) addresses
        instead of data-plane learning.  The latter requires the
        flooding of unknown unicast and Address Resolution Protocol
        (ARP) frames; whereas, the former does not require any flooding.

   3.   Route Reflector (RR) is used to reduce a full mesh of BGP
        sessions among PE devices to a single BGP session between a PE
        and the RR.  Furthermore, RR hierarchy can be leveraged to scale
        the number of BGP routes on the RR.

   4.   Auto-discovery via BGP is used to discover PE devices
        participating in a given VPN, PE devices participating in a
        given redundancy group, tunnel encapsulation types, multicast
        tunnel types, multicast members, etc.

   5.   All-Active multihoming is used.  This allows a given Customer
        Edge (CE) device to have multiple links to multiple PEs, and
        traffic to/from that CE fully utilizes all of these links.

   6.   When a link between a CE and a PE fails, the PEs for that EVI
        are notified of the failure via the withdrawal of a single EVPN
        route.  This allows those PEs to remove the withdrawing PE as a
        next hop for every MAC address associated with the failed link.
        This is termed "mass withdrawal".

   7.   BGP route filtering and constrained route distribution are
        leveraged to ensure that the control-plane traffic for a given
        EVI is only distributed to the PEs in that EVI.

RFC8365 - Page 8

   8.   When an IEEE 802.1Q [IEEE.802.1Q] interface is used between a CE
        and a PE, each of the VLAN IDs (VIDs) on that interface can be
        mapped onto a bridge table (for up to 4094 such bridge tables).
        All these bridge tables may be mapped onto a single MAC-VRF (in
        case of VLAN-aware bundle service).

   9.   VM Mobility mechanisms ensure that all PEs in a given EVI know
        the ES with which a given VM, as identified by its MAC and IP
        addresses, is currently associated.

   10.  RTs are used to allow the operator (or customer) to define a
        spectrum of logical network topologies including mesh, hub and
        spoke, and extranets (e.g., a VPN whose sites are owned by
        different enterprises), without the need for proprietary
        software or the aid of other virtual or physical devices.

   Because the design goal for NVO is millions of instances per common
   physical infrastructure, the scaling properties of the control plane
   for NVO are extremely important.  EVPN and the extensions described
   herein, are designed with this level of scalability in mind.

5.  Encapsulation Options for EVPN Overlays

5.1.  VXLAN/NVGRE Encapsulation

   Both VXLAN and NVGRE are examples of technologies that provide a data
   plane encapsulation which is used to transport a packet over the
   common physical IP infrastructure between Network Virtualization
   Edges (NVEs) - e.g., VXLAN Tunnel End Points (VTEPs) in VXLAN
   network.  Both of these technologies include the identifier of the
   specific NVO instance, VNI in VXLAN and VSID in NVGRE, in each
   packet.  In the remainder of this document we use VNI as the
   representation for NVO instance with the understanding that VSID can
   equally be used if the encapsulation is NVGRE unless it is stated
   otherwise.

   Note that a PE is equivalent to an NVE/VTEP.

   VXLAN encapsulation is based on UDP, with an 8-byte header following
   the UDP header.  VXLAN provides a 24-bit VNI, which typically
   provides a one-to-one mapping to the tenant VID, as described in
   [RFC7348].  In this scenario, the ingress VTEP does not include an
   inner VLAN tag on the encapsulated frame, and the egress VTEP
   discards the frames with an inner VLAN tag.  This mode of operation
   in [RFC7348] maps to VLAN-Based Service in [RFC7432], where a tenant
   VID gets mapped to an EVI.

RFC8365 - Page 9

   VXLAN also provides an option of including an inner VLAN tag in the
   encapsulated frame, if explicitly configured at the VTEP.  This mode
   of operation can map to VLAN Bundle Service in [RFC7432] because all
   the tenant's tagged frames map to a single bridge table / MAC-VRF,
   and the inner VLAN tag is not used for lookup by the disposition PE
   when performing VXLAN decapsulation as described in Section 6 of
   [RFC7348].

   [RFC7637] encapsulation is based on GRE encapsulation, and it
   mandates the inclusion of the optional GRE Key field, which carries
   the VSID.  There is a one-to-one mapping between the VSID and the
   tenant VID, as described in [RFC7637].  The inclusion of an inner
   VLAN tag is prohibited.  This mode of operation in [RFC7637] maps to
   VLAN Based Service in [RFC7432].

   As described in the next section, there is no change to the encoding
   of EVPN routes to support VXLAN or NVGRE encapsulation, except for
   the use of the BGP Encapsulation Extended Community to indicate the
   encapsulation type (e.g., VXLAN or NVGRE).  However, there is
   potential impact to the EVPN procedures depending on where the NVE is
   located (i.e., in hypervisor or ToR) and whether multihoming
   capabilities are required.

5.1.1.  Virtual Identifiers Scope

   Although VNIs are defined as 24-bit globally unique values, there are
   scenarios in which it is desirable to use a locally significant value
   for the VNI, especially in the context of a data-center interconnect.

5.1.1.1.  Data-Center Interconnect with Gateway

   In the case where NVEs in different data centers need to be
   interconnected, and the NVEs need to use VNIs as globally unique
   identifiers within a data center, then a Gateway (GW) needs to be
   employed at the edge of the data-center network (DCN).  This is
   because the Gateway will provide the functionality of translating the
   VNI when crossing network boundaries, which may align with operator
   span-of-control boundaries.  As an example, consider the network of
   Figure 1.  Assume there are three network operators: one for each of
   the DC1, DC2, and WAN networks.  The Gateways at the edge of the data
   centers are responsible for translating the VNIs between the values
   used in each of the DCNs and the values used in the WAN.

RFC8365 - Page 10

                             +--------------+
                             |              |
           +---------+       |     WAN      |       +---------+
   +----+  |        +---+  +----+        +----+  +---+        |  +----+
   |NVE1|--|        |   |  |WAN |        |WAN |  |   |        |--|NVE3|
   +----+  |IP      |GW |--|Edge|        |Edge|--|GW | IP     |  +----+
   +----+  |Fabric  +---+  +----+        +----+  +---+ Fabric |  +----+
   |NVE2|--|         |       |              |       |         |--|NVE4|
   +----+  +---------+       +--------------+       +---------+  +----+

   |<------ DC 1 ------>                          <------ DC2  ------>|

              Figure 1: Data-Center Interconnect with Gateway

5.1.1.2.  Data-Center Interconnect without Gateway

   In the case where NVEs in different data centers need to be
   interconnected, and the NVEs need to use locally assigned VNIs (e.g.,
   similar to MPLS labels), there may be no need to employ Gateways at
   the edge of the DCN.  More specifically, the VNI value that is used
   by the transmitting NVE is allocated by the NVE that is receiving the
   traffic (in other words, this is similar to a "downstream-assigned"
   MPLS label).  This allows the VNI space to be decoupled between
   different DCNs without the need for a dedicated Gateway at the edge
   of the data centers.  This topic is covered in Section 10.2.

                              +--------------+
                              |              |
              +---------+     |     WAN      |    +---------+
      +----+  |         |   +----+        +----+  |         |  +----+
      |NVE1|--|         |   |ASBR|        |ASBR|  |         |--|NVE3|
      +----+  |IP Fabric|---|    |        |    |--|IP Fabric|  +----+
      +----+  |         |   +----+        +----+  |         |  +----+
      |NVE2|--|         |     |              |    |         |--|NVE4|
      +----+  +---------+     +--------------+    +---------+  +----+

      |<------ DC 1 ----->                        <---- DC2  ------>|

               Figure 2: Data-Center Interconnect with ASBR

RFC8365 - Page 11

5.1.2.  Virtual Identifiers to EVI Mapping

   Just like in [RFC7432], where two options existed for mapping
   broadcast domains (represented by VLAN IDs) to an EVI, when the EVPN
   control plane is used in conjunction with VXLAN (or NVGRE
   encapsulation), there are also two options for mapping broadcast
   domains represented by VXLAN VNIs (or NVGRE VSIDs) to an EVI:

      Option 1: A Single Broadcast Domain per EVI

   In this option, a single Ethernet broadcast domain (e.g., subnet)
   represented by a VNI is mapped to a unique EVI.  This corresponds to
   the VLAN-Based Service in [RFC7432], where a tenant-facing interface,
   logical interface (e.g., represented by a VID), or physical interface
   gets mapped to an EVI.  As such, a BGP Route Distinguisher (RD) and
   Route Target (RT) are needed per VNI on every NVE.  The advantage of
   this model is that it allows the BGP RT constraint mechanisms to be
   used in order to limit the propagation and import of routes to only
   the NVEs that are interested in a given VNI.  The disadvantage of
   this model may be the provisioning overhead if the RD and RT are not
   derived automatically from the VNI.

   In this option, the MAC-VRF table is identified by the RT in the
   control plane and by the VNI in the data plane.  In this option, the
   specific MAC-VRF table corresponds to only a single bridge table.

      Option 2: Multiple Broadcast Domains per EVI

   In this option, multiple subnets, each represented by a unique VNI,
   are mapped to a single EVI.  For example, if a tenant has multiple
   segments/subnets each represented by a VNI, then all the VNIs for
   that tenant are mapped to a single EVI; for example, the EVI in this
   case represents the tenant and not a subnet.  This corresponds to the
   VLAN-aware bundle service in [RFC7432].  The advantage of this model
   is that it doesn't require the provisioning of an RD/RT per VNI.
   However, this is a moot point when compared to Option 1 where auto-
   derivation is used.  The disadvantage of this model is that routes
   would be imported by NVEs that may not be interested in a given VNI.

   In this option, the MAC-VRF table is identified by the RT in the
   control plane; a specific bridge table for that MAC-VRF is identified
   by the <RT, Ethernet Tag ID> in the control plane.  In this option,
   the VNI in the data plane is sufficient to identify a specific bridge
   table.

RFC8365 - Page 12

5.1.2.1.  Auto-Derivation of RT

   In order to simplify configuration, when the option of a single VNI
   per EVI is used, the RT used for EVPN can be auto-derived.  RD can be
   auto-generated as described in [RFC7432], and RT can be auto-derived
   as described next.

   Since a Gateway PE as depicted in Figure 1 participates in both the
   DCN and WAN BGP sessions, it is important that, when RT values are
   auto-derived from VNIs, there be no conflict in RT spaces between
   DCNs and WANs, assuming that both are operating within the same
   Autonomous System (AS).  Also, there can be scenarios where both
   VXLAN and NVGRE encapsulations may be needed within the same DCN, and
   their corresponding VNIs are administered independently, which means
   VNI spaces can overlap.  In order to avoid conflict in RT spaces, the
   6-byte RT values with 2-octet AS number for DCNs can be auto-derived
   as follow:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Global Administrator      |    Local Administrator        |
   +-----------------------------------------------+---------------+
   | Local Administrator (Cont.)   |
   +-------------------------------+

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Global Administrator      |A| TYPE| D-ID  | Service ID    |
   +-----------------------------------------------+---------------+
   |       Service ID (Cont.)      |
   +-------------------------------+

   The 6-octet RT field consists of two sub-fields:

   -  Global Administrator sub-field: 2 octets.  This sub-field contains
      an AS number assigned by IANA <https://www.iana.org/assignments/
      as-numbers/>.

   -  Local Administrator sub-field: 4 octets

      *  A: A single-bit field indicating if this RT is auto-derived

            0: auto-derived
            1: manually derived

RFC8365 - Page 13

      *  Type: A 3-bit field that identifies the space in which the
         other 3 bytes are defined.  The following spaces are defined:

            0 : VID (802.1Q VLAN ID)
            1 : VXLAN
            2 : NVGRE
            3 : I-SID
            4 : EVI
            5 : dual-VID (QinQ VLAN ID)

      *  D-ID: A 4-bit field that identifies domain-id.  The default
         value of domain-id is zero, indicating that only a single
         numbering space exist for a given technology.  However, if more
         than one number space exists for a given technology (e.g.,
         overlapping VXLAN spaces), then each of the number spaces need
         to be identified by its corresponding domain-id starting from
         1.

      *  Service ID: This 3-octet field is set to VNI, VSID, I-SID, or
         VID.

   It should be noted that RT auto-derivation is applicable for 2-octet
   AS numbers.  For 4-octet AS numbers, the RT needs to be manually
   configured because 3-octet VNI fields cannot be fit within the
   2-octet local administrator field.

5.1.3.  Constructing EVPN BGP Routes

   In EVPN, an MPLS label, for instance, identifying the forwarding
   table is distributed by the egress PE via the EVPN control plane and
   is placed in the MPLS header of a given packet by the ingress PE.
   This label is used upon receipt of that packet by the egress PE for
   disposition of that packet.  This is very similar to the use of the
   VNI by the egress NVE, with the difference being that an MPLS label
   has local significance while a VNI typically has global significance.
   Accordingly, and specifically to support the option of locally
   assigned VNIs, the MPLS Label1 field in the MAC/IP Advertisement
   route, the MPLS label field in the Ethernet A-D per EVI route, and
   the MPLS label field in the P-Multicast Service Interface (PMSI)
   Tunnel attribute of the Inclusive Multicast Ethernet Tag (IMET) route
   are used to carry the VNI.  For the balance of this memo, the above
   MPLS label fields will be referred to as the VNI field.  The VNI
   field is used for both local and global VNIs; for either case, the
   entire 24-bit field is used to encode the VNI value.

RFC8365 - Page 14

   For the VLAN-Based Service (a single VNI per MAC-VRF), the Ethernet
   Tag field in the MAC/IP Advertisement, Ethernet A-D per EVI, and IMET
   route MUST be set to zero just as in the VLAN-Based Service in
   [RFC7432].

   For the VLAN-Aware Bundle Service (multiple VNIs per MAC-VRF with
   each VNI associated with its own bridge table), the Ethernet Tag
   field in the MAC Advertisement, Ethernet A-D per EVI, and IMET route
   MUST identify a bridge table within a MAC-VRF; the set of Ethernet
   Tags for that EVI needs to be configured consistently on all PEs
   within that EVI.  For locally assigned VNIs, the value advertised in
   the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware
   bundle service in [RFC7432].  Such setting must be done consistently
   on all PE devices participating in that EVI within a given domain.
   For global VNIs, the value advertised in the Ethernet Tag field
   SHOULD be set to a VNI as long as it matches the existing semantics
   of the Ethernet Tag, i.e., it identifies a bridge table within a
   MAC-VRF and the set of VNIs are configured consistently on each PE in
   that EVI.

   In order to indicate which type of data-plane encapsulation (i.e.,
   VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP
   Encapsulation Extended Community defined in [RFC5512] is included
   with all EVPN routes (i.e., MAC Advertisement, Ethernet A-D per EVI,
   Ethernet A-D per ESI, IMET, and Ethernet Segment) advertised by an
   egress PE.  Five new values have been assigned by IANA to extend the
   list of encapsulation types defined in [RFC5512]; they are listed in
   Section 11.

   The MPLS encapsulation tunnel type, listed in Section 11, is needed
   in order to distinguish between an advertising node that only
   supports non-MPLS encapsulations and one that supports MPLS and
   non-MPLS encapsulations.  An advertising node that only supports MPLS
   encapsulation does not need to advertise any encapsulation tunnel
   types; i.e., if the BGP Encapsulation Extended Community is not
   present, then either MPLS encapsulation or a statically configured
   encapsulation is assumed.

   The Next Hop field of the MP_REACH_NLRI attribute of the route MUST
   be set to the IPv4 or IPv6 address of the NVE.  The remaining fields
   in each route are set as per [RFC7432].

   Note that the procedure defined here -- to use the MPLS Label field
   to carry the VNI in the presence of a Tunnel Encapsulation Extended
   Community specifying the use of a VNI -- is aligned with the
   procedures described in Section 8.2.2.2 of [TUNNEL-ENCAP] ("When a
   Valid VNI has not been Signaled").

RFC8365 - Page 15

5.2.  MPLS over GRE

   The EVPN data plane is modeled as an EVPN MPLS client layer sitting
   over an MPLS PSN tunnel server layer.  Some of the EVPN functions
   (split-horizon, Aliasing, and Backup Path) are tied to the MPLS
   client layer.  If MPLS over GRE encapsulation is used, then the EVPN
   MPLS client layer can be carried over an IP PSN tunnel transparently.
   Therefore, there is no impact to the EVPN procedures and associated
   data-plane operation.

   [RFC4023] defines the standard for using MPLS over GRE encapsulation,
   which can be used for this purpose.  However, when MPLS over GRE is
   used in conjunction with EVPN, it is recommended that the GRE key
   field be present and be used to provide a 32-bit entropy value only
   if the P nodes can perform Equal-Cost Multipath (ECMP) hashing based
   on the GRE key; otherwise, the GRE header SHOULD NOT include the GRE
   key field.  The Checksum and Sequence Number fields MUST NOT be
   included, and the corresponding C and S bits in the GRE header MUST
   be set to zero.  A PE capable of supporting this encapsulation SHOULD
   advertise its EVPN routes along with the Tunnel Encapsulation
   Extended Community indicating MPLS over GRE encapsulation as
   described in the previous section.

6.  EVPN with Multiple Data-Plane Encapsulations

   The use of the BGP Encapsulation Extended Community per [RFC5512]
   allows each NVE in a given EVI to know each of the encapsulations
   supported by each of the other NVEs in that EVI.  That is, each of
   the NVEs in a given EVI may support multiple data-plane
   encapsulations.  An ingress NVE can send a frame to an egress NVE
   only if the set of encapsulations advertised by the egress NVE forms
   a non-empty intersection with the set of encapsulations supported by
   the ingress NVE; it is at the discretion of the ingress NVE which
   encapsulation to choose from this intersection.  (As noted in
   Section 5.1.3, if the BGP Encapsulation extended community is not
   present, then the default MPLS encapsulation or a locally configured
   encapsulation is assumed.)

   When a PE advertises multiple supported encapsulations, it MUST
   advertise encapsulations that use the same EVPN procedures including
   procedures associated with split-horizon filtering described in
   Section 8.3.1.  For example, VXLAN and NVGRE (or MPLS and MPLS over
   GRE) encapsulations use the same EVPN procedures; thus, a PE can
   advertise both of them and can support either of them or both of them
   simultaneously.  However, a PE MUST NOT advertise VXLAN and MPLS
   encapsulations together because (a) the MPLS field of EVPN routes is

RFC8365 - Page 16

   set to either an MPLS label or a VNI, but not both and (b) some EVPN
   procedures (such as split-horizon filtering) are different for VXLAN/
   NVGRE and MPLS encapsulations.

   An ingress node that uses shared multicast trees for sending
   broadcast or multicast frames MAY maintain distinct trees for each
   different encapsulation type.

   It is the responsibility of the operator of a given EVI to ensure
   that all of the NVEs in that EVI support at least one common
   encapsulation.  If this condition is violated, it could result in
   service disruption or failure.  The use of the BGP Encapsulation
   Extended Community provides a method to detect when this condition is
   violated, but the actions to be taken are at the discretion of the
   operator and are outside the scope of this document.

7.  Single-Homing NVEs - NVE Residing in Hypervisor

   When an NVE and its hosts/VMs are co-located in the same physical
   device, e.g., when they reside in a server, the links between them
   are virtual and they typically share fate.  That is, the subject
   hosts/VMs are typically not multihomed or, if they are multihomed,
   the multihoming is a purely local matter to the server hosting the VM
   and the NVEs, and it need not be "visible" to any other NVEs residing
   on other servers.  Thus, it does not require any specific protocol
   mechanisms.  The most common case of this is when the NVE resides on
   the hypervisor.

   In the subsections that follow, we will discuss the impact on EVPN
   procedures for the case when the NVE resides on the hypervisor and
   the VXLAN (or NVGRE) encapsulation is used.

7.1.  Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE
      Encapsulations

   In scenarios where different groups of data centers are under
   different administrative domains, and these data centers are
   connected via one or more backbone core providers as described in
   [RFC7365], the RD must be a unique value per EVI or per NVE as
   described in [RFC7432].  In other words, whenever there is more than
   one administrative domain for global VNI, a unique RD must be used;
   or, whenever the VNI value has local significance, a unique RD must
   be used.  Therefore, it is recommended to use a unique RD as
   described in [RFC7432] at all times.

RFC8365 - Page 17

   When the NVEs reside on the hypervisor, the EVPN BGP routes and
   attributes associated with multihoming are no longer required.  This
   reduces the required routes and attributes to the following subset of
   four out of the total of eight listed in Section 7 of [RFC7432]:

   -  MAC/IP Advertisement Route

   -  Inclusive Multicast Ethernet Tag Route

   -  MAC Mobility Extended Community

   -  Default Gateway Extended Community

   However, as noted in Section 8.6 of [RFC7432], in order to enable a
   single-homing ingress NVE to take advantage of fast convergence,
   Aliasing, and Backup Path when interacting with multihomed egress
   NVEs attached to a given ES, the single-homing ingress NVE should be
   able to receive and process routes that are Ethernet A-D per ES and
   Ethernet A-D per EVI.

7.2.  Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations

   When the NVEs reside on the hypervisors, the EVPN procedures
   associated with multihoming are no longer required.  This limits the
   procedures on the NVE to the following subset.

   1.  Local learning of MAC addresses received from the VMs per
       Section 10.1 of [RFC7432].

   2.  Advertising locally learned MAC addresses in BGP using the MAC/IP
       Advertisement routes.

   3.  Performing remote learning using BGP per Section 9.2 of
       [RFC7432].

   4.  Discovering other NVEs and constructing the multicast tunnels
       using the IMET routes.

   5.  Handling MAC address mobility events per the procedures of
       Section 15 in [RFC7432].

   However, as noted in Section 8.6 of [RFC7432], in order to enable a
   single-homing ingress NVE to take advantage of fast convergence,
   Aliasing, and Backup Path when interacting with multihomed egress
   NVEs attached to a given ES, a single-homing ingress NVE should
   implement the ingress node processing of routes that are Ethernet A-D
   per ES and Ethernet A-D per EVI as defined in Sections 8.2 ("Fast
   Convergence") and 8.4 ("Aliasing and Backup Path") of [RFC7432].

(next page on part 2)