Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7609

IBM's Shared Memory Communications over RDMA (SMC-R) Protocol

Pages: 143
Informational
Part 5 of 6 – Pages 92 to 129
First   Prev   Next

Top   ToC   RFC7609 - Page 92   prevText

Appendix A. Formats

A.1. TCP Option

The SMC-R TCP option is formatted in accordance with [RFC6994] ("Shared Use of Experimental TCP Options"). The ExID value is IBM-1047 (EBCDIC) encoding for "SMCR". 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Kind = 254 | Length = 6 | x'E2' | x'D4' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 24: SMC-R TCP Option Format

A.2. CLC Messages

The following rules apply to all CLC messages: General rules on formats: o Reserved fields must be set to zero and not validated. o Each message has an eye catcher at the start and another eye catcher at the end. These must both be validated by the receiver. o SMC version indicator: The only SMC-R version defined in this architecture is version 1. In the future, if peers have a mismatch of versions, the lowest common version number is used.
Top   ToC   RFC7609 - Page 93

A.2.1. Peer ID Format

All CLC messages contain a peer ID that uniquely identifies an instance of a TCP/IP stack. This peer ID is required to be universally unique across TCP/IP stacks and instances (including restarts) of TCP/IP stacks. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Instance ID | RoCE MAC (first 2 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RoCE MAC (last 4 bytes) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 25: Peer ID Format Instance ID A 2-byte instance count that ensures that if the same RNIC MAC is later used in the peer ID for a different TCP/IP stack -- for example, if an RNIC is redeployed to another stack -- the values are unique. It also ensures that if a TCP/IP stack is restarted, the instance ID changes. The value is implementation defined, with one suggestion being 2 bytes of the system clock. RoCE MAC The RoCE MAC address for one of the peer's RNICs. Note that in a virtualized environment this will be the virtual MAC of one of the peer's RNICs.
Top   ToC   RFC7609 - Page 94

A.2.2. SMC Proposal CLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 1 | Length |Version| Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Client's Peer ID -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- Client's preferred GID -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Client's preferred RoCE | +- MAC address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |Offset to mask/prefix area (0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . . Area for future growth . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Subnet Mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Mask Lgth| Reserved |Num IPv6 prfx | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : : : Array of IPv6 prefixes (variable length) : : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 26: SMC Proposal CLC Message Format
Top   ToC   RFC7609 - Page 95
   The fields present in the SMC Proposal CLC message are:

   Eye catchers

      Like all CLC messages, the SMC Proposal has beginning and ending
      eye catchers to aid with verification and parsing.  The hex digits
      spell "SMCR" in IBM-1047 (EBCDIC).

   Type

      CLC message Type 1 indicates SMC Proposal.

   Length

      The length of this CLC message.  If this is an IPv4 flow, this
      value is 52.  Otherwise, it is variable, depending upon how many
      prefixes are listed.

   Version

      Version of the SMC-R protocol.  Version 1 is the only currently
      defined value.

   Client's Peer ID

      As described in Appendix A.2.1 above.

   Client's preferred RoCE GID

      The IPv6 address of the client's preferred RNIC on the RoCE
      fabric.

   Client's preferred RoCE MAC address

      The MAC address of the client's preferred RNIC on the RoCE fabric.
      It is required, as some operating systems do not have neighbor
      discovery or ARP support for RoCE RNICs.

   Offset to mask/prefix area

      Provides the number of bytes that must be skipped after this
      field, to access the IPv4 Subnet Mask field and the fields that
      follow it.  Allows for future growth of this signal.  In this
      version of the architecture, this value is always zero.
Top   ToC   RFC7609 - Page 96
   Area for future growth

      In this version of the architecture, this field does not exist.
      This indicates where additional information may be inserted into
      the signal in the future.  The "Offset to mask/prefix area" field
      must be used to skip over this area.

   IPv4 Subnet Mask

      If this message is flowing over an IPv4 TCP connection, the value
      of the subnet mask associated with the interface over which the
      client sent this message.  If this is an IPv6 flow, this field is
      all zeros.

      This field, along with all fields that follow it in this signal,
      must be accessed by skipping the number of bytes listed in the
      "Offset to mask/prefix area" field after the end of that field.

   IPv4 Mask Lgth

      If this message is flowing over an IPv4 TCP connection, the number
      of significant bits in the IPv4 Subnet Mask field.  If this is an
      IPv6 flow, this field is zero.

   Num IPv6 prfx

      If this message is flowing over an IPv6 TCP connection, the number
      of IPv6 prefixes that follow, with a maximum value of 8.  If this
      is an IPv4 flow, this field is zero and is immediately followed by
      the ending eye catcher.
Top   ToC   RFC7609 - Page 97
   Array of IPv6 prefixes

      For IPv6 TCP connections, a list of the IPv6 prefixes associated
      with the network over which the client sent this message, up to a
      maximum of eight prefixes.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     +                                                               +
     |                                                               |
     +                  IPv6 prefix value                            +
     |                                                               |
     +                                                               +
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Prefix Length |
     +-+-+-+-+-+-+-+-+

              Figure 27: Format for IPv6 Prefix Array Element
Top   ToC   RFC7609 - Page 98

A.2.3. SMC Accept CLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 2 | Length = 68 |Version|F|Rsrvd| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Server's Peer ID -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- Server's RoCE GID -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Server's RoCE | +- MAC address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Server QP (bytes 1-2) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---+ |Srvr QP byte 3 | Server RMB RKey (bytes 1-3) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Srvr RMB byte 4|Server RMB indx| Srvr RMB alert tkn (bytes 1-2)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Srvr RMB alert tkn (bytes 3-4)|Bsize | MTU | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Server's RMB virtual address -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Server's initial packet sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 28: SMC Accept CLC Message Format
Top   ToC   RFC7609 - Page 99
   The fields present in the SMC Accept CLC message are:

   Eye catchers

      Like all CLC messages, the SMC Accept has beginning and ending
      eye catchers to aid with verification and parsing.  The hex digits
      spell "SMCR" in IBM-1047 (EBCDIC).

   Type

      CLC message Type 2 indicates SMC Accept.

   Length

      The SMC Accept CLC message is 68 bytes long.

   Version

      Version of the SMC-R protocol.  Version 1 is the only currently
      defined value.

   F-bit

      First contact flag: A 1-bit flag that indicates that the server
      believes this TCP connection is the first SMC-R contact for this
      link group.

   Server's Peer ID

      As described in Appendix A.2.1 above.

   Server's RoCE GID

      The IPv6 address of the RNIC that the server chose for this SMC-R
      link.

   Server's RoCE MAC address

      The MAC address of the server's RNIC for the SMC-R link.  It is
      required, as some operating systems do not have neighbor discovery
      or ARP support for RoCE RNICs.

   Server's QP number

      The number for the reliably connected queue pair that the server
      created for this SMC-R link.
Top   ToC   RFC7609 - Page 100
   Server's RMB RKey

      The RDMA RKey for the RMB that the server created or chose for
      this TCP connection.

   Server's RMB element index

      Indexes which element within the server's RMB will represent this
      TCP connection.

   Server's RMB element alert token

      A platform-defined, architecturally opaque token that identifies
      this TCP connection.  Added by the client as immediate data on
      RDMA writes from the client to the server to inform the server
      that there is data for this connection to retrieve from the
      RMB element.

   Bsize:

      Server's RMB element buffer size in 4-bit compressed notation:
      x = 4 bits.  Actual buffer size value is (2^(x + 4)) * 1K.
      Smallest possible value is 16K.  Largest size supported by this
      architecture is 512K.

   MTU

      An enumerated value indicating this peer's QP MTU size.  The two
      peers exchange their MTU values, and whichever value is smaller
      will be used for the QP.  This field should only be validated in
      the first contact exchange.

      The enumerated MTU values are:

         0:  reserved

         1:  256

         2:  512

         3:  1024

         4:  2048

         5:  4096

         6-15: reserved
Top   ToC   RFC7609 - Page 101
   Server's RMB virtual address

      The virtual address of the server's RMB as assigned by the
      server's RNIC.

   Server's initial packet sequence number

      The starting packet sequence number that this peer will use when
      sending to the other peer, so that the other peer can prepare its
      QP for the sequence number to expect.
Top   ToC   RFC7609 - Page 102

A.2.4. SMC Confirm CLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 | Length = 68 |Version| Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Client's Peer ID -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- Client's RoCE GID -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Client's RoCE | +- MAC address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Client QP (bytes 1-2) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---+ |Clnt QP byte 3 | Client RMB RKey (bytes 1-3) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Clnt RMB byte 4|Client RMB indx| Clnt RMB alert tkn (bytes 1-2)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Clnt RMB alert tkn (bytes 3-4)|Bsize | MTU | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Client's RMB Virtual Address -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | Client's initial packet sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 29: SMC Confirm CLC Message Format The SMC Confirm CLC message is nearly identical to the SMC Accept, except that it contains client information and lacks a first contact flag.
Top   ToC   RFC7609 - Page 103
   The fields present in the SMC Confirm CLC message are:

   Eye catchers

      Like all CLC messages, the SMC Confirm has beginning and ending
      eye catchers to aid with verification and parsing.  The hex digits
      spell "SMCR" in IBM-1047 (EBCDIC).

   Type

      CLC message Type 3 indicates SMC Confirm.

   Length

      The SMC Confirm CLC message is 68 bytes long.

   Version

      Version of the SMC-R protocol.  Version 1 is the only currently
      defined value.

   Client's Peer ID

      As described in Appendix A.2.1 above.

   Client's RoCE GID

      The IPv6 address of the RNIC that the client chose for this SMC-R
      link.

   Client's RoCE MAC address

      The MAC address of the client's RNIC for the SMC-R link.  It is
      required, as some operating systems do not have neighbor discovery
      or ARP support for RoCE RNICs.

   Client's QP number

      The number for the reliably connected queue pair that the client
      created for this SMC-R link.

   Client's RMB RKey

      The RDMA RKey for the RMB that the client created or chose for
      this TCP connection.
Top   ToC   RFC7609 - Page 104
   Client's RMB element index

      Indexes which element within the client's RMB will represent this
      TCP connection.

   Client's RMB element alert token

      A platform-defined, architecturally opaque token that identifies
      this TCP connection.  Added by the server as immediate data on
      RDMA writes from the server to the client to inform the client
      that there is data for this connection to retrieve from the
      RMB element.

   Bsize:

      Client's RMB element buffer size in 4-bit compressed notation:
      x = 4 bits.  Actual buffer size value is (2^(x + 4)) * 1K.
      Smallest possible value is 16K.  Largest size supported by this
      architecture is 512K.

   MTU

      An enumerated value indicating this peer's QP MTU size.  The two
      peers exchange their MTU values, and whichever value is smaller
      will be used for the QP.  The values are enumerated in
      Appendix A.2.3.  This value should only be validated in the first
      contact exchange.

   Client's RMB Virtual Address

      The virtual address of the client's RMB as assigned by the
      server's RNIC.

   Client's initial packet sequence number

      The starting packet sequence number that this peer will use when
      sending to the other peer, so that the other peer can prepare its
      QP for the sequence number to expect.
Top   ToC   RFC7609 - Page 105

A.2.5. SMC Decline CLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 4 | Length = 28 |Version|S|Rsrvd| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Sender's Peer ID -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Peer Diagnosis Information | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | x'E2' | x'D4' | x'C3' | x'D9' | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 30: SMC Decline CLC Message Format The fields present in the SMC Decline CLC message are: Eye catchers Like all CLC messages, the SMC Decline has beginning and ending eye catchers to aid with verification and parsing. The hex digits spell "SMCR" in IBM-1047 (EBCDIC). Type CLC message Type 4 indicates SMC Decline. Length The SMC Decline CLC message is 28 bytes long. Version Version of the SMC-R protocol. Version 1 is the only currently defined value. S-bit Sync Bit. Indicates that the link group is out of sync and the receiving peer must clean up its representation of the link group.
Top   ToC   RFC7609 - Page 106
   Sender's Peer ID

      As described in Appendix A.2.1 above.

   Peer Diagnosis Information

      4 bytes of diagnosis information provided by the peer.  These
      values are defined by the individual peers, and it is necessary to
      consult the peer's system documentation to interpret the results.

A.3. LLC Messages

LLC messages are sent over an existing SMC-R link using RoCE SendMsg and are always 44 bytes long so that they fit into the space available in a single WQE without requiring the receiver to post receive buffers. If all 44 bytes are not needed, they are padded out with zeros. LLC messages are in a request/response format. The message type is the same for request and response, and a flag indicates whether a message is flowing as a request or a response. The two high-order bits of an LLC message opcode indicate how it is to be handled by a peer that does not support the opcode. If the high-order bits of the opcode are b'00', then the peer must support the LLC message and indicate a protocol error if it does not. If the high-order bits of the opcode are b'10', then the peer must silently discard the LLC message if it does not support the opcode. This requirement is included to allow for toleration of advanced, but optional, functionality. High-order bits of b'11' indicate a Connection Data Control (CDC) message as described in Appendix A.4.
Top   ToC   RFC7609 - Page 107

A.3.1. CONFIRM LINK LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 1 | Length = 44 | Reserved |R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender's RoCE | +- MAC address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | +- -+ | Sender's RoCE GID | +- -+ | | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |Sender's QP number, bytes 1-2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Sender QP byte3| Link number |Sender's link userID, bytes 1-2| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Sender's link userID, bytes 3-4| Max links | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- Reserved -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 31: CONFIRM LINK LLC Message Format The CONFIRM LINK LLC message is required to be exchanged between the server and client over a newly created SMC-R link to complete the setup of an SMC-R link. Its purpose is to confirm that the RoCE path is actually usable. On first contact, this message flows after the server receives the SMC Confirm CLC message from the client over the IP connection. For additional links added to an SMC-R link group, it flows after the ADD LINK and ADD LINK CONTINUATION exchange. This flow provides confirmation that the queue pair is in fact usable. Each peer echoes its RoCE information back to the other.
Top   ToC   RFC7609 - Page 108
   The contents of the CONFIRM LINK LLC message are:

   Type

      Type 1 indicates CONFIRM LINK.

   Length

      The CONFIRM LINK LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a CONFIRM LINK
      reply.

   Sender's RoCE MAC address

      The MAC address of the sender's RNIC for the SMC-R link.  It is
      required, as some operating systems do not have neighbor discovery
      or ARP support for RoCE RNICs.

   Sender's RoCE GID

      The IPv6 address of the RNIC that the sender is using for this
      SMC-R link.

   Sender's QP number

      The number for the reliably connected queue pair that the sender
      created for this SMC-R link.

   Link number

      An identifier assigned by the server that uniquely identifies the
      link within the link group.  This identifier is ONLY unique within
      a link group.  Provided by the server and echoed back by the
      client.

   Link user ID

      An opaque, implementation-defined identifier assigned by the
      sender and provided to the receiver solely for purposes of
      display, diagnosis, network management, etc.  The link user ID
      should be unique across the sender's entire software space,
      including all other link groups.
Top   ToC   RFC7609 - Page 109
   Max links

      The maximum number of links the sender can support in a link
      group.  The maximum for this link group is the smaller of the
      values provided by the two peers.

A.3.2. ADD LINK LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 2 | Length = 44 | Rsrvd |RsnCode|R|Z| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender's RoCE | +- MAC address +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | +- -+ | Sender's RoCE GID | +- -+ | | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |Sender's QP number, bytes 1-2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Sender QP byte3| Link number |Rsrvd | MTU |Initial PSN | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initial PSN (continued) | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+ | Reserved | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 32: ADD LINK LLC Message Format The ADD LINK LLC message is sent over an existing link in the link group when a peer wishes to add an SMC-R link to an existing SMC-R link group. It is sent by the server to add a new SMC-R link to the group, or by the client to request that the server add a new link -- for example, when a new RNIC becomes active. When sent from the client to the server, it represents a request that the server initiate an ADD LINK exchange.
Top   ToC   RFC7609 - Page 110
   This message is sent immediately after the initial SMC-R link in the
   group completes, as described in Section 3.5.1 ("First Contact").  It
   can also be sent over an existing SMC-R link group at any time as new
   RNICs are added and become available.  Therefore, there can be as few
   as one new RMB RToken to be communicated, or several.  RTokens will
   be communicated using ADD LINK CONTINUATION messages.

   The contents of the ADD LINK LLC message are:

   Type

      Type 2 indicates ADD LINK.

   Length

      The ADD LINK LLC message is 44 bytes long.

   RsnCode

      If the Z (rejection) flag is set, this field provides the reason
      code.  Values can be:

         X'1' - no alternate path available: set when the server
                provides the same MAC/GID as an existing SMC-R link in
                the group, and the client does not have any additional
                RNICs available (i.e., the server is attempting to set
                up an asymmetric link but none is available).

         X'2' - Invalid MTU value specified.

   R

      Reply flag.  When set, indicates that this is an ADD LINK reply.

   Z

      Rejection flag.  When set on reply, indicates that the server's
      ADD LINK was rejected by the client.  When this flag is set, the
      reason code will also be set.

   Sender's RoCE MAC address

      The MAC address of the sender's RNIC for the new SMC-R link.  It
      is required, as some operating systems do not have neighbor
      discovery or ARP support for RoCE RNICs.
Top   ToC   RFC7609 - Page 111
   Sender's RoCE GID

      The IPv6 address of the RNIC that the sender is using for the new
      SMC-R link.

   Sender's QP number

      The number for the reliably connected queue pair that the sender
      created for the new SMC-R link.

   Link number

      An identifier for the new SMC-R link.  This is assigned by the
      server and uniquely identifies the link within the link group.
      This identifier is ONLY unique within a link group.  Provided by
      the server and echoed back by the client.

   MTU

      An enumerated value indicating this peer's QP MTU size.  The two
      peers exchange their MTU values, and whichever value is smaller
      will be used for the QP.  The values are enumerated in
      Appendix A.2.3.

   Initial PSN

      The starting packet sequence number (PSN) that this peer will use
      when sending to the other peer, so that the other peer can prepare
      its QP for the sequence number to expect.
Top   ToC   RFC7609 - Page 112

A.3.3. ADD LINK CONTINUATION LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 3 | Length = 44 | Reserved |R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Linknum | NumRTokens | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- RKey/RToken pair -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- RKey/RToken pair or zeros -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 33: ADD LINK CONTINUATION LLC Message Format When a new SMC-R link is added to an SMC-R link group, it is necessary to communicate the new link's RTokens for the RMBs that the SMC-R link group can access. This message follows the ADD LINK and provides the RTokens. The server kicks off this exchange by sending the first ADD LINK CONTINUATION LLC message, and the server controls the exchange as described below. o If the client and the server require the same number of ADD LINK CONTINUATION messages to communicate their RTokens, the server starts the exchange by sending the first ADD LINK CONTINUATION request to the client with its (the server's) RTokens. The client then responds with an ADD LINK CONTINUATION response with its RTokens, and so on until the exchange is completed.
Top   ToC   RFC7609 - Page 113
   o  If the server requires more ADD LINK CONTINUATION messages than
      the client, then after the client has communicated all of its
      RTokens, the server continues to send ADD LINK CONTINUATION
      request messages to the client.  The client continues to respond,
      using empty (number of RTokens to be communicated = 0) ADD LINK
      CONTINUATION response messages.

   o  If the client requires more ADD LINK CONTINUATION messages than
      the server, then after communicating all of its RTokens, the
      server will continue to send empty ADD LINK CONTINUATION messages
      to the client to solicit replies with the client's RTokens, until
      all have been communicated.

   The contents of the ADD LINK CONTINUATION LLC message are:

   Type

      Type 3 indicates ADD LINK CONTINUATION.

   Length

      The ADD LINK CONTINUATION LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is an ADD LINK
      CONTINUATION reply.

   LinkNum

      The link number of the new link within the SMC-R link group for
      which RKeys are being communicated.

   NumRTokens

      Number of RTokens remaining to be communicated (including the ones
      in this message).  If the value is less than or equal to 2, this
      is the last message.  If it is greater than 2, another
      continuation message will be required, and its value will be the
      value in this message minus 2, and so on until all RKeys are
      communicated.  The maximum value for this field is 255.
Top   ToC   RFC7609 - Page 114
   RKey/RToken pairs (two or less)

      These consist of an RKey for an RMB that is known on the SMC-R
      link over which this message was sent (the reference RKey), paired
      with the same RMB's RToken over the new SMC-R link.  A full RToken
      is not required for the reference, because it is only being used
      to distinguish which RMB it applies to, not address it.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                         Reference RKey                        |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                            New RKey                           |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     +-                       New Virtual Address                   -+
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                    Figure 34: RKey/RToken Pair Format

   The contents of the RKey/RToken pair are:

   Reference RKey

      The RKey of the RMB as it is already known on the SMC-R link over
      which this message is being sent.  Required so that the peer knows
      with which RMB to associate the new RToken.

   New RKey

      The RKey of this RMB as it is known over the new SMC-R link.

   New Virtual Address

      The virtual address of this RMB as it is known over the new
      SMC-R link.
Top   ToC   RFC7609 - Page 115

A.3.4. DELETE LINK LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 4 | Length = 44 | Reserved |R|A|O| Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Linknum | reason code (bytes 1-3) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |RsnCode byte 4 | | +-+-+-+-+-+-+-+-+ -+ | | +- -+ | | +- -+ | | +- Reserved -+ | | +- -+ | | +- -+ | | +- -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 35: DELETE LINK LLC Message Format When the client or server detects that a QP or SMC-R link goes down or needs to come down, it sends this message over one of the other links in the link group. When the DELETE LINK is sent from the client, it only serves as a notification, and the client expects the server to respond by sending a DELETE LINK request. To avoid races, only the server will initiate the actual DELETE LINK request and response sequence that results from notification from the client. The server can also initiate the DELETE LINK without notification from the client if it detects an error or if orderly link termination was initiated. The client may also request termination of the entire link group, and the server may terminate the entire link group using this message.
Top   ToC   RFC7609 - Page 116
   The contents of the DELETE LINK LLC message are:

   Type

      Type 4 indicates DELETE LINK.

   Length

      The DELETE LINK LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a DELETE LINK reply.

   A

      "All" flag.  When set, indicates that all links in the link group
      are to be terminated.  This terminates the link group.

   O

      Orderly flag.  Indicates orderly termination.  Orderly termination
      is generally caused by an operator command rather than an error on
      the link.  When the client requests orderly termination, the
      server may wait to complete other work before terminating.

   LinkNum

      The link number of the link to be terminated.  If the A flag is
      set, this field has no meaning and is set to 0.

   RsnCode

      The termination reason code.  Currently defined reason codes are:

      Request reason codes:

         X'00010000' = Lost path

         X'00020000' = Operator initiated termination

         X'00030000' = Program initiated termination (link inactivity)

         X'00040000' = LLC protocol violation

         X'00050000' = Asymmetric link no longer needed
Top   ToC   RFC7609 - Page 117
      Response reason code:

         X'00100000' = Unknown link ID (no link)

A.3.5. CONFIRM RKEY LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 6 | Length = 44 | Reserved |R|0|Z|C|Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NumTkns | New RMB RKey for this link (bytes 1-3) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |ThisLink byte 4| | +-+-+-+-+-+-+-+-+ -+ | New RMB virtual address for this link | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+ -+ | | +- Other link RMB specification or zeros -+ | | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+ | | +- -+ | Other link RMB specification or zeros | +- +-+-+-+-+-+-+-+-+ | | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 36: CONFIRM RKEY LLC Message Format The CONFIRM RKEY flow can be sent at any time from either the client or the server, to inform the peer that an RMB has been created or deleted. The creator of a new RMB must inform its peer of the new RMB's RToken for all SMC-R links in the SMC-R link group. For RMB creation, the creator sends this message over the SMC-R link that the first TCP connection that uses the new RMB is using. This message contains the new RMB RToken for the SMC-R link over which the message is sent. It then lists the sender's SMC-R links in the link group paired with the new RToken for the new RMB for that link. This message can communicate the new RTokens for three QPs: the QP for the link over which this message is sent, and two others. If there are more than three links in the SMC-R link group, a CONFIRM RKEY CONTINUATION will be required.
Top   ToC   RFC7609 - Page 118
   The peer responds by simply echoing the message with the response
   flag set.  If the response is a negative response, the sender must
   recalculate the RToken set and start a new CONFIRM RKEY exchange from
   the beginning.  The timing of this retry is controlled by the C flag,
   as described below.

   The contents of the CONFIRM RKEY LLC message are:

   Type

      Type 6 indicates CONFIRM RKEY.

   Length

      The CONFIRM RKEY LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a CONFIRM RKEY
      reply.

   0

      Reserved bit.

   Z

      Negative response flag.

   C

      Configuration Retry bit.  If this is a negative response and this
      flag is set, the originator should recalculate the RKey set and
      retry this exchange as soon as the current configuration change is
      completed.  If this flag is not set on a negative response, the
      originator must wait for the next natural stimulus (for example, a
      new TCP connection started that requires a new RMB) before
      retrying.

   NumTkns

      The number of other link/RToken pairs, including those provided in
      this message, to be communicated.  Note that this value does not
      include the RToken for the link on which this message was sent
      (i.e., the maximum value is 2).  If this value is 3 or less, this
      is the only message in the exchange.  If this value is greater
      than 3, a CONFIRM RKEY CONTINUATION message will be required.
Top   ToC   RFC7609 - Page 119
      Note: In this version of the architecture, eight is the maximum
      number of links supported in a link group.

   New RMB RKey for this link

      The new RMB's RKey as assigned on the link over which this message
      is being sent.

   New RMB virtual address for this link

      The new RMB's virtual address as assigned on the link over which
      this message is being sent.

   Other link RMB specification

      The new RMB's specification on the other links in the link group,
      as shown in Figure 37.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | Link number   | RMB's RKey for the specified link (bytes 1-3) |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |New RKey byte 4|                                               |
     +-+-+-+-+-+-+-+-+                                              -+
     |           RMB's virtual address for the specified link        |
     +-              +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |               |
     +-+-+-+-+-+-+-+-+

                Figure 37: Format of Link Number/RKey Pairs

   Link number

      The link number for a link in the link group.

   RMB's RKey for the specified link

      The RKey used to reach the RMB over the link whose number was
      specified in the Link number field.

   RMB's virtual address for the specified link

      The virtual address used to reach the RMB over the link whose
      number was specified in the Link number field.
Top   ToC   RFC7609 - Page 120

A.3.6. CONFIRM RKEY CONTINUATION LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 8 | Length = 44 | Reserved |R|0|Z| Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | NumTknsLeft | | +-+-+-+-+-+-+-+-+ -+ | | +- Other link RMB specification -+ | | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+ -+ | | +- Other link RMB specification or zeros -+ | | +- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -+ | | +- -+ | Other link RMB specification or zeros | +- +-+-+-+-+-+-+-+-+ | | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 38: CONFIRM RKEY CONTINUATION LLC Message Format The CONFIRM RKEY CONTINUATION LLC message is used to communicate any additional RMB RTokens that did not fit into the CONFIRM RKEY message. Each of these messages can hold up to three RMB RTokens. The NumTknsLeft field indicates how many RMB RTokens are to be communicated, including the ones in this message. If the value is 3 or less, this is the last message of the group. If the value is 4 or higher, additional CONFIRM RKEY CONTINUATION messages will follow, and the NumTknsLeft value will be a countdown until all are communicated. Like the CONFIRM RKEY message, the peer responds by echoing the message back with the reply flag set.
Top   ToC   RFC7609 - Page 121
   The contents of the CONFIRM RKEY CONTINUATION LLC message are:

   Type

      Type 8 indicates CONFIRM RKEY CONTINUATION.

   Length

      The CONFIRM RKEY CONTINUATION LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a CONFIRM RKEY
      CONTINUATION reply.

   0

      Reserved bit.

   Z

      Negative response flag.

   NumTknsLeft

      The number of link/RToken pairs, including those provided in this
      message, that are remaining to be communicated.  If this value is
      3 or less, this is the last message in the exchange.  If this
      value is greater than 3, another CONFIRM RKEY CONTINUATION message
      will be required.  Note that in this version of the architecture,
      eight is the maximum number of links supported in a link group.

   Other link RMB specification

      The new RMB's specification on other links in the link group, as
      shown in Figure 37.
Top   ToC   RFC7609 - Page 122

A.3.7. DELETE RKEY LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 9 | Length = 44 | Reserved |R|0|Z| Rsrvd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Count | Error Mask | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | First deleted RKey | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Second deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Third deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Fourth deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Fifth deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sixth deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Seventh deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Eighth deleted RKey or zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 39: DELETE RKEY LLC Message Format The DELETE RKEY flow can be sent at any time from either the client or the server, to inform the peer that one or more RMBs have been deleted. Because the peer already knows every RMB's RKey on each link in the link group, this message only specifies one RKey for each RMB being deleted. The RKey provided for each deleted RMB will be its RKey as known on the SMC-R link over which this message is sent. It is not necessary to provide the entire RToken. The RKey alone is sufficient for identifying an existing RMB. The peer responds by simply echoing the message with the response flag set. If the peer did not recognize an RKey, a negative response flag will be set; however, no aggressive recovery action beyond logging the error will be taken.
Top   ToC   RFC7609 - Page 123
   The contents of the DELETE RKEY LLC message are:

   Type

      Type 9 indicates DELETE RKEY.

   Length

      The DELETE RKEY LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a DELETE RKEY reply.

   0

      Reserved bit.

   Z

      Negative response flag.

   Count

      Number of RMBs being deleted by this message.  Maximum value is 8.

   Error Mask

      If this is a negative response, indicates which RMBs were not
      successfully deleted.  Each bit corresponds to a listed RMB; for
      example, b'01010000' indicates that the second and fourth RKeys
      weren't successfully deleted.

   Deleted RKeys

      A list of Count RKeys.  Provided on the request flow and echoed
      back on the response flow.  Each RKey is valid on the link over
      which this message is sent and represents a deleted RMB.  Up to
      eight RMBs can be deleted in this message.
Top   ToC   RFC7609 - Page 124

A.3.8. TEST LINK LLC Message Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type = 7 | Length = 44 | Reserved |R| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- User Data -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +- -+ | | +- -+ | Reserved | +- -+ | | +- -+ | | +- -+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 40: TEST LINK LLC Message Format The TEST LINK request can be sent from either peer to the other on an existing SMC-R link at any time to test that the SMC-R link is active and healthy at the software level. A peer that receives a TEST LINK LLC message immediately sends back a TEST LINK reply, echoing back the user data. Refer also to Section 4.5.3 ("TCP Keepalive Processing").
Top   ToC   RFC7609 - Page 125
   The contents of the TEST LINK LLC message are:

   Type

      Type 7 indicates TEST LINK.

   Length

      The TEST LINK LLC message is 44 bytes long.

   R

      Reply flag.  When set, indicates that this is a TEST LINK reply.

   User Data

      The receiver of this message echoes the sender's data back in a
      TEST LINK response LLC message.

A.4. Connection Data Control (CDC) Message Format

The RMBE control data is communicated using Connection Data Control (CDC) messages, which use RoCE SendMsg, similar to LLC messages. Also, as with LLC messages, CDC messages are 44 bytes long to ensure that they can fit into private data areas of receive WQEs without requiring the receiver to post receive buffers. Unlike LLC messages, this data is integral to the data path, so its processing must be prioritized and optimized similarly to other data path processing. While LLC messages may be processed on a slower path than data, these messages cannot be.
Top   ToC   RFC7609 - Page 126
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   0  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Type = x'FE'  | Length = 44   |      Sequence number          |
   4  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       SMC-R alert token                       |
   8  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         Reserved              | Producer cursor wrap seqno    |
   12 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       Producer Cursor                         |
   16 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         Reserved              | Consumer cursor wrap seqno    |
   20 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       Consumer Cursor                         |
   24 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |B|P|U|R|F|Rsrvd|D|C|A|             Reserved                    |
   28 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
   32 +-                                                             -+
      |                                                               |
   36 +-                         Reserved                            -+
      |                                                               |
   40 +-                                                             -+
      |                                                               |
   44 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 41: Connection Data Control (CDC) Message Format

   Type = x'FE'

      This type number has the two high-order bits turned on to enable
      processing to quickly distinguish it from an LLC message.

   Length = 44

      The length of inline data that does not require the posting of a
      receive buffer.

   Sequence number

      A 2-byte unsigned integer that represents a wrapping sequence
      number.  The initial value is 1, and this value can wrap to 0.
      Incremented with every control message sent, except for the
      failover data validation message, and used to guard against
      processing an old control message out of sequence.  Also used in
      failover data validation.  In normal usage, if this number is less
Top   ToC   RFC7609 - Page 127
      than the last received value, discard this message.  If greater,
      process this message.  Old control messages can be lost with no
      ill effect but cannot be processed after newer ones.

      If this is a failover validation CDC message (F flag set), then
      the receiver must verify that it has received and fully processed
      the RDMA write that was described by the CDC message with the
      sequence number in this message.  If not, the TCP connection must
      be reset to guard against data loss.  Details of this processing
      are provided in Section 4.6.1.

   SMC-R alert token

      The endpoint-assigned alert token that identifies to which TCP
      connection on the link group this control message refers.

   Producer cursor wrap seqno

      A 2-byte unsigned integer that represents a wrapping counter
      incremented by the producer whenever the data written into this
      RMBE receive buffer causes a wrap (i.e., the producer cursor
      wraps).  This is used by the receiver to determine when new data
      is available even though the cursors appear unchanged, such as
      when a full window size write is completed (producer cursor of
      this RMBE sent by peer = local consumer cursor) or in scenarios
      where the producer cursor sent for this RMBE < local consumer
      cursor.

   Producer Cursor

      A 4-byte unsigned integer that is a wrapping offset into the RMBE
      data area.  Points to the next byte of data to be written by the
      sender.  Can advance up to the receiver's consumer cursor as known
      by the sender.  When the urgent data present indicator is on,
      points 1 byte beyond the last byte of urgent data.  When computing
      this cursor, the presence of the eye catcher in the RMBE data area
      must be accounted for.  The first writable data location in the
      RMBE is at offset 4, so this cursor begins at 4 and wraps to 4.

   Consumer cursor wrap seqno

      A 2-byte unsigned integer that mirrors the value of the producer
      cursor wrap sequence number when the last read from this RMBE
      occurred.  Used as an indicator of how far along the consumer is
      in reading data (i.e., processed last wrap point or not).  The
      producer side can use this indicator to detect whether or not more
      data can be written to the partner in full window write scenarios
      (where the producer cursor = consumer cursor as known on the
Top   ToC   RFC7609 - Page 128
      remote RMBE).  In this scenario, if the consumer sequence number
      equals the local producer sequence number, the producer knows that
      more data can be written.

   Consumer Cursor

      A 4-byte unsigned integer that is a wrapping offset into the
      sender's RMBE data area.  Points to the offset of the next byte of
      data to be consumed by the peer in its own RMBE.  When computing
      this cursor, the presence of the eye catcher in the RMBE data area
      must be accounted for.  The first writable data location in the
      RMBE is at offset 4, so this cursor begins at 4 and wraps to 4.
      The sender cannot write beyond this cursor into the peer's RMBE
      without causing data loss.

   B-bit

      Writer blocked indicator: Sender is blocked for writing.  If this
      bit is set, sender will require explicit notification when receive
      buffer space is available.

   P-bit

      Urgent data pending: Sender has urgent data pending for this
      connection.

   U-bit

      Urgent data present: Indicates that urgent data is present in the
      RMBE data area, and the producer cursor points to 1 byte beyond
      the last byte of urgent data.

   R-bit

      Request for consumer cursor update: Indicates that an immediate
      consumer cursor update is requested, regardless of whether or not
      one is warranted according to the window size optimization
      algorithm described in Section 4.5.1.

   F-bit

      Failover validation indicator: Sent by a peer to guard against
      data loss during failover when the TCP connection is being moved
      to another SMC-R link in the link group.  When this bit is set,
      the only other fields in the CDC message that are significant are
      the Type, Length, SMC-R alert token, and Sequence number fields.
      The receiver must validate that it has fully processed the RDMA
      write described by the previous CDC message bearing the same
Top   ToC   RFC7609 - Page 129
      sequence number as this validation message.  If it has, no further
      action is required.  If it has not, the TCP connection must be
      reset.  This processing is described in detail in Section 4.6.1.

   D-bit

      Sending done indicator: Sent by a peer when it is done writing new
      data into the receiver's RMBE data area.

   C-bit

      PeerConnectionClosed indicator: Sent by a peer when it is
      completely done with this connection and will no longer be making
      any updates to the receiver's RMBE or sending any more control
      messages.

   A-bit

      Abnormal close indicator: Sent by a peer when the connection is
      abnormally terminated (for example, the TCP connection was reset).
      When sent, it indicates that the peer is completely done with this
      connection and will no longer be making any updates to this RMBE
      or sending any more control messages.  It also indicates that the
      RMBE owner must flush any remaining data on this connection and
      generate an error return code to any outstanding socket APIs on
      this connection (same processing as receiving a RST segment on a
      TCP connection).



(page 129 continued on part 6)

Next Section