tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Glossaries     Architecture     IMS     UICC    |    search     info

RFC 5046

 
 
 

Internet Small Computer System Interface (iSCSI) Extensions for Remote Direct Memory Access (RDMA)

Part 4 of 4, p. 58 to 85
Prev RFC Part

 


prevText      Top      Up      ToC       Page 58 
9.  iSER Control and Data Transfer

   For iSCSI data-type PDUs (see Section 7.1), the iSER layer uses RDMA
   Read and RDMA Write operations to transfer the solicited data.  For
   iSCSI control-type PDUs (see Section 7.2), the iSER layer uses Send
   Message Types of RCaP.

9.1.  iSER Header Format

   An iSER header MUST be present in every Send Message Type of RCaP.
   The iSER header is located in the first 12 bytes of the message
   payload of the Send Message Type of RCaP, as shown in Figure 2.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | Opcode|                  Opcode Specific Fields               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    Opcode Specific Fields                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                    Opcode Specific Fields                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                       Figure 2.  iSER Header Format

   Opcode - Operation Code: 4 bits

        The Opcode field identifies the type of iSER Messages:

           0001b = iSCSI control-type PDU

           0010b = iSER Hello Message

           0011b = iSER HelloReply Message

           All other opcodes are reserved.

Top      Up      ToC       Page 59 
9.2.  iSER Header Format for the iSCSI Control-Type PDU

   The iSER layer uses Send Message Types of RCaP to transfer iSCSI
   control-type PDUs (see Section 7.2).  The message payload of each of
   the Send Message Types of RCaP used for transferring an iSER Message
   contains an iSER Header followed by an iSCSI control-type PDU.

   The iSER header in a Send Message Type of RCaP carrying an iSCSI
   control-type PDU MUST have the format as described in Figure 3.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       |W|R|                                                   |
      | 0001b |S|S|                  Reserved                         |
      |       |V|V|                                                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Write STag (or N/A)                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         Read STag (or N/A)                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Figure 3.  iSER Header Format for iSCSI Control-Type PDU

   WSV - Write STag Valid flag: 1 bit

       This flag indicates the validity of the Write STag field of the
       iSER Header.  If set to one, the Write STag field in this iSER
       Header is valid.  If set to zero, the Write STag field in this
       iSER Header MUST be ignored at the receiver.  The Write STag
       Valid flag is set to one when there is solicited data to be
       transferred for a SCSI write or bidirectional command, or when
       there are non-immediate unsolicited and solicited data to be
       transferred for the referenced task specified in a Task
       Management Function Request with the TASK REASSIGN function.

   RSV - Read STag Valid flag: 1 bit

       This flag indicates the validity of the Read STag field of the
       iSER Header.  If set to one, the Read STag field in this iSER
       Header is valid.  If set to zero, the Read STag field in this
       iSER Header MUST be ignored at the receiver.  The Read STag Valid
       flag is set to one for a SCSI read or bidirectional command, or
       for a Task Management Function Request with the TASK REASSIGN
       function.

Top      Up      ToC       Page 60 
   Write STag - Write Steering Tag: 32 bits

       This field contains the Write STag when the Write STag Valid flag
       is set to one.  For a SCSI write or bidirectional command, the
       Write STag is used to Advertise the initiator's I/O Buffer
       containing the solicited data.  For a Task Management Function
       Request with the TASK REASSIGN function, the Write STag is used
       to Advertise the initiator's I/O Buffer containing the non-
       immediate unsolicited data and solicited data.  This Write STag
       is used as the Data Source STag in the resultant RDMA Read
       operation(s).  When the Write STag Valid flag is set to zero,
       this field MUST be set to zero.

   Read STag - Read Steering Tag: 32 bits

       This field contains the Read STag when the Read STag Valid flag
       is set to one.  The Read STag is used to Advertise the
       initiator's Read I/O Buffer of a SCSI read or bidirectional
       command, or of a Task Management Function Request with the TASK
       REASSIGN function.  This Read STag is used as the Data Sink STag
       in the resultant RDMA Write operation(s).  When the Read STag
       Valid flag is zero, this field MUST be set to zero.

   Reserved:

       Reserved fields MUST be set to zero on transmit and MUST be
       ignored on reception.

9.3.  iSER Header Format for the iSER Hello Message

   An iSER Hello Message MUST only contain the iSER header, which MUST
   have the format as described in Figure 4.  The iSER Hello Message is
   the first iSER Message sent on the RCaP Stream from the iSER layer at
   the initiator to the iSER layer at the target.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       |       |       |       |                               |
      | 0010b | Rsvd  | MaxVer| MinVer|           iSER-IRD            |
      |       |       |       |       |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           Reserved                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           Reserved                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Figure 4.  iSER Header Format for iSER Hello Message

Top      Up      ToC       Page 61 
   MaxVer - Maximum Version: 4 bits

       This field specifies the maximum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   MinVer - Minimum Version: 4 bits

       This field specifies the minimum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   iSER-IRD: 16 bits

       This field contains the value of the iSER-IRD at the initiator.

   Reserved (Rsvd):

       Reserved fields MUST be set to zero on transmit, and MUST be
       ignored on reception.

9.4.  iSER Header Format for the iSER HelloReply Message

   An iSER HelloReply Message MUST only contain the iSER header which
   MUST have the format as described in Figure 5.  The iSER HelloReply
   Message is the first iSER Message sent on the RCaP Stream from the
   iSER layer at the target to the iSER layer at the initiator.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       |     |R|       |       |                               |
      | 0011b |Rsvd |E| MaxVer| CurVer|           iSER-ORD            |
      |       |     |J|       |       |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           Reserved                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           Reserved                            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         Figure 5.  iSER Header Format for iSER HelloReply Message

   REJ - Reject flag: 1 bit

       This flag indicates whether the target is rejecting this
       connection.  If set to one, the target is rejecting the
       connection.

Top      Up      ToC       Page 62 
   MaxVer - Maximum Version: 4 bits

       This field specifies the maximum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   CurVer - Current Version: 4 bits

       This field specifies the current version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   iSER-ORD: 16 bits

       This field contains the value of the iSER-ORD at the target.

   Reserved (Rsvd):

       Reserved fields MUST be set to zero on transmit, and MUST be
       ignored on reception.

9.5.  SCSI Data Transfer Operations

   The iSER layer at the initiator and the iSER layer at the target
   handle each SCSI Write, SCSI Read, and bidirectional operation as
   described below.

9.5.1.  SCSI Write Operation

   The iSCSI layer at the initiator MUST invoke the Send_Control
   Operational Primitive to request that the iSER layer at the initiator
   send the SCSI write command.  The iSER layer at the initiator MUST
   request that the RCaP layer transmit a SendSE Message with the
   message payload consisting of the iSER header followed by the SCSI
   Command PDU and immediate data (if any).  If there is solicited data,
   the iSER layer MUST Advertise the Write STag in the iSER header of
   the SendSE Message, as described in Section 9.2.  Upon receiving the
   SendSE Message, the iSER layer at the target MUST notify the iSCSI
   layer at the target by invoking the Control_Notify Operational
   Primitive qualified with the SCSI Command PDU.  See Section 7.3.1 for
   details on the handling of the SCSI write command.

   For the non-immediate unsolicited data, the iSCSI layer at the
   initiator MUST invoke a Send_Control Operational Primitive qualified
   with the SCSI Data-out PDU.  Upon receiving each Send or SendSE
   Message containing the non-immediate unsolicited data, the iSER layer
   at the target MUST notify the iSCSI layer at the target by invoking
   the Control_Notify Operational Primitive qualified with the SCSI

Top      Up      ToC       Page 63 
   Data-out PDU.  See Section 7.3.4 for details on the handling of the
   SCSI Data-out PDU.

   For the solicited data, when the iSCSI layer at the target has an I/O
   Buffer available, it MUST invoke the Get_Data Operational Primitive
   qualified with the R2T PDU.  See Section 7.3.6 for details on the
   handling of the R2T PDU.

   When the data transfer associated with this SCSI Write operation is
   complete, the iSCSI layer at the target MUST invoke the Send_Control
   Operational Primitive when it is ready to send the SCSI Response PDU.
   Upon receiving a SendSE or SendInvSE Message containing the SCSI
   Response PDU, the iSER layer at the initiator MUST notify the iSCSI
   layer at the initiator by invoking the Control_Notify Operational
   Primitive qualified with the SCSI Response PDU.  See Section 7.3.2
   for details on the handling of the SCSI Response PDU.

9.5.2.  SCSI Read Operation

   The iSCSI layer at the initiator MUST invoke the Send_Control
   Operational Primitive to request that the iSER layer at the initiator
   to send the SCSI read command.  The iSER layer at the initiator MUST
   request that the RCaP layer transmit a SendSE Message with the
   message payload consisting of the iSER header followed by the SCSI
   Command PDU.  The iSER layer at the initiator MUST Advertise the Read
   STag in the iSER header of the SendSE Message, as described in
   Section 9.2.  Upon receiving the SendSE Message, the iSER layer at
   the target MUST notify the iSCSI layer at the target by invoking the
   Control_Notify Operational Primitive qualified with the SCSI Command
   PDU.  See Section 7.3.1 for details on the handling of the SCSI read
   command.

   When the requested SCSI data is available in the I/O Buffer, the
   iSCSI layer at the target MUST invoke the Put_Data Operational
   Primitive qualified with the SCSI Data-in PDU.  See Section 7.3.5 for
   details on the handling of the SCSI Data-in PDU.

   When the data transfer associated with this SCSI Read operation is
   complete, the iSCSI layer at the target MUST invoke the Send_Control
   Operational Primitive when it is ready to send the SCSI Response PDU.
   Upon receiving the SendInvSE Message containing the SCSI Response
   PDU, the iSER layer at the initiator MUST notify the iSCSI layer at
   the initiator by invoking the Control_Notify Operational Primitive
   qualified with the SCSI Response PDU.  See Section 7.3.2 for details
   on the handling of the SCSI Response PDU.

Top      Up      ToC       Page 64 
9.5.3.  Bidirectional Operation

   The initiator and the target handle the SCSI Write and the SCSI Read
   portions of this bidirectional operation the same as described in
   Sections 9.5.1 and 9.5.2, respectively.

10.  iSER Error Handling and Recovery

   RCaP provides the iSER layer with reliable in-order delivery.
   Therefore, the error management needs of an iSER-assisted connection
   are somewhat different than those of a Traditional iSCSI connection.

10.1.  Error Handling

   iSER error handling is described in the following sections,
   classified loosely based on the sources of errors:

   1.  Those originating at the transport layer (e.g., TCP).

   2.  Those originating at the RCaP layer.

   3.  Those originating at the iSER layer.

   4.  Those originating at the iSCSI layer.

10.1.1.  Errors in the Transport Layer

   If the transport layer is TCP, then TCP packets with detected errors
   are silently dropped by the TCP layer and result in retransmission at
   the TCP layer.  This has no impact on the iSER layer.  However,
   connection loss (e.g., link failure) and unexpected termination
   (e.g., TCP graceful or abnormal close without the iSCSI Logout
   exchanges) at the transport layer will cause the iSCSI/iSER
   connection to be terminated as well.

10.1.1.1.  Failure in the Transport Layer before RCaP Mode Is Enabled

   If the connection is lost or terminated before the iSCSI layer
   invokes the Allocate_Connection_Resources Operational Primitive, the
   login process is terminated and no further action is required.

   If the connection is lost or terminated after the iSCSI layer has
   invoked the Allocate_Connection_Resources Operational Primitive, then
   the iSCSI layer MUST request that the iSER layer deallocate all
   connection resources by invoking the Deallocate_Connection_Resources
   Operational Primitive.

Top      Up      ToC       Page 65 
10.1.1.2.  Failure in the Transport Layer after RCaP Mode Is Enabled

   If the connection is lost or terminated after the iSCSI layer has
   invoked the Enable_Datamover Operational Primitive, the iSER layer
   MUST notify the iSCSI layer of the connection loss by invoking the
   Connection_Terminate_Notify Operational Primitive.  Prior to invoking
   the Connection_Terminate_Notify Operational Primitive, the iSER layer
   MUST perform the actions described in Section 5.2.3.2.

10.1.2.  Errors in the RCaP Layer

   The RCaP layer does not have error recovery operations built in.  If
   errors are detected at the RCaP layer, the RCaP layer will terminate
   the RCaP Stream and the associated connection.

10.1.2.1.  Errors Detected in the Local RCaP Layer

   If an error is encountered at the local RCaP layer, the RCaP layer
   MAY send a Terminate Message to the Remote Peer to report the error
   if possible.  (For iWARP, see [RDMAP] for the list of errors where a
   Terminate Message is sent.)  The RCaP layer is responsible for
   terminating the connection.  After the RCaP layer notifies the iSER
   layer that the connection is terminated, the iSER layer MUST notify
   the iSCSI layer by invoking the Connection_Terminate_Notify
   Operational Primitive.  Prior to invoking the
   Connection_Terminate_Notify Operational Primitive, the iSER layer
   MUST perform the actions described in Section 5.2.3.2.

10.1.2.2.  Errors Detected in the RCaP Layer at the Remote Peer

   If an error is encountered at the RCaP layer at the Remote Peer, the
   RCaP layer at the Remote Peer may send a Terminate Message to report
   the error if possible.  If it is unable to send the Terminate
   Message, the connection is terminated.  This is treated the same as a
   failure in the transport layer after RDMA is enabled as described in
   Section 10.1.1.2.

   If an error is encountered at the RCaP layer at the Remote Peer and
   it is able to send a Terminate Message, the RCaP layer at the Remote
   Peer is responsible for terminating the connection.  After the local
   RCaP layer notifies the iSER layer that the connection is terminated,
   the iSER layer MUST notify the iSCSI layer by invoking the
   Connection_Terminate_Notify Operational Primitive.  Prior to invoking
   the Connection_Terminate_Notify Operational Primitive, the iSER layer
   MUST perform the actions described in Section 5.2.3.2.

Top      Up      ToC       Page 66 
10.1.3.  Errors in the iSER Layer

   The error handling due to errors at the iSER layer is described in
   the following sections.

10.1.3.1.  Insufficient Connection Resources to Support RCaP at
           Connection Setup

   After the iSCSI layer at the initiator invokes the
   Allocate_Connection_Resources Operational Primitive during the iSCSI
   Login Negotiation Phase, if the iSER layer at the initiator fails to
   allocate the connection resources necessary to support RCaP, it MUST
   return a status of failure to the iSCSI layer at the initiator.  The
   iSCSI layer at the initiator MUST terminate the connection as
   described in Section 5.2.3.1.

   After the iSCSI layer at the target invokes the
   Allocate_Connection_Resources Operational Primitive during the iSCSI
   Login Negotiation Phase, if the iSER layer at the target fails to
   allocate the connection resources necessary to support RCaP, it MUST
   return a status of failure to the iSCSI layer at the target.  The
   iSCSI layer at the target MUST send a Login Response with a status
   class of 3 (Target Error), and a status code of "0302" (Out of
   Resources).  The iSCSI layers at the initiator and the target MUST
   terminate the connection as described in Section 5.2.3.1.

10.1.3.2.  iSER Negotiation Failures

   If the RCaP or iSER related parameters declared by the initiator in
   the iSER Hello Message are unacceptable to the iSER layer at the
   target, the iSER layer at the target MUST set the Reject (REJ) flag,
   as described in Section 9.4, in the iSER HelloReply Message.  The
   following are the cases when the iSER layer MUST set the REJ flag to
   one in the HelloReply Message:

   *  The initiator-declared iSER-IRD value is greater than 0 and the
      target-declared iSER-ORD value is 0.

   *  The initiator-supported and the target-supported iSER protocol
      versions do not overlap.

   After requesting that the RCaP layer send the iSER HelloReply
   Message, the handling of the error situation is the same as that for
   iSER format errors as described in Section 10.1.3.3.

Top      Up      ToC       Page 67 
10.1.3.3.  iSER Format Errors

   The following types of errors in an iSER header are considered format
   errors:

   *  Illegal contents of any iSER header field

   *  Inconsistent field contents in an iSER header

   *  Length error for an iSER Hello or HelloReply Message (see Section
      9.3 and 9.4)

   When a format error is detected, the following events MUST occur in
   the specified sequence:

   1.  The iSER layer MUST request that the RCaP layer terminate the
       RCaP Stream.  The RCaP layer MUST terminate the associated
       connection.

   2.  The iSER layer MUST notify the iSCSI layer of the connection
       termination by invoking the Connection_Terminate_Notify
       Operational Primitive.  Prior to invoking the
       Connection_Terminate_Notify Operational Primitive, the iSER layer
       MUST perform the actions described in Section 5.2.3.2.

10.1.3.4.  iSER Protocol Errors

   The first iSER Message sent by the iSER layer at the initiator after
   transitioning into iSER-assisted mode MUST be the iSER Hello Message
   (see Section 9.3).  Likewise, the first iSER Message sent by the iSER
   layer at the target after transitioning into iSER-assisted mode MUST
   be the iSER HelloReply Message (see Section 9.4).  Failure to send
   the iSER Hello or HelloReply Message, as indicated by the wrong
   Opcode in the iSER header, is a protocol error.  The handling of this
   error situation is the same as that for iSER format errors as
   described in Section 10.1.3.3.

   If the sending side of an iSER-enabled connection acts in a manner
   not permitted by the negotiated or declared login/text operational
   key values as described in Section 6, this is a protocol error, and
   the receiving side MAY handle this the same as for iSER format errors
   as described in Section 10.1.3.3.

10.1.4.  Errors in the iSCSI Layer

   The error handling due to errors at the iSCSI layer is described in
   the following sections.  For error recovery, see Section 10.2.

Top      Up      ToC       Page 68 
10.1.4.1.  iSCSI Format Errors

   When an iSCSI format error is detected, the iSCSI layer MUST request
   that the iSER layer terminate the RCaP Stream by invoking the
   Connection_Terminate Operational Primitive.  For more details on the
   connection termination, see Section 5.2.3.1.

10.1.4.2.  iSCSI Digest Errors

   In the iSER-assisted mode, the iSCSI layer will not see any digest
   error because both the HeaderDigest and the DataDigest keys are
   negotiated to "None".

10.1.4.3.  iSCSI Sequence Errors

   For Traditional iSCSI, sequence errors are caused by dropped PDUs due
   to header or data digest errors.  Since digests are not used in
   iSER-assisted mode and the RCaP layer will deliver all messages in
   the order they were sent, sequence errors will not occur in iSER-
   assisted mode.

10.1.4.4.  iSCSI Protocol Error

   When the iSCSI layer handles certain protocol errors by dropping the
   connection, the error handling is the same as that for iSCSI format
   errors as described in Section 10.1.4.1.

   When the iSCSI layer uses the iSCSI Reject PDU and response codes to
   handle certain other protocol errors, no special handling at the iSER
   layer is required.

10.1.4.5.  SCSI Timeouts and Session Errors

   SCSI Timeouts and Session Errors are handled at the iSCSI layer and
   no special handling at the iSER layer is required.

10.1.4.6.  iSCSI Negotiation Failures

   For negotiation failures that happen during the Login Phase at the
   initiator after the iSCSI layer has invoked the
   Allocate_Connection_Resources Operational Primitive and before the
   Enable_Datamover Operational Primitive has been invoked, the iSCSI
   layer MUST request that the iSER layer deallocate all connection
   resources by invoking the Deallocate_Connection_Resources Operational
   Primitive.  The iSCSI layer at the initiator MUST terminate the
   connection.

Top      Up      ToC       Page 69 
   For negotiation failures during the Login Phase at the target, the
   iSCSI layer can use a Login Response with a status class other than 0
   (success) to terminate the Login Phase.  If the iSCSI layer has
   invoked the Allocate_Connection_Resources Operational Primitive
   before the Enable_Datamover Operational Primitive has been invoked,
   the iSCSI layer at the target MUST request that the iSER layer at the
   target deallocate all connection resources by invoking the
   Deallocate_Connection_Resources Operational Primitive.  The iSCSI
   layer at both the initiator and the target MUST terminate the
   connection.

   During the iSCSI Login Phase, if the iSCSI layer at the initiator
   receives a Login Response from the target with a status class other
   than 0 (Success) after the iSCSI layer at the initiator has invoked
   the Allocate_Connection_Resources Operational Primitive, the iSCSI
   layer MUST request the iSER layer to deallocate all connection
   resources by invoking the Deallocate_Connection_Resources Operational
   Primitive.  The iSCSI layer MUST terminate the connection in this
   case.

   For negotiation failures during the Full Feature Phase, the error
   handling is left to the iSCSI layer and no special handling at the
   iSER layer is required.

10.2.  Error Recovery

   Error recovery requirements of iSCSI/iSER are the same as that of
   Traditional iSCSI.  All three ErrorRecoveryLevels as defined in
   [RFC3720] are supported in iSCSI/iSER.

   *  For ErrorRecoveryLevel 0, session recovery is handled by iSCSI and
      no special handling by the iSER layer is required.

   *  For ErrorRecoveryLevel 1, see Section 10.2.1 on PDU Recovery.

   *  For ErrorRecoveryLevel 2, see Section 10.2.2 on Connection
      Recovery.

   The iSCSI layer may invoke the Notice_Key_Values Operational
   Primitive during connection setup to request that the iSER layer take
   note of the value of the operational ErrorRecoveryLevel, as described
   in Sections 5.1.1 and 5.1.2.

10.2.1.  PDU Recovery

   As described in Sections 10.1.4.2 and 10.1.4.3, digest and sequence
   errors will not occur in the iSER-assisted mode.  If the RCaP layer
   detects an error, it will close the iSCSI/iSER connection, as

Top      Up      ToC       Page 70 
   described in Section 10.1.2.  Therefore, PDU recovery is not useful
   in the iSER-assisted mode.

   The iSCSI layer at the initiator SHOULD disable iSCSI timeout-driven
   PDU retransmissions.

10.2.2.  Connection Recovery

   The iSCSI layer at the initiator MAY reassign connection allegiance
   for non-immediate commands that are still in progress and are
   associated with the failed connection by using a Task Management
   Function Request with the TASK REASSIGN function.  See Section 7.3.3
   for more details.

   When the iSCSI layer at the initiator does a task reassignment for a
   SCSI write command, it MUST qualify the Send_Control Operational
   Primitive invocation with DataDescriptorOut, which defines the I/O
   Buffer for both the non-immediate unsolicited data and the solicited
   data.  This allows the iSCSI layer at the target to use recovery R2Ts
   to request data originally sent as unsolicited and solicited from the
   initiator.

   When the iSCSI layer at the target accepts a reassignment request for
   a SCSI read command, it MUST request that the iSER layer process SCSI
   Data-in for all unacknowledged data by invoking the Put_Data
   Operational Primitive.  See Section 7.3.5 on the handling of SCSI
   Data-in.

   When the iSCSI layer at the target accepts a reassignment request for
   a SCSI write command, it MUST request that the iSER layer process a
   recovery R2T for any non-immediate unsolicited data and any solicited
   data sequences that have not been received by invoking the Get_Data
   Operational Primitive.  See Section 7.3.6 on the handling of Ready To
   Transfer (R2T).

   The iSCSI layer at the target MUST NOT issue recovery R2Ts on an
   iSCSI/iSER connection for a task for which the connection allegiance
   was never reassigned.  The iSER layer at the target MAY reject such a
   recovery R2T received via the Get_Data Operational Primitive
   invocation from the iSCSI layer at the target, with an appropriate
   error code.

   The iSER layer at the target will process the requests invoked by the
   Put_Data and Get_Data Operational Primitives for a reassigned task in
   the same way as for the original commands.

Top      Up      ToC       Page 71 
11.  Security Considerations

   When iSER is layered on top of an RCaP layer and provides the RDMA
   extensions to the iSCSI protocol, the security considerations of iSER
   are the same as that of the underlying RCaP layer.  For iWARP, this
   is described in [RDMAP] and [RDDPSEC].

   Since the iSER-assisted iSCSI protocol is still functionally iSCSI
   from a security considerations perspective, all of the iSCSI security
   requirements as described in [RFC3720] and [RFC3723] apply.  If the
   IPsec [IPSEC] mechanism is used, then it MUST be established before
   the connection transitions to the iSER-assisted mode.  If iSER is
   layered on top of a non-IP based RCaP layer, all the security
   protocol mechanisms applicable to that RCaP layer are also applicable
   to an iSCSI/iSER connection.  If iSER is layered on top of a non-IP
   protocol, the IPsec mechanism as specified in [RFC3720] MUST be
   implemented at any point where the iSER protocol enters the IP
   network (e.g., via gateways), and the non-IP protocol SHOULD
   implement (optional to use) a packet-by packet security protocol
   equal in strength to the IPsec mechanism specified by [RFC3720].

   To minimize the potential for a denial-of-service attack, the iSCSI
   layer MUST NOT request that the iSER layer allocate the connection
   resources necessary to support RCaP until the iSCSI layer is
   sufficiently far along in the iSCSI Login Phase that it is reasonably
   certain that the peer side is not an attacker, as described in
   Sections 5.1.1 and 5.1.2.

   Note that the IPsec requirements for this document are based on the
   version of IPsec specified in RFC 2401 [IPSEC] and related RFCs, as
   profiled by RFC 3723 [RFC3723], despite the existence of a newer
   version of IPsec specified in RFC 4301 [RFC4301] and related RFCs.

12.  References

12.1.  Normative References

   [RFC3720] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and
             E. Zeidner, "Internet Small Computer Systems Interface
             (iSCSI)", RFC 3720, April 2004.

   [RFC3723] Aboba, B., Tseng, J., Walker, J., Rangan, V., and F.
             Travostino, "Securing Block Storage Protocols over IP", RFC
             3723, April 2004.

   [RDMAP]   Recio, R., Culley, P., Garcia, D., Hilland, J., and B.
             Metzler, "A Remote Direct Memory Access Protocol
             Specification", RFC 5040, October 2007.

Top      Up      ToC       Page 72 
   [DDP]     Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct
             Data Placement over Reliable Transports", RFC 5041, October
             2007.

   [IPSEC]   Kent, S. and R. Atkinson, "Security Architecture for the
             Internet Protocol", RFC 2401, November 1998.

   [MPA]     Culley, P., Elzur, U., Recio, R., Bailey, S., and J.
             Carrier, "Marker PDU Aligned Framing for TCP
             Specification", RFC 5044, October 2007.

   [RDDPSEC] Pinkerton, J. and E. Deleganes, "Direct Data Placement
             Protocol (DDP) / Remote Direct Memory Access Protocol
             (RDMAP) Security", RFC 5042, October 2007.

   [TCP]     Postel, J., "Transmission Control Protocol", STD 7, RFC
             793, September 1981.

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

12.2.  Informative References

   [SAM2]    T10/1157D, SCSI Architecture Model - 2 (SAM-2)

   [DA]      Chadalapaka, M., Hufferd, J., Satran, J., and H. Shah, "DA:
             Datamover Architecture for the Internet Small Computer
             System Interface (iSCSI)", RFC 5047, October 2007.

   [IB]      InfiniBand Architecture Specification Volume 1 Release 1.2,
             October 2004

   [IPoIB]   Chu, J. and V. Kashyap, "Transmission of IP over InfiniBand
             (IPoIB)", RFC 4391, April 2006.

   [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
             Internet Protocol", RFC 4301, December 2005.

Top      Up      ToC       Page 73 
Appendix A.  iWARP Message Format for iSER

   This section is for information only and is NOT part of the standard.
   It simply depicts the iWARP Message format for the various iSER
   Messages when the transport layer is TCP.

A.1.  iWARP Message Format for iSER Hello Message

   The following figure depicts an iSER Hello Message encapsulated in an
   iWARP SendSE Message.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Reserved                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       (Send) Queue Number                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 (Send) Message Sequence Number                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      (Send) Message Offset                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | 0010b | Zeros | 0001b | 0001b |           iSER-IRD            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         Figure 6.  SendSE Message Containing an iSER Hello Message

Top      Up      ToC       Page 74 
A.2.  iWARP Message Format for iSER HelloReply Message

   The following figure depicts an iSER HelloReply Message encapsulated
   in an iWARP SendSE Message.  The Reject (REJ) flag is set to 0.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Reserved                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       (Send) Queue Number                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 (Send) Message Sequence Number                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      (Send) Message Offset                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | 0011b |Zeros|0| 0001b | 0001b |           iSER-ORD            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       Figure 7.  SendSE Message Containing an iSER HelloReply Message

Top      Up      ToC       Page 75 
A.3.  iWARP Message Format for SCSI Read Command PDU

   The following figure depicts a SCSI Read Command PDU embedded in an
   iSER Message encapsulated in an iWARP SendSE Message.  For this
   particular example, in the iSER header, the Write STag Valid flag is
   set to zero, the Read STag Valid flag is set to one, the Write STag
   field is set to all zeros, and the Read STag field contains a valid
   Read STag.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Reserved                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       (Send) Queue Number                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 (Send) Message Sequence Number                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      (Send) Message Offset                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | 0001b |0|1|                  All zeros                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         All Zeros                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         Read STag                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       SCSI Read Command PDU                   |
      //                                                             //
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        Figure 8.  SendSE Message Containing a SCSI Read Command PDU

Top      Up      ToC       Page 76 
A.4.  iWARP Message Format for SCSI Read Data

   The following figure depicts an iWARP RDMA Write Message carrying
   SCSI Read data in the payload:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |   DDP Control | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       Data Sink STag                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                   Data Sink Tagged Offset                     |
      +                                                               +
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      SCSI Read data                           |
      //                                                             //
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

           Figure 9.  RDMA Write Message Containing SCSI Read Data

Top      Up      ToC       Page 77 
A.5.  iWARP Message Format for SCSI Write Command PDU

   The following figure depicts a SCSI Write Command PDU embedded in an
   iSER Message encapsulated in an iWARP SendSE Message.  For this
   particular example, in the iSER header, the Write STag Valid flag is
   set to one, the Read STag Valid flag is set to zero, the Write STag
   field contains a valid Write STag, and the Read STag field is set to
   all zeros since it is not used.

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Reserved                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       (Send) Queue Number                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 (Send) Message Sequence Number                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      (Send) Message Offset                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | 0001b |1|0|                  All zeros                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                        Write STag                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         All Zeros                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       SCSI Write Command PDU                  |
      //                                                             //
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

       Figure 10.  SendSE Message Containing a SCSI Write Command PDU

Top      Up      ToC       Page 78 
A.6.  iWARP Message Format for RDMA Read Request

   An iSCSI R2T is transformed into an iWARP RDMA Read Request Message.
   The following figure depicts an iWARP RDMA Read Request Message:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      Reserved (Not Used)                      |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              DDP (RDMA Read Request) Queue Number             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |        DDP (RDMA Read Request) Message Sequence Number        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |             DDP (RDMA Read Request) Message Offset            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Data Sink STag (SinkSTag)                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +                  Data Sink Tagged Offset (SinkTO)             +
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                  RDMA Read Message Size (RDMARDSZ)            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Data Source STag (SrcSTag)                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +                 Data Source Tagged Offset (SrcTO)             +
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                   Figure 11.  RDMA Read Request Message

Top      Up      ToC       Page 79 
A.7.  iWARP Message Format for Solicited SCSI Write Data

   The following figure depicts an iWARP RDMA Read Response Message
   carrying the solicited SCSI Write data in the payload:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       Data Sink STag                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                   Data Sink Tagged Offset                     |
      +                                                               +
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       SCSI Write Data                         |
      //                                                             //
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      Figure 12.  RDMA Read Response Message Containing SCSI Write Data

Top      Up      ToC       Page 80 
A.8.  iWARP Message Format for SCSI Response PDU

   The following figure depicts a SCSI Response PDU embedded in an iSER
   Message encapsulated in an iWARP SendInvSE Message:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         MPA Header            |  DDP Control  | RDMA Control  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      Invalidate STag                          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       (Send) Queue Number                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 (Send) Message Sequence Number                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                      (Send) Message Offset                    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | 0001b |0|0|                  All Zeros                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           All Zeros                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       SCSI Response PDU                       |
      //                                                             //
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           MPA CRC                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         Figure 13.  SendInvSE Message Containing SCSI Response PDU

Top      Up      ToC       Page 81 
Appendix B.  Architectural Discussion of iSER over InfiniBand

   This section explains how an InfiniBand network (with Gateways) would
   be structured.  It is informational only and is intended to provide
   insight on how iSER is used in an InfiniBand environment.

B.1.  The Host Side of the iSCSI and iSER Connections in InfiniBand

   Figure 14 defines the topologies in which iSCSI and iSER will be able
   to operate on an InfiniBand Network.

   +---------+ +---------+ +---------+ +---------+ +--- -----+
   |  Host   | |  Host   | |   Host  | |   Host  | |   Host  |
   |         | |         | |         | |         | |         |
   +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+
   |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA|
   +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+
     |----+------|-----+-----|-----+-----|-----+-----|-----+---> To IB
   IB|        IB |        IB |        IB |        IB |    SubNet2 SWTCH
   +-v-----------v-----------v-----------v-----------v---------+
   |                  InfiniBand Switch for Subnet1            |
   +---+-----+--------+-----+--------+-----+------------v------+
       | TCA |        | TCA |        | TCA |            |
       +-----+        +-----+        +-----+            | IB
      /  IB   \      /  IB   \      /       \     +--+--v--+--+
     |  iSER   |    |  iSER   |    |  IPoIB  |    |  | TCA |  |
     | Gateway |    | Gateway |    | Gateway |    |  +-----+  |
     |   to    |    |   to    |    |   to    |    | Storage   |
     |  iSCSI  |    |  iSER   |    |   IP    |    | Controller|
     |   TCP   |    |  iWARP  |    |Ethernet |    +-----+-----+
     +---v-----|    +---v-----|    +----v----+
         | EN           | EN            | EN
         +--------------+---------------+----> to IP based storage
           Ethernet links that carry iSCSI or iWARP

                   Figure 14.  iSCSI and iSER on IB

   In Figure 14, the Host systems are connected via the InfiniBand Host
   Channel Adapters (HCAs) to the InfiniBand links.  With the use of IB
   switch(es), the InfiniBand links connect the HCA to InfiniBand Target
   Channel Adapters (TCAs) located in gateways or Storage Controllers.
   An iSER-capable IB-IP Gateway converts the iSER Messages encapsulated
   in IB protocols to either standard iSCSI, or iSER Messages for iWARP.
   An [IPoIB] Gateway converts the InfiniBand [IPoIB] protocol to IP
   protocol, and in the iSCSI case, permits iSCSI to be operated on an
   IB Network between the Hosts and the [IPoIB] Gateway.

Top      Up      ToC       Page 82 
B.2.  The Storage Side of the iSCSI and iSER Mixed Network Environment

   Figure 15 shows a storage controller that has three different portal
   groups: one supporting only iSCSI (TPG-4), one supporting iSER/iWARP
   or iSCSI (TPG-2), and one supporting iSER/IB (TPG-1).

                  |                |                |
                  |                |                |
            +--+--v--+----------+--v--+----------+--v--+--+
            |  | IB  |          |iWARP|          | EN  |  |
            |  |     |          | TCP |          | NIC |  |
            |  |(TCA)|          | RNIC|          |     |  |
            |  +-----|          +-----+          +-----+  |
            |   TPG-1            TPG-2            TPG-4   |
            |  9.1.3.3          9.1.2.4          9.1.2.6  |
            |                                             |
            |                  Storage Controller         |
            |                                             |
            +---------------------------------------------+

   Figure 15.  Storage Controller with TCP, iWARP, and IB Connections

   The normal iSCSI portal group advertising processes (via the Service
   Location Protocol (SLP), the Internet Storage Name Service (iSNS), or
   SendTargets) are available to a Storage Controller.

B.3.  Discovery Processes for an InfiniBand Host

   An InfiniBand Host system can gather portal group IP addresses from
   SLP, iSNS, or the SendTargets discovery processes by using TCP/IP via
   [IPoIB].  After obtaining one or more remote portal IP addresses, the
   Initiator uses the standard IP mechanisms to resolve the IP address
   to a local outgoing interface and the destination hardware address
   (Ethernet MAC or IB GID of the target or a gateway leading to the
   target).  If the resolved interface is an [IPoIB] network interface,
   then the target portal can be reached through an InfiniBand fabric.
   In this case, the Initiator can establish an iSCSI/TCP or iSCSI/iSER
   session with the Target over that InfiniBand interface, using the
   Hardware Address (InfiniBand GID) obtained through the standard
   Address Resolution (ARP) processes.

   If more than one IP address is obtained through the discovery
   process, the Initiator should select a Target IP address that is on
   the same IP subnet as the Initiator, if one exists.  This will avoid
   a potential overhead of going through a gateway when a direct path
   exists.

Top      Up      ToC       Page 83 
   In addition, a user can configure manual static IP route entries if a
   particular path to the target is preferred.

B.4.  IBTA Connection Specifications

   The InfiniBand Trade Association (IBTA) connection specifications are
   outside the scope of this document, but it is expected that the IBTA
   has or will define:

   *  The iSER ServiceID.

   *  A Means for permitting a Host to establish a connection with a
      peer InfiniBand end-node, and to fall back to iSCSI/TCP over
      [IPoIB] if that peer indicates iSER is not supported.

   *  A Means for permitting the Host to establish connections with IB
      iSER connections on storage controllers or IB iSER connected
      Gateways in preference to [IPoIB] connected Gateways/Bridges or
      connections to Target Storage Controllers that also accept iSCSI
      via [IPoIB].

   *  A Means for combining the IB ServiceID for iSER and the IP port
      number such that the IB Host can use normal IB connection
      processes, yet ensure that the iSER target peer can actually
      connect to the required IP port number.

Acknowledgments

   This protocol was developed by a design team that, in addition to the
   authors, included Dwight Barron (HP), John Carrier (formerly from
   Adaptec), Ted Compton (EMC), Paul R. Culley (HP), Yaron Haviv
   (Voltaire), Jeff Hilland (HP), Mike Krause (HP), Alex Nezhinsky
   (Voltaire), Jim Pinkerton (Microsoft), Renato J. Recio (IBM), Julian
   Satran (IBM), Tom Talpey (Network Appliance), and Jim Wendt (HP).
   Special thanks to David Black (EMC) for his extensive review
   comments.

Top      Up      ToC       Page 84 
Author's Address

   Mallikarjun Chadalapaka
   Hewlett-Packard Company
   8000 Foothills Blvd.
   Roseville, CA 95747-5668, USA
   Phone: +1-916-785-5621
   EMail: cbm@rose.hp.com

   Uri Elzur
   Broadcom Corporation
   5300 California Avenue
   Irvine, CA 92617, USA
   Phone: +1-949-926-6432
   EMail: Uri@Broadcom.com

   John Hufferd
   Brocade Communications Systems, Inc.
   1745 Technology Drive
   San Jose, CA 95110, USA
   Phone: +1-408-333-5244
   EMail: jhufferd@brocade.com

   Mike Ko
   IBM Corp.
   650 Harry Rd.
   San Jose, CA 95120, USA
   Phone: +1-408-927-2085
   EMail: mako@us.ibm.com

   Hemal Shah
   Broadcom Corporation
   5300 California Avenue
   Irvine, CA 92617, USA
   Phone: +1-949-926-6941
   EMail: hemal@broadcom.com

   Patricia Thaler
   Broadcom Corporation
   5300 California Avenue
   Irvine, CA 92617, USA
   Phone: +1-916-570-2707
   EMail: pthaler@broadcom.com

Top      Up      ToC       Page 85 
Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.