Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 5046

Internet Small Computer System Interface (iSCSI) Extensions for Remote Direct Memory Access (RDMA)

Pages: 85
Obsoleted by:  7145
Updated by:  7146
Part 4 of 4 – Pages 58 to 85
First   Prev   None

ToP   noToC   RFC5046 - Page 58   prevText

9. iSER Control and Data Transfer

For iSCSI data-type PDUs (see Section 7.1), the iSER layer uses RDMA Read and RDMA Write operations to transfer the solicited data. For iSCSI control-type PDUs (see Section 7.2), the iSER layer uses Send Message Types of RCaP.

9.1. iSER Header Format

An iSER header MUST be present in every Send Message Type of RCaP. The iSER header is located in the first 12 bytes of the message payload of the Send Message Type of RCaP, as shown in Figure 2. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opcode| Opcode Specific Fields | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opcode Specific Fields | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opcode Specific Fields | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2. iSER Header Format Opcode - Operation Code: 4 bits The Opcode field identifies the type of iSER Messages: 0001b = iSCSI control-type PDU 0010b = iSER Hello Message 0011b = iSER HelloReply Message All other opcodes are reserved.
ToP   noToC   RFC5046 - Page 59

9.2. iSER Header Format for the iSCSI Control-Type PDU

The iSER layer uses Send Message Types of RCaP to transfer iSCSI control-type PDUs (see Section 7.2). The message payload of each of the Send Message Types of RCaP used for transferring an iSER Message contains an iSER Header followed by an iSCSI control-type PDU. The iSER header in a Send Message Type of RCaP carrying an iSCSI control-type PDU MUST have the format as described in Figure 3. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |W|R| | | 0001b |S|S| Reserved | | |V|V| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Write STag (or N/A) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Read STag (or N/A) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3. iSER Header Format for iSCSI Control-Type PDU WSV - Write STag Valid flag: 1 bit This flag indicates the validity of the Write STag field of the iSER Header. If set to one, the Write STag field in this iSER Header is valid. If set to zero, the Write STag field in this iSER Header MUST be ignored at the receiver. The Write STag Valid flag is set to one when there is solicited data to be transferred for a SCSI write or bidirectional command, or when there are non-immediate unsolicited and solicited data to be transferred for the referenced task specified in a Task Management Function Request with the TASK REASSIGN function. RSV - Read STag Valid flag: 1 bit This flag indicates the validity of the Read STag field of the iSER Header. If set to one, the Read STag field in this iSER Header is valid. If set to zero, the Read STag field in this iSER Header MUST be ignored at the receiver. The Read STag Valid flag is set to one for a SCSI read or bidirectional command, or for a Task Management Function Request with the TASK REASSIGN function.
ToP   noToC   RFC5046 - Page 60
   Write STag - Write Steering Tag: 32 bits

       This field contains the Write STag when the Write STag Valid flag
       is set to one.  For a SCSI write or bidirectional command, the
       Write STag is used to Advertise the initiator's I/O Buffer
       containing the solicited data.  For a Task Management Function
       Request with the TASK REASSIGN function, the Write STag is used
       to Advertise the initiator's I/O Buffer containing the non-
       immediate unsolicited data and solicited data.  This Write STag
       is used as the Data Source STag in the resultant RDMA Read
       operation(s).  When the Write STag Valid flag is set to zero,
       this field MUST be set to zero.

   Read STag - Read Steering Tag: 32 bits

       This field contains the Read STag when the Read STag Valid flag
       is set to one.  The Read STag is used to Advertise the
       initiator's Read I/O Buffer of a SCSI read or bidirectional
       command, or of a Task Management Function Request with the TASK
       REASSIGN function.  This Read STag is used as the Data Sink STag
       in the resultant RDMA Write operation(s).  When the Read STag
       Valid flag is zero, this field MUST be set to zero.

   Reserved:

       Reserved fields MUST be set to zero on transmit and MUST be
       ignored on reception.

9.3. iSER Header Format for the iSER Hello Message

An iSER Hello Message MUST only contain the iSER header, which MUST have the format as described in Figure 4. The iSER Hello Message is the first iSER Message sent on the RCaP Stream from the iSER layer at the initiator to the iSER layer at the target. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | | | | 0010b | Rsvd | MaxVer| MinVer| iSER-IRD | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4. iSER Header Format for iSER Hello Message
ToP   noToC   RFC5046 - Page 61
   MaxVer - Maximum Version: 4 bits

       This field specifies the maximum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   MinVer - Minimum Version: 4 bits

       This field specifies the minimum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   iSER-IRD: 16 bits

       This field contains the value of the iSER-IRD at the initiator.

   Reserved (Rsvd):

       Reserved fields MUST be set to zero on transmit, and MUST be
       ignored on reception.

9.4. iSER Header Format for the iSER HelloReply Message

An iSER HelloReply Message MUST only contain the iSER header which MUST have the format as described in Figure 5. The iSER HelloReply Message is the first iSER Message sent on the RCaP Stream from the iSER layer at the target to the iSER layer at the initiator. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | |R| | | | | 0011b |Rsvd |E| MaxVer| CurVer| iSER-ORD | | | |J| | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5. iSER Header Format for iSER HelloReply Message REJ - Reject flag: 1 bit This flag indicates whether the target is rejecting this connection. If set to one, the target is rejecting the connection.
ToP   noToC   RFC5046 - Page 62
   MaxVer - Maximum Version: 4 bits

       This field specifies the maximum version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   CurVer - Current Version: 4 bits

       This field specifies the current version of the iSER protocol
       supported.  It MUST be set to one to indicate the version of the
       specification described in this document.

   iSER-ORD: 16 bits

       This field contains the value of the iSER-ORD at the target.

   Reserved (Rsvd):

       Reserved fields MUST be set to zero on transmit, and MUST be
       ignored on reception.

9.5. SCSI Data Transfer Operations

The iSER layer at the initiator and the iSER layer at the target handle each SCSI Write, SCSI Read, and bidirectional operation as described below.

9.5.1. SCSI Write Operation

The iSCSI layer at the initiator MUST invoke the Send_Control Operational Primitive to request that the iSER layer at the initiator send the SCSI write command. The iSER layer at the initiator MUST request that the RCaP layer transmit a SendSE Message with the message payload consisting of the iSER header followed by the SCSI Command PDU and immediate data (if any). If there is solicited data, the iSER layer MUST Advertise the Write STag in the iSER header of the SendSE Message, as described in Section 9.2. Upon receiving the SendSE Message, the iSER layer at the target MUST notify the iSCSI layer at the target by invoking the Control_Notify Operational Primitive qualified with the SCSI Command PDU. See Section 7.3.1 for details on the handling of the SCSI write command. For the non-immediate unsolicited data, the iSCSI layer at the initiator MUST invoke a Send_Control Operational Primitive qualified with the SCSI Data-out PDU. Upon receiving each Send or SendSE Message containing the non-immediate unsolicited data, the iSER layer at the target MUST notify the iSCSI layer at the target by invoking the Control_Notify Operational Primitive qualified with the SCSI
ToP   noToC   RFC5046 - Page 63
   Data-out PDU.  See Section 7.3.4 for details on the handling of the
   SCSI Data-out PDU.

   For the solicited data, when the iSCSI layer at the target has an I/O
   Buffer available, it MUST invoke the Get_Data Operational Primitive
   qualified with the R2T PDU.  See Section 7.3.6 for details on the
   handling of the R2T PDU.

   When the data transfer associated with this SCSI Write operation is
   complete, the iSCSI layer at the target MUST invoke the Send_Control
   Operational Primitive when it is ready to send the SCSI Response PDU.
   Upon receiving a SendSE or SendInvSE Message containing the SCSI
   Response PDU, the iSER layer at the initiator MUST notify the iSCSI
   layer at the initiator by invoking the Control_Notify Operational
   Primitive qualified with the SCSI Response PDU.  See Section 7.3.2
   for details on the handling of the SCSI Response PDU.

9.5.2. SCSI Read Operation

The iSCSI layer at the initiator MUST invoke the Send_Control Operational Primitive to request that the iSER layer at the initiator to send the SCSI read command. The iSER layer at the initiator MUST request that the RCaP layer transmit a SendSE Message with the message payload consisting of the iSER header followed by the SCSI Command PDU. The iSER layer at the initiator MUST Advertise the Read STag in the iSER header of the SendSE Message, as described in Section 9.2. Upon receiving the SendSE Message, the iSER layer at the target MUST notify the iSCSI layer at the target by invoking the Control_Notify Operational Primitive qualified with the SCSI Command PDU. See Section 7.3.1 for details on the handling of the SCSI read command. When the requested SCSI data is available in the I/O Buffer, the iSCSI layer at the target MUST invoke the Put_Data Operational Primitive qualified with the SCSI Data-in PDU. See Section 7.3.5 for details on the handling of the SCSI Data-in PDU. When the data transfer associated with this SCSI Read operation is complete, the iSCSI layer at the target MUST invoke the Send_Control Operational Primitive when it is ready to send the SCSI Response PDU. Upon receiving the SendInvSE Message containing the SCSI Response PDU, the iSER layer at the initiator MUST notify the iSCSI layer at the initiator by invoking the Control_Notify Operational Primitive qualified with the SCSI Response PDU. See Section 7.3.2 for details on the handling of the SCSI Response PDU.
ToP   noToC   RFC5046 - Page 64

9.5.3. Bidirectional Operation

The initiator and the target handle the SCSI Write and the SCSI Read portions of this bidirectional operation the same as described in Sections 9.5.1 and 9.5.2, respectively.

10. iSER Error Handling and Recovery

RCaP provides the iSER layer with reliable in-order delivery. Therefore, the error management needs of an iSER-assisted connection are somewhat different than those of a Traditional iSCSI connection.

10.1. Error Handling

iSER error handling is described in the following sections, classified loosely based on the sources of errors: 1. Those originating at the transport layer (e.g., TCP). 2. Those originating at the RCaP layer. 3. Those originating at the iSER layer. 4. Those originating at the iSCSI layer.

10.1.1. Errors in the Transport Layer

If the transport layer is TCP, then TCP packets with detected errors are silently dropped by the TCP layer and result in retransmission at the TCP layer. This has no impact on the iSER layer. However, connection loss (e.g., link failure) and unexpected termination (e.g., TCP graceful or abnormal close without the iSCSI Logout exchanges) at the transport layer will cause the iSCSI/iSER connection to be terminated as well.
10.1.1.1. Failure in the Transport Layer before RCaP Mode Is Enabled
If the connection is lost or terminated before the iSCSI layer invokes the Allocate_Connection_Resources Operational Primitive, the login process is terminated and no further action is required. If the connection is lost or terminated after the iSCSI layer has invoked the Allocate_Connection_Resources Operational Primitive, then the iSCSI layer MUST request that the iSER layer deallocate all connection resources by invoking the Deallocate_Connection_Resources Operational Primitive.
ToP   noToC   RFC5046 - Page 65
10.1.1.2. Failure in the Transport Layer after RCaP Mode Is Enabled
If the connection is lost or terminated after the iSCSI layer has invoked the Enable_Datamover Operational Primitive, the iSER layer MUST notify the iSCSI layer of the connection loss by invoking the Connection_Terminate_Notify Operational Primitive. Prior to invoking the Connection_Terminate_Notify Operational Primitive, the iSER layer MUST perform the actions described in Section 5.2.3.2.

10.1.2. Errors in the RCaP Layer

The RCaP layer does not have error recovery operations built in. If errors are detected at the RCaP layer, the RCaP layer will terminate the RCaP Stream and the associated connection.
10.1.2.1. Errors Detected in the Local RCaP Layer
If an error is encountered at the local RCaP layer, the RCaP layer MAY send a Terminate Message to the Remote Peer to report the error if possible. (For iWARP, see [RDMAP] for the list of errors where a Terminate Message is sent.) The RCaP layer is responsible for terminating the connection. After the RCaP layer notifies the iSER layer that the connection is terminated, the iSER layer MUST notify the iSCSI layer by invoking the Connection_Terminate_Notify Operational Primitive. Prior to invoking the Connection_Terminate_Notify Operational Primitive, the iSER layer MUST perform the actions described in Section 5.2.3.2.
10.1.2.2. Errors Detected in the RCaP Layer at the Remote Peer
If an error is encountered at the RCaP layer at the Remote Peer, the RCaP layer at the Remote Peer may send a Terminate Message to report the error if possible. If it is unable to send the Terminate Message, the connection is terminated. This is treated the same as a failure in the transport layer after RDMA is enabled as described in Section 10.1.1.2. If an error is encountered at the RCaP layer at the Remote Peer and it is able to send a Terminate Message, the RCaP layer at the Remote Peer is responsible for terminating the connection. After the local RCaP layer notifies the iSER layer that the connection is terminated, the iSER layer MUST notify the iSCSI layer by invoking the Connection_Terminate_Notify Operational Primitive. Prior to invoking the Connection_Terminate_Notify Operational Primitive, the iSER layer MUST perform the actions described in Section 5.2.3.2.
ToP   noToC   RFC5046 - Page 66

10.1.3. Errors in the iSER Layer

The error handling due to errors at the iSER layer is described in the following sections.
10.1.3.1. Insufficient Connection Resources to Support RCaP at Connection Setup
After the iSCSI layer at the initiator invokes the Allocate_Connection_Resources Operational Primitive during the iSCSI Login Negotiation Phase, if the iSER layer at the initiator fails to allocate the connection resources necessary to support RCaP, it MUST return a status of failure to the iSCSI layer at the initiator. The iSCSI layer at the initiator MUST terminate the connection as described in Section 5.2.3.1. After the iSCSI layer at the target invokes the Allocate_Connection_Resources Operational Primitive during the iSCSI Login Negotiation Phase, if the iSER layer at the target fails to allocate the connection resources necessary to support RCaP, it MUST return a status of failure to the iSCSI layer at the target. The iSCSI layer at the target MUST send a Login Response with a status class of 3 (Target Error), and a status code of "0302" (Out of Resources). The iSCSI layers at the initiator and the target MUST terminate the connection as described in Section 5.2.3.1.
10.1.3.2. iSER Negotiation Failures
If the RCaP or iSER related parameters declared by the initiator in the iSER Hello Message are unacceptable to the iSER layer at the target, the iSER layer at the target MUST set the Reject (REJ) flag, as described in Section 9.4, in the iSER HelloReply Message. The following are the cases when the iSER layer MUST set the REJ flag to one in the HelloReply Message: * The initiator-declared iSER-IRD value is greater than 0 and the target-declared iSER-ORD value is 0. * The initiator-supported and the target-supported iSER protocol versions do not overlap. After requesting that the RCaP layer send the iSER HelloReply Message, the handling of the error situation is the same as that for iSER format errors as described in Section 10.1.3.3.
ToP   noToC   RFC5046 - Page 67
10.1.3.3. iSER Format Errors
The following types of errors in an iSER header are considered format errors: * Illegal contents of any iSER header field * Inconsistent field contents in an iSER header * Length error for an iSER Hello or HelloReply Message (see Section 9.3 and 9.4) When a format error is detected, the following events MUST occur in the specified sequence: 1. The iSER layer MUST request that the RCaP layer terminate the RCaP Stream. The RCaP layer MUST terminate the associated connection. 2. The iSER layer MUST notify the iSCSI layer of the connection termination by invoking the Connection_Terminate_Notify Operational Primitive. Prior to invoking the Connection_Terminate_Notify Operational Primitive, the iSER layer MUST perform the actions described in Section 5.2.3.2.
10.1.3.4. iSER Protocol Errors
The first iSER Message sent by the iSER layer at the initiator after transitioning into iSER-assisted mode MUST be the iSER Hello Message (see Section 9.3). Likewise, the first iSER Message sent by the iSER layer at the target after transitioning into iSER-assisted mode MUST be the iSER HelloReply Message (see Section 9.4). Failure to send the iSER Hello or HelloReply Message, as indicated by the wrong Opcode in the iSER header, is a protocol error. The handling of this error situation is the same as that for iSER format errors as described in Section 10.1.3.3. If the sending side of an iSER-enabled connection acts in a manner not permitted by the negotiated or declared login/text operational key values as described in Section 6, this is a protocol error, and the receiving side MAY handle this the same as for iSER format errors as described in Section 10.1.3.3.

10.1.4. Errors in the iSCSI Layer

The error handling due to errors at the iSCSI layer is described in the following sections. For error recovery, see Section 10.2.
ToP   noToC   RFC5046 - Page 68
10.1.4.1. iSCSI Format Errors
When an iSCSI format error is detected, the iSCSI layer MUST request that the iSER layer terminate the RCaP Stream by invoking the Connection_Terminate Operational Primitive. For more details on the connection termination, see Section 5.2.3.1.
10.1.4.2. iSCSI Digest Errors
In the iSER-assisted mode, the iSCSI layer will not see any digest error because both the HeaderDigest and the DataDigest keys are negotiated to "None".
10.1.4.3. iSCSI Sequence Errors
For Traditional iSCSI, sequence errors are caused by dropped PDUs due to header or data digest errors. Since digests are not used in iSER-assisted mode and the RCaP layer will deliver all messages in the order they were sent, sequence errors will not occur in iSER- assisted mode.
10.1.4.4. iSCSI Protocol Error
When the iSCSI layer handles certain protocol errors by dropping the connection, the error handling is the same as that for iSCSI format errors as described in Section 10.1.4.1. When the iSCSI layer uses the iSCSI Reject PDU and response codes to handle certain other protocol errors, no special handling at the iSER layer is required.
10.1.4.5. SCSI Timeouts and Session Errors
SCSI Timeouts and Session Errors are handled at the iSCSI layer and no special handling at the iSER layer is required.
10.1.4.6. iSCSI Negotiation Failures
For negotiation failures that happen during the Login Phase at the initiator after the iSCSI layer has invoked the Allocate_Connection_Resources Operational Primitive and before the Enable_Datamover Operational Primitive has been invoked, the iSCSI layer MUST request that the iSER layer deallocate all connection resources by invoking the Deallocate_Connection_Resources Operational Primitive. The iSCSI layer at the initiator MUST terminate the connection.
ToP   noToC   RFC5046 - Page 69
   For negotiation failures during the Login Phase at the target, the
   iSCSI layer can use a Login Response with a status class other than 0
   (success) to terminate the Login Phase.  If the iSCSI layer has
   invoked the Allocate_Connection_Resources Operational Primitive
   before the Enable_Datamover Operational Primitive has been invoked,
   the iSCSI layer at the target MUST request that the iSER layer at the
   target deallocate all connection resources by invoking the
   Deallocate_Connection_Resources Operational Primitive.  The iSCSI
   layer at both the initiator and the target MUST terminate the
   connection.

   During the iSCSI Login Phase, if the iSCSI layer at the initiator
   receives a Login Response from the target with a status class other
   than 0 (Success) after the iSCSI layer at the initiator has invoked
   the Allocate_Connection_Resources Operational Primitive, the iSCSI
   layer MUST request the iSER layer to deallocate all connection
   resources by invoking the Deallocate_Connection_Resources Operational
   Primitive.  The iSCSI layer MUST terminate the connection in this
   case.

   For negotiation failures during the Full Feature Phase, the error
   handling is left to the iSCSI layer and no special handling at the
   iSER layer is required.

10.2. Error Recovery

Error recovery requirements of iSCSI/iSER are the same as that of Traditional iSCSI. All three ErrorRecoveryLevels as defined in [RFC3720] are supported in iSCSI/iSER. * For ErrorRecoveryLevel 0, session recovery is handled by iSCSI and no special handling by the iSER layer is required. * For ErrorRecoveryLevel 1, see Section 10.2.1 on PDU Recovery. * For ErrorRecoveryLevel 2, see Section 10.2.2 on Connection Recovery. The iSCSI layer may invoke the Notice_Key_Values Operational Primitive during connection setup to request that the iSER layer take note of the value of the operational ErrorRecoveryLevel, as described in Sections 5.1.1 and 5.1.2.

10.2.1. PDU Recovery

As described in Sections 10.1.4.2 and 10.1.4.3, digest and sequence errors will not occur in the iSER-assisted mode. If the RCaP layer detects an error, it will close the iSCSI/iSER connection, as
ToP   noToC   RFC5046 - Page 70
   described in Section 10.1.2.  Therefore, PDU recovery is not useful
   in the iSER-assisted mode.

   The iSCSI layer at the initiator SHOULD disable iSCSI timeout-driven
   PDU retransmissions.

10.2.2. Connection Recovery

The iSCSI layer at the initiator MAY reassign connection allegiance for non-immediate commands that are still in progress and are associated with the failed connection by using a Task Management Function Request with the TASK REASSIGN function. See Section 7.3.3 for more details. When the iSCSI layer at the initiator does a task reassignment for a SCSI write command, it MUST qualify the Send_Control Operational Primitive invocation with DataDescriptorOut, which defines the I/O Buffer for both the non-immediate unsolicited data and the solicited data. This allows the iSCSI layer at the target to use recovery R2Ts to request data originally sent as unsolicited and solicited from the initiator. When the iSCSI layer at the target accepts a reassignment request for a SCSI read command, it MUST request that the iSER layer process SCSI Data-in for all unacknowledged data by invoking the Put_Data Operational Primitive. See Section 7.3.5 on the handling of SCSI Data-in. When the iSCSI layer at the target accepts a reassignment request for a SCSI write command, it MUST request that the iSER layer process a recovery R2T for any non-immediate unsolicited data and any solicited data sequences that have not been received by invoking the Get_Data Operational Primitive. See Section 7.3.6 on the handling of Ready To Transfer (R2T). The iSCSI layer at the target MUST NOT issue recovery R2Ts on an iSCSI/iSER connection for a task for which the connection allegiance was never reassigned. The iSER layer at the target MAY reject such a recovery R2T received via the Get_Data Operational Primitive invocation from the iSCSI layer at the target, with an appropriate error code. The iSER layer at the target will process the requests invoked by the Put_Data and Get_Data Operational Primitives for a reassigned task in the same way as for the original commands.
ToP   noToC   RFC5046 - Page 71

11. Security Considerations

When iSER is layered on top of an RCaP layer and provides the RDMA extensions to the iSCSI protocol, the security considerations of iSER are the same as that of the underlying RCaP layer. For iWARP, this is described in [RDMAP] and [RDDPSEC]. Since the iSER-assisted iSCSI protocol is still functionally iSCSI from a security considerations perspective, all of the iSCSI security requirements as described in [RFC3720] and [RFC3723] apply. If the IPsec [IPSEC] mechanism is used, then it MUST be established before the connection transitions to the iSER-assisted mode. If iSER is layered on top of a non-IP based RCaP layer, all the security protocol mechanisms applicable to that RCaP layer are also applicable to an iSCSI/iSER connection. If iSER is layered on top of a non-IP protocol, the IPsec mechanism as specified in [RFC3720] MUST be implemented at any point where the iSER protocol enters the IP network (e.g., via gateways), and the non-IP protocol SHOULD implement (optional to use) a packet-by packet security protocol equal in strength to the IPsec mechanism specified by [RFC3720]. To minimize the potential for a denial-of-service attack, the iSCSI layer MUST NOT request that the iSER layer allocate the connection resources necessary to support RCaP until the iSCSI layer is sufficiently far along in the iSCSI Login Phase that it is reasonably certain that the peer side is not an attacker, as described in Sections 5.1.1 and 5.1.2. Note that the IPsec requirements for this document are based on the version of IPsec specified in RFC 2401 [IPSEC] and related RFCs, as profiled by RFC 3723 [RFC3723], despite the existence of a newer version of IPsec specified in RFC 4301 [RFC4301] and related RFCs.

12. References

12.1. Normative References

[RFC3720] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E. Zeidner, "Internet Small Computer Systems Interface (iSCSI)", RFC 3720, April 2004. [RFC3723] Aboba, B., Tseng, J., Walker, J., Rangan, V., and F. Travostino, "Securing Block Storage Protocols over IP", RFC 3723, April 2004. [RDMAP] Recio, R., Culley, P., Garcia, D., Hilland, J., and B. Metzler, "A Remote Direct Memory Access Protocol Specification", RFC 5040, October 2007.
ToP   noToC   RFC5046 - Page 72
   [DDP]     Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct
             Data Placement over Reliable Transports", RFC 5041, October
             2007.

   [IPSEC]   Kent, S. and R. Atkinson, "Security Architecture for the
             Internet Protocol", RFC 2401, November 1998.

   [MPA]     Culley, P., Elzur, U., Recio, R., Bailey, S., and J.
             Carrier, "Marker PDU Aligned Framing for TCP
             Specification", RFC 5044, October 2007.

   [RDDPSEC] Pinkerton, J. and E. Deleganes, "Direct Data Placement
             Protocol (DDP) / Remote Direct Memory Access Protocol
             (RDMAP) Security", RFC 5042, October 2007.

   [TCP]     Postel, J., "Transmission Control Protocol", STD 7, RFC
             793, September 1981.

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119, March 1997.

12.2. Informative References

[SAM2] T10/1157D, SCSI Architecture Model - 2 (SAM-2) [DA] Chadalapaka, M., Hufferd, J., Satran, J., and H. Shah, "DA: Datamover Architecture for the Internet Small Computer System Interface (iSCSI)", RFC 5047, October 2007. [IB] InfiniBand Architecture Specification Volume 1 Release 1.2, October 2004 [IPoIB] Chu, J. and V. Kashyap, "Transmission of IP over InfiniBand (IPoIB)", RFC 4391, April 2006. [RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005.
ToP   noToC   RFC5046 - Page 73

Appendix A. iWARP Message Format for iSER

This section is for information only and is NOT part of the standard. It simply depicts the iWARP Message format for the various iSER Messages when the transport layer is TCP.

A.1. iWARP Message Format for iSER Hello Message

The following figure depicts an iSER Hello Message encapsulated in an iWARP SendSE Message. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0010b | Zeros | 0001b | 0001b | iSER-IRD | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6. SendSE Message Containing an iSER Hello Message
ToP   noToC   RFC5046 - Page 74

A.2. iWARP Message Format for iSER HelloReply Message

The following figure depicts an iSER HelloReply Message encapsulated in an iWARP SendSE Message. The Reject (REJ) flag is set to 0. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0011b |Zeros|0| 0001b | 0001b | iSER-ORD | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7. SendSE Message Containing an iSER HelloReply Message
ToP   noToC   RFC5046 - Page 75

A.3. iWARP Message Format for SCSI Read Command PDU

The following figure depicts a SCSI Read Command PDU embedded in an iSER Message encapsulated in an iWARP SendSE Message. For this particular example, in the iSER header, the Write STag Valid flag is set to zero, the Read STag Valid flag is set to one, the Write STag field is set to all zeros, and the Read STag field contains a valid Read STag. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0001b |0|1| All zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Read STag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCSI Read Command PDU | // // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 8. SendSE Message Containing a SCSI Read Command PDU
ToP   noToC   RFC5046 - Page 76

A.4. iWARP Message Format for SCSI Read Data

The following figure depicts an iWARP RDMA Write Message carrying SCSI Read data in the payload: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Sink STag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Sink Tagged Offset | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCSI Read data | // // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 9. RDMA Write Message Containing SCSI Read Data
ToP   noToC   RFC5046 - Page 77

A.5. iWARP Message Format for SCSI Write Command PDU

The following figure depicts a SCSI Write Command PDU embedded in an iSER Message encapsulated in an iWARP SendSE Message. For this particular example, in the iSER header, the Write STag Valid flag is set to one, the Read STag Valid flag is set to zero, the Write STag field contains a valid Write STag, and the Read STag field is set to all zeros since it is not used. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0001b |1|0| All zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Write STag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCSI Write Command PDU | // // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10. SendSE Message Containing a SCSI Write Command PDU
ToP   noToC   RFC5046 - Page 78

A.6. iWARP Message Format for RDMA Read Request

An iSCSI R2T is transformed into an iWARP RDMA Read Request Message. The following figure depicts an iWARP RDMA Read Request Message: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved (Not Used) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DDP (RDMA Read Request) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DDP (RDMA Read Request) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DDP (RDMA Read Request) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Sink STag (SinkSTag) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Data Sink Tagged Offset (SinkTO) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RDMA Read Message Size (RDMARDSZ) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Source STag (SrcSTag) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Data Source Tagged Offset (SrcTO) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 11. RDMA Read Request Message
ToP   noToC   RFC5046 - Page 79

A.7. iWARP Message Format for Solicited SCSI Write Data

The following figure depicts an iWARP RDMA Read Response Message carrying the solicited SCSI Write data in the payload: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Sink STag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data Sink Tagged Offset | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCSI Write Data | // // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 12. RDMA Read Response Message Containing SCSI Write Data
ToP   noToC   RFC5046 - Page 80

A.8. iWARP Message Format for SCSI Response PDU

The following figure depicts a SCSI Response PDU embedded in an iSER Message encapsulated in an iWARP SendInvSE Message: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA Header | DDP Control | RDMA Control | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Invalidate STag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Queue Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | (Send) Message Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0001b |0|0| All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | All Zeros | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SCSI Response PDU | // // | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | MPA CRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 13. SendInvSE Message Containing SCSI Response PDU
ToP   noToC   RFC5046 - Page 81

Appendix B. Architectural Discussion of iSER over InfiniBand

This section explains how an InfiniBand network (with Gateways) would be structured. It is informational only and is intended to provide insight on how iSER is used in an InfiniBand environment.

B.1. The Host Side of the iSCSI and iSER Connections in InfiniBand

Figure 14 defines the topologies in which iSCSI and iSER will be able to operate on an InfiniBand Network. +---------+ +---------+ +---------+ +---------+ +--- -----+ | Host | | Host | | Host | | Host | | Host | | | | | | | | | | | +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+ |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ |----+------|-----+-----|-----+-----|-----+-----|-----+---> To IB IB| IB | IB | IB | IB | SubNet2 SWTCH +-v-----------v-----------v-----------v-----------v---------+ | InfiniBand Switch for Subnet1 | +---+-----+--------+-----+--------+-----+------------v------+ | TCA | | TCA | | TCA | | +-----+ +-----+ +-----+ | IB / IB \ / IB \ / \ +--+--v--+--+ | iSER | | iSER | | IPoIB | | | TCA | | | Gateway | | Gateway | | Gateway | | +-----+ | | to | | to | | to | | Storage | | iSCSI | | iSER | | IP | | Controller| | TCP | | iWARP | |Ethernet | +-----+-----+ +---v-----| +---v-----| +----v----+ | EN | EN | EN +--------------+---------------+----> to IP based storage Ethernet links that carry iSCSI or iWARP Figure 14. iSCSI and iSER on IB In Figure 14, the Host systems are connected via the InfiniBand Host Channel Adapters (HCAs) to the InfiniBand links. With the use of IB switch(es), the InfiniBand links connect the HCA to InfiniBand Target Channel Adapters (TCAs) located in gateways or Storage Controllers. An iSER-capable IB-IP Gateway converts the iSER Messages encapsulated in IB protocols to either standard iSCSI, or iSER Messages for iWARP. An [IPoIB] Gateway converts the InfiniBand [IPoIB] protocol to IP protocol, and in the iSCSI case, permits iSCSI to be operated on an IB Network between the Hosts and the [IPoIB] Gateway.
ToP   noToC   RFC5046 - Page 82

B.2. The Storage Side of the iSCSI and iSER Mixed Network Environment

Figure 15 shows a storage controller that has three different portal groups: one supporting only iSCSI (TPG-4), one supporting iSER/iWARP or iSCSI (TPG-2), and one supporting iSER/IB (TPG-1). | | | | | | +--+--v--+----------+--v--+----------+--v--+--+ | | IB | |iWARP| | EN | | | | | | TCP | | NIC | | | |(TCA)| | RNIC| | | | | +-----| +-----+ +-----+ | | TPG-1 TPG-2 TPG-4 | | 9.1.3.3 9.1.2.4 9.1.2.6 | | | | Storage Controller | | | +---------------------------------------------+ Figure 15. Storage Controller with TCP, iWARP, and IB Connections The normal iSCSI portal group advertising processes (via the Service Location Protocol (SLP), the Internet Storage Name Service (iSNS), or SendTargets) are available to a Storage Controller.

B.3. Discovery Processes for an InfiniBand Host

An InfiniBand Host system can gather portal group IP addresses from SLP, iSNS, or the SendTargets discovery processes by using TCP/IP via [IPoIB]. After obtaining one or more remote portal IP addresses, the Initiator uses the standard IP mechanisms to resolve the IP address to a local outgoing interface and the destination hardware address (Ethernet MAC or IB GID of the target or a gateway leading to the target). If the resolved interface is an [IPoIB] network interface, then the target portal can be reached through an InfiniBand fabric. In this case, the Initiator can establish an iSCSI/TCP or iSCSI/iSER session with the Target over that InfiniBand interface, using the Hardware Address (InfiniBand GID) obtained through the standard Address Resolution (ARP) processes. If more than one IP address is obtained through the discovery process, the Initiator should select a Target IP address that is on the same IP subnet as the Initiator, if one exists. This will avoid a potential overhead of going through a gateway when a direct path exists.
ToP   noToC   RFC5046 - Page 83
   In addition, a user can configure manual static IP route entries if a
   particular path to the target is preferred.

B.4. IBTA Connection Specifications

The InfiniBand Trade Association (IBTA) connection specifications are outside the scope of this document, but it is expected that the IBTA has or will define: * The iSER ServiceID. * A Means for permitting a Host to establish a connection with a peer InfiniBand end-node, and to fall back to iSCSI/TCP over [IPoIB] if that peer indicates iSER is not supported. * A Means for permitting the Host to establish connections with IB iSER connections on storage controllers or IB iSER connected Gateways in preference to [IPoIB] connected Gateways/Bridges or connections to Target Storage Controllers that also accept iSCSI via [IPoIB]. * A Means for combining the IB ServiceID for iSER and the IP port number such that the IB Host can use normal IB connection processes, yet ensure that the iSER target peer can actually connect to the required IP port number.

Acknowledgments

This protocol was developed by a design team that, in addition to the authors, included Dwight Barron (HP), John Carrier (formerly from Adaptec), Ted Compton (EMC), Paul R. Culley (HP), Yaron Haviv (Voltaire), Jeff Hilland (HP), Mike Krause (HP), Alex Nezhinsky (Voltaire), Jim Pinkerton (Microsoft), Renato J. Recio (IBM), Julian Satran (IBM), Tom Talpey (Network Appliance), and Jim Wendt (HP). Special thanks to David Black (EMC) for his extensive review comments.
ToP   noToC   RFC5046 - Page 84

Author's Address

Mallikarjun Chadalapaka Hewlett-Packard Company 8000 Foothills Blvd. Roseville, CA 95747-5668, USA Phone: +1-916-785-5621 EMail: cbm@rose.hp.com Uri Elzur Broadcom Corporation 5300 California Avenue Irvine, CA 92617, USA Phone: +1-949-926-6432 EMail: Uri@Broadcom.com John Hufferd Brocade Communications Systems, Inc. 1745 Technology Drive San Jose, CA 95110, USA Phone: +1-408-333-5244 EMail: jhufferd@brocade.com Mike Ko IBM Corp. 650 Harry Rd. San Jose, CA 95120, USA Phone: +1-408-927-2085 EMail: mako@us.ibm.com Hemal Shah Broadcom Corporation 5300 California Avenue Irvine, CA 92617, USA Phone: +1-949-926-6941 EMail: hemal@broadcom.com Patricia Thaler Broadcom Corporation 5300 California Avenue Irvine, CA 92617, USA Phone: +1-916-570-2707 EMail: pthaler@broadcom.com
ToP   noToC   RFC5046 - Page 85
Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.