Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7145

Internet Small Computer System Interface (iSCSI) Extensions for the Remote Direct Memory Access (RDMA) Specification

Pages: 91
Proposed Standard
Obsoletes:  5046
Part 3 of 5 – Pages 44 to 63
First   Prev   Next

Top   ToC   RFC7145 - Page 44   prevText

7. iSCSI PDU Considerations

When a connection is in the iSER-assisted mode, two types of message transfers are allowed between the iSCSI layer (at the initiator) and the iSCSI layer (at the target). These are known as the iSCSI data- type PDUs and the iSCSI control-type PDUs, and these terms are described in the following sections.

7.1. iSCSI Data-Type PDU

An iSCSI data-type PDU is defined as an iSCSI PDU that causes data transfer, transparent to the remote iSCSI layer, to take place between the peer iSCSI nodes in the Full Feature Phase of an iSCSI/iSER connection. An iSCSI data-type PDU, when requested for transmission by the iSCSI layer in the sending node, results in the data's transfer without the participation of the iSCSI layers at the sending and the receiving nodes. This is due to the fact that the PDU itself is not delivered as-is to the iSCSI layer in the receiving node. Instead, the data transfer operations are transformed into the appropriate RDMA operations, which are handled by the RDMA-Capable Controller. The set of iSCSI data-type PDUs consists of SCSI Data-In PDUs and R2T PDUs. If the invocation of the Operational Primitive by the iSCSI layer to request the iSER layer to process an iSCSI data-type PDU is qualified with Notify_Enable set, then upon completing the RDMA operation, the iSER layer at the target MUST notify the iSCSI layer at the target by invoking the Data_Completion_Notify Operational Primitive qualified with the ITT and SN. There is no data completion notification at the initiator since the RDMA operations are completely handled by the RDMA-Capable Controller at the initiator and the iSER layer at the initiator is not involved with the data transfer associated with iSCSI data-type PDUs. If the invocation of the Operational Primitive by the iSCSI layer to request the iSER layer to process an iSCSI data-type PDU is qualified with Notify_Enable cleared, then upon completing the RDMA operation, the iSER layer at the target MUST NOT notify the iSCSI layer at the target and MUST NOT invoke the Data_Completion_Notify Operational Primitive. If an operation associated with an iSCSI data-type PDU fails for any reason, the contents of the Data Sink buffers associated with the operation are considered indeterminate.
Top   ToC   RFC7145 - Page 45

7.2. iSCSI Control-Type PDU

Any iSCSI PDU that is not an iSCSI data-type PDU and also not a SCSI Data-Out PDU carrying solicited data is defined as an iSCSI control- type PDU. The iSCSI layer invokes the Send_Control Operational Primitive to request the iSER layer to process an iSCSI control-type PDU. iSCSI control-type PDUs are transferred using Send Messages of RCaP. Specifically, it is to be noted that SCSI Data-Out PDUs carrying unsolicited data are defined as iSCSI control-type PDUs. See Section 7.3.4 on the treatment of SCSI Data-Out PDUs. When the iSER layer receives an iSCSI control-type PDU, it MUST notify the iSCSI layer by invoking the Control_Notify Operational Primitive qualified with the iSCSI control-type PDU.

7.3. iSCSI PDUs

This section describes the handling of each of the iSCSI PDU types by the iSER layer. The iSCSI layer requests the iSER layer to process the iSCSI PDU by invoking the appropriate Operational Primitive. A Connection_Handle MUST qualify each of these invocations. In addition, the BHS and the optional AHS of the iSCSI PDU as defined in [iSCSI] MUST qualify each of the invocations. The qualifying Connection_Handle, the BHS, and the AHS are not explicitly listed in the subsequent sections.

7.3.1. SCSI Command

Type: control-type PDU PDU-specific qualifiers (for SCSI Write or bidirectional command): ImmediateDataSize, UnsolicitedDataSize, DataDescriptorOut PDU-specific qualifiers (for SCSI Read or bidirectional command): DataDescriptorIn The iSER layer at the initiator MUST send the SCSI command in a Send Message to the target. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).
Top   ToC   RFC7145 - Page 46
   For a SCSI Write or bidirectional command, the iSCSI layer at the
   initiator MUST invoke the Send_Control Operational Primitive as
   follows:

   *  If there is immediate data to be transferred for the SCSI write or
      bidirectional command, the qualifier ImmediateDataSize MUST be
      used to define the number of bytes of immediate unsolicited data
      to be sent with the write or bidirectional command, and the
      qualifier DataDescriptorOut MUST be used to define the initiator's
      I/O Buffer containing the SCSI Write data.

   *  If there is unsolicited data to be transferred for the SCSI Write
      or bidirectional command, the qualifier UnsolicitedDataSize MUST
      be used to define the number of bytes of immediate and non-
      immediate unsolicited data for the command.  The iSCSI layer will
      issue one or more SCSI Data-Out PDUs for the non-immediate
      unsolicited data.  See Section 7.3.4 on SCSI Data-Out.

   *  If there is solicited data to be transferred for the SCSI Write or
      bidirectional command, as indicated when the Expected Data
      Transfer Length in the SCSI Command PDU exceeds the value of
      UnsolicitedDataSize, the iSER layer at the initiator MUST do the
      following:

      a. It MUST allocate a Write STag for the I/O Buffer defined by the
         qualifier DataDescriptorOut.  DataDescriptorOut describes the
         I/O buffer starting with the immediate unsolicited data (if
         any), followed by the non-immediate unsolicited data (if any)
         and solicited data.  When TaggedBufferForSolicitedDataOnly is
         negotiated to No, the Base Offset is associated with this I/O
         Buffer.  When TaggedBufferForSolicitedDataOnly is negotiated to
         Yes, the Base Offset is associated with an I/O Buffer that
         contains only solicited data.

      b. It MUST establish a Local Mapping that associates the Initiator
         Task Tag (ITT) to the Write STag.

      c. It MUST Advertise the Write STag and the Base Offset to the
         target by sending them in the iSER header of the iSER Message
         (the payload of the Send Message of RCaP) containing the SCSI
         Write or bidirectional command PDU.  The SendSE Message should
         be used if supported by the RCaP layer (e.g., iWARP).  See
         Section 9.2 on iSER Header Format for iSCSI Control-Type PDU.
Top   ToC   RFC7145 - Page 47
   For a SCSI Read or bidirectional command, the iSCSI layer at the
   initiator MUST invoke the Send_Control Operational Primitive
   qualified with DataDescriptorIn, which defines the initiator's I/O
   Buffer for receiving the SCSI Read data.  The iSER layer at the
   initiator MUST do the following:

      a. It MUST allocate a Read STag for the I/O Buffer and note the
         Base Offset for this I/O Buffer.

      b. It MUST establish a Local Mapping that associates the Initiator
         Task Tag (ITT) to the Read STag.

      c. It MUST Advertise the Read STag and the Base Offset to the
         target by sending them in the iSER header of the iSER Message
         (the payload of the Send Message of RCaP) containing the SCSI
         Read or bidirectional command PDU.  The SendSE Message should
         be used if supported by the RCaP layer (e.g., iWARP).  See
         Section 9.2 on iSER Header Format for iSCSI Control-Type PDU.

   If the amount of unsolicited data to be transferred in a SCSI Command
   exceeds TargetRecvDataSegmentLength, then the iSCSI layer at the
   initiator MUST segment the data into multiple iSCSI control-type
   PDUs, with the data segment length in all generated PDUs (except the
   last one) having exactly the size TargetRecvDataSegmentLength.  The
   data segment length of the last iSCSI control-type PDU carrying the
   unsolicited data can be up to TargetRecvDataSegmentLength.

   When the iSER layer at the target receives the SCSI Command, it MUST
   establish a Remote Mapping that associates the ITT to the Base
   Offset(s) and the Advertised STag(s) in the iSER header.  The Write
   STag is used by the iSER layer at the target in handling the data
   transfer associated with the R2T PDU(s) as described in Section
   7.3.6.  The Read STag is used in handling the SCSI Data-In PDU(s)
   from the iSCSI layer at the target as described in Section 7.3.5.

7.3.2. SCSI Response

Type: control-type PDU PDU-specific qualifiers: DataDescriptorStatus The iSCSI layer at the target MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorStatus, which defines the buffer containing the sense and response information. The iSCSI layer at the target MUST always return the SCSI status for a SCSI command in a separate SCSI Response PDU. "Phase collapse" for
Top   ToC   RFC7145 - Page 48
   transferring SCSI status in a SCSI Data-In PDU MUST NOT be used. The
   iSER layer at the target sends the SCSI Response PDU according to the
   following rules:

   *  If no STags were Advertised by the initiator in the iSER Message
      containing the SCSI command PDU, then the iSER layer at the target
      MUST send a Send Message containing the SCSI Response PDU.  The
      SendSE Message should be used if supported by the RCaP layer
      (e.g., iWARP).

   *  If the initiator Advertised a Read STag in the iSER Message
      containing the SCSI Command PDU, then the iSER layer at the target
      MUST send a Send Message containing the SCSI Response PDU.  The
      header of the Send Message MUST carry the Read STag to be
      invalidated at the initiator.  The Send with Invalidate Message,
      if supported by the RCaP layer (e.g., iWARP), can be used for the
      automatic invalidation of the STag.

   *  If the initiator Advertised only the Write STag in the iSER
      Message containing the SCSI command PDU, then the iSER layer at
      the target MUST send a Send Message containing the SCSI Response
      PDU.  The header of the Send Message MUST carry the Write STag to
      be invalidated at the initiator.  The Send with Invalidate
      Message, if supported by the RCaP layer (e.g., iWARP), can be used
      for the automatic invalidation of the STag.

   When the iSCSI layer at the target invokes the Send_Control
   Operational Primitive to send the SCSI Response PDU, the iSER layer
   at the target MUST invalidate the Remote Mapping before transferring
   the SCSI Response PDU to the initiator.

   Upon receiving a Send Message containing the SCSI Response PDU from
   the target, the iSER layer at the initiator MUST invalidate the
   STag(s) specified in the header.  (If a Send with Invalidate Message
   is supported by the RCaP layer (e.g., iWARP) and is used to carry the
   SCSI Response PDU, the RCaP layer at the initiator will invalidate
   the STag.  The iSER layer at the initiator MUST ensure that the
   correct STag is invalidated.  If both the Read and the Write STags
   were Advertised earlier by the initiator, then the iSER layer at the
   initiator MUST explicitly invalidate the Write STag upon receiving
   the Send with Invalidate Message because the header of the Send with
   Invalidate Message can only carry one STag (in this case, the Read
   STag) to be invalidated.)

   The iSER layer at the initiator MUST ensure the invalidation of the
   STag(s) used in a command before notifying the iSCSI layer at the
   initiator by invoking the Control_Notify Operational Primitive
   qualified with the SCSI Response.  This precludes the possibility of
Top   ToC   RFC7145 - Page 49
   using the STag(s) after the completion of the command; such use would
   cause data corruption.

   When the iSER layer at the initiator receives a Send Message
   containing the SCSI Response PDU, it SHOULD invalidate the Local
   Mapping.  The iSER layer MUST ensure that all local STag(s)
   associated with the ITT are invalidated before notifying the iSCSI
   layer of the SCSI Response PDU by invoking the Control_Notify
   Operational Primitive qualified with the SCSI Response PDU.

7.3.3. Task Management Function Request/Response

Type: control-type PDU PDU-specific qualifiers (for TMF Request): DataDescriptorOut, DataDescriptorIn The iSER layer MUST use a Send Message to send the Task Management Function Request/Response PDU. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP). For the Task Management Function Request with the TASK REASSIGN function, the iSER layer at the initiator MUST do the following: * It MUST use the ITT as specified in the Referenced Task Tag from the Task Management Function Request PDU to locate the existing STags (if any) in the Local Mappings. * It MUST invalidate the existing STags (if any) and the Local Mappings. * It MUST allocate a Read STag for the I/O Buffer and note the Base Offset associated with the I/O Buffer as defined by the qualifier DataDescriptorIn if the Send_Control Operational Primitive invocation is qualified with DataDescriptorIn. * It MUST allocate a Write STag for the I/O Buffer and note the Base Offset associated with the I/O Buffer as defined by the qualifier DataDescriptorOut if the Send_Control Operational Primitive invocation is qualified with DataDescriptorOut. * If STags are allocated, it MUST establish new Local Mapping(s) that associate the ITT to the allocated STag(s). * It MUST Advertise the STags and the Base Offsets, if allocated, to the target in the iSER header of the Send Message carrying the iSCSI PDU, as described in Section 9.2. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).
Top   ToC   RFC7145 - Page 50
   For the Task Management Function Request with the TASK REASSIGN
   function for a SCSI Read or bidirectional command, the iSCSI layer at
   the initiator MUST set ExpDataSN to zero since the data transfer and
   acknowledgements happen transparently to the iSCSI layer at the
   initiator.  This provides the flexibility to the iSCSI layer at the
   target to request transmission of only the unacknowledged data as
   specified in [iSCSI].

   When the iSER layer at the target receives the Task Management
   Function Request with the TASK REASSIGN function, it MUST do the
   following:

   *  It MUST use the ITT as specified in the Referenced Task Tag from
      the Task Management Function Request PDU to locate the Local and
      Remote Mappings (if any).

   *  It MUST invalidate the local STags (if any) associated with the
      ITT.

   *  It MUST replace the Base Offset(s) and the Advertised STag(s) in
      the Remote Mapping with the Base Offset(s) and the Advertised
      STag(s) in the iSER header.  The Write STag is used in the
      handling of the R2T PDU(s) from the iSCSI layer at the target as
      described in Section 7.3.6.  The Read STag is used in the handling
      of the SCSI Data-In PDU(s) from the iSCSI layer at the target as
      described in Section 7.3.5.

7.3.4. SCSI Data-Out

Type: control-type PDU PDU-specific qualifiers: DataDescriptorOut The iSCSI layer at the initiator MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorOut, which defines the initiator's I/O Buffer containing unsolicited SCSI Write data. If the amount of unsolicited data to be transferred as SCSI Data-Out exceeds TargetRecvDataSegmentLength, then the iSCSI layer at the initiator MUST segment the data into multiple iSCSI control-type PDUs, where the DataSegmentLength has the value of TargetRecvDataSegmentLength in all generated PDUs except the last one. The DataSegmentLength of the last iSCSI control-type PDU carrying the unsolicited data can be up to TargetRecvDataSegmentLength. The iSCSI layer at the target MUST perform the reassembly function for the unsolicited data.
Top   ToC   RFC7145 - Page 51
   For unsolicited data, the iSER layer at the initiator MUST use a Send
   Message to send the SCSI Data-Out PDU.  If the F bit is set to 1, the
   SendSE Message should be used if supported by the RCaP layer (e.g.,
   iWARP).

   Note that for solicited data, the SCSI Data-Out PDUs are not used
   since R2T PDUs are not delivered to the iSCSI layer at the initiator;
   instead, R2T PDUs are transformed by the iSER layer at the target
   into RDMA Read operations.  (See Section 7.3.6.)

7.3.5. SCSI Data-In

Type: data-type PDU PDU-specific qualifiers: DataDescriptorIn When the iSCSI layer at the target is ready to return the SCSI Read data to the initiator, it MUST invoke the Put_Data Operational Primitive qualified with DataDescriptorIn, which defines the SCSI Data-In buffer. See Section 7.1 on the general requirement on the handling of iSCSI data-type PDUs. SCSI Data-In PDU(s) are used in SCSI Read data transfer as described in Section 9.5.2. The iSER layer at the target MUST do the following for each invocation of the Put_Data Operational Primitive: 1. It MUST use the ITT in the SCSI Data-In PDU to locate the remote Read STag and the Base Offset in the Remote Mapping. The Remote Mapping was established earlier by the iSER layer at the target when the SCSI Read Command was received from the initiator. 2. It MUST generate and send an RDMA Write Message containing the read data to the initiator. a. It MUST use the remote Read STag as the Data Sink STag of the RDMA Write Message. b. It MUST add the Buffer Offset from the SCSI Data-In PDU to the Base Offset from the Remote Mapping as the Data Sink Tagged Offset of the RDMA Write Message. c. It MUST use DataSegmentLength from the SCSI Data-In PDU to determine the amount of data to be sent in the RDMA Write Message. 3. It MUST associate the DataSN and ITT from the SCSI Data-In PDU with the RDMA Write operation. If the Put_Data Operational Primitive invocation was qualified with Notify_Enable set, then
Top   ToC   RFC7145 - Page 52
      when the iSER layer at the target receives a completion from the
      RCaP layer for the RDMA Write Message, the iSER layer at the
      target MUST notify the iSCSI layer by invoking the
      Data_Completion_Notify Operational Primitive qualified with the
      DataSN and ITT.  Conversely, if the Put_Data Operational Primitive
      invocation was qualified with Notify_Enable cleared, then the iSER
      layer at the target MUST NOT notify the iSCSI layer on completion
      and MUST NOT invoke the Data_Completion_Notify Operational
      Primitive.

   When the A-bit is set to one in the SCSI Data-In PDU, the iSER layer
   at the target MUST notify the iSCSI layer at the target when the data
   transfer is complete at the initiator.  To perform this additional
   function, the iSER layer at the target can take advantage of the
   operational ErrorRecoveryLevel if previously disclosed by the iSCSI
   layer via an earlier invocation of the Notice_Key_Values Operational
   Primitive.  There are two approaches that can be taken:

   1. If the iSER layer at the target knows that the operational
      ErrorRecoveryLevel is 2, or if the iSER layer at the target does
      not know the operational ErrorRecoveryLevel, then the iSER layer
      at the target MUST issue a zero-length RDMA Read Request Message
      following the RDMA Write Message.  When the iSER layer at the
      target receives a completion for the RDMA Read Request Message
      from the RCaP layer, implying that the RDMA-Capable Controller at
      the initiator has completed processing the RDMA Write Message due
      to the completion ordering semantics of RCaP, the iSER layer at
      the target MUST notify the iSCSI layer at the target by invoking
      the Data_ACK_Notify Operational Primitive qualified with ITT and
      DataSN (see Section 3.2.3).

   2. If the iSER layer at the target knows that the operational
      ErrorRecoveryLevel is 1, then the iSER layer at the target MUST do
      one of the following:

      a. It MUST notify the iSCSI layer at the target by invoking the
         Data_ACK_Notify Operational Primitive qualified with ITT and
         DataSN (see Section 3.2.3) when it receives the local
         completion from the RCaP layer for the RDMA Write Message.
         This is allowed since digest errors do not occur in iSER (see
         Section 10.1.4.2) and a CRC error will cause the connection to
         be terminated and the task to be terminated anyway.  The local
         RDMA Write completion from the RCaP layer guarantees that the
         RCaP layer will not access the I/O Buffer again to transfer the
         data associated with that RDMA Write operation.
Top   ToC   RFC7145 - Page 53
      b. Alternatively, it MUST use the same procedure for handling the
         data transfer completion at the initiator as for
         ErrorRecoveryLevel 2.

   It should be noted that the iSCSI layer at the target cannot set the
   A-bit to 1 if the ErrorRecoveryLevel=0.

   SCSI status MUST always be returned in a separate SCSI Response PDU.
   The S bit in the SCSI Data-In PDU MUST always be set to zero.  There
   MUST NOT be a "phase collapse" in the SCSI Data-In PDU.

   Since the RDMA Write Message only transfers the data portion of the
   SCSI Data-In PDU but not the control information in the header, such
   as ExpCmdSN, if timely updates of such information are crucial, the
   iSCSI layer at the initiator MAY issue NOP-Out PDUs to request the
   iSCSI layer at the target to respond with the information using
   NOP-In PDUs.

7.3.6. Ready To Transfer (R2T)

Type: data-type PDU PDU-specific qualifiers: DataDescriptorOut In order to send an R2T PDU, the iSCSI layer at the target MUST invoke the Get_Data Operational Primitive qualified with DataDescriptorOut, which defines the I/O Buffer for receiving the SCSI Write data from the initiator. See Section 7.1 on the general requirements on the handling of iSCSI data-type PDUs. The iSER layer at the target MUST do the following for each invocation of the Get_Data Operational Primitive: 1. It MUST ensure a valid local STag for the I/O Buffer and a valid Local Mapping. This may involve allocating a valid local STag and establishing a Local Mapping. 2. It MUST use the ITT in the R2T to locate the remote Write STag and the Base Offset in the Remote Mapping. The Remote Mapping was established earlier by the iSER layer at the target when the iSER Message containing the Advertised Write STag, the Base Offset, and the SCSI Command PDU for a SCSI Write or bidirectional command was received from the initiator. 3. If the iSER-ORD value at the target is set to zero, the iSER layer at the target MUST terminate the connection and free up the resources associated with the connection (as described in Section 5.2.3) if it received the R2T PDU from the iSCSI layer at the
Top   ToC   RFC7145 - Page 54
      target.  Upon termination of the connection, the iSER layer at the
      target MUST notify the iSCSI layer at the target by invoking the
      Connection_Terminate_Notify Operational Primitive.

   4. If the iSER-ORD value at the target is set to greater than 0, the
      iSER layer at the target MUST transform the R2T PDU into an RDMA
      Read Request Message.  While transforming the R2T PDU, the iSER
      layer at the target MUST ensure that the number of outstanding
      RDMA Read Request Messages does not exceed the iSER-ORD value.  To
      transform the R2T PDU, the iSER layer at the target:

      a. MUST derive the local STag and local Tagged Offset from the
         DataDescriptorOut that qualified the Get_Data invocation.

      b. MUST use the local STag as the Data Sink STag of the RDMA Read
         Request Message.

      c. MUST use the local Tagged Offset as the Data Sink Tagged Offset
         of the RDMA Read Request Message.

      d. MUST use the Desired Data Transfer Length from the R2T PDU as
         the RDMA Read Message Size of the RDMA Read Request Message.

      e. MUST use the remote Write STag as the Data Source STag of the
         RDMA Read Request Message.

      f. MUST add the Buffer Offset from the R2T PDU to the Base Offset
         from the Remote Mapping as the Data Source Tagged Offset of the
         RDMA Read Request Message.

   5. It MUST associate the R2TSN and ITT from the R2T PDU with the RDMA
      Read operation.  If the Get_Data Operational Primitive invocation
      was qualified with Notify_Enable set, then when the iSER layer at
      the target receives a completion from the RCaP layer for the RDMA
      Read operation, the iSER layer at the target MUST notify the iSCSI
      layer by invoking the Data_Completion_Notify Operational Primitive
      qualified with the R2TSN and ITT.  Conversely, if the Get_Data
      Operational Primitive invocation was qualified with Notify_Enable
      cleared, then the iSER layer at the target MUST NOT notify the
      iSCSI layer on completion and MUST NOT invoke the
      Data_Completion_Notify Operational Primitive.

   When the RCaP layer at the initiator receives a valid RDMA Read
   Request Message, it will return an RDMA Read Response Message
   containing the solicited write data to the target.  When the RCaP
   layer at the target receives the RDMA Read Response Message from the
   initiator, it will place the solicited data in the I/O Buffer
   referenced by the Data Sink STag in the RDMA Read Response Message.
Top   ToC   RFC7145 - Page 55
   Since the RDMA Read Request Message from the target does not transfer
   the control information in the R2T PDU such as ExpCmdSN, if timely
   updates of such information are crucial, the iSCSI layer at the
   initiator MAY issue NOP-Out PDUs to request the iSCSI layer at the
   target to respond with the information using NOP-In PDUs.

   Similarly, since the RDMA Read Response Message from the initiator
   only transfers the data but not the control information normally
   found in the SCSI Data-Out PDU, such as ExpStatSN, if timely updates
   of such information are crucial, the iSCSI layer at the target MAY
   issue NOP-In PDUs to request the iSCSI layer at the initiator to
   respond with the information using NOP-Out PDUs.

7.3.7. Asynchronous Message

Type: control-type PDU PDU-specific qualifiers: DataDescriptorSense The iSCSI layer MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorSense, which defines the buffer containing the sense and iSCSI event information. The iSER layer MUST use a Send Message to send the Asynchronous Message PDU. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).

7.3.8. Text Request and Text Response

Type: control-type PDU PDU-specific qualifiers: DataDescriptorTextOut (for Text Request), DataDescriptorIn (for Text Response) The iSCSI layer MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorTextOut (or DataDescriptorIn), which defines the Text Request (or Text Response) buffer. The iSER layer MUST use Send Messages to send the Text Request (or Text Response PDUs). The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).

7.3.9. Login Request and Login Response

During the login negotiation, the iSCSI layer interacts with the transport layer directly, and the iSER layer is not involved. See Section 5.1 on iSCSI/iSER Connection Setup. If the underlying transport is TCP, the Login Request PDUs and the Login Response PDUs are exchanged when the connection between the initiator and the target is still in the byte stream mode.
Top   ToC   RFC7145 - Page 56
   The iSCSI layer MUST NOT send a Login Request (or a Login Response)
   PDU during the Full Feature Phase.  A Login Request (or a Login
   Response) PDU, if used, MUST be treated as an iSCSI protocol error.
   The iSER layer MAY reject such a PDU from the iSCSI layer with an
   appropriate error code.  If a Login Request PDU is received by the
   iSCSI layer at the target, it MUST respond with a Reject PDU with a
   reason code of "protocol error".

7.3.10. Logout Request and Logout Response

Type: control-type PDU PDU-specific qualifiers: None The iSER layer MUST use a Send Message to send the Logout Request or Logout Response PDU. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP). Sections 5.2.1 and 5.2.2 describe the handling of the Logout Request and the Logout Response at the initiator and the target and the interactions between the initiator and the target to terminate a connection.

7.3.11. SNACK Request

Since HeaderDigest and DataDigest must be negotiated to "None", there are no digest errors when the connection is in iSER-assisted mode. Also, since RCaP delivers all messages in the order they were sent, there are no sequence errors when the connection is in iSER-assisted mode. Therefore, the iSCSI layer MUST NOT send SNACK Request PDUs. A SNACK Request PDU, if used, MUST be treated as an iSCSI protocol error. The iSER layer MAY reject such a PDU from the iSCSI layer with an appropriate error code. If a SNACK Request PDU is received by the iSCSI layer at the target, it MUST respond with a Reject PDU with a reason code of "protocol error".

7.3.12. Reject

Type: control-type PDU PDU-specific qualifiers: DataDescriptorReject The iSCSI layer MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorReject, which defines the Reject buffer. The iSER layer MUST use a Send Message to send the Reject PDU. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).
Top   ToC   RFC7145 - Page 57

7.3.13. NOP-Out and NOP-In

Type: control-type PDU PDU-specific qualifiers: DataDescriptorNOPOut (for NOP-Out), DataDescriptorNOPIn (for NOP-In) The iSCSI layer MUST invoke the Send_Control Operational Primitive qualified with DataDescriptorNOPOut (or DataDescriptorNOPIn), which defines the Ping (or Return Ping) data buffer. The iSER layer MUST use Send Messages to send the NOP-Out (or NOP-In) PDU. The SendSE Message should be used if supported by the RCaP layer (e.g., iWARP).

8. Flow Control and STag Management

8.1. Flow Control for RDMA Send Messages

Send Messages in RCaP are used by the iSER layer to transfer iSCSI control-type PDUs. Each Send Message in RCaP consumes an Untagged Buffer at the Data Sink. However, neither the RCaP layer nor the iSER layer provides an explicit flow control mechanism for the Send Messages. Therefore, the iSER layer SHOULD provision enough Untagged buffers for handling incoming Send Messages to prevent buffer exhaustion at the RCaP layer. If buffer exhaustion occurs, it may result in the termination of the connection. An implementation may choose to satisfy the buffer requirement by using a common buffer pool shared across multiple connections, with usage limits on a per-connection basis and usage limits on the buffer pool itself. In such an implementation, exceeding the buffer usage limit for a connection or the buffer pool itself may trigger interventions from the iSER layer to replenish the buffer pool and/or to isolate the connection causing the problem. iSER also provides the MaxOutstandingUnexpectedPDUs key to be used by the initiator and the target to declare the maximum number of outstanding "unexpected" control-type PDUs that it can receive. It is intended to allow the receiving side to determine the amount of buffer resources needed beyond the normal flow control mechanism available in iSCSI. The buffer resources required at both the initiator and the target as a result of control-type PDUs sent by the initiator are described in Section 8.1.1. The buffer resources required at both the initiator and target as a result of control-type PDUs sent by the target are described in Section 8.1.2.
Top   ToC   RFC7145 - Page 58

8.1.1. Flow Control for Control-Type PDUs from the Initiator

The control-type PDUs that can be sent by an initiator to a target can be grouped into the following categories: 1. Regulated: Control-type PDUs in this category are regulated by the iSCSI CmdSN window mechanism, and the immediate flag is not set. 2. Unregulated but Expected: Control-type PDUs in this category are not regulated by the iSCSI CmdSN window mechanism but are expected by the target. 3. Unregulated and Unexpected: Control-type PDUs in this category are not regulated by the iSCSI CmdSN window mechanism and are "unexpected" by the target.
8.1.1.1. Control-Type PDUs from the Initiator in the Regulated Category
Control-type PDUs that can be sent by the initiator in this category are regulated by the iSCSI CmdSN window mechanism, and the immediate flag is not set. The queuing capacity required of the iSCSI layer at the target is described in Section 4.2.2.1 of [iSCSI]. For each of the control- type PDUs that can be sent by the initiator in this category, the initiator MUST provision for the buffer resources required for the corresponding control-type PDU sent as a response from the target. The following is a list of the PDUs that can be sent by the initiator and the PDUs that are sent by the target in response: a. When an initiator sends a SCSI Command PDU, it expects a SCSI Response PDU from the target. b. When the initiator sends a Task Management Function Request PDU, it expects a Task Management Function Response PDU from the target. c. When the initiator sends a Text Request PDU, it expects a Text Response PDU from the target. d. When the initiator sends a Logout Request PDU, it expects a Logout Response PDU from the target. e. When the initiator sends a NOP-Out PDU as a ping request with ITT != 0xffffffff and TTT = 0xffffffff, it expects a NOP-In PDU from the target with the same ITT and TTT as in the ping request.
Top   ToC   RFC7145 - Page 59
   The response from the target for any of the PDUs enumerated here may
   alternatively be in the form of a Reject PDU sent before the task is
   active, as described in Section 7.3 of [iSCSI].

8.1.1.2. Control-Type PDUs from the Initiator in the Unregulated but Expected Category
For the control-type PDUs in the Unregulated but Expected category, the amount of buffering resources required at the target can be predetermined. The following is a list of the PDUs in this category: a. SCSI Data-Out PDUs are used by the initiator to send unsolicited data. The amount of buffer resources required by the target can be determined using FirstBurstLength. Note that SCSI Data-Out PDUs are not used for solicited data since the R2T PDU, which is used for solicitation, is transformed into RDMA Read operations by the iSER layer at the target. See Section 7.3.4. b. A NOP-Out PDU with TTT != 0xffffffff is sent as a ping response by the initiator to the NOP-In PDU sent as a ping request by the target.
8.1.1.3. Control-Type PDUs from the Initiator in the Unregulated and Unexpected Category
PDUs in the Unregulated and Unexpected category are PDUs with the immediate flag set. The number of PDUs that are in this category and can be sent by an initiator is controlled by the value of MaxOutstandingUnexpectedPDUs declared by the target. (See Section 6.7.) After a PDU in this category is sent by the initiator, it is outstanding until it is retired. At any time, the number of outstanding unexpected PDUs MUST NOT exceed the value of MaxOutstandingUnexpectedPDUs declared by the target. The target uses the value of MaxOutstandingUnexpectedPDUs that it declared to determine the amount of buffer resources required for control-type PDUs in this category that can be sent by an initiator. For the initiator, for each of the control-type PDUs that can be sent in this category, the initiator MUST provision for the buffer resources if required for the corresponding control-type PDU that can be sent as a response from the target. An outstanding PDU in this category is retired as follows. If the CmdSN of the PDU sent by the initiator in this category is x, the PDU is outstanding until the initiator sends a non-immediate control-type
Top   ToC   RFC7145 - Page 60
   PDU on the same connection with CmdSN = y (where y is at least x) and
   the target responds with a control-type PDU on any connection where
   ExpCmdSN is at least y+1.

   When the number of outstanding unexpected control-type PDUs equals
   MaxOutstandingUnexpectedPDUs, the iSCSI layer at the initiator MUST
   NOT generate any unexpected PDUs, which otherwise it would have
   generated, even if the unexpected PDU is intended for immediate
   delivery.

8.1.2. Flow Control for Control-Type PDUs from the Target

Control-type PDUs that can be sent by a target and are expected by the initiator are listed in the Regulated category. (See Section 8.1.1.1.) For the control-type PDUs that can be sent by a target and are unexpected by the initiator, the number is controlled by MaxOutstandingUnexpectedPDUs declared by the initiator. (See Section 6.7.) After a PDU in this category is sent by a target, it is outstanding until it is retired. At any time, the number of outstanding unexpected PDUs MUST NOT exceed the value of MaxOutstandingUnexpectedPDUs declared by the initiator. The initiator uses the value of MaxOutstandingUnexpectedPDUs that it declared to determine the amount of buffer resources required for control-type PDUs in this category that can be sent by a target. The following is a list of the PDUs in this category and the conditions for retiring the outstanding PDU: a. For an Asynchronous Message PDU with StatSN = x, the PDU is outstanding until the initiator sends a control-type PDU with ExpStatSN set to at least x+1. b. For a Reject PDU with StatSN = x, which is sent after a task is active, the PDU is outstanding until the initiator sends a control-type PDU with ExpStatSN set to at least x+1. c. For a NOP-In PDU with ITT = 0xffffffff and StatSN = x, the PDU is outstanding until the initiator responds with a control-type PDU on the same connection where ExpStatSN is at least x+1. But if the NOP-In PDU is sent as a ping request with TTT != 0xffffffff, the PDU can also be retired when the initiator sends a NOP-Out PDU with the same ITT and TTT as in the ping request. Note that when a target sends a NOP-In PDU as a ping request, it must provision a buffer for the NOP-Out PDU sent as a ping response from the initiator.
Top   ToC   RFC7145 - Page 61
   When the number of outstanding unexpected control-type PDUs equals
   MaxOutstandingUnexpectedPDUs, the iSCSI layer at the target MUST NOT
   generate any unexpected PDUs, which otherwise it would have
   generated, even if its intent is to indicate an iSCSI error condition
   (e.g., Asynchronous Message, Reject).  Task timeouts, as in the
   initiator's waiting for a command completion or other connection and
   session-level exceptions, will ensure that correct operational
   behavior will result in these cases despite not generating the PDU.
   This rule overrides any other requirements elsewhere that require
   that a Reject PDU MUST be sent.

   (Implementation note:  SCSI task timeout and recovery can be a
   lengthy process and hence SHOULD be avoided by proper provisioning of
   resources.)

   (Implementation note:  To ensure that the initiator has a means to
   inform the target that outstanding PDUs have been retired, the target
   should reserve the last unexpected control-type PDU allowable by the
   value of MaxOutstandingUnexpectedPDUs declared by the initiator for
   sending a NOP-In ping request with TTT != 0xffffffff to allow the
   initiator to return the NOP-Out ping response with the current
   ExpStatSN.)

8.2. Flow Control for RDMA Read Resources

If iSERHelloRequired is negotiated to "Yes", then the total number of RDMA Read operations that can be active simultaneously on an iSCSI/iSER connection depends on the amount of resources allocated as declared in the iSER Hello exchange described in Section 5.1.3. Exceeding the number of RDMA Read operations allowed on a connection will result in the connection being terminated by the RCaP layer. The iSER layer at the target maintains the iSER-ORD to keep track of the maximum number of RDMA Read Requests that can be issued by the iSER layer on a particular RCaP Stream. During connection setup (see Section 5.1), iSER-IRD is known at the initiator and iSER-ORD is known at the target after the iSER layers at the initiator and the target have respectively allocated the connection resources necessary to support RCaP, as directed by the Allocate_Connection_Resources Operational Primitive from the iSCSI layer before the end of the iSCSI Login Phase. In the Full Feature Phase, if iSERHelloRequired is negotiated to "Yes", then the first message sent by the initiator is the iSER Hello Message (see Section 9.3), which contains the value of iSER-IRD. In response to the iSER Hello Message, the target sends the iSER HelloReply Message (see Section 9.4), which contains the value of iSER-ORD. The iSER layer at both the initiator and the target MAY adjust (lower) the resources associated with iSER-IRD and iSER-ORD, respectively, to match the
Top   ToC   RFC7145 - Page 62
   iSER-ORD value declared in the HelloReply Message.  The iSER layer at
   the target MUST control the flow of the RDMA Read Request Messages so
   that it does not exceed the iSER-ORD value at the target.

   If iSERHelloRequired is negotiated to "No", then the maximum number
   of RDMA Read operations that can be active is negotiated via other
   means outside the scope of this document.  For example, in
   InfiniBand, iSER connection setup uses InfiniBand Connection Manager
   (CM) Management Datagrams (MADs), with additional iSER information
   exchanged in the private data.

8.3. STag Management

An STag is an identifier of a Tagged Buffer used in an RDMA operation. If the STags are exposed on the wire by being Advertised in the iSER header or declared in the header of an RCaP Message, then the allocation and the subsequent invalidation of the STags are as specified in this document.

8.3.1. Allocation of STags

When the iSCSI layer at the initiator invokes the Send_Control Operational Primitive to request the iSER layer at the initiator to process a SCSI Command, zero, one, or two STags may be allocated by the iSER layer. See Section 7.3.1 for details. The number of STags allocated depends on whether the command is unidirectional or bidirectional and whether or not solicited write data transfer is involved. When the iSCSI layer at the initiator invokes the Send_Control Operational Primitive to request the iSER layer at the initiator to process a Task Management Function Request with the TASK REASSIGN function, besides allocating zero, one, or two STags, the iSER layer MUST invalidate the existing STags (if any) associated with the ITT. See Section 7.3.3 for details. The iSER layer at the target allocates a local Data Sink STag when the iSCSI layer at the target invokes the Get_Data Operational Primitive to request the iSER layer to process an R2T PDU. See Section 7.3.6 for details.

8.3.2. Invalidation of STags

The invalidation of the STags at the initiator at the completion of a unidirectional or bidirectional command when the associated SCSI Response PDU is sent by the target is described in Section 7.3.2.
Top   ToC   RFC7145 - Page 63
   When a unidirectional or bidirectional command concludes without the
   associated SCSI Response PDU being sent by the target, the iSCSI
   layer at the initiator MUST request the iSER layer at the initiator
   to invalidate the STags by invoking the Deallocate_Task_Resources
   Operational Primitive qualified with ITT.  In response, the iSER
   layer at the initiator MUST locate the STags (if any) in the Local
   Mapping.  The iSER layer at the initiator MUST invalidate the STags
   (if any) and the Local Mapping.

   For an RDMA Read operation used to realize a SCSI Write data
   transfer, the iSER layer at the target SHOULD invalidate the Data
   Sink STag at the conclusion of the RDMA Read operation referencing
   the Data Sink STag (to permit the immediate reuse of buffer
   resources).

   For an RDMA Write operation used to realize a SCSI Read data
   transfer, the Data Source STag at the target is not declared to the
   initiator and is not exposed on the wire.  Invalidation of the STag
   is thus not specified.

   When a unidirectional or bidirectional command concludes without the
   associated SCSI Response PDU being sent by the target, the iSCSI
   layer at the target MUST request the iSER layer at the target to
   invalidate the STags by invoking the Deallocate_Task_Resources
   Operational Primitive qualified with ITT.  In response, the iSER
   layer at the target MUST locate the local STags (if any) in the Local
   Mapping.  The iSER layer at the target MUST invalidate the local
   STags (if any) and the Local Mapping.


(next page on part 4)

Next Section