6. iSCSI Error Handling and Recovery
6.1. Overview
6.1.1. Background
The following two considerations prompted the design of much of the error recovery functionality in iSCSI: i) An iSCSI PDU may fail the digest check and be dropped, despite being received by the TCP layer. The iSCSI layer must optionally be allowed to recover such dropped PDUs. ii) A TCP connection may fail at any time during the data transfer. All the active tasks must optionally be allowed to continue on a different TCP connection within the same session. Implementations have considerable flexibility in deciding what degree of error recovery to support, when to use it and by which mechanisms to achieve the required behavior. Only the externally visible actions of the error recovery mechanisms must be standardized to ensure interoperability. This chapter describes a general model for recovery in support of interoperability. See Appendix E. - Algorithmic Presentation of Error Recovery Classes - for further detail on how the described model may be implemented. Compliant implementations do not have to match the implementation details of this model as presented, but the external behavior of such implementations must correspond to the externally observable characteristics of the presented model.6.1.2. Goals
The major design goals of the iSCSI error recovery scheme are as follows: a) Allow iSCSI implementations to meet different requirements by defining a collection of error recovery mechanisms that implementations may choose from. b) Ensure interoperability between any two implementations supporting different sets of error recovery capabilities. c) Define the error recovery mechanisms to ensure command ordering even in the face of errors, for initiators that demand ordering.
d) Do not make additions in the fast path, but allow moderate
complexity in the error recovery path.
e) Prevent both the initiator and target from attempting to
recover the same set of PDUs at the same time. For example,
there must be a clear "error recovery functionality
distribution" between the initiator and target.
6.1.3. Protocol Features and State Expectations
The initiator mechanisms defined in connection with error recovery
are:
a) NOP-OUT to probe sequence numbers of the target (section
10.18)
b) Command retry (section 6.2.1)
c) Recovery R2T support (section 6.7)
d) Requesting retransmission of status/data/R2T using the SNACK
facility (section 10.16)
e) Acknowledging the receipt of the data (section 10.16)
f) Reassigning the connection allegiance of a task to a different
TCP connection (section 6.2.2)
g) Terminating the entire iSCSI session to start afresh (section
6.1.4.4)
The target mechanisms defined in connection with error recovery are:
a) NOP-IN to probe sequence numbers of the initiator (section
10.19)
b) Requesting retransmission of data using the recovery R2T
feature (section 6.7)
c) SNACK support (section 10.16) d) Requesting that parts of
read data be acknowledged (section 10.7.2)
e) Allegiance reassignment support (section 6.2.2)
f) Terminating the entire iSCSI session to force the initiator to
start over (section 6.1.4.4)
For any outstanding SCSI command, it is assumed that iSCSI, in
conjunction with SCSI at the initiator, is able to keep enough
information to be able to rebuild the command PDU, and that outgoing
data is available (in host memory) for retransmission while the
command is outstanding. It is also assumed that at the target,
incoming data (read data) MAY be kept for recovery or it can be
reread from a device server.
It is further assumed that a target will keep the "status & sense"
for a command it has executed if it supports status retransmission.
A target that agrees to support data retransmission is expected to be
prepared to retransmit the outgoing data (i.e., Data-In) on request
until either the status for the completed command is acknowledged, or the data in question has been separately acknowledged.6.1.4. Recovery Classes
iSCSI enables the following classes of recovery (in the order of increasing scope of affected iSCSI tasks): - Within a command (i.e., without requiring command restart). - Within a connection (i.e., without requiring the connection to be rebuilt, but perhaps requiring command restart). - Connection recovery (i.e., perhaps requiring connections to be rebuilt and commands to be reissued). - Session recovery. The recovery scenarios detailed in the rest of this section are representative rather than exclusive. In every case, they detail the lowest class recovery that MAY be attempted. The implementer is left to decide under which circumstances to escalate to the next recovery class and/or what recovery classes to implement. Both the iSCSI target and initiator MAY escalate the error handling to an error recovery class, which impacts a larger number of iSCSI tasks in any of the cases identified in the following discussion. In all classes, the implementer has the choice of deferring errors to the SCSI initiator (with an appropriate response code), in which case the task, if any, has to be removed from the target and all the side effects, such as ACA, must be considered. Use of within-connection and within-command recovery classes MUST NOT be attempted before the connection is in Full Feature Phase. In the detailed description of the recovery classes, the mandating terms (MUST, SHOULD, MAY, etc.) indicate normative actions to be executed if the recovery class is supported and used.6.1.4.1. Recovery Within-command
At the target, the following cases lend themselves to within-command recovery: - Lost data PDU - realized through one of the following: a) Data digest error - dealt with as specified in Section 6.7 Digest Errors, using the option of a recovery R2T.
b) Sequence reception timeout (no data or
partial-data-and-no-F-bit) - considered an implicit sequence
error and dealt with as specified in Section 6.8 Sequence
Errors, using the option of a recovery R2T.
c) Header digest error, which manifests as a sequence reception
timeout or a sequence error - dealt with as specified in
Section 6.8 Sequence Errors, using the option of a recovery
R2T.
At the initiator, the following cases lend themselves to
within-command recovery:
Lost data PDU or lost R2T - realized through one of the
following:
a) Data digest error - dealt with as specified in Section 6.7
Digest Errors, using the option of a SNACK.
b) Sequence reception timeout (no status) or response reception
timeout - dealt with as specified in Section 6.8 Sequence
Errors, using the option of a SNACK.
c) Header digest error, which manifests as a sequence reception
timeout or a sequence error - dealt with as specified in
Section 6.8 Sequence Errors, using the option of a SNACK.
To avoid a race with the target, which may already have a recovery
R2T or a termination response on its way, an initiator SHOULD NOT
originate a SNACK for an R2T based on its internal timeouts (if any).
Recovery in this case is better left to the target.
The timeout values used by the initiator and target are outside the
scope of this document. Sequence reception timeout is generally a
large enough value to allow the data sequence transfer to be
complete.
6.1.4.2. Recovery Within-connection
At the initiator, the following cases lend themselves to
within-connection recovery:
- Requests not acknowledged for a long time. Requests are
acknowledged explicitly through ExpCmdSN or implicitly by
receiving data and/or status. The initiator MAY retry
non-acknowledged commands as specified in Section 6.2 Retry and
Reassign in Recovery.
- Lost iSCSI numbered Response. It is recognized by either
identifying a data digest error on a Response PDU or a Data-In
PDU carrying the status, or by receiving a Response PDU with a
higher StatSN than expected. In the first case, digest error
handling is done as specified in Section 6.7 Digest Errors using
the option of a SNACK. In the second case, sequence error
handling is done as specified in Section 6.8 Sequence Errors,
using the option of a SNACK.
At the target, the following cases lend themselves to
within-connection recovery:
- Status/Response not acknowledged for a long time. The target MAY
issue a NOP-IN (with a valid Target Transfer Tag or otherwise)
that carries the next status sequence number it is going to use
in the StatSN field. This helps the initiator detect any missing
StatSN(s) and issue a SNACK for the status.
The timeout values used by the initiator and the target are outside
the scope of this document.
6.1.4.3. Connection Recovery
At an iSCSI initiator, the following cases lend themselves to
connection recovery:
- TCP connection failure: The initiator MUST close the connection.
It then MUST either implicitly or explicitly logout the failed
connection with the reason code "remove the connection for
recovery" and reassign connection allegiance for all commands
still in progress associated with the failed connection on one or
more connections (some or all of which MAY be newly established
connections) using the "Task reassign" task management function
(see Section 10.5.1 Function). For an initiator, a command is in
progress as long as it has not received a response or a Data-In
PDU including status.
Note: The logout function is mandatory. However, a new connection
establishment is only mandatory if the failed connection was the
last or only connection in the session.
- Receiving an Asynchronous Message that indicates one or all
connections in a session has been dropped. The initiator MUST
handle it as a TCP connection failure for the connection(s)
referred to in the Message.
At an iSCSI target, the following cases lend themselves to connection
recovery:
- TCP connection failure. The target MUST close the connection and,
if more than one connection is available, the target SHOULD send
an Asynchronous Message that indicates it has dropped the
connection. Then, the target will wait for the initiator to
continue recovery.
6.1.4.4. Session Recovery
Session recovery should be performed when all other recovery attempts
have failed. Very simple initiators and targets MAY perform session
recovery on all iSCSI errors and rely on recovery on the SCSI layer
and above.
Session recovery implies the closing of all TCP connections,
internally aborting all executing and queued tasks for the given
initiator at the target, terminating all outstanding SCSI commands
with an appropriate SCSI service response at the initiator, and
restarting a session on a new set of connection(s) (TCP connection
establishment and login on all new connections).
For possible clearing effects of session recovery on SCSI and iSCSI
objects, refer to Appendix F. - Clearing Effects of Various Events on
Targets -.
6.1.5. Error Recovery Hierarchy
The error recovery classes described so far are organized into a
hierarchy for ease in understanding and to limit the implementation
complexity. With few and well defined recovery levels
interoperability is easier to achieve. The attributes of this
hierarchy are as follows:
a) Each level is a superset of the capabilities of the previous
level. For example, Level 1 support implies supporting all
capabilities of Level 0 and more.
b) As a corollary, supporting a higher error recovery level means
increased sophistication and possibly an increase in resource
requirements.
c) Supporting error recovery level "n" is advertised and
negotiated by each iSCSI entity by exchanging the text key
"ErrorRecoveryLevel=n". The lower of the two exchanged values
is the operational ErrorRecoveryLevel for the session.
The following diagram represents the error recovery hierarchy.
+
/
/ 2 \ <-- Connection recovery
+-----+
/ 1 \ <-- Digest failure recovery
+---------+
/ 0 \ <-- Session failure recovery
+-------------+
The following table lists the error recovery capabilities expected
from the implementations that support each error recovery level.
+-------------------+--------------------------------------------+
|ErrorRecoveryLevel | Associated Error recovery capabilities |
+-------------------+--------------------------------------------+
| 0 | Session recovery class |
| | (Section 6.1.4.4 Session Recovery) |
+-------------------+--------------------------------------------+
| 1 | Digest failure recovery (See Note below.) |
| | plus the capabilities of ER Level 0 |
+-------------------+--------------------------------------------+
| 2 | Connection recovery class |
| | (Section 6.1.4.3 Connection Recovery) |
| | plus the capabilities of ER Level 1 |
+-------------------+--------------------------------------------+
Note: Digest failure recovery is comprised of two recovery classes:
Within-Connection recovery class (Section 6.1.4.2 Recovery Within-
connection) and Within-Command recovery class (Section 6.1.4.1
Recovery Within-command).
When a defined value of ErrorRecoveryLevel is proposed by an
originator in a text negotiation, the originator MUST support the
functionality defined for the proposed value and additionally, the
functionality corresponding to any defined value numerically less
than the proposed. When a defined value of ErrorRecoveryLevel is
returned by a responder in a text negotiation, the responder MUST
support the functionality corresponding to the ErrorRecoveryLevel it
is accepting.
When either party attempts to use error recovery functionality beyond
what is negotiated, the recovery attempts MAY fail unless an a priori
agreement outside the scope of this document exists between the two
parties to provide such support.
Implementations MUST support error recovery level "0", while the rest are OPTIONAL to implement. In implementation terms, the above striation means that the following incremental sophistication with each level is required. +-------------------+---------------------------------------------+ |Level transition | Incremental requirement | +-------------------+---------------------------------------------+ | 0->1 | PDU retransmissions on the same connection | +-------------------+---------------------------------------------+ | 1->2 | Retransmission across connections and | | | allegiance reassignment | +-------------------+---------------------------------------------+6.2. Retry and Reassign in Recovery
This section summarizes two important and somewhat related iSCSI protocol features used in error recovery.6.2.1. Usage of Retry
By resending the same iSCSI command PDU ("retry") in the absence of a command acknowledgement (by way of an ExpCmdSN update) or a response, an initiator attempts to "plug" (what it thinks are) the discontinuities in CmdSN ordering on the target end. Discarded command PDUs, due to digest errors, may have created these discontinuities. Retry MUST NOT be used for reasons other than plugging command sequence gaps, and in particular, cannot be used for requesting PDU retransmissions from a target. Any such PDU retransmission requests for a currently allegiant command in progress may be made using the SNACK mechanism described in section 10.16, although the usage of SNACK is OPTIONAL. If initiators, as part of plugging command sequence gaps as described above, inadvertently issue retries for allegiant commands already in progress (i.e., targets did not see the discontinuities in CmdSN ordering), the duplicate commands are silently ignored by targets as specified in section 3.2.2.1. When an iSCSI command is retried, the command PDU MUST carry the original Initiator Task Tag and the original operational attributes (e.g., flags, function names, LUN, CDB etc.) as well as the original CmdSN. The command being retried MUST be sent on the same connection as the original command unless the original connection was already successfully logged out.
6.2.2. Allegiance Reassignment
By issuing a "task reassign" task management request (Section 10.5.1 Function), the initiator signals its intent to continue an already active command (but with no current connection allegiance) as part of connection recovery. This means that a new connection allegiance is requested for the command, which seeks to associate it to the connection on which the task management request is being issued. Before the allegiance reassignment is attempted for a task, an implicit or explicit Logout with the reason code "remove the connection for recovery" ( see section 10.14) MUST be successfully completed for the previous connection to which the task was allegiant. In reassigning connection allegiance for a command, the targets SHOULD continue the command from its current state. For example, when reassigning read commands, the target SHOULD take advantage of the ExpDataSN field provided by the Task Management function request (which must be set to zero if there was no data transfer) and bring the read command to completion by sending the remaining data and sending (or resending) the status. ExpDataSN acknowledges all data sent up to, but not including, the Data-In PDU and or R2T with DataSN (or R2TSN) equal to ExpDataSN. However, targets may choose to send/receive all unacknowledged data or all of the data on a reassignment of connection allegiance if unable to recover or maintain an accurate state. Initiators MUST not subsequently request data retransmission through Data SNACK for PDUs numbered less than ExpDataSN (i.e., prior to the acknowledged sequence number). For all types of commands, a reassignment request implies that the task is still considered in progress by the initiator and the target must conclude the task appropriately if the target returns the "Function Complete" response to the reassignment request. This might possibly involve retransmission of data/R2T/status PDUs as necessary, but MUST involve the (re)transmission of the status PDU. It is OPTIONAL for targets to support the allegiance reassignment. This capability is negotiated via the ErrorRecoveryLevel text key during the login time. When a target does not support allegiance reassignment, it MUST respond with a Task Management response code of "Allegiance reassignment not supported". If allegiance reassignment is supported by the target, but the task is still allegiant to a different connection, or a successful recovery Logout of the previously allegiant connection was not performed, the target MUST respond with a Task Management response code of "Task still allegiant".
If allegiance reassignment is supported by the target, the Task Management response to the reassignment request MUST be issued before the reassignment becomes effective. If a SCSI Command that involves data input is reassigned, any SNACK Tag it holds for a final response from the original connection is deleted and the default value of 0 MUST be used instead.6.3. Usage Of Reject PDU in Recovery
Targets MUST NOT implicitly terminate an active task by sending a Reject PDU for any PDU exchanged during the life of the task. If the target decides to terminate the task, a Response PDU (SCSI, Text, Task, etc.) must be returned by the target to conclude the task. If the task had never been active before the Reject (i.e., the Reject is on the command PDU), targets should not send any further responses because the command itself is being discarded. The above rule means that the initiator can eventually expect a response on receiving Rejects, if the received Reject is for a PDU other than the command PDU itself. The non-command Rejects only have diagnostic value in logging the errors, and they can be used for retransmission decisions by the initiators. The CmdSN of the rejected command PDU (if it is a non-immediate command) MUST NOT be considered received by the target (i.e., a command sequence gap must be assumed for the CmdSN), even though the CmdSN of the rejected command PDU may be reliably ascertained. Upon receiving the Reject, the initiator MUST plug the CmdSN gap in order to continue to use the session. The gap may be plugged either by transmitting a command PDU with the same CmdSN, or by aborting the task (see section 6.9 on how an abort may plug a CmdSN gap). When a data PDU is rejected and its DataSN can be ascertained, a target MUST advance ExpDataSN for the current data burst if a recovery R2T is being generated. The target MAY advance its ExpDataSN if it does not attempt to recover the lost data PDU.6.4. Connection Timeout Management
iSCSI defines two session-global timeout values (in seconds) - Time2Wait and Time2Retain - that are applicable when an iSCSI Full Feature Phase connection is taken out of service either intentionally or by an exception. Time2Wait is the initial "respite time" before attempting an explicit/implicit Logout for the CID in question or task reassignment for the affected tasks (if any). Time2Retain is the maximum time after the initial respite interval that the task and/or connection state(s) is/are guaranteed to be maintained on the
target to cater to a possible recovery attempt. Recovery attempts for the connection and/or task(s) SHOULD NOT be made before Time2Wait seconds, but MUST be completed within Time2Retain seconds after that initial Time2Wait waiting period.6.4.1. Timeouts on Transport Exception Events
A transport connection shutdown or a transport reset without any preceding iSCSI protocol interactions informing the end-points of the fact causes a Full Feature Phase iSCSI connection to be abruptly terminated. The timeout values to be used in this case are the negotiated values of defaultTime2Wait (Section 12.15 DefaultTime2Wait) and DefaultTime2Retain (Section 12.16 DefaultTime2Retain) text keys for the session.6.4.2. Timeouts on Planned Decommissioning
Any planned decommissioning of a Full Feature Phase iSCSI connection is preceded by either a Logout Response PDU, or an Async Message PDU. The Time2Wait and Time2Retain field values (section 10.15) in a Logout Response PDU, and the Parameter2 and Parameter3 fields of an Async Message (AsyncEvent types "drop the connection" or "drop all the connections"; section 10.9.1) specify the timeout values to be used in each of these cases. These timeout values are only applicable for the affected connection, and the tasks active on that connection. These timeout values have no bearing on initiator timers (if any) that are already running on connections or tasks associated with that session.6.5. Implicit Termination of Tasks
A target implicitly terminates the active tasks due to iSCSI protocol dynamics in the following cases: a) When a connection is implicitly or explicitly logged out with the reason code of "Close the connection" and there are active tasks allegiant to that connection. b) When a connection fails and the connection state eventually times out (state transition M1 in Section 7.2.2 State Transition Descriptions for Initiators and Targets) and there are active tasks allegiant to that connection. c) When a successful Logout with the reason code of "remove the connection for recovery" is performed while there are active tasks allegiant to that connection, and those tasks eventually
time out after the Time2Wait and Time2Retain periods without
allegiance reassignment.
d) When a connection is implicitly or explicitly logged out with
the reason code of "Close the session" and there are active
tasks in that session.
If the tasks terminated in the above cases a), b, c) and d)are SCSI
tasks, they must be internally terminated as if with CHECK CONDITION
status. This status is only meaningful for appropriately handling
the internal SCSI state and SCSI side effects with respect to
ordering because this status is never communicated back as a
terminating status to the initiator. However additional actions may
have to be taken at SCSI level depending on the SCSI context as
defined by the SCSI standards (e.g., queued commands and ACA, in
cases a), b), and c), after the tasks are terminated, the target MUST
report a Unit Attention condition on the next command processed on
any connection for each affected I_T_L nexus with the status of CHECK
CONDITION, and the ASC/ASCQ value of 47h/7Fh - "SOME COMMANDS CLEARED
BY ISCSI PROTOCOL EVENT" , etc. - see [SAM2] and [SPC3]).
6.6. Format Errors
The following two explicit violations of PDU layout rules are format
errors:
a) Illegal contents of any PDU header field except the Opcode
(legal values are specified in Section 10 iSCSI PDU Formats).
b) Inconsistent field contents (consistent field contents are
specified in Section 10 iSCSI PDU Formats).
Format errors indicate a major implementation flaw in one of the
parties.
When a target or an initiator receives an iSCSI PDU with a format
error, it MUST immediately terminate all transport connections in the
session either with a connection close or with a connection reset and
escalate the format error to session recovery (see Section 6.1.4.4
Session Recovery).
6.7. Digest Errors
The discussion of the legal choices in handling digest errors below
excludes session recovery as an explicit option, but either party
detecting a digest error may choose to escalate the error to session
recovery.
When a target or an initiator receives any iSCSI PDU, with a header
digest error, it MUST either discard the header and all data up to
the beginning of a later PDU or close the connection. Because the
digest error indicates that the length field of the header may have
been corrupted, the location of the beginning of a later PDU needs to
be reliably ascertained by other means such as the operation of a
sync and steering layer.
When a target receives any iSCSI PDU with a payload digest error, it
MUST answer with a Reject PDU with a reason code of
Data-Digest-Error and discard the PDU.
- If the discarded PDU is a solicited or unsolicited iSCSI data
PDU (for immediate data in a command PDU, non-data PDU rule
below applies), the target MUST do one of the following:
a) Request retransmission with a recovery R2T.
b) Terminate the task with a response PDU with a CHECK
CONDITION Status and an iSCSI Condition of "protocol service
CRC error" (Section 10.4.7.2 Sense Data). If the target
chooses to implement this option, it MUST wait to receive
all the data (signaled by a Data PDU with the final bit set
for all outstanding R2Ts) before sending the response PDU.
A task management command (such as an abort task) from the
initiator during this wait may also conclude the task.
- No further action is necessary for targets if the discarded PDU
is a non-data PDU. In case of immediate data being present on
a discarded command, the immediate data is implicitly recovered
when the task is retried (see section 6.2.1), followed by the
entire data transfer for the task.
When an initiator receives any iSCSI PDU with a payload digest error,
it MUST discard the PDU.
- If the discarded PDU is an iSCSI data PDU, the initiator MUST do
one of the following:
a) Request the desired data PDU through SNACK. In response to the
SNACK, the target MUST either resend the data PDU or reject the
SNACK with a Reject PDU with a reason code of "SNACK reject" in
which case:
i) If the status has not already been sent for the command,
the target MUST terminate the command with a CHECK
CONDITION Status and an iSCSI Condition of "SNACK rejected"
(Section 10.4.7.2 Sense Data).
ii) If the status was already sent, no further action is
necessary for the target. The initiator in this case MUST
wait for the status to be received and then discard it, so
as to internally signal the completion with CHECK CONDITION
Status and an iSCSI Condition of "protocol service CRC
error" (Section 10.4.7.2 Sense Data).
b) Abort the task and terminate the command with an error.
- If the discarded PDU is a response PDU, the initiator MUST do one
of the following:
a) Request PDU retransmission with a status SNACK.
b) Logout the connection for recovery and continue the tasks on a
different connection instance as described in Section 6.2 Retry
and Reassign in Recovery.
c) Logout to close the connection (abort all the commands
associated with the connection).
- No further action is necessary for initiators if the discarded PDU
is an unsolicited PDU (e.g., Async, Reject). Task timeouts as in
the initiator waiting for a command completion, or process
timeouts, as in the target waiting for a Logout, will ensure that
the correct operational behavior will result in these cases
despite the discarded PDU.
6.8. Sequence Errors
When an initiator receives an iSCSI R2T/data PDU with an out of order
R2TSN/DataSN or a SCSI response PDU with an ExpDataSN that implies
missing data PDU(s), it means that the initiator must have detected a
header or payload digest error on one or more earlier R2T/data PDUs.
The initiator MUST address these implied digest errors as described
in Section 6.7 Digest Errors. When a target receives a data PDU with
an out of order DataSN, it means that the target must have hit a
header or payload digest error on at least one of the earlier data
PDUs. The target MUST address these implied digest errors as
described in Section 6.7 Digest Errors.
When an initiator receives an iSCSI status PDU with an out of order
StatSN that implies missing responses, it MUST address the one or
more missing status PDUs as described in Section 6.7 Digest Errors.
As a side effect of receiving the missing responses, the initiator
may discover missing data PDUs. If the initiator wants to recover
the missing data for a command, it MUST NOT acknowledge the received
responses that start from the StatSN of the relevant command, until
it has completed receiving all the data PDUs of the command.
When an initiator receives duplicate R2TSNs (due to proactive
retransmission of R2Ts by the target) or duplicate DataSNs (due to
proactive SNACKs by the initiator), it MUST discard the duplicates.
6.9. SCSI Timeouts
An iSCSI initiator MAY attempt to plug a command sequence gap on the target end (in the absence of an acknowledgement of the command by way of ExpCmdSN) before the ULP timeout by retrying the unacknowledged command, as described in Section 6.2 Retry and Reassign in Recovery. On a ULP timeout for a command (that carried a CmdSN of n), if the iSCSI initiator intends to continue the session, it MUST abort the command by either using an appropriate Task Management function request for the specific command, or a "close the connection" Logout. When using an ABORT TASK, if the ExpCmdSN is still less than (n+1), the target may see the abort request while missing the original command itself due to one of the following reasons: - Original command was dropped due to digest error. - Connection on which the original command was sent was successfully logged out. Upon logout, the unacknowledged commands issued on the connection being logged out are discarded. If the abort request is received and the original command is missing, targets MUST consider the original command with that RefCmdSN to be received and issue a Task Management response with the response code: "Function Complete". This response concludes the task on both ends. If the abort request is received and the target can determine (based on the Referenced Task Tag) that the command was received and executed and also that the response was sent prior to the abort, then the target MUST respond with the response code of "Task Does Not Exist".6.10. Negotiation Failures
Text request and response sequences, when used to set/negotiate operational parameters, constitute the negotiation/parameter setting. A negotiation failure is considered to be one or more of the following: - None of the choices, or the stated value, is acceptable to one of the sides in the negotiation. - The text request timed out and possibly terminated. - The text request was answered with a Reject PDU.
The following two rules should be used to address negotiation
failures:
- During Login, any failure in negotiation MUST be considered a
login process failure and the Login Phase must be terminated,
and with it, the connection. If the target detects the
failure, it must terminate the login with the appropriate Login
Response code.
- A failure in negotiation, while in the Full Feature Phase, will
terminate the entire negotiation sequence that may consist of a
series of text requests that use the same Initiator Task Tag.
The operational parameters of the session or the connection
MUST continue to be the values agreed upon during an earlier
successful negotiation (i.e., any partial results of this
unsuccessful negotiation MUST NOT take effect and MUST be
discarded).
6.11. Protocol Errors
Mapping framed messages over a "stream" connection, such as TCP,
makes the proposed mechanisms vulnerable to simple software framing
errors. On the other hand, the introduction of framing mechanisms to
limit the effects of these errors may be onerous on performance for
simple implementations. Command Sequence Numbers and the above
mechanisms for connection drop and reestablishment help handle this
type of mapping errors.
All violations of iSCSI PDU exchange sequences specified in this
document are also protocol errors. This category of errors can only
be addressed by fixing the implementations; iSCSI defines Reject and
response codes to enable this.
6.12. Connection Failures
iSCSI can keep a session in operation if it is able to
keep/establish at least one TCP connection between the initiator and
the target in a timely fashion. Targets and/or initiators may
recognize a failing connection by either transport level means (TCP),
a gap in the command sequence number, a response stream that is not
filled for a long time, or by a failing iSCSI NOP (acting as a ping).
The latter MAY be used periodically to increase the speed and
likelihood of detecting connection failures. Initiators and targets
MAY also use the keep-alive option on the TCP connection to enable
early link failure detection on otherwise idle links.
On connection failure, the initiator and target MUST do one of the
following:
- Attempt connection recovery within the session (Section 6.1.4.3
Connection Recovery).
- Logout the connection with the reason code "closes the
connection" (Section 10.14.5 Implicit termination of tasks),
re-issue missing commands, and implicitly terminate all active
commands. This option requires support for the
within-connection recovery class (Section 6.1.4.2 Recovery
Within-connection).
- Perform session recovery (Section 6.1.4.4 Session Recovery).
Either side may choose to escalate to session recovery (via the
initiator dropping all the connections, or via an Async Message that
announces the similar intent from a target), and the other side MUST
give it precedence. On a connection failure, a target MUST terminate
and/or discard all of the active immediate commands regardless of
which of the above options is used (i.e., immediate commands are not
recoverable across connection failures).
6.13. Session Errors
If all of the connections of a session fail and cannot be
reestablished in a short time, or if initiators detect protocol
errors repeatedly, an initiator may choose to terminate a session and
establish a new session.
In this case, the initiator takes the following actions:
- Resets or closes all the transport connections.
- Terminates all outstanding requests with an appropriate
response before initiating a new session. If the same I_T
nexus is intended to be reestablished, the initiator MUST
employ session reinstatement (see section 5.3.5).
When the session timeout (the connection state timeout for the last
failed connection) happens on the target, it takes the following
actions:
- Resets or closes the TCP connections (closes the session).
- Terminates all active tasks that were allegiant to the
connection(s) that constituted the session.
A target MUST also be prepared to handle a session reinstatement
request from the initiator, that may be addressing session errors.
7. State Transitions
iSCSI connections and iSCSI sessions go through several well-defined states from the time they are created to the time they are cleared. The connection state transitions are described in two separate but dependent state diagrams for ease in understanding. The first diagram, "standard connection state diagram", describes the connection state transitions when the iSCSI connection is not waiting for, or undergoing, a cleanup by way of an explicit or implicit Logout. The second diagram, "connection cleanup state diagram", describes the connection state transitions while performing the iSCSI connection cleanup. The "session state diagram" describes the state transitions an iSCSI session would go through during its lifetime, and it depends on the states of possibly multiple iSCSI connections that participate in the session. States and state transitions are described in the text, tables and diagrams. The diagrams are used for illustration. The text and the tables are the governing specification.7.1. Standard Connection State Diagrams
7.1.1. State Descriptions for Initiators and Targets
State descriptions for the standard connection state diagram are as follows: -S1: FREE -initiator: State on instantiation, or after successful connection closure. -target: State on instantiation, or after successful connection closure. -S2: XPT_WAIT -initiator: Waiting for a response to its transport connection establishment request. -target: Illegal -S3: XPT_UP -initiator: Illegal -target: Waiting for the Login process to commence. -S4: IN_LOGIN -initiator: Waiting for the Login process to conclude, possibly involving several PDU exchanges. -target: Waiting for the Login process to conclude, possibly involving several PDU exchanges.
-S5: LOGGED_IN
-initiator: In Full Feature Phase, waiting for all internal,
iSCSI, and transport events.
-target: In Full Feature Phase, waiting for all internal, iSCSI,
and transport events.
-S6: IN_LOGOUT
-initiator: Waiting for a Logout response.
-target: Waiting for an internal event signaling completion of
logout processing.
-S7: LOGOUT_REQUESTED
-initiator: Waiting for an internal event signaling readiness to
proceed with Logout.
-target: Waiting for the Logout process to start after having
requested a Logout via an Async Message.
-S8: CLEANUP_WAIT
-initiator: Waiting for the context and/or resources to initiate
the cleanup processing for this CSM.
-target: Waiting for the cleanup process to start for this CSM.
7.1.2. State Transition Descriptions for Initiators and Targets
-T1:
-initiator: Transport connect request was made (e.g., TCP SYN
sent).
-target: Illegal
-T2:
-initiator: Transport connection request timed out, a transport
reset was received, or an internal event of receiving a
Logout response (success) on another connection for a
"close the session" Logout request was received.
-target:Illegal
-T3:
-initiator: Illegal
-target: Received a valid transport connection request that
establishes the transport connection.
-T4:
-initiator: Transport connection established, thus prompting the
initiator to start the iSCSI Login.
-target: Initial iSCSI Login Request was received.
-T5:
-initiator: The final iSCSI Login Response with a Status-Class
of zero was received.
-target: The final iSCSI Login Request to conclude the Login
Phase was received, thus prompting the target to send the
final iSCSI Login Response with a Status-Class of zero.
-T6:
-initiator: Illegal
-target: Timed out waiting for an iSCSI Login, transport
disconnect indication was received, transport reset was
received, or an internal event indicating a transport
timeout was received. In all these cases, the connection is
to be closed.
-T7:
-initiator - one of the following events caused the transition:
- The final iSCSI Login Response was received with a
non-zero Status-Class.
- Login timed out.
- A transport disconnect indication was received.
- A transport reset was received.
- An internal event was received indicating a transport
timeout.
- An internal event of receiving a Logout response (success)
on another connection for a "close the session" Logout
request was received.
In all these cases, the transport connection is closed.
-target - one of the following events caused the transition:
- The final iSCSI Login Request to conclude the Login Phase
was received, prompting the target to send the final iSCSI
Login Response with a non-zero Status-Class.
- Login timed out.
- Transport disconnect indication was received.
- Transport reset was received.
- An internal event indicating a transport timeout was
received.
- On another connection a "close the session" Logout request
was received.
In all these cases, the connection is to be closed.
-T8:
-initiator: An internal event of receiving a Logout response
(success) on another connection for a "close the session"
Logout request was received, thus closing this connection
requiring no further cleanup.
-target: An internal event of sending a Logout response
(success) on another connection for a "close the session"
Logout request was received, or an internal event of a
successful connection/session reinstatement is received,
thus prompting the target to close this connection cleanly.
-T9, T10:
-initiator: An internal event that indicates the readiness to
start the Logout process was received, thus prompting an
iSCSI Logout to be sent by the initiator.
-target: An iSCSI Logout request was received.
-T11, T12:
-initiator: Async PDU with AsyncEvent "Request Logout" was
received.
-target: An internal event that requires the decommissioning of
the connection is received, thus causing an Async PDU with
an AsyncEvent "Request Logout" to be sent.
-T13:
-initiator: An iSCSI Logout response (success) was received, or
an internal event of receiving a Logout response (success)
on another connection for a "close the session" Logout
request was received.
-target: An internal event was received that indicates
successful processing of the Logout, which prompts an iSCSI
Logout response (success) to be sent; an internal event of
sending a Logout response (success) on another connection
for a "close the session" Logout request was received; or an
internal event of a successful connection/session
reinstatement is received. In all these cases, the
transport connection is closed.
-T14:
-initiator: Async PDU with AsyncEvent "Request Logout" was
received again.
-target: Illegal
-T15, T16:
-initiator: One or more of the following events caused this
transition:
-Internal event that indicates a transport connection
timeout was received thus prompting transport RESET or
transport connection closure.
-A transport RESET.
-A transport disconnect indication.
-Async PDU with AsyncEvent "Drop connection" (for this CID).
-Async PDU with AsyncEvent "Drop all connections".
-target: One or more of the following events caused this
transition:
-Internal event that indicates a transport connection
timeout was received, thus prompting transport RESET or
transport connection closure.
-An internal event of a failed connection/session
reinstatement is received.
-A transport RESET.
-A transport disconnect indication.
-Internal emergency cleanup event was received which prompts
an Async PDU with AsyncEvent "Drop connection" (for this
CID), or event "Drop all connections".
-T17:
-initiator: One or more of the following events caused this
transition:
-Logout response, (failure i.e., a non-zero status) was
received, or Logout timed out.
-Any of the events specified for T15 and T16.
-target: One or more of the following events caused this
transition:
-Internal event that indicates a failure of the Logout
processing was received, which prompts a Logout response
(failure, i.e., a non-zero status) to be sent.
-Any of the events specified for T15 and T16.
-T18:
-initiator: An internal event of receiving a Logout response
(success) on another connection for a "close the session"
Logout request was received.
-target: An internal event of sending a Logout response
(success) on another connection for a "close the session"
Logout request was received, or an internal event of a
successful connection/session reinstatement is received. In
both these cases, the connection is closed.
The CLEANUP_WAIT state (S8) implies that there are possible iSCSI
tasks that have not reached conclusion and are still considered busy.
7.1.3. Standard Connection State Diagram for an Initiator
Symbolic names for States:
S1: FREE
S2: XPT_WAIT
S4: IN_LOGIN
S5: LOGGED_IN
S6: IN_LOGOUT
S7: LOGOUT_REQUESTED
S8: CLEANUP_WAIT
States S5, S6, and S7 constitute the Full Feature Phase operation of
the connection.
The state diagram is as follows:
-------<-------------+
+--------->/ S1 \<----+ |
T13| +->\ /<-+ \ |
| / ---+--- \ \ |
| / | T2 \ | |
| T8 | |T1 | | |
| | | / |T7 |
| | | / | |
| | | / | |
| | V / / |
| | ------- / / |
| | / S2 \ / |
| | \ / / |
| | ---+--- / |
| | |T4 / |
| | V / | T18
| | ------- / |
| | / S4 \ |
| | \ / |
| | ---+--- | T15
| | |T5 +--------+---------+
| | | /T16+-----+------+ |
| | | / -+-----+--+ | |
| | | / / S7 \ |T12| |
| | | / +->\ /<-+ V V
| | | / / -+----- -------
| | | / /T11 |T10 / S8 \
| | V / / V +----+ \ /
| | ---+-+- ----+-- | -------
| | / S5 \T9 / S6 \<+ ^
| +-----\ /--->\ / T14 |
| ------- --+----+------+T17
+---------------------------+
The following state transition table represents the above diagram.
Each row represents the starting state for a given transition, which
after taking a transition marked in a table cell would end in the
state represented by the column of the cell. For example, from state
S1, the connection takes the T1 transition to arrive at state S2.
The fields marked "-" correspond to undefined transitions.
+----+---+---+---+---+----+---+
|S1 |S2 |S4 |S5 |S6 |S7 |S8 |
---+----+---+---+---+---+----+---+
S1| - |T1 | - | - | - | - | - |
---+----+---+---+---+---+----+---+
S2|T2 |- |T4 | - | - | - | - |
---+----+---+---+---+---+----+---+
S4|T7 |- |- |T5 | - | - | - |
---+----+---+---+---+---+----+---+
S5|T8 |- |- | - |T9 |T11 |T15|
---+----+---+---+---+---+----+---+
S6|T13 |- |- | - |T14|- |T17|
---+----+---+---+---+---+----+---+
S7|T18 |- |- | - |T10|T12 |T16|
---+----+---+---+---+---+----+---+
S8| - |- |- | - | - | - | - |
---+----+---+---+---+---+----+---+
7.1.4. Standard Connection State Diagram for a Target
Symbolic names for States:
S1: FREE
S3: XPT_UP
S4: IN_LOGIN
S5: LOGGED_IN
S6: IN_LOGOUT
S7: LOGOUT_REQUESTED
S8: CLEANUP_WAIT
States S5, S6, and S7 constitute the Full Feature Phase operation of
the connection.
The state diagram is as follows:
-------<-------------+
+--------->/ S1 \<----+ |
T13| +->\ /<-+ \ |
| / ---+--- \ \ |
| / | T6 \ | |
| T8 | |T3 | | |
| | | / |T7 |
| | | / | |
| | | / | |
| | V / / |
| | ------- / / |
| | / S3 \ / |
| | \ / / | T18
| | ---+--- / |
| | |T4 / |
| | V / |
| | ------- / |
| | / S4 \ |
| | \ / |
| | ---+--- T15 |
| | |T5 +--------+---------+
| | | /T16+-----+------+ |
| | | / -+-----+---+ | |
| | | / / S7 \ |T12| |
| | | / +->\ /<-+ V V
| | | / / -+----- -------
| | | / /T11 |T10 / S8 \
| | V / / V \ /
| | ---+-+- ------- -------
| | / S5 \T9 / S6 \ ^
| +-----\ /--->\ / |
| ------- --+----+--------+T17
+---------------------------+
The following state transition table represents the above diagram,
and follows the conventions described for the initiator diagram.
+----+---+---+---+---+----+---+
|S1 |S3 |S4 |S5 |S6 |S7 |S8 |
---+----+---+---+---+---+----+---+
S1| - |T3 | - | - | - | - | - |
---+----+---+---+---+---+----+---+
S3|T6 |- |T4 | - | - | - | - |
---+----+---+---+---+---+----+---+
S4|T7 |- |- |T5 | - | - | - |
---+----+---+---+---+---+----+---+
S5|T8 |- |- | - |T9 |T11 |T15|
---+----+---+---+---+---+----+---+
S6|T13 |- |- | - |- |- |T17|
---+----+---+---+---+---+----+---+
S7|T18 |- |- | - |T10|T12 |T16|
---+----+---+---+---+---+----+---+
S8| - |- |- | - | - | - | - |
---+----+---+---+---+---+----+---+
7.2. Connection Cleanup State Diagram for Initiators and Targets
Symbolic names for states:
R1: CLEANUP_WAIT (same as S8)
R2: IN_CLEANUP
R3: FREE (same as S1)
Whenever a connection state machine (e.g., CSM-C) enters the
CLEANUP_WAIT state (S8), it must go through the state transitions
described in the connection cleanup state diagram either a) using a
separate full-feature phase connection (let's call it CSM-E) in the
LOGGED_IN state in the same session, or b) using a new transport
connection (let's call it CSM-I) in the FREE state that is to be
added to the same session. In the CSM-E case, an explicit logout for
the CID that corresponds to CSM-C (either as a connection or session
logout) needs to be performed to complete the cleanup. In the CSM-I
case, an implicit logout for the CID that corresponds to CSM-C needs
to be performed by way of connection reinstatement (section 5.3.4)
for that CID. In either case, the protocol exchanges on CSM-E or
CSM-I determine the state transitions for CSM-C. Therefore, this
cleanup state diagram is only applicable to the instance of the
connection in cleanup (i.e., CSM-C). In the case of an implicit
logout for example, CSM-C reaches FREE (R3) at the time CSM-I reaches
LOGGED_IN. In the case of an explicit logout, CSM-C reaches FREE
(R3) when CSM-E receives a successful logout response while
continuing to be in the LOGGED_IN state.
An initiator must initiate an explicit or implicit connection logout
for a connection in the CLEANUP_WAIT state, if the initiator intends
to continue using the associated iSCSI session.
The following state diagram applies to both initiators and targets.
-------
/ R1 \
+--\ /<-+
/ ---+---
/ | \ M3
M1 | |M2 |
| | /
| | /
| | /
| V /
| ------- /
| / R2 \
| \ /
| -------
| |
| |M4
| |
| |
| |
| V
| -------
| / R3 \
+---->\ /
-------
The following state transition table represents the above diagram,
and follows the same conventions as in earlier sections.
+----+----+----+
|R1 |R2 |R3 |
-----+----+----+----+
R1 | - |M2 |M1 |
-----+----+----+----+
R2 |M3 | - |M4 |
-----+----+----+----+
R3 | - | - | - |
-----+----+----+----+
7.2.1. State Descriptions for Initiators and Targets
-R1: CLEANUP_WAIT (Same as S8) -initiator: Waiting for the internal event to initiate the cleanup processing for CSM-C. -target: Waiting for the cleanup process to start for CSM-C. -R2: IN_CLEANUP -initiator: Waiting for the connection cleanup process to conclude for CSM-C. -target: Waiting for the connection cleanup process to conclude for CSM-C. -R3: FREE (Same as S1) -initiator: End state for CSM-C. -target: End state for CSM-C.7.2.2. State Transition Descriptions for Initiators and Targets
-M1: One or more of the following events was received: -initiator: -An internal event that indicates connection state timeout. -An internal event of receiving a successful Logout response on a different connection for a "close the session" Logout. -target: -An internal event that indicates connection state timeout. -An internal event of sending a Logout response (success) on a different connection for a "close the session" Logout request. -M2: An implicit/explicit logout process was initiated by the initiator. -In CSM-I usage: -initiator: An internal event requesting the connection (or session) reinstatement was received, thus prompting a connection (or session) reinstatement Login to be sent transitioning CSM-I to state IN_LOGIN. -target: A connection/session reinstatement Login was received while in state XPT_UP. -In CSM-E usage: -initiator: An internal event that indicates that an explicit logout was sent for this CID in state LOGGED_IN. -target: An explicit logout was received for this CID in state LOGGED_IN.
-M3: Logout failure detected
-In CSM-I usage:
-initiator: CSM-I failed to reach LOGGED_IN and arrived into
FREE instead.
-target: CSM-I failed to reach LOGGED_IN and arrived into
FREE instead.
-In CSM-E usage:
-initiator: CSM-E either moved out of LOGGED_IN, or Logout
timed out and/or aborted, or Logout response (failure)
was received.
-target: CSM-E either moved out of LOGGED_IN, Logout timed
out and/or aborted, or an internal event that indicates a
failed Logout processing was received. A Logout response
(failure) was sent in the last case.
-M4: Successful implicit/explicit logout was performed.
- In CSM-I usage:
-initiator: CSM-I reached state LOGGED_IN, or an internal
event of receiving a Logout response (success) on another
connection for a "close the session" Logout request was
received.
-target: CSM-I reached state LOGGED_IN, or an internal event
of sending a Logout response (success) on a different
connection for a "close the session" Logout request was
received.
- In CSM-E usage:
-initiator: CSM-E stayed in LOGGED_IN and received a Logout
response (success), or an internal event of receiving a
Logout response (success) on another connection for a
"close the session" Logout request was received.
-target: CSM-E stayed in LOGGED_IN and an internal event
indicating a successful Logout processing was received,
or an internal event of sending a Logout response
(success) on a different connection for a "close the
session" Logout request was received.
7.3. Session State Diagrams
7.3.1. Session State Diagram for an Initiator
Symbolic Names for States:
Q1: FREE
Q3: LOGGED_IN
Q4: FAILED
State Q3 represents the Full Feature Phase operation of the session.
The state diagram is as follows:
-------
/ Q1 \
+------>\ /<-+
/ ---+--- |
/ | |N3
N6 | |N1 |
| | |
| N4 | |
| +--------+ | /
| | | | /
| | | | /
| | V V /
-+--+-- -----+-
/ Q4 \ N5 / Q3 \
\ /<---\ /
------- -------
The state transition table is as follows:
+----+----+----+
|Q1 |Q3 |Q4 |
-----+----+----+----+
Q1 | - |N1 | - |
-----+----+----+----+
Q3 |N3 | - |N5 |
-----+----+----+----+
Q4 |N6 |N4 | - |
-----+----+----+----+
7.3.2. Session State Diagram for a Target
Symbolic Names for States:
Q1: FREE
Q2: ACTIVE
Q3: LOGGED_IN
Q4: FAILED
Q5: IN_CONTINUE
State Q3 represents the Full Feature Phase operation of the session.
The state diagram is as follows:
-------
+------------------>/ Q1 \
/ +-------------->\ /<-+
| | ---+--- |
| | ^ | |N3
N6 | |N11 N9| V N1 |
| | +------ |
| | / Q2 \ |
| | \ / |
| --+---- +--+--- |
| / Q5 \ | |
| \ / N10 | |
| +-+---+------------+ |N2 /
| ^ | | | /
|N7| |N8 | | /
| | | | V /
-+--+-V V----+-
/ Q4 \ N5 / Q3 \
\ /<-------------\ /
------- -------
The state transition table is as follows:
+----+----+----+----+----+
|Q1 |Q2 |Q3 |Q4 |Q5 |
-----+----+----+----+----+----+
Q1 | - |N1 | - | - | - |
-----+----+----+----+----+----+
Q2 |N9 | - |N2 | - | - |
-----+----+----+----+----+----+
Q3 |N3 | - | - |N5 | - |
-----+----+----+----+----+----+
Q4 |N6 | - | - | - |N7 |
-----+----+----+----+----+----+
Q5 |N11 | - |N10 |N8 | - |
-----+----+----+----+----+----+
7.3.3. State Descriptions for Initiators and Targets
-Q1: FREE
-initiator: State on instantiation or after cleanup.
-target: State on instantiation or after cleanup.
-Q2: ACTIVE
-initiator: Illegal.
-target: The first iSCSI connection in the session transitioned
to IN_LOGIN, waiting for it to complete the login process.
-Q3: LOGGED_IN
-initiator: Waiting for all session events.
-target: Waiting for all session events.
-Q4: FAILED
-initiator: Waiting for session recovery or session
continuation.
-target: Waiting for session recovery or session continuation.
-Q5: IN_CONTINUE
-initiator: Illegal.
-target: Waiting for session continuation attempt to reach a
conclusion.
7.3.4. State Transition Descriptions for Initiators and Targets
-N1:
-initiator: At least one transport connection reached the
LOGGED_IN state.
-target: The first iSCSI connection in the session had reached
the IN_LOGIN state.
-N2:
-initiator: Illegal.
-target: At least one iSCSI connection reached the LOGGED_IN
state.
-N3:
-initiator: Graceful closing of the session via session closure
(Section 5.3.6 Session Continuation and Failure).
-target: Graceful closing of the session via session closure
(Section 5.3.6 Session Continuation and Failure) or a
successful session reinstatement cleanly closed the session.
-N4:
-initiator: A session continuation attempt succeeded.
-target: Illegal.
-N5:
-initiator: Session failure (Section 5.3.6 Session Continuation
and Failure) occurred.
-target: Session failure (Section 5.3.6 Session Continuation and
Failure) occurred.
-N6:
-initiator: Session state timeout occurred, or a session
reinstatement cleared this session instance. This results
in the freeing of all associated resources and the session
state is discarded.
-target: Session state timeout occurred, or a session
reinstatement cleared this session instance. This results
in the freeing of all associated resources and the session
state is discarded.
-N7:
-initiator: Illegal.
-target: A session continuation attempt is initiated.
-N8:
-initiator: Illegal.
-target: The last session continuation attempt failed.
-N9:
-initiator: Illegal.
-target: Login attempt on the leading connection failed.
-N10:
-initiator: Illegal.
-target: A session continuation attempt succeeded.
-N11:
-initiator: Illegal.
-target: A successful session reinstatement cleanly closed the
session.