RFC 7530

Network File System (NFS) Version 4 Protocol

Pages: 323
Proposed Standard
→ Errata
Obsoletes: 3530
Updated by: 7931 8587

Part 13 of 14 – Pages 265 to 305

RFC7530 - Page 265 prevText

16.22.  Operation 24: PUTROOTFH - Set Root Filehandle

16.22.1.  SYNOPSIS

     - -> (cfh)

16.22.2.  ARGUMENT

     void;

16.22.3.  RESULT

   struct PUTROOTFH4res {
           /* CURRENT_FH: root fh */
           nfsstat4        status;
   };

16.22.4.  DESCRIPTION

   PUTROOTFH replaces the current filehandle with the filehandle that
   represents the root of the server's namespace.  From this filehandle,
   a LOOKUP operation can locate any other filehandle on the server.
   This filehandle may be different from the public filehandle, which
   may be associated with some other directory on the server.

   See Section 15.2.4.1 for more details on the current filehandle.

16.22.5.  IMPLEMENTATION

   PUTROOTFH is commonly used as the first operator in an NFS request to
   set the context for operations that follow it.

RFC7530 - Page 266

16.23.  Operation 25: READ - Read from File

16.23.1.  SYNOPSIS

     (cfh), stateid, offset, count -> eof, data

16.23.2.  ARGUMENT

   struct READ4args {
           /* CURRENT_FH: file */
           stateid4        stateid;
           offset4         offset;
           count4          count;
   };

16.23.3.  RESULT

   struct READ4resok {
           bool            eof;
           opaque          data<>;
   };

   union READ4res switch (nfsstat4 status) {
    case NFS4_OK:
            READ4resok     resok4;
    default:
            void;
   };

16.23.4.  DESCRIPTION

   The READ operation reads data from the regular file identified by the
   current filehandle.

   The client provides an offset of where the READ is to start and a
   count of how many bytes are to be read.  An offset of 0 (zero) means
   to read data starting at the beginning of the file.  If the offset is
   greater than or equal to the size of the file, the status, NFS4_OK,
   is returned with a data length set to 0 (zero), and eof is set to
   TRUE.  The READ is subject to access permissions checking.

   If the client specifies a count value of 0 (zero), the READ succeeds
   and returns 0 (zero) bytes of data (subject to access permissions
   checking).  The server may choose to return fewer bytes than
   specified by the client.  The client needs to check for this
   condition and handle the condition appropriately.

RFC7530 - Page 267

   The stateid value for a READ request represents a value returned from
   a previous byte-range lock or share reservation request, or the
   stateid associated with a delegation.  The stateid is used by the
   server to verify that the associated share reservation and any
   byte-range locks are still valid and to update lease timeouts for the
   client.

   If the READ ended at the end-of-file (formally, in a correctly formed
   READ request, if offset + count is equal to the size of the file), or
   the READ request extends beyond the size of the file (if offset +
   count is greater than the size of the file), eof is returned as TRUE;
   otherwise, it is FALSE.  A successful READ of an empty file will
   always return eof as TRUE.

   If the current filehandle is not a regular file, an error will be
   returned to the client.  In the case where the current filehandle
   represents a directory, NFS4ERR_ISDIR is returned; otherwise,
   NFS4ERR_INVAL is returned.

   For a READ using the special anonymous stateid, the server MAY allow
   the READ to be serviced subject to mandatory file locks or the
   current share_deny modes for the file.  For a READ using the special
   READ bypass stateid, the server MAY allow READ operations to bypass
   locking checks at the server.

   On success, the current filehandle retains its value.

16.23.5.  IMPLEMENTATION

   If the server returns a "short read" (i.e., less data than requested
   and eof is set to FALSE), the client should send another READ to get
   the remaining data.  A server may return less data than requested
   under several circumstances.  The file may have been truncated by
   another client or perhaps on the server itself, changing the file
   size from what the requesting client believes to be the case.  This
   would reduce the actual amount of data available to the client.  It
   is possible that the server reduces the transfer size and so returns
   a short read result.  Server resource exhaustion may also result in a
   short read.

   If mandatory byte-range locking is in effect for the file, and if the
   byte range corresponding to the data to be read from the file is
   WRITE_LT locked by an owner not associated with the stateid, the
   server will return the NFS4ERR_LOCKED error.  The client should try
   to get the appropriate READ_LT via the LOCK operation before
   re-attempting the READ.  When the READ completes, the client should
   release the byte-range lock via LOCKU.

RFC7530 - Page 268

   If another client has an OPEN_DELEGATE_WRITE delegation for the file
   being read, the delegation must be recalled, and the operation cannot
   proceed until that delegation is returned or revoked.  Except where
   this happens very quickly, one or more NFS4ERR_DELAY errors will be
   returned to requests made while the delegation remains outstanding.
   Normally, delegations will not be recalled as a result of a READ
   operation, since the recall will occur as a result of an earlier
   OPEN.  However, since it is possible for a READ to be done with a
   special stateid, the server needs to check for this case even though
   the client should have done an OPEN previously.

RFC7530 - Page 269

16.24.  Operation 26: READDIR - Read Directory

16.24.1.  SYNOPSIS

     (cfh), cookie, cookieverf, dircount, maxcount, attr_request ->
     cookieverf { cookie, name, attrs }

16.24.2.  ARGUMENT

   struct READDIR4args {
           /* CURRENT_FH: directory */
           nfs_cookie4     cookie;
           verifier4       cookieverf;
           count4          dircount;
           count4          maxcount;
           bitmap4         attr_request;
   };

16.24.3.  RESULT

   struct entry4 {
           nfs_cookie4     cookie;
           component4      name;
           fattr4          attrs;
           entry4          *nextentry;
   };

   struct dirlist4 {
           entry4          *entries;
           bool            eof;
   };

   struct READDIR4resok {
           verifier4       cookieverf;
           dirlist4        reply;
   };

   union READDIR4res switch (nfsstat4 status) {
    case NFS4_OK:
            READDIR4resok  resok4;
    default:
            void;
   };

RFC7530 - Page 270

16.24.4.  DESCRIPTION

   The READDIR operation retrieves a variable number of entries from a
   file system directory and for each entry returns attributes that were
   requested by the client, along with information to allow the client
   to request additional directory entries in a subsequent READDIR.

   The arguments contain a cookie value that represents where the
   READDIR should start within the directory.  A value of 0 (zero) for
   the cookie is used to start reading at the beginning of the
   directory.  For subsequent READDIR requests, the client specifies a
   cookie value that is provided by the server in a previous READDIR
   request.

   The cookieverf value should be set to 0 (zero) when the cookie value
   is 0 (zero) (first directory read).  On subsequent requests, it
   should be a cookieverf as returned by the server.  The cookieverf
   must match that returned by the READDIR in which the cookie was
   acquired.  If the server determines that the cookieverf is no longer
   valid for the directory, the error NFS4ERR_NOT_SAME must be returned.

   The dircount portion of the argument is a hint of the maximum number
   of bytes of directory information that should be returned.  This
   value represents the length of the names of the directory entries and
   the cookie value for these entries.  This length represents the XDR
   encoding of the data (names and cookies) and not the length in the
   native format of the server.

   The maxcount value of the argument is the maximum number of bytes for
   the result.  This maximum size represents all of the data being
   returned within the READDIR4resok structure and includes the XDR
   overhead.  The server may return less data.  If the server is unable
   to return a single directory entry within the maxcount limit, the
   error NFS4ERR_TOOSMALL will be returned to the client.

   Finally, attr_request represents the list of attributes to be
   returned for each directory entry supplied by the server.

   On successful return, the server's response will provide a list of
   directory entries.  Each of these entries contains the name of the
   directory entry, a cookie value for that entry, and the associated
   attributes as requested.  The "eof" flag has a value of TRUE if there
   are no more entries in the directory.

   The cookie value is only meaningful to the server and is used as a
   "bookmark" for the directory entry.  As mentioned, this cookie is
   used by the client for subsequent READDIR operations so that it may
   continue reading a directory.  The cookie is similar in concept to a

RFC7530 - Page 271

   READ offset but should not be interpreted as such by the client.  The
   server SHOULD try to accept cookie values issued with READDIR
   responses even if the directory has been modified between the READDIR
   calls but MAY return NFS4ERR_NOT_VALID if this is not possible, as
   might be the case if the server has rebooted in the interim.

   In some cases, the server may encounter an error while obtaining the
   attributes for a directory entry.  Instead of returning an error for
   the entire READDIR operation, the server can instead return the
   attribute 'fattr4_rdattr_error'.  With this, the server is able to
   communicate the failure to the client and not fail the entire
   operation in the instance of what might be a transient failure.
   Obviously, the client must request the fattr4_rdattr_error attribute
   for this method to work properly.  If the client does not request the
   attribute, the server has no choice but to return failure for the
   entire READDIR operation.

   For some file system environments, the directory entries "." and ".."
   have special meaning, and in other environments, they may not.  If
   the server supports these special entries within a directory, they
   should not be returned to the client as part of the READDIR response.
   To enable some client environments, the cookie values of 0, 1, and 2
   are to be considered reserved.  Note that the UNIX client will use
   these values when combining the server's response and local
   representations to enable a fully formed UNIX directory presentation
   to the application.

   For READDIR arguments, cookie values of 1 and 2 SHOULD NOT be used,
   and for READDIR results, cookie values of 0, 1, and 2 MUST NOT be
   returned.

   On success, the current filehandle retains its value.

16.24.5.  IMPLEMENTATION

   The server's file system directory representations can differ
   greatly.  A client's programming interfaces may also be bound to the
   local operating environment in a way that does not translate well
   into the NFS protocol.  Therefore, the dircount and maxcount fields
   are provided to allow the client the ability to provide guidelines to
   the server.  If the client is aggressive about attribute collection
   during a READDIR, the server has an idea of how to limit the encoded
   response.  The dircount field provides a hint on the number of
   entries based solely on the names of the directory entries.  Since it
   is a hint, it may be possible that a dircount value is zero.  In this
   case, the server is free to ignore the dircount value and return
   directory information based on the specified maxcount value.

RFC7530 - Page 272

   As there is no way for the client to indicate that a cookie value,
   once received, will not be subsequently used, server implementations
   should avoid schemes that allocate memory corresponding to a returned
   cookie.  Such allocation can be avoided if the server bases cookie
   values on a value such as the offset within the directory where the
   scan is to be resumed.

   Cookies generated by such techniques should be designed to remain
   valid despite modification of the associated directory.  If a server
   were to invalidate a cookie because of a directory modification,
   READDIRs of large directories might never finish.

   If a directory is deleted after the client has carried out one or
   more READDIR operations on the directory, the cookies returned will
   become invalid; however, the server does not need to be concerned, as
   the directory filehandle used previously would have become stale and
   would be reported as such on subsequent READDIR operations.  The
   server would not need to check the cookie verifier in this case.

   However, certain reorganization operations on a directory (including
   directory compaction) may invalidate READDIR cookies previously given
   out.  When such a situation occurs, the server should modify the
   cookie verifier so as to disallow the use of cookies that would
   otherwise no longer be valid.

   The cookieverf may be used by the server to help manage cookie values
   that may become stale.  It should be a rare occurrence that a server
   is unable to continue properly reading a directory with the provided
   cookie/cookieverf pair.  The server should make every effort to avoid
   this condition since the application at the client may not be able to
   properly handle this type of failure.

   The use of the cookieverf will also protect the client from using
   READDIR cookie values that may be stale.  For example, if the file
   system has been migrated, the server may or may not be able to use
   the same cookie values to service READDIR as the previous server
   used.  With the client providing the cookieverf, the server is able
   to provide the appropriate response to the client.  This prevents the
   case where the server may accept a cookie value but the underlying
   directory has changed and the response is invalid from the client's
   context of its previous READDIR.

   Since some servers will not be returning "." and ".." entries as has
   been done with previous versions of the NFS protocol, the client that
   requires these entries be present in READDIR responses must fabricate
   them.

RFC7530 - Page 273

16.25.  Operation 27: READLINK - Read Symbolic Link

16.25.1.  SYNOPSIS

     (cfh) -> linktext

16.25.2.  ARGUMENT

     /* CURRENT_FH: symlink */
     void;

16.25.3.  RESULT

   struct READLINK4resok {
           linktext4       link;
   };

   union READLINK4res switch (nfsstat4 status) {
    case NFS4_OK:
            READLINK4resok resok4;
    default:
            void;
   };

16.25.4.  DESCRIPTION

   READLINK reads the data associated with a symbolic link.  The data is
   a UTF-8 string that is opaque to the server.  That is, whether
   created by an NFS client or created locally on the server, the data
   in a symbolic link is not interpreted when created but is simply
   stored.

   On success, the current filehandle retains its value.

16.25.5.  IMPLEMENTATION

   A symbolic link is nominally a pointer to another file.  The data is
   not necessarily interpreted by the server; it is just stored in the
   file.  It is possible for a client implementation to store a pathname
   that is not meaningful to the server operating system in a symbolic
   link.  A READLINK operation returns the data to the client for
   interpretation.  If different implementations want to share access to
   symbolic links, then they must agree on the interpretation of the
   data in the symbolic link.

   The READLINK operation is only allowed on objects of type NF4LNK.
   The server should return the error NFS4ERR_INVAL if the object is not
   of type NF4LNK.

RFC7530 - Page 274

16.26.  Operation 28: REMOVE - Remove File System Object

16.26.1.  SYNOPSIS

     (cfh), filename -> change_info

16.26.2.  ARGUMENT

   struct REMOVE4args {
           /* CURRENT_FH: directory */
           component4      target;
   };

16.26.3.  RESULT

   struct REMOVE4resok {
           change_info4    cinfo;
   };

   union REMOVE4res switch (nfsstat4 status) {
    case NFS4_OK:
            REMOVE4resok   resok4;
    default:
            void;
   };

16.26.4.  DESCRIPTION

   The REMOVE operation removes (deletes) a directory entry named by
   filename from the directory corresponding to the current filehandle.
   If the entry in the directory was the last reference to the
   corresponding file system object, the object may be destroyed.

   For the directory where the filename was removed, the server returns
   change_info4 information in cinfo.  With the atomic field of the
   change_info4 struct, the server will indicate if the before and after
   change attributes were obtained atomically with respect to the
   removal.

   If the target is of zero length, NFS4ERR_INVAL will be returned.  The
   target is also subject to the normal UTF-8, character support, and
   name checks.  See Section 12.7 for further discussion.

   On success, the current filehandle retains its value.

RFC7530 - Page 275

16.26.5.  IMPLEMENTATION

   NFSv3 required a different operator -- RMDIR -- for directory
   removal, and REMOVE for non-directory removal.  This allowed clients
   to skip checking the file type when being passed a non-directory
   delete system call (e.g., unlink() [unlink] in POSIX) to remove a
   directory, as well as the converse (e.g., a rmdir() on a
   non-directory), because they knew the server would check the file
   type.  NFSv4 REMOVE can be used to delete any directory entry,
   independent of its file type.  The implementer of an NFSv4 client's
   entry points from the unlink() and rmdir() system calls should first
   check the file type against the types the system call is allowed to
   remove before issuing a REMOVE.  Alternatively, the implementer can
   produce a COMPOUND call that includes a LOOKUP/VERIFY sequence to
   verify the file type before a REMOVE operation in the same COMPOUND
   call.

   The concept of last reference is server specific.  However, if the
   numlinks field in the previous attributes of the object had the value
   1, the client should not rely on referring to the object via a
   filehandle.  Likewise, the client should not rely on the resources
   (disk space, directory entry, and so on) formerly associated with the
   object becoming immediately available.  Thus, if a client needs to be
   able to continue to access a file after using REMOVE to remove it,
   the client should take steps to make sure that the file will still be
   accessible.  The usual mechanism used is to RENAME the file from its
   old name to a new hidden name.

   If the server finds that the file is still open when the REMOVE
   arrives:

   o  The server SHOULD NOT delete the file's directory entry if the
      file was opened with OPEN4_SHARE_DENY_WRITE or
      OPEN4_SHARE_DENY_BOTH.

   o  If the file was not opened with OPEN4_SHARE_DENY_WRITE or
      OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's
      directory entry.  However, until the last CLOSE of the file, the
      server MAY continue to allow access to the file via its
      filehandle.

RFC7530 - Page 276

16.27.  Operation 29: RENAME - Rename Directory Entry

16.27.1.  SYNOPSIS

     (sfh), oldname, (cfh), newname -> source_cinfo, target_cinfo

16.27.2.  ARGUMENT

   struct RENAME4args {
           /* SAVED_FH: source directory */
           component4      oldname;
           /* CURRENT_FH: target directory */
           component4      newname;
   };

16.27.3.  RESULT

   struct RENAME4resok {
           change_info4    source_cinfo;
           change_info4    target_cinfo;
   };

   union RENAME4res switch (nfsstat4 status) {
    case NFS4_OK:
            RENAME4resok    resok4;
    default:
            void;
   };

16.27.4.  DESCRIPTION

   The RENAME operation renames the object identified by oldname in the
   source directory corresponding to the saved filehandle, as set by the
   SAVEFH operation, to newname in the target directory corresponding to
   the current filehandle.  The operation is required to be atomic to
   the client.  Source and target directories must reside on the same
   file system on the server.  On success, the current filehandle will
   continue to be the target directory.

   If the target directory already contains an entry with the name
   newname, the source object must be compatible with the target: either
   both are non-directories, or both are directories, and the target
   must be empty.  If compatible, the existing target is removed before
   the rename occurs (see Section 16.26 for client and server actions
   whenever a target is removed).  If they are not compatible or if the
   target is a directory but not empty, the server will return the error
   NFS4ERR_EXIST.

RFC7530 - Page 277

   If oldname and newname both refer to the same file (they might be
   hard links of each other), then RENAME should perform no action and
   return success.

   For both directories involved in the RENAME, the server returns
   change_info4 information.  With the atomic field of the change_info4
   struct, the server will indicate if the before and after change
   attributes were obtained atomically with respect to the rename.

   If the oldname refers to a named attribute and the saved and current
   filehandles refer to the named attribute directories of different
   file system objects, the server will return NFS4ERR_XDEV, just as if
   the saved and current filehandles represented directories on
   different file systems.

   If the oldname or newname is of zero length, NFS4ERR_INVAL will be
   returned.  The oldname and newname are also subject to the normal
   UTF-8, character support, and name checks.  See Section 12.7 for
   further discussion.

16.27.5.  IMPLEMENTATION

   The RENAME operation must be atomic to the client.  The statement
   "source and target directories must reside on the same file system on
   the server" means that the fsid fields in the attributes for the
   directories are the same.  If they reside on different file systems,
   the error NFS4ERR_XDEV is returned.

   Based on the value of the fh_expire_type attribute for the object,
   the filehandle may or may not expire on a RENAME.  However, server
   implementers are strongly encouraged to attempt to keep filehandles
   from expiring in this fashion.

   On some servers, the filenames "." and ".." are illegal as either
   oldname or newname and will result in the error NFS4ERR_BADNAME.  In
   addition, on many servers the case of oldname or newname being an
   alias for the source directory will be checked for.  Such servers
   will return the error NFS4ERR_INVAL in these cases.

   If either of the source or target filehandles are not directories,
   the server will return NFS4ERR_NOTDIR.

RFC7530 - Page 278

16.28.  Operation 30: RENEW - Renew a Lease

16.28.1.  SYNOPSIS

     clientid -> ()

16.28.2.  ARGUMENT

   struct RENEW4args {
           clientid4       clientid;
   };

16.28.3.  RESULT

   struct RENEW4res {
           nfsstat4        status;
   };

16.28.4.  DESCRIPTION

   The RENEW operation is used by the client to renew leases that it
   currently holds at a server.  In processing the RENEW request, the
   server renews all leases associated with the client.  The associated
   leases are determined by the clientid provided via the SETCLIENTID
   operation.

16.28.5.  IMPLEMENTATION

   When the client holds delegations, it needs to use RENEW to detect
   when the server has determined that the callback path is down.  When
   the server has made such a determination, only the RENEW operation
   will renew the lease on delegations.  If the server determines the
   callback path is down, it returns NFS4ERR_CB_PATH_DOWN.  Even though
   it returns NFS4ERR_CB_PATH_DOWN, the server MUST renew the lease on
   the byte-range locks and share reservations that the client has
   established on the server.  If for some reason the lock and share
   reservation lease cannot be renewed, then the server MUST return an
   error other than NFS4ERR_CB_PATH_DOWN, even if the callback path is
   also down.  In the event that the server has conditions such that it
   could return either NFS4ERR_CB_PATH_DOWN or NFS4ERR_LEASE_MOVED,
   NFS4ERR_LEASE_MOVED MUST be handled first.

RFC7530 - Page 279

   The client that issues RENEW MUST choose the principal, RPC security
   flavor, and, if applicable, GSS-API mechanism and service via one of
   the following algorithms:

   o  The client uses the same principal, RPC security flavor, and -- if
      the flavor was RPCSEC_GSS -- the same mechanism and service that
      were used when the client ID was established via
      SETCLIENTID_CONFIRM.

   o  The client uses any principal, RPC security flavor, mechanism, and
      service combination that currently has an OPEN file on the server.
      That is, the same principal had a successful OPEN operation; the
      file is still open by that principal; and the flavor, mechanism,
      and service of RENEW match that of the previous OPEN.

   The server MUST reject a RENEW that does not use one of the
   aforementioned algorithms, with the error NFS4ERR_ACCESS.

RFC7530 - Page 280

16.29.  Operation 31: RESTOREFH - Restore Saved Filehandle

16.29.1.  SYNOPSIS

     (sfh) -> (cfh)

16.29.2.  ARGUMENT

     /* SAVED_FH: */
     void;

16.29.3.  RESULT

   struct RESTOREFH4res {
           /* CURRENT_FH: value of saved fh */
           nfsstat4        status;
   };

16.29.4.  DESCRIPTION

   Set the current filehandle to the value in the saved filehandle.  If
   there is no saved filehandle, then return the error
   NFS4ERR_RESTOREFH.

16.29.5.  IMPLEMENTATION

   Operations like OPEN and LOOKUP use the current filehandle to
   represent a directory and replace it with a new filehandle.  Assuming
   that the previous filehandle was saved with a SAVEFH operator, the
   previous filehandle can be restored as the current filehandle.  This
   is commonly used to obtain post-operation attributes for the
   directory, e.g.,

     PUTFH (directory filehandle)
     SAVEFH
     GETATTR attrbits     (pre-op dir attrs)
     CREATE optbits "foo" attrs
     GETATTR attrbits     (file attributes)
     RESTOREFH
     GETATTR attrbits     (post-op dir attrs)

RFC7530 - Page 281

16.30.  Operation 32: SAVEFH - Save Current Filehandle

16.30.1.  SYNOPSIS

     (cfh) -> (sfh)

16.30.2.  ARGUMENT

     /* CURRENT_FH: */
     void;

16.30.3.  RESULT

   struct SAVEFH4res {
           /* SAVED_FH: value of current fh */
           nfsstat4        status;
   };

16.30.4.  DESCRIPTION

   Save the current filehandle.  If a previous filehandle was saved,
   then it is no longer accessible.  The saved filehandle can be
   restored as the current filehandle with the RESTOREFH operator.

   On success, the current filehandle retains its value.

16.30.5.  IMPLEMENTATIO

RFC7530 - Page 282

16.31.  Operation 33: SECINFO - Obtain Available Security

16.31.1.  SYNOPSIS

     (cfh), name -> { secinfo }

16.31.2.  ARGUMENT

   struct SECINFO4args {
           /* CURRENT_FH: directory */
           component4      name;
   };

16.31.3.  RESULT

   /*
    * From RFC 2203
    */
   enum rpc_gss_svc_t {
           RPC_GSS_SVC_NONE        = 1,
           RPC_GSS_SVC_INTEGRITY   = 2,
           RPC_GSS_SVC_PRIVACY     = 3
   };

   struct rpcsec_gss_info {
           sec_oid4        oid;
           qop4            qop;
           rpc_gss_svc_t   service;
   };

   /* RPCSEC_GSS has a value of '6'.  See RFC 2203 */
   union secinfo4 switch (uint32_t flavor) {
    case RPCSEC_GSS:
            rpcsec_gss_info        flavor_info;
    default:
            void;
   };

   typedef secinfo4 SECINFO4resok<>;

   union SECINFO4res switch (nfsstat4 status) {
    case NFS4_OK:
            SECINFO4resok resok4;
    default:
            void;
   };

RFC7530 - Page 283

16.31.4.  DESCRIPTION

   The SECINFO operation is used by the client to obtain a list of valid
   RPC authentication flavors for a specific directory filehandle,
   filename pair.  SECINFO should apply the same access methodology used
   for LOOKUP when evaluating the name.  Therefore, if the requester
   does not have the appropriate access to perform a LOOKUP for the
   name, then SECINFO must behave the same way and return
   NFS4ERR_ACCESS.

   The result will contain an array that represents the security
   mechanisms available, with an order corresponding to the server's
   preferences, the most preferred being first in the array.  The client
   is free to pick whatever security mechanism it both desires and
   supports, or to pick -- in the server's preference order -- the first
   one it supports.  The array entries are represented by the secinfo4
   structure.  The field 'flavor' will contain a value of AUTH_NONE,
   AUTH_SYS (as defined in [RFC5531]), or RPCSEC_GSS (as defined in
   [RFC2203]).

   For the flavors AUTH_NONE and AUTH_SYS, no additional security
   information is returned.  For a return value of RPCSEC_GSS, a
   security triple is returned that contains the mechanism object id (as
   defined in [RFC2743]), the quality of protection (as defined in
   [RFC2743]), and the service type (as defined in [RFC2203]).  It is
   possible for SECINFO to return multiple entries with flavor equal to
   RPCSEC_GSS, with different security triple values.

   On success, the current filehandle retains its value.

   If the name has a length of 0 (zero), or if the name does not obey
   the UTF-8 definition, the error NFS4ERR_INVAL will be returned.

16.31.5.  IMPLEMENTATION

   The SECINFO operation is expected to be used by the NFS client when
   the error value of NFS4ERR_WRONGSEC is returned from another NFS
   operation.  This signifies to the client that the server's security
   policy is different from what the client is currently using.  At this
   point, the client is expected to obtain a list of possible security
   flavors and choose what best suits its policies.

   As mentioned, the server's security policies will determine when a
   client request receives NFS4ERR_WRONGSEC.  The operations that may
   receive this error are LINK, LOOKUP, LOOKUPP, OPEN, PUTFH, PUTPUBFH,
   PUTROOTFH, RENAME, RESTOREFH, and, indirectly, READDIR.  LINK and
   RENAME will only receive this error if the security used for the
   operation is inappropriate for the saved filehandle.  With the

RFC7530 - Page 284

   exception of READDIR, these operations represent the point at which
   the client can instantiate a filehandle into the current filehandle
   at the server.  The filehandle is either provided by the client
   (PUTFH, PUTPUBFH, PUTROOTFH) or generated as a result of a name-to-
   filehandle translation (LOOKUP and OPEN).  RESTOREFH is different
   because the filehandle is a result of a previous SAVEFH.  Even though
   the filehandle, for RESTOREFH, might have previously passed the
   server's inspection for a security match, the server will check it
   again on RESTOREFH to ensure that the security policy has not
   changed.

   If the client wants to resolve an error return of NFS4ERR_WRONGSEC,
   the following will occur:

   o  For LOOKUP and OPEN, the client will use SECINFO with the same
      current filehandle and name as provided in the original LOOKUP or
      OPEN to enumerate the available security triples.

   o  For LINK, PUTFH, RENAME, and RESTOREFH, the client will use
      SECINFO and provide the parent directory filehandle and the object
      name that corresponds to the filehandle originally provided by the
      PUTFH or RESTOREFH, or, for LINK and RENAME, the SAVEFH.

   o  For LOOKUPP, PUTROOTFH, and PUTPUBFH, the client will be unable to
      use the SECINFO operation since SECINFO requires a current
      filehandle and none exist for these three operations.  Therefore,
      the client must iterate through the security triples available at
      the client and re-attempt the PUTROOTFH or PUTPUBFH operation.  In
      the unfortunate event that none of the MANDATORY security triples
      are supported by the client and server, the client SHOULD try
      using others that support integrity.  Failing that, the client can
      try using AUTH_NONE, but because such forms lack integrity checks,
      this puts the client at risk.  Nonetheless, the server SHOULD
      allow the client to use whatever security form the client requests
      and the server supports, since the risks of doing so are on the
      client.

   The READDIR operation will not directly return the NFS4ERR_WRONGSEC
   error.  However, if the READDIR request included a request for
   attributes, it is possible that the READDIR request's security triple
   does not match that of a directory entry.  If this is the case and
   the client has requested the rdattr_error attribute, the server will
   return the NFS4ERR_WRONGSEC error in rdattr_error for the entry.

RFC7530 - Page 285

   Note that a server MAY use the AUTH_NONE flavor to signify that the
   client is allowed to attempt to use authentication flavors that are
   not explicitly listed in the SECINFO results.  Instead of using a
   listed flavor, the client might then, for instance, opt to use an
   otherwise unlisted RPCSEC_GSS mechanism instead of AUTH_NONE.  It may
   wish to do so in order to meet an application requirement for data
   integrity or privacy.  In choosing to use an unlisted flavor, the
   client SHOULD always be prepared to handle a failure by falling back
   to using AUTH_NONE or another listed flavor.  It cannot assume that
   identity mapping is supported and should be prepared for the fact
   that its identity is squashed.

   See Section 19 for a discussion on the recommendations for security
   flavors used by SECINFO.

RFC7530 - Page 286

16.32.  Operation 34: SETATTR - Set Attributes

16.32.1.  SYNOPSIS

     (cfh), stateid, attrmask, attr_vals -> attrsset

16.32.2.  ARGUMENT

   struct SETATTR4args {
           /* CURRENT_FH: target object */
           stateid4        stateid;
           fattr4          obj_attributes;
   };

16.32.3.  RESULT

   struct SETATTR4res {
           nfsstat4        status;
           bitmap4         attrsset;
   };

16.32.4.  DESCRIPTION

   The SETATTR operation changes one or more of the attributes of a file
   system object.  The new attributes are specified with a bitmap and
   the attributes that follow the bitmap in bit order.

   The stateid argument for SETATTR is used to provide byte-range
   locking context that is necessary for SETATTR requests that set the
   size attribute.  Since setting the size attribute modifies the file's
   data, it has the same locking requirements as a corresponding WRITE.
   Any SETATTR that sets the size attribute is incompatible with a share
   reservation that specifies OPEN4_SHARE_DENY_WRITE.  The area between
   the old end-of-file and the new end-of-file is considered to be
   modified just as would have been the case had the area in question
   been specified as the target of WRITE, for the purpose of checking
   conflicts with byte-range locks, for those cases in which a server is
   implementing mandatory byte-range locking behavior.  A valid stateid
   SHOULD always be specified.  When the file size attribute is not set,
   the special anonymous stateid MAY be passed.

   On either success or failure of the operation, the server will return
   the attrsset bitmask to represent what (if any) attributes were
   successfully set.  The attrsset in the response is a subset of the
   bitmap4 that is part of the obj_attributes in the argument.

   On success, the current filehandle retains its value.

RFC7530 - Page 287

16.32.5.  IMPLEMENTATION

   If the request specifies the owner attribute to be set, the server
   SHOULD allow the operation to succeed if the current owner of the
   object matches the value specified in the request.  Some servers may
   be implemented in such a way as to prohibit the setting of the owner
   attribute unless the requester has the privilege to do so.  If the
   server is lenient in this one case of matching owner values, the
   client implementation may be simplified in cases of creation of an
   object (e.g., an exclusive create via OPEN) followed by a SETATTR.

   The file size attribute is used to request changes to the size of a
   file.  A value of zero causes the file to be truncated, a value less
   than the current size of the file causes data from the new size to
   the end of the file to be discarded, and a size greater than the
   current size of the file causes logically zeroed data bytes to be
   added to the end of the file.  Servers are free to implement this
   using holes or actual zero data bytes.  Clients should not make any
   assumptions regarding a server's implementation of this feature,
   beyond that the bytes returned will be zeroed.  Servers MUST support
   extending the file size via SETATTR.

   SETATTR is not guaranteed atomic.  A failed SETATTR may partially
   change a file's attributes -- hence, the reason why the reply always
   includes the status and the list of attributes that were set.

   If the object whose attributes are being changed has a file
   delegation that is held by a client other than the one doing the
   SETATTR, the delegation(s) must be recalled, and the operation cannot
   proceed to actually change an attribute until each such delegation is
   returned or revoked.  In all cases in which delegations are recalled,
   the server is likely to return one or more NFS4ERR_DELAY errors while
   the delegation(s) remains outstanding, although it might not do that
   if the delegations are returned quickly.

   Changing the size of a file with SETATTR indirectly changes the
   time_modify and change attributes.  A client must account for this,
   as size changes can result in data deletion.

   The attributes time_access_set and time_modify_set are write-only
   attributes constructed as a switched union so the client can direct
   the server in setting the time values.  If the switched union
   specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 to
   be used for the operation.  If the switch union does not specify
   SET_TO_CLIENT_TIME4, the server is to use its current time for the
   SETATTR operation.

RFC7530 - Page 288

   If server and client times differ, programs that compare client times
   to file times can break.  A time maintenance protocol should be used
   to limit client/server time skew.

   Use of a COMPOUND containing a VERIFY operation specifying only the
   change attribute, immediately followed by a SETATTR, provides a means
   whereby a client may specify a request that emulates the
   functionality of the SETATTR guard mechanism of NFSv3.  Since the
   function of the guard mechanism is to avoid changes to the file
   attributes based on stale information, delays between checking of the
   guard condition and the setting of the attributes have the potential
   to compromise this function, as would the corresponding delay in the
   NFSv4 emulation.  Therefore, NFSv4 servers should take care to avoid
   such delays, to the degree possible, when executing such a request.

   If the server does not support an attribute as requested by the
   client, the server should return NFS4ERR_ATTRNOTSUPP.

   A mask of the attributes actually set is returned by SETATTR in all
   cases.  That mask MUST NOT include attribute bits not requested to be
   set by the client.  If the attribute masks in the request and reply
   are equal, the status field in the reply MUST be NFS4_OK.

RFC7530 - Page 289

16.33.  Operation 35: SETCLIENTID - Negotiate Client ID

16.33.1.  SYNOPSIS

     client, callback, callback_ident -> clientid, setclientid_confirm

16.33.2.  ARGUMENT

   struct SETCLIENTID4args {
           nfs_client_id4  client;
           cb_client4      callback;
           uint32_t        callback_ident;
   };

16.33.3.  RESULT

   struct SETCLIENTID4resok {
           clientid4       clientid;
           verifier4       setclientid_confirm;
   };

   union SETCLIENTID4res switch (nfsstat4 status) {
    case NFS4_OK:
            SETCLIENTID4resok      resok4;
    case NFS4ERR_CLID_INUSE:
            clientaddr4    client_using;
    default:
            void;
   };

16.33.4.  DESCRIPTION

   The client uses the SETCLIENTID operation to notify the server of its
   intention to use a particular client identifier, callback, and
   callback_ident for subsequent requests that entail creating lock,
   share reservation, and delegation state on the server.  Upon
   successful completion the server will return a shorthand client ID
   that, if confirmed via a separate step, will be used in subsequent
   file locking and file open requests.  Confirmation of the client ID
   must be done via the SETCLIENTID_CONFIRM operation to return the
   client ID and setclientid_confirm values, as verifiers, to the
   server.  Two verifiers are necessary because it is possible to use
   SETCLIENTID and SETCLIENTID_CONFIRM to modify the callback and
   callback_ident information but not the shorthand client ID.  In that
   event, the setclientid_confirm value is effectively the only
   verifier.

RFC7530 - Page 290

   The callback information provided in this operation will be used if
   the client is provided an open delegation at a future point.
   Therefore, the client must correctly reflect the program and port
   numbers for the callback program at the time SETCLIENTID is used.

   The callback_ident value is used by the server on the callback.  The
   client can leverage the callback_ident to eliminate the need for more
   than one callback RPC program number, while still being able to
   determine which server is initiating the callback.

16.33.5.  IMPLEMENTATION

   To understand how to implement SETCLIENTID, make the following
   notations.  Let:

   x  be the value of the client.id subfield of the SETCLIENTID4args
      structure.

   v  be the value of the client.verifier subfield of the
      SETCLIENTID4args structure.

   c  be the value of the client ID field returned in the
      SETCLIENTID4resok structure.

   k  represent the value combination of the callback and callback_ident
      fields of the SETCLIENTID4args structure.

   s  be the setclientid_confirm value returned in the SETCLIENTID4resok
      structure.

   { v, x, c, k, s }  be a quintuple for a client record.  A client
      record is confirmed if there has been a SETCLIENTID_CONFIRM
      operation to confirm it.  Otherwise, it is unconfirmed.  An
      unconfirmed record is established by a SETCLIENTID call.

   Since SETCLIENTID is a non-idempotent operation, let us assume that
   the server is implementing the duplicate request cache (DRC).

RFC7530 - Page 291

   When the server gets a SETCLIENTID { v, x, k } request, it processes
   it in the following manner.

   o  It first looks up the request in the DRC.  If there is a hit, it
      returns the result cached in the DRC.  The server does NOT remove
      client state (locks, shares, delegations), nor does it modify any
      recorded callback and callback_ident information for client { x }.

      For any DRC miss, the server takes the client ID string x, and
      searches for client records for x that the server may have
      recorded from previous SETCLIENTID calls.  For any confirmed
      record with the same id string x, if the recorded principal does
      not match that of the SETCLIENTID call, then the server returns an
      NFS4ERR_CLID_INUSE error.

      For brevity of discussion, the remaining description of the
      processing assumes that there was a DRC miss, and that where the
      server has previously recorded a confirmed record for client x,
      the aforementioned principal check has successfully passed.

   o  The server checks if it has recorded a confirmed record for { v,
      x, c, l, s }, where l may or may not equal k.  If so, and since
      the id verifier v of the request matches that which is confirmed
      and recorded, the server treats this as a probable callback
      information update and records an unconfirmed { v, x, c, k, t }
      and leaves the confirmed { v, x, c, l, s } in place, such that
      t != s.  It does not matter whether k equals l or not.  Any
      pre-existing unconfirmed { v, x, c, *, * } is removed.

      The server returns { c, t }.  It is indeed returning the old
      clientid4 value c, because the client apparently only wants to
      update callback value k to value l.  It's possible this request is
      one from the Byzantine router that has stale callback information,
      but this is not a problem.  The callback information update is
      only confirmed if followed up by a SETCLIENTID_CONFIRM { c, t }.

      The server awaits confirmation of k via SETCLIENTID_CONFIRM
      { c, t }.

      The server does NOT remove client (lock/share/delegation) state
      for x.

RFC7530 - Page 292

   o  The server has previously recorded a confirmed { u, x, c, l, s }
      record such that v != u, l may or may not equal k, and has not
      recorded any unconfirmed { *, x, *, *, * } record for x.  The
      server records an unconfirmed { v, x, d, k, t } (d != c, t != s).

      The server returns { d, t }.

      The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM
      { d, t }.

      The server does NOT remove client (lock/share/delegation) state
      for x.

   o  The server has previously recorded a confirmed { u, x, c, l, s }
      record such that v != u, l may or may not equal k, and recorded an
      unconfirmed { w, x, d, m, t } record such that c != d, t != s, m
      may or may not equal k, m may or may not equal l, and k may or may
      not equal l.  Whether w == v or w != v makes no difference.  The
      server simply removes the unconfirmed { w, x, d, m, t } record and
      replaces it with an unconfirmed { v, x, e, k, r } record, such
      that e != d, e != c, r != t, r != s.

      The server returns { e, r }.

      The server awaits confirmation of { e, k } via SETCLIENTID_CONFIRM
      { e, r }.

      The server does NOT remove client (lock/share/delegation) state
      for x.

   o  The server has no confirmed { *, x, *, *, * } for x.  It may or
      may not have recorded an unconfirmed { u, x, c, l, s }, where l
      may or may not equal k, and u may or may not equal v.  Any
      unconfirmed record { u, x, c, l, * }, regardless of whether u == v
      or l == k, is replaced with an unconfirmed record { v, x, d, k, t
      } where d != c, t != s.

      The server returns { d, t }.

      The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM
      { d, t }.  The server does NOT remove client (lock/share/
      delegation) state for x.

   The server generates the clientid and setclientid_confirm values and
   must take care to ensure that these values are extremely unlikely to
   ever be regenerated.

RFC7530 - Page 293

16.34.  Operation 36: SETCLIENTID_CONFIRM - Confirm Client ID

16.34.1.  SYNOPSIS

     clientid, setclientid_confirm -> -

16.34.2.  ARGUMENT

   struct SETCLIENTID_CONFIRM4args {
           clientid4       clientid;
           verifier4       setclientid_confirm;
   };

16.34.3.  RESULT

   struct SETCLIENTID_CONFIRM4res {
           nfsstat4        status;
   };

16.34.4.  DESCRIPTION

   This operation is used by the client to confirm the results from a
   previous call to SETCLIENTID.  The client provides the server-
   supplied (from a SETCLIENTID response) client ID.  The server
   responds with a simple status of success or failure.

16.34.5.  IMPLEMENTATION

   The client must use the SETCLIENTID_CONFIRM operation to confirm the
   following two distinct cases:

   o  The client's use of a new shorthand client identifier (as returned
      from the server in the response to SETCLIENTID), a new callback
      value (as specified in the arguments to SETCLIENTID), and a new
      callback_ident value (as specified in the arguments to
      SETCLIENTID).  The client's use of SETCLIENTID_CONFIRM in this
      case also confirms the removal of any of the client's previous
      relevant leased state.  Relevant leased client state includes
      byte-range locks, share reservations, and -- where the server does
      not support the CLAIM_DELEGATE_PREV claim type -- delegations.  If
      the server supports CLAIM_DELEGATE_PREV, then SETCLIENTID_CONFIRM
      MUST NOT remove delegations for this client; relevant leased
      client state would then just include byte-range locks and share
      reservations.

RFC7530 - Page 294

   o  The client's reuse of an old, previously confirmed shorthand
      client identifier; a new callback value; and a new callback_ident
      value.  The client's use of SETCLIENTID_CONFIRM in this case MUST
      NOT result in the removal of any previous leased state (locks,
      share reservations, and delegations).

   We use the same notation and definitions for v, x, c, k, s, and
   unconfirmed and confirmed client records as introduced in the
   description of the SETCLIENTID operation.  The arguments to
   SETCLIENTID_CONFIRM are indicated by the notation { c, s }, where c
   is a value of type clientid4, and s is a value of type verifier4
   corresponding to the setclientid_confirm field.

   As with SETCLIENTID, SETCLIENTID_CONFIRM is a non-idempotent
   operation, and we assume that the server is implementing the
   duplicate request cache (DRC).

   When the server gets a SETCLIENTID_CONFIRM { c, s } request, it
   processes it in the following manner.

   o  It first looks up the request in the DRC.  If there is a hit, it
      returns the result cached in the DRC.  The server does not remove
      any relevant leased client state, nor does it modify any recorded
      callback and callback_ident information for client { x } as
      represented by the shorthand value c.

   For a DRC miss, the server checks for client records that match the
   shorthand value c.  The processing cases are as follows:

   o  The server has recorded an unconfirmed { v, x, c, k, s } record
      and a confirmed { v, x, c, l, t } record, such that s != t.  If
      the principals of the records do not match that of the
      SETCLIENTID_CONFIRM, the server returns NFS4ERR_CLID_INUSE, and no
      relevant leased client state is removed and no recorded callback
      and callback_ident information for client { x } is changed.
      Otherwise, the confirmed { v, x, c, l, t } record is removed and
      the unconfirmed { v, x, c, k, s } is marked as confirmed, thereby
      modifying recorded and confirmed callback and callback_ident
      information for client { x }.

      The server does not remove any relevant leased client state.

      The server returns NFS4_OK.

RFC7530 - Page 295

   o  The server has not recorded an unconfirmed { v, x, c, *, * } and
      has recorded a confirmed { v, x, c, *, s }.  If the principals of
      the record and of SETCLIENTID_CONFIRM do not match, the server
      returns NFS4ERR_CLID_INUSE without removing any relevant leased
      client state, and without changing recorded callback and
      callback_ident values for client { x }.

      If the principals match, then what has likely happened is that the
      client never got the response from the SETCLIENTID_CONFIRM, and
      the DRC entry has been purged.  Whatever the scenario, since the
      principals match, as well as { c, s } matching a confirmed record,
      the server leaves client x's relevant leased client state intact,
      leaves its callback and callback_ident values unmodified, and
      returns NFS4_OK.

   o  The server has not recorded a confirmed { *, *, c, *, * } and has
      recorded an unconfirmed { *, x, c, k, s }.  Even if this is a
      retry from the client, nonetheless the client's first
      SETCLIENTID_CONFIRM attempt was not received by the server.  Retry
      or not, the server doesn't know, but it processes it as if it were
      a first try.  If the principal of the unconfirmed { *, x, c, k, s
      } record mismatches that of the SETCLIENTID_CONFIRM request, the
      server returns NFS4ERR_CLID_INUSE without removing any relevant
      leased client state.

      Otherwise, the server records a confirmed { *, x, c, k, s }.  If
      there is also a confirmed { *, x, d, *, t }, the server MUST
      remove client x's relevant leased client state and overwrite the
      callback state with k.  The confirmed record { *, x, d, *, t } is
      removed.

      The server returns NFS4_OK.

   o  The server has no record of a confirmed or unconfirmed { *, *, c,
      *, s }.  The server returns NFS4ERR_STALE_CLIENTID.  The server
      does not remove any relevant leased client state, nor does it
      modify any recorded callback and callback_ident information for
      any client.

   The server needs to cache unconfirmed { v, x, c, k, s } client
   records and await for some time their confirmation.  As should be
   clear from the discussions of record processing for SETCLIENTID and
   SETCLIENTID_CONFIRM, there are cases where the server does not
   deterministically remove unconfirmed client records.  To avoid
   running out of resources, the server is not required to hold
   unconfirmed records indefinitely.  One strategy the server might use
   is to set a limit on how many unconfirmed client records it will
   maintain and then, when the limit would be exceeded, remove the

RFC7530 - Page 296

   oldest record.  Another strategy might be to remove an unconfirmed
   record when some amount of time has elapsed.  The choice of the
   amount of time is fairly arbitrary, but it is surely no higher than
   the server's lease time period.  Consider that leases need to be
   renewed before the lease time expires via an operation from the
   client.  If the client cannot issue a SETCLIENTID_CONFIRM after a
   SETCLIENTID before a period of time equal to a lease expiration time,
   then the client is unlikely to be able to maintain state on the
   server during steady-state operation.

   If the client does send a SETCLIENTID_CONFIRM for an unconfirmed
   record that the server has already deleted, the client will get
   NFS4ERR_STALE_CLIENTID back.  If so, the client should then start
   over, and send SETCLIENTID to re-establish an unconfirmed client
   record and get back an unconfirmed client ID and setclientid_confirm
   verifier.  The client should then send the SETCLIENTID_CONFIRM to
   confirm the client ID.

   SETCLIENTID_CONFIRM does not establish or renew a lease.  However, if
   SETCLIENTID_CONFIRM removes relevant leased client state, and that
   state does not include existing delegations, the server MUST allow
   the client a period of time no less than the value of the lease_time
   attribute, to reclaim (via the CLAIM_DELEGATE_PREV claim type of the
   OPEN operation) its delegations before removing unreclaimed
   delegations.

RFC7530 - Page 297

16.35.  Operation 37: VERIFY - Verify Same Attributes

16.35.1.  SYNOPSIS

     (cfh), fattr -> -

16.35.2.  ARGUMENT

   struct VERIFY4args {
           /* CURRENT_FH: object */
           fattr4          obj_attributes;
   };

16.35.3.  RESULT

   struct VERIFY4res {
           nfsstat4        status;
   };

16.35.4.  DESCRIPTION

   The VERIFY operation is used to verify that attributes have a value
   assumed by the client before proceeding with subsequent operations in
   the COMPOUND request.  If any of the attributes do not match, then
   the error NFS4ERR_NOT_SAME must be returned.  The current filehandle
   retains its value after successful completion of the operation.

16.35.5.  IMPLEMENTATION

   One possible use of the VERIFY operation is the following COMPOUND
   sequence.  With this, the client is attempting to verify that the
   file being removed will match what the client expects to be removed.
   This sequence can help prevent the unintended deletion of a file.

     PUTFH (directory filehandle)
     LOOKUP (filename)
     VERIFY (filehandle == fh)
     PUTFH (directory filehandle)
     REMOVE (filename)

   This sequence does not prevent a second client from removing and
   creating a new file in the middle of this sequence, but it does help
   avoid the unintended result.

RFC7530 - Page 298

   In the case that a RECOMMENDED attribute is specified in the VERIFY
   operation and the server does not support that attribute for the file
   system object, the error NFS4ERR_ATTRNOTSUPP is returned to the
   client.

   When the attribute rdattr_error or any write-only attribute (e.g.,
   time_modify_set) is specified, the error NFS4ERR_INVAL is returned to
   the client.

RFC7530 - Page 299

16.36.  Operation 38: WRITE - Write to File

16.36.1.  SYNOPSIS

     (cfh), stateid, offset, stable, data -> count, committed, writeverf

16.36.2.  ARGUMENT

   enum stable_how4 {
           UNSTABLE4       = 0,
           DATA_SYNC4      = 1,
           FILE_SYNC4      = 2
   };

   struct WRITE4args {
           /* CURRENT_FH: file */
           stateid4        stateid;
           offset4         offset;
           stable_how4     stable;
           opaque          data<>;
   };

16.36.3.  RESULT

   struct WRITE4resok {
           count4          count;
           stable_how4     committed;
           verifier4       writeverf;
   };

   union WRITE4res switch (nfsstat4 status) {
    case NFS4_OK:
            WRITE4resok    resok4;
    default:
            void;
   };

16.36.4.  DESCRIPTION

   The WRITE operation is used to write data to a regular file.  The
   target file is specified by the current filehandle.  The offset
   specifies the offset where the data should be written.  An offset of
   0 (zero) specifies that the write should start at the beginning of
   the file.  The count, as encoded as part of the opaque data
   parameter, represents the number of bytes of data that are to be
   written.  If the count is 0 (zero), the WRITE will succeed and return
   a count of 0 (zero) subject to permissions checking.  The server may
   choose to write fewer bytes than requested by the client.

RFC7530 - Page 300

   Part of the WRITE request is a specification of how the WRITE is to
   be performed.  The client specifies with the stable parameter the
   method of how the data is to be processed by the server.  If stable
   is FILE_SYNC4, the server must commit the data written plus all file
   system metadata to stable storage before returning results.  This
   corresponds to the NFSv2 protocol semantics.  Any other behavior
   constitutes a protocol violation.  If stable is DATA_SYNC4, then the
   server must commit all of the data to stable storage and enough of
   the metadata to retrieve the data before returning.  The server
   implementer is free to implement DATA_SYNC4 in the same fashion as
   FILE_SYNC4, but with a possible performance drop.  If stable is
   UNSTABLE4, the server is free to commit any part of the data and the
   metadata to stable storage, including all or none, before returning a
   reply to the client.  There is no guarantee whether or when any
   uncommitted data will subsequently be committed to stable storage.
   The only guarantees made by the server are that it will not destroy
   any data without changing the value of verf and that it will not
   commit the data and metadata at a level less than that requested by
   the client.

   The stateid value for a WRITE request represents a value returned
   from a previous byte-range lock or share reservation request or the
   stateid associated with a delegation.  The stateid is used by the
   server to verify that the associated share reservation and any
   byte-range locks are still valid and to update lease timeouts for the
   client.

   Upon successful completion, the following results are returned.  The
   count result is the number of bytes of data written to the file.  The
   server may write fewer bytes than requested.  If so, the actual
   number of bytes written starting at location, offset, is returned.

   The server also returns an indication of the level of commitment of
   the data and metadata via committed.  If the server committed all
   data and metadata to stable storage, committed should be set to
   FILE_SYNC4.  If the level of commitment was at least as strong as
   DATA_SYNC4, then committed should be set to DATA_SYNC4.  Otherwise,
   committed must be returned as UNSTABLE4.  If stable was FILE4_SYNC,
   then committed must also be FILE_SYNC4: anything else constitutes a
   protocol violation.  If stable was DATA_SYNC4, then committed may be
   FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol
   violation.  If stable was UNSTABLE4, then committed may be either
   FILE_SYNC4, DATA_SYNC4, or UNSTABLE4.

RFC7530 - Page 301

   The final portion of the result is the write verifier.  The write
   verifier is a cookie that the client can use to determine whether the
   server has changed instance (boot) state between a call to WRITE and
   a subsequent call to either WRITE or COMMIT.  This cookie must be
   consistent during a single instance of the NFSv4 protocol service and
   must be unique between instances of the NFSv4 protocol server, where
   uncommitted data may be lost.

   If a client writes data to the server with the stable argument set to
   UNSTABLE4 and the reply yields a committed response of DATA_SYNC4 or
   UNSTABLE4, the client will follow up at some time in the future with
   a COMMIT operation to synchronize outstanding asynchronous data and
   metadata with the server's stable storage, barring client error.  It
   is possible that due to client crash or other error a subsequent
   COMMIT will not be received by the server.

   For a WRITE using the special anonymous stateid, the server MAY allow
   the WRITE to be serviced subject to mandatory file locks or the
   current share deny modes for the file.  For a WRITE using the special
   READ bypass stateid, the server MUST NOT allow the WRITE operation to
   bypass locking checks at the server, and the WRITE is treated exactly
   the same as if the anonymous stateid were used.

   On success, the current filehandle retains its value.

16.36.5.  IMPLEMENTATION

   It is possible for the server to write fewer bytes of data than
   requested by the client.  In this case, the server should not return
   an error unless no data was written at all.  If the server writes
   less than the number of bytes specified, the client should issue
   another WRITE to write the remaining data.

   It is assumed that the act of writing data to a file will cause the
   time_modify attribute of the file to be updated.  However, the
   time_modify attribute of the file should not be changed unless the
   contents of the file are changed.  Thus, a WRITE request with count
   set to 0 should not cause the time_modify attribute of the file to be
   updated.

RFC7530 - Page 302

   The definition of stable storage has been historically a point of
   contention.  The following expected properties of stable storage may
   help in resolving design issues in the implementation.  Stable
   storage is persistent storage that survives:

   1.  Repeated power failures.

   2.  Hardware failures (of any board, power supply, etc.).

   3.  Repeated software crashes, including reboot cycle.

   This definition does not address failure of the stable storage module
   itself.

   The verifier is defined to allow a client to detect different
   instances of an NFSv4 protocol server over which cached, uncommitted
   data may be lost.  In the most likely case, the verifier allows the
   client to detect server reboots.  This information is required so
   that the client can safely determine whether the server could have
   lost cached data.  If the server fails unexpectedly and the client
   has uncommitted data from previous WRITE requests (done with the
   stable argument set to UNSTABLE4 and in which the result committed
   was returned as UNSTABLE4 as well), it may not have flushed cached
   data to stable storage.  The burden of recovery is on the client, and
   the client will need to retransmit the data to the server.

   One suggested way to use the verifier would be to use the time that
   the server was booted or the time the server was last started (if
   restarting the server without a reboot results in lost buffers).

   The committed field in the results allows the client to do more
   effective caching.  If the server is committing all WRITE requests to
   stable storage, then it should return with committed set to
   FILE_SYNC4, regardless of the value of the stable field in the
   arguments.  A server that uses an NVRAM accelerator may choose to
   implement this policy.  The client can use this to increase the
   effectiveness of the cache by discarding cached data that has already
   been committed on the server.

   Some implementations may return NFS4ERR_NOSPC instead of
   NFS4ERR_DQUOT when a user's quota is exceeded.  In the case that the
   current filehandle is a directory, the server will return
   NFS4ERR_ISDIR.  If the current filehandle is not a regular file or a
   directory, the server will return NFS4ERR_INVAL.

RFC7530 - Page 303

   If mandatory file locking is on for the file, and a corresponding
   record of the data to be written to file is read or write locked by
   an owner that is not associated with the stateid, the server will
   return NFS4ERR_LOCKED.  If so, the client must check if the owner
   corresponding to the stateid used with the WRITE operation has a
   conflicting read lock that overlaps with the region that was to be
   written.  If the stateid's owner has no conflicting read lock, then
   the client should try to get the appropriate write byte-range lock
   via the LOCK operation before re-attempting the WRITE.  When the
   WRITE completes, the client should release the byte-range lock via
   LOCKU.

   If the stateid's owner had a conflicting read lock, then the client
   has no choice but to return an error to the application that
   attempted the WRITE.  The reason is that since the stateid's owner
   had a read lock, the server either (1) attempted to temporarily
   effectively upgrade this read lock to a write lock or (2) has no
   upgrade capability.  If the server attempted to upgrade the read lock
   and failed, it is pointless for the client to re-attempt the upgrade
   via the LOCK operation, because there might be another client also
   trying to upgrade.  If two clients are blocked trying to upgrade the
   same lock, the clients deadlock.  If the server has no upgrade
   capability, then it is pointless to try a LOCK operation to upgrade.

RFC7530 - Page 304

16.37.  Operation 39: RELEASE_LOCKOWNER - Release Lock-Owner State

16.37.1.  SYNOPSIS

     lock-owner -> ()

16.37.2.  ARGUMENT

   struct RELEASE_LOCKOWNER4args {
           lock_owner4     lock_owner;
   };

16.37.3.  RESULT

   struct RELEASE_LOCKOWNER4res {
           nfsstat4        status;
   };

16.37.4.  DESCRIPTION

   This operation is used to notify the server that the lock_owner is no
   longer in use by the client and that future client requests will not
   reference this lock_owner.  This allows the server to release cached
   state related to the specified lock_owner.  If file locks associated
   with the lock_owner are held at the server, the error
   NFS4ERR_LOCKS_HELD will be returned and no further action will be
   taken.

16.37.5.  IMPLEMENTATION

   The client may choose to use this operation to ease the amount of
   server state that is held.  Information that can be released when a
   RELEASE_LOCKOWNER is done includes the specified lock-owner string,
   the seqid associated with the lock-owner, any saved reply for the
   lock-owner, and any lock stateids associated with that lock-owner.

   Depending on the behavior of applications at the client, it may be
   important for the client to use this operation since the server
   has certain obligations with respect to holding a reference to
   lock-owner-associated state as long as an associated file is open.
   Therefore, if the client knows for certain that the lock_owner will
   no longer be used to either reference existing lock stateids
   associated with the lock-owner or create new ones, it should use
   RELEASE_LOCKOWNER.

RFC7530 - Page 305

16.38.  Operation 10044: ILLEGAL - Illegal Operation

16.38.1.  SYNOPSIS

     <null> -> ()

16.38.2.  ARGUMENT

     void;

16.38.3.  RESULT

   struct ILLEGAL4res {
           nfsstat4        status;
   };

16.38.4.  DESCRIPTION

   This operation is a placeholder for encoding a result to handle the
   case of the client sending an operation code within COMPOUND that is
   not supported.  See Section 15.2.4 for more details.

   The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL.

16.38.5.  IMPLEMENTATION

   A client will probably not send an operation with code OP_ILLEGAL,
   but if it does, the response will be ILLEGAL4res, just as it would be
   with any other invalid operation code.  Note that if the server gets
   an illegal operation code that is not OP_ILLEGAL, and if the server
   checks for legal operation codes during the XDR decode phase, then
   the ILLEGAL4res would not be returned.

(next page on part 14)