Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 5661

Network File System (NFS) Version 4 Minor Version 1 Protocol

Pages: 617
Obsoleted by:  8881
Updated by:  81788434
Part 2 of 20 – Pages 9 to 34
First   Prev   Next

Top   ToC   RFC5661 - Page 9   prevText

1. Introduction

1.1. The NFS Version 4 Minor Version 1 Protocol

The NFS version 4 minor version 1 (NFSv4.1) protocol is the second minor version of the NFS version 4 (NFSv4) protocol. The first minor version, NFSv4.0, is described in [30]. It generally follows the guidelines for minor versioning that are listed in Section 10 of RFC 3530. However, it diverges from guidelines 11 ("a client and server that support minor version X must support minor versions 0 through X-1") and 12 ("no new features may be introduced as mandatory in a minor version"). These divergences are due to the introduction of the sessions model for managing non-idempotent operations and the RECLAIM_COMPLETE operation. These two new features are infrastructural in nature and simplify implementation of existing and other new features. Making them anything but REQUIRED would add undue complexity to protocol definition and implementation. NFSv4.1 accordingly updates the minor versioning guidelines (Section 2.7). As a minor version, NFSv4.1 is consistent with the overall goals for NFSv4, but extends the protocol so as to better meet those goals, based on experiences with NFSv4.0. In addition, NFSv4.1 has adopted some additional goals, which motivate some of the major extensions in NFSv4.1.

1.2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1].

1.3. Scope of This Document

This document describes the NFSv4.1 protocol. With respect to NFSv4.0, this document does not: o describe the NFSv4.0 protocol, except where needed to contrast with NFSv4.1. o modify the specification of the NFSv4.0 protocol. o clarify the NFSv4.0 protocol.
Top   ToC   RFC5661 - Page 10

1.4. NFSv4 Goals

The NFSv4 protocol is a further revision of the NFS protocol defined already by NFSv3 [31]. It retains the essential characteristics of previous versions: easy recovery; independence of transport protocols, operating systems, and file systems; simplicity; and good performance. NFSv4 has the following goals: o Improved access and good performance on the Internet The protocol is designed to transit firewalls easily, perform well where latency is high and bandwidth is low, and scale to very large numbers of clients per server. o Strong security with negotiation built into the protocol The protocol builds on the work of the ONCRPC working group in supporting the RPCSEC_GSS protocol. Additionally, the NFSv4.1 protocol provides a mechanism to allow clients and servers the ability to negotiate security and require clients and servers to support a minimal set of security schemes. o Good cross-platform interoperability The protocol features a file system model that provides a useful, common set of features that does not unduly favor one file system or operating system over another. o Designed for protocol extensions The protocol is designed to accept standard extensions within a framework that enables and encourages backward compatibility.

1.5. NFSv4.1 Goals

NFSv4.1 has the following goals, within the framework established by the overall NFSv4 goals. o To correct significant structural weaknesses and oversights discovered in the base protocol. o To add clarity and specificity to areas left unaddressed or not addressed in sufficient detail in the base protocol. However, as stated in Section 1.3, it is not a goal to clarify the NFSv4.0 protocol in the NFSv4.1 specification. o To add specific features based on experience with the existing protocol and recent industry developments.
Top   ToC   RFC5661 - Page 11
   o  To provide protocol support to take advantage of clustered server
      deployments including the ability to provide scalable parallel
      access to files distributed among multiple servers.

1.6. General Definitions

The following definitions provide an appropriate context for the reader. Byte: In this document, a byte is an octet, i.e., a datum exactly 8 bits in length. Client: The client is the entity that accesses the NFS server's resources. The client may be an application that contains the logic to access the NFS server directly. The client may also be the traditional operating system client that provides remote file system services for a set of applications. A client is uniquely identified by a client owner. With reference to byte-range locking, the client is also the entity that maintains a set of locks on behalf of one or more applications. This client is responsible for crash or failure recovery for those locks it manages. Note that multiple clients may share the same transport and connection and multiple clients may exist on the same network node. Client ID: The client ID is a 64-bit quantity used as a unique, short-hand reference to a client-supplied verifier and client owner. The server is responsible for supplying the client ID. Client Owner: The client owner is a unique string, opaque to the server, that identifies a client. Multiple network connections and source network addresses originating from those connections may share a client owner. The server is expected to treat requests from connections with the same client owner as coming from the same client. File System: The file system is the collection of objects on a server (as identified by the major identifier of a server owner, which is defined later in this section) that share the same fsid attribute (see Section 5.8.1.9).
Top   ToC   RFC5661 - Page 12
   Lease:  A lease is an interval of time defined by the server for
      which the client is irrevocably granted locks.  At the end of a
      lease period, locks may be revoked if the lease has not been
      extended.  A lock must be revoked if a conflicting lock has been
      granted after the lease interval.

      A server grants a client a single lease for all state.

   Lock:  The term "lock" is used to refer to byte-range (in UNIX
      environments, also known as record) locks, share reservations,
      delegations, or layouts unless specifically stated otherwise.

   Secret State Verifier (SSV):  The SSV is a unique secret key shared
      between a client and server.  The SSV serves as the secret key for
      an internal (that is, internal to NFSv4.1) Generic Security
      Services (GSS) mechanism (the SSV GSS mechanism; see
      Section 2.10.9).  The SSV GSS mechanism uses the SSV to compute
      message integrity code (MIC) and Wrap tokens.  See
      Section 2.10.8.3 for more details on how NFSv4.1 uses the SSV and
      the SSV GSS mechanism.

   Server:  The Server is the entity responsible for coordinating client
      access to a set of file systems and is identified by a server
      owner.  A server can span multiple network addresses.

   Server Owner:  The server owner identifies the server to the client.
      The server owner consists of a major identifier and a minor
      identifier.  When the client has two connections each to a peer
      with the same major identifier, the client assumes that both peers
      are the same server (the server namespace is the same via each
      connection) and that lock state is sharable across both
      connections.  When each peer has both the same major and minor
      identifiers, the client assumes that each connection might be
      associable with the same session.

   Stable Storage:  Stable storage is storage from which data stored by
      an NFSv4.1 server can be recovered without data loss from multiple
      power failures (including cascading power failures, that is,
      several power failures in quick succession), operating system
      failures, and/or hardware failure of components other than the
      storage medium itself (such as disk, nonvolatile RAM, flash
      memory, etc.).

      Some examples of stable storage that are allowable for an NFS
      server include:
Top   ToC   RFC5661 - Page 13
      1.  Media commit of data; that is, the modified data has been
          successfully written to the disk media, for example, the disk
          platter.

      2.  An immediate reply disk drive with battery-backed, on-drive
          intermediate storage or uninterruptible power system (UPS).

      3.  Server commit of data with battery-backed intermediate storage
          and recovery software.

      4.  Cache commit with uninterruptible power system (UPS) and
          recovery software.

   Stateid:  A stateid is a 128-bit quantity returned by a server that
      uniquely defines the open and locking states provided by the
      server for a specific open-owner or lock-owner/open-owner pair for
      a specific file and type of lock.

   Verifier:  A verifier is a 64-bit quantity generated by the client
      that the server can use to determine if the client has restarted
      and lost all previous lock state.

1.7. Overview of NFSv4.1 Features

The major features of the NFSv4.1 protocol will be reviewed in brief. This will be done to provide an appropriate context for both the reader who is familiar with the previous versions of the NFS protocol and the reader who is new to the NFS protocols. For the reader new to the NFS protocols, there is still a set of fundamental knowledge that is expected. The reader should be familiar with the External Data Representation (XDR) and Remote Procedure Call (RPC) protocols as described in [2] and [3]. A basic knowledge of file systems and distributed file systems is expected as well. In general, this specification of NFSv4.1 will not distinguish those features added in minor version 1 from those present in the base protocol but will treat NFSv4.1 as a unified whole. See Section 1.8 for a summary of the differences between NFSv4.0 and NFSv4.1.

1.7.1. RPC and Security

As with previous versions of NFS, the External Data Representation (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFSv4.1 protocol are those defined in [2] and [3]. To meet end-to-end security requirements, the RPCSEC_GSS framework [4] is used to extend the basic RPC security. With the use of RPCSEC_GSS, various mechanisms can be provided to offer authentication, integrity, and
Top   ToC   RFC5661 - Page 14
   privacy to the NFSv4 protocol.  Kerberos V5 is used as described in
   [5] to provide one security framework.  With the use of RPCSEC_GSS,
   other mechanisms may also be specified and used for NFSv4.1 security.

   To enable in-band security negotiation, the NFSv4.1 protocol has
   operations that provide the client a method of querying the server
   about its policies regarding which security mechanisms must be used
   for access to the server's file system resources.  With this, the
   client can securely match the security mechanism that meets the
   policies specified at both the client and server.

   NFSv4.1 introduces parallel access (see Section 1.7.2.2), which is
   called pNFS.  The security framework described in this section is
   significantly modified by the introduction of pNFS (see
   Section 12.9), because data access is sometimes not over RPC.  The
   level of significance varies with the storage protocol (see
   Section 12.2.5) and can be as low as zero impact (see Section 13.12).

1.7.2. Protocol Structure

1.7.2.1. Core Protocol
Unlike NFSv3, which used a series of ancillary protocols (e.g., NLM, NSM (Network Status Monitor), MOUNT), within all minor versions of NFSv4 a single RPC protocol is used to make requests to the server. Facilities that had been separate protocols, such as locking, are now integrated within a single unified protocol.
1.7.2.2. Parallel Access
Minor version 1 supports high-performance data access to a clustered server implementation by enabling a separation of metadata access and data access, with the latter done to multiple servers in parallel. Such parallel data access is controlled by recallable objects known as "layouts", which are integrated into the protocol locking model. Clients direct requests for data access to a set of data servers specified by the layout via a data storage protocol which may be NFSv4.1 or may be another protocol. Because the protocols used for parallel data access are not necessarily RPC-based, the RPC-based security model (Section 1.7.1) is obviously impacted (see Section 12.9). The degree of impact varies with the storage protocol (see Section 12.2.5) used for data access, and can be as low as zero (see Section 13.12).
Top   ToC   RFC5661 - Page 15

1.7.3. File System Model

The general file system model used for the NFSv4.1 protocol is the same as previous versions. The server file system is hierarchical with the regular files contained within being treated as opaque byte streams. In a slight departure, file and directory names are encoded with UTF-8 to deal with the basics of internationalization. The NFSv4.1 protocol does not require a separate protocol to provide for the initial mapping between path name and filehandle. All file systems exported by a server are presented as a tree so that all file systems are reachable from a special per-server global root filehandle. This allows LOOKUP operations to be used to perform functions previously provided by the MOUNT protocol. The server provides any necessary pseudo file systems to bridge any gaps that arise due to unexported gaps between exported file systems.
1.7.3.1. Filehandles
As in previous versions of the NFS protocol, opaque filehandles are used to identify individual files and directories. Lookup-type and create operations translate file and directory names to filehandles, which are then used to identify objects in subsequent operations. The NFSv4.1 protocol provides support for persistent filehandles, guaranteed to be valid for the lifetime of the file system object designated. In addition, it provides support to servers to provide filehandles with more limited validity guarantees, called volatile filehandles.
1.7.3.2. File Attributes
The NFSv4.1 protocol has a rich and extensible file object attribute structure, which is divided into REQUIRED, RECOMMENDED, and named attributes (see Section 5). Several (but not all) of the REQUIRED attributes are derived from the attributes of NFSv3 (see the definition of the fattr3 data type in [31]). An example of a REQUIRED attribute is the file object's type (Section 5.8.1.2) so that regular files can be distinguished from directories (also known as folders in some operating environments) and other types of objects. REQUIRED attributes are discussed in Section 5.1. An example of three RECOMMENDED attributes are acl, sacl, and dacl. These attributes define an Access Control List (ACL) on a file object (Section 6). An ACL provides directory and file access control beyond the model used in NFSv3. The ACL definition allows for
Top   ToC   RFC5661 - Page 16
   specification of specific sets of permissions for individual users
   and groups.  In addition, ACL inheritance allows propagation of
   access permissions and restrictions down a directory tree as file
   system objects are created.  RECOMMENDED attributes are discussed in
   Section 5.2.

   A named attribute is an opaque byte stream that is associated with a
   directory or file and referred to by a string name.  Named attributes
   are meant to be used by client applications as a method to associate
   application-specific data with a regular file or directory.  NFSv4.1
   modifies named attributes relative to NFSv4.0 by tightening the
   allowed operations in order to prevent the development of non-
   interoperable implementations.  Named attributes are discussed in
   Section 5.3.

1.7.3.3. Multi-Server Namespace
NFSv4.1 contains a number of features to allow implementation of namespaces that cross server boundaries and that allow and facilitate a non-disruptive transfer of support for individual file systems between servers. They are all based upon attributes that allow one file system to specify alternate or new locations for that file system. These attributes may be used together with the concept of absent file systems, which provide specifications for additional locations but no actual file system content. This allows a number of important facilities: o Location attributes may be used with absent file systems to implement referrals whereby one server may direct the client to a file system provided by another server. This allows extensive multi-server namespaces to be constructed. o Location attributes may be provided for present file systems to provide the locations of alternate file system instances or replicas to be used in the event that the current file system instance becomes unavailable. o Location attributes may be provided when a previously present file system becomes absent. This allows non-disruptive migration of file systems to alternate servers.

1.7.4. Locking Facilities

As mentioned previously, NFSv4.1 is a single protocol that includes locking facilities. These locking facilities include support for many types of locks including a number of sorts of recallable locks.
Top   ToC   RFC5661 - Page 17
   Recallable locks such as delegations allow the client to be assured
   that certain events will not occur so long as that lock is held.
   When circumstances change, the lock is recalled via a callback
   request.  The assurances provided by delegations allow more extensive
   caching to be done safely when circumstances allow it.

   The types of locks are:

   o  Share reservations as established by OPEN operations.

   o  Byte-range locks.

   o  File delegations, which are recallable locks that assure the
      holder that inconsistent opens and file changes cannot occur so
      long as the delegation is held.

   o  Directory delegations, which are recallable locks that assure the
      holder that inconsistent directory modifications cannot occur so
      long as the delegation is held.

   o  Layouts, which are recallable objects that assure the holder that
      direct access to the file data may be performed directly by the
      client and that no change to the data's location that is
      inconsistent with that access may be made so long as the layout is
      held.

   All locks for a given client are tied together under a single client-
   wide lease.  All requests made on sessions associated with the client
   renew that lease.  When the client's lease is not promptly renewed,
   the client's locks are subject to revocation.  In the event of server
   restart, clients have the opportunity to safely reclaim their locks
   within a special grace period.

1.8. Differences from NFSv4.0

The following summarizes the major differences between minor version 1 and the base protocol: o Implementation of the sessions model (Section 2.10). o Parallel access to data (Section 12). o Addition of the RECLAIM_COMPLETE operation to better structure the lock reclamation process (Section 18.51). o Enhanced delegation support as follows.
Top   ToC   RFC5661 - Page 18
      *  Delegations on directories and other file types in addition to
         regular files (Section 18.39, Section 18.49).

      *  Operations to optimize acquisition of recalled or denied
         delegations (Section 18.49, Section 20.5, Section 20.7).

      *  Notifications of changes to files and directories
         (Section 18.39, Section 20.4).

      *  A method to allow a server to indicate that it is recalling one
         or more delegations for resource management reasons, and thus a
         method to allow the client to pick which delegations to return
         (Section 20.6).

   o  Attributes can be set atomically during exclusive file create via
      the OPEN operation (see the new EXCLUSIVE4_1 creation method in
      Section 18.16).

   o  Open files can be preserved if removed and the hard link count
      ("hard link" is defined in an Open Group [6] standard) goes to
      zero, thus obviating the need for clients to rename deleted files
      to partially hidden names -- colloquially called "silly rename"
      (see the new OPEN4_RESULT_PRESERVE_UNLINKED reply flag in
      Section 18.16).

   o  Improved compatibility with Microsoft Windows for Access Control
      Lists (Section 6.2.3, Section 6.2.2, Section 6.4.3.2).

   o  Data retention (Section 5.13).

   o  Identification of the implementation of the NFS client and server
      (Section 18.35).

   o  Support for notification of the availability of byte-range locks
      (see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in
      Section 18.16 and see Section 20.11).

   o  In NFSv4.1, LIPKEY and SPKM-3 are not required security mechanisms
      [32].

2. Core Infrastructure

2.1. Introduction

NFSv4.1 relies on core infrastructure common to nearly every operation. This core infrastructure is described in the remainder of this section.
Top   ToC   RFC5661 - Page 19

2.2. RPC and XDR

The NFSv4.1 protocol is a Remote Procedure Call (RPC) application that uses RPC version 2 and the corresponding eXternal Data Representation (XDR) as defined in [3] and [2].

2.2.1. RPC-Based Security

Previous NFS versions have been thought of as having a host-based authentication model, where the NFS server authenticates the NFS client, and trusts the client to authenticate all users. Actually, NFS has always depended on RPC for authentication. One of the first forms of RPC authentication, AUTH_SYS, had no strong authentication and required a host-based authentication approach. NFSv4.1 also depends on RPC for basic security services and mandates RPC support for a user-based authentication model. The user-based authentication model has user principals authenticated by a server, and in turn the server authenticated by user principals. RPC provides some basic security services that are used by NFSv4.1.
2.2.1.1. RPC Security Flavors
As described in Section 7.2 ("Authentication") of [3], RPC security is encapsulated in the RPC header, via a security or authentication flavor, and information specific to the specified security flavor. Every RPC header conveys information used to identify and authenticate a client and server. As discussed in Section 2.2.1.1.1, some security flavors provide additional security services. NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This requirement to implement is not a requirement to use.) Other flavors, such as AUTH_NONE and AUTH_SYS, MAY be implemented as well.
2.2.1.1.1. RPCSEC_GSS and Security Services
RPCSEC_GSS [4] uses the functionality of GSS-API [7]. This allows for the use of various security mechanisms by the RPC layer without the additional implementation overhead of adding RPC security flavors. 2.2.1.1.1.1. Identification, Authentication, Integrity, Privacy Via the GSS-API, RPCSEC_GSS can be used to identify and authenticate users on clients to servers, and servers to users. It can also perform integrity checking on the entire RPC message, including the RPC header, and on the arguments or results. Finally, privacy, usually via encryption, is a service available with RPCSEC_GSS. Privacy is performed on the arguments and results. Note that if
Top   ToC   RFC5661 - Page 20
   privacy is selected, integrity, authentication, and identification
   are enabled.  If privacy is not selected, but integrity is selected,
   authentication and identification are enabled.  If integrity and
   privacy are not selected, but authentication is enabled,
   identification is enabled.  RPCSEC_GSS does not provide
   identification as a separate service.

   Although GSS-API has an authentication service distinct from its
   privacy and integrity services, GSS-API's authentication service is
   not used for RPCSEC_GSS's authentication service.  Instead, each RPC
   request and response header is integrity protected with the GSS-API
   integrity service, and this allows RPCSEC_GSS to offer per-RPC
   authentication and identity.  See [4] for more information.

   NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and
   authentication service.  NFSv4.1 servers MUST support RPCSEC_GSS's
   privacy service.  NFSv4.1 clients SHOULD support RPCSEC_GSS's privacy
   service.

2.2.1.1.1.2.  Security Mechanisms for NFSv4.1

   RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide
   security services.  Therefore, NFSv4.1 clients and servers MUST
   support the Kerberos V5 security mechanism.

   The use of RPCSEC_GSS requires selection of mechanism, quality of
   protection (QOP), and service (authentication, integrity, privacy).
   For the mandated security mechanisms, NFSv4.1 specifies that a QOP of
   zero is used, leaving it up to the mechanism or the mechanism's
   configuration to map QOP zero to an appropriate level of protection.
   Each mandated mechanism specifies a minimum set of cryptographic
   algorithms for implementing integrity and privacy.  NFSv4.1 clients
   and servers MUST be implemented on operating environments that comply
   with the REQUIRED cryptographic algorithms of each REQUIRED
   mechanism.

2.2.1.1.1.2.1.  Kerberos V5

   The Kerberos V5 GSS-API mechanism as described in [5] MUST be
   implemented with the RPCSEC_GSS services as specified in the
   following table:
Top   ToC   RFC5661 - Page 21
      column descriptions:
      1 == number of pseudo flavor
      2 == name of pseudo flavor
      3 == mechanism's OID
      4 == RPCSEC_GSS service
      5 == NFSv4.1 clients MUST support
      6 == NFSv4.1 servers MUST support

      1      2        3                    4                     5   6
      ------------------------------------------------------------------
      390003 krb5     1.2.840.113554.1.2.2 rpc_gss_svc_none      yes yes
      390004 krb5i    1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
      390005 krb5p    1.2.840.113554.1.2.2 rpc_gss_svc_privacy    no yes

   Note that the number and name of the pseudo flavor are presented here
   as a mapping aid to the implementor.  Because the NFSv4.1 protocol
   includes a method to negotiate security and it understands the GSS-
   API mechanism, the pseudo flavor is not needed.  The pseudo flavor is
   needed for the NFSv3 since the security negotiation is done via the
   MOUNT protocol as described in [33].

   At the time NFSv4.1 was specified, the Advanced Encryption Standard
   (AES) with HMAC-SHA1 was a REQUIRED algorithm set for Kerberos V5.
   In contrast, when NFSv4.0 was specified, weaker algorithm sets were
   REQUIRED for Kerberos V5, and were REQUIRED in the NFSv4.0
   specification, because the Kerberos V5 specification at the time did
   not specify stronger algorithms.  The NFSv4.1 specification does not
   specify REQUIRED algorithms for Kerberos V5, and instead, the
   implementor is expected to track the evolution of the Kerberos V5
   standard if and when stronger algorithms are specified.

2.2.1.1.1.2.1.1.  Security Considerations for Cryptographic Algorithms
                  in Kerberos V5

   When deploying NFSv4.1, the strength of the security achieved depends
   on the existing Kerberos V5 infrastructure.  The algorithms of
   Kerberos V5 are not directly exposed to or selectable by the client
   or server, so there is some due diligence required by the user of
   NFSv4.1 to ensure that security is acceptable where needed.

2.2.1.1.1.3.  GSS Server Principal

   Regardless of what security mechanism under RPCSEC_GSS is being used,
   the NFS server MUST identify itself in GSS-API via a
   GSS_C_NT_HOSTBASED_SERVICE name type.  GSS_C_NT_HOSTBASED_SERVICE
   names are of the form:

        service@hostname
Top   ToC   RFC5661 - Page 22
   For NFS, the "service" element is

        nfs

   Implementations of security mechanisms will convert nfs@hostname to
   various different forms.  For Kerberos V5, the following form is
   RECOMMENDED:

        nfs/hostname

2.3. COMPOUND and CB_COMPOUND

A significant departure from the versions of the NFS protocol before NFSv4 is the introduction of the COMPOUND procedure. For the NFSv4 protocol, in all minor versions, there are exactly two RPC procedures, NULL and COMPOUND. The COMPOUND procedure is defined as a series of individual operations and these operations perform the sorts of functions performed by traditional NFS procedures. The operations combined within a COMPOUND request are evaluated in order by the server, without any atomicity guarantees. A limited set of facilities exist to pass results from one operation to another. Once an operation returns a failing result, the evaluation ends and the results of all evaluated operations are returned to the client. With the use of the COMPOUND procedure, the client is able to build simple or complex requests. These COMPOUND requests allow for a reduction in the number of RPCs needed for logical file system operations. For example, multi-component look up requests can be constructed by combining multiple LOOKUP operations. Those can be further combined with operations such as GETATTR, READDIR, or OPEN plus READ to do more complicated sets of operation without incurring additional latency. NFSv4.1 also contains a considerable set of callback operations in which the server makes an RPC directed at the client. Callback RPCs have a similar structure to that of the normal server requests. In all minor versions of the NFSv4 protocol, there are two callback RPC procedures: CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is defined in an analogous fashion to that of COMPOUND with its own set of callback operations. The addition of new server and callback operations within the COMPOUND and CB_COMPOUND request framework provides a means of extending the protocol in subsequent minor versions.
Top   ToC   RFC5661 - Page 23
   Except for a small number of operations needed for session creation,
   server requests and callback requests are performed within the
   context of a session.  Sessions provide a client context for every
   request and support robust reply protection for non-idempotent
   requests.

2.4. Client Identifiers and Client Owners

For each operation that obtains or depends on locking state, the specific client needs to be identifiable by the server. Each distinct client instance is represented by a client ID. A client ID is a 64-bit identifier representing a specific client at a given time. The client ID is changed whenever the client re- initializes, and may change when the server re-initializes. Client IDs are used to support lock identification and crash recovery. During steady state operation, the client ID associated with each operation is derived from the session (see Section 2.10) on which the operation is sent. A session is associated with a client ID when the session is created. Unlike NFSv4.0, the only NFSv4.1 operations possible before a client ID is established are those needed to establish the client ID. A sequence of an EXCHANGE_ID operation followed by a CREATE_SESSION operation using that client ID (eir_clientid as returned from EXCHANGE_ID) is required to establish and confirm the client ID on the server. Establishment of identification by a new incarnation of the client also has the effect of immediately releasing any locking state that a previous incarnation of that same client might have had on the server. Such released state would include all byte-range lock, share reservation, layout state, and -- where the server supports neither the CLAIM_DELEGATE_PREV nor CLAIM_DELEG_CUR_FH claim types -- all delegation state associated with the same client with the same identity. For discussion of delegation state recovery, see Section 10.2.1. For discussion of layout state recovery, see Section 12.7.1. Releasing such state requires that the server be able to determine that one client instance is the successor of another. Where this cannot be done, for any of a number of reasons, the locking state will remain for a time subject to lease expiration (see Section 8.3) and the new client will need to wait for such state to be removed, if it makes conflicting lock requests. Client identification is encapsulated in the following client owner data type:
Top   ToC   RFC5661 - Page 24
   struct client_owner4 {
           verifier4       co_verifier;
           opaque          co_ownerid<NFS4_OPAQUE_LIMIT>;
   };

   The first field, co_verifier, is a client incarnation verifier.  The
   server will start the process of canceling the client's leased state
   if co_verifier is different than what the server has previously
   recorded for the identified client (as specified in the co_ownerid
   field).

   The second field, co_ownerid, is a variable length string that
   uniquely defines the client so that subsequent instances of the same
   client bear the same co_ownerid with a different verifier.

   There are several considerations for how the client generates the
   co_ownerid string:

   o  The string should be unique so that multiple clients do not
      present the same string.  The consequences of two clients
      presenting the same string range from one client getting an error
      to one client having its leased state abruptly and unexpectedly
      cancelled.

   o  The string should be selected so that subsequent incarnations
      (e.g., restarts) of the same client cause the client to present
      the same string.  The implementor is cautioned from an approach
      that requires the string to be recorded in a local file because
      this precludes the use of the implementation in an environment
      where there is no local disk and all file access is from an
      NFSv4.1 server.

   o  The string should be the same for each server network address that
      the client accesses.  This way, if a server has multiple
      interfaces, the client can trunk traffic over multiple network
      paths as described in Section 2.10.5.  (Note: the precise opposite
      was advised in the NFSv4.0 specification [30].)

   o  The algorithm for generating the string should not assume that the
      client's network address will not change, unless the client
      implementation knows it is using statically assigned network
      addresses.  This includes changes between client incarnations and
      even changes while the client is still running in its current
      incarnation.  Thus, with dynamic address assignment, if the client
      includes just the client's network address in the co_ownerid
      string, there is a real risk that after the client gives up the
Top   ToC   RFC5661 - Page 25
      network address, another client, using a similar algorithm for
      generating the co_ownerid string, would generate a conflicting
      co_ownerid string.

   Given the above considerations, an example of a well-generated
   co_ownerid string is one that includes:

   o  If applicable, the client's statically assigned network address.

   o  Additional information that tends to be unique, such as one or
      more of:

      *  The client machine's serial number (for privacy reasons, it is
         best to perform some one-way function on the serial number).

      *  A Media Access Control (MAC) address (again, a one-way function
         should be performed).

      *  The timestamp of when the NFSv4.1 software was first installed
         on the client (though this is subject to the previously
         mentioned caution about using information that is stored in a
         file, because the file might only be accessible over NFSv4.1).

      *  A true random number.  However, since this number ought to be
         the same between client incarnations, this shares the same
         problem as that of using the timestamp of the software
         installation.

   o  For a user-level NFSv4.1 client, it should contain additional
      information to distinguish the client from other user-level
      clients running on the same host, such as a process identifier or
      other unique sequence.

   The client ID is assigned by the server (the eir_clientid result from
   EXCHANGE_ID) and should be chosen so that it will not conflict with a
   client ID previously assigned by the server.  This applies across
   server restarts.

   In the event of a server restart, a client may find out that its
   current client ID is no longer valid when it receives an
   NFS4ERR_STALE_CLIENTID error.  The precise circumstances depend on
   the characteristics of the sessions involved, specifically whether
   the session is persistent (see Section 2.10.6.5), but in each case
   the client will receive this error when it attempts to establish a
   new session with the existing client ID and receives the error
   NFS4ERR_STALE_CLIENTID, indicating that a new client ID needs to be
   obtained via EXCHANGE_ID and the new session established with that
   client ID.
Top   ToC   RFC5661 - Page 26
   When a session is not persistent, the client will find out that it
   needs to create a new session as a result of getting an
   NFS4ERR_BADSESSION, since the session in question was lost as part of
   a server restart.  When the existing client ID is presented to a
   server as part of creating a session and that client ID is not
   recognized, as would happen after a server restart, the server will
   reject the request with the error NFS4ERR_STALE_CLIENTID.

   In the case of the session being persistent, the client will re-
   establish communication using the existing session after the restart.
   This session will be associated with the existing client ID but may
   only be used to retransmit operations that the client previously
   transmitted and did not see replies to.  Replies to operations that
   the server previously performed will come from the reply cache;
   otherwise, NFS4ERR_DEADSESSION will be returned.  Hence, such a
   session is referred to as "dead".  In this situation, in order to
   perform new operations, the client needs to establish a new session.
   If an attempt is made to establish this new session with the existing
   client ID, the server will reject the request with
   NFS4ERR_STALE_CLIENTID.

   When NFS4ERR_STALE_CLIENTID is received in either of these
   situations, the client needs to obtain a new client ID by use of the
   EXCHANGE_ID operation, then use that client ID as the basis of a new
   session, and then proceed to any other necessary recovery for the
   server restart case (see Section 8.4.2).

   See the descriptions of EXCHANGE_ID (Section 18.35) and
   CREATE_SESSION (Section 18.36) for a complete specification of these
   operations.

2.4.1. Upgrade from NFSv4.0 to NFSv4.1

To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a value of data type client_owner4 in an EXCHANGE_ID with a value of data type nfs_client_id4 that was established using the SETCLIENTID operation of NFSv4.0. A server that does so will allow an upgraded client to avoid waiting until the lease (i.e., the lease established by the NFSv4.0 instance client) expires. This requires that the value of data type client_owner4 be constructed the same way as the value of data type nfs_client_id4. If the latter's contents included the server's network address (per the recommendations of the NFSv4.0 specification [30]), and the NFSv4.1 client does not wish to use a client ID that prevents trunking, it should send two EXCHANGE_ID operations. The first EXCHANGE_ID will have a client_owner4 equal to the nfs_client_id4. This will clear the state created by the NFSv4.0 client. The second EXCHANGE_ID will not have the server's network
Top   ToC   RFC5661 - Page 27
   address.  The state created for the second EXCHANGE_ID will not have
   to wait for lease expiration, because there will be no state to
   expire.

2.4.2. Server Release of Client ID

NFSv4.1 introduces a new operation called DESTROY_CLIENTID (Section 18.50), which the client SHOULD use to destroy a client ID it no longer needs. This permits graceful, bilateral release of a client ID. The operation cannot be used if there are sessions associated with the client ID, or state with an unexpired lease. If the server determines that the client holds no associated state for its client ID (associated state includes unrevoked sessions, opens, locks, delegations, layouts, and wants), the server MAY choose to unilaterally release the client ID in order to conserve resources. If the client contacts the server after this release, the server MUST ensure that the client receives the appropriate error so that it will use the EXCHANGE_ID/CREATE_SESSION sequence to establish a new client ID. The server ought to be very hesitant to release a client ID since the resulting work on the client to recover from such an event will be the same burden as if the server had failed and restarted. Typically, a server would not release a client ID unless there had been no activity from that client for many minutes. As long as there are sessions, opens, locks, delegations, layouts, or wants, the server MUST NOT release the client ID. See Section 2.10.13.1.4 for discussion on releasing inactive sessions.

2.4.3. Resolving Client Owner Conflicts

When the server gets an EXCHANGE_ID for a client owner that currently has no state, or that has state but the lease has expired, the server MUST allow the EXCHANGE_ID and confirm the new client ID if followed by the appropriate CREATE_SESSION. When the server gets an EXCHANGE_ID for a new incarnation of a client owner that currently has an old incarnation with state and an unexpired lease, the server is allowed to dispose of the state of the previous incarnation of the client owner if one of the following is true: o The principal that created the client ID for the client owner is the same as the principal that is sending the EXCHANGE_ID operation. Note that if the client ID was created with SP4_MACH_CRED state protection (Section 18.35), the principal MUST be based on RPCSEC_GSS authentication, the RPCSEC_GSS service used
Top   ToC   RFC5661 - Page 28
      MUST be integrity or privacy, and the same GSS mechanism and
      principal MUST be used as that used when the client ID was
      created.

   o  The client ID was established with SP4_SSV protection
      (Section 18.35, Section 2.10.8.3) and the client sends the
      EXCHANGE_ID with the security flavor set to RPCSEC_GSS using the
      GSS SSV mechanism (Section 2.10.9).

   o  The client ID was established with SP4_SSV protection, and under
      the conditions described herein, the EXCHANGE_ID was sent with
      SP4_MACH_CRED state protection.  Because the SSV might not persist
      across client and server restart, and because the first time a
      client sends EXCHANGE_ID to a server it does not have an SSV, the
      client MAY send the subsequent EXCHANGE_ID without an SSV
      RPCSEC_GSS handle.  Instead, as with SP4_MACH_CRED protection, the
      principal MUST be based on RPCSEC_GSS authentication, the
      RPCSEC_GSS service used MUST be integrity or privacy, and the same
      GSS mechanism and principal MUST be used as that used when the
      client ID was created.

   If none of the above situations apply, the server MUST return
   NFS4ERR_CLID_INUSE.

   If the server accepts the principal and co_ownerid as matching that
   which created the client ID, and the co_verifier in the EXCHANGE_ID
   differs from the co_verifier used when the client ID was created,
   then after the server receives a CREATE_SESSION that confirms the
   client ID, the server deletes state.  If the co_verifier values are
   the same (e.g., the client either is updating properties of the
   client ID (Section 18.35) or is attempting trunking (Section 2.10.5),
   the server MUST NOT delete state.

2.5. Server Owners

The server owner is similar to a client owner (Section 2.4), but unlike the client owner, there is no shorthand server ID. The server owner is defined in the following data type: struct server_owner4 { uint64_t so_minor_id; opaque so_major_id<NFS4_OPAQUE_LIMIT>; }; The server owner is returned from EXCHANGE_ID. When the so_major_id fields are the same in two EXCHANGE_ID results, the connections that each EXCHANGE_ID were sent over can be assumed to address the same
Top   ToC   RFC5661 - Page 29
   server (as defined in Section 1.6).  If the so_minor_id fields are
   also the same, then not only do both connections connect to the same
   server, but the session can be shared across both connections.  The
   reader is cautioned that multiple servers may deliberately or
   accidentally claim to have the same so_major_id or so_major_id/
   so_minor_id; the reader should examine Sections 2.10.5 and 18.35 in
   order to avoid acting on falsely matching server owner values.

   The considerations for generating a so_major_id are similar to that
   for generating a co_ownerid string (see Section 2.4).  The
   consequences of two servers generating conflicting so_major_id values
   are less dire than they are for co_ownerid conflicts because the
   client can use RPCSEC_GSS to compare the authenticity of each server
   (see Section 2.10.5).

2.6. Security Service Negotiation

With the NFSv4.1 server potentially offering multiple security mechanisms, the client needs a method to determine or negotiate which mechanism is to be used for its communication with the server. The NFS server may have multiple points within its file system namespace that are available for use by NFS clients. These points can be considered security policy boundaries, and, in some NFS implementations, are tied to NFS export points. In turn, the NFS server may be configured such that each of these security policy boundaries may have different or multiple security mechanisms in use. The security negotiation between client and server SHOULD be done with a secure channel to eliminate the possibility of a third party intercepting the negotiation sequence and forcing the client and server to choose a lower level of security than required or desired. See Section 21 for further discussion.

2.6.1. NFSv4.1 Security Tuples

An NFS server can assign one or more "security tuples" to each security policy boundary in its namespace. Each security tuple consists of a security flavor (see Section 2.2.1.1) and, if the flavor is RPCSEC_GSS, a GSS-API mechanism Object Identifier (OID), a GSS-API quality of protection, and an RPCSEC_GSS service.

2.6.2. SECINFO and SECINFO_NO_NAME

The SECINFO and SECINFO_NO_NAME operations allow the client to determine, on a per-filehandle basis, what security tuple is to be used for server access. In general, the client will not have to use either operation except during initial communication with the server or when the client crosses security policy boundaries at the server.
Top   ToC   RFC5661 - Page 30
   However, the server's policies may also change at any time and force
   the client to negotiate a new security tuple.

   Where the use of different security tuples would affect the type of
   access that would be allowed if a request was sent over the same
   connection used for the SECINFO or SECINFO_NO_NAME operation (e.g.,
   read-only vs. read-write) access, security tuples that allow greater
   access should be presented first.  Where the general level of access
   is the same and different security flavors limit the range of
   principals whose privileges are recognized (e.g., allowing or
   disallowing root access), flavors supporting the greatest range of
   principals should be listed first.

2.6.3. Security Error

Based on the assumption that each NFSv4.1 client and server MUST support a minimum set of security (i.e., Kerberos V5 under RPCSEC_GSS), the NFS client will initiate file access to the server with one of the minimal security tuples. During communication with the server, the client may receive an NFS error of NFS4ERR_WRONGSEC. This error allows the server to notify the client that the security tuple currently being used contravenes the server's security policy. The client is then responsible for determining (see Section 2.6.3.1) what security tuples are available at the server and choosing one that is appropriate for the client.
2.6.3.1. Using NFS4ERR_WRONGSEC, SECINFO, and SECINFO_NO_NAME
This section explains the mechanics of NFSv4.1 security negotiation.
2.6.3.1.1. Put Filehandle Operations
The term "put filehandle operation" refers to PUTROOTFH, PUTPUBFH, PUTFH, and RESTOREFH. Each of the subsections herein describes how the server handles a subseries of operations that starts with a put filehandle operation. 2.6.3.1.1.1. Put Filehandle Operation + SAVEFH The client is saving a filehandle for a future RESTOREFH, LINK, or RENAME. SAVEFH MUST NOT return NFS4ERR_WRONGSEC. To determine whether or not the put filehandle operation returns NFS4ERR_WRONGSEC, the server implementation pretends SAVEFH is not in the series of operations and examines which of the situations described in the other subsections of Section 2.6.3.1.1 apply.
Top   ToC   RFC5661 - Page 31
2.6.3.1.1.2.  Two or More Put Filehandle Operations

   For a series of N put filehandle operations, the server MUST NOT
   return NFS4ERR_WRONGSEC to the first N-1 put filehandle operations.
   The Nth put filehandle operation is handled as if it is the first in
   a subseries of operations.  For example, if the server received a
   COMPOUND request with this series of operations -- PUTFH, PUTROOTFH,
   LOOKUP -- then the PUTFH operation is ignored for NFS4ERR_WRONGSEC
   purposes, and the PUTROOTFH, LOOKUP subseries is processed as
   according to Section 2.6.3.1.1.3.

2.6.3.1.1.3.  Put Filehandle Operation + LOOKUP (or OPEN of an Existing
              Name)

   This situation also applies to a put filehandle operation followed by
   a LOOKUP or an OPEN operation that specifies an existing component
   name.

   In this situation, the client is potentially crossing a security
   policy boundary, and the set of security tuples the parent directory
   supports may differ from those of the child.  The server
   implementation may decide whether to impose any restrictions on
   security policy administration.  There are at least three approaches
   (sec_policy_child is the tuple set of the child export,
   sec_policy_parent is that of the parent).

   (a)  sec_policy_child <= sec_policy_parent (<= for subset).  This
        means that the set of security tuples specified on the security
        policy of a child directory is always a subset of its parent
        directory.

   (b)  sec_policy_child ^ sec_policy_parent != {} (^ for intersection,
        {} for the empty set).  This means that the set of security
        tuples specified on the security policy of a child directory
        always has a non-empty intersection with that of the parent.

   (c)  sec_policy_child ^ sec_policy_parent == {}.  This means that the
        set of security tuples specified on the security policy of a
        child directory may not intersect with that of the parent.  In
        other words, there are no restrictions on how the system
        administrator may set up these tuples.

   In order for a server to support approaches (b) (for the case when a
   client chooses a flavor that is not a member of sec_policy_parent)
   and (c), the put filehandle operation cannot return NFS4ERR_WRONGSEC
   when there is a security tuple mismatch.  Instead, it should be
   returned from the LOOKUP (or OPEN by existing component name) that
   follows.
Top   ToC   RFC5661 - Page 32
   Since the above guideline does not contradict approach (a), it should
   be followed in general.  Even if approach (a) is implemented, it is
   possible for the security tuple used to be acceptable for the target
   of LOOKUP but not for the filehandles used in the put filehandle
   operation.  The put filehandle operation could be a PUTROOTFH or
   PUTPUBFH, where the client cannot know the security tuples for the
   root or public filehandle.  Or the security policy for the filehandle
   used by the put filehandle operation could have changed since the
   time the filehandle was obtained.

   Therefore, an NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC in
   response to the put filehandle operation if the operation is
   immediately followed by a LOOKUP or an OPEN by component name.

2.6.3.1.1.4.  Put Filehandle Operation + LOOKUPP

   Since SECINFO only works its way down, there is no way LOOKUPP can
   return NFS4ERR_WRONGSEC without SECINFO_NO_NAME.  SECINFO_NO_NAME
   solves this issue via style SECINFO_STYLE4_PARENT, which works in the
   opposite direction as SECINFO.  As with Section 2.6.3.1.1.3, a put
   filehandle operation that is followed by a LOOKUPP MUST NOT return
   NFS4ERR_WRONGSEC.  If the server does not support SECINFO_NO_NAME,
   the client's only recourse is to send the put filehandle operation,
   LOOKUPP, GETFH sequence of operations with every security tuple it
   supports.

   Regardless of whether SECINFO_NO_NAME is supported, an NFSv4.1 server
   MUST NOT return NFS4ERR_WRONGSEC in response to a put filehandle
   operation if the operation is immediately followed by a LOOKUPP.

2.6.3.1.1.5.  Put Filehandle Operation + SECINFO/SECINFO_NO_NAME

   A security-sensitive client is allowed to choose a strong security
   tuple when querying a server to determine a file object's permitted
   security tuples.  The security tuple chosen by the client does not
   have to be included in the tuple list of the security policy of
   either the parent directory indicated in the put filehandle operation
   or the child file object indicated in SECINFO (or any parent
   directory indicated in SECINFO_NO_NAME).  Of course, the server has
   to be configured for whatever security tuple the client selects;
   otherwise, the request will fail at the RPC layer with an appropriate
   authentication error.

   In theory, there is no connection between the security flavor used by
   SECINFO or SECINFO_NO_NAME and those supported by the security
   policy.  But in practice, the client may start looking for strong
   flavors from those supported by the security policy, followed by
   those in the REQUIRED set.
Top   ToC   RFC5661 - Page 33
   The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC to a put
   filehandle operation that is immediately followed by SECINFO or
   SECINFO_NO_NAME.  The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC
   from SECINFO or SECINFO_NO_NAME.

2.6.3.1.1.6.  Put Filehandle Operation + Nothing

   The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC.

2.6.3.1.1.7.  Put Filehandle Operation + Anything Else

   "Anything Else" includes OPEN by filehandle.

   The security policy enforcement applies to the filehandle specified
   in the put filehandle operation.  Therefore, the put filehandle
   operation MUST return NFS4ERR_WRONGSEC when there is a security tuple
   mismatch.  This avoids the complexity of adding NFS4ERR_WRONGSEC as
   an allowable error to every other operation.

   A COMPOUND containing the series put filehandle operation +
   SECINFO_NO_NAME (style SECINFO_STYLE4_CURRENT_FH) is an efficient way
   for the client to recover from NFS4ERR_WRONGSEC.

   The NFSv4.1 server MUST NOT return NFS4ERR_WRONGSEC to any operation
   other than a put filehandle operation, LOOKUP, LOOKUPP, and OPEN (by
   component name).

2.6.3.1.1.8.  Operations after SECINFO and SECINFO_NO_NAME

   Suppose a client sends a COMPOUND procedure containing the series
   SEQUENCE, PUTFH, SECINFO_NONAME, READ, and suppose the security tuple
   used does not match that required for the target file.  By rule (see
   Section 2.6.3.1.1.5), neither PUTFH nor SECINFO_NO_NAME can return
   NFS4ERR_WRONGSEC.  By rule (see Section 2.6.3.1.1.7), READ cannot
   return NFS4ERR_WRONGSEC.  The issue is resolved by the fact that
   SECINFO and SECINFO_NO_NAME consume the current filehandle (note that
   this is a change from NFSv4.0).  This leaves no current filehandle
   for READ to use, and READ returns NFS4ERR_NOFILEHANDLE.

2.6.3.1.2. LINK and RENAME
The LINK and RENAME operations use both the current and saved filehandles. Technically, the server MAY return NFS4ERR_WRONGSEC from LINK or RENAME if the security policy of the saved filehandle rejects the security flavor used in the COMPOUND request's credentials. If the server does so, then if there is no intersection
Top   ToC   RFC5661 - Page 34
   between the security policies of saved and current filehandles, this
   means that it will be impossible for the client to perform the
   intended LINK or RENAME operation.

   For example, suppose the client sends this COMPOUND request:
   SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, RENAME "c" "d", where
   filehandles bFH and aFH refer to different directories.  Suppose no
   common security tuple exists between the security policies of aFH and
   bFH.  If the client sends the request using credentials acceptable to
   bFH's security policy but not aFH's policy, then the PUTFH aFH
   operation will fail with NFS4ERR_WRONGSEC.  After a SECINFO_NO_NAME
   request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH,
   RENAME "c" "d", using credentials acceptable to aFH's security policy
   but not bFH's policy.  The server returns NFS4ERR_WRONGSEC on the
   RENAME operation.

   To prevent a client from an endless sequence of a request containing
   LINK or RENAME, followed by a request containing SECINFO_NO_NAME or
   SECINFO, the server MUST detect when the security policies of the
   current and saved filehandles have no mutually acceptable security
   tuple, and MUST NOT return NFS4ERR_WRONGSEC from LINK or RENAME in
   that situation.  Instead the server MUST do one of two things:

   o  The server can return NFS4ERR_XDEV.

   o  The server can allow the security policy of the current filehandle
      to override that of the saved filehandle, and so return NFS4_OK.



(page 34 continued on part 3)

Next Section