Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7862

Network File System (NFS) Version 4 Minor Version 2 Protocol

Pages: 104
Proposed Standard
Errata
Updated by:  8178
Part 2 of 6 – Pages 10 to 32
First   Prev   Next

Top   ToC   RFC7862 - Page 10   prevText

4. Server-Side Copy

The server-side copy features provide mechanisms that allow an NFS client to copy file data on a server or between two servers without the data being transmitted back and forth over the network through the NFS client. Without these features, an NFS client would copy data from one location to another by reading the data from the source server over the network and then writing the data back over the network to the destination server. If the source object and destination object are on different file servers, the file servers will communicate with one another to perform the COPY operation. The server-to-server protocol by which this is accomplished is not defined in this document. The copy feature allows the server to perform the copying either synchronously or asynchronously. The client can request synchronous copying, but the server may not be able to honor this request. If the server intends to perform asynchronous copying, it supplies the client with a request identifier that the client can use to monitor the progress of the copying and, if appropriate, cancel a request in progress. The request identifier is a stateid representing the internal state held by the server while the copying is performed. Multiple asynchronous copies of all or part of a file may be in progress in parallel on a server; the stateid request identifier allows monitoring and canceling to be applied to the correct request.

4.1. Protocol Overview

The server-side copy offload operations support both intra-server and inter-server file copies. An intra-server copy is a copy in which the source file and destination file reside on the same server. In an inter-server copy, the source file and destination file are on different servers. In both cases, the copy may be performed synchronously or asynchronously. In addition, the CLONE operation provides COPY-like functionality in the intra-server case, which is both synchronous and atomic in that other operations may not see the target file in any state between the state before the CLONE operation and the state after it. Throughout the rest of this document, the NFS server containing the source file is referred to as the "source server" and the NFS server to which the file is transferred as the "destination server". In the case of an intra-server copy, the source server and destination server are the same server. Therefore, in the context of an intra-server copy, the terms "source server" and "destination server" refer to the single server performing the copy.
Top   ToC   RFC7862 - Page 11
   The new operations are designed to copy files or regions within them.
   Other file system objects can be copied by building on these
   operations or using other techniques.  For example, if a user wishes
   to copy a directory, the client can synthesize a directory COPY
   operation by first creating the destination directory and the
   individual (empty) files within it and then copying the contents of
   the source directory's files to files in the new destination
   directory.

   For the inter-server copy, the operations are defined to be
   compatible with the traditional copy authorization approach.  The
   client and user are authorized at the source for reading.  Then, they
   are authorized at the destination for writing.

4.1.1. COPY Operations

CLONE: Used by the client to request a synchronous atomic COPY-like operation. (Section 15.13) COPY_NOTIFY: Used by the client to request the source server to authorize a future file copy that will be made by a given destination server on behalf of the given user. (Section 15.3) COPY: Used by the client to request a file copy. (Section 15.2) OFFLOAD_CANCEL: Used by the client to terminate an asynchronous file copy. (Section 15.8) OFFLOAD_STATUS: Used by the client to poll the status of an asynchronous file copy. (Section 15.9) CB_OFFLOAD: Used by the destination server to report the results of an asynchronous file copy to the client. (Section 16.1)

4.1.2. Requirements for Operations

Inter-server copy, intra-server copy, and intra-server clone are each OPTIONAL features in the context of server-side copy. A server may choose independently to implement any of them. A server implementing any of these features may be REQUIRED to implement certain operations. Other operations are OPTIONAL in the context of a particular feature (see Table 5 in Section 13) but may become REQUIRED, depending on server behavior. Clients need to use these operations to successfully copy a file.
Top   ToC   RFC7862 - Page 12
   For a client to do an intra-server file copy, it needs to use either
   the COPY or the CLONE operation.  If COPY is used, the client MUST
   support the CB_OFFLOAD operation.  If COPY is used and it returns a
   stateid, then the client MAY use the OFFLOAD_CANCEL and
   OFFLOAD_STATUS operations.

   For a client to do an inter-server file copy, it needs to use the
   COPY and COPY_NOTIFY operations and MUST support the CB_OFFLOAD
   operation.  If COPY returns a stateid, then the client MAY use the
   OFFLOAD_CANCEL and OFFLOAD_STATUS operations.

   If a server supports the intra-server COPY feature, then the server
   MUST support the COPY operation.  If a server's COPY operation
   returns a stateid, then the server MUST also support these
   operations: CB_OFFLOAD, OFFLOAD_CANCEL, and OFFLOAD_STATUS.

   If a server supports the CLONE feature, then it MUST support the
   CLONE operation and the clone_blksize attribute on any file system on
   which CLONE is supported (as either source or destination file).

   If a source server supports the inter-server COPY feature, then it
   MUST support the COPY_NOTIFY and OFFLOAD_CANCEL operations.  If a
   destination server supports the inter-server COPY feature, then it
   MUST support the COPY operation.  If a destination server's COPY
   operation returns a stateid, then the destination server MUST also
   support these operations: CB_OFFLOAD, OFFLOAD_CANCEL, COPY_NOTIFY,
   and OFFLOAD_STATUS.

   Each operation is performed in the context of the user identified by
   the Open Network Computing (ONC) RPC credential in the RPC request
   containing the COMPOUND or CB_COMPOUND request.  For example, an
   OFFLOAD_CANCEL operation issued by a given user indicates that a
   specified COPY operation initiated by the same user is to be
   canceled.  Therefore, an OFFLOAD_CANCEL MUST NOT interfere with a
   copy of the same file initiated by another user.

   An NFS server MAY allow an administrative user to monitor or cancel
   COPY operations using an implementation-specific interface.
Top   ToC   RFC7862 - Page 13

4.2. Requirements for Inter-Server Copy

The specification of the inter-server copy is driven by several requirements: o The specification MUST NOT mandate the server-to-server protocol. o The specification MUST provide guidance for using NFSv4.x as a copy protocol. For those source and destination servers willing to use NFSv4.x, there are specific security considerations that the specification MUST address. o The specification MUST NOT mandate preconfiguration between the source and destination servers. Requiring that the source and destination servers first have a "copying relationship" increases the administrative burden. However, the specification MUST NOT preclude implementations that require preconfiguration. o The specification MUST NOT mandate a trust relationship between the source and destination servers. The NFSv4 security model requires mutual authentication between a principal on an NFS client and a principal on an NFS server. This model MUST continue with the introduction of COPY.

4.3. Implementation Considerations

4.3.1. Locking the Files

Both the source file and the destination file may need to be locked to protect the content during the COPY operations. A client can achieve this by a combination of OPEN and LOCK operations. That is, either share locks or byte-range locks might be desired. Note that when the client establishes a lock stateid on the source, the context of that stateid is for the client and not the destination. As such, there might already be an outstanding stateid, issued to the destination as the client of the source, with the same value as that provided for the lock stateid. The source MUST interpret the lock stateid as that of the client, i.e., when the destination presents it in the context of an inter-server copy, it is on behalf of the client.
Top   ToC   RFC7862 - Page 14

4.3.2. Client Caches

In a traditional copy, if the client is in the process of writing to the file before the copy (and perhaps with a write delegation), it will be straightforward to update the destination server. With an inter-server copy, the source has no insight into the changes cached on the client. The client SHOULD write the data back to the source. If it does not do so, it is possible that the destination will receive a corrupt copy of the file.

4.4. Intra-Server Copy

To copy a file on a single server, the client uses a COPY operation. The server may respond to the COPY operation with the final results of the copy, or it may perform the copy asynchronously and deliver the results using a CB_OFFLOAD callback operation. If the copy is performed asynchronously, the client may poll the status of the copy using OFFLOAD_STATUS or cancel the copy using OFFLOAD_CANCEL. A synchronous intra-server copy is shown in Figure 1. In this example, the NFS server chooses to perform the copy synchronously. The COPY operation is completed, either successfully or unsuccessfully, before the server replies to the client's request. The server's reply contains the final result of the operation. Client Server + + | | |--- OPEN ---------------------------->| Client opens |<------------------------------------/| the source file | | |--- OPEN ---------------------------->| Client opens |<------------------------------------/| the destination file | | |--- COPY ---------------------------->| Client requests |<------------------------------------/| a file copy | | |--- CLOSE --------------------------->| Client closes |<------------------------------------/| the destination file | | |--- CLOSE --------------------------->| Client closes |<------------------------------------/| the source file | | | | Figure 1: A Synchronous Intra-Server Copy
Top   ToC   RFC7862 - Page 15
   An asynchronous intra-server copy is shown in Figure 2.  In this
   example, the NFS server performs the copy asynchronously.  The
   server's reply to the copy request indicates that the COPY operation
   was initiated and the final result will be delivered at a later time.
   The server's reply also contains a copy stateid.  The client may use
   this copy stateid to poll for status information (as shown) or to
   cancel the copy using an OFFLOAD_CANCEL.  When the server completes
   the copy, the server performs a callback to the client and reports
   the results.

     Client                                  Server
        +                                      +
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the source file
        |                                      |
        |--- OPEN ---------------------------->| Client opens
        |<------------------------------------/| the destination file
        |                                      |
        |--- COPY ---------------------------->| Client requests
        |<------------------------------------/| a file copy
        |                                      |
        |                                      |
        |--- OFFLOAD_STATUS ------------------>| Client may poll
        |<------------------------------------/| for status
        |                                      |
        |                  .                   | Multiple OFFLOAD_STATUS
        |                  .                   | operations may be sent
        |                  .                   |
        |                                      |
        |<-- CB_OFFLOAD -----------------------| Server reports results
        |\------------------------------------>|
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the destination file
        |                                      |
        |--- CLOSE --------------------------->| Client closes
        |<------------------------------------/| the source file
        |                                      |
        |                                      |

                Figure 2: An Asynchronous Intra-Server Copy
Top   ToC   RFC7862 - Page 16

4.5. Inter-Server Copy

A copy may also be performed between two servers. The copy protocol is designed to accommodate a variety of network topologies. As shown in Figure 3, the client and servers may be connected by multiple networks. In particular, the servers may be connected by a specialized, high-speed network (network 192.0.2.0/24 in the diagram) that does not include the client. The protocol allows the client to set up the copy between the servers (over network 203.0.113.0/24 in the diagram) and for the servers to communicate on the high-speed network if they choose to do so. 192.0.2.0/24 +-------------------------------------+ | | | | | 192.0.2.18 | 192.0.2.56 +-------+------+ +------+------+ | Source | | Destination | +-------+------+ +------+------+ | 203.0.113.18 | 203.0.113.56 | | | | | 203.0.113.0/24 | +------------------+------------------+ | | | 203.0.113.243 +-----+-----+ | Client | +-----------+ Figure 3: An Example Inter-Server Network Topology For an inter-server copy, the client notifies the source server that a file will be copied by the destination server using a COPY_NOTIFY operation. The client then initiates the copy by sending the COPY operation to the destination server. The destination server may perform the copy synchronously or asynchronously.
Top   ToC   RFC7862 - Page 17
   A synchronous inter-server copy is shown in Figure 4.  In this case,
   the destination server chooses to perform the copy before responding
   to the client's COPY request.

     Client                Source         Destination
        +                    +                 +
        |                    |                 |
        |--- OPEN        --->|                 | Returns
        |<------------------/|                 | open state os1
        |                    |                 |
        |--- COPY_NOTIFY --->|                 |
        |<------------------/|                 |
        |                    |                 |
        |--- OPEN ---------------------------->| Returns
        |<------------------------------------/| open state os2
        |                    |                 |
        |--- COPY ---------------------------->|
        |                    |                 |
        |                    |                 |
        |                    |<----- READ -----|
        |                    |\--------------->|
        |                    |                 |
        |                    |        .        | Multiple READs may
        |                    |        .        | be necessary
        |                    |        .        |
        |                    |                 |
        |                    |                 |
        |<------------------------------------/| Destination replies
        |                    |                 | to COPY
        |                    |                 |
        |--- CLOSE --------------------------->| Release os2
        |<------------------------------------/|
        |                    |                 |
        |--- CLOSE       --->|                 | Release os1
        |<------------------/|                 |

                 Figure 4: A Synchronous Inter-Server Copy
Top   ToC   RFC7862 - Page 18
   An asynchronous inter-server copy is shown in Figure 5.  In this
   case, the destination server chooses to respond to the client's COPY
   request immediately and then perform the copy asynchronously.

     Client                Source         Destination
       +                    +                 +
       |                    |                 |
       |--- OPEN        --->|                 | Returns
       |<------------------/|                 | open state os1
       |                    |                 |
       |--- LOCK        --->|                 | Optional; could be done
       |<------------------/|                 | with a share lock
       |                    |                 |
       |--- COPY_NOTIFY --->|                 | Need to pass in
       |<------------------/|                 | os1 or lock state
       |                    |                 |
       |                    |                 |
       |                    |                 |
       |--- OPEN ---------------------------->| Returns
       |<------------------------------------/| open state os2
       |                    |                 |
       |--- LOCK ---------------------------->| Optional ...
       |<------------------------------------/|
       |                    |                 |
       |--- COPY ---------------------------->| Need to pass in
       |<------------------------------------/| os2 or lock state
       |                    |                 |
       |                    |                 |
       |                    |<----- READ -----|
       |                    |\--------------->|
       |                    |                 |
       |                    |        .        | Multiple READs may
       |                    |        .        | be necessary
       |                    |        .        |
       |                    |                 |
       |                    |                 |
       |--- OFFLOAD_STATUS ------------------>| Client may poll
       |<------------------------------------/| for status
       |                    |                 |
       |                    |        .        | Multiple OFFLOAD_STATUS
       |                    |        .        | operations may be sent
       |                    |        .        |
       |                    |                 |
       |                    |                 |
       |                    |                 |
       |<-- CB_OFFLOAD -----------------------| Destination reports
       |\------------------------------------>| results
       |                    |                 |
Top   ToC   RFC7862 - Page 19
       |--- LOCKU --------------------------->| Only if LOCK was done
       |<------------------------------------/|
       |                    |                 |
       |--- CLOSE --------------------------->| Release os2
       |<------------------------------------/|
       |                    |                 |
       |--- LOCKU       --->|                 | Only if LOCK was done
       |<------------------/|                 |
       |                    |                 |
       |--- CLOSE       --->|                 | Release os1
       |<------------------/|                 |
       |                    |                 |

                Figure 5: An Asynchronous Inter-Server Copy

4.6. Server-to-Server Copy Protocol

The choice of what protocol to use in an inter-server copy is ultimately the destination server's decision. However, the destination server has to be cognizant that it is working on behalf of the client.

4.6.1. Considerations on Selecting a Copy Protocol

The client can have requirements over both the size of transactions and error recovery semantics. It may want to split the copy up such that each chunk is synchronously transferred. It may want the copy protocol to copy the bytes in consecutive order such that upon an error the client can restart the copy at the last known good offset. If the destination server cannot meet these requirements, the client may prefer the traditional copy mechanism such that it can meet those requirements.

4.6.2. Using NFSv4.x as the Copy Protocol

The destination server MAY use standard NFSv4.x (where x >= 1) operations to read the data from the source server. If NFSv4.x is used for the server-to-server copy protocol, the destination server can use the source filehandle and ca_src_stateid provided in the COPY request with standard NFSv4.x operations to read data from the source server. Note that the ca_src_stateid MUST be the cnr_stateid returned from the source via the COPY_NOTIFY (Section 15.3).
Top   ToC   RFC7862 - Page 20

4.6.3. Using an Alternative Copy Protocol

In a homogeneous environment, the source and destination servers might be able to perform the file copy extremely efficiently using specialized protocols. For example, the source and destination servers might be two nodes sharing a common file system format for the source and destination file systems. Thus, the source and destination are in an ideal position to efficiently render the image of the source file to the destination file by replicating the file system formats at the block level. Another possibility is that the source and destination might be two nodes sharing a common storage area network, and thus there is no need to copy any data at all; instead, ownership of the file and its contents might simply be reassigned to the destination. To allow for these possibilities, the destination server is allowed to use a server-to-server copy protocol of its choice. In a heterogeneous environment, using a protocol other than NFSv4.x (e.g., HTTP [RFC7230] or FTP [RFC959]) presents some challenges. In particular, the destination server is presented with the challenge of accessing the source file given only an NFSv4.x filehandle. One option for protocols that identify source files with pathnames is to use an ASCII hexadecimal representation of the source filehandle as the filename. Another option for the source server is to use URLs to direct the destination server to a specialized service. For example, the response to COPY_NOTIFY could include the URL <ftp://s1.example.com:9999/_FH/0x12345>, where 0x12345 is the ASCII hexadecimal representation of the source filehandle. When the destination server receives the source server's URL, it would use "_FH/0x12345" as the filename to pass to the FTP server listening on port 9999 of s1.example.com. On port 9999 there would be a special instance of the FTP service that understands how to convert NFS filehandles to an open file descriptor (in many operating systems, this would require a new system call, one that is the inverse of the makefh() function that the pre-NFSv4 MOUNT service needs). Authenticating and identifying the destination server to the source server is also a challenge. One solution would be to construct unique URLs for each destination server.
Top   ToC   RFC7862 - Page 21

4.7. netloc4 - Network Locations

The server-side COPY operations specify network locations using the netloc4 data type shown below (see [RFC7863]): <CODE BEGINS> enum netloc_type4 { NL4_NAME = 1, NL4_URL = 2, NL4_NETADDR = 3 }; union netloc4 switch (netloc_type4 nl_type) { case NL4_NAME: utf8str_cis nl_name; case NL4_URL: utf8str_cis nl_url; case NL4_NETADDR: netaddr4 nl_addr; }; <CODE ENDS> If the netloc4 is of type NL4_NAME, the nl_name field MUST be specified as a UTF-8 string. The nl_name is expected to be resolved to a network address via DNS, the Lightweight Directory Access Protocol (LDAP), the Network Information Service (NIS), /etc/hosts, or some other means. If the netloc4 is of type NL4_URL, a server URL [RFC3986] appropriate for the server-to-server COPY operation is specified as a UTF-8 string. If the netloc4 is of type NL4_NETADDR, the nl_addr field MUST contain a valid netaddr4 as defined in Section 3.3.9 of [RFC5661]. When netloc4 values are used for an inter-server copy as shown in Figure 3, their values may be evaluated on the source server, destination server, and client. The network environment in which these systems operate should be configured so that the netloc4 values are interpreted as intended on each system.

4.8. Copy Offload Stateids

A server may perform a copy offload operation asynchronously. An asynchronous copy is tracked using a copy offload stateid. Copy offload stateids are included in the COPY, OFFLOAD_CANCEL, OFFLOAD_STATUS, and CB_OFFLOAD operations. A copy offload stateid will be valid until either (A) the client or server restarts or (B) the client returns the resource by issuing an OFFLOAD_CANCEL operation or the client replies to a CB_OFFLOAD operation.
Top   ToC   RFC7862 - Page 22
   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
   copy offload operation, it is inappropriate to indicate "the most
   recent copy offload operation" using a stateid with a seqid of zero
   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
   stateid refers to internal state in the server and there may be
   several asynchronous COPY operations being performed in parallel on
   the same file by the server.  Therefore, a copy offload stateid with
   a seqid of zero MUST be considered invalid.

4.9. Security Considerations for Server-Side Copy

All security considerations pertaining to NFSv4.1 [RFC5661] apply to this section; as such, the standard security mechanisms used by the protocol can be used to secure the server-to-server operations. NFSv4 clients and servers supporting the inter-server COPY operations described in this section are REQUIRED to implement the mechanism described in Section 4.9.1.1 and to support rejecting COPY_NOTIFY requests that do not use the RPC security protocol (RPCSEC_GSS) [RFC7861] with privacy. If the server-to-server copy protocol is based on ONC RPC, the servers are also REQUIRED to implement [RFC7861], including the RPCSEC_GSSv3 "copy_to_auth", "copy_from_auth", and "copy_confirm_auth" structured privileges. This requirement to implement is not a requirement to use; for example, a server may, depending on configuration, also allow COPY_NOTIFY requests that use only AUTH_SYS. If a server requires the use of an RPCSEC_GSSv3 copy_to_auth, copy_from_auth, or copy_confirm_auth privilege and it is not used, the server will reject the request with NFS4ERR_PARTNER_NO_AUTH.

4.9.1. Inter-Server Copy Security

4.9.1.1. Inter-Server Copy via ONC RPC with RPCSEC_GSSv3
When the client sends a COPY_NOTIFY to the source server to expect the destination to attempt to copy data from the source server, it is expected that this copy is being done on behalf of the principal (called the "user principal") that sent the RPC request that encloses the COMPOUND procedure that contains the COPY_NOTIFY operation. The user principal is identified by the RPC credentials. A mechanism that allows the user principal to authorize the destination server to perform the copy, lets the source server properly authenticate the destination's copy, and does not allow the destination server to exceed this authorization is necessary.
Top   ToC   RFC7862 - Page 23
   An approach that sends delegated credentials of the client's user
   principal to the destination server is not used for the following
   reason.  If the client's user delegated its credentials, the
   destination would authenticate as the user principal.  If the
   destination were using the NFSv4 protocol to perform the copy, then
   the source server would authenticate the destination server as the
   user principal, and the file copy would securely proceed.  However,
   this approach would allow the destination server to copy other files.
   The user principal would have to trust the destination server to not
   do so.  This is counter to the requirements and therefore is not
   considered.

   Instead, a feature of the RPCSEC_GSSv3 protocol [RFC7861] can be
   used: RPC-application-defined structured privilege assertion.  This
   feature allows the destination server to authenticate to the source
   server as acting on behalf of the user principal and to authorize the
   destination server to perform READs of the file to be copied from the
   source on behalf of the user principal.  Once the copy is complete,
   the client can destroy the RPCSEC_GSSv3 handles to end the
   authorization of both the source and destination servers to copy.

   For each structured privilege assertion defined by an RPC
   application, RPCSEC_GSSv3 requires the application to define a name
   string and a data structure that will be encoded and passed between
   client and server as opaque data.  For NFSv4, the data structures
   specified below MUST be serialized using XDR.

   Three RPCSEC_GSSv3 structured privilege assertions that work together
   to authorize the copy are defined here.  For each of the assertions,
   the description starts with the name string passed in the rp_name
   field of the rgss3_privs structure defined in Section 2.7.1.4 of
   [RFC7861] and specifies the XDR encoding of the associated structured
   data passed via the rp_privilege field of the structure.
Top   ToC   RFC7862 - Page 24
   copy_from_auth:  A user principal is authorizing a source principal
      ("nfs@<source>") to allow a destination principal
      ("nfs@<destination>") to set up the copy_confirm_auth privilege
      required to copy a file from the source to the destination on
      behalf of the user principal.  This privilege is established on
      the source server before the user principal sends a COPY_NOTIFY
      operation to the source server, and the resultant RPCSEC_GSSv3
      context is used to secure the COPY_NOTIFY operation.

      <CODE BEGINS>

   struct copy_from_auth_priv {
           secret4             cfap_shared_secret;
           netloc4             cfap_destination;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed       cfap_username;
   };

      <CODE ENDS>

      cfap_shared_secret is an automatically generated random number
      secret value.

   copy_to_auth:  A user principal is authorizing a destination
      principal ("nfs@<destination>") to set up a copy_confirm_auth
      privilege with a source principal ("nfs@<source>") to allow it to
      copy a file from the source to the destination on behalf of the
      user principal.  This privilege is established on the destination
      server before the user principal sends a COPY operation to the
      destination server, and the resultant RPCSEC_GSSv3 context is used
      to secure the COPY operation.

      <CODE BEGINS>

   struct copy_to_auth_priv {
           /* equal to cfap_shared_secret */
           secret4              ctap_shared_secret;
           netloc4              ctap_source<>;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed        ctap_username;
   };

      <CODE ENDS>

      ctap_shared_secret is the automatically generated secret value
      used to establish the copy_from_auth privilege with the source
      principal.  See Section 4.9.1.1.1.
Top   ToC   RFC7862 - Page 25
   copy_confirm_auth:  A destination principal ("nfs@<destination>") is
      confirming with the source principal ("nfs@<source>") that it is
      authorized to copy data from the source.  This privilege is
      established on the destination server before the file is copied
      from the source to the destination.  The resultant RPCSEC_GSSv3
      context is used to secure the READ operations from the source to
      the destination server.

      <CODE BEGINS>

   struct copy_confirm_auth_priv {
           /* equal to GSS_GetMIC() of cfap_shared_secret */
           opaque              ccap_shared_secret_mic<>;
           /* the NFSv4 user name that the user principal maps to */
           utf8str_mixed       ccap_username;
   };

      <CODE ENDS>

4.9.1.1.1. Establishing a Security Context
When the user principal wants to copy a file between two servers, if it has not established copy_from_auth and copy_to_auth privileges on the servers, it establishes them as follows: o As noted in [RFC7861], the client uses an existing RPCSEC_GSSv3 context termed the "parent" handle to establish and protect RPCSEC_GSSv3 structured privilege assertion exchanges. The copy_from_auth privilege will use the context established between the user principal and the source server used to OPEN the source file as the RPCSEC_GSSv3 parent handle. The copy_to_auth privilege will use the context established between the user principal and the destination server used to OPEN the destination file as the RPCSEC_GSSv3 parent handle. o A random number is generated to use as a secret to be shared between the two servers. Note that the random number SHOULD NOT be reused between establishing different security contexts. The resulting shared secret will be placed in the copy_from_auth_priv cfap_shared_secret field and the copy_to_auth_priv ctap_shared_secret field. Because of this shared_secret, the RPCSEC_GSS3_CREATE control messages for copy_from_auth and copy_to_auth MUST use a Quality of Protection (QoP) of rpc_gss_svc_privacy.
Top   ToC   RFC7862 - Page 26
   o  An instance of copy_from_auth_priv is filled in with the shared
      secret, the destination server, and the NFSv4 user id of the user
      principal and is placed in rpc_gss3_create_args
      assertions[0].privs.privilege.  The string "copy_from_auth" is
      placed in assertions[0].privs.name.  The source server unwraps the
      rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and verifies that
      the NFSv4 user id being asserted matches the source server's
      mapping of the user principal.  If it does, the privilege is
      established on the source server as <copy_from_auth, user id,
      destination>.  The field "handle" in a successful reply is the
      RPCSEC_GSSv3 copy_from_auth "child" handle that the client will
      use in COPY_NOTIFY requests to the source server.

   o  An instance of copy_to_auth_priv is filled in with the shared
      secret, the cnr_source_server list returned by COPY_NOTIFY, and
      the NFSv4 user id of the user principal.  The copy_to_auth_priv
      instance is placed in rpc_gss3_create_args
      assertions[0].privs.privilege.  The string "copy_to_auth" is
      placed in assertions[0].privs.name.  The destination server
      unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and
      verifies that the NFSv4 user id being asserted matches the
      destination server's mapping of the user principal.  If it does,
      the privilege is established on the destination server as
      <copy_to_auth, user id, source list>.  The field "handle" in a
      successful reply is the RPCSEC_GSSv3 copy_to_auth child handle
      that the client will use in COPY requests to the destination
      server involving the source server.

   As noted in Section 2.7.1 of [RFC7861] ("New Control Procedure -
   RPCSEC_GSS_CREATE"), both the client and the source server should
   associate the RPCSEC_GSSv3 child handle with the parent RPCSEC_GSSv3
   handle used to create the RPCSEC_GSSv3 child handle.

4.9.1.1.2. Starting a Secure Inter-Server Copy
When the client sends a COPY_NOTIFY request to the source server, it uses the privileged copy_from_auth RPCSEC_GSSv3 handle. cna_destination_server in the COPY_NOTIFY MUST be the same as cfap_destination specified in copy_from_auth_priv. Otherwise, the COPY_NOTIFY will fail with NFS4ERR_ACCESS. The source server verifies that the privilege <copy_from_auth, user id, destination> exists and annotates it with the source filehandle, if the user principal has read access to the source file and if administrative policies give the user principal and the NFS client read access to the source file (i.e., if the ACCESS operation would grant read access). Otherwise, the COPY_NOTIFY will fail with NFS4ERR_ACCESS.
Top   ToC   RFC7862 - Page 27
   When the client sends a COPY request to the destination server, it
   uses the privileged copy_to_auth RPCSEC_GSSv3 handle.
   ca_source_server list in the COPY MUST be the same as ctap_source
   list specified in copy_to_auth_priv.  Otherwise, the COPY will fail
   with NFS4ERR_ACCESS.  The destination server verifies that the
   privilege <copy_to_auth, user id, source list> exists and annotates
   it with the source and destination filehandles.  If the COPY returns
   a wr_callback_id, then this is an asynchronous copy and the
   wr_callback_id must also must be annotated to the copy_to_auth
   privilege.  If the client has failed to establish the copy_to_auth
   privilege, it will reject the request with NFS4ERR_PARTNER_NO_AUTH.

   If either the COPY_NOTIFY operation or the COPY operations fail, the
   associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 handles MUST
   be destroyed.

4.9.1.1.3. Securing ONC RPC Server-to-Server Copy Protocols
After a destination server has a copy_to_auth privilege established on it and it receives a COPY request, if it knows it will use an ONC RPC protocol to copy data, it will establish a copy_confirm_auth privilege on the source server prior to responding to the COPY operation, as follows: o Before establishing an RPCSEC_GSSv3 context, a parent context needs to exist between nfs@<destination> as the initiator principal and nfs@<source> as the target principal. If NFS is to be used as the copy protocol, this means that the destination server must mount the source server using RPCSEC_GSSv3. o An instance of copy_confirm_auth_priv is filled in with information from the established copy_to_auth privilege. The value of the ccap_shared_secret_mic field is a GSS_GetMIC() of the ctap_shared_secret in the copy_to_auth privilege using the parent handle context. The ccap_username field is the mapping of the user principal to an NFSv4 user name ("user"@"domain" form) and MUST be the same as the ctap_username in the copy_to_auth privilege. The copy_confirm_auth_priv instance is placed in rpc_gss3_create_args assertions[0].privs.privilege. The string "copy_confirm_auth" is placed in assertions[0].privs.name. o The RPCSEC_GSS3_CREATE copy_from_auth message is sent to the source server with a QoP of rpc_gss_svc_privacy. The source server unwraps the rpc_gss_svc_privacy RPCSEC_GSS3_CREATE payload and verifies the cap_shared_secret_mic by calling GSS_VerifyMIC() using the parent context on the cfap_shared_secret from the established copy_from_auth privilege, and verifies that the ccap_username equals the cfap_username.
Top   ToC   RFC7862 - Page 28
   o  If all verifications succeed, the copy_confirm_auth privilege is
      established on the source server as <copy_confirm_auth,
      shared_secret_mic, user id>.  Because the shared secret has been
      verified, the resultant copy_confirm_auth RPCSEC_GSSv3 child
      handle is noted to be acting on behalf of the user principal.

   o  If the source server fails to verify the copy_from_auth privilege,
      the COPY_NOTIFY operation will be rejected with
      NFS4ERR_PARTNER_NO_AUTH.

   o  If the destination server fails to verify the copy_to_auth or
      copy_confirm_auth privilege, the COPY will be rejected with
      NFS4ERR_PARTNER_NO_AUTH, causing the client to destroy the
      associated copy_from_auth and copy_to_auth RPCSEC_GSSv3 structured
      privilege assertion handles.

   o  All subsequent ONC RPC READ requests sent from the destination to
      copy data from the source to the destination will use the
      RPCSEC_GSSv3 copy_confirm_auth child handle.

   Note that the use of the copy_confirm_auth privilege accomplishes the
   following:

   o  If a protocol like NFS is being used with export policies, the
      export policies can be overridden if the destination server is not
      authorized to act as an NFS client.

   o  Manual configuration to allow a copy relationship between the
      source and destination is not needed.

4.9.1.1.4. Maintaining a Secure Inter-Server Copy
If the client determines that either the copy_from_auth or the copy_to_auth handle becomes invalid during a copy, then the copy MUST be aborted by the client sending an OFFLOAD_CANCEL to both the source and destination servers and destroying the respective copy-related context handles as described in Section 4.9.1.1.5.
4.9.1.1.5. Finishing or Stopping a Secure Inter-Server Copy
Under normal operation, the client MUST destroy the copy_from_auth and the copy_to_auth RPCSEC_GSSv3 handle once the COPY operation returns for a synchronous inter-server copy or a CB_OFFLOAD reports the result of an asynchronous copy.
Top   ToC   RFC7862 - Page 29
   The copy_confirm_auth privilege is constructed from information held
   by the copy_to_auth privilege and MUST be destroyed by the
   destination server (via an RPCSEC_GSS3_DESTROY call) when the
   copy_to_auth RPCSEC_GSSv3 handle is destroyed.

   The copy_confirm_auth RPCSEC_GSS3 handle is associated with a
   copy_from_auth RPCSEC_GSS3 handle on the source server via the shared
   secret and MUST be locally destroyed (there is no
   RPCSEC_GSS3_DESTROY, as the source server is not the initiator) when
   the copy_from_auth RPCSEC_GSSv3 handle is destroyed.

   If the client sends an OFFLOAD_CANCEL to the source server to rescind
   the destination server's synchronous copy privilege, it uses the
   privileged copy_from_auth RPCSEC_GSSv3 handle, and the
   cra_destination_server in the OFFLOAD_CANCEL MUST be the same as the
   name of the destination server specified in copy_from_auth_priv.  The
   source server will then delete the <copy_from_auth, user id,
   destination> privilege and fail any subsequent copy requests sent
   under the auspices of this privilege from the destination server.
   The client MUST destroy both the copy_from_auth and the copy_to_auth
   RPCSEC_GSSv3 handles.

   If the client sends an OFFLOAD_STATUS to the destination server to
   check on the status of an asynchronous copy, it uses the privileged
   copy_to_auth RPCSEC_GSSv3 handle, and the osa_stateid in the
   OFFLOAD_STATUS MUST be the same as the wr_callback_id specified in
   the copy_to_auth privilege stored on the destination server.

   If the client sends an OFFLOAD_CANCEL to the destination server to
   cancel an asynchronous copy, it uses the privileged copy_to_auth
   RPCSEC_GSSv3 handle, and the oaa_stateid in the OFFLOAD_CANCEL MUST
   be the same as the wr_callback_id specified in the copy_to_auth
   privilege stored on the destination server.  The destination server
   will then delete the <copy_to_auth, user id, source list> privilege
   and the associated copy_confirm_auth RPCSEC_GSSv3 handle.  The client
   MUST destroy both the copy_to_auth and copy_from_auth RPCSEC_GSSv3
   handles.

4.9.1.2. Inter-Server Copy via ONC RPC without RPCSEC_GSS
ONC RPC security flavors other than RPCSEC_GSS MAY be used with the server-side copy offload operations described in this section. In particular, host-based ONC RPC security flavors such as AUTH_NONE and AUTH_SYS MAY be used. If a host-based security flavor is used, a minimal level of protection for the server-to-server copy protocol is possible.
Top   ToC   RFC7862 - Page 30
   The biggest issue is that there is a lack of a strong security method
   to allow the source server and destination server to identify
   themselves to each other.  A further complication is that in a
   multihomed environment the destination server might not contact the
   source server from the same network address specified by the client
   in the COPY_NOTIFY.  The cnr_stateid returned from the COPY_NOTIFY
   can be used to uniquely identify the destination server to the source
   server.  The use of the cnr_stateid provides initial authentication
   of the destination server but cannot defend against man-in-the-middle
   attacks after authentication or against an eavesdropper that observes
   the opaque stateid on the wire.  Other secure communication
   techniques (e.g., IPsec) are necessary to block these attacks.

   Servers SHOULD reject COPY_NOTIFY requests that do not use RPCSEC_GSS
   with privacy, thus ensuring that the cnr_stateid in the COPY_NOTIFY
   reply is encrypted.  For the same reason, clients SHOULD send COPY
   requests to the destination using RPCSEC_GSS with privacy.

5. Support for Application I/O Hints

Applications can issue client I/O hints via posix_fadvise() [posix_fadvise] to the NFS client. While this can help the NFS client optimize I/O and caching for a file, it does not allow the NFS server and its exported file system to do likewise. The IO_ADVISE procedure (Section 15.5) is used to communicate the client file access patterns to the NFS server. The NFS server, upon receiving an IO_ADVISE operation, MAY choose to alter its I/O and caching behavior but is under no obligation to do so. Application-specific NFS clients such as those used by hypervisors and databases can also leverage application hints to communicate their specialized requirements.

6. Sparse Files

A sparse file is a common way of representing a large file without having to utilize all of the disk space for it. Consequently, a sparse file uses less physical space than its size indicates. This means the file contains "holes", byte ranges within the file that contain no data. Most modern file systems support sparse files, including most UNIX file systems and Microsoft's New Technology File System (NTFS); however, it should be noted that Apple's Hierarchical File System Plus (HFS+) does not. Common examples of sparse files include Virtual Machine (VM) OS/disk images, database files, log files, and even checkpoint recovery files most commonly used by the High-Performance Computing (HPC) community.
Top   ToC   RFC7862 - Page 31
   In addition, many modern file systems support the concept of
   "unwritten" or "uninitialized" blocks, which have uninitialized space
   allocated to them on disk but will return zeros until data is written
   to them.  Such functionality is already present in the data model of
   the pNFS block/volume layout (see [RFC5663]).  Uninitialized blocks
   can be thought of as holes inside a space reservation window.

   If an application reads a hole in a sparse file, the file system must
   return all zeros to the application.  For local data access there is
   little penalty, but with NFS these zeros must be transferred back to
   the client.  If an application uses the NFS client to read data into
   memory, this wastes time and bandwidth as the application waits for
   the zeros to be transferred.

   A sparse file is typically created by initializing the file to be all
   zeros.  Nothing is written to the data in the file; instead, the hole
   is recorded in the metadata for the file.  So, an 8G disk image might
   be represented initially by a few hundred bits in the metadata (on
   UNIX file systems, the inode) and nothing on the disk.  If the VM
   then writes 100M to a file in the middle of the image, there would
   now be two holes represented in the metadata and 100M in the data.

   No new operation is needed to allow the creation of a sparsely
   populated file; when a file is created and a write occurs past the
   current size of the file, the non-allocated region will either be a
   hole or be filled with zeros.  The choice of behavior is dictated by
   the underlying file system and is transparent to the application.
   However, the abilities to read sparse files and to punch holes to
   reinitialize the contents of a file are needed.

   Two new operations -- DEALLOCATE (Section 15.4) and READ_PLUS
   (Section 15.10) -- are introduced.  DEALLOCATE allows for the hole
   punching, where an application might want to reset the allocation and
   reservation status of a range of the file.  READ_PLUS supports all
   the features of READ but includes an extension to support sparse
   files.  READ_PLUS is guaranteed to perform no worse than READ and can
   dramatically improve performance with sparse files.  READ_PLUS does
   not depend on pNFS protocol features but can be used by pNFS to
   support sparse files.

6.1. Terminology

Regular file: An object of file type NF4REG or NF4NAMEDATTR. Sparse file: A regular file that contains one or more holes. Hole: A byte range within a sparse file that contains all zeros. A hole might or might not have space allocated or reserved to it.
Top   ToC   RFC7862 - Page 32

6.2. New Operations

6.2.1. READ_PLUS

READ_PLUS is a new variant of the NFSv4.1 READ operation [RFC5661]. Besides being able to support all of the data semantics of the READ operation, it can also be used by the client and server to efficiently transfer holes. Because the client does not know in advance whether a hole is present or not, if the client supports READ_PLUS and so does the server, then it should always use the READ_PLUS operation in preference to the READ operation. READ_PLUS extends the response with a new arm representing holes to avoid returning data for portions of the file that are initialized to zero and may or may not contain a backing store. Returning actual data blocks corresponding to holes wastes computational and network resources, thus reducing performance. When a client sends a READ operation, it is not prepared to accept a READ_PLUS-style response providing a compact encoding of the scope of holes. If a READ occurs on a sparse file, then the server must expand such data to be raw bytes. If a READ occurs in the middle of a hole, the server can only send back bytes starting from that offset. By contrast, if a READ_PLUS occurs in the middle of a hole, the server can send back a range that starts before the offset and extends past the requested length.

6.2.2. DEALLOCATE

The client can use the DEALLOCATE operation on a range of a file as a hole punch, which allows the client to avoid the transfer of a repetitive pattern of zeros across the network. This hole punch is a result of the unreserved space returning all zeros until overwritten.


(page 32 continued on part 3)

Next Section