tech-invite   World Map     

IETF     RFCs     Groups     SIP     ABNFs    |    3GPP     Specs     Gloss.     Arch.     IMS     UICC    |    Misc.    |    search     info

RFC 7539

 Errata 
Informational
Pages: 45
Top     in Index     Prev     Next
in Group Index     Prev in Group     Next in Group     Group: IRTF

ChaCha20 and Poly1305 for IETF Protocols

Part 1 of 2, p. 1 to 24
None       Next RFC Part

 


Top       ToC       Page 1 
Internet Research Task Force (IRTF)                               Y. Nir
Request for Comments: 7539                                   Check Point
Category: Informational                                       A. Langley
ISSN: 2070-1721                                             Google, Inc.
                                                                May 2015


                ChaCha20 and Poly1305 for IETF Protocols

Abstract

   This document defines the ChaCha20 stream cipher as well as the use
   of the Poly1305 authenticator, both as stand-alone algorithms and as
   a "combined mode", or Authenticated Encryption with Associated Data
   (AEAD) algorithm.

   This document does not introduce any new crypto, but is meant to
   serve as a stable reference and an implementation guide.  It is a
   product of the Crypto Forum Research Group (CFRG).

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Research Task Force
   (IRTF).  The IRTF publishes the results of Internet-related research
   and development activities.  These results might not be suitable for
   deployment.  This RFC represents the consensus of the Crypto Forum
   Research Group of the Internet Research Task Force (IRTF).  Documents
   approved for publication by the IRSG are not a candidate for any
   level of Internet Standard; see Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc7539.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Top       Page 2 
Table of Contents

   1. Introduction ....................................................3
      1.1. Conventions Used in This Document ..........................4
   2. The Algorithms ..................................................4
      2.1. The ChaCha Quarter Round ...................................4
           2.1.1. Test Vector for the ChaCha Quarter Round ............5
      2.2. A Quarter Round on the ChaCha State ........................5
           2.2.1. Test Vector for the Quarter Round on the
                  ChaCha State ........................................6
      2.3. The ChaCha20 Block Function ................................6
           2.3.1. The ChaCha20 Block Function in Pseudocode ...........8
           2.3.2. Test Vector for the ChaCha20 Block Function .........9
      2.4. The ChaCha20 Encryption Algorithm .........................10
           2.4.1. The ChaCha20 Encryption Algorithm in Pseudocode ....11
           2.4.2. Example and Test Vector for the ChaCha20 Cipher ....11
      2.5. The Poly1305 Algorithm ....................................13
           2.5.1. The Poly1305 Algorithms in Pseudocode ..............15
           2.5.2. Poly1305 Example and Test Vector ...................15
      2.6. Generating the Poly1305 Key Using ChaCha20 ................17
           2.6.1. Poly1305 Key Generation in Pseudocode ..............18
           2.6.2. Poly1305 Key Generation Test Vector ................18
      2.7. A Pseudorandom Function for Crypto Suites based on
           ChaCha/Poly1305 ...........................................18
      2.8. AEAD Construction .........................................19
           2.8.1. Pseudocode for the AEAD Construction ...............21
           2.8.2. Example and Test Vector for
                  AEAD_CHACHA20_POLY1305 .............................22
   3. Implementation Advice ..........................................24
   4. Security Considerations ........................................24
   5. IANA Considerations ............................................26
   6. References .....................................................26
      6.1. Normative References ......................................26
      6.2. Informative References ....................................26
   Appendix A. Additional Test Vectors ...............................29
     A.1. The ChaCha20 Block Functions ...............................29
     A.2. ChaCha20 Encryption ........................................32
     A.3. Poly1305 Message Authentication Code .......................34
     A.4. Poly1305 Key Generation Using ChaCha20 .....................40
     A.5. ChaCha20-Poly1305 AEAD Decryption ..........................41
   Appendix B. Performance Measurements of ChaCha20 ..................44
   Acknowledgements ..................................................45
   Authors' Addresses ................................................45

Top      ToC       Page 3 
1.  Introduction

   The Advanced Encryption Standard (AES -- [FIPS-197]) has become the
   gold standard in encryption.  Its efficient design, widespread
   implementation, and hardware support allow for high performance in
   many areas.  On most modern platforms, AES is anywhere from four to
   ten times as fast as the previous most-used cipher, Triple Data
   Encryption Standard (3DES -- [SP800-67]), which makes it not only the
   best choice, but the only practical choice.

   There are several problems with this.  If future advances in
   cryptanalysis reveal a weakness in AES, users will be in an
   unenviable position.  With the only other widely supported cipher
   being the much slower 3DES, it is not feasible to reconfigure
   deployments to use 3DES.  [Standby-Cipher] describes this issue and
   the need for a standby cipher in greater detail.  Another problem is
   that while AES is very fast on dedicated hardware, its performance on
   platforms that lack such hardware is considerably lower.  Yet another
   problem is that many AES implementations are vulnerable to cache-
   collision timing attacks ([Cache-Collisions]).

   This document provides a definition and implementation guide for
   three algorithms:

   1.  The ChaCha20 cipher.  This is a high-speed cipher first described
       in [ChaCha].  It is considerably faster than AES in software-only
       implementations, making it around three times as fast on
       platforms that lack specialized AES hardware.  See Appendix B for
       some hard numbers.  ChaCha20 is also not sensitive to timing
       attacks (see the security considerations in Section 4).  This
       algorithm is described in Section 2.4

   2.  The Poly1305 authenticator.  This is a high-speed message
       authentication code.  Implementation is also straightforward and
       easy to get right.  The algorithm is described in Section 2.5.

   3.  The CHACHA20-POLY1305 Authenticated Encryption with Associated
       Data (AEAD) construction, described in Section 2.8.

   This document does not introduce these new algorithms for the first
   time.  They have been defined in scientific papers by
   D. J. Bernstein, which are referenced by this document.  The purpose
   of this document is to serve as a stable reference for IETF documents
   making use of these algorithms.

   These algorithms have undergone rigorous analysis.  Several papers
   discuss the security of Salsa and ChaCha ([LatinDances],
   [LatinDances2], [Zhenqing2012]).

Top      ToC       Page 4 
   This document represents the consensus of the Crypto Forum Research
   Group (CFRG).

1.1.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The description of the ChaCha algorithm will at various time refer to
   the ChaCha state as a "vector" or as a "matrix".  This follows the
   use of these terms in Professor Bernstein's paper.  The matrix
   notation is more visually convenient and gives a better notion as to
   why some rounds are called "column rounds" while others are called
   "diagonal rounds".  Here's a diagram of how the matrices relate to
   vectors (using the C language convention of zero being the index
   origin).

      0   1   2   3
      4   5   6   7
      8   9  10  11
     12  13  14  15

   The elements in this vector or matrix are 32-bit unsigned integers.

   The algorithm name is "ChaCha".  "ChaCha20" is a specific instance
   where 20 "rounds" (or 80 quarter rounds -- see Section 2.1) are used.
   Other variations are defined, with 8 or 12 rounds, but in this
   document we only describe the 20-round ChaCha, so the names "ChaCha"
   and "ChaCha20" will be used interchangeably.

2.  The Algorithms

   The subsections below describe the algorithms used and the AEAD
   construction.

2.1.  The ChaCha Quarter Round

   The basic operation of the ChaCha algorithm is the quarter round.  It
   operates on four 32-bit unsigned integers, denoted a, b, c, and d.
   The operation is as follows (in C-like notation):

   1.  a += b; d ^= a; d <<<= 16;
   2.  c += d; b ^= c; b <<<= 12;
   3.  a += b; d ^= a; d <<<= 8;
   4.  c += d; b ^= c; b <<<= 7;

Top      ToC       Page 5 
   Where "+" denotes integer addition modulo 2^32, "^" denotes a bitwise
   Exclusive OR (XOR), and "<<< n" denotes an n-bit left rotation
   (towards the high bits).

   For example, let's see the add, XOR, and roll operations from the
   fourth line with sample numbers:

   o  a = 0x11111111
   o  b = 0x01020304
   o  c = 0x77777777
   o  d = 0x01234567
   o  c = c + d = 0x77777777 + 0x01234567 = 0x789abcde
   o  b = b ^ c = 0x01020304 ^ 0x789abcde = 0x7998bfda
   o  b = b <<< 7 = 0x7998bfda <<< 7 = 0xcc5fed3c

2.1.1.  Test Vector for the ChaCha Quarter Round

   For a test vector, we will use the same numbers as in the example,
   adding something random for c.

   o  a = 0x11111111
   o  b = 0x01020304
   o  c = 0x9b8d6f43
   o  d = 0x01234567

   After running a Quarter Round on these four numbers, we get these:

   o  a = 0xea2a92f4
   o  b = 0xcb1cf8ce
   o  c = 0x4581472e
   o  d = 0x5881c4bb

2.2.  A Quarter Round on the ChaCha State

   The ChaCha state does not have four integer numbers: it has 16.  So
   the quarter-round operation works on only four of them -- hence the
   name.  Each quarter round operates on four predetermined numbers in
   the ChaCha state.  We will denote by QUARTERROUND(x,y,z,w) a quarter-
   round operation on the numbers at indices x, y, z, and w of the
   ChaCha state when viewed as a vector.  For example, if we apply
   QUARTERROUND(1,5,9,13) to a state, this means running the quarter-
   round operation on the elements marked with an asterisk, while
   leaving the others alone:

      0  *a   2   3
      4  *b   6   7
      8  *c  10  11
     12  *d  14  15

Top      ToC       Page 6 
   Note that this run of quarter round is part of what is called a
   "column round".

2.2.1.  Test Vector for the Quarter Round on the ChaCha State

   For a test vector, we will use a ChaCha state that was generated
   randomly:

   Sample ChaCha State

       879531e0  c5ecf37d  516461b1  c9a62f8a
       44c20ef3  3390af7f  d9fc690b  2a5f714c
       53372767  b00a5631  974c541a  359e9963
       5c971061  3d631689  2098d9d6  91dbd320

   We will apply the QUARTERROUND(2,7,8,13) operation to this state.
   For obvious reasons, this one is part of what is called a "diagonal
   round":

   After applying QUARTERROUND(2,7,8,13)

       879531e0  c5ecf37d *bdb886dc  c9a62f8a
       44c20ef3  3390af7f  d9fc690b *cfacafd2
      *e46bea80  b00a5631  974c541a  359e9963
       5c971061 *ccc07c79  2098d9d6  91dbd320

   Note that only the numbers in positions 2, 7, 8, and 13 changed.

2.3.  The ChaCha20 Block Function

   The ChaCha block function transforms a ChaCha state by running
   multiple quarter rounds.

   The inputs to ChaCha20 are:

   o  A 256-bit key, treated as a concatenation of eight 32-bit little-
      endian integers.

   o  A 96-bit nonce, treated as a concatenation of three 32-bit little-
      endian integers.

   o  A 32-bit block count parameter, treated as a 32-bit little-endian
      integer.

   The output is 64 random-looking bytes.

Top      ToC       Page 7 
   The ChaCha algorithm described here uses a 256-bit key.  The original
   algorithm also specified 128-bit keys and 8- and 12-round variants,
   but these are out of scope for this document.  In this section, we
   describe the ChaCha block function.

   Note also that the original ChaCha had a 64-bit nonce and 64-bit
   block count.  We have modified this here to be more consistent with
   recommendations in Section 3.2 of [RFC5116].  This limits the use of
   a single (key,nonce) combination to 2^32 blocks, or 256 GB, but that
   is enough for most uses.  In cases where a single key is used by
   multiple senders, it is important to make sure that they don't use
   the same nonces.  This can be assured by partitioning the nonce space
   so that the first 32 bits are unique per sender, while the other 64
   bits come from a counter.

   The ChaCha20 state is initialized as follows:

   o  The first four words (0-3) are constants: 0x61707865, 0x3320646e,
      0x79622d32, 0x6b206574.

   o  The next eight words (4-11) are taken from the 256-bit key by
      reading the bytes in little-endian order, in 4-byte chunks.

   o  Word 12 is a block counter.  Since each block is 64-byte, a 32-bit
      word is enough for 256 gigabytes of data.

   o  Words 13-15 are a nonce, which should not be repeated for the same
      key.  The 13th word is the first 32 bits of the input nonce taken
      as a little-endian integer, while the 15th word is the last 32
      bits.

       cccccccc  cccccccc  cccccccc  cccccccc
       kkkkkkkk  kkkkkkkk  kkkkkkkk  kkkkkkkk
       kkkkkkkk  kkkkkkkk  kkkkkkkk  kkkkkkkk
       bbbbbbbb  nnnnnnnn  nnnnnnnn  nnnnnnnn

   c=constant k=key b=blockcount n=nonce

   ChaCha20 runs 20 rounds, alternating between "column rounds" and
   "diagonal rounds".  Each round consists of four quarter-rounds, and
   they are run as follows.  Quarter rounds 1-4 are part of a "column"
   round, while 5-8 are part of a "diagonal" round:

   1.  QUARTERROUND ( 0, 4, 8,12)
   2.  QUARTERROUND ( 1, 5, 9,13)
   3.  QUARTERROUND ( 2, 6,10,14)
   4.  QUARTERROUND ( 3, 7,11,15)
   5.  QUARTERROUND ( 0, 5,10,15)

Top      ToC       Page 8 
   6.  QUARTERROUND ( 1, 6,11,12)
   7.  QUARTERROUND ( 2, 7, 8,13)
   8.  QUARTERROUND ( 3, 4, 9,14)

   At the end of 20 rounds (or 10 iterations of the above list), we add
   the original input words to the output words, and serialize the
   result by sequencing the words one-by-one in little-endian order.

   Note: "addition" in the above paragraph is done modulo 2^32.  In some
   machine languages, this is called carryless addition on a 32-bit
   word.

2.3.1.  The ChaCha20 Block Function in Pseudocode

   Note: This section and a few others contain pseudocode for the
   algorithm explained in a previous section.  Every effort was made for
   the pseudocode to accurately reflect the algorithm as described in
   the preceding section.  If a conflict is still present, the textual
   explanation and the test vectors are normative.

      inner_block (state):
         Qround(state, 0, 4, 8,12)
         Qround(state, 1, 5, 9,13)
         Qround(state, 2, 6,10,14)
         Qround(state, 3, 7,11,15)
         Qround(state, 0, 5,10,15)
         Qround(state, 1, 6,11,12)
         Qround(state, 2, 7, 8,13)
         Qround(state, 3, 4, 9,14)
         end

      chacha20_block(key, counter, nonce):
         state = constants | key | counter | nonce
         working_state = state
         for i=1 upto 10
            inner_block(working_state)
            end
         state += working_state
         return serialize(state)
         end

Top      ToC       Page 9 
2.3.2.  Test Vector for the ChaCha20 Block Function

   For a test vector, we will use the following inputs to the ChaCha20
   block function:

   o  Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
      14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.  The key is a sequence of
      octets with no particular structure before we copy it into the
      ChaCha state.

   o  Nonce = (00:00:00:09:00:00:00:4a:00:00:00:00)

   o  Block Count = 1.

   After setting up the ChaCha state, it looks like this:

   ChaCha state with the key setup.

       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000001  09000000  4a000000  00000000

   After running 20 rounds (10 column rounds interleaved with 10
   "diagonal rounds"), the ChaCha state looks like this:

   ChaCha state after 20 rounds

       837778ab  e238d763  a67ae21e  5950bb2f
       c4f2d0c7  fc62bb2f  8fa018fc  3f5ec7b7
       335271c2  f29489f3  eabda8fc  82e46ebd
       d19c12b4  b04e16de  9e83d0cb  4e3c50a2

   Finally, we add the original state to the result (simple vector or
   matrix addition), giving this:

   ChaCha state at the end of the ChaCha20 operation

       e4e7f110  15593bd1  1fdd0f50  c47120a3
       c7f4d1c7  0368c033  9aaa2204  4e6cd4c3
       466482d2  09aa9f07  05d7c214  a2028bd9
       d19c12b5  b94e16de  e883d0cb  4e3c50a2

Top      ToC       Page 10 
   After we serialize the state, we get this:

  Serialized Block:
  000  10 f1 e7 e4 d1 3b 59 15 50 0f dd 1f a3 20 71 c4  .....;Y.P.... q.
  016  c7 d1 f4 c7 33 c0 68 03 04 22 aa 9a c3 d4 6c 4e  ....3.h.."....lN
  032  d2 82 64 46 07 9f aa 09 14 c2 d7 05 d9 8b 02 a2  ..dF............
  048  b5 12 9c d1 de 16 4e b9 cb d0 83 e8 a2 50 3c 4e  ......N......P<N

2.4.  The ChaCha20 Encryption Algorithm

   ChaCha20 is a stream cipher designed by D. J. Bernstein.  It is a
   refinement of the Salsa20 algorithm, and it uses a 256-bit key.

   ChaCha20 successively calls the ChaCha20 block function, with the
   same key and nonce, and with successively increasing block counter
   parameters.  ChaCha20 then serializes the resulting state by writing
   the numbers in little-endian order, creating a keystream block.

   Concatenating the keystream blocks from the successive blocks forms a
   keystream.  The ChaCha20 function then performs an XOR of this
   keystream with the plaintext.  Alternatively, each keystream block
   can be XORed with a plaintext block before proceeding to create the
   next block, saving some memory.  There is no requirement for the
   plaintext to be an integral multiple of 512 bits.  If there is extra
   keystream from the last block, it is discarded.  Specific protocols
   MAY require that the plaintext and ciphertext have certain length.
   Such protocols need to specify how the plaintext is padded and how
   much padding it receives.

   The inputs to ChaCha20 are:

   o  A 256-bit key

   o  A 32-bit initial counter.  This can be set to any number, but will
      usually be zero or one.  It makes sense to use one if we use the
      zero block for something else, such as generating a one-time
      authenticator key as part of an AEAD algorithm.

   o  A 96-bit nonce.  In some protocols, this is known as the
      Initialization Vector.

   o  An arbitrary-length plaintext

   The output is an encrypted message, or "ciphertext", of the same
   length.

Top      ToC       Page 11 
   Decryption is done in the same way.  The ChaCha20 block function is
   used to expand the key into a keystream, which is XORed with the
   ciphertext giving back the plaintext.

2.4.1.  The ChaCha20 Encryption Algorithm in Pseudocode

     chacha20_encrypt(key, counter, nonce, plaintext):
        for j = 0 upto floor(len(plaintext)/64)-1
           key_stream = chacha20_block(key, counter+j, nonce)
           block = plaintext[(j*64)..(j*64+63)]
           encrypted_message +=  block ^ key_stream
           end
        if ((len(plaintext) % 64) != 0)
           j = floor(len(plaintext)/64)
           key_stream = chacha20_block(key, counter+j, nonce)
           block = plaintext[(j*64)..len(plaintext)-1]
           encrypted_message += (block^key_stream)[0..len(plaintext)%64]
           end
        return encrypted_message
        end

2.4.2.  Example and Test Vector for the ChaCha20 Cipher

   For a test vector, we will use the following inputs to the ChaCha20
   block function:

   o  Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
      14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.

   o  Nonce = (00:00:00:00:00:00:00:4a:00:00:00:00).

   o  Initial Counter = 1.

   We use the following for the plaintext.  It was chosen to be long
   enough to require more than one block, but not so long that it would
   make this example cumbersome (so, less than 3 blocks):

  Plaintext Sunscreen:
  000  4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c  Ladies and Gentl
  016  65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73  emen of the clas
  032  73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63  s of '99: If I c
  048  6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f  ould offer you o
  064  6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20  nly one tip for
  080  74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73  the future, suns
  096  63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69  creen would be i
  112  74 2e                                            t.

Top      ToC       Page 12 
   The following figure shows four ChaCha state matrices:

   1.  First block as it is set up.

   2.  Second block as it is set up.  Note that these blocks are only
       two bits apart -- only the counter in position 12 is different.

   3.  Third block is the first block after the ChaCha20 block
       operation.

   4.  Final block is the second block after the ChaCha20 block
       operation was applied.

   After that, we show the keystream.

   First block setup:
       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000001  00000000  4a000000  00000000

   Second block setup:
       61707865  3320646e  79622d32  6b206574
       03020100  07060504  0b0a0908  0f0e0d0c
       13121110  17161514  1b1a1918  1f1e1d1c
       00000002  00000000  4a000000  00000000

   First block after block operation:
       f3514f22  e1d91b40  6f27de2f  ed1d63b8
       821f138c  e2062c3d  ecca4f7e  78cff39e
       a30a3b8a  920a6072  cd7479b5  34932bed
       40ba4c79  cd343ec6  4c2c21ea  b7417df0

   Second block after block operation:
       9f74a669  410f633f  28feca22  7ec44dec
       6d34d426  738cb970  3ac5e9f3  45590cc4
       da6e8b39  892c831a  cdea67c1  2b7e1d90
       037463f3  a11a2073  e8bcfb88  edc49139

   Keystream:
   22:4f:51:f3:40:1b:d9:e1:2f:de:27:6f:b8:63:1d:ed:8c:13:1f:82:3d:2c:06
   e2:7e:4f:ca:ec:9e:f3:cf:78:8a:3b:0a:a3:72:60:0a:92:b5:79:74:cd:ed:2b
   93:34:79:4c:ba:40:c6:3e:34:cd:ea:21:2c:4c:f0:7d:41:b7:69:a6:74:9f:3f
   63:0f:41:22:ca:fe:28:ec:4d:c4:7e:26:d4:34:6d:70:b9:8c:73:f3:e9:c5:3a
   c4:0c:59:45:39:8b:6e:da:1a:83:2c:89:c1:67:ea:cd:90:1d:7e:2b:f3:63

Top      ToC       Page 13 
   Finally, we XOR the keystream with the plaintext, yielding the
   ciphertext:

  Ciphertext Sunscreen:
  000  6e 2e 35 9a 25 68 f9 80 41 ba 07 28 dd 0d 69 81  n.5.%h..A..(..i.
  016  e9 7e 7a ec 1d 43 60 c2 0a 27 af cc fd 9f ae 0b  .~z..C`..'......
  032  f9 1b 65 c5 52 47 33 ab 8f 59 3d ab cd 62 b3 57  ..e.RG3..Y=..b.W
  048  16 39 d6 24 e6 51 52 ab 8f 53 0c 35 9f 08 61 d8  .9.$.QR..S.5..a.
  064  07 ca 0d bf 50 0d 6a 61 56 a3 8e 08 8a 22 b6 5e  ....P.jaV....".^
  080  52 bc 51 4d 16 cc f8 06 81 8c e9 1a b7 79 37 36  R.QM.........y76
  096  5a f9 0b bf 74 a3 5b e6 b4 0b 8e ed f2 78 5e 42  Z...t.[......x^B
  112  87 4d                                            .M

2.5.  The Poly1305 Algorithm

   Poly1305 is a one-time authenticator designed by D. J. Bernstein.
   Poly1305 takes a 32-byte one-time key and a message and produces a
   16-byte tag.  This tag is used to authenticate the message.

   The original article ([Poly1305]) is titled "The Poly1305-AES
   message-authentication code", and the MAC function there requires a
   128-bit AES key, a 128-bit "additional key", and a 128-bit (non-
   secret) nonce.  AES is used there for encrypting the nonce, so as to
   get a unique (and secret) 128-bit string, but as the paper states,
   "There is nothing special about AES here.  One can replace AES with
   an arbitrary keyed function from an arbitrary set of nonces to
   16-byte strings."

   Regardless of how the key is generated, the key is partitioned into
   two parts, called "r" and "s".  The pair (r,s) should be unique, and
   MUST be unpredictable for each invocation (that is why it was
   originally obtained by encrypting a nonce), while "r" MAY be
   constant, but needs to be modified as follows before being used: ("r"
   is treated as a 16-octet little-endian number):

   o  r[3], r[7], r[11], and r[15] are required to have their top four
      bits clear (be smaller than 16)

   o  r[4], r[8], and r[12] are required to have their bottom two bits
      clear (be divisible by 4)

Top      ToC       Page 14 
   The following sample code clamps "r" to be appropriate:

   /*
   Adapted from poly1305aes_test_clamp.c version 20050207
   D. J. Bernstein
   Public domain.
   */

   #include "poly1305aes_test.h"

   void poly1305aes_test_clamp(unsigned char r[16])
   {
     r[3] &= 15;
     r[7] &= 15;
     r[11] &= 15;
     r[15] &= 15;
     r[4] &= 252;
     r[8] &= 252;
     r[12] &= 252;
   }

   The "s" should be unpredictable, but it is perfectly acceptable to
   generate both "r" and "s" uniquely each time.  Because each of them
   is 128 bits, pseudorandomly generating them (see Section 2.6) is also
   acceptable.

   The inputs to Poly1305 are:

   o  A 256-bit one-time key

   o  An arbitrary length message

   The output is a 128-bit tag.

   First, the "r" value should be clamped.

   Next, set the constant prime "P" be 2^130-5:
   3fffffffffffffffffffffffffffffffb.  Also set a variable "accumulator"
   to zero.

   Next, divide the message into 16-byte blocks.  The last one might be
   shorter:

   o  Read the block as a little-endian number.

Top      ToC       Page 15 
   o  Add one bit beyond the number of octets.  For a 16-byte block,
      this is equivalent to adding 2^128 to the number.  For the shorter
      block, it can be 2^120, 2^112, or any power of two that is evenly
      divisible by 8, all the way down to 2^8.

   o  If the block is not 17 bytes long (the last block), pad it with
      zeros.  This is meaningless if you are treating the blocks as
      numbers.

   o  Add this number to the accumulator.

   o  Multiply by "r".

   o  Set the accumulator to the result modulo p.  To summarize: Acc =
      ((Acc+block)*r) % p.

   Finally, the value of the secret key "s" is added to the accumulator,
   and the 128 least significant bits are serialized in little-endian
   order to form the tag.

2.5.1.  The Poly1305 Algorithms in Pseudocode

      clamp(r): r &= 0x0ffffffc0ffffffc0ffffffc0fffffff
      poly1305_mac(msg, key):
         r = (le_bytes_to_num(key[0..15])
         clamp(r)
         s = le_num(key[16..31])
         accumulator = 0
         p = (1<<130)-5
         for i=1 upto ceil(msg length in bytes / 16)
            n = le_bytes_to_num(msg[((i-1)*16)..(i*16)] | [0x01])
            a += n
            a = (r * a) % p
            end
         a += s
         return num_to_16_le_bytes(a)
         end

2.5.2.  Poly1305 Example and Test Vector

   For our example, we will dispense with generating the one-time key
   using AES, and assume that we got the following keying material:

   o  Key Material: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01:0
      3:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b

   o  s as an octet string:
      01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b

Top      ToC       Page 16 
   o  s as a 128-bit number: 1bf54941aff6bf4afdb20dfb8a800301

   o  r before clamping: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8

   o  Clamped r as a number: 806d5400e52447c036d555408bed685

   For our message, we'll use a short text:

  Message to be Authenticated:
  000  43 72 79 70 74 6f 67 72 61 70 68 69 63 20 46 6f  Cryptographic Fo
  016  72 75 6d 20 52 65 73 65 61 72 63 68 20 47 72 6f  rum Research Gro
  032  75 70                                            up

   Since Poly1305 works in 16-byte chunks, the 34-byte message divides
   into three blocks.  In the following calculation, "Acc" denotes the
   accumulator and "Block" the current block:

   Block #1

   Acc = 00
   Block = 6f4620636968706172676f7470797243
   Block with 0x01 byte = 016f4620636968706172676f7470797243
   Acc + block = 016f4620636968706172676f7470797243
   (Acc+Block) * r =
        b83fe991ca66800489155dcd69e8426ba2779453994ac90ed284034da565ecf
   Acc = ((Acc+Block)*r) % P = 2c88c77849d64ae9147ddeb88e69c83fc

   Block #2

   Acc = 2c88c77849d64ae9147ddeb88e69c83fc
   Block = 6f7247206863726165736552206d7572
   Block with 0x01 byte = 016f7247206863726165736552206d7572
   Acc + block = 437febea505c820f2ad5150db0709f96e
   (Acc+Block) * r =
        21dcc992d0c659ba4036f65bb7f88562ae59b32c2b3b8f7efc8b00f78e548a26
   Acc = ((Acc+Block)*r) % P = 2d8adaf23b0337fa7cccfb4ea344b30de

   Last Block

   Acc = 2d8adaf23b0337fa7cccfb4ea344b30de
   Block = 7075
   Block with 0x01 byte = 017075
   Acc + block = 2d8adaf23b0337fa7cccfb4ea344ca153
   (Acc + Block) * r =
        16d8e08a0f3fe1de4fe4a15486aca7a270a29f1e6c849221e4a6798b8e45321f
   ((Acc + Block) * r) % P = 28d31b7caff946c77c8844335369d03a7

Top      ToC       Page 17 
   Adding s, we get this number, and serialize if to get the tag:

   Acc + s = 2a927010caf8b2bc2c6365130c11d06a8

   Tag: a8:06:1d:c1:30:51:36:c6:c2:2b:8b:af:0c:01:27:a9

2.6.  Generating the Poly1305 Key Using ChaCha20

   As said in Section 2.5, it is acceptable to generate the one-time
   Poly1305 pseudorandomly.  This section defines such a method.

   To generate such a key pair (r,s), we will use the ChaCha20 block
   function described in Section 2.3.  This assumes that we have a
   256-bit session key for the Message Authentication Code (MAC)
   function, such as SK_ai and SK_ar in Internet Key Exchange Protocol
   version 2 (IKEv2) ([RFC7296]), the integrity key in the Encapsulating
   Security Payload (ESP) and Authentication Header (AH), or the
   client_write_MAC_key and server_write_MAC_key in TLS.  Any document
   that specifies the use of Poly1305 as a MAC algorithm for some
   protocol must specify that 256 bits are allocated for the integrity
   key.  Note that in the AEAD construction defined in Section 2.8, the
   same key is used for encryption and key generation, so the use of
   SK_a* or *_write_MAC_key is only for stand-alone Poly1305.

   The method is to call the block function with the following
   parameters:

   o  The 256-bit session integrity key is used as the ChaCha20 key.

   o  The block counter is set to zero.

   o  The protocol will specify a 96-bit or 64-bit nonce.  This MUST be
      unique per invocation with the same key, so it MUST NOT be
      randomly generated.  A counter is a good way to implement this,
      but other methods, such as a Linear Feedback Shift Register (LFSR)
      are also acceptable.  ChaCha20 as specified here requires a 96-bit
      nonce.  So if the provided nonce is only 64-bit, then the first 32
      bits of the nonce will be set to a constant number.  This will
      usually be zero, but for protocols with multiple senders it may be
      different for each sender, but should be the same for all
      invocations of the function with the same key by a particular
      sender.

   After running the block function, we have a 512-bit state.  We take
   the first 256 bits or the serialized state, and use those as the one-
   time Poly1305 key: the first 128 bits are clamped and form "r", while
   the next 128 bits become "s".  The other 256 bits are discarded.

Top      ToC       Page 18 
   Note that while many protocols have provisions for a nonce for
   encryption algorithms (often called Initialization Vectors, or IVs),
   they usually don't have such a provision for the MAC function.  In
   that case, the per-invocation nonce will have to come from somewhere
   else, such as a message counter.

2.6.1.  Poly1305 Key Generation in Pseudocode

      poly1305_key_gen(key,nonce):
         counter = 0
         block = chacha20_block(key,counter,nonce)
         return block[0..31]
         end

2.6.2.  Poly1305 Key Generation Test Vector

   For this example, we'll set:

  Key:
  000  80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f  ................
  016  90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f  ................

   Nonce:
   000  00 00 00 00 00 01 02 03 04 05 06 07              ............

   The ChaCha state setup with key, nonce, and block counter zero:
         61707865  3320646e  79622d32  6b206574
         83828180  87868584  8b8a8988  8f8e8d8c
         93929190  97969594  9b9a9998  9f9e9d9c
         00000000  00000000  03020100  07060504

   The ChaCha state after 20 rounds:
         8ba0d58a  cc815f90  27405081  7194b24a
         37b633a8  a50dfde3  e2b8db08  46a6d1fd
         7da03782  9183a233  148ad271  b46773d1
         3cc1875a  8607def1  ca5c3086  7085eb87

  Output bytes:
  000  8a d5 a0 8b 90 5f 81 cc 81 50 40 27 4a b2 94 71  ....._...P@'J..q
  016  a8 33 b6 37 e3 fd 0d a5 08 db b8 e2 fd d1 a6 46  .3.7...........F

   And that output is also the 32-byte one-time key used for Poly1305.

2.7.  A Pseudorandom Function for Crypto Suites based on ChaCha/Poly1305

   Some protocols, such as IKEv2 ([RFC7296]), require a Pseudorandom
   Function (PRF), mostly for key derivation.  In the IKEv2 definition,
   a PRF is a function that accepts a variable-length key and a

Top      ToC       Page 19 
   variable-length input, and returns a fixed-length output.  Most
   commonly, Hashed MAC (HMAC) constructions are used for this purpose,
   and often the same function is used for both message authentication
   and PRF.

   Poly1305 is not a suitable choice for a PRF.  Poly1305 prohibits
   using the same key twice, whereas the PRF in IKEv2 is used multiple
   times with the same key.  Additionally, unlike HMAC, Poly1305 is
   biased, so using it for key derivation would reduce the security of
   the symmetric encryption.

   Chacha20 could be used as a key-derivation function, by generating an
   arbitrarily long keystream.  However, that is not what protocols such
   as IKEv2 require.

   For this reason, this document does not specify a PRF and recommends
   that crypto suites use some other PRF such as PRF_HMAC_SHA2_256 (see
   Section 2.1.2 of [RFC4868]).

2.8.  AEAD Construction

   AEAD_CHACHA20_POLY1305 is an authenticated encryption with additional
   data algorithm.  The inputs to AEAD_CHACHA20_POLY1305 are:

   o  A 256-bit key

   o  A 96-bit nonce -- different for each invocation with the same key

   o  An arbitrary length plaintext

   o  Arbitrary length additional authenticated data (AAD)

   Some protocols may have unique per-invocation inputs that are not 96
   bits in length.  For example, IPsec may specify a 64-bit nonce.  In
   such a case, it is up to the protocol document to define how to
   transform the protocol nonce into a 96-bit nonce, for example, by
   concatenating a constant value.

   The ChaCha20 and Poly1305 primitives are combined into an AEAD that
   takes a 256-bit key and 96-bit nonce as follows:

   o  First, a Poly1305 one-time key is generated from the 256-bit key
      and nonce using the procedure described in Section 2.6.

   o  Next, the ChaCha20 encryption function is called to encrypt the
      plaintext, using the same key and nonce, and with the initial
      counter set to 1.

Top      ToC       Page 20 
   o  Finally, the Poly1305 function is called with the Poly1305 key
      calculated above, and a message constructed as a concatenation of
      the following:

      *  The AAD

      *  padding1 -- the padding is up to 15 zero bytes, and it brings
         the total length so far to an integral multiple of 16.  If the
         length of the AAD was already an integral multiple of 16 bytes,
         this field is zero-length.

      *  The ciphertext

      *  padding2 -- the padding is up to 15 zero bytes, and it brings
         the total length so far to an integral multiple of 16.  If the
         length of the ciphertext was already an integral multiple of 16
         bytes, this field is zero-length.

      *  The length of the additional data in octets (as a 64-bit
         little-endian integer).

      *  The length of the ciphertext in octets (as a 64-bit little-
         endian integer).

   The output from the AEAD is twofold:

   o  A ciphertext of the same length as the plaintext.

   o  A 128-bit tag, which is the output of the Poly1305 function.

   Decryption is similar with the following differences:

   o  The roles of ciphertext and plaintext are reversed, so the
      ChaCha20 encryption function is applied to the ciphertext,
      producing the plaintext.

   o  The Poly1305 function is still run on the AAD and the ciphertext,
      not the plaintext.

   o  The calculated tag is bitwise compared to the received tag.  The
      message is authenticated if and only if the tags match.

   A few notes about this design:

   1.  The amount of encrypted data possible in a single invocation is
       2^32-1 blocks of 64 bytes each, because of the size of the block
       counter field in the ChaCha20 block function.  This gives a total
       of 247,877,906,880 bytes, or nearly 256 GB.  This should be

Top      ToC       Page 21 
       enough for traffic protocols such as IPsec and TLS, but may be
       too small for file and/or disk encryption.  For such uses, we can
       return to the original design, reduce the nonce to 64 bits, and
       use the integer at position 13 as the top 32 bits of a 64-bit
       block counter, increasing the total message size to over a
       million petabytes (1,180,591,620,717,411,303,360 bytes to be
       exact).

   2.  Despite the previous item, the ciphertext length field in the
       construction of the buffer on which Poly1305 runs limits the
       ciphertext (and hence, the plaintext) size to 2^64 bytes, or
       sixteen thousand petabytes (18,446,744,073,709,551,616 bytes to
       be exact).

   The AEAD construction in this section is a novel composition of
   ChaCha20 and Poly1305.  A security analysis of this composition is
   given in [Procter].

   Here is a list of the parameters for this construction as defined in
   Section 4 of RFC 5116:

   o  K_LEN (key length) is 32 octets.

   o  P_MAX (maximum size of the plaintext) is 247,877,906,880 bytes, or
      nearly 256 GB.

   o  A_MAX (maximum size of the associated data) is set to 2^64-1
      octets by the length field for associated data.

   o  N_MIN = N_MAX = 12 octets.

   o  C_MAX = P_MAX + tag length = 247,877,906,896 octets.

   Distinct AAD inputs (as described in Section 3.3 of RFC 5116) shall
   be concatenated into a single input to AEAD_CHACHA20_POLY1305.  It is
   up to the application to create a structure in the AAD input if it is
   needed.

2.8.1.  Pseudocode for the AEAD Construction

      pad16(x):
         if (len(x) % 16)==0
            then return NULL
            else return copies(0, 16-(len(x)%16))
         end

Top      ToC       Page 22 
      chacha20_aead_encrypt(aad, key, iv, constant, plaintext):
         nonce = constant | iv
         otk = poly1305_key_gen(key, nonce)
         ciphertext = chacha20_encrypt(key, 1, nonce, plaintext)
         mac_data = aad | pad16(aad)
         mac_data |= ciphertext | pad16(ciphertext)
         mac_data |= num_to_4_le_bytes(aad.length)
         mac_data |= num_to_4_le_bytes(ciphertext.length)
         tag = poly1305_mac(mac_data, otk)
         return (ciphertext, tag)

2.8.2.  Example and Test Vector for AEAD_CHACHA20_POLY1305

   For a test vector, we will use the following inputs to the
   AEAD_CHACHA20_POLY1305 function:

  Plaintext:
  000  4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c  Ladies and Gentl
  016  65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73  emen of the clas
  032  73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63  s of '99: If I c
  048  6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f  ould offer you o
  064  6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20  nly one tip for
  080  74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73  the future, suns
  096  63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69  creen would be i
  112  74 2e                                            t.

   AAD:
   000  50 51 52 53 c0 c1 c2 c3 c4 c5 c6 c7              PQRS........

  Key:
  000  80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f  ................
  016  90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f  ................

   IV:
   000  40 41 42 43 44 45 46 47                          @ABCDEFG

   32-bit fixed-common part:
   000  07 00 00 00                                      ....

   Setup for generating Poly1305 one-time key (sender id=7):
       61707865  3320646e  79622d32  6b206574
       83828180  87868584  8b8a8988  8f8e8d8c
       93929190  97969594  9b9a9998  9f9e9d9c
       00000000  00000007  43424140  47464544

Top      ToC       Page 23 
   After generating Poly1305 one-time key:
       252bac7b  af47b42d  557ab609  8455e9a4
       73d6e10a  ebd97510  7875932a  ff53d53e
       decc7ea2  b44ddbad  e49c17d1  d8430bc9
       8c94b7bc  8b7d4b4b  3927f67d  1669a432

  Poly1305 Key:
  000  7b ac 2b 25 2d b4 47 af 09 b6 7a 55 a4 e9 55 84  {.+%-.G...zU..U.
  016  0a e1 d6 73 10 75 d9 eb 2a 93 75 78 3e d5 53 ff  ...s.u..*.ux>.S.

  Poly1305 r =  455e9a4057ab6080f47b42c052bac7b
  Poly1305 s = ff53d53e7875932aebd9751073d6e10a

   keystream bytes:
   9f:7b:e9:5d:01:fd:40:ba:15:e2:8f:fb:36:81:0a:ae:
   c1:c0:88:3f:09:01:6e:de:dd:8a:d0:87:55:82:03:a5:
   4e:9e:cb:38:ac:8e:5e:2b:b8:da:b2:0f:fa:db:52:e8:
   75:04:b2:6e:be:69:6d:4f:60:a4:85:cf:11:b8:1b:59:
   fc:b1:c4:5f:42:19:ee:ac:ec:6a:de:c3:4e:66:69:78:
   8e:db:41:c4:9c:a3:01:e1:27:e0:ac:ab:3b:44:b9:cf:
   5c:86:bb:95:e0:6b:0d:f2:90:1a:b6:45:e4:ab:e6:22:
   15:38

  Ciphertext:
  000  d3 1a 8d 34 64 8e 60 db 7b 86 af bc 53 ef 7e c2  ...4d.`.{...S.~.
  016  a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7 36 ee 62 d6  ...Q)n......6.b.
  032  3d be a4 5e 8c a9 67 12 82 fa fb 69 da 92 72 8b  =..^..g....i..r.
  048  1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6 7e cd 3b 36  .q.....)....~.;6
  064  92 dd bd 7f 2d 77 8b 8c 98 03 ae e3 28 09 1b 58  ....-w......(..X
  080  fa b3 24 e4 fa d6 75 94 55 85 80 8b 48 31 d7 bc  ..$...u.U...H1..
  096  3f f4 de f0 8e 4b 7a 9d e5 76 d2 65 86 ce c6 4b  ?....Kz..v.e...K
  112  61 16                                            a.

  AEAD Construction for Poly1305:
  000  50 51 52 53 c0 c1 c2 c3 c4 c5 c6 c7 00 00 00 00  PQRS............
  016  d3 1a 8d 34 64 8e 60 db 7b 86 af bc 53 ef 7e c2  ...4d.`.{...S.~.
  032  a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7 36 ee 62 d6  ...Q)n......6.b.
  048  3d be a4 5e 8c a9 67 12 82 fa fb 69 da 92 72 8b  =..^..g....i..r.
  064  1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6 7e cd 3b 36  .q.....)....~.;6
  080  92 dd bd 7f 2d 77 8b 8c 98 03 ae e3 28 09 1b 58  ....-w......(..X
  096  fa b3 24 e4 fa d6 75 94 55 85 80 8b 48 31 d7 bc  ..$...u.U...H1..
  112  3f f4 de f0 8e 4b 7a 9d e5 76 d2 65 86 ce c6 4b  ?....Kz..v.e...K
  128  61 16 00 00 00 00 00 00 00 00 00 00 00 00 00 00  a...............
  144  0c 00 00 00 00 00 00 00 72 00 00 00 00 00 00 00  ........r.......

   Note the four zero bytes in line 000 and the 14 zero bytes in line
   128

Top      ToC       Page 24 
   Tag:
   1a:e1:0b:59:4f:09:e2:6a:7e:90:2e:cb:d0:60:06:91



(page 24 continued on part 2)

Next RFC Part