RFC 7541

HPACK: Header Compression for HTTP/2

Pages: 55
Proposed Standard
→ Errata

Part 1 of 3 – Pages 1 to 11

RFC7541 - Page 1

Internet Engineering Task Force (IETF)                           R. Peon
Request for Comments: 7541                                   Google, Inc
Category: Standards Track                                     H. Ruellan
ISSN: 2070-1721                                                Canon CRF
                                                                May 2015


                  HPACK: Header Compression for HTTP/2

Abstract

   This specification defines HPACK, a compression format for
   efficiently representing HTTP header fields, to be used in HTTP/2.

Status of This Memo

   This is an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is available in Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc7541.

Copyright Notice

   Copyright (c) 2015 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

RFC7541 - Page 2

Table of Contents

   1. Introduction ....................................................4
      1.1. Overview ...................................................4
      1.2. Conventions ................................................5
      1.3. Terminology ................................................5
   2. Compression Process Overview ....................................6
      2.1. Header List Ordering .......................................6
      2.2. Encoding and Decoding Contexts .............................6
      2.3. Indexing Tables ............................................6
           2.3.1. Static Table ........................................6
           2.3.2. Dynamic Table .......................................6
           2.3.3. Index Address Space .................................7
      2.4. Header Field Representation ................................8
   3. Header Block Decoding ...........................................8
      3.1. Header Block Processing ....................................8
      3.2. Header Field Representation Processing .....................9
   4. Dynamic Table Management ........................................9
      4.1. Calculating Table Size ....................................10
      4.2. Maximum Table Size ........................................10
      4.3. Entry Eviction When Dynamic Table Size Changes ............11
      4.4. Entry Eviction When Adding New Entries ....................11
   5. Primitive Type Representations .................................11
      5.1. Integer Representation ....................................11
      5.2. String Literal Representation .............................13
   6. Binary Format ..................................................14
      6.1. Indexed Header Field Representation .......................14
      6.2. Literal Header Field Representation .......................15
           6.2.1. Literal Header Field with Incremental Indexing .....15
           6.2.2. Literal Header Field without Indexing ..............16
           6.2.3. Literal Header Field Never Indexed .................17
      6.3. Dynamic Table Size Update .................................18
   7. Security Considerations ........................................19
      7.1. Probing Dynamic Table State ...............................19
           7.1.1. Applicability to HPACK and HTTP ....................20
           7.1.2. Mitigation .........................................20
           7.1.3. Never-Indexed Literals .............................21
      7.2. Static Huffman Encoding ...................................22
      7.3. Memory Consumption ........................................22
      7.4. Implementation Limits .....................................23
   8. References .....................................................23
      8.1. Normative References ......................................23
      8.2. Informative References ....................................24
   Appendix A. Static Table Definition ...............................25
   Appendix B. Huffman Code ..........................................27

RFC7541 - Page 3

   Appendix C. Examples ..............................................33
     C.1. Integer Representation Examples ............................33
       C.1.1. Example 1: Encoding 10 Using a 5-Bit Prefix ............33
       C.1.2. Example 2: Encoding 1337 Using a 5-Bit Prefix ..........33
       C.1.3. Example 3: Encoding 42 Starting at an Octet Boundary ...34
     C.2. Header Field Representation Examples .......................34
       C.2.1. Literal Header Field with Indexing .....................34
       C.2.2. Literal Header Field without Indexing ..................35
       C.2.3. Literal Header Field Never Indexed .....................36
       C.2.4. Indexed Header Field ...................................37
     C.3. Request Examples without Huffman Coding ....................37
       C.3.1. First Request ..........................................37
       C.3.2. Second Request .........................................38
       C.3.3. Third Request ..........................................39
     C.4. Request Examples with Huffman Coding .......................41
       C.4.1. First Request ..........................................41
       C.4.2. Second Request .........................................42
       C.4.3. Third Request ..........................................43
     C.5. Response Examples without Huffman Coding ...................45
       C.5.1. First Response .........................................45
       C.5.2. Second Response ........................................46
       C.5.3. Third Response .........................................47
     C.6. Response Examples with Huffman Coding ......................49
       C.6.1. First Response .........................................49
       C.6.2. Second Response ........................................51
       C.6.3. Third Response .........................................52
   Acknowledgments ...................................................55
   Authors' Addresses ................................................55

RFC7541 - Page 4

1.  Introduction

   In HTTP/1.1 (see [RFC7230]), header fields are not compressed.  As
   web pages have grown to require dozens to hundreds of requests, the
   redundant header fields in these requests unnecessarily consume
   bandwidth, measurably increasing latency.

   SPDY [SPDY] initially addressed this redundancy by compressing header
   fields using the DEFLATE [DEFLATE] format, which proved very
   effective at efficiently representing the redundant header fields.
   However, that approach exposed a security risk as demonstrated by the
   CRIME (Compression Ratio Info-leak Made Easy) attack (see [CRIME]).

   This specification defines HPACK, a new compressor that eliminates
   redundant header fields, limits vulnerability to known security
   attacks, and has a bounded memory requirement for use in constrained
   environments.  Potential security concerns for HPACK are described in
   Section 7.

   The HPACK format is intentionally simple and inflexible.  Both
   characteristics reduce the risk of interoperability or security
   issues due to implementation error.  No extensibility mechanisms are
   defined; changes to the format are only possible by defining a
   complete replacement.

1.1.  Overview

   The format defined in this specification treats a list of header
   fields as an ordered collection of name-value pairs that can include
   duplicate pairs.  Names and values are considered to be opaque
   sequences of octets, and the order of header fields is preserved
   after being compressed and decompressed.

   Encoding is informed by header field tables that map header fields to
   indexed values.  These header field tables can be incrementally
   updated as new header fields are encoded or decoded.

   In the encoded form, a header field is represented either literally
   or as a reference to a header field in one of the header field
   tables.  Therefore, a list of header fields can be encoded using a
   mixture of references and literal values.

   Literal values are either encoded directly or use a static Huffman
   code.

   The encoder is responsible for deciding which header fields to insert
   as new entries in the header field tables.  The decoder executes the
   modifications to the header field tables prescribed by the encoder,

RFC7541 - Page 5

   reconstructing the list of header fields in the process.  This
   enables decoders to remain simple and interoperate with a wide
   variety of encoders.

   Examples illustrating the use of these different mechanisms to
   represent header fields are available in Appendix C.

1.2.  Conventions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

   All numeric values are in network byte order.  Values are unsigned
   unless otherwise indicated.  Literal values are provided in decimal
   or hexadecimal as appropriate.

1.3.  Terminology

   This specification uses the following terms:

   Header Field:  A name-value pair.  Both the name and value are
      treated as opaque sequences of octets.

   Dynamic Table:  The dynamic table (see Section 2.3.2) is a table that
      associates stored header fields with index values.  This table is
      dynamic and specific to an encoding or decoding context.

   Static Table:  The static table (see Section 2.3.1) is a table that
      statically associates header fields that occur frequently with
      index values.  This table is ordered, read-only, always
      accessible, and it may be shared amongst all encoding or decoding
      contexts.

   Header List:  A header list is an ordered collection of header fields
      that are encoded jointly and can contain duplicate header fields.
      A complete list of header fields contained in an HTTP/2 header
      block is a header list.

   Header Field Representation:  A header field can be represented in
      encoded form either as a literal or as an index (see Section 2.4).

   Header Block:  An ordered list of header field representations,
      which, when decoded, yields a complete header list.

RFC7541 - Page 6

2.  Compression Process Overview

   This specification does not describe a specific algorithm for an
   encoder.  Instead, it defines precisely how a decoder is expected to
   operate, allowing encoders to produce any encoding that this
   definition permits.

2.1.  Header List Ordering

   HPACK preserves the ordering of header fields inside the header list.
   An encoder MUST order header field representations in the header
   block according to their ordering in the original header list.  A
   decoder MUST order header fields in the decoded header list according
   to their ordering in the header block.

2.2.  Encoding and Decoding Contexts

   To decompress header blocks, a decoder only needs to maintain a
   dynamic table (see Section 2.3.2) as a decoding context.  No other
   dynamic state is needed.

   When used for bidirectional communication, such as in HTTP, the
   encoding and decoding dynamic tables maintained by an endpoint are
   completely independent, i.e., the request and response dynamic tables
   are separate.

2.3.  Indexing Tables

   HPACK uses two tables for associating header fields to indexes.  The
   static table (see Section 2.3.1) is predefined and contains common
   header fields (most of them with an empty value).  The dynamic table
   (see Section 2.3.2) is dynamic and can be used by the encoder to
   index header fields repeated in the encoded header lists.

   These two tables are combined into a single address space for
   defining index values (see Section 2.3.3).

2.3.1.  Static Table

   The static table consists of a predefined static list of header
   fields.  Its entries are defined in Appendix A.

2.3.2.  Dynamic Table

   The dynamic table consists of a list of header fields maintained in
   first-in, first-out order.  The first and newest entry in a dynamic
   table is at the lowest index, and the oldest entry of a dynamic table
   is at the highest index.

RFC7541 - Page 7

   The dynamic table is initially empty.  Entries are added as each
   header block is decompressed.

   The dynamic table can contain duplicate entries (i.e., entries with
   the same name and same value).  Therefore, duplicate entries MUST NOT
   be treated as an error by a decoder.

   The encoder decides how to update the dynamic table and as such can
   control how much memory is used by the dynamic table.  To limit the
   memory requirements of the decoder, the dynamic table size is
   strictly bounded (see Section 4.2).

   The decoder updates the dynamic table during the processing of a list
   of header field representations (see Section 3.2).

2.3.3.  Index Address Space

   The static table and the dynamic table are combined into a single
   index address space.

   Indices between 1 and the length of the static table (inclusive)
   refer to elements in the static table (see Section 2.3.1).

   Indices strictly greater than the length of the static table refer to
   elements in the dynamic table (see Section 2.3.2).  The length of the
   static table is subtracted to find the index into the dynamic table.

   Indices strictly greater than the sum of the lengths of both tables
   MUST be treated as a decoding error.

   For a static table size of s and a dynamic table size of k, the
   following diagram shows the entire valid index address space.

           <----------  Index Address Space ---------->
           <-- Static  Table -->  <-- Dynamic Table -->
           +---+-----------+---+  +---+-----------+---+
           | 1 |    ...    | s |  |s+1|    ...    |s+k|
           +---+-----------+---+  +---+-----------+---+
                                  ^                   |
                                  |                   V
                           Insertion Point      Dropping Point

                       Figure 1: Index Address Space

RFC7541 - Page 8

2.4.  Header Field Representation

   An encoded header field can be represented either as an index or as a
   literal.

   An indexed representation defines a header field as a reference to an
   entry in either the static table or the dynamic table (see
   Section 6.1).

   A literal representation defines a header field by specifying its
   name and value.  The header field name can be represented literally
   or as a reference to an entry in either the static table or the
   dynamic table.  The header field value is represented literally.

   Three different literal representations are defined:

   o  A literal representation that adds the header field as a new entry
      at the beginning of the dynamic table (see Section 6.2.1).

   o  A literal representation that does not add the header field to the
      dynamic table (see Section 6.2.2).

   o  A literal representation that does not add the header field to the
      dynamic table, with the additional stipulation that this header
      field always use a literal representation, in particular when re-
      encoded by an intermediary (see Section 6.2.3).  This
      representation is intended for protecting header field values that
      are not to be put at risk by compressing them (see Section 7.1.3
      for more details).

   The selection of one of these literal representations can be guided
   by security considerations, in order to protect sensitive header
   field values (see Section 7.1).

   The literal representation of a header field name or of a header
   field value can encode the sequence of octets either directly or
   using a static Huffman code (see Section 5.2).

3.  Header Block Decoding

3.1.  Header Block Processing

   A decoder processes a header block sequentially to reconstruct the
   original header list.

   A header block is the concatenation of header field representations.
   The different possible header field representations are described in
   Section 6.

RFC7541 - Page 9

   Once a header field is decoded and added to the reconstructed header
   list, the header field cannot be removed.  A header field added to
   the header list can be safely passed to the application.

   By passing the resulting header fields to the application, a decoder
   can be implemented with minimal transitory memory commitment in
   addition to the memory required for the dynamic table.

3.2.  Header Field Representation Processing

   The processing of a header block to obtain a header list is defined
   in this section.  To ensure that the decoding will successfully
   produce a header list, a decoder MUST obey the following rules.

   All the header field representations contained in a header block are
   processed in the order in which they appear, as specified below.
   Details on the formatting of the various header field representations
   and some additional processing instructions are found in Section 6.

   An _indexed representation_ entails the following actions:

   o  The header field corresponding to the referenced entry in either
      the static table or dynamic table is appended to the decoded
      header list.

   A _literal representation_ that is _not added_ to the dynamic table
   entails the following action:

   o  The header field is appended to the decoded header list.

   A _literal representation_ that is _added_ to the dynamic table
   entails the following actions:

   o  The header field is appended to the decoded header list.

   o  The header field is inserted at the beginning of the dynamic
      table.  This insertion could result in the eviction of previous
      entries in the dynamic table (see Section 4.4).

4.  Dynamic Table Management

   To limit the memory requirements on the decoder side, the dynamic
   table is constrained in size.

RFC7541 - Page 10

4.1.  Calculating Table Size

   The size of the dynamic table is the sum of the size of its entries.

   The size of an entry is the sum of its name's length in octets (as
   defined in Section 5.2), its value's length in octets, and 32.

   The size of an entry is calculated using the length of its name and
   value without any Huffman encoding applied.

      Note: The additional 32 octets account for an estimated overhead
      associated with an entry.  For example, an entry structure using
      two 64-bit pointers to reference the name and the value of the
      entry and two 64-bit integers for counting the number of
      references to the name and value would have 32 octets of overhead.

4.2.  Maximum Table Size

   Protocols that use HPACK determine the maximum size that the encoder
   is permitted to use for the dynamic table.  In HTTP/2, this value is
   determined by the SETTINGS_HEADER_TABLE_SIZE setting (see
   Section 6.5.2 of [HTTP2]).

   An encoder can choose to use less capacity than this maximum size
   (see Section 6.3), but the chosen size MUST stay lower than or equal
   to the maximum set by the protocol.

   A change in the maximum size of the dynamic table is signaled via a
   dynamic table size update (see Section 6.3).  This dynamic table size
   update MUST occur at the beginning of the first header block
   following the change to the dynamic table size.  In HTTP/2, this
   follows a settings acknowledgment (see Section 6.5.3 of [HTTP2]).

   Multiple updates to the maximum table size can occur between the
   transmission of two header blocks.  In the case that this size is
   changed more than once in this interval, the smallest maximum table
   size that occurs in that interval MUST be signaled in a dynamic table
   size update.  The final maximum size is always signaled, resulting in
   at most two dynamic table size updates.  This ensures that the
   decoder is able to perform eviction based on reductions in dynamic
   table size (see Section 4.3).

   This mechanism can be used to completely clear entries from the
   dynamic table by setting a maximum size of 0, which can subsequently
   be restored.

RFC7541 - Page 11

4.3.  Entry Eviction When Dynamic Table Size Changes

   Whenever the maximum size for the dynamic table is reduced, entries
   are evicted from the end of the dynamic table until the size of the
   dynamic table is less than or equal to the maximum size.

4.4.  Entry Eviction When Adding New Entries

   Before a new entry is added to the dynamic table, entries are evicted
   from the end of the dynamic table until the size of the dynamic table
   is less than or equal to (maximum size - new entry size) or until the
   table is empty.

   If the size of the new entry is less than or equal to the maximum
   size, that entry is added to the table.  It is not an error to
   attempt to add an entry that is larger than the maximum size; an
   attempt to add an entry larger than the maximum size causes the table
   to be emptied of all existing entries and results in an empty table.

   A new entry can reference the name of an entry in the dynamic table
   that will be evicted when adding this new entry into the dynamic
   table.  Implementations are cautioned to avoid deleting the
   referenced name if the referenced entry is evicted from the dynamic
   table prior to inserting the new entry.

(page 11 continued on part 2)