Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 6386

VP8 Data Format and Decoding Guide

Pages: 304
Informational
Errata
Part 5 of 11 – Pages 108 to 132
First   Prev   Next

Top   ToC   RFC6386 - Page 108   prevText

17. Motion Vector Decoding

As discussed above, motion vectors appear in two places in the VP8 datastream: applied to whole macroblocks in NEWMV mode and applied to subsets of macroblocks in NEW4x4 mode. The format of the vectors is identical in both cases. Each vector has two pieces: a vertical component (row) followed by a horizontal component (column). The row and column use separate coding probabilities but are otherwise represented identically.

17.1. Coding of Each Component

Each component is a signed integer V representing a vertical or horizontal luma displacement of V quarter-pixels (and a chroma displacement of V eighth-pixels). The absolute value of V, if non-zero, is followed by a boolean sign. V may take any value between -1023 and +1023, inclusive.
Top   ToC   RFC6386 - Page 109
   The absolute value A is coded in one of two different ways according
   to its size.  For 0 <= A <= 7, A is tree-coded, and for 8 <= A <=
   1023, the bits in the binary expansion of A are coded using
   independent boolean probabilities.  The coding of A begins with a
   bool specifying which range is in effect.

   Decoding a motion vector component then requires a 19-position
   probability table, whose offsets, along with the procedure used to
   decode components, are as follows:

   ---- Begin code block --------------------------------------

   typedef enum
   {
       mvpis_short,         /* short (<= 7) vs long (>= 8) */
       MVPsign,             /* sign for non-zero */
       MVPshort,            /* 8 short values = 7-position tree */

       MVPbits = MVPshort + 7,      /* 8 long value bits
                                       w/independent probs */

       MVPcount = MVPbits + 10      /* 19 probabilities in total */
   }
   MVPindices;

   typedef Prob MV_CONTEXT [MVPcount];    /* Decoding spec for
                                             a single component */

   /* Tree used for small absolute values (has expected
      correspondence). */

   const tree_index small_mvtree [2 * (8 - 1)] =
   {
    2, 8,          /* "0" subtree, "1" subtree */
     4, 6,         /* "00" subtree, "01" subtree */
      -0, -1,      /* 0 = "000", 1 = "001" */
      -2, -3,      /* 2 = "010", 3 = "011" */
     10, 12,       /* "10" subtree, "11" subtree */
      -4, -5,      /* 4 = "100", 5 = "101" */
      -6, -7       /* 6 = "110", 7 = "111" */
   };

   /* Read MV component at current decoder position, using
      supplied probs. */

   int read_mvcomponent(bool_decoder *d, const MV_CONTEXT *mvc)
   {
       const Prob * const p = (const Prob *) mvc;
Top   ToC   RFC6386 - Page 110
       int A = 0;

       if (read_bool(d, p [mvpis_short]))    /* 8 <= A <= 1023 */
       {
           /* Read bits 0, 1, 2 */

           int i = 0;
           do { A += read_bool(d, p [MVPbits + i]) << i;}
             while (++i < 3);

           /* Read bits 9, 8, 7, 6, 5, 4 */

           i = 9;
           do { A += read_bool(d, p [MVPbits + i]) << i;}
             while (--i > 3);

           /* We know that A >= 8 because it is coded long,
              so if A <= 15, bit 3 is one and is not
              explicitly coded. */

           if (!(A & 0xfff0)  ||  read_bool(d, p [MVPbits + 3]))
               A += 8;
       }
       else    /* 0 <= A <= 7 */
           A = treed_read(d, small_mvtree, p + MVPshort);

       return A && read_bool(r, p [MVPsign]) ?  -A : A;
   }

   ---- End code block ----------------------------------------

17.2. Probability Updates

The decoder should maintain an array of two MV_CONTEXTs for decoding row and column components, respectively. These MV_CONTEXTs should be set to their defaults every key frame. Each individual probability may be updated every interframe (by field J of the frame header) using a constant table of update probabilities. Each optional update is of the form B? P(7), that is, a bool followed by a 7-bit probability specification if true. As with other dynamic probabilities used by VP8, the updates remain in effect until the next key frame or until replaced via another update.
Top   ToC   RFC6386 - Page 111
   In detail, the probabilities should then be managed as follows.

   ---- Begin code block --------------------------------------

   /* Never-changing table of update probabilities for each
      individual probability used in decoding motion vectors. */

   const MV_CONTEXT vp8_mv_update_probs[2] =
   {
     {
       237,
       246,
       253, 253, 254, 254, 254, 254, 254,
       254, 254, 254, 254, 254, 250, 250, 252, 254, 254
     },
     {
       231,
       243,
       245, 253, 254, 254, 254, 254, 254,
       254, 254, 254, 254, 254, 251, 251, 254, 254, 254
     }
   };

   /* Default MV decoding probabilities. */

   const MV_CONTEXT default_mv_context[2] =
   {
     {                       // row
       162,                    // is short
       128,                    // sign
         225, 146, 172, 147, 214,  39, 156,      // short tree
       128, 129, 132,  75, 145, 178, 206, 239, 254, 254 // long bits
     },

     {                       // same for column
       164,                    // is short
       128,
       204, 170, 119, 235, 140, 230, 228,
       128, 130, 130,  74, 148, 180, 203, 236, 254, 254 // long bits

     }
   };

   /* Current MV decoding probabilities, set to above defaults
      every key frame. */

   MV_CONTEXT mvc [2];     /* always row, then column */
Top   ToC   RFC6386 - Page 112
   /* Procedure for decoding a complete motion vector. */

   typedef struct { int16 row, col;}  MV;  /* as in previous section */

   MV read_mv(bool_decoder *d)
   {
       MV v;
       v.row = (int16) read_mvcomponent(d, mvc);
       v.col = (int16) read_mvcomponent(d, mvc + 1);
       return v;
   }

   /* Procedure for updating MV decoding probabilities, called
      every interframe with "d" at the appropriate position in
      the frame header. */

   void update_mvcontexts(bool_decoder *d)
   {
       int i = 0;
       do {                      /* component = row, then column */
           const Prob *up = mv_update_probs[i];    /* update probs
                                                      for component */
           Prob *p = mvc[i];                  /* start decode tbl "" */
           Prob * const pstop = p + MVPcount; /* end decode tbl "" */
           do {
               if (read_bool(d, *up++))     /* update this position */
               {
                   const Prob x = read_literal(d, 7);

                   *p = x? x<<1 : 1;
               }
           } while (++p < pstop);              /* next position */
       } while (++i < 2);                      /* next component */
   }

   ---- End code block ----------------------------------------

   This completes the description of the motion-vector decoding
   procedure and, with it, the procedure for decoding interframe
   macroblock prediction records.
Top   ToC   RFC6386 - Page 113

18. Interframe Prediction

Given an inter-prediction specification for the current macroblock, that is, a reference frame together with a motion vector for each of the sixteen Y subblocks, we describe the calculation of the prediction buffer for the macroblock. Frame reconstruction is then completed via the previously described processes of residue summation (Section 14) and loop filtering (Section 15). The management of inter-predicted subblocks and sub-pixel interpolation may be found in the reference decoder file predict.c (Section 20.14).

18.1. Bounds on, and Adjustment of, Motion Vectors

Since each motion vector is differentially encoded from a neighboring block or macroblock and the only clamp is to ensure that the referenced motion vector represents a valid location inside a reference frame buffer, it is technically possible within the VP8 format for a block or macroblock to have arbitrarily large motion vectors, up to the size of the input image plus the extended border areas. For practical reasons, VP8 imposes a motion vector size range limit of -4096 to 4095 full pixels, regardless of image size (VP8 defines 14 raw bits for width and height; 16383x16383 is the maximum possible image size). Bitstream-compliant encoders and decoders shall enforce this limit. Because the motion vectors applied to the chroma subblocks have 1/8-pixel resolution, the synthetic pixel calculation, outlined in Section 5 and detailed below, uses this resolution for the luma subblocks as well. In accordance, the stored luma motion vectors are all doubled, each component of each luma vector becoming an even integer in the range -2046 to +2046, inclusive. The vector applied to each chroma subblock is calculated by averaging the vectors for the 4 luma subblocks occupying the same visible area as the chroma subblock in the usual correspondence; that is, the vector for U and V block 0 is the average of the vectors for the Y subblocks { 0, 1, 4, 5}, chroma block 1 corresponds to Y blocks { 2, 3, 6, 7}, chroma block 2 to Y blocks { 8, 9, 12, 13}, and chroma block 3 to Y blocks { 10, 11, 14, 15}.
Top   ToC   RFC6386 - Page 114
   In detail, each of the two components of the vectors for each of the
   chroma subblocks is calculated from the corresponding luma vector
   components as follows:

   ---- Begin code block --------------------------------------

   int avg(int c1, int c2, int c3, int c4)
   {
       int s = c1 + c2 + c3 + c4;

       /* The shift divides by 8 (not 4) because chroma pixels
          have twice the diameter of luma pixels.  The handling
          of negative motion vector components is slightly
          cumbersome because, strictly speaking, right shifts
          of negative numbers are not well-defined in C. */

       return s >= 0 ?  (s + 4) >> 3 : -((-s + 4) >> 3);
   }

   ---- End code block ----------------------------------------

   Furthermore, if the version number in the frame tag specifies only
   full-pel chroma motion vectors, then the fractional parts of both
   components of the vector are truncated to zero, as illustrated in the
   following pseudocode (assuming 3 bits of fraction for both luma and
   chroma vectors):

   ---- Begin code block --------------------------------------

       x = x & (~7);
       y = y & (~7);

   ---- End code block ----------------------------------------

   Earlier in this document we described the vp8_clamp_mv() function to
   limit "nearest" and "near" motion vector predictors inside specified
   margins within the frame boundaries.  Additional clamping is
   performed for NEWMV macroblocks, for which the final motion vector is
   clamped again after combining the "best" predictor and the
   differential vector decoded from the stream.

   However, the secondary clamping is not performed for SPLITMV
   macroblocks, meaning that any subblock's motion vector within the
   SPLITMV macroblock may point outside the clamping zone.  These
   non-clamped vectors are also used when determining the decoding tree
   context for subsequent subblocks' modes in the vp8_mvCont() function.
Top   ToC   RFC6386 - Page 115

18.2. Prediction Subblocks

The prediction calculation for each subblock is then as follows. Temporarily disregarding the fractional part of the motion vector (that is, rounding "up" or "left" by right-shifting each component 3 bits with sign propagation) and adding the origin (upper left position) of the (16x16 luma or 8x8 chroma) current macroblock gives us an origin in the Y, U, or V plane of the predictor frame (either the golden frame or previous frame). Considering that origin to be the upper left corner of a (luma or chroma) macroblock, we need to specify the relative positions of the pixels associated to that subblock, that is, any pixels that might be involved in the sub-pixel interpolation processes for the subblock.

18.3. Sub-Pixel Interpolation

The sub-pixel interpolation is effected via two one-dimensional convolutions. These convolutions may be thought of as operating on a two-dimensional array of pixels whose origin is the subblock origin, that is the origin of the prediction macroblock described above plus the offset to the subblock. Because motion vectors are arbitrary, so are these "prediction subblock origins". The integer part of the motion vector is subsumed in the origin of the prediction subblock; the 16 (synthetic) pixels we need to construct are given by 16 offsets from the origin. The integer part of each of these offsets is the offset of the corresponding pixel from the subblock origin (using the vertical stride). To these integer parts is added a constant fractional part, which is simply the difference between the actual motion vector and its integer truncation used to calculate the origins of the prediction macroblock and subblock. Each component of this fractional part is an integer between 0 and 7, representing a forward displacement in eighths of a pixel. It is these fractional displacements that determine the filtering process. If they both happen to be zero (that is, we had a "whole pixel" motion vector), the prediction subblock is simply copied into the corresponding piece of the current macroblock's prediction buffer. As discussed in Section 14, the layout of the macroblock's prediction buffer can depend on the specifics of the reconstruction implementation chosen. Of course, the vertical displacement between lines of the prediction subblock is given by the stride, as are all vertical displacements used here.
Top   ToC   RFC6386 - Page 116
   Otherwise, at least one of the fractional displacements is non-zero.
   We then synthesize the missing pixels via a horizontal, followed by a
   vertical, one-dimensional interpolation.

   The two interpolations are essentially identical.  Each uses a (at
   most) six-tap filter (the choice of which of course depends on the
   one-dimensional offset).  Thus, every calculated pixel references at
   most three pixels before (above or to the left of) it and at most
   three pixels after (below or to the right of) it.  The horizontal
   interpolation must calculate two extra rows above and three extra
   rows below the 4x4 block, to provide enough samples for the vertical
   interpolation to proceed.

   Depending on the reconstruction filter type given in the version
   number field in the frame tag, either a bicubic or a bilinear tap set
   is used.

   The exact implementation of subsampling is as follows.

   ---- Begin code block --------------------------------------

   /* Filter taps taken to 7-bit precision.
      Because DC is always passed, taps always sum to 128. */

   const int BilinearFilters[8][6] =
   {
       { 0, 0, 128,   0, 0, 0 },
       { 0, 0, 112,  16, 0, 0 },
       { 0, 0,  96,  32, 0, 0 },
       { 0, 0,  80,  48, 0, 0 },
       { 0, 0,  64,  64, 0, 0 },
       { 0, 0,  48,  80, 0, 0 },
       { 0, 0,  32,  96, 0, 0 },
       { 0, 0,  16, 112, 0, 0 }
   };

   const int filters [8] [6] = {        /* indexed by displacement */
       { 0,  0,  128,    0,   0,  0 },  /* degenerate whole-pixel */
       { 0, -6,  123,   12,  -1,  0 },  /* 1/8 */
       { 2, -11, 108,   36,  -8,  1 },  /* 1/4 */
       { 0, -9,   93,   50,  -6,  0 },  /* 3/8 */
       { 3, -16,  77,   77, -16,  3 },  /* 1/2 is symmetric */
       { 0, -6,   50,   93,  -9,  0 },  /* 5/8 = reverse of 3/8 */
       { 1, -8,   36,  108, -11,  2 },  /* 3/4 = reverse of 1/4 */
       { 0, -1,   12,  123,  -6,  0 }   /* 7/8 = reverse of 1/8 */
   };
Top   ToC   RFC6386 - Page 117
   /* One-dimensional synthesis of a single sample.
      Filter is determined by fractional displacement */

   Pixel interp(
       const int fil[6],   /* filter to apply */
       const Pixel *p,     /* origin (rounded "before") in
                              prediction area */
       const int s         /* size of one forward step "" */
   ) {
       int32 a = 0;
       int i = 0;
       p -= s + s;         /* move back two positions */

       do {
           a += *p * fil[i];
           p += s;
       }  while (++i < 6);

       return clamp255((a + 64) >> 7);    /* round to nearest
                                              8-bit value */
   }


   /* First do horizontal interpolation, producing intermediate
      buffer. */

   void Hinterp(
       Pixel temp[9][4],   /* 9 rows of 4 (intermediate)
                              destination values */
       const Pixel *p,     /* subblock origin in prediction
                              frame */
       int s,              /* vertical stride to be used in
                              prediction frame */
       uint hfrac,         /* 0 <= horizontal displacement <= 7 */
       uint bicubic        /* 1=bicubic filter, 0=bilinear */
   ) {
       const int * const fil = bicubic ? filters [hfrac] :
         BilinearFilters[hfrac];

       int r = 0;  do              /* for each row */
       {
           int c = 0;  do          /* for each destination sample */
           {
               /* Pixel separation = one horizontal step = 1 */

               temp[r][c] = interp(fil, p + c, 1);
           }
Top   ToC   RFC6386 - Page 118
           while (++c < 4);
       }
       while (p += s, ++r < 9);    /* advance p to next row */
   }

   /* Finish with vertical interpolation, producing final results.
      Input array "temp" is of course that computed above. */

   void Vinterp(
       Pixel final[4][4],  /* 4 rows of 4 (final) destination values */
       const Pixel temp[9][4],
       uint vfrac,         /* 0 <= vertical displacement <= 7 */
       uint bicubic        /* 1=bicubic filter, 0=bilinear */
   ) {
       const int * const fil = bicubic ? filters [vfrac] :
         BilinearFilters[vfrac];

       int r = 0;  do              /* for each row */
       {
           int c = 0;  do          /* for each destination sample */
           {
               /* Pixel separation = one vertical step = width
                  of array = 4 */

               final[r][c] = interp(fil, temp[r] + c, 4);
           }
           while (++c < 4);
       }
       while (++r < 4);
   }

   ---- End code block ----------------------------------------

18.4. Filter Properties

We discuss briefly the rationale behind the choice of filters. Our approach is necessarily cursory; a genuinely accurate discussion would require a couple of books. Readers unfamiliar with signal processing may or may not wish to skip this. All digital signals are of course sampled in some fashion. The case where the inter-sample spacing (say in time for audio samples, or space for pixels) is uniform, that is, the same at all positions, is particularly common and amenable to analysis. Many aspects of the treatment of such signals are best-understood in the frequency domain via Fourier Analysis, particularly those aspects of the signal that are not changed by shifts in position, especially when those positional shifts are not given by a whole number of samples.
Top   ToC   RFC6386 - Page 119
   Non-integral translates of a sampled signal are a textbook example of
   the foregoing.  In our case of non-integral motion vectors, we wish
   to say what the underlying image "really is" at these pixels;
   although we don't have values for them, we feel that it makes sense
   to talk about them.  The correctness of this feeling is predicated on
   the underlying signal being band-limited, that is, not containing any
   energy in spatial frequencies that cannot be faithfully rendered at
   the pixel resolution at our disposal.  In one dimension, this range
   of "OK" frequencies is called the Nyquist band; in our two-
   dimensional case of integer-grid samples, this range might be termed
   a Nyquist rectangle.  The finer the grid, the more we know about the
   image, and the wider the Nyquist rectangle.

   It turns out that, for such band-limited signals, there is indeed an
   exact mathematical formula to produce the correct sample value at an
   arbitrary point.  Unfortunately, this calculation requires the
   consideration of every single sample in the image, as well as needing
   to operate at infinite precision.  Also, strictly speaking, all band-
   limited signals have infinite spatial (or temporal) extent, so
   everything we are discussing is really some sort of approximation.

   It is true that the theoretically correct subsampling procedure, as
   well as any approximation thereof, is always given by a translation-
   invariant weighted sum (or filter) similar to that used by VP8.  It
   is also true that the reconstruction error made by such a filter can
   be simply represented as a multiplier in the frequency domain; that
   is, such filters simply multiply the Fourier transform of any signal
   to which they are applied by a fixed function associated to the
   filter.  This fixed function is usually called the frequency response
   (or transfer function); the ideal subsampling filter has a frequency
   response equal to one in the Nyquist rectangle and zero everywhere
   else.

   Another basic fact about approximations to "truly correct"
   subsampling is that the wider the subrectangle (within the Nyquist
   rectangle) of spatial frequencies one wishes to "pass" (that is,
   correctly render) or, put more accurately, the closer one wishes to
   approximate the ideal transfer function, the more samples of the
   original signal must be considered by the subsampling, and the wider
   the calculation precision necessitated.

   The filters chosen by VP8 were chosen, within the constraints of 4 or
   6 taps and 7-bit precision, to do the best possible job of handling
   the low spatial frequencies near the 0th DC frequency along with
   introducing no resonances (places where the absolute value of the
   frequency response exceeds one).
Top   ToC   RFC6386 - Page 120
   The justification for the foregoing has two parts.  First, resonances
   can produce extremely objectionable visible artifacts when, as often
   happens in actual compressed video streams, filters are applied
   repeatedly.  Second, the vast majority of energy in real-world images
   lies near DC and not at the high end.

   To get slightly more specific, the filters chosen by VP8 are the best
   resonance-free 4- or 6-tap filters possible, where "best" describes
   the frequency response near the origin: The response at 0 is required
   to be 1, and the graph of the response at 0 is as flat as possible.

   To provide an intuitively more obvious point of reference, the "best"
   2-tap filter is given by simple linear interpolation between the
   surrounding actual pixels.

   Finally, it should be noted that, because of the way motion vectors
   are calculated, the (shorter) 4-tap filters (used for odd fractional
   displacements) are applied in the chroma plane only.  Human color
   perception is notoriously poor, especially where higher spatial
   frequencies are involved.  The shorter filters are easier to
   understand mathematically, and the difference between them and a
   theoretically slightly better 6-tap filter is negligible where chroma
   is concerned.

19. Annex A: Bitstream Syntax

This annex presents the bitstream syntax in a tabular form. All the information elements have been introduced and explained in the previous sections but are collected here for a quick reference. Each syntax element is briefly described after the tabular representation along with a reference to the corresponding paragraph in the main document. The meaning of each syntax element value is not repeated here. The top-level hierarchy of the bitstream is introduced in Section 4. Definition of syntax element coding types can be found in Section 8. The types used in the representation in this annex are: o f(n), n-bit value from stream (n successive bits, not boolean encoded) o L(n), n-bit number encoded as n booleans (with equal probability of being 0 or 1) o B(p), bool with probability p of being 0 o T, tree-encoded value
Top   ToC   RFC6386 - Page 121

19.1. Uncompressed Data Chunk

| Frame Tag | Type | | ------------------------------------------------- | ----- | | frame_tag | f(24) | | if (key_frame) { | | | start_code | f(24) | | horizontal_size_code | f(16) | | vertical_size_code | f(16) | | } | | The 3-byte frame tag can be parsed as follows: ---- Begin code block -------------------------------------- unsigned char *c = pbi->source; unsigned int tmp; tmp = (c[2] << 16) | (c[1] << 8) | c[0]; key_frame = tmp & 0x1; version = (tmp >> 1) & 0x7; show_frame = (tmp >> 4) & 0x1; first_part_size = (tmp >> 5) & 0x7FFFF; ---- End code block ---------------------------------------- Where: o key_frame indicates whether the current frame is a key frame or not. o version determines the bitstream version. o show_frame indicates whether the current frame is meant to be displayed or not. o first_part_size determines the size of the first partition (control partition), excluding the uncompressed data chunk.
Top   ToC   RFC6386 - Page 122
   The start_code is a constant 3-byte pattern having value 0x9d012a.
   The latter part of the uncompressed chunk (after the start_code) can
   be parsed as follows:

   ---- Begin code block --------------------------------------

   unsigned char *c = pbi->source + 6;
   unsigned int tmp;

   tmp = (c[1] << 8) | c[0];

   width = tmp & 0x3FFF;
   horizontal_scale = tmp >> 14;

   tmp = (c[3] << 8) | c[2];

   height = tmp & 0x3FFF;
   vertical_scale = tmp >> 14;

   ---- End code block ----------------------------------------

19.2. Frame Header

| Frame Header | Type | | ------------------------------------------------- | ----- | | if (key_frame) { | | | color_space | L(1) | | clamping_type | L(1) | | } | | | segmentation_enabled | L(1) | | if (segmentation_enabled) | | | update_segmentation() | | | filter_type | L(1) | | loop_filter_level | L(6) | | sharpness_level | L(3) | | mb_lf_adjustments() | | | log2_nbr_of_dct_partitions | L(2) | | quant_indices() | | | if (key_frame) | | | refresh_entropy_probs | L(1) |
Top   ToC   RFC6386 - Page 123
   | else {                                            |       |
   |   refresh_golden_frame                            | L(1)  |
   |   refresh_alternate_frame                         | L(1)  |
   |   if (!refresh_golden_frame)                      |       |
   |     copy_buffer_to_golden                         | L(2)  |
   |   if (!refresh_alternate_frame)                   |       |
   |     copy_buffer_to_alternate                      | L(2)  |
   |   sign_bias_golden                                | L(1)  |
   |   sign_bias_alternate                             | L(1)  |
   |   refresh_entropy_probs                           | L(1)  |
   |   refresh_last                                    | L(1)  |
   | }                                                 |       |
   | token_prob_update()                               |       |
   | mb_no_skip_coeff                                  | L(1)  |
   | if (mb_no_skip_coeff)                             |       |
   |   prob_skip_false                                 | L(8)  |
   | if (!key_frame) {                                 |       |
   |   prob_intra                                      | L(8)  |
   |   prob_last                                       | L(8)  |
   |   prob_gf                                         | L(8)  |
   |   intra_16x16_prob_update_flag                    | L(1)  |
   |   if (intra_16x16_prob_update_flag) {             |       |
   |     for (i = 0; i < 4; i++)                       |       |
   |       intra_16x16_prob                            | L(8)  |
   |   }                                               |       |
   |   intra_chroma prob_update_flag                   | L(1)  |
   |   if (intra_chroma_prob_update_flag) {            |       |
   |     for (i = 0; i < 3; i++)                       |       |
   |       intra_chroma_prob                           | L(8)  |
   |   }                                               |       |
   |   mv_prob_update()                                |       |
   | }                                                 |       |

   o  color_space defines the YUV color space of the sequence
      (Section 9.2)

   o  clamping_type specifies if the decoder is required to clamp the
      reconstructed pixel values (Section 9.2)

   o  segmentation_enabled enables the segmentation feature for the
      current frame (Section 9.3)

   o  filter_type determines whether the normal or the simple loop
      filter is used (Sections 9.4, 15)

   o  loop_filter_level controls the deblocking filter
      (Sections 9.4, 15)
Top   ToC   RFC6386 - Page 124
   o  sharpness_level controls the deblocking filter (Sections 9.4, 15)

   o  log2_nbr_of_dct_partitions determines the number of separate
      partitions containing the DCT coefficients of the macroblocks
      (Section 9.5)

   o  refresh_entropy_probs determines whether updated token
      probabilities are used only for this frame or until further update

   o  refresh_golden_frame determines if the current decoded frame
      refreshes the golden frame (Section 9.7)

   o  refresh_alternate_frame determines if the current decoded frame
      refreshes the alternate reference frame (Section 9.7)

   o  copy_buffer_to_golden determines if the golden reference is
      replaced by another reference (Section 9.7)

   o  copy_buffer_to_alternate determines if the alternate reference is
      replaced by another reference (Section 9.7)

   o  sign_bias_golden controls the sign of motion vectors when the
      golden frame is referenced (Section 9.7)

   o  sign_bias_alternate controls the sign of motion vectors when the
      alternate frame is referenced (Section 9.7)

   o  refresh_last determines if the current decoded frame refreshes the
      last frame reference buffer (Section 9.8)

   o  mb_no_skip_coeff enables or disables the skipping of macroblocks
      containing no non-zero coefficients (Section 9.10)

   o  prob_skip_false indicates the probability that the macroblock is
      not skipped (flag indicating skipped macroblock is false)
      (Section 9.10)

   o  prob_intra indicates the probability of an intra macroblock
      (Section 9.10)

   o  prob_last indicates the probability that the last reference frame
      is used for inter-prediction (Section 9.10)

   o  prob_gf indicates the probability that the golden reference frame
      is used for inter-prediction (Section 9.10)
Top   ToC   RFC6386 - Page 125
   o  intra_16x16_prob_update_flag indicates if the branch probabilities
      used in the decoding of the luma intra-prediction mode are updated
      (Section 9.10)

   o  intra_16x16_prob indicates the branch probabilities of the luma
      intra-prediction mode decoding tree

   o  intra_chroma_prob_update_flag indicates if the branch
      probabilities used in the decoding of the chroma intra-prediction
      mode are updated (Section 9.10)

   o  intra_chroma_prob indicates the branch probabilities of the chroma
      intra-prediction mode decoding tree

   | update_segmentation()                             | Type  |
   | ------------------------------------------------- | ----- |
   | update_mb_segmentation_map                        | L(1)  |
   | update_segment_feature_data                       | L(1)  |
   | if (update_segment_feature_data) {                |       |
   |   segment_feature_mode                            | L(1)  |
   |   for (i = 0; i < 4; i++) {                       |       |
   |     quantizer_update                              | L(1)  |
   |     if (quantizer_update) {                       |       |
   |       quantizer_update_value                      | L(7)  |
   |       quantizer_update_sign                       | L(1)  |
   |     }                                             |       |
   |   }                                               |       |
   |   for (i = 0; i < 4; i++) {                       |       |
   |     loop_filter_update                            | L(1)  |
   |     if (loop_filter_update) {                     |       |
   |       lf_update_value                             | L(6)  |
   |       lf_update_sign                              | L(1)  |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |
   | if (update_mb_segmentation_map) {                 |       |
   |   for (i = 0; i < 3; i++) {                       |       |
   |     segment_prob_update                           | L(1)  |
   |     if (segment_prob_update)                      |       |
   |       segment_prob                                | L(8)  |
   |   }                                               |       |
   | }                                                 |       |

   o  update_mb_segmentation_map determines if the MB segmentation map
      is updated in the current frame (Section 9.3)

   o  update_segment_feature_data indicates if the segment feature data
      is updated in the current frame (Section 9.3)
Top   ToC   RFC6386 - Page 126
   o  segment_feature_mode indicates the feature data update mode, 0 for
      delta and 1 for the absolute value (Section 9.3)

   o  quantizer_update indicates if the quantizer value is updated for
      the i^(th) segment (Section 9.3)

   o  quantizer_update_value indicates the update value for the segment
      quantizer (Section 9.3)

   o  quantizer_update_sign indicates the update sign for the segment
      quantizer (Section 9.3)

   o  loop_filter_update indicates if the loop filter level value is
      updated for the i^(th) segment (Section 9.3)

   o  lf_update_value indicates the update value for the loop filter
      level (Section 9.3)

   o  lf_update_sign indicates the update sign for the loop filter level
      (Section 9.3)

   o  segment_prob_update indicates whether the branch probabilities
      used to decode the segment_id in the MB header are decoded from
      the stream or use the default value of 255 (Section 9.3)

   o  segment_prob indicates the branch probabilities of the segment_id
      decoding tree (Section 9.3)
Top   ToC   RFC6386 - Page 127
   | mb_lf_adjustments()                               | Type  |
   | ------------------------------------------------- | ----- |
   | loop_filter_adj_enable                            | L(1)  |
   | if (loop_filter_adj_enable) {                     |       |
   |   mode_ref_lf_delta_update                        | L(1)  |
   |   if (mode_ref_lf_delta_update) {                 |       |
   |     for (i = 0; i < 4; i++) {                     |       |
   |       ref_frame_delta_update_flag                 | L(1)  |
   |       if (ref_frame_delta_update_flag) {          |       |
   |         delta_magnitude                           | L(6)  |
   |         delta_sign                                | L(1)  |
   |       }                                           |       |
   |     }                                             |       |
   |     for (i = 0; i < 4; i++) {                     |       |
   |       mb_mode_delta_update_flag                   | L(1)  |
   |       if (mb_mode_delta_update_flag) {            |       |
   |         delta_magnitude                           | L(6)  |
   |         delta_sign                                | L(1)  |
   |       }                                           |       |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |

   o  loop_filter_adj_enable indicates if the MB-level loop filter
      adjustment (based on the used reference frame and coding mode) is
      on for the current frame (Section 9.4)

   o  mode_ref_lf_delta_update indicates if the delta values used in an
      adjustment are updated in the current frame (Section 9.4)

   o  ref_frame_delta_update_flag indicates if the adjustment delta
      value corresponding to a certain used reference frame is updated
      (Section 9.4)

   o  delta_magnitude is the absolute value of the delta value

   o  delta_sign is the sign of the delta value

   o  mb_mode_delta_update_flag indicates if the adjustment delta value
      corresponding to a certain MB prediction mode is updated
      (Section 9.4)
Top   ToC   RFC6386 - Page 128
   | quant_indices()                                   | Type  |
   | ------------------------------------------------- | ----- |
   | y_ac_qi                                           | L(7)  |
   | y_dc_delta_present                                | L(1)  |
   | if (y_dc_delta_present) {                         |       |
   |   y_dc_delta_magnitude                            | L(4)  |
   |   y_dc_delta_sign                                 | L(1)  |
   | }                                                 |       |
   | y2_dc_delta_present                               | L(1)  |
   | if (y2_dc_delta_present) {                        |       |
   |   y2_dc_delta_magnitude                           | L(4)  |
   |   y2_dc_delta_sign                                | L(1)  |
   | }                                                 |       |
   | y2_ac_delta_present                               | L(1)  |
   | if (y2_ac_delta_present) {                        |       |
   |   y2_ac_delta_magnitude                           | L(4)  |
   |   y2_ac_delta_sign                                | L(1)  |
   | }                                                 |       |
   | uv_dc_delta_present                               | L(1)  |
   | if (uv_dc_delta_present) {                        |       |
   |   uv_dc_delta_magnitude                           | L(4)  |
   |   uv_dc_delta_sign                                | L(1)  |
   | }                                                 |       |
   | uv_ac_delta_present                               | L(1)  |
   | if (uv_ac_delta_present) {                        |       |
   |   uv_ac_delta_magnitude                           | L(4)  |
   |   uv_ac_delta_sign                                | L(1)  |
   | }                                                 |       |

   o  y_ac_qi is the dequantization table index used for the luma AC
      coefficients (and other coefficient groups if no delta value is
      present) (Section 9.6)

   o  y_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the luma DC
      coefficient dequantization index (Section 9.6)

   o  y_dc_delta_magnitude is the magnitude of the delta value
      (Section 9.6)

   o  y_dc_delta_sign is the sign of the delta value (Section 9.6)

   o  y2_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the Y2 block DC
      coefficient dequantization index (Section 9.6)
Top   ToC   RFC6386 - Page 129
   o  y2_ac_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the Y2 block AC
      coefficient dequantization index (Section 9.6)

   o  uv_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the chroma DC
      coefficient dequantization index (Section 9.6)

   o  uv_ac_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the chroma AC
      coefficient dequantization index (Section 9.6)

   | token_prob_update()                               | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = 0; i < 4; i++) {                         |       |
   |   for (j = 0; j < 8; j++) {                       |       |
   |     for (k = 0; k < 3; k++) {                     |       |
   |       for (l = 0; l < 11; l++) {                  |       |
   |         coeff_prob_update_flag                    | L(1)  |
   |         if (coeff_prob_update_flag)               |       |
   |           coeff_prob                              | L(8)  |
   |       }                                           |       |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |

   o  coeff_prob_update_flag indicates if the corresponding branch
      probability is updated in the current frame (Section 13.4)

   o  coeff_prob is the new branch probability (Section 13.4)

   | mv_prob_update()                                  | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = 0; i < 2; i++) {                         |       |
   |   for (j = 0; j < 19; j++) {                      |       |
   |     mv_prob_update_flag                           | L(1)  |
   |     if (mv_prob_update_flag)                      |       |
   |       prob                                        | L(7)  |
   |   }                                               |       |
   | }                                                 |       |

   o  mv_prob_update_flag indicates if the corresponding MV decoding
      probability is updated in the current frame (Section 17.2)

   o  prob is the updated probability (Section 17.2)
Top   ToC   RFC6386 - Page 130

19.3. Macroblock Data

| Macroblock Data | Type | | ------------------------------------------------- | ----- | | macroblock_header() | | | residual_data() | | | macroblock_header() | Type | | ------------------------------------------------- | ----- | | if (update_mb_segmentation_map) | | | segment_id | T | | if (mb_no_skip_coeff) | | | mb_skip_coeff | B(p) | | if (!key_frame) | | | is_inter_mb | B(p) | | if (is_inter_mb) { | | | mb_ref_frame_sel1 | B(p) | | if (mb_ref_frame_sel1) | | | mb_ref_frame_sel2 | B(p) | | mv_mode | T | | if (mv_mode == SPLITMV) { | | | mv_split_mode | T | | for (i = 0; i < numMvs; i++) { | | | sub_mv_mode | T | | if (sub_mv_mode == NEWMV4x4) { | | | read_mvcomponent() | | | read_mvcomponent() | | | } | | | } | | | } else if (mv_mode == NEWMV) { | | | read_mvcomponent() | | | read_mvcomponent() | | | } | | | } else { /* intra mb */ | | | intra_y_mode | T | | if (intra_y_mode == B_PRED) { | | | for (i = 0; i < 16; i++) | | | intra_b_mode | T | | } | | | intra_uv_mode | T | | } | | o segment_id indicates to which segment the macroblock belongs (Section 10) o mb_skip_coeff indicates whether the macroblock contains any coded coefficients or not (Section 11.1)
Top   ToC   RFC6386 - Page 131
   o  is_inter_mb indicates whether the macroblock is intra- or inter-
      coded (Section 16)

   o  mb_ref_frame_sel1 selects the reference frame to be used; last
      frame (0), golden/alternate (1) (Section 16.2)

   o  mb_ref_frame_sel2 selects whether the golden (0) or alternate
      reference frame (1) is used (Section 16.2)

   o  mv_mode determines the macroblock motion vector mode
      (Section 16.2)

   o  mv_split_mode gives the macroblock partitioning specification and
      determines the number of motion vectors used (numMvs)
      (Section 16.2)

   o  sub_mv_mode determines the sub-macroblock motion vector mode for
      macroblocks coded using the SPLITMV motion vector mode
      (Section 16.2)

   o  intra_y_mode selects the luminance intra-prediction mode
      (Section 16.1)

   o  intra_b_mode selects the sub-macroblock luminance prediction mode
      for macroblocks coded using B_PRED mode (Section 16.1)

   o  intra_uv_mode selects the chrominance intra-prediction mode
      (Section 16.1)

   | residual_data()                                   | Type  |
   | ------------------------------------------------- | ----- |
   | if (!mb_skip_coeff) {                             |       |
   |   if ( (is_inter_mb && mv_mode != SPLITMV) ||     |       |
   |        (!is_inter_mb && intra_y_mode != B_PRED) ) |       |
   |     residual_block() /* Y2 */                     |       |
   |   for (i = 0; i < 24; i++)                        |       |
   |     residual_block() /* 16 Y, 4 U, 4 V */         |       |
   | }                                                 |       |
Top   ToC   RFC6386 - Page 132
   | residual_block()                                  | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = firstCoeff; i < 16; i++) {               |       |
   |   token                                           | T     |
   |   if (token == EOB) break;                        |       |
   |   if (token_has_extra_bits)                       |       |
   |     extra_bits                                    | L(n)  |
   |   if (coefficient != 0)                           |       |
   |     sign                                          | L(1)  |
   | }                                                 |       |

   o  firstCoeff is 1 for luma blocks of macroblocks containing Y2
      subblock; otherwise 0

   o  token defines the value of the coefficient, the value range of the
      coefficient, or the end of block (Section 13.2)

   o  extra_bits determines the value of the coefficient within the
      value range defined by the token (Section 13.2)

   o  sign indicates the sign of the coefficient (Section 13.2)


(next page on part 6)

Next Section