Tech-invite3GPPspaceIETF RFCsSIP
93929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 6716

Definition of the Opus Audio Codec

Pages: 326
Proposed Standard
Errata
Updated by:  8251
Part 3 of 14 – Pages 32 to 46
First   Prev   Next

Top   ToC   RFC6716 - Page 32   prevText

4.2. SILK Decoder

The decoder's LP layer uses a modified version of the SILK codec (herein simply called "SILK"), which runs a decoded excitation signal through adaptive long-term and short-term prediction synthesis filters. It runs at NB, MB, and WB sample rates internally. When used in a SWB or FB Hybrid frame, the LP layer itself still only runs in WB.

4.2.1. SILK Decoder Modules

An overview of the decoder is given in Figure 14. +---------+ +------------+ -->| Range |--->| Decode |---------------------------+ 1 | Decoder | 2 | Parameters |----------+ 5 | +---------+ +------------+ 4 | | 3 | | | \/ \/ \/ +------------+ +------------+ +------------+ | Generate |-->| LTP |-->| LPC | | Excitation | | Synthesis | | Synthesis | +------------+ +------------+ +------------+ ^ | | | +-------------------+----------------+ | 6 | +------------+ +-------------+ +-->| Stereo |-->| Sample Rate |--> | Unmixing | 7 | Conversion | 8 +------------+ +-------------+ 1: Range encoded bitstream 2: Coded parameters 3: Pulses, LSBs, and signs 4: Pitch lags, Long-Term Prediction (LTP) coefficients 5: Linear Predictive Coding (LPC) coefficients and gains 6: Decoded signal (mono or mid-side stereo) 7: Unmixed signal (mono or left-right stereo) 8: Resampled signal Figure 14: SILK Decoder
Top   ToC   RFC6716 - Page 33
   The decoder feeds the bitstream (1) to the range decoder from
   Section 4.1 and then decodes the parameters in it (2) using the
   procedures detailed in Sections 4.2.3 through 4.2.7.8.5.  These
   parameters (3, 4, 5) are used to generate an excitation signal (see
   Section 4.2.7.8.6), which is fed to an optional Long-Term Prediction
   (LTP) filter (voiced frames only, see Section 4.2.7.9.1) and then a
   short-term prediction filter (see Section 4.2.7.9.2), producing the
   decoded signal (6).  For stereo streams, the mid-side representation
   is converted to separate left and right channels (7).  The result is
   finally resampled to the desired output sample rate (e.g., 48 kHz) so
   that the resampled signal (8) can be mixed with the CELT layer.

4.2.2. LP Layer Organization

Internally, the LP layer of a single Opus frame is composed of either a single 10 ms regular SILK frame or between one and three 20 ms regular SILK frames. A stereo Opus frame may double the number of regular SILK frames (up to a total of six), since it includes separate frames for a mid channel and, optionally, a side channel. Optional Low Bit-Rate Redundancy (LBRR) frames, which are reduced- bitrate encodings of previous SILK frames, may be included to aid in recovery from packet loss. If present, these appear before the regular SILK frames. They are, in most respects, identical to regular, active SILK frames, except that they are usually encoded with a lower bitrate. This document uses "SILK frame" to refer to either one and "regular SILK frame" if it needs to draw a distinction between the two. Logically, each SILK frame is, in turn, composed of either two or four 5 ms subframes. Various parameters, such as the quantization gain of the excitation and the pitch lag and filter coefficients can vary on a subframe-by-subframe basis. Physically, the parameters for each subframe are interleaved in the bitstream, as described in the relevant sections for each parameter. All of these frames and subframes are decoded from the same range coder, with no padding between them. Thus, packing multiple SILK frames in a single Opus frame saves, on average, half a byte per SILK frame. It also allows some parameters to be predicted from prior SILK frames in the same Opus frame, since this does not degrade packet loss robustness (beyond any penalty for merely using fewer, larger packets to store multiple frames). Stereo support in SILK uses a variant of mid-side coding, allowing a mono decoder to simply decode the mid channel. However, the data for the two channels is interleaved, so a mono decoder must still unpack
Top   ToC   RFC6716 - Page 34
   the data for the side channel.  It would be required to do so anyway
   for Hybrid Opus frames or to support decoding individual 20 ms
   frames.

   Table 3 summarizes the overall grouping of the contents of the LP
   layer.  Figures 15 and 16 illustrate the ordering of the various SILK
   frames for a 60 ms Opus frame, for both mono and stereo,
   respectively.

   +-----------------------------------+---------------+---------------+
   |             Symbol(s)             |     PDF(s)    |   Condition   |
   +-----------------------------------+---------------+---------------+
   |   Voice Activity Detection (VAD)  |    {1, 1}/2   |               |
   |               Flags               |               |               |
   |                                   |               |               |
   |             LBRR Flag             |    {1, 1}/2   |               |
   |                                   |               |               |
   |        Per-Frame LBRR Flags       |    Table 4    | Section 4.2.4 |
   |                                   |               |               |
   |           LBRR Frame(s)           | Section 4.2.7 | Section 4.2.4 |
   |                                   |               |               |
   |       Regular SILK Frame(s)       | Section 4.2.7 |               |
   +-----------------------------------+---------------+---------------+

         Table 3: Organization of the SILK layer of an Opus Frame


                    +---------------------------------+
                    |            VAD Flags            |
                    +---------------------------------+
                    |            LBRR Flag            |
                    +---------------------------------+
                    | Per-Frame LBRR Flags (Optional) |
                    +---------------------------------+
                    |     LBRR Frame 1 (Optional)     |
                    +---------------------------------+
                    |     LBRR Frame 2 (Optional)     |
                    +---------------------------------+
                    |     LBRR Frame 3 (Optional)     |
                    +---------------------------------+
                    |      Regular SILK Frame 1       |
                    +---------------------------------+
                    |      Regular SILK Frame 2       |
                    +---------------------------------+
                    |      Regular SILK Frame 3       |
                    +---------------------------------+

                       Figure 15: A 60 ms Mono Frame
Top   ToC   RFC6716 - Page 35
                 +---------------------------------------+
                 |             Mid VAD Flags             |
                 +---------------------------------------+
                 |             Mid LBRR Flag             |
                 +---------------------------------------+
                 |             Side VAD Flags            |
                 +---------------------------------------+
                 |             Side LBRR Flag            |
                 +---------------------------------------+
                 |  Mid Per-Frame LBRR Flags (Optional)  |
                 +---------------------------------------+
                 | Side Per-Frame LBRR Flags (Optional)  |
                 +---------------------------------------+
                 |     Mid LBRR Frame 1 (Optional)       |
                 +---------------------------------------+
                 |     Side LBRR Frame 1 (Optional)      |
                 +---------------------------------------+
                 |     Mid LBRR Frame 2 (Optional)       |
                 +---------------------------------------+
                 |     Side LBRR Frame 2 (Optional)      |
                 +---------------------------------------+
                 |     Mid LBRR Frame 3 (Optional)       |
                 +---------------------------------------+
                 |     Side LBRR Frame 3 (Optional)      |
                 +---------------------------------------+
                 |      Mid Regular SILK Frame 1         |
                 +---------------------------------------+
                 | Side Regular SILK Frame 1 (Optional)  |
                 +---------------------------------------+
                 |      Mid Regular SILK Frame 2         |
                 +---------------------------------------+
                 | Side Regular SILK Frame 2 (Optional)  |
                 +---------------------------------------+
                 |      Mid Regular SILK Frame 3         |
                 +---------------------------------------+
                 | Side Regular SILK Frame 3 (Optional)  |
                 +---------------------------------------+

                      Figure 16: A 60 ms Stereo Frame

4.2.3. Header Bits

The LP layer begins with two to eight header bits, decoded in silk_Decode() (dec_API.c). These consist of one Voice Activity Detection (VAD) bit per frame (up to 3), followed by a single flag indicating the presence of LBRR frames. For a stereo packet, these first flags correspond to the mid channel, and a second set of flags is included for the side channel.
Top   ToC   RFC6716 - Page 36
   Because these are the first symbols decoded by the range coder and
   because they are coded as binary values with uniform probability,
   they can be extracted directly from the most significant bits of the
   first byte of compressed data.  Thus, a receiver can determine if an
   Opus frame contains any active SILK frames without the overhead of
   using the range decoder.

4.2.4. Per-Frame LBRR Flags

For Opus frames longer than 20 ms, a set of LBRR flags is decoded for each channel that has its LBRR flag set. Each set contains one flag per 20 ms SILK frame. 40 ms Opus frames use the 2-frame LBRR flag PDF from Table 4, and 60 ms Opus frames use the 3-frame LBRR flag PDF. For each channel, the resulting 2- or 3-bit integer contains the corresponding LBRR flag for each frame, packed in order from the LSB to the MSB. +------------+-------------------------------------+ | Frame Size | PDF | +------------+-------------------------------------+ | 40 ms | {0, 53, 53, 150}/256 | | | | | 60 ms | {0, 41, 20, 29, 41, 15, 28, 82}/256 | +------------+-------------------------------------+ Table 4: LBRR Flag PDFs A 10 or 20 ms Opus frame does not contain any per-frame LBRR flags, as there may be at most one LBRR frame per channel. The global LBRR flag in the header bits (see Section 4.2.3) is already sufficient to indicate the presence of that single LBRR frame.

4.2.5. LBRR Frames

The LBRR frames, if present, contain an encoded representation of the signal immediately prior to the current Opus frame as if it were encoded with the current mode, frame size, audio bandwidth, and channel count, even if those differ from the prior Opus frame. When one of these parameters changes from one Opus frame to the next, this implies that the LBRR frames of the current Opus frame may not be simple drop-in replacements for the contents of the previous Opus frame. For example, when switching from 20 ms to 60 ms, the 60 ms Opus frame may contain LBRR frames covering up to three prior 20 ms Opus frames, even if those frames already contained LBRR frames covering some of the same time periods. When switching from 20 ms to 10 ms, the 10 ms Opus frame can contain an LBRR frame covering at most half the prior
Top   ToC   RFC6716 - Page 37
   20 ms Opus frame, potentially leaving a hole that needs to be
   concealed from even a single packet loss (see Section 4.4).  When
   switching from mono to stereo, the LBRR frames in the first stereo
   Opus frame MAY contain a non-trivial side channel.

   In order to properly produce LBRR frames under all conditions, an
   encoder might need to buffer up to 60 ms of audio and re-encode it
   during these transitions.  However, the reference implementation opts
   to disable LBRR frames at the transition point for simplicity.  Since
   transitions are relatively infrequent in normal usage, this does not
   have a significant impact on packet loss robustness.

   The LBRR frames immediately follow the LBRR flags, prior to any
   regular SILK frames.  Section 4.2.7 describes their exact contents.
   LBRR frames do not include their own separate VAD flags.  LBRR frames
   are only meant to be transmitted for active speech, thus all LBRR
   frames are treated as active.

   In a stereo Opus frame longer than 20 ms, although the per-frame LBRR
   flags for the mid channel are coded as a unit before the per-frame
   LBRR flags for the side channel, the LBRR frames themselves are
   interleaved.  The decoder parses an LBRR frame for the mid channel of
   a given 20 ms interval (if present) and then immediately parses the
   corresponding LBRR frame for the side channel (if present), before
   proceeding to the next 20 ms interval.

4.2.6. Regular SILK Frames

The regular SILK frame(s) follow the LBRR frames (if any). Section 4.2.7 describes their contents, as well. Unlike the LBRR frames, a regular SILK frame is coded for each time interval in an Opus frame, even if the corresponding VAD flags are unset. For stereo Opus frames longer than 20 ms, the regular mid and side SILK frames for each 20 ms interval are interleaved, just as with the LBRR frames. The side frame may be skipped by coding an appropriate flag, as detailed in Section 4.2.7.2.

4.2.7. SILK Frame Contents

Each SILK frame includes a set of side information that encodes o The frame type and quantization type (Section 4.2.7.3), o Quantization gains (Section 4.2.7.4), o Short-term prediction filter coefficients (Section 4.2.7.5),
Top   ToC   RFC6716 - Page 38
   o  A Line Spectral Frequencies (LSFs) interpolation weight
      (Section 4.2.7.5.5),

   o  LTP filter lags and gains (Section 4.2.7.6), and

   o  A Linear Congruential Generator (LCG) seed (Section 4.2.7.7).

   The quantized excitation signal (see Section 4.2.7.8) follows these
   at the end of the frame.  Table 5 details the overall organization of
   a SILK frame.
Top   ToC   RFC6716 - Page 39
   +---------------------------+-------------------+-------------------+
   |         Symbol(s)         |       PDF(s)      |     Condition     |
   +---------------------------+-------------------+-------------------+
   | Stereo Prediction Weights |      Table 6      |  Section 4.2.7.1  |
   |                           |                   |                   |
   |       Mid-only Flag       |      Table 8      |  Section 4.2.7.2  |
   |                           |                   |                   |
   |         Frame Type        |  Section 4.2.7.3  |                   |
   |                           |                   |                   |
   |       Subframe Gains      |  Section 4.2.7.4  |                   |
   |                           |                   |                   |
   |   Normalized LSF Stage-1  |      Table 14     |                   |
   |           Index           |                   |                   |
   |                           |                   |                   |
   |   Normalized LSF Stage-2  | Section 4.2.7.5.2 |                   |
   |          Residual         |                   |                   |
   |                           |                   |                   |
   |       Normalized LSF      |      Table 26     |    20 ms frame    |
   |    Interpolation Weight   |                   |                   |
   |                           |                   |                   |
   |     Primary Pitch Lag     | Section 4.2.7.6.1 |    Voiced frame   |
   |                           |                   |                   |
   |   Subframe Pitch Contour  |      Table 32     |    Voiced frame   |
   |                           |                   |                   |
   |     Periodicity Index     |      Table 37     |    Voiced frame   |
   |                           |                   |                   |
   |         LTP Filter        |      Table 38     |    Voiced frame   |
   |                           |                   |                   |
   |        LTP Scaling        |      Table 42     | Section 4.2.7.6.3 |
   |                           |                   |                   |
   |          LCG Seed         |      Table 43     |                   |
   |                           |                   |                   |
   |   Excitation Rate Level   |      Table 45     |                   |
   |                           |                   |                   |
   |  Excitation Pulse Counts  |      Table 46     |                   |
   |                           |                   |                   |
   |      Excitation Pulse     | Section 4.2.7.8.3 |   Non-zero pulse  |
   |         Locations         |                   |       count       |
   |                           |                   |                   |
   |      Excitation LSBs      |      Table 51     | Section 4.2.7.8.2 |
   |                           |                   |                   |
   |      Excitation Signs     |      Table 52     |                   |
   +---------------------------+-------------------+-------------------+

         Table 5: Order of the Symbols in an Individual SILK Frame
Top   ToC   RFC6716 - Page 40
4.2.7.1. Stereo Prediction Weights
A SILK frame corresponding to the mid channel of a stereo Opus frame begins with a pair of side channel prediction weights, designed such that zeros indicate normal mid-side coupling. Since these weights can change on every frame, the first portion of each frame linearly interpolates between the previous weights and the current ones, using zeros for the previous weights if none are available. These prediction weights are never included in a mono Opus frame, and the previous weights are reset to zeros on any transition from mono to stereo. They are also not included in an LBRR frame for the side channel, even if the LBRR flags indicate the corresponding mid channel was not coded. In that case, the previous weights are used, again substituting in zeros if no previous weights are available since the last decoder reset (see Section 4.5.2). To summarize, these weights are coded if and only if o This is a stereo Opus frame (Section 3.1), and o The current SILK frame corresponds to the mid channel. The prediction weights are coded in three separate pieces, which are decoded by silk_stereo_decode_pred() (stereo_decode_pred.c). The first piece jointly codes the high-order part of a table index for both weights. The second piece codes the low-order part of each table index. The third piece codes an offset used to linearly interpolate between table indices. The details are as follows. Let n be an index decoded with the 25-element stage-1 PDF in Table 6. Then, let i0 and i1 be indices decoded with the stage-2 and stage-3 PDFs in Table 6, respectively, and let i2 and i3 be two more indices decoded with the stage-2 and stage-3 PDFs, all in that order. +-------+-----------------------------------------------------------+ | Stage | PDF | +-------+-----------------------------------------------------------+ | Stage | {7, 2, 1, 1, 1, 10, 24, 8, 1, 1, 3, 23, 92, 23, 3, 1, 1, | | 1 | 8, 24, 10, 1, 1, 1, 2, 7}/256 | | | | | Stage | {85, 86, 85}/256 | | 2 | | | | | | Stage | {51, 51, 52, 51, 51}/256 | | 3 | | +-------+-----------------------------------------------------------+ Table 6: Stereo Weight PDFs
Top   ToC   RFC6716 - Page 41
   Then, use n, i0, and i2 to form two table indices, wi0 and wi1,
   according to

                             wi0 = i0 + 3*(n/5)
                             wi1 = i2 + 3*(n%5)

   where the division is integer division.  The range of these indices
   is 0 to 14, inclusive.  Let w_Q13[i] be the i'th weight from Table 7.
   Then, the two prediction weights, w0_Q13 and w1_Q13, are

      w1_Q13 = w_Q13[wi1]
               + (((w_Q13[wi1+1] - w_Q13[wi1])*6554) >> 16)*(2*i3 + 1)

      w0_Q13 = w_Q13[wi0]
               + (((w_Q13[wi0+1] - w_Q13[wi0])*6554) >> 16)*(2*i1 + 1)
               - w1_Q13

   N.B., w1_Q13 is computed first here, because w0_Q13 depends on it.
   The constant 6554 is approximately 0.1 in Q16.  Although wi0 and wi1
   only have 15 possible values, Table 7 contains 16 entries to allow
   interpolation between entry wi0 and (wi0 + 1) (and likewise for wi1).
Top   ToC   RFC6716 - Page 42
                         +-------+--------------+
                         | Index | Weight (Q13) |
                         +-------+--------------+
                         | 0     |       -13732 |
                         |       |              |
                         | 1     |       -10050 |
                         |       |              |
                         | 2     |        -8266 |
                         |       |              |
                         | 3     |        -7526 |
                         |       |              |
                         | 4     |        -6500 |
                         |       |              |
                         | 5     |        -5000 |
                         |       |              |
                         | 6     |        -2950 |
                         |       |              |
                         | 7     |         -820 |
                         |       |              |
                         | 8     |          820 |
                         |       |              |
                         | 9     |         2950 |
                         |       |              |
                         | 10    |         5000 |
                         |       |              |
                         | 11    |         6500 |
                         |       |              |
                         | 12    |         7526 |
                         |       |              |
                         | 13    |         8266 |
                         |       |              |
                         | 14    |        10050 |
                         |       |              |
                         | 15    |        13732 |
                         +-------+--------------+

                       Table 7: Stereo Weight Table

4.2.7.2. Mid-Only Flag
A flag appears after the stereo prediction weights that indicates if only the mid channel is coded for this time interval. It appears only when o This is a stereo Opus frame (see Section 3.1), o The current SILK frame corresponds to the mid channel, and
Top   ToC   RFC6716 - Page 43
   o  Either

      *  This is a regular SILK frame where the VAD flags (see
         Section 4.2.3) indicate that the corresponding side channel is
         not active.

      *  This is an LBRR frame where the LBRR flags (see Sections 4.2.3
         and 4.2.4) indicate that the corresponding side channel is not
         coded.

   It is omitted when there are no stereo weights, for all of the same
   reasons.  It is also omitted for a regular SILK frame when the VAD
   flag of the corresponding side channel frame is set (indicating it is
   active).  The side channel must be coded in this case, making the
   mid-only flag redundant.  It is also omitted for an LBRR frame when
   the corresponding LBRR flags indicate the side channel is coded.

   When the flag is present, the decoder reads a single value using the
   PDF in Table 8, as implemented in silk_stereo_decode_mid_only()
   (stereo_decode_pred.c).  If the flag is set, then there is no
   corresponding SILK frame for the side channel, the entire decoding
   process for the side channel is skipped, and zeros are fed to the
   stereo unmixing process (see Section 4.2.8) instead.  As stated
   above, LBRR frames still include this flag when the LBRR flag
   indicates that the side channel is not coded.  In that case, if this
   flag is zero (indicating that there should be a side channel), then
   Packet Loss Concealment (PLC, see Section 4.4) SHOULD be invoked to
   recover a side channel signal.  Otherwise, the stereo image will
   collapse.

                             +---------------+
                             | PDF           |
                             +---------------+
                             | {192, 64}/256 |
                             +---------------+

                        Table 8: Mid-only Flag PDF

4.2.7.3. Frame Type
Each SILK frame contains a single "frame type" symbol that jointly codes the signal type and quantization offset type of the corresponding frame. If the current frame is a regular SILK frame whose VAD bit was not set (an "inactive" frame), then the frame type symbol takes on a value of either 0 or 1 and is decoded using the first PDF in Table 9. If the frame is an LBRR frame or a regular SILK frame whose VAD flag was set (an "active" frame), then the value of the symbol may range from 2 to 5, inclusive, and is decoded using
Top   ToC   RFC6716 - Page 44
   the second PDF in Table 9.  Table 10 translates between the value of
   the frame type symbol and the corresponding signal type and
   quantization offset type.

                +----------+-----------------------------+
                | VAD Flag | PDF                         |
                +----------+-----------------------------+
                | Inactive | {26, 230, 0, 0, 0, 0}/256   |
                |          |                             |
                | Active   | {0, 0, 24, 74, 148, 10}/256 |
                +----------+-----------------------------+

                         Table 9: Frame Type PDFs

          +------------+-------------+--------------------------+
          | Frame Type | Signal Type | Quantization Offset Type |
          +------------+-------------+--------------------------+
          | 0          | Inactive    |                      Low |
          |            |             |                          |
          | 1          | Inactive    |                     High |
          |            |             |                          |
          | 2          | Unvoiced    |                      Low |
          |            |             |                          |
          | 3          | Unvoiced    |                     High |
          |            |             |                          |
          | 4          | Voiced      |                      Low |
          |            |             |                          |
          | 5          | Voiced      |                     High |
          +------------+-------------+--------------------------+

    Table 10: Signal Type and Quantization Offset Type from Frame Type

4.2.7.4. Subframe Gains
A separate quantization gain is coded for each 5 ms subframe. These gains control the step size between quantization levels of the excitation signal and, therefore, the quality of the reconstruction. They are independent of and unrelated to the pitch contours coded for voiced frames. The quantization gains are themselves uniformly quantized to 6 bits on a log scale, giving them a resolution of approximately 1.369 dB and a range of approximately 1.94 dB to 88.21 dB. The subframe gains are either coded independently, or relative to the gain from the most recent coded subframe in the same channel. Independent coding is used if and only if
Top   ToC   RFC6716 - Page 45
   o  This is the first subframe in the current SILK frame, and

   o  Either

      *  This is the first SILK frame of its type (LBRR or regular) for
         this channel in the current Opus frame, or

      *  The previous SILK frame of the same type (LBRR or regular) for
         this channel in the same Opus frame was not coded.

   In an independently coded subframe gain, the 3 most significant bits
   of the quantization gain are decoded using a PDF selected from
   Table 11 based on the decoded signal type (see Section 4.2.7.3).

           +-------------+------------------------------------+
           | Signal Type | PDF                                |
           +-------------+------------------------------------+
           | Inactive    | {32, 112, 68, 29, 12, 1, 1, 1}/256 |
           |             |                                    |
           | Unvoiced    | {2, 17, 45, 60, 62, 47, 19, 4}/256 |
           |             |                                    |
           | Voiced      | {1, 3, 26, 71, 94, 50, 9, 2}/256   |
           +-------------+------------------------------------+

        Table 11: PDFs for Independent Quantization Gain MSB Coding

   The 3 least significant bits are decoded using a uniform PDF:

                 +--------------------------------------+
                 | PDF                                  |
                 +--------------------------------------+
                 | {32, 32, 32, 32, 32, 32, 32, 32}/256 |
                 +--------------------------------------+

        Table 12: PDF for Independent Quantization Gain LSB Coding

   These 6 bits are combined to form a value, gain_index, between 0 and
   63.  When the gain for the previous subframe is available, then the
   current gain is limited as follows:

             log_gain = max(gain_index, previous_log_gain - 16)

   This may help some implementations limit the change in precision of
   their internal LTP history.  The indices to which this clamp applies
   cannot simply be removed from the codebook, because previous_log_gain
   will not be available after packet loss.  The clamping is skipped
   after a decoder reset, and in the side channel if the previous frame
Top   ToC   RFC6716 - Page 46
   in the side channel was not coded, since there is no value for
   previous_log_gain available.  It MAY also be skipped after packet
   loss.

   For subframes that do not have an independent gain (including the
   first subframe of frames not listed as using independent coding
   above), the quantization gain is coded relative to the gain from the
   previous subframe (in the same channel).  The PDF in Table 13 yields
   a delta_gain_index value between 0 and 40, inclusive.

   +-------------------------------------------------------------------+
   | PDF                                                               |
   +-------------------------------------------------------------------+
   | {6, 5, 11, 31, 132, 21, 8, 4, 3, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, |
   | 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,       |
   | 1}/256                                                            |
   +-------------------------------------------------------------------+

             Table 13: PDF for Delta Quantization Gain Coding

   The following formula translates this index into a quantization gain
   for the current subframe using the gain from the previous subframe:

     log_gain = clamp(0, max(2*delta_gain_index - 16,
                        previous_log_gain + delta_gain_index - 4), 63)

   silk_gains_dequant() (gain_quant.c) dequantizes log_gain for the k'th
   subframe and converts it into a linear Q16 scale factor via

         gain_Q16[k] = silk_log2lin((0x1D1C71*log_gain>>16) + 2090)

   The function silk_log2lin() (log2lin.c) computes an approximation of
   2**(inLog_Q7/128.0), where inLog_Q7 is its Q7 input.  Let i =
   inLog_Q7>>7 be the integer part of inLogQ7 and f = inLog_Q7&127 be
   the fractional part.  Then,

               (1<<i) + ((-174*f*(128-f)>>16)+f)*((1<<i)>>7)

   yields the approximate exponential.  The final Q16 gain values lies
   between 81920 and 1686110208, inclusive (representing scale factors
   of 1.25 to 25728, respectively).



(page 46 continued on part 4)

Next Section