# RaptorQ Forward Error Correction Scheme for Object Delivery

Part 2 of 4, p. 12 to 36

```5.  RaptorQ FEC Code Specification

5.1.  Background

For the purpose of the RaptorQ FEC code specification in this
section, the following definitions, symbols, and abbreviations apply.
A basic understanding of linear algebra, matrix operations, and
finite fields is assumed in this section.  In particular, matrix
multiplication and matrix inversion operations over a mixture of the
```
```   finite fields GF[2] and GF[256] are used.  A basic familiarity with
sparse linear equations, and efficient implementations of algorithms
that take advantage of sparse linear equations, is also quite
beneficial to an implementer of this specification.

5.1.1.  Definitions

o  Source block: a block of K source symbols that are considered
together for RaptorQ encoding and decoding purposes.

o  Extended Source Block: a block of K' source symbols, where K' >=
K, constructed from a source block and zero or more padding
symbols.

o  Symbol: a unit of data.  The size, in octets, of a symbol is known
as the symbol size.  The symbol size is always a positive integer.

o  Source symbol: the smallest unit of data used during the encoding
process.  All source symbols within a source block have the same
size.

o  Padding symbol: a symbol with all zero bits that is added to the
source block to form the extended source block.

o  Encoding symbol: a symbol that can be sent as part of the encoding
of a source block.  The encoding symbols of a source block consist
of the source symbols of the source block and the repair symbols
generated from the source block.  Repair symbols generated from a
source block have the same size as the source symbols of that
source block.

o  Repair symbol: the encoding symbols of a source block that are not
source symbols.  The repair symbols are generated based on the
source symbols of a source block.

o  Intermediate symbols: symbols generated from the source symbols
using an inverse encoding process based on pre-coding
relationships.  The repair symbols are then generated directly
from the intermediate symbols.  The encoding symbols do not
include the intermediate symbols, i.e., intermediate symbols are
not sent as part of the encoding of a source block.  The
intermediate symbols are partitioned into LT symbols and PI
symbols for the purposes of the encoding process.

o  LT symbols: a process similar to that described in [LTCodes] is
used to generate part of the contribution to each generated
encoding symbol from the portion of the intermediate symbols
designated as LT symbols.
```
```   o  PI symbols: a process even simpler than that described in
[LTCodes] is used to generate the other part of the contribution
to each generated encoding symbol from the portion of the
intermediate symbols designated as PI symbols.  In the decoding
algorithm suggested in Section 5.4, the PI symbols are inactivated
at the start, i.e., are placed into the matrix U at the beginning
of the first phase of the decoding algorithm.  Because the symbols
corresponding to the columns of U are sometimes called the
"inactivated" symbols, and since the PI symbols are inactivated at
the beginning, they are considered "permanently inactivated".

o  HDPC symbols: there is a small subset of the intermediate symbols
that are HDPC symbols.  Each HDPC symbol has a pre-coding
relationship with a large fraction of the other intermediate
symbols.  HDPC means "High Density Parity Check".

o  LDPC symbols: there is a moderate-sized subset of the intermediate
symbols that are LDPC symbols.  Each LDPC symbol has a pre-coding
relationship with a small fraction of the other intermediate
symbols.  LDPC means "Low Density Parity Check".

o  Systematic code: a code in which all source symbols are included
as part of the encoding symbols of a source block.  The RaptorQ
code as described herein is a systematic code.

o  Encoding Symbol ID (ESI): information that uniquely identifies
each encoding symbol associated with a source block for sending
and receiving purposes.

o  Internal Symbol ID (ISI): information that uniquely identifies
each symbol associated with an extended source block for encoding
and decoding purposes.

o  Arithmetic operations on octets and symbols and matrices: the
operations that are used to produce encoding symbols from source
symbols and vice versa.  See Section 5.7.

5.1.2.  Symbols

i, j, u, v, h, d, a, b, d1, a1, b1, v, m, x, y   represent values or
variables of one type or another, depending on the context.

X    denotes a non-negative integer value that is either an ISI value
or an ESI value, depending on the context.

ceil(x)  denotes the smallest integer that is greater than or equal
to x, where x is a real value.
```
```   floor(x)  denotes the largest integer that is less than or equal to
x, where x is a real value.

min(x,y)  denotes the minimum value of the values x and y, and in
general the minimum value of all the argument values.

max(x,y)  denotes the maximum value of the values x and y, and in
general the maximum value of all the argument values.

i % j  denotes i modulo j.

i + j  denotes the sum of i and j.  If i and j are octets or symbols,
this designates the arithmetic on octets or symbols,
respectively, as defined in Section 5.7.  If i and j are
integers, then it denotes the usual integer addition.

i * j  denotes the product of i and j.  If i and j are octets, this
designates the arithmetic on octets, as defined in Section 5.7.
If i is an octet and j is a symbol, this denotes the
multiplication of a symbol by an octet, as also defined in
Section 5.7.  Finally, if i and j are integers, i * j denotes
the usual product of integers.

a ^^ b  denotes the operation a raised to the power b.  If a is an
octet and b is a non-negative integer, this is understood to
mean a*a*...*a (b terms), with '*' being the octet product as
defined in Section 5.7.

u ^ v  denotes, for equal-length bit strings u and v, the bitwise
exclusive-or of u and v.

Transpose[A]  denotes the transposed matrix of matrix A.  In this
specification, all matrices have entries that are octets.

A^^-1  denotes the inverse matrix of matrix A.  In this
specification, all the matrices have octets as entries, so it is
understood that the operations of the matrix entries are to be
done as stated in Section 5.7 and A^^-1 is the matrix inverse of
A with respect to octet arithmetic.

K    denotes the number of symbols in a single source block.

K'   denotes the number of source plus padding symbols in an extended
source block.  For the majority of this specification, the

K'_max  denotes the maximum number of source symbols that can be in a
single source block.  Set to 56403.
```
```   L    denotes the number of intermediate symbols for a single extended
source block.

S    denotes the number of LDPC symbols for a single extended source
block.  These are LT symbols.  For each value of K' shown in
Table 2 in Section 5.6, the corresponding value of S is a prime
number.

H    denotes the number of HDPC symbols for a single extended source
block.  These are PI symbols.

B    denotes the number of intermediate symbols that are LT symbols
excluding the LDPC symbols.

W    denotes the number of intermediate symbols that are LT symbols.
For each value of K' in Table 2 shown in Section 5.6, the
corresponding value of W is a prime number.

P    denotes the number of intermediate symbols that are PI symbols.
These contain all HDPC symbols.

P1   denotes the smallest prime number greater than or equal to P.

U    denotes the number of non-HDPC intermediate symbols that are PI
symbols.

C    denotes an array of intermediate symbols, C[0], C[1], C[2], ...,
C[L-1].

C'   denotes an array of the symbols of the extended source block,
where C'[0], C'[1], C'[2], ..., C'[K-1] are the source symbols
of the source block and C'[K], C'[K+1], ..., C'[K'-1] are

V0, V1, V2, V3  denote four arrays of 32-bit unsigned integers,
V0[0], V0[1], ..., V0[255]; V1[0], V1[1], ..., V1[255]; V2[0],
V2[1], ..., V2[255]; and V3[0], V3[1], ..., V3[255] as shown in
Section 5.5.

Rand[y, i, m]  denotes a pseudo-random number generator.

Deg[v]  denotes a degree generator.

Enc[K', C ,(d, a, b, d1, a1, b1)]  denotes an encoding symbol
generator.

Tuple[K', X]  denotes a tuple generator function.
```
```   T    denotes the symbol size in octets.

J(K')  denotes the systematic index associated with K'.

G    denotes any generator matrix.

I_S  denotes the S x S identity matrix.

5.2.  Overview

This section defines the systematic RaptorQ FEC code.

Symbols are the fundamental data units of the encoding and decoding
process.  For each source block, all symbols are the same size,
referred to as the symbol size T.  The atomic operations performed on
symbols for both encoding and decoding are the arithmetic operations
defined in Section 5.7.

The basic encoder is described in Section 5.3.  The encoder first
derives a block of intermediate symbols from the source symbols of a
source block.  This intermediate block has the property that both
source and repair symbols can be generated from it using the same
process.  The encoder produces repair symbols from the intermediate
block using an efficient process, where each such repair symbol is
the exclusive-or of a small number of intermediate symbols from the
block.  Source symbols can also be reproduced from the intermediate
block using the same process.  The encoding symbols are the
combination of the source and repair symbols.

An example of a decoder is described in Section 5.4.  The process for
producing source and repair symbols from the intermediate block is
designed so that the intermediate block can be recovered from any
sufficiently large set of encoding symbols, independent of the mix of
source and repair symbols in the set.  Once the intermediate block is
recovered, missing source symbols of the source block can be
recovered using the encoding process.

Requirements for a RaptorQ-compliant decoder are provided in
Section 5.8.  A number of decoding algorithms are possible to achieve
these requirements.  An efficient decoding algorithm to achieve these
requirements is provided in Section 5.4.

The construction of the intermediate and repair symbols is based in
part on a pseudo-random number generator described in Section 5.3.
This generator is based on a fixed set of 1024 random numbers that
must be available to both sender and receiver.  These numbers are
```
```   provided in Section 5.5.  Encoding and decoding operations for
RaptorQ use operations on octets.  Section 5.7 describes how to
perform these operations.

Finally, the construction of the intermediate symbols from the source
symbols is governed by "systematic indices", values of which are
provided in Section 5.6 for specific extended source block sizes
between 6 and K'_max = 56403 source symbols.  Thus, the RaptorQ code
supports source blocks with between 1 and 56403 source symbols.

5.3.  Systematic RaptorQ Encoder

5.3.1.  Introduction

For a given source block of K source symbols, for encoding and
decoding purposes, the source block is augmented with K'-K additional
padding symbols, where K' is the smallest value that is at least K in
the systematic index Table 2 of Section 5.6.  The reason for padding
out a source block to a multiple of K' is to enable faster encoding
and decoding and to minimize the amount of table information that
needs to be stored in the encoder and decoder.

For purposes of transmitting and receiving data, the value of K is
used to determine the number of source symbols in a source block, and
thus K needs to be known at the sender and the receiver.  In this
case, the sender and receiver can compute K' from K and the K'-K
without any additional communication.  The encoding symbol ID (ESI)
is used by a sender and receiver to identify the encoding symbols of
a source block, where the encoding symbols of a source block consist
of the source symbols and the repair symbols associated with the
source block.  For a source block with K source symbols, the ESIs for
the source symbols are 0, 1, 2, ..., K-1, and the ESIs for the repair
symbols are K, K+1, K+2, ....  Using the ESI for identifying encoding
symbols in transport ensures that the ESI values continue
consecutively between the source and repair symbols.

For purposes of encoding and decoding data, the value of K' derived
from K is used as the number of source symbols of the extended source
block upon which encoding and decoding operations are performed,
where the K' source symbols consist of the original K source symbols
is used by the encoder and decoder to identify the symbols associated
with the extended source block, i.e., for generating encoding symbols
and for decoding.  For a source block with K original source symbols,
the ISIs for the original source symbols are 0, 1, 2, ..., K-1, the
ISIs for the K'-K padding symbols are K, K+1, K+2, ..., K'-1, and the
ISIs for the repair symbols are K', K'+1, K'+2, ....  Using the ISI
```
```   for encoding and decoding allows the padding symbols of the extended
source block to be treated the same way as other source symbols of
the extended source block.  Also, it ensures that a given prefix of
repair symbols are generated in a consistent way for a given number
K' of source symbols in the extended source block, independent of K.

The relationship between the ESIs and the ISIs is simple: the ESIs
and the ISIs for the original K source symbols are the same, the K'-K
padding symbols have an ISI but do not have a corresponding ESI
(since they are symbols that are neither sent nor received), and a
repair symbol ISI is simply the repair symbol ESI plus K'-K.  The
translation between ESIs (used to identify encoding symbols sent and
received) and the corresponding ISIs (used for encoding and
decoding), as well as determining the proper padding of the extended
source block with padding symbols (used for encoding and decoding),
is the internal responsibility of the RaptorQ encoder/decoder.

5.3.2.  Encoding Overview

The systematic RaptorQ encoder is used to generate any number of
repair symbols from a source block that consists of K source symbols
placed into an extended source block C'.  Figure 4 shows the encoding
overview.

The first step of encoding is to construct an extended source block
by adding zero or more padding symbols such that the total number of
symbols, K', is one of the values listed in Section 5.6.  Each
padding symbol consists of T octets where the value of each octet is
zero.  K' MUST be selected as the smallest value of K' from the table
of Section 5.6 that is greater than or equal to K.
```
```         -----------------------------------------------------------+
|                                                          |
|    +-----------+    +--------------+    +-------------+  |
C' |    |           | C' | Intermediate | C  |             |  |
----+--->|  Padding  |--->|    Symbol    |--->|   Encoding  |--+-->
K  |    |           | K' |  Generation  | L  |             |  |
|    +-----------+    +--------------+    +-------------+  |
|           |                             (d,a,b, ^        |
|           |                            d1,a1,b1)|        |
|           |                              +------------+  |
|           |              K'              |   Tuple    |  |
|           +----------------------------->|            |  |
|                                          | Generation |  |
|                                          +------------+  |
|                                                 ^        |
+-------------------------------------------------+--------+
|
ISI X

Figure 4: Encoding Overview

Let C'[0], ..., C'[K-1] denote the K source symbols.

Let C'[K], ..., C'[K'-1] denote the K'-K padding symbols, which are
all set to zero bits.  Then, C'[0], ..., C'[K'-1] are the symbols of
the extended source block upon which encoding and decoding are
performed.

In the remainder of this description, these padding symbols will be
considered as additional source symbols and referred to as such.
However, these padding symbols are not part of the encoding symbols,
i.e., they are not sent as part of the encoding.  At a receiver, the
value of K' can be computed based on K, then the receiver can insert
K'-K padding symbols at the end of a source block of K' source
symbols and recover the remaining K source symbols of the source

The second step of encoding is to generate a number, L > K', of
intermediate symbols from the K' source symbols.  In this step, K'
source tuples (d[0], a[0], b[0], d1[0], a1[0], b1[0]), ..., (d[K'-1],
a[K'-1], b[K'-1], d1[K'-1], a1[K'-1], b1[K'-1]) are generated using
the Tuple[] generator as described in Section 5.3.5.4.  The K' source
tuples and the ISIs associated with the K' source symbols are used to
determine L intermediate symbols C[0], ..., C[L-1] from the source
symbols using an inverse encoding process.  This process can be
realized by a RaptorQ decoding process.
```
```   Certain "pre-coding relationships" must hold within the L
intermediate symbols.  Section 5.3.3.3 describes these relationships.
Section 5.3.3.4 describes how the intermediate symbols are generated
from the source symbols.

Once the intermediate symbols have been generated, repair symbols can
be produced.  For a repair symbol with ISI X > K', the tuple of non-
negative integers (d, a, b, d1, a1, b1) can be generated, using the
Tuple[] generator as described in Section 5.3.5.4.  Then, the (d, a,
b, d1, a1, b1) tuple and the ISI X are used to generate the
corresponding repair symbol from the intermediate symbols using the
Enc[] generator described in Section 5.3.5.3.  The corresponding ESI
for this repair symbol is then X-(K'-K).  Note that source symbols of
the extended source block can also be generated using the same
process, i.e., for any X < K', the symbol generated using this
process has the same value as C'[X].

5.3.3.  First Encoding Step: Intermediate Symbol Generation

5.3.3.1.  General

This encoding step is a pre-coding step to generate the L
intermediate symbols C[0], ..., C[L-1] from the source symbols C'[0],
..., C'[K'-1], where L > K' is defined in Section 5.3.3.3.  The
intermediate symbols are uniquely defined by two sets of constraints:

1.  The intermediate symbols are related to the source symbols by a
set of source symbol tuples and by the ISIs of the source
symbols.  The generation of the source symbol tuples is defined
in Section 5.3.3.2 using the Tuple[] generator as described in
Section 5.3.5.4.

2.  A number of pre-coding relationships hold within the intermediate
symbols themselves.  These are defined in Section 5.3.3.3.

The generation of the L intermediate symbols is then defined in
Section 5.3.3.4.

5.3.3.2.  Source Symbol Tuples

Each of the K' source symbols is associated with a source symbol
tuple (d[X], a[X], b[X], d1[X], a1[X], b1[X]) for 0 <= X < K'.  The
source symbol tuples are determined using the Tuple[] generator
defined in Section 5.3.5.4 as:

For each X, 0 <= X < K'

(d[X], a[X], b[X], d1[X], a1[X], b1[X]) = Tuple[K, X]
```
```5.3.3.3.  Pre-Coding Relationships

The pre-coding relationships amongst the L intermediate symbols are
defined by requiring that a set of S+H linear combinations of the
intermediate symbols evaluate to zero.  There are S LDPC and H HDPC
symbols, and thus L = K'+S+H.  Another partition of the L
intermediate symbols is into two sets, one set of W LT symbols and
another set of P PI symbols, and thus it is also the case that L =
W+P.  The P PI symbols are treated differently than the W LT symbols
in the encoding process.  The P PI symbols consist of the H HDPC
symbols together with a set of U = P-H of the other K' intermediate
symbols.  The W LT symbols consist of the S LDPC symbols together
with W-S of the other K' intermediate symbols.  The values of these
parameters are determined from K' as described below, where H(K'),
S(K'), and W(K') are derived from Table 2 in Section 5.6.

Let

o  S = S(K')

o  H = H(K')

o  W = W(K')

o  L = K' + S + H

o  P = L - W

o  P1 denote the smallest prime number greater than or equal to P.

o  U = P - H

o  B = W - S

o  C[0], ..., C[B-1] denote the intermediate symbols that are LT
symbols but not LDPC symbols.

o  C[B], ..., C[B+S-1] denote the S LDPC symbols that are also LT
symbols.

o  C[W], ..., C[W+U-1] denote the intermediate symbols that are PI
symbols but not HDPC symbols.

o  C[L-H], ..., C[L-1] denote the H HDPC symbols that are also PI
symbols.
```
```   The first set of pre-coding relations, called LDPC relations, is
described below and requires that at the end of this process the set
of symbols D[0] , ..., D[S-1] are all zero:

o  Initialize the symbols D[0] = C[B], ..., D[S-1] = C[B+S-1].

o  For i = 0, ..., B-1 do

*  a = 1 + floor(i/S)

*  b = i % S

*  D[b] = D[b] + C[i]

*  b = (b + a) % S

*  D[b] = D[b] + C[i]

*  b = (b + a) % S

*  D[b] = D[b] + C[i]

o  For i = 0, ..., S-1 do

*  a = i % P

*  b = (i+1) % P

*  D[i] = D[i] + C[W+a] + C[W+b]

Recall that the addition of symbols is to be carried out as specified
in Section 5.7.

Note that the LDPC relations as defined in the algorithm above are
linear, so there exists an S x B matrix G_LDPC,1 and an S x P matrix
G_LDPC,2 such that

G_LDPC,1 * Transpose[(C[0], ..., C[B-1])] + G_LDPC,2 *
Transpose(C[W], ..., C[W+P-1]) + Transpose[(C[B], ..., C[B+S-1])]
= 0

(The matrix G_LDPC,1 is defined by the first loop in the above
algorithm, and G_LDPC,2 can be deduced from the second loop.)

The second set of relations among the intermediate symbols C[0], ...,
C[L-1] are the HDPC relations and they are defined as follows:
```
```   Let

o  alpha denote the octet represented by integer 2 as defined in
Section 5.7.

o  MT denote an H x (K' + S) matrix of octets, where for j=0, ...,
K'+S-2, the entry MT[i,j] is the octet represented by the integer
1 if i= Rand[j+1,6,H] or i = (Rand[j+1,6,H] + Rand[j+1,7,H-1] + 1)
% H, and MT[i,j] is the zero element for all other values of i,
and for j=K'+S-1, MT[i,j] = alpha^^i for i=0, ..., H-1.

o  GAMMA denote a (K'+S) x (K'+S) matrix of octets, where

GAMMA[i,j] =

alpha ^^ (i-j) for i >= j,

0 otherwise.

Then, the relationship between the first K'+S intermediate symbols
C[0], ..., C[K'+S-1] and the H HDPC symbols C[K'+S], ..., C[K'+S+H-1]
is given by:

Transpose[C[K'+S], ..., C[K'+S+H-1]] + MT * GAMMA *
Transpose[C[0], ..., C[K'+S-1]] = 0,

where '*' represents standard matrix multiplication utilizing the
octet multiplication to define the multiplication between a matrix of
octets and a matrix of symbols (in particular, the column vector of
symbols), and '+' denotes addition over octet vectors.

5.3.3.4.  Intermediate Symbols

5.3.3.4.1.  Definition

Given the K' source symbols C'[0], C'[1], ..., C'[K'-1] the L
intermediate symbols C[0], C[1], ..., C[L-1] are the uniquely defined
symbol values that satisfy the following conditions:

1.  The K' source symbols C'[0], C'[1], ..., C'[K'-1] satisfy the K'
constraints

C'[X] = Enc[K', (C[0], ..., C[L-1]), (d[X], a[X], b[X], d1[X],
a1[X], b1[X])], for all X, 0 <= X < K',

where (d[X], a[X], b[X], d1[X], a1[X], b1[X])) = Tuple[K',X],
Tuple[] is defined in Section 5.3.5.4, and Enc[] is described in
Section 5.3.5.3.
```
```   2.  The L intermediate symbols C[0], C[1], ..., C[L-1] satisfy the
pre-coding relationships defined in Section 5.3.3.3.

5.3.3.4.2.  Example Method for Calculation of Intermediate Symbols

This section describes a possible method for calculation of the L
intermediate symbols C[0], C[1], ..., C[L-1] satisfying the
constraints in Section 5.3.3.4.1.

The L intermediate symbols can be calculated as follows:

Let

o  C denote the column vector of the L intermediate symbols, C[0],
C[1], ..., C[L-1].

o  D denote the column vector consisting of S+H zero symbols followed
by the K' source symbols C'[0], C'[1], ..., C'[K'-1].

Then, the above constraints define an L x L matrix A of octets such
that:

A*C = D

The matrix A can be constructed as follows:

Let

o  G_LDPC,1 and G_LDPC,2 be S x B and S x P matrices as defined in
Section 5.3.3.3.

o  G_HDPC be the H x (K'+S) matrix such that

G_HDPC * Transpose(C[0], ..., C[K'+S-1]) = Transpose(C[K'+S],
..., C[L-1]),

i.e., G_HDPC = MT*GAMMA

o  I_S be the S x S identity matrix

o  I_H be the H x H identity matrix

o  G_ENC be the K' x L matrix such that

G_ENC * Transpose[(C[0], ..., C[L-1])] =
Transpose[(C'[0],C'[1], ...,C'[K'-1])],
```
```         i.e., G_ENC[i,j] = 1 if and only if C[j] is included in the
symbols that are summed to produce Enc[K', (C[0], ..., C[L-1]),
(d[i], a[i], b[i], d1[i], a1[i], b1[i])] and G_ENC[i,j] = 0
otherwise.

Then

o  The first S rows of A are equal to G_LDPC,1 | I_S | G_LDPC,2.

o  The next H rows of A are equal to G_HDPC | I_H.

o  The remaining K' rows of A are equal to G_ENC.

The matrix A is depicted in Figure 5 below:

B               S         U         H
+-----------------------+-------+------------------+
|                       |       |                  |
S |        G_LDPC,1       |  I_S  |      G_LDPC,2    |
|                       |       |                  |
+-----------------------+-------+----------+-------+
|                                          |       |
H |                G_HDPC                    |  I_H  |
|                                          |       |
+------------------------------------------+-------+
|                                                  |
|                                                  |
K' |                      G_ENC                       |
|                                                  |
|                                                  |
+--------------------------------------------------+

Figure 5: The Matrix A

The intermediate symbols can then be calculated as:

C = (A^^-1)*D

The source tuples are generated such that for any K' matrix A has
full rank and is therefore invertible.  This calculation can be
realized by applying a RaptorQ decoding process to the K' source
symbols C'[0], C'[1], ..., C'[K'-1] to produce the L intermediate
symbols C[0], C[1], ..., C[L-1].

To efficiently generate the intermediate symbols from the source
symbols, it is recommended that an efficient decoder implementation
such as that described in Section 5.4 be used.
```
```5.3.4.  Second Encoding Step: Encoding

In the second encoding step, the repair symbol with ISI X (X >= K')
is generated by applying the generator Enc[K', (C[0], C[1], ...,
C[L-1]), (d, a, b, d1, a1, b1)] defined in Section 5.3.5.3 to the L
intermediate symbols C[0], C[1], ..., C[L-1] using the tuple (d, a,
b, d1, a1, b1)=Tuple[K',X].

5.3.5.  Generators

5.3.5.1.  Random Number Generator

The random number generator Rand[y, i, m] is defined as follows,
where y is a non-negative integer, i is a non-negative integer less
than 256, and m is a positive integer, and the value produced is an
integer between 0 and m-1.  Let V0, V1, V2, and V3 be the arrays
provided in Section 5.5.

Let

o  x0 = (y + i) mod 2^^8

o  x1 = (floor(y / 2^^8) + i) mod 2^^8

o  x2 = (floor(y / 2^^16) + i) mod 2^^8

o  x3 = (floor(y / 2^^24) + i) mod 2^^8

Then

Rand[y, i, m] = (V0[x0] ^ V1[x1] ^ V2[x2] ^ V3[x3]) % m

5.3.5.2.  Degree Generator

The degree generator Deg[v] is defined as follows, where v is a non-
negative integer that is less than 2^^20 = 1048576.  Given v, find
index d in Table 1 such that f[d-1] <= v < f[d], and set Deg[v] =
min(d, W-2).  Recall that W is derived from K' as described in
Section 5.3.3.3.
```
```                 +---------+---------+---------+---------+
| Index d | f[d]    | Index d | f[d]    |
+---------+---------+---------+---------+
| 0       | 0       | 1       | 5243    |
+---------+---------+---------+---------+
| 2       | 529531  | 3       | 704294  |
+---------+---------+---------+---------+
| 4       | 791675  | 5       | 844104  |
+---------+---------+---------+---------+
| 6       | 879057  | 7       | 904023  |
+---------+---------+---------+---------+
| 8       | 922747  | 9       | 937311  |
+---------+---------+---------+---------+
| 10      | 948962  | 11      | 958494  |
+---------+---------+---------+---------+
| 12      | 966438  | 13      | 973160  |
+---------+---------+---------+---------+
| 14      | 978921  | 15      | 983914  |
+---------+---------+---------+---------+
| 16      | 988283  | 17      | 992138  |
+---------+---------+---------+---------+
| 18      | 995565  | 19      | 998631  |
+---------+---------+---------+---------+
| 20      | 1001391 | 21      | 1003887 |
+---------+---------+---------+---------+
| 22      | 1006157 | 23      | 1008229 |
+---------+---------+---------+---------+
| 24      | 1010129 | 25      | 1011876 |
+---------+---------+---------+---------+
| 26      | 1013490 | 27      | 1014983 |
+---------+---------+---------+---------+
| 28      | 1016370 | 29      | 1017662 |
+---------+---------+---------+---------+
| 30      | 1048576 |         |         |
+---------+---------+---------+---------+

Table 1: Defines the Degree Distribution for Encoding Symbols

5.3.5.3.  Encoding Symbol Generator

The encoding symbol generator Enc[K', (C[0], C[1], ..., C[L-1]), (d,
a, b, d1, a1, b1)] takes the following inputs:

o  K' is the number of source symbols for the extended source block.
Let L, W, B, S, P, and P1 be derived from K' as described in
Section 5.3.3.3.
```
```   o  (C[0], C[1], ..., C[L-1]) is the array of L intermediate symbols
(sub-symbols) generated as described in Section 5.3.3.4.

o  (d, a, b, d1, a1, b1) is a source tuple determined from ISI X
using the Tuple[] generator defined in Section 5.3.5.4, whereby

*  d is a positive integer denoting an encoding symbol LT degree

*  a is a positive integer between 1 and W-1 inclusive

*  b is a non-negative integer between 0 and W-1 inclusive

*  d1 is a positive integer that has value either 2 or 3 denoting
an encoding symbol PI degree

*  a1 is a positive integer between 1 and P1-1 inclusive

*  b1 is a non-negative integer between 0 and P1-1 inclusive

The encoding symbol generator produces a single encoding symbol as
output (referred to as result), according to the following algorithm:

o  result = C[b]

o  For j = 1, ..., d-1 do

*  b = (b + a) % W

*  result = result + C[b]

o  While (b1 >= P) do b1 = (b1+a1) % P1

o  result = result + C[W+b1]

o  For j = 1, ..., d1-1 do

*  b1 = (b1 + a1) % P1

*  While (b1 >= P) do b1 = (b1+a1) % P1

*  result = result + C[W+b1]

o  Return result
```
```5.3.5.4.  Tuple Generator

The tuple generator Tuple[K',X] takes the following inputs:

o  K': the number of source symbols in the extended source block

o  X: an ISI

Let

o  L be determined from K' as described in Section 5.3.3.3

o  J = J(K') be the systematic index associated with K', as defined
in Table 2 in Section 5.6

The output of the tuple generator is a tuple, (d, a, b, d1, a1, b1),
determined as follows:

o  A = 53591 + J*997

o  if (A % 2 == 0) { A = A + 1 }

o  B = 10267*(J+1)

o  y = (B + X*A) % 2^^32

o  v = Rand[y, 0, 2^^20]

o  d = Deg[v]

o  a = 1 + Rand[y, 1, W-1]

o  b = Rand[y, 2, W]

o  If (d < 4) { d1 = 2 + Rand[X, 3, 2] } else { d1 = 2 }

o  a1 = 1 + Rand[X, 4, P1-1]

o  b1 = Rand[X, 5, P1]

5.4.  Example FEC Decoder

5.4.1.  General

This section describes an efficient decoding algorithm for the
RaptorQ code introduced in this specification.  Note that each
received encoding symbol is a known linear combination of the
intermediate symbols.  So, each received encoding symbol provides a
```
```   linear equation among the intermediate symbols, which, together with
the known linear pre-coding relationships amongst the intermediate
symbols, gives a system of linear equations.  Thus, any algorithm for
solving systems of linear equations can successfully decode the
intermediate symbols and hence the source symbols.  However, the
algorithm chosen has a major effect on the computational efficiency
of the decoding.

5.4.2.  Decoding an Extended Source Block

5.4.2.1.  General

It is assumed that the decoder knows the structure of the source
block it is to decode, including the symbol size, T, and the number K
of symbols in the source block and the number K' of source symbols in
the extended source block.

From the algorithms described in Section 5.3, the RaptorQ decoder can
calculate the total number L = K'+S+H of intermediate symbols and
determine how they were generated from the extended source block to
be decoded.  In this description, it is assumed that the received
encoding symbols for the extended source block to be decoded are
passed to the decoder.  Furthermore, for each such encoding symbol,
it is assumed that the number and set of intermediate symbols whose
sum is equal to the encoding symbol are passed to the decoder.  In
the case of source symbols, including padding symbols, the source
symbol tuples described in Section 5.3.3.2 indicate the number and
set of intermediate symbols that sum to give each source symbol.

Let N >= K' be the number of received encoding symbols to be used for
decoding, including padding symbols for an extended source block, and
let M = S+H+N.  Then, with the notation of Section 5.3.3.4.2, we have
A*C = D.

Decoding an extended source block is equivalent to decoding C from
known A and D.  It is clear that C can be decoded if and only if the
rank of A is L.  Once C has been decoded, missing source symbols can
be obtained by using the source symbol tuples to determine the number
and set of intermediate symbols that must be summed to obtain each
missing source symbol.

The first step in decoding C is to form a decoding schedule.  In this
step, A is converted using Gaussian elimination (using row operations
and row and column reorderings) and after discarding M - L rows, into
the L x L identity matrix.  The decoding schedule consists of the
sequence of row operations and row and column reorderings during the
Gaussian elimination process, and it only depends on A and not on D.
```
```   The decoding of C from D can take place concurrently with the forming
of the decoding schedule, or the decoding can take place afterwards
based on the decoding schedule.

The correspondence between the decoding schedule and the decoding of
C is as follows.  Let c[0] = 0, c[1] = 1, ..., c[L-1] = L-1 and d[0]
= 0, d[1] = 1, ..., d[M-1] = M-1 initially.

o  Each time a multiple, beta, of row i of A is added to row i' in
the decoding schedule, then in the decoding process the symbol
beta*D[d[i]] is added to symbol D[d[i']].

o  Each time a row i of A is multiplied by an octet beta, then in the
decoding process the symbol D[d[i]] is also multiplied by beta.

o  Each time row i is exchanged with row i' in the decoding schedule,
then in the decoding process the value of d[i] is exchanged with
the value of d[i'].

o  Each time column j is exchanged with column j' in the decoding
schedule, then in the decoding process the value of c[j] is
exchanged with the value of c[j'].

From this correspondence, it is clear that the total number of
operations on symbols in the decoding of the extended source block is
the number of row operations (not exchanges) in the Gaussian
elimination.  Since A is the L x L identity matrix after the Gaussian
elimination and after discarding the last M - L rows, it is clear at
the end of successful decoding that the L symbols D[d[0]], D[d[1]],
..., D[d[L-1]] are the values of the L symbols C[c[0]], C[c[1]], ...,
C[c[L-1]].

The order in which Gaussian elimination is performed to form the
decoding schedule has no bearing on whether or not the decoding is
successful.  However, the speed of the decoding depends heavily on
the order in which Gaussian elimination is performed.  (Furthermore,
maintaining a sparse representation of A is crucial, although this is
not described here.)  The remainder of this section describes an
order in which Gaussian elimination could be performed that is
relatively efficient.

5.4.2.2.  First Phase

In the first phase of the Gaussian elimination, the matrix A is
conceptually partitioned into submatrices and, additionally, a matrix
X is created.  This matrix has as many rows and columns as A, and it
will be a lower triangular matrix throughout the first phase.  At the
beginning of this phase, the matrix A is copied into the matrix X.
```
```   The submatrix sizes are parameterized by non-negative integers i and
u, which are initialized to 0 and P, the number of PI symbols,
respectively.  The submatrices of A are:

1.  The submatrix I defined by the intersection of the first i rows
and first i columns.  This is the identity matrix at the end of
each step in the phase.

2.  The submatrix defined by the intersection of the first i rows and
all but the first i columns and last u columns.  All entries of
this submatrix are zero.

3.  The submatrix defined by the intersection of the first i columns
and all but the first i rows.  All entries of this submatrix are
zero.

4.  The submatrix U defined by the intersection of all the rows and
the last u columns.

5.  The submatrix V formed by the intersection of all but the first i
columns and the last u columns and all but the first i rows.

Figure 6 illustrates the submatrices of A.  At the beginning of the
first phase, V consists of the first L-P columns of A, and U consists
of the last P columns corresponding to the PI symbols.  In each step,
a row of A is chosen.

+-----------+-----------------+---------+
|           |                 |         |
|     I     |    All Zeros    |         |
|           |                 |         |
+-----------+-----------------+    U    |
|           |                 |         |
|           |                 |         |
| All Zeros |       V         |         |
|           |                 |         |
|           |                 |         |
+-----------+-----------------+---------+

Figure 6: Submatrices of A in the First Phase

The following graph defined by the structure of V is used in
determining which row of A is chosen.  The columns that intersect V
are the nodes in the graph, and the rows that have exactly 2 nonzero
entries in V and are not HDPC rows are the edges of the graph that
connect the two columns (nodes) in the positions of the two ones.  A
component in this graph is a maximal set of nodes (columns) and edges
```
```   (rows) such that there is a path between each pair of nodes/edges in
the graph.  The size of a component is the number of nodes (columns)
in the component.

There are at most L steps in the first phase.  The phase ends
successfully when i + u = L, i.e., when V and the all zeros submatrix
above V have disappeared, and A consists of I, the all zeros
submatrix below I, and U.  The phase ends unsuccessfully in decoding
failure if at some step before V disappears there is no nonzero row
in V to choose in that step.  In each step, a row of A is chosen as
follows:

o  If all entries of V are zero, then no row is chosen and decoding
fails.

o  Let r be the minimum integer such that at least one row of A has
exactly r nonzeros in V.

*  If r != 2, then choose a row with exactly r nonzeros in V with
minimum original degree among all such rows, except that HDPC
rows should not be chosen until all non-HDPC rows have been
processed.

*  If r = 2 and there is a row with exactly 2 ones in V, then
choose any row with exactly 2 ones in V that is part of a
maximum size component in the graph described above that is
defined by V.

*  If r = 2 and there is no row with exactly 2 ones in V, then
choose any row with exactly 2 nonzeros in V.

After the row is chosen in this step, the first row of A that
intersects V is exchanged with the chosen row so that the chosen row
is the first row that intersects V.  The columns of A among those
that intersect V are reordered so that one of the r nonzeros in the
chosen row appears in the first column of V and so that the remaining
r-1 nonzeros appear in the last columns of V.  The same row and
column operations are also performed on the matrix X.  Then, an
appropriate multiple of the chosen row is added to all the other rows
of A below the chosen row that have a nonzero entry in the first
column of V.  Specifically, if a row below the chosen row has entry
beta in the first column of V, and the chosen row has entry alpha in
the first column of V, then beta/alpha multiplied by the chosen row
is added to this row to leave a zero value in the first column of V.
Finally, i is incremented by 1 and u is incremented by r-1, which
completes the step.
```
```   Note that efficiency can be improved if the row operations identified
above are not actually performed until the affected row is itself
chosen during the decoding process.  This avoids processing of row
operations for rows that are not eventually used in the decoding
process, and in particular this avoids those rows for which beta!=1
until they are actually required.  Furthermore, the row operations
required for the HDPC rows may be performed for all such rows in one
process, by using the algorithm described in Section 5.3.3.3.

5.4.2.3.  Second Phase

At this point, all the entries of X outside the first i rows and i
columns are discarded, so that X has lower triangular form.  The last
i rows and columns of X are discarded, so that X now has i rows and i
columns.  The submatrix U is further partitioned into the first i
rows, U_upper, and the remaining M - i rows, U_lower.  Gaussian
elimination is performed in the second phase on U_lower either to
determine that its rank is less than u (decoding failure) or to
convert it into a matrix where the first u rows is the identity
matrix (success of the second phase).  Call this u x u identity
matrix I_u.  The M - L rows of A that intersect U_lower - I_u are
discarded.  After this phase, A has L rows and L columns.

5.4.2.4.  Third Phase

After the second phase, the only portion of A that needs to be zeroed
out to finish converting A into the L x L identity matrix is U_upper.
The number of rows i of the submatrix U_upper is generally much
larger than the number of columns u of U_upper.  Moreover, at this
time, the matrix U_upper is typically dense, i.e., the number of
nonzero entries of this matrix is large.  To reduce this matrix to a
sparse form, the sequence of operations performed to obtain the
matrix U_lower needs to be inverted.  To this end, the matrix X is
multiplied with the submatrix of A consisting of the first i rows of
A.  After this operation, the submatrix of A consisting of the
intersection of the first i rows and columns equals to X, whereas the
matrix U_upper is transformed to a sparse form.

5.4.2.5.  Fourth Phase

For each of the first i rows of U_upper, do the following: if the row
has a nonzero entry at position j, and if the value of that nonzero
entry is b, then add to this row b times row j of I_u.  After this
step, the submatrix of A consisting of the intersection of the first
i rows and columns is equal to X, the submatrix U_upper consists of
zeros, the submatrix consisting of the intersection of the last u
rows and the first i columns consists of zeros, and the submatrix
consisting of the last u rows and columns is the matrix I_u.
```
```5.4.2.6.  Fifth Phase

For j from 1 to i, perform the following operations:

1.  If A[j,j] is not one, then divide row j of A by A[j,j].

2.  For l from 1 to j-1, if A[j,l] is nonzero, then add A[j,l]
multiplied with row l of A to row j of A.

After this phase, A is the L x L identity matrix and a complete
decoding schedule has been successfully formed.  Then, the
corresponding decoding consisting of summing known encoding symbols
can be executed to recover the intermediate symbols based on the
decoding schedule.  The tuples associated with all source symbols are
computed according to Section 5.3.3.2.  The tuples for received
source symbols are used in the decoding.  The tuples for missing
source symbols are used to determine which intermediate symbols need
to be summed to recover the missing source symbols.

```

(page 36 continued on part 3)