doc/xml/08-residue.xml

   1 <?xml version="1.0" standalone="no"?>
   2 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
   3                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
   4
   5 ]>
   6
   7 <section id="vorbis-spec-residue">
   8 <sectioninfo>
   9  <releaseinfo>
  10   $Id: 08-residue.xml,v 1.5 2003/03/11 11:02:17 xiphmont Exp $
  11   <emphasis>Last update to this document: March 11, 2003</emphasis>
  12  </releaseinfo>
  13 </sectioninfo>
  14 <title>Residue setup and decode</title>
  15
  16
  17 <section>
  18 <title>Overview</title>
  19
  20 <para>
  21 A residue vector represents the fine detail of the audio spectrum of
  22 one channel in an audio frame after the encoder subtracts the floor
  23 curve and performs any channel coupling.  A residue vector may
  24 represent spectral lines, spectral magnitude, spectral phase or
  25 hybrids as mixed by channel coupling.  The exact semantic content of
  26 the vector does not matter to the residue abstraction.</para>
  27
  28 <para>
  29 Whatever the exact qualities, the Vorbis residue abstraction codes the
  30 residue vectors into the bitstream packet, and then reconstructs the
  31 vectors during decode.  Vorbis makes use of three different encoding
  32 variants (numbered 0, 1 and 2) of the same basic vector encoding
  33 abstraction.</para>
  34
  35 </section>
  36
  37 <section>
  38 <title>Residue format</title>
  39
  40 <para>
  41 Reside format partitions each vector in the vector bundle into chunks,
  42 classifies each chunk, encodes the chunk classifications and finally
  43 encodes the chunks themselves using the the specific VQ arrangement
  44 defined for each selected selected classification.  The exact
  45 interleaving and partitioning vary by residue encoding number, however
  46 the high-level process used to classify and encode the residue vector
  47 is the same in all three variants.</para>
  48
  49 <para>
  50 A set of coded residue vectors are all of the same length.  High level
  51 coding structure, ignoring for the moment exactly how a partition is
  52 encoded and simply trusting that it is, is as follows:</para>
  53
  54 <itemizedlist>
  55 <listitem><para>Each vector is partitioned into multiple equal sized chunks
  56 according to configuration specified.  If we have a vector size of
  57 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
  58 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
  59 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>.  It is
  60 important to note that the integer division truncates.  In the below
  61 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
  62
  63 <listitem><para>Each partition in each vector has a classification number that
  64 specifies which of multiple configured VQ codebook setups are used to
  65 decode that partition.  The classification numbers of each partition
  66 can be thought of as forming a vector in their own right, as in the
  67 illustration below.  Just as the residue vectors are coded in grouped
  68 partitions to increase encoding efficiency, the classification vector
  69 is also partitioned into chunks.  The integer elements of each scalar
  70 in a classification chunk are built into a single scalar that
  71 represents the classification numbers in that chunk.  In the below
  72 example, the classification codeword encodes two classification
  73 numbers.</para></listitem>
  74
  75 <listitem><para>The values in a residue vector may be encoded monolithically in a
  76 single pass through the residue vector, but more often efficient
  77 codebook design dictates that each vector is encoded as the additive
  78 sum of several passes through the residue vector using more than one
  79 VQ codebook.  Thus, each residue value potentially accumulates values
  80 from multiple decode passes.  The classification value associated with
  81 a partition is the same in each pass, thus the classification codeword
  82 is coded only in the first pass.</para></listitem>
  83
  84 </itemizedlist>
  85
  86 <mediaobject>
  87 <imageobject>
  88  <imagedata fileref="residue-pack.png" format="PNG"/>
  89 </imageobject>
  90 <textobject>
  91  <phrase>[illustration of residue vector format]</phrase>
  92 </textobject>
  93 </mediaobject>
  94
  95 </section>
  96
  97 <section><title>residue 0</title>
  98
  99 <para>
 100 Residue 0 and 1 differ only in the way the values within a residue
 101 partition are interleaved during partition encoding (visually treated
 102 as a black box--or cyan box or brown box--in the above figure).</para>
 103
 104 <para>
 105 Residue encoding 0 interleaves VQ encoding according to the
 106 dimension of the codebook used to encode a partition in a specific
 107 pass.  The dimension of the codebook need not be the same in multiple
 108 passes, however the partition size must be an even multiple of the
 109 codebook dimension.</para>
 110
 111 <para>
 112 As an example, assume a partition vector of size eight, to be encoded
 113 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
 114
 115 <programlisting>
 116
 117             original residue vector: [ 0 1 2 3 4 5 6 7 ]
 118
 119 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
 120
 121 codebook dimensions = 4  encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
 122
 123 codebook dimensions = 2  encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
 124
 125 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
 126
 127 </programlisting>
 128
 129 <para>
 130 It is worth mentioning at this point that no configurable value in the
 131 residue coding setup is restricted to a power of two.</para>
 132
 133 </section>
 134
 135 <section><title>residue 1</title>
 136
 137 <para>
 138 Residue 1 does not interleave VQ encoding.  It represents partition
 139 vector scalars in order.  As with residue 0, however, partition length
 140 must be an integer multiple of the codebook dimension, although
 141 dimension may vary from pass to pass.</para>
 142
 143 <para>
 144 As an example, assume a partition vector of size eight, to be encoded
 145 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
 146
 147 <programlisting>
 148
 149             original residue vector: [ 0 1 2 3 4 5 6 7 ]
 150
 151 codebook dimensions = 8  encoded as: [ 0 1 2 3 4 5 6 7 ]
 152
 153 codebook dimensions = 4  encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
 154
 155 codebook dimensions = 2  encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
 156
 157 codebook dimensions = 1  encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
 158
 159 </programlisting>
 160
 161 </section>
 162
 163 <section><title>residue 2</title>
 164
 165 <para>
 166 Residue type two can be thought of as a variant of residue type 1.
 167 Rather than encoding multiple passed-in vectors as in residue type 1,
 168 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
 169 interleaved and flattened into a single vector of length
 170 <emphasis>ch</emphasis>*<emphasis>n</emphasis>.  Encoding then proceeds as in type 1. Decoding is
 171 as in type 1 with decode interleave reversed. If operating on a single
 172 vector to begin with, residue type 1 and type 2 are equivalent.</para>
 173
 174 <mediaobject>
 175 <imageobject>
 176  <imagedata fileref="residue2.png" format="PNG"/>
 177 </imageobject>
 178 <textobject>
 179  <phrase>[illustration of residue type 2]</phrase>
 180 </textobject>
 181 </mediaobject>
 182
 183 </section>
 184
 185 <section>
 186 <title>Residue decode</title>
 187
 188 <section><title>header decode</title>
 189
 190 <para>
 191 Header decode for all three residue types is identical.</para>
 192 <programlisting>
 193   1) [residue_begin] = read 24 bits as unsigned integer
 194   2) [residue_end] = read 24 bits as unsigned integer
 195   3) [residue_partition_size] = read 24 bits as unsigned integer and add one
 196   4) [residue_classifications] = read 6 bits as unsigned integer and add one
 197   5) [residue_classbook] = read 8 bits as unsigned integer
 198 </programlisting>
 199
 200 <para>
 201 <varname>[residue_begin]</varname> and <varname>[residue_end]</varname> select the specific
 202 sub-portion of each vector that is actually coded; it implements akin
 203 to a bandpass where, for coding purposes, the vector effectively
 204 begins at element <varname>[residue_begin]</varname> and ends at
 205 <varname>[residue_end]</varname>.  Preceding and following values in the unpacked
 206 vectors are zeroed.  Note that for residue type 2, these values as
 207 well as <varname>[residue_partition_size]</varname>apply to the interleaved
 208 vector, not the individual vectors before interleave.
 209 <varname>[residue_partition_size]</varname> is as explained above,
 210 <varname>[residue_classifications]</varname> is the number of possible
 211 classification to which a partition can belong and
 212 <varname>[residue_classbook]</varname> is the codebook number used to code
 213 classification codewords.  The number of dimensions in book
 214 <varname>[residue_classbook]</varname> determines how many classification values
 215 are grouped into a single classification codeword.</para>
 216
 217 <para>
 218 Next we read a bitmap pattern that specifies which partition classes
 219 code values in which passes.</para>
 220
 221 <programlisting>
 222   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
 223
 224        2) [high_bits] = 0
 225        3) [low_bits] = read 3 bits as unsigned integer
 226        4) [bitflag] = read one bit as boolean
 227        5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
 228        6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
 229      }
 230   7) done
 231 </programlisting>
 232
 233 <para>
 234 Finally, we read in a list of book numbers, each corresponding to
 235 specific bit set in the cascade bitmap.  We loop over the possible
 236 codebook classifications and the maximum possible number of encoding
 237 stages (8 in Vorbis I, as constrained by the elements of the cascade
 238 bitmap being eight bits):</para>
 239
 240 <programlisting>
 241   1) iterate [i] over the range 0 ... [residue_classifications]-1 {
 242
 243        2) iterate [j] over the range 0 ... 7 {
 244
 245             3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
 246
 247                  4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
 248
 249                } else {
 250
 251                  5) array [residue_books] element [i][j] = unused
 252
 253                }
 254           }
 255       }
 256
 257   6) done
 258 </programlisting>
 259
 260 <para>
 261 An end-of-packet condition at any point in header decode renders the
 262 stream undecodable.  In addition, any codebook number greater than the
 263 maximum numbered codebook set up in this stream also renders the
 264 stream undecodable.</para>
 265
 266 </section>
 267
 268 <section><title>packet decode</title>
 269
 270 <para>
 271 Format 0 and 1 packet decode is identical except for specific
 272 partition interleave.  Format 2 packet decode can be built out of the
 273 format 1 decode process.  Thus we describe first the decode
 274 infrastructure identical to all three formats.</para>
 275
 276 <para>
 277 In addition to configuration information, the residue decode process
 278 is passed the number of vectors in the submap bundle and a vector of
 279 flags indicating if any of the vectors are not to be decoded.  If the
 280 passed in number of vectors is 3 and vector number 1 is marked 'do not
 281 decode', decode skips vector 1 during the decode loop.  However, even
 282 'do not decode' vectors are allocated and zeroed.</para>
 283
 284 <para>
 285 The following convenience values are conceptually useful to clarifying
 286 the decode process:</para>
 287
 288 <programlisting>
 289   1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
 290   2) [n_to_read] = [residue_end] - [residue_begin]
 291   3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
 292 </programlisting>
 293
 294 <para>
 295 Packet decode proceeds as follows, matching the description offered earlier in the document.  We assume that the number of vectors being encoded, <varname>[ch]</varname> is provided by the higher level decoding process.</para>
 296 <programlisting>
 297   1) allocate and zero all vectors that will be returned.
 298   2) iterate [pass] over the range 0 ... 7 {
 299
 300        3) [partition_count] = 0
 301
 302        4) if ([pass] is zero) {
 303
 304             5) iterate [j] over the range 0 .. [ch]-1 {
 305
 306                  6) if vector [j] is not marked 'do not decode' {
 307
 308                       7) [temp] = read from packet using codebook [residue_classbook] in scalar context
 309                       8) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
 310
 311                            9) array [classifications] element [j],([i]+[partition_count]) =
 312                               [temp] integer modulo [residue_classifications]
 313                           10) [temp] = [temp] / [residue_classifications] using integer division
 314
 315                          }
 316
 317                     }
 318
 319                }
 320
 321           }
 322
 323       11) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
 324           is also less than [partitions_to_read] {
 325
 326             12) iterate [j] over the range 0 .. [ch]-1 {
 327
 328                  13) if vector [j] is not marked 'do not decode' {
 329
 330                       14) [vqclass] = array [classifications] element [j],[partition_count]
 331                       15) [vqbook] = array [residue_books] element [vqclass],[pass]
 332                       16) if ([vqbook] is not 'unused') {
 333
 334                            17) decode partition into output vector number [j], starting at scalar
 335                            offset [residue_begin]+[partition_count]*[residue_partition_size] using
 336                            codebook number [vqbook] in VQ context
 337                      }
 338                 }
 339
 340             18) increment [partition_count] by one
 341
 342           }
 343      }
 344
 345  19) done
 346
 347 </programlisting>
 348
 349 <para>
 350 An end-of-packet condition during packet decode is to be considered a
 351 nominal occurrence.  Decode returns the result of vector decode up to
 352 that point.</para>
 353
 354 </section>
 355
 356 <section><title>format 0 specifics</title>
 357
 358 <para>
 359 Format zero decodes partitions exactly as described earlier in the
 360 'Residue Format: residue 0' section.  The following pseudocode
 361 presents the same algorithm. Assume:</para>
 362
 363 <itemizedlist>
 364 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
 365 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
 366 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
 367 </itemizedlist>
 368
 369 <programlisting>
 370  1) [step] = [n] / [codebook_dimensions]
 371  2) iterate [i] over the range 0 ... [step]-1 {
 372
 373       3) vector [entry_temp] = read vector from packet using current codebook in VQ context
 374       4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
 375
 376            5) vector [v] element ([offset]+[i]+[j]*[step]) =
 377                 vector [v] element ([offset]+[i]+[j]*[step]) +
 378                 vector [entry_temp] element [j]
 379
 380          }
 381
 382     }
 383
 384   6) done
 385
 386 </programlisting>
 387
 388 </section>
 389
 390 <section><title>format 1 specifics</title>
 391
 392 <para>
 393 Format 1 decodes partitions exactly as described earlier in the
 394 'Residue Format: residue 1' section.  The following pseudocode
 395 presents the same algorithm. Assume:</para>
 396
 397 <itemizedlist>
 398 <listitem><simpara> <varname>[n]</varname> is the value in
 399 <varname>[residue_partition_size]</varname></simpara></listitem>
 400 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
 401 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
 402 </itemizedlist>
 403
 404 <programlisting>
 405  1) [i] = 0
 406  2) vector [entry_temp] = read vector from packet using current codebook in VQ context
 407  3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
 408
 409       5) vector [v] element ([offset]+[i]) =
 410           vector [v] element ([offset]+[i]) +
 411           vector [entry_temp] element [j]
 412       6) increment [i]
 413
 414     }
 415
 416   4) if ( [i] is less than [n] ) continue at step 2
 417   5) done
 418 </programlisting>
 419
 420 </section>
 421
 422 <section><title>format 2 specifics</title>
 423
 424 <para>
 425 Format 2 is reducible to format 1.  It may be implemented as an additional stepprior to and an additional post-decode step after a normal format 1 decode.
 426 </para>
 427
 428 <para>
 429 Format 2 handles 'do not decode' vectors differently than residue 0 or
 430 1; if all vectors are marked 'do not decode', no decode occurrs.
 431 However, if at least one vector is to be decoded, all the vectors are
 432 decoded.  We then request normal format 1 to decode a single vector
 433 representing all output channels, rather than a vector for each
 434 channel.  After decode, deinterleave the vector into independent vectors, one for each output channel.  That is:</para>
 435
 436 <orderedlist>
 437  <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
 438  <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
 439  <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
 440   <programlisting>
 441   1) iterate [i] over the range 0 ... [n]-1 {
 442
 443        2) iterate [j] over the range 0 ... [ch]-1 {
 444
 445             3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
 446
 447           }
 448      }
 449
 450   4) done
 451   </programlisting>
 452  </para></listitem>
 453 </orderedlist>
 454
 455 </section>
 456
 457 </section>
 458
 459 </section>
 460