1 <?xml version="1.0" standalone="no"?>
2 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
7 <section id="vorbis-spec-residue">
13 <title>Residue setup and decode</title>
17 <title>Overview</title>
20 A residue vector represents the fine detail of the audio spectrum of
21 one channel in an audio frame after the encoder subtracts the floor
22 curve and performs any channel coupling. A residue vector may
23 represent spectral lines, spectral magnitude, spectral phase or
24 hybrids as mixed by channel coupling. The exact semantic content of
25 the vector does not matter to the residue abstraction.</para>
28 Whatever the exact qualities, the Vorbis residue abstraction codes the
29 residue vectors into the bitstream packet, and then reconstructs the
30 vectors during decode. Vorbis makes use of three different encoding
31 variants (numbered 0, 1 and 2) of the same basic vector encoding
37 <title>Residue format</title>
40 Residue format partitions each vector in the vector bundle into chunks,
41 classifies each chunk, encodes the chunk classifications and finally
42 encodes the chunks themselves using the the specific VQ arrangement
43 defined for each selected classification.
44 The exact interleaving and partitioning vary by residue encoding number,
45 however the high-level process used to classify and encode the residue
46 vector is the same in all three variants.</para>
49 A set of coded residue vectors are all of the same length. High level
50 coding structure, ignoring for the moment exactly how a partition is
51 encoded and simply trusting that it is, is as follows:</para>
55 <listitem><para>Each vector is partitioned into multiple equal sized chunks
56 according to configuration specified. If we have a vector size of
57 <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
58 of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
59 coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>. It is
60 important to note that the integer division truncates. In the below
61 example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
63 <listitem><para>Each partition in each vector has a classification number that
64 specifies which of multiple configured VQ codebook setups are used to
65 decode that partition. The classification numbers of each partition
66 can be thought of as forming a vector in their own right, as in the
67 illustration below. Just as the residue vectors are coded in grouped
68 partitions to increase encoding efficiency, the classification vector
69 is also partitioned into chunks. The integer elements of each scalar
70 in a classification chunk are built into a single scalar that
71 represents the classification numbers in that chunk. In the below
72 example, the classification codeword encodes two classification
73 numbers.</para></listitem>
75 <listitem><para>The values in a residue vector may be encoded monolithically in a
76 single pass through the residue vector, but more often efficient
77 codebook design dictates that each vector is encoded as the additive
78 sum of several passes through the residue vector using more than one
79 VQ codebook. Thus, each residue value potentially accumulates values
80 from multiple decode passes. The classification value associated with
81 a partition is the same in each pass, thus the classification codeword
82 is coded only in the first pass.</para></listitem>
89 <imagedata fileref="residue-pack.png" format="PNG"/>
92 <phrase>[illustration of residue vector format]</phrase>
98 <section><title>residue 0</title>
101 Residue 0 and 1 differ only in the way the values within a residue
102 partition are interleaved during partition encoding (visually treated
103 as a black box--or cyan box or brown box--in the above figure).</para>
106 Residue encoding 0 interleaves VQ encoding according to the
107 dimension of the codebook used to encode a partition in a specific
108 pass. The dimension of the codebook need not be the same in multiple
109 passes, however the partition size must be an even multiple of the
110 codebook dimension.</para>
113 As an example, assume a partition vector of size eight, to be encoded
114 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
118 original residue vector: [ 0 1 2 3 4 5 6 7 ]
120 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
122 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
124 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
126 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
131 It is worth mentioning at this point that no configurable value in the
132 residue coding setup is restricted to a power of two.</para>
136 <section><title>residue 1</title>
139 Residue 1 does not interleave VQ encoding. It represents partition
140 vector scalars in order. As with residue 0, however, partition length
141 must be an integer multiple of the codebook dimension, although
142 dimension may vary from pass to pass.</para>
145 As an example, assume a partition vector of size eight, to be encoded
146 by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
150 original residue vector: [ 0 1 2 3 4 5 6 7 ]
152 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
154 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
156 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
158 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
164 <section><title>residue 2</title>
167 Residue type two can be thought of as a variant of residue type 1.
168 Rather than encoding multiple passed-in vectors as in residue type 1,
169 the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
170 interleaved and flattened into a single vector of length
171 <emphasis>ch</emphasis>*<emphasis>n</emphasis>. Encoding then proceeds as in type 1. Decoding is
172 as in type 1 with decode interleave reversed. If operating on a single
173 vector to begin with, residue type 1 and type 2 are equivalent.</para>
177 <imagedata fileref="residue2.png" format="PNG"/>
180 <phrase>[illustration of residue type 2]</phrase>
187 <title>Residue decode</title>
189 <section><title>header decode</title>
192 Header decode for all three residue types is identical.</para>
194 1) [residue_begin] = read 24 bits as unsigned integer
195 2) [residue_end] = read 24 bits as unsigned integer
196 3) [residue_partition_size] = read 24 bits as unsigned integer and add one
197 4) [residue_classifications] = read 6 bits as unsigned integer and add one
198 5) [residue_classbook] = read 8 bits as unsigned integer
202 <varname>[residue_begin]</varname> and
203 <varname>[residue_end]</varname> select the specific sub-portion of
204 each vector that is actually coded; it implements akin to a bandpass
205 where, for coding purposes, the vector effectively begins at element
206 <varname>[residue_begin]</varname> and ends at
207 <varname>[residue_end]</varname>. Preceding and following values in
208 the unpacked vectors are zeroed. Note that for residue type 2, these
209 values as well as <varname>[residue_partition_size]</varname>apply to
210 the interleaved vector, not the individual vectors before interleave.
211 <varname>[residue_partition_size]</varname> is as explained above,
212 <varname>[residue_classifications]</varname> is the number of possible
213 classification to which a partition can belong and
214 <varname>[residue_classbook]</varname> is the codebook number used to
215 code classification codewords. The number of dimensions in book
216 <varname>[residue_classbook]</varname> determines how many
217 classification values are grouped into a single classification
218 codeword. Note that the number of entries and dimensions in book
219 <varname>[residue_classbook]</varname>, along with
220 <varname>[residue_classifications]</varname>, overdetermines to
221 possible number of classification codewords. If
222 <varname>[residue_classifications]</varname>^<varname>[residue_classbook]</varname>.dimensions
223 exceeds <varname>[residue_classbook]</varname>.entries, the
224 bitstream should be regarded to be undecodable. </para>
227 Next we read a bitmap pattern that specifies which partition classes
228 code values in which passes.</para>
231 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
234 3) [low_bits] = read 3 bits as unsigned integer
235 4) [bitflag] = read one bit as boolean
236 5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
237 6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
243 Finally, we read in a list of book numbers, each corresponding to
244 specific bit set in the cascade bitmap. We loop over the possible
245 codebook classifications and the maximum possible number of encoding
246 stages (8 in Vorbis I, as constrained by the elements of the cascade
247 bitmap being eight bits):</para>
250 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
252 2) iterate [j] over the range 0 ... 7 {
254 3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
256 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
260 5) array [residue_books] element [i][j] = unused
270 An end-of-packet condition at any point in header decode renders the
271 stream undecodable. In addition, any codebook number greater than the
272 maximum numbered codebook set up in this stream also renders the
273 stream undecodable.</para>
277 <section><title>packet decode</title>
280 Format 0 and 1 packet decode is identical except for specific
281 partition interleave. Format 2 packet decode can be built out of the
282 format 1 decode process. Thus we describe first the decode
283 infrastructure identical to all three formats.</para>
286 In addition to configuration information, the residue decode process
287 is passed the number of vectors in the submap bundle and a vector of
288 flags indicating if any of the vectors are not to be decoded. If the
289 passed in number of vectors is 3 and vector number 1 is marked 'do not
290 decode', decode skips vector 1 during the decode loop. However, even
291 'do not decode' vectors are allocated and zeroed.</para>
294 Depending on the values of <varname>[residue_begin]</varname> and
295 <varname>[residue_end]</varname>, it is obvious that the encoded
296 portion of a residue vector may be the entire possible residue vector
297 or some other strict subset of the actual residue vector size with
298 zero padding at either uncoded end. However, it is also possible to
299 set <varname>[residue_begin]</varname> and
300 <varname>[residue_end]</varname> to specify a range partially or
301 wholly beyond the maximum vector size. Before beginning residue
302 decode, limit <varname>[residue_begin]</varname> and
303 <varname>[residue_end]</varname> to the maximum possible vector size
304 as follows. We assume that the number of vectors being encoded,
305 <varname>[ch]</varname> is provided by the higher level decoding
309 1) [actual_size] = current blocksize/2;
310 2) if residue encoding is format 2
311 3) [actual_size] = [actual_size] * [ch];
312 4) [limit_residue_begin] = maximum of ([residue_begin],[actual_size]);
313 5) [limit_residue_end] = maximum of ([residue_end],[actual_size]);
317 The following convenience values are conceptually useful to clarifying
318 the decode process:</para>
321 1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
322 2) [n_to_read] = [limit_residue_end] - [limit_residue_begin]
323 3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
327 Packet decode proceeds as follows, matching the description offered earlier in the document. </para>
329 1) allocate and zero all vectors that will be returned.
330 2) if ([n_to_read] is zero), stop; there is no residue to decode.
331 3) iterate [pass] over the range 0 ... 7 {
333 4) [partition_count] = 0
335 5) while [partition_count] is less than [partitions_to_read]
337 6) if ([pass] is zero) {
339 7) iterate [j] over the range 0 .. [ch]-1 {
341 8) if vector [j] is not marked 'do not decode' {
343 9) [temp] = read from packet using codebook [residue_classbook] in scalar context
344 10) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
346 11) array [classifications] element [j],([i]+[partition_count]) =
347 [temp] integer modulo [residue_classifications]
348 12) [temp] = [temp] / [residue_classifications] using integer division
358 13) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
359 is also less than [partitions_to_read] {
361 14) iterate [j] over the range 0 .. [ch]-1 {
363 15) if vector [j] is not marked 'do not decode' {
365 16) [vqclass] = array [classifications] element [j],[partition_count]
366 17) [vqbook] = array [residue_books] element [vqclass],[pass]
367 18) if ([vqbook] is not 'unused') {
369 19) decode partition into output vector number [j], starting at scalar
370 offset [limit_residue_begin]+[partition_count]*[residue_partition_size] using
371 codebook number [vqbook] in VQ context
375 20) increment [partition_count] by one
386 An end-of-packet condition during packet decode is to be considered a
387 nominal occurrence. Decode returns the result of vector decode up to
392 <section><title>format 0 specifics</title>
395 Format zero decodes partitions exactly as described earlier in the
396 'Residue Format: residue 0' section. The following pseudocode
397 presents the same algorithm. Assume:</para>
401 <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
402 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
403 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
408 1) [step] = [n] / [codebook_dimensions]
409 2) iterate [i] over the range 0 ... [step]-1 {
411 3) vector [entry_temp] = read vector from packet using current codebook in VQ context
412 4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
414 5) vector [v] element ([offset]+[i]+[j]*[step]) =
415 vector [v] element ([offset]+[i]+[j]*[step]) +
416 vector [entry_temp] element [j]
428 <section><title>format 1 specifics</title>
431 Format 1 decodes partitions exactly as described earlier in the
432 'Residue Format: residue 1' section. The following pseudocode
433 presents the same algorithm. Assume:</para>
437 <listitem><simpara> <varname>[n]</varname> is the value in
438 <varname>[residue_partition_size]</varname></simpara></listitem>
439 <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
440 <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
446 2) vector [entry_temp] = read vector from packet using current codebook in VQ context
447 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
449 4) vector [v] element ([offset]+[i]) =
450 vector [v] element ([offset]+[i]) +
451 vector [entry_temp] element [j]
456 6) if ( [i] is less than [n] ) continue at step 2
462 <section><title>format 2 specifics</title>
465 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
469 Format 2 handles 'do not decode' vectors differently than residue 0 or
470 1; if all vectors are marked 'do not decode', no decode occurrs.
471 However, if at least one vector is to be decoded, all the vectors are
472 decoded. We then request normal format 1 to decode a single vector
473 representing all output channels, rather than a vector for each
474 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:</para>
477 <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
478 <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
479 <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
481 1) iterate [i] over the range 0 ... [n]-1 {
483 2) iterate [j] over the range 0 ... [ch]-1 {
485 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])