1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
2 %!TEX root = Vorbis_I_spec.tex
3 \section{Residue setup and decode} \label{vorbis:spec:residue}
7 A residue vector represents the fine detail of the audio spectrum of
8 one channel in an audio frame after the encoder subtracts the floor
9 curve and performs any channel coupling. A residue vector may
10 represent spectral lines, spectral magnitude, spectral phase or
11 hybrids as mixed by channel coupling. The exact semantic content of
12 the vector does not matter to the residue abstraction.
14 Whatever the exact qualities, the Vorbis residue abstraction codes the
15 residue vectors into the bitstream packet, and then reconstructs the
16 vectors during decode. Vorbis makes use of three different encoding
17 variants (numbered 0, 1 and 2) of the same basic vector encoding
22 \subsection{Residue format}
24 Residue format partitions each vector in the vector bundle into chunks,
25 classifies each chunk, encodes the chunk classifications and finally
26 encodes the chunks themselves using the the specific VQ arrangement
27 defined for each selected classification.
28 The exact interleaving and partitioning vary by residue encoding number,
29 however the high-level process used to classify and encode the residue
30 vector is the same in all three variants.
32 A set of coded residue vectors are all of the same length. High level
33 coding structure, ignoring for the moment exactly how a partition is
34 encoded and simply trusting that it is, is as follows:
37 \item Each vector is partitioned into multiple equal sized chunks
38 according to configuration specified. If we have a vector size of
39 \emph{n}, a partition size \emph{residue\_partition\_size}, and a total
40 of \emph{ch} residue vectors, the total number of partitioned chunks
41 coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is
42 important to note that the integer division truncates. In the below
43 example, we assume an example \emph{residue\_partition\_size} of 8.
45 \item Each partition in each vector has a classification number that
46 specifies which of multiple configured VQ codebook setups are used to
47 decode that partition. The classification numbers of each partition
48 can be thought of as forming a vector in their own right, as in the
49 illustration below. Just as the residue vectors are coded in grouped
50 partitions to increase encoding efficiency, the classification vector
51 is also partitioned into chunks. The integer elements of each scalar
52 in a classification chunk are built into a single scalar that
53 represents the classification numbers in that chunk. In the below
54 example, the classification codeword encodes two classification
57 \item The values in a residue vector may be encoded monolithically in a
58 single pass through the residue vector, but more often efficient
59 codebook design dictates that each vector is encoded as the additive
60 sum of several passes through the residue vector using more than one
61 VQ codebook. Thus, each residue value potentially accumulates values
62 from multiple decode passes. The classification value associated with
63 a partition is the same in each pass, thus the classification codeword
64 is coded only in the first pass.
70 \includegraphics[width=\textwidth]{residue-pack}
71 \captionof{figure}{illustration of residue vector format}
76 \subsection{residue 0}
78 Residue 0 and 1 differ only in the way the values within a residue
79 partition are interleaved during partition encoding (visually treated
80 as a black box--or cyan box or brown box--in the above figure).
82 Residue encoding 0 interleaves VQ encoding according to the
83 dimension of the codebook used to encode a partition in a specific
84 pass. The dimension of the codebook need not be the same in multiple
85 passes, however the partition size must be an even multiple of the
88 As an example, assume a partition vector of size eight, to be encoded
89 by residue 0 using codebook sizes of 8, 4, 2 and 1:
91 \begin{programlisting}
93 original residue vector: [ 0 1 2 3 4 5 6 7 ]
95 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
97 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
99 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
101 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
105 It is worth mentioning at this point that no configurable value in the
106 residue coding setup is restricted to a power of two.
110 \subsection{residue 1}
112 Residue 1 does not interleave VQ encoding. It represents partition
113 vector scalars in order. As with residue 0, however, partition length
114 must be an integer multiple of the codebook dimension, although
115 dimension may vary from pass to pass.
117 As an example, assume a partition vector of size eight, to be encoded
118 by residue 0 using codebook sizes of 8, 4, 2 and 1:
120 \begin{programlisting}
122 original residue vector: [ 0 1 2 3 4 5 6 7 ]
124 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
126 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
128 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
130 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
136 \subsection{residue 2}
138 Residue type two can be thought of as a variant of residue type 1.
139 Rather than encoding multiple passed-in vectors as in residue type 1,
140 the \emph{ch} passed in vectors of length \emph{n} are first
141 interleaved and flattened into a single vector of length
142 \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is
143 as in type 1 with decode interleave reversed. If operating on a single
144 vector to begin with, residue type 1 and type 2 are equivalent.
147 \includegraphics[width=\textwidth]{residue2}
148 \captionof{figure}{illustration of residue type 2}
152 \subsection{Residue decode}
154 \subsubsection{header decode}
156 Header decode for all three residue types is identical.
157 \begin{programlisting}
158 1) [residue\_begin] = read 24 bits as unsigned integer
159 2) [residue\_end] = read 24 bits as unsigned integer
160 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one
161 4) [residue\_classifications] = read 6 bits as unsigned integer and add one
162 5) [residue\_classbook] = read 8 bits as unsigned integer
165 \varname{[residue\_begin]} and
166 \varname{[residue\_end]} select the specific sub-portion of
167 each vector that is actually coded; it implements akin to a bandpass
168 where, for coding purposes, the vector effectively begins at element
169 \varname{[residue\_begin]} and ends at
170 \varname{[residue\_end]}. Preceding and following values in
171 the unpacked vectors are zeroed. Note that for residue type 2, these
172 values as well as \varname{[residue\_partition\_size]}apply to
173 the interleaved vector, not the individual vectors before interleave.
174 \varname{[residue\_partition\_size]} is as explained above,
175 \varname{[residue\_classifications]} is the number of possible
176 classification to which a partition can belong and
177 \varname{[residue\_classbook]} is the codebook number used to
178 code classification codewords. The number of dimensions in book
179 \varname{[residue\_classbook]} determines how many
180 classification values are grouped into a single classification
181 codeword. Note that the number of entries and dimensions in book
182 \varname{[residue\_classbook]}, along with
183 \varname{[residue\_classifications]}, overdetermines to
184 possible number of classification codewords.
185 If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions
186 exceeds \varname{[residue\_classbook]}.entries, the
187 bitstream should be regarded to be undecodable.
189 Next we read a bitmap pattern that specifies which partition classes
190 code values in which passes.
192 \begin{programlisting}
193 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
196 3) [low\_bits] = read 3 bits as unsigned integer
197 4) [bitflag] = read one bit as boolean
198 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer
199 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits]
204 Finally, we read in a list of book numbers, each corresponding to
205 specific bit set in the cascade bitmap. We loop over the possible
206 codebook classifications and the maximum possible number of encoding
207 stages (8 in Vorbis I, as constrained by the elements of the cascade
208 bitmap being eight bits):
210 \begin{programlisting}
211 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
213 2) iterate [j] over the range 0 ... 7 {
215 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) {
217 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer
221 5) array [residue\_books] element [i][j] = unused
230 An end-of-packet condition at any point in header decode renders the
231 stream undecodable. In addition, any codebook number greater than the
232 maximum numbered codebook set up in this stream also renders the
233 stream undecodable. All codebooks in array [residue\_books] are
234 required to have a value mapping. The presence of codebook in array
235 [residue\_books] without a value mapping (maptype equals zero) renders
236 the stream undecodable.
240 \subsubsection{packet decode}
242 Format 0 and 1 packet decode is identical except for specific
243 partition interleave. Format 2 packet decode can be built out of the
244 format 1 decode process. Thus we describe first the decode
245 infrastructure identical to all three formats.
247 In addition to configuration information, the residue decode process
248 is passed the number of vectors in the submap bundle and a vector of
249 flags indicating if any of the vectors are not to be decoded. If the
250 passed in number of vectors is 3 and vector number 1 is marked 'do not
251 decode', decode skips vector 1 during the decode loop. However, even
252 'do not decode' vectors are allocated and zeroed.
254 Depending on the values of \varname{[residue\_begin]} and
255 \varname{[residue\_end]}, it is obvious that the encoded
256 portion of a residue vector may be the entire possible residue vector
257 or some other strict subset of the actual residue vector size with
258 zero padding at either uncoded end. However, it is also possible to
259 set \varname{[residue\_begin]} and
260 \varname{[residue\_end]} to specify a range partially or
261 wholly beyond the maximum vector size. Before beginning residue
262 decode, limit \varname{[residue\_begin]} and
263 \varname{[residue\_end]} to the maximum possible vector size
264 as follows. We assume that the number of vectors being encoded,
265 \varname{[ch]} is provided by the higher level decoding
268 \begin{programlisting}
269 1) [actual\_size] = current blocksize/2;
270 2) if residue encoding is format 2
271 3) [actual\_size] = [actual\_size] * [ch];
272 4) [limit\_residue\_begin] = minimum of ([residue\_begin],[actual\_size]);
273 5) [limit\_residue\_end] = minimum of ([residue\_end],[actual\_size]);
276 The following convenience values are conceptually useful to clarifying
279 \begin{programlisting}
280 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook]
281 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin]
282 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size]
285 Packet decode proceeds as follows, matching the description offered earlier in the document.
286 \begin{programlisting}
287 1) allocate and zero all vectors that will be returned.
288 2) if ([n\_to\_read] is zero), stop; there is no residue to decode.
289 3) iterate [pass] over the range 0 ... 7 {
291 4) [partition\_count] = 0
293 5) while [partition\_count] is less than [partitions\_to\_read]
295 6) if ([pass] is zero) {
297 7) iterate [j] over the range 0 .. [ch]-1 {
299 8) if vector [j] is not marked 'do not decode' {
301 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context
302 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 {
304 11) array [classifications] element [j],([i]+[partition\_count]) =
305 [temp] integer modulo [residue\_classifications]
306 12) [temp] = [temp] / [residue\_classifications] using integer division
316 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count]
317 is also less than [partitions\_to\_read] {
319 14) iterate [j] over the range 0 .. [ch]-1 {
321 15) if vector [j] is not marked 'do not decode' {
323 16) [vqclass] = array [classifications] element [j],[partition\_count]
324 17) [vqbook] = array [residue\_books] element [vqclass],[pass]
325 18) if ([vqbook] is not 'unused') {
327 19) decode partition into output vector number [j], starting at scalar
328 offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using
329 codebook number [vqbook] in VQ context
333 20) increment [partition\_count] by one
343 An end-of-packet condition during packet decode is to be considered a
344 nominal occurrence. Decode returns the result of vector decode up to
349 \subsubsection{format 0 specifics}
351 Format zero decodes partitions exactly as described earlier in the
352 'Residue Format: residue 0' section. The following pseudocode
353 presents the same algorithm. Assume:
356 \item \varname{[n]} is the value in \varname{[residue\_partition\_size]}
357 \item \varname{[v]} is the residue vector
358 \item \varname{[offset]} is the beginning read offset in [v]
362 \begin{programlisting}
363 1) [step] = [n] / [codebook\_dimensions]
364 2) iterate [i] over the range 0 ... [step]-1 {
366 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context
367 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
369 5) vector [v] element ([offset]+[i]+[j]*[step]) =
370 vector [v] element ([offset]+[i]+[j]*[step]) +
371 vector [entry\_temp] element [j]
383 \subsubsection{format 1 specifics}
385 Format 1 decodes partitions exactly as described earlier in the
386 'Residue Format: residue 1' section. The following pseudocode
387 presents the same algorithm. Assume:
390 \item \varname{[n]} is the value in
391 \varname{[residue\_partition\_size]}
392 \item \varname{[v]} is the residue vector
393 \item \varname{[offset]} is the beginning read offset in [v]
397 \begin{programlisting}
399 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context
400 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
402 4) vector [v] element ([offset]+[i]) =
403 vector [v] element ([offset]+[i]) +
404 vector [entry\_temp] element [j]
409 6) if ( [i] is less than [n] ) continue at step 2
415 \subsubsection{format 2 specifics}
417 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
420 Format 2 handles 'do not decode' vectors differently than residue 0 or
421 1; if all vectors are marked 'do not decode', no decode occurrs.
422 However, if at least one vector is to be decoded, all the vectors are
423 decoded. We then request normal format 1 to decode a single vector
424 representing all output channels, rather than a vector for each
425 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:
428 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step.
429 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}.
430 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to:
431 \begin{programlisting}
432 1) iterate [i] over the range 0 ... [n]-1 {
434 2) iterate [j] over the range 0 ... [ch]-1 {
436 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])