doc/vorbis-spec-ref.html

   1 <HTML><HEAD><TITLE>xiph.org: Ogg Vorbis documentation</TITLE>
   2 <BODY bgcolor="#ffffff" text="#202020" link="#006666" vlink="#000000">
   3 <nobr><img src="white-ogg.png"><img src="vorbisword2.png"></nobr><p>
   4
   5 <h1><font color=#000070>
   6 Ogg Vorbis I format specification: codec setup and packet decode
   7 </font></h1>
   8
   9 <em>Last update to this document: October 15, 2002</em><br>
  10
  11 <h1>Overview</h1>
  12
  13 This document serves as the top-level reference document for the
  14 bit-by-bit decode specification of Vorbis I.  This document assumes a
  15 high-level understanding of the Vorbis decode process, which is
  16 provided in the document <a href="vorbis-spec-intro.html">Ogg Vorbis I
  17 format specification: introduction and description</a>.  <a
  18 href="vorbis-spec-bitpack.html">Ogg Vorbis I format specification:
  19 bitpacking convention</a> covers reading and writing bit fields from
  20 and to bitstream packets.<p>
  21
  22 <h1>Header decode and decode setup</h1>
  23
  24 A Vorbis bitstream begins with three header packets. The header
  25 packets are, in order, the identification header, the comments header,
  26 and the setup header. All are required for decode compliance.  An
  27 end-of-packet condition during decoding the first or third header
  28 packet renders the stream undecodable.  End-of-packet decoding the
  29 comment header is a non-fatal error condition.<p>
  30
  31 <h2>Common header decode</h2>
  32
  33 Each header packet begins with the same header fields
  34
  35 <pre>
  36   1) [packet_type] : 8 bit value
  37   2) 0x76, 0x6f, 0x72, 0x62, 0x69, 0x73: the characters 'v','o','r','b','i','s' as six octets
  38 </pre>
  39
  40 Decode continues according to packet type; the identification header
  41 is type 1, the comment header type 3 and the setup header type 5
  42 (these types are all odd as a packet with a leading single bit of '0'
  43 is an audio packet).  The packets must occur in the order of
  44 identification, comment, setup.
  45
  46 <h2>Identification Header</h2>
  47
  48 The identification header is a short header of only a few fields used
  49 to declare the stream definitively as Vorbis, and provide a few externally
  50 relevant pieces of information about the audio stream. The
  51 identification header is coded as follows:<p>
  52
  53 <pre>
  54  1) [vorbis_version] = read 32 bits as unsigned integer
  55  2) [audio_channels] = read 8 bit integer as unsigned
  56  3) [audio_sample_rate] = read 32 bits as unsigned integer
  57  4) [bitrate_maximum] = read 32 bits as signed integer
  58  5) [bitrate_nominal] = read 32 bits as signed integer
  59  6) [bitrate_minimum] = read 32 bits as signed integer
  60  7) [blocksize_0] = 2 exponent (read 4 bits as unsigned integer)
  61  8) [blocksize_1] = 2 exponent (read 4 bits as unsigned integer)
  62  9) [framing_flag] = read one bit
  63 </pre>
  64
  65 <tt>[vorbis_version]</tt> is to read '0' in order to be compatible
  66 with this document.  Both <tt>[audio_channels]</tt> and
  67 <tt>[audio_sample_rate]</tt> must read greater than zero.  Allowed final
  68 blocksize values are 64, 128, 256, 512, 1024, 2048, 4096 and 8192 in
  69 Vorbis I.  <tt>[blocksize_0]</tt> must be less than or equal to
  70 <tt>[blocksize_1]</tt>.  The framing bit must be nonzero.  Failure to
  71 meet any of these conditions renders a stream undecodable.<p>
  72
  73 The bitrate fields above are used only as hints. The nominal bitrate
  74 field especially may be considerably off in purely VBR streams.  The
  75 fields are meaningful only when greater than zero.<p>
  76 <ul><li>All three fields set to the same value implies a fixed rate, or tightly bounded, nearly fixed-rate bitstream
  77     <li>Only nominal set implies a VBR or ABR stream that averages the nominal bitrate
  78     <li>Maximum and or minimum set implies a VBR bitstream that obeys the bitrate limits
  79     <li>None set indicates the encoder does not care to speculate.
  80 </ul>
  81
  82
  83 <h2>Comment Header</h2>
  84
  85 Comment header decode and data specification is covered in <a
  86 href="v-comment.html">Ogg Vorbis I format specification: comment field
  87 and header specification</a>.
  88
  89 <h2>Setup Header</h2>
  90
  91 Vorbis codec setup is configurable to an extreme degree:<p>
  92
  93 <img src="components.png"><p>
  94
  95 The setup header contains the bulk of the codec setup information
  96 needed for decode.  The setup header contains, in order, the lists of
  97 codebook configurations, time-domain transform configurations
  98 (placeholders in Vorbis I), floor configurations, residue
  99 configurations, channel mapping configurations and mode
 100 configurations. It finishes with a framing bit of '1'.  Header decode
 101 proceeds in the following order:<p>
 102
 103 <h3>codebooks</h3>
 104
 105 <ol>
 106 <li><tt>[vorbis_codebook_count]</tt> = read eight bits as unsigned integer and add one
 107 <li>Decode <tt>[vorbis_codebook_count]</tt> codebooks in order as defined
 108 in <a href="vorbis-spec-codebook.html">the codebook specification
 109 document</a>.  Save each configuration, in order, in an array of
 110 codebook configurations <tt>[vorbis_codebook_configurations]</tt>.
 111 </ol>
 112
 113 <h3>time domain transforms</h3>
 114
 115 These hooks are placeholders in Vorbis I.  Nevertheless, the
 116 configuration placeholder values must be read to maintain bitstream
 117 sync.<p>
 118
 119 <ol>
 120 <li><tt>[vorbis_time_count]</tt> = read 6 bits as unsigned integer and add one
 121 <li>read <tt>[vorbis_time_count]</tt> 16 bit values; each value should be zero.  If any value is nonzero, this is an error condition and the stream is undecodable.
 122 </ol>
 123
 124 <h3>floors</h3>
 125
 126 Vorbis uses two floor types; header decode is handed to the decode
 127 abstraction of the appropriate type.
 128
 129 <ol>
 130 <li><tt>[vorbis_floor_count]</tt> = read 6 bits as unsigned integer and add one
 131 <li>For each <tt>[i]</tt> of <tt>[vorbis_floor_count]</tt> floor numbers:
 132   <ol>
 133   <li>read the floor type: vector <tt>[vorbis_floor_types]</tt> element <tt>[i]</tt> = read 16 bits as unsigned integer
 134   <li>If the floor type is zero, decode the floor configuration as defined in <a href="vorbis-spec-floor0.html">the floor type 0 specification document</a>; save this configuration in slot <tt>[i]</tt> of the floor configuration array <tt>[vorbis_floor_configurations]</tt>.
 135   <li>If the floor type is one, decode the floor configuration as defined in <a href="vorbis-spec-floor1.html">the floor type 1 specification document</a>; save this configuration in slot <tt>[i]</tt> of the floor configuration array <tt>[vorbis_floor_configurations]</tt>.
 136   <li>If the the floor type is greater than one, this stream is undecodable; ERROR CONDITION
 137   </ol>
 138 </ol>
 139
 140 <h3>residues</h3>
 141
 142 Vorbis uses three residue types; header decode of each type is identical.
 143
 144 <ol>
 145 <li><tt>[vorbis_residue_count]</tt> = read 6 bits as unsigned integer and add one
 146 <li>For each of <tt>[vorbis_residue_count]</tt> residue numbers:
 147   <ol>
 148   <li>read the residue type; vector <tt>[vorbis_residue_types]</tt> element <tt>[i]</tt> = read 16 bits as unsigned integer
 149   <li>If the residue type is zero, one or two, decode the residue configuration as defined in <a href="vorbis-spec-res.html">the residue specification document</a>; save this configuration in slot <tt>[i]</tt> of the residue configuration array <tt>[vorbis_residue_configurations]</tt>.
 150   <li>If the the residue type is greater than two, this stream is undecodable; ERROR CONDITION
 151   </ol>
 152 </ol>
 153
 154 <h3>mappings</h3>
 155
 156 Mappings are used to set up specific pipelines for encoding
 157 multichannel audio with varying channel mapping applications. Vorbis I
 158 uses a single mapping type (0), with implicit PCM channel mappings.<p>
 159
 160 <ol>
 161 <li><tt>[vorbis_mapping_count]</tt> = read 6 bits as unsigned integer and add one<p>
 162 <li>For each <tt>[i]</tt> of <tt>[vorbis_mapping_count]</tt> mapping numbers:<p>
 163   <ol>
 164   <li>read the mapping type: 16 bits as unsigned integer.  There's no reason to save the mapping type in Vorbis I.<p>
 165   <li>If the mapping type is nonzero, the stream is undecodable<p>
 166   <li>If the mapping type is zero:<p>
 167      <ol> <li>read 1 bit as a boolean flag<p>
 168              <ol><li>if set, <tt>[vorbis_mapping_submaps]</tt> = read 4 bits as unsigned integer and add one<p>
 169                  <li>if unset, <tt>[vorbis_mapping_submaps]</tt> = 1<p>
 170              </ol>
 171           <li>read 1 bit as a boolean flag<p>
 172              <ol><li>if set, square polar channel mapping is in use:<p>
 173                   <ol><li><tt>[vorbis_mapping_coupling_steps]</tt> = read 8 bits as unsigned integer and add one<p>
 174                       <li>for <tt>[j]</tt> each of <tt>[vorbis_mapping_coupling_steps]</tt> steps:<p>
 175                           <ol>
 176                           <li>vector <tt>[vorbis_mapping_magnitude]</tt> element <tt>[j]</tt>= read <a href="helper.html#ilog">ilog</a>([audio_channels] - 1) bits as unsigned integer<p>
 177                           <li>vector <tt>[vorbis_mapping_angle]</tt> element <tt>[j]</tt>= read <a href="helper.html#ilog">ilog</a>([audio_channels] - 1) bits as unsigned integer<p>
 178                           <li>the numbers read in the above two steps are channel numbers representing the channel to treat as magnitude and the channel to treat as angle, respectively.  If for any coupling step the angle channel number equals the magnitude channel number, the magnitude channel number is greater than <tt>[audio_channels]</tt>-1, or the angle channel is greater than <tt>[audio_channels]</tt>-1, the stream is undecodable.<p>
 179                           </ol>
 180                    </ol>
 181                <li>if unset, <tt>[vorbis_mapping_coupling_steps]</tt> = 0
 182                </ol>
 183            <li>read 2 bits (reserved field); if the value is nonzero, the stream is undecodable<p>
 184            <li>if <tt>[vorbis_mapping_submaps]</tt> is greater than one, we read channel multiplex settings.  For each <tt>[j]</tt> of <tt>[audio_channels]</tt> channels:<p>
 185                <ol><li>vector <tt>[vorbis_mapping_mux]</tt> element <tt>[j]</tt> = read 4 bits as unsigned integer<p>
 186                    <li>if the value is greater than the highest numbered submap (<tt>[vorbis_mapping_submaps]</tt> - 1), this in an error condition rendering the stream undecodable<p>
 187                </ol>
 188            <li>for each submap <tt>[j]</tt> of <tt>[vorbis_mapping_submaps]</tt> submaps, read the floor and residue numbers for use in decoding that submap:
 189               <ol><li>read and discard 8 bits (the unused time configuration placeholder)<p>
 190                   <li>read 8 bits as unsigned integer for the floor number; save in vector <tt>[vorbis_mapping_submap_floor]</tt> element <tt>[j]</tt><p>
 191                   <li>verify the floor number is not greater than the highest number floor configured for the bitstream.  If it is, the bitstream is undecodable<p>
 192                   <li>read 8 bits as unsigned integer for the residue number; save in vector <tt>[vorbis_mapping_submap_residue]</tt> element <tt>[j]</tt><p>
 193                   <li>verify the residue number is not greater than the highest number residue configured for the bitstream.  If it is, the bitstream is undecodable<p>
 194               </ol>
 195
 196
 197            <li>save this mapping configuration in slot <tt>[i]</tt> of the mapping configuration array <tt>[vorbis_mapping_configurations]</tt>.
 198
 199      </ol>
 200   </ol>
 201 </ol>
 202
 203 <h3>modes</h3>
 204
 205 <ol>
 206 <li><tt>[vorbis_mode_count]</tt> = read 6 bits as unsigned integer and add one<p>
 207 <li>For each of <tt>[vorbis_mode_count]</tt> mode numbers:<p>
 208   <ol>
 209   <li><tt>[vorbis_mode_blockflag]</tt> = read 1 bit<p>
 210   <li><tt>[vorbis_mode_windowtype]</tt> = read 16 bits as unsigned integer<p>
 211   <li><tt>[vorbis_mode_transformtype]</tt> = read 16 bits as unsigned integer<p>
 212   <li><tt>[vorbis_mode_mapping]</tt> = read 8 bits as unsigned integer<p>
 213   <li>verify ranges; zero is the only legal value in Vorbis I for <tt>[vorbis_mode_windowtype]</tt> and <tt>[vorbis_mode_transformtype]</tt>.  <tt>[vorbis_mode_mapping]</tt> must not be greater than the highest number mapping in use.  Any illegal values render the stream undecodable.<p>
 214   <li>save this mode configuration in slot <tt>[i]</tt> of the mode configuration array <tt>[vorbis_mode_configurations]</tt>.<p>
 215
 216   </ol>
 217   <li>read 1 bit as a framing flag.  If unset, a framing error occurred and the stream is not decodable.
 218
 219 </ol><p>
 220
 221 After reading mode descriptions, setup header decode is complete.<p>
 222
 223 <h1>Audio packet decode and synthesis</h1>
 224
 225 Following the three header packets, all packets in a Vorbis I stream
 226 are audio.  The first step of audio packet decode is to read and
 227 verify the packet type; <em>a non-audio packet when audio is expected
 228 indicates stream corruption or a non-compliant stream. The decoder
 229 must ignore the packet and not attempt decoding it to audio</em>.
 230
 231 <h2>packet type, mode and window decode</h2>
 232
 233 <ol>
 234 <li>read 1 bit <tt>[packet_type]</tt>; check that packet type is 0 (audio)<p>
 235 <li>read <a href="helper.html#ilog">ilog</a>([vorbis_mode_count]-1) bits <tt>[mode_number]</tt><p>
 236 <li>decode blocksize <tt>[n]</tt> is equal to <tt>[blocksize_0]</tt> if  <tt>[vorbis_mode_blockflag]</tt> is 0, else <tt>[n]</tt> is equal to <tt>[blocksize_1]</tt><p.
 237 <li>perform window selection and setup; this window is used later by the inverse MDCT:<p>
 238    <ol><li>if this is a long window (the <tt>[vorbis_mode_blockflag]</tt> flag of this mode is set):<p>
 239        <ol>
 240          <li>read 1 bit for <tt>[previous_window_flag]</tt><p>
 241          <li>read 1 bit for <tt>[next_window_flag]</tt><p>
 242
 243          <li>if <tt>[previous_window_flag]</tt> is not set, the left half
 244          of the window will be a hybrid window for lapping with a
 245          short block.  See <a href="vorbis-spec-intro.html#window">the
 246          'Window' subheading of the specification introduction
 247          document</a> for an illustration of overlapping dissimilar
 248          windows. Else, the left half window will have normal long
 249          shape.<p>
 250
 251          <li>if <tt>[next_window_flag]</tt> is not set, the right half of
 252          the window will be a hybrid window for lapping with a short
 253          block.  See <a href="vorbis-spec-intro.html#window">the
 254          'Window' subheading of the specification introduction
 255          document</a> for an illustration of overlapping dissimilar
 256          windows. Else, the left right window will have normal long
 257          shape.<p>
 258        </ol>
 259        <li> if this is a short window, the window is always the same
 260        short-window shape.<p>
 261
 262   </ol>
 263 </ol>
 264
 265 Vorbis windows all use the slope function y=sin( .5 * PI * sin^2( (x+.5) /
 266 n * PI) ) where n is window size and x ranges 0...n-1, but dissimilar
 267 lapping requirements can affect overall shape.  Window generation
 268 proceeds as follows:<p>
 269
 270 <ol>
 271 <li> <tt>[window_center]</tt> = <tt>[n]</tt> / 2
 272 <li> <tt>[left_window_start]</tt>
 273 <li> if (<tt>[vorbis_mode_blockflag]</tt> is set and <tt>[previous_window_flag]</tt> is not set) then
 274     <ol><li><tt>[left_window_start]</tt> = <tt>[n]</tt>/4 - <tt>[blocksize_0]</tt>/4
 275         <li><tt>[left_window_end]</tt> = <tt>[n]</tt>/4 + <tt>[blocksize_0]</tt>/4
 276         <li><tt>[left_n]</tt> = <tt>[blocksize_0]</tt>/2
 277     </ol>
 278     else
 279     <ol><li><tt>[left_window_start]</tt> = 0
 280         <li><tt>[left_window_end]</tt> = <tt>[window_center]</tt>
 281         <li><tt>[left_n]</tt> = <tt>[n]</tt>/2
 282     </ol>
 283
 284 <li> if (<tt>[vorbis_mode_blockflag]</tt> is set and <tt>[next_window_flag]</tt> is not set) then
 285     <ol><li><tt>[right_window_start]</tt> = <tt>[n]*3</tt>/4 - <tt>[blocksize_0]</tt>/4
 286         <li><tt>[right_window_end]</tt> = <tt>[n]*3</tt>/4 + <tt>[blocksize_0]</tt>/4
 287         <li><tt>[right_n]</tt> = <tt>[blocksize_0]</tt>/2
 288     </ol>
 289     else
 290     <ol><li><tt>[right_window_start]</tt> = <tt>[window_center]</tt>
 291         <li><tt>[right_window_end]</tt> = <tt>[n]</tt>
 292         <li><tt>[right_n]</tt> = <tt>[n]</tt>/2
 293     </ol>
 294 <li> window from range 0 ... <tt>[left_window_start]</tt>-1 inclusive is zero
 295
 296 <li> for <tt>[i]</tt> in range <tt>[left_window_start]</tt> ... <tt>[left_window_end]</tt>-1, window(<tt>[i]</tt>) = sin(.5 * PI * sin^2( (<tt>[i]</tt>-<tt>[left_window_start]</tt>+.5) / <tt>[left_n]</tt> * .5 * PI) )
 297
 298
 299 <li> window from range <tt>[left_window_end]</tt> ... <tt>[right_window_start]</tt>-1 inclusive is one
 300
 301 <li> for <tt>[i]</tt> in range <tt>[right_window_start]</tt> ... <tt>[right_window_end]</tt>-1, window(<tt>[i]</tt>) = sin(.5 * PI * sin^2( (<tt>[i]</tt>-<tt>[right_window_start]</tt>+.5) / <tt>[right_n]</tt> * .5 * PI/2. + .5 * PI) )
 302
 303 <li> window from range <tt>[rigth_window_start]</tt> ... <tt>[n]</tt>-1 is zero
 304
 305 </ol><p>
 306
 307 An end-of-packet condition up to this point should be considered an
 308 error that discards this packet from the stream.  An end of packet
 309 condition past this point is to be considered a possible nominal
 310 occurrence.<p>
 311
 312
 313 <h2>floor curve decode</h2>
 314
 315 From this point on, we assume out decode context is using mode number
 316 <tt>[mode_number]</tt> from configuration array
 317 <tt>[vorbis_mode_configurations]</tt> and the map number
 318 <tt>[vorbis_mode_mapping]</tt> (specified by the current mode) taken
 319 from the mapping configuration array
 320 <tt>[vorbis_mapping_configurations]</tt>.<p>
 321
 322 Floor curves are decoded one-by-one in channel order.<p>
 323
 324 For each floor <tt>[i]</tt> of <tt>[audio_channels]</tt>
 325   <ol><li><tt>[submap_number]</tt> = element <tt>[i]</tt> of vector [vorbis_mapping_mux] <p>
 326
 327       <li><tt>[floor_number]</tt> = element <tt>[submap_number]</tt> of vector [vorbis_submap_floor]<p>
 328       <li>if the floor type of this floor (vector <tt>[vorbis_floor_types]</tt> element <tt>[floor_number]</tt>) is zero then decode the floor for channel <tt>[i]</tt> according to the <a href="vorbis-spec-floor0.html#decode">floor 0 decode algorithm</a><p>
 329       <li>if the type of this floor is one then decode the floor for channel <tt>[i]</tt> according to the <a href="vorbis-spec-floor1.html#decode">floor 1 decode algorithm</a><p>
 330       <li>save the needed decoded floor information for channel for later synthesis<p>
 331       <li>if the decoded floor returned 'unused', set vector <tt>[no_residue]</tt> element <tt>[i]</tt> to true, else set vector <tt>[no_residue]</tt> element <tt>[i]</tt> to false<p>
 332 </ol>
 333
 334 An end-of-packet condition during floor decode shall result in packet
 335 decode zeroing all channel output vectors and skipping to the
 336 add/overlap output stage.<p>
 337
 338 <h2>nonzero vector propagate</h2>
 339
 340 A possible result of floor decode is that a specific vector is marked
 341 'unused' which indicates that that final output vector is all-zero
 342 values (and the floor is zero).  The residue for that vector is not
 343 coded in the stream, save for one complication.  If some vectors are
 344 used and some are not, channel coupling could result in mixing a
 345 zeroed and nonzeroed vector to produce two nonzeroed vectors.<p>
 346
 347 for each <tt>[i]</tt> from 0 ... <tt>[vorbis_mapping_coupling_steps]</tt>-1
 348
 349 <ol><li>if either <tt>[no_residue]</tt> entry for channel
 350 (<tt>[vorbis_mapping_magnitude]</tt> element <tt>[i]</tt>) or (channel
 351 <tt>[vorbis_mapping_angle]</tt> element <tt>[i]</tt>) are set to false, then both
 352 must be set to false.  Note that an 'unused' floor has no decoded floor
 353 information; it is important that this is remembered at floor curve
 354 synthesis time.
 355 </ol>
 356
 357 <h2>residue decode</h2>
 358
 359 Unlike floors, which are decoded in channel order, the residue vectors
 360 are decoded in submap order.<p>
 361
 362 for each submap <tt>[i]</tt> in order from 0 ... <tt>[vorbis_mapping_submaps]</tt>-1<p>
 363 <ol><li><tt>[ch]</tt> = 0<p>
 364     <li>for each channel <tt>[j]</tt> in order from 0 ... <tt>[audio_channels]</tt><p>
 365         <ol><li>if channel <tt>[j]</tt> is in submap <tt>[i]</tt> (vector <tt>[vorbis_mapping_mux]</tt> element <tt>[j]</tt> is equal to <tt>[i]</tt>)<p>
 366             <ol><li>if vector <tt>[no_residue]</tt> element <tt>[j]</tt> is true<p>
 367                 <ol><li>vector <tt>[do_not_decode_flag]</tt> element <tt>[channels_in_bundle]</tt> is set<p>
 368                 </ol>else<ol><li>vector <tt>[do_not_decode_flag]</tt> element <tt>[channels_in_bundle]</tt> is unset<p>
 369                 </ol>
 370                 <li>increment <tt>[ch]</tt><p>
 371              </ol>
 372          </ol>
 373      <li><tt>[residue_number]</tt> = vector <tt>[vorbis_mapping_submap_residue]</tt> element <tt>[i]</tt><p>
 374
 375      <li><tt>[residue_type]</tt> = vector <tt>[vorbis_residue_types]</tt> element <tt>[residue_number]</tt><p>
 376      <li>decode <tt>[ch]</tt> vectors using residue <tt>[residue_number]</tt>, according to type <tt>[residue_type]</tt>, also passing vector <tt>[do_not_decode_flag]</tt> to indicate which vectors in the bundle should not be decoded. Correct per-vector decode length is <tt>[n]</tt>/2.<p>
 377
 378     <li><tt>[ch]</tt> = 0<p>
 379     <li>for each channel <tt>[j]</tt> in order from 0 ... <tt>[audio_channels]</tt><p>
 380         <ol><li>if channel <tt>[j]</tt> is in submap <tt>[i]</tt> (vector <tt>[vorbis_mapping_mux]</tt> element <tt>[j]</tt> is equal to <tt>[i]</tt>)<p>
 381             <ol><li>residue vector for channel <tt>[j]</tt> is set to decoded residue vector <tt>[ch]</tt><p>
 382                 <li>increment <tt>[ch]</tt>
 383             </ol>
 384          </ol>
 385 </ol>
 386
 387 <h2>inverse coupling</h2>
 388
 389 for each <tt>[i]</tt> from <tt>[vorbis_mapping_coupling_steps]</tt>-1 descending to 0
 390
 391 <ol>
 392
 393 <li><tt>[magnitude_vector]</tt> = the residue vector for channel
 394 (vector <tt>[vorbis_mapping_magnitude]</tt> element <tt>[i]</tt>)
 395
 396 <li><tt>[angle_vector]</tt> = the residue vector for channel (vector
 397 <tt>[vorbis_mapping_angle]</tt> element <tt>[i]</tt>)
 398
 399 <li>for each scalar value <tt>[M]</tt> in vector <tt>[magnitude_vector]</tt> and the corresponding scalar value <tt>[A]</tt> in vector <tt>[angle_vector]</tt>:
 400    <ol><li>if (<tt>[M]</tt> is greater than zero)
 401        <ol><li>if (<tt>[A]</tt> is greater than zero)
 402            <ol>
 403                <li><tt>[new_M]</tt> = <tt>[M]</tt>
 404                <li><tt>[new_A]</tt> = <tt>[M]</tt>-<tt>[A]</tt>
 405            </ol>
 406            else
 407            <ol>
 408                <li><tt>[new_A]</tt> = <tt>[M]</tt>
 409                <li><tt>[new_M]</tt> = <tt>[M]</tt>+<tt>[A]</tt>
 410            </ol>
 411         </ol>
 412         else
 413         <ol><li>if (<tt>[A]</tt> is greater than zero)
 414            <ol>
 415                <li><tt>[new_M]</tt> = <tt>[M]</tt>
 416                <li><tt>[new_A]</tt> = <tt>[M]</tt>+<tt>[A]</tt>
 417            </ol>
 418            else
 419            <ol>
 420                <li><tt>[new_A]</tt> = <tt>[M]</tt>
 421                <li><tt>[new_M]</tt> = <tt>[M]</tt>-<tt>[A]</tt>
 422            </ol>
 423         </ol><p>
 424
 425         <li>set scalar value <tt>[M]</tt> in vector <tt>[magnitude_vector]</tt> to <tt>[new_M]</tt>
 426         <li>set scalar value <tt>[A]</tt> in vector <tt>[angle_vector]</tt> to <tt>[new_A]</tt>
 427     </ol>
 428 </ol>
 429
 430 <h2>dot product</h2>
 431
 432 For each channel, synthesize the floor curve from the decoded floor
 433 information, according to packet type. Note that the vector synthesis
 434 length for floor computation is <tt>[n]</tt>/2.<p>
 435
 436 For each channel, multiply each element of the floor curve by each
 437 element of that channel's residue vector.  The result is the dot
 438 product the floor and residue vectors for each channel; the produced
 439 vectors are the length <tt>[n]</tt>/2 audio spectrum for each
 440 channel.<p>
 441
 442 One point is worth mentioning about this dot product; a common mistake
 443 in a fixed point implementation might be to assume that a 32 bit
 444 fixed-point representation for floor and residue and direct
 445 multiplication of the vectors is sufficient for acceptable spectral
 446 depth in all cases because it happens to mostly work with the current
 447 Xiph.Org reference encoder. <p>
 448
 449 However, floor vector values can span ~140dB (~24 bits unsigned), and
 450 the audio spectrum vector should represent a minimum of 120dB (~21
 451 bits with sign), even when output is to a 16 bit PCM device.  For the
 452 residue vector to represent full scale if the floor is nailed to
 453 -140dB, it must be able to span 0 to +140dB.  For the residue vector
 454 to reach full scale if the floor is nailed at 0dB, it must be able to
 455 represent -140dB to +0dB.  Thus, in order to handle full range
 456 dynamics, a residue vector may span -140dB to +140dB entirely within
 457 spec.  A 280dB range is approximately 48 bits with sign; thus the
 458 residue vector must be able to represent a 48 bit range and the dot
 459 product must be able to handle an effective 48 bit times 24 bit
 460 multiplication.  This range may be achieved using large (64 bit or
 461 larger) integers, or implementing a movable binary point
 462 representation.<p>
 463
 464 <h2>inverse MDCT</h2>
 465
 466 Convert the audio spectrum vector of each channel back into time
 467 domain PCM audio via an inverse Modified Discrete Cosine Transform
 468 (MDCT).  A detailed description of the MDCT is available in the paper
 469 <a
 470 href="http://www.iocon.com/resource/docs/ps/eusipco_corrected.ps">_The
 471 use of multirate filter banks for coding of high quality digital
 472 audio_</a>, by T. Sporer, K. Brandenburg and B. Edler.  The window
 473 function used for the MDCT is the window determined earlier.<p>
 474
 475 <h2>overlap_add</h2>
 476
 477 Windowed MDCT output is overlapped and added with the right hand data
 478 of the previous window such that the 3/4 point of the previous window
 479 is aligned with the 1/4 point of the current window (as illustrated in
 480 <a href="vorbis-spec-intro.html#window">the 'Window' portion of the
 481 specification introduction document</a>.  The overlapped portion
 482 produced from overlapping the previous and current frame data is
 483 finished data to be returned by the decoder.  This data spans from the
 484 center of the previous window to the center of the current window.  In
 485 the case of same-sized windows, the amount of data to return is
 486 one-half block consisting of and only of the overlapped portions. When
 487 overlapping a short and long window, much of the returned range is not
 488 actually overlap.  This does not damage transform orthogonality.  Pay
 489 attention however to returning the correct data range; the amount of
 490 data to be returned is:<p>
 491 <tt>window_blocksize(previous_window)/4+window_blocksize(current_window)/4</tt>
 492 from the center (element windowsize/2) of the previous window to the
 493 center (element windowsize/2-1, inclusive) of the current window.<p>
 494
 495 Data is not returned from the first frame; it must be used to 'prime'
 496 the decode engine.  The encoder accounts for this priming when
 497 calculating PCM offsets; after the first frame, the proper PCM output
 498 offset is '0' (as no data has been returned yet).<p>
 499
 500 <h2>output channel order</h2>
 501
 502 Vorbis I specifies only a channel mapping type 0.  In mapping type 0,
 503 channel mapping is implicitly defined as follows for standard audio
 504 applications:<p>
 505
 506 <dl>
 507 <dt>one channel:<dd> the stream is monophonic
 508 <dt>two channels:<dd> the stream is stereo.  channel order: left, right
 509 <dt>three channels:<dd> the stream is a 1d-surround encoding.  channel order: left, center, right
 510 <dt>four channels:<dd> the stream is quadraphonic surround.  channel order: front left, front right, rear left, rear right
 511 <dt>five channels:<dd> the stream is five-channel surround.  channel order: front left, front center, front right, rear left, rear right
 512 <dt>six channels:<dd> the stream is 5,1 surround.  channel order: front left, front center, front right, rear left, rear right, LFE
 513 <dt>greater than six channels:<dd> channel use and order is defined by the application
 514 </dl>
 515
 516 Applications using Vorbis for dedicated purposes may define channel
 517 mapping as seen fit.  Future channel mappings (such as three and four
 518 channel <a href="http://www.ambisonic.net">Ambisonics</a>) will make
 519 use of channel mappings other than mapping 0.<p>
 520
 521 <hr>
 522 <a href="http://www.xiph.org/">
 523 <img src="white-xifish.png" align=left border=0>
 524 </a>
 525 <font size=-2 color=#505050>
 526
 527 Ogg is a <a href="http://www.xiph.org">Xiph.org Foundation</a> effort
 528 to protect essential tenets of Internet multimedia from corporate
 529 hostage-taking; Open Source is the net's greatest tool to keep
 530 everyone honest. See <a href="http://www.xiph.org/about.html">About
 531 the Xiph.org Foundation</a> for details.
 532 <p>
 533
 534 Ogg Vorbis is the first Ogg audio CODEC.  Anyone may freely use and
 535 distribute the Ogg and Vorbis specification, whether in a private,
 536 public or corporate capacity.  However, the Xiph.org Foundation and
 537 the Ogg project (xiph.org) reserve the right to set the Ogg Vorbis
 538 specification and certify specification compliance.<p>
 539
 540 Xiph.org's Vorbis software CODEC implementation is distributed under a
 541 BSD-like license.  This does not restrict third parties from
 542 distributing independent implementations of Vorbis software under
 543 other licenses.<p>
 544
 545 Ogg, Vorbis, Xiph.org Foundation and their logos are trademarks (tm)
 546 of the <a href="http://www.xiph.org/">Xiph.org Foundation</a>.  These
 547 pages are copyright (C) 1994-2002 Xiph.org Foundation. All rights
 548 reserved.<p>
 549
 550 </body>
 551