doc/xml/07-floor1.xml

   1 <?xml version="1.0" standalone="no"?>
   2 <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
   3                 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
   4
   5 ]>
   6
   7 <section id="vorbis-spec-floor1">
   8 <sectioninfo>
   9 <releaseinfo>
  10  $Id: 07-floor1.xml,v 1.5 2003/03/11 11:02:17 xiphmont Exp $
  11  <emphasis>Last update to this document: March 11, 2003</emphasis>
  12 </releaseinfo>
  13 </sectioninfo>
  14 <title>Floor type 1 setup and decode</title>
  15
  16 <section>
  17 <title>Overview</title>
  18
  19 <para>
  20 Vorbis floor type one uses a piecewise straight-line representation to
  21 encode a spectral envelope curve. The representation plots this curve
  22 mechanically on a linear frequency axis and a logarithmic (dB)
  23 amplitude axis. The integer plotting algorithm used is similar to
  24 Bresenham's algorithm.</para>
  25
  26 </section>
  27
  28 <section>
  29 <title>Floor 1 format</title>
  30
  31 <section><title>model</title>
  32
  33 <para>
  34 Floor type one represents a spectral curve as a series of
  35 line segments.  Synthesis constructs a floor curve using iterative
  36 prediction in a process roughly equivalent to the following simplified
  37 description:</para>
  38
  39 <itemizedlist>
  40  <listitem><simpara> the first line segment (base case) is a logical line spanning
  41 from x_0,y_0 to x_1,y_1 where in the base case x_0=0 and x_1=[n], the
  42 full range of the spectral floor to be computed.</simpara></listitem>
  43
  44 <listitem><simpara>the induction step chooses a point x_new within an existing
  45 logical line segment and produces a y_new value at that point computed
  46 from the existing line's y value at x_new (as plotted by the line) and
  47 a difference value decoded from the bitstream packet.</simpara></listitem>
  48
  49 <listitem><simpara>floor computation produces two new line segments, one running from
  50 x_0,y_0 to x_new,y_new and from x_new,y_new to x_1,y_1. This step is
  51 performed logically even if y_new represents no change to the
  52 amplitude value at x_new so that later refinement is additionally
  53 bounded at x_new.</simpara></listitem>
  54
  55 <listitem><simpara>the induction step repeats, using a list of x values specified in
  56 the codec setup header at floor 1 initialization time.  Computation
  57 is completed at the end of the x value list.</simpara></listitem>
  58
  59 </itemizedlist>
  60
  61 <para>
  62 Consider the following example, with values chosen for ease of
  63 understanding rather than representing typical configuration:</para>
  64
  65 <para>
  66 For the below example, we assume a floor setup with an [n] of 128.
  67 The list of selected X values in increasing order is
  68 0,16,32,48,64,80,96,112 and 128.  In list order, the values interleave
  69 as 0, 128, 64, 32, 96, 16, 48, 80 and 112.  The corresponding
  70 list-order Y values as decoded from an example packet are 110, 20, -5,
  71 -45, 0, -25, -10, 30 and -10.  We compute the floor in the following
  72 way, beginning with the first line:</para>
  73
  74 <mediaobject>
  75 <imageobject>
  76  <imagedata fileref="floor1-1.png" format="PNG"/>
  77 </imageobject>
  78 <textobject>
  79  <phrase>[graph of example floor]</phrase>
  80 </textobject>
  81 </mediaobject>
  82
  83 <para>
  84 We now draw new logical lines to reflect the correction to new_Y, and
  85 iterate for X positions 32 and 96:</para>
  86
  87 <mediaobject>
  88 <imageobject>
  89  <imagedata fileref="floor1-2.png" format="PNG"/>
  90  </imageobject>
  91  <textobject>
  92   <phrase>[graph of example floor]</phrase>
  93  </textobject>
  94 </mediaobject>
  95
  96 <para>
  97 Although the new Y value at X position 96 is unchanged, it is still
  98 used later as an endpoint for further refinement.  From here on, the
  99 pattern should be clear; we complete the floor computation as follows:</para>
 100
 101 <mediaobject>
 102 <imageobject>
 103  <imagedata fileref="floor1-3.png" format="PNG"/>
 104  </imageobject>
 105  <textobject>
 106   <phrase>[graph of example floor]</phrase>
 107  </textobject>
 108 </mediaobject>
 109
 110 <mediaobject>
 111 <imageobject>
 112  <imagedata fileref="floor1-4.png" format="PNG"/>
 113  </imageobject>
 114  <textobject>
 115   <phrase>[graph of example floor]</phrase>
 116  </textobject>
 117 </mediaobject>
 118
 119
 120 <para>
 121 A more efficient algorithm with carefully defined integer rounding
 122 behavior is used for actual decode, as described later.  The actual
 123 algorithm splits Y value computation and line plotting into two steps
 124 with modifications to the above algorithm to eliminate noise
 125 accumulation through integer roundoff/truncation. </para>
 126
 127 </section>
 128
 129 <section><title>header decode</title>
 130
 131 <para>
 132 A list of floor X values is stored in the packet header in interleaved
 133 format (used in list order during packet decode and synthesis).  This
 134 list is split into partitions, and each partition is assigned to a
 135 partition class.  X positions 0 and [n] are implicit and do not belong
 136 to an explicit partition or partition class.</para>
 137
 138 <para>
 139 A partition class consists of a representation vector width (the
 140 number of Y values which the partition class encodes at once), a
 141 'subclass' value representing the number of alternate entropy books
 142 the partition class may use in representing Y values, the list of
 143 [subclass] books and a master book used to encode which alternate
 144 books were chosen for representation in a given packet.  The
 145 master/subclass mechanism is meant to be used as a flexible
 146 representation cascade while still using codebooks only in a scalar
 147 context.</para>
 148
 149 <screen>
 150
 151   1) [floor1_partitions] = read 5 bits as unsigned integer
 152   2) [maximum_class] = -1
 153   3) iterate [i] over the range 0 ... [floor1_partitions]-1 {
 154
 155         4) vector [floor1_partition_class_list] element [i] = read 4 bits as unsigned integer
 156
 157      }
 158
 159   5) [maximum_class] = largest integer scalar value in vector [floor1_partition_class_list]
 160   6) iterate [i] over the range 0 ... [maximum_class] {
 161
 162         7) vector [floor1_class_dimensions] element [i] = read 3 bits as unsigned integer and add 1
 163         8) vector [floor1_class_subclasses] element [i] = read 2 bits as unsigned integer
 164         9) if ( vector [floor1_class_subclasses] element [i] is nonzero ) {
 165
 166              10) vector [floor1_class_masterbooks] element [i] = read 8 bits as unsigned integer
 167
 168            }
 169
 170        11) iterate [j] over the range 0 ... (2 exponent [floor1_class_subclasses] element [i]) - 1  {
 171
 172              12) array [floor1_subclass_books] element [i],[j] =
 173                  read 8 bits as unsigned integer and subtract one
 174            }
 175       }
 176
 177  13) [floor1_multiplier] = read 2 bits as unsigned integer and add one
 178  14) [rangebits] = read 4 bits as unsigned integer
 179  15) vector [floor1_X_list] element [0] = 0
 180  16) vector [floor1_X_list] element [1] = 2 exponent [rangebits];
 181  17) [floor1_values] = 2
 182  18) iterate [i] over the range 0 ... [floor1_partitions]-1 {
 183
 184        19) [current_class_number] = vector [floor1_partition_class_list] element [i]
 185        20) iterate [j] over the range 0 ... ([floor1_class_dimensions] element [current_class_number])-1 {
 186              21) vector [floor1_X_list] element ([j] + [floor1_values]) =
 187                  read [rangebits] bits as unsigned integer
 188              22) increment [floor1_values] by one
 189            }
 190      }
 191
 192  23) done
 193 </screen>
 194
 195 <para>
 196 An end-of-packet condition while reading any aspect of a floor 1
 197 configuration during setup renders a stream undecodable.  In
 198 addition, a <varname>[floor1_class_masterbooks]</varname> or
 199 <varname>[floor1_subclass_books]</varname> scalar element greater than the
 200 highest numbered codebook configured in this stream is an error
 201 condition that renders the stream undecodable.</para>
 202
 203 <section id="vorbis-spec-floor1-decode">
 204 <title>packet decode</title>
 205
 206 <para>
 207 Packet decode begins by checking the <varname>[nonzero]</varname> flag:</para>
 208
 209 <screen>
 210   1) [nonzero] = read 1 bit as boolean
 211 </screen>
 212
 213 <para>
 214 If <varname>[nonzero]</varname> is unset, that indicates this channel contained
 215 no audio energy in this frame.  Decode immediately returns a status
 216 indicating this floor curve (and thus this channel) is unused this
 217 frame.  (A return status of 'unused' is different from decoding a
 218 floor that has all points set to minimum representation amplitude,
 219 which happens to be approximately -140dB).
 220 </para>
 221
 222 <para>
 223 Assuming <varname>[nonzero]</varname> is set, decode proceeds as follows:</para>
 224
 225 <screen>
 226   1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
 227   2) vector [floor1_Y] element [0] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer
 228   3) vector [floor1_Y] element [1] = read <link linkend="vorbis-spec-ilog">ilog</link>([range]-1) bits as unsigned integer
 229   4) [offset] = 2;
 230   5) iterate [i] over the range 0 ... [floor1_partitions]-1 {
 231
 232        6) [class] = vector [floor1_partition_class]  element [i]
 233        7) [cdim]  = vector [floor1_class_dimensions] element [class]
 234        8) [cbits] = vector [floor1_class_subclasses] element [class]
 235        9) [csub]  = (2 exponent [cbits])-1
 236       10) [cval]  = 0
 237       11) if ( [cbits] is greater than zero ) {
 238
 239              12) [cval] = read from packet using codebook number
 240                  (vector [floor1_class_masterbooks] element [class]) in scalar context
 241           }
 242
 243       13) iterate [j] over the range 0 ... [cdim]-1 {
 244
 245              14) [book] = array [floor1_subclass_books] element [class],([cval] bitwise AND [csub])
 246              15) [cval] = [cval] right shifted [cbits] bits
 247              16) if ( [book] is not less than zero ) {
 248
 249                    17) vector [floor1_Y] element ([j]+[offset]) = read from packet using codebook
 250                        [book] in scalar context
 251
 252                  } else [book] is less than zero {
 253
 254                    18) vector [floor1_Y] element ([j]+[offset]) = 0
 255
 256                  }
 257           }
 258
 259       19) [offset] = [offset] + [cdim]
 260
 261      }
 262
 263  20) done
 264 </screen>
 265
 266 <para>
 267 An end-of-packet condition during curve decode should be considered a
 268 nominal occurrence; if end-of-packet is reached during any read
 269 operation above, floor decode is to return 'unused' status as if the
 270 <varname>[nonzero]</varname> flag had been unset at the beginning of decode.
 271 </para>
 272
 273 <para>
 274 Vector <varname>[floor1_Y]</varname> contains the values from packet decode
 275 needed for floor 1 synthesis.</para>
 276
 277 </section>
 278
 279 <section id="vorbis-spec-floor1-synth">
 280 <title>curve computation</title>
 281
 282 <para>
 283 Curve computation is split into two logical steps; the first step
 284 derives final Y amplitude values from the encoded, wrapped difference
 285 values taken from the bitstream.  The second step plots the curve
 286 lines.  Also, although zero-difference values are used in the
 287 iterative prediction to find final Y values, these points are
 288 conditionally skipped during final line computation in step two.
 289 Skipping zero-difference values allows a smoother line fit.  </para>
 290
 291 <para>
 292 Although some aspects of the below algorithm look like inconsequential
 293 optimizations, implementors are warned to follow the details closely.
 294 Deviation from implementing a strictly equivalent algorithm can result
 295 in serious decoding errors.</para>
 296
 297 <section>
 298 <title>step 1: amplitude value synthesis</title>
 299
 300 <para>
 301 Unwrap the always-positive-or-zero values read from the packet into
 302 +/- difference values, then apply to line prediction.</para>
 303
 304 <screen>
 305   1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
 306   2) vector [floor1_step2_flag] element [0] = set
 307   3) vector [floor1_step2_flag] element [1] = set
 308   4) vector [floor1_final_Y] element [0] = vector [floor1_Y] element [0]
 309   5) vector [floor1_final_Y] element [1] = vector [floor1_Y] element [1]
 310   6) iterate [i] over the range 2 ... [floor1_values]-1 {
 311
 312        7) [low_neighbor_offset] = <link linkend="vorbis-spec-low_neighbor">low_neighbor</link>([floor1_X_list],[i])
 313        8) [high_neighbor_offset] = <link linkend="vorbis-spec-high_neighbor">high_neighbor</link>([floor1_X_list],[i])
 314
 315        9) [predicted] = <link linkend="vorbis-spec-render_point">render_point</link>( vector [floor1_X_list] element [low_neighbor_offset],
 316                                       vector [floor1_final_Y] element [low_neighbor_offset],
 317                                       vector [floor1_X_list] element [high_neighbor_offset],
 318                                       vector [floor1_final_Y] element [high_neighbor_offset],
 319                                       vector [floor1_X_list] element [i] )
 320
 321       10) [val] = vector [floor1_Y] element [i]
 322       11) [highroom] = [range] - [predicted]
 323       12) [lowroom]  = [predicted]
 324       13) if ( [highroom] is less than [lowroom] ) {
 325
 326             14) [room] = [highroom] * 2
 327
 328           } else [highroom] is not less than [lowroom] {
 329
 330             15) [root] = [lowroom] * 2
 331
 332           }
 333
 334       16) if ( [val] is nonzero ) {
 335
 336             17) vector [floor1_step2_flag] element [low_neighbor_offset] = set
 337             18) vector [floor1_step2_flag] element [high_neighbor_offset] = set
 338             19) vector [floor1_step2_flag] element [i] = set
 339             20) if ( [val] is greater than or equal to [room] ) {
 340
 341                   21) if ( [hiroom] is greater than [lowroom] ) {
 342
 343                         22) vector [floor1_final_Y] element [i] = [val] - [lowroom] + [predicted]
 344
 345                       } else [hiroom] is not greater than [lowroom] {
 346
 347                         23) vector [floor1_final_Y] element [i] = [predicted] - [val] + [hiroom] - 1
 348
 349                       }
 350
 351                 } else [val] is less than [room] {
 352
 353                   24) if ([val] is odd) {
 354
 355                         25) vector [floor1_final_Y] element [i] =
 356                             [predicted] - (([val] + 1) divided by  2 using integer division)
 357
 358                       } else [val] is even {
 359
 360                         26) vector [floor1_final_Y] element [i] =
 361                             [predicted] + ([val] / 2 using integer division)
 362
 363                       }
 364
 365                 }
 366
 367           } else [val] is zero {
 368
 369             27) vector [floor1_step2_flag] element [i] = unset
 370             28) vector [floor1_final_Y] element [i] = [predicted]
 371
 372           }
 373
 374      }
 375
 376  29) done
 377
 378 </screen>
 379
 380 </section>
 381
 382 <section>
 383 <title>step 2: curve synthesis</title>
 384
 385 <para>
 386 Curve synthesis generates a return vector <varname>[floor]</varname> of length
 387 <varname>[n]</varname> (where <varname>[n]</varname> is provided by the decode process
 388 calling to floor decode).  Floor 1 curve synthesis makes use of the
 389 <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and
 390 <varname>[floor1_step2_flag]</varname> vectors, as well as [floor1_multiplier]
 391 and [floor1_values] values.</para>
 392
 393 <para>
 394 Decode begins by sorting the scalars from vectors
 395 <varname>[floor1_X_list]</varname>, <varname>[floor1_final_Y]</varname> and
 396 <varname>[floor1_step2_flag]</varname> together into new vectors
 397 <varname>[floor1_X_list]'</varname>, <varname>[floor1_final_Y]'</varname> and
 398 <varname>[floor1_step2_flag]'</varname> according to ascending sort order of the
 399 values in <varname>[floor1_X_list]</varname>.  That is, sort the values of
 400 <varname>[floor1_X_list]</varname> and then apply the same permutation to
 401 elements of the other two vectors so that the X, Y and step2_flag
 402 values still match.</para>
 403
 404 <para>
 405 Then compute the final curve in one pass:</para>
 406
 407 <screen>
 408   1) [hx] = 0
 409   2) [lx] = 0
 410   3) [ly] = vector [floor1_final_Y]' element [0] * [floor1_multiplier]
 411   4) iterate [i] over the range 1 ... [floor1_values]-1 {
 412
 413        5) if ( [floor1_step2_flag]' is set ) {
 414
 415              6) [hy] = [floor1_final_Y]' element [i] * [floor1_multiplier]
 416              7) [hx] = [floor1_X_list]' element [i]
 417              8) <link linkend="vorbis-spec-render_line">render_line</link>( [lx], [ly], [hx], [hy], [floor] )
 418              9) [lx] = [hx]
 419             10) [ly] = [hy]
 420           }
 421      }
 422
 423  11) if ( [hx] is less than [n] ) {
 424
 425         12) <link linkend="vorbis-spec-render_line">render_line</link>( [hx], [hy], [n], [hy], [floor] )
 426
 427      }
 428
 429  13) if ( [hx] is greater than [n] ) {
 430
 431             14) truncate vector [floor] to [n] elements
 432
 433      }
 434
 435  15) for each scalar in vector [floor], perform a lookup substitution using
 436      the scalar value from [floor] as an offset into the vector <link linkend="vorbis-spec-floor1_inverse_dB_table">[floor1_inverse_dB_static_table]</link>
 437
 438  16) done
 439
 440 </screen>
 441
 442 </section>
 443
 444 </section>
 445
 446 </section>
 447 </section>
 448 </section>
 449