doc/vorbis.html

   1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\r
   2 <html>\r
   3 <head>\r
   4 \r
   5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>\r
   6 <title>Ogg Vorbis Documentation</title>\r
   7 \r
   8 <style type="text/css">\r
   9 body {\r
  10   margin: 0 18px 0 18px;\r
  11   padding-bottom: 30px;\r
  12   font-family: Verdana, Arial, Helvetica, sans-serif;\r
  13   color: #333333;\r
  14   font-size: .8em;\r
  15 }\r
  16 \r
  17 a {\r
  18   color: #3366cc;\r
  19 }\r
  20 \r
  21 img {\r
  22   border: 0;\r
  23 }\r
  24 \r
  25 #xiphlogo {\r
  26   margin: 30px 0 16px 0;\r
  27 }\r
  28 \r
  29 #content p {\r
  30   line-height: 1.4;\r
  31 }\r
  32 \r
  33 h1, h1 a, h2, h2 a, h3, h3 a {\r
  34   font-weight: bold;\r
  35   color: #ff9900;\r
  36   margin: 1.3em 0 8px 0;\r
  37 }\r
  38 \r
  39 h1 {\r
  40   font-size: 1.3em;\r
  41 }\r
  42 \r
  43 h2 {\r
  44   font-size: 1.2em;\r
  45 }\r
  46 \r
  47 h3 {\r
  48   font-size: 1.1em;\r
  49 }\r
  50 \r
  51 li {\r
  52   line-height: 1.4;\r
  53 }\r
  54 \r
  55 #copyright {\r
  56   margin-top: 30px;\r
  57   line-height: 1.5em;\r
  58   text-align: center;\r
  59   font-size: .8em;\r
  60   color: #888888;\r
  61   clear: both;\r
  62 }\r
  63 </style>\r
  64 \r
  65 </head>\r
  66 \r
  67 <body>\r
  68 \r
  69 <div id="xiphlogo">\r
  70   <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.org"/></a>\r
  71 </div>\r
  72 \r
  73 <h1>Ogg Vorbis encoding format documentation</h1>\r
  74 \r
  75 <p><img src="wait.png" alt="wait"/>As of writing, not all the below document\r
  76 links are live. They will be populated as we complete the documents.</p>\r
  77 \r
  78 <h2>Documents</h2>\r
  79 \r
  80 <ul>\r
  81 <li><a href="packet.html">Vorbis packet structure</a></li>\r
  82 <li><a href="envelope.html">Temporal envelope shaping and blocksize</a></li>\r
  83 <li><a href="mdct.html">Time domain segmentation and MDCT transform</a></li>\r
  84 <li><a href="resolution.html">The resolution floor</a></li>\r
  85 <li><a href="residuals.html">MDCT-domain fine structure</a></li>\r
  86 </ul>\r
  87 \r
  88 <ul>\r
  89 <li><a href="probmodel.html">The Vorbis probability model</a></li>\r
  90 <li><a href="bitpack.html">The Vorbis bitpacker</a></li>\r
  91 </ul>\r
  92 \r
  93 <ul>\r
  94 <li><a href="oggstream.html">Ogg bitstream overview</a></li>\r
  95 <li><a href="framing.html">Ogg logical bitstream and framing spec</a></li>\r
  96 <li><a href="vorbis-stream.html">Vorbis packet->Ogg bitstream mapping</a></li>\r
  97 </ul>\r
  98 \r
  99 <ul>\r
 100 <li><a href="programming.html">Programming with libvorbis</a></li>\r
 101 </ul>\r
 102 \r
 103 <h2>Description</h2>\r
 104 \r
 105 <p>Ogg Vorbis is a general purpose compressed audio format\r
 106 for high quality (44.1-48.0kHz, 16+ bit, polyphonic) audio and music\r
 107 at moderate fixed and variable bitrates (40-80 kb/s/channel). This\r
 108 places Vorbis in the same class as audio representations including\r
 109 MPEG-1 audio layer 3, MPEG-4 audio (AAC and TwinVQ), and PAC.</p>\r
 110 \r
 111 <p>Vorbis is the first of a planned family of Ogg multimedia coding\r
 112 formats being developed as part of the Xiph.org Foundation's Ogg multimedia\r
 113 project. See <a href="http://www.xiph.org/">http://www.xiph.org/</a>\r
 114 for more information.</p>\r
 115 \r
 116 <h2>Vorbis technical documents</h2>\r
 117 \r
 118 <p>A Vorbis encoder takes in overlapping (but contiguous) short-time\r
 119 segments of audio data. The encoder analyzes the content of the audio\r
 120 to determine an optimal compact representation; this phase of encoding\r
 121 is known as <em>analysis</em>. For each short-time block of sound,\r
 122 the encoder then packs an efficient representation of the signal, as\r
 123 determined by analysis, into a raw packet much smaller than the size\r
 124 required by the original signal; this phase is <em>coding</em>.\r
 125 Lastly, in a streaming environment, the raw packets are then\r
 126 structured into a continuous stream of octets; this last phase is\r
 127 <em>streaming</em>. Note that the stream of octets is referred to both\r
 128 as a 'byte-' and 'bit-'stream; the latter usage is acceptible as the\r
 129 stream of octets is a physical representation of a true logical\r
 130 bit-by-bit stream.</p>\r
 131 \r
 132 <p>A Vorbis decoder performs a mirror image process of extracting the\r
 133 original sequence of raw packets from an Ogg stream (<em>stream\r
 134 decomposition</em>), reconstructing the signal representation from the\r
 135 raw data in the packet (<em>decoding</em>) and them reconstituting an\r
 136 audio signal from the decoded representation (<em>synthesis</em>).</p>\r
 137 \r
 138 <p>The <a href="programming.html">Programming with libvorbis</a>\r
 139 documents discuss use of the reference Vorbis codec library\r
 140 (libvorbis) produced by the Xiph.org Foundation.</p>\r
 141 \r
 142 <p>The data representations and algorithms necessary at each step to\r
 143 encode and decode Ogg Vorbis bitstreams are described by the below\r
 144 documents in sufficient detail to construct a complete Vorbis codec.\r
 145 Note that at the time of writing, Vorbis is still in a 'Request For\r
 146 Comments' stage of development; despite being in advanced stages of\r
 147 development, input from the multimedia community is welcome.</p>\r
 148 \r
 149 <h3>Vorbis analysis and synthesis</h3>\r
 150 \r
 151 <p>Analysis begins by seperating an input audio stream into individual,\r
 152 overlapping short-time segments of audio data. These segments are\r
 153 then transformed into an alternate representation, seeking to\r
 154 represent the original signal in a more efficient form that codes into\r
 155 a smaller number of bytes. The analysis and transformation stage is\r
 156 the most complex element of producing a Vorbis bitstream.</p>\r
 157 \r
 158 <p>The corresponding synthesis step in the decoder is simpler; there is\r
 159 no analysis to perform, merely a mechanical, deterministic\r
 160 reconstruction of the original audio data from the transform-domain\r
 161 representation.</p>\r
 162 \r
 163 <ul>\r
 164 <li><a href="packet.html">Vorbis packet structure</a>:\r
 165 Describes the basic analysis components necessary to produce Vorbis\r
 166 packets and the structure of the packet itself.</li>\r
 167 <li><a href="envelope.html">Temporal envelope shaping and blocksize</a>:\r
 168 Use of temporal envelope shaping and variable blocksize to minimize\r
 169 time-domain energy leakage during wide dynamic range and spectral energy\r
 170 swings. Also discusses time-related principles of psychoacoustics.</li>\r
 171 <li><a href="mdct.html">Time domain segmentation and MDCT transform</a>:\r
 172 Division of time domain data into individual overlapped, windowed\r
 173 short-time vectors and transformation using the MDCT</li>\r
 174 <li><a href="resolution.html">The resolution floor</a>: Use of frequency\r
 175 doamin psychoacoustics, and the MDCT-domain noise, masking and resolution\r
 176 floors</li>\r
 177 <li><a href="residuals.html">MDCT-domain fine structure</a>: Production,\r
 178 quantization and massaging of MDCT-spectrum fine structure</li>\r
 179 </ul>\r
 180 \r
 181 <h3>Vorbis coding and decoding</h3>\r
 182 \r
 183 <p>Coding and decoding converts the transform-domain representation of\r
 184 the original audio produced by analysis to and from a bitwise packed\r
 185 raw data packet. Coding and decoding consist of two logically\r
 186 orthogonal concepts, <em>back-end coding</em> and <em>bitpacking</em>.</p>\r
 187 \r
 188 <p><em>Back-end coding</em> uses a probability model to represent the raw numbers\r
 189 of the audio representation in as few physical bits as possible;\r
 190 familiar examples of back-end coding include Huffman coding and Vector\r
 191 Quantization.</p>\r
 192 \r
 193 <p><em>Bitpacking</em> arranges the variable sized words of the back-end\r
 194 coding into a vector of octets without wasting space. The octets\r
 195 produced by coding a single short-time audio segment is one raw Vorbis\r
 196 packet.</p>\r
 197 \r
 198 <ul>\r
 199 <li><a href="probmodel.html">The Vorbis probability model</a></li>\r
 200 <li><a href="bitpack.html">The Vorbis bitpacker</a>: Arrangement of \r
 201 variable bit-length words into an octet-aligned packet.</li>\r
 202 </ul>\r
 203 \r
 204 <h3>Vorbis streaming and stream decomposition</h3>\r
 205 \r
 206 <p>Vorbis packets contain the raw, bitwise-compressed representation of a\r
 207 snippet of audio. These packets contain no structure and cannot be\r
 208 strung together directly into a stream; for streamed transmission and\r
 209 storage, Vorbis packets are encoded into an Ogg bitstream.</p>\r
 210 \r
 211 <ul>\r
 212 <li><a href="oggstream.html">Ogg bitstream overview</a>: High-level\r
 213 description of Ogg logical bitstreams, how logical bitstreams\r
 214 (of mixed media types) can be combined into physical bitstreams, and\r
 215 restrictions on logical-to-physical mapping. Note that this document is\r
 216 not specific only to Ogg Vorbis.</li>\r
 217 <li><a href="framing.html">Ogg logical bitstream and framing\r
 218 spec</a>: Low level, complete specification of Ogg logical\r
 219 bitstream pages. Note that this document is not specific only to Ogg\r
 220 Vorbis.</li>\r
 221 <li><a href="vorbis-stream.html">Vorbis bitstream mapping</a>:\r
 222 Specifically describes mapping Vorbis data into an\r
 223 Ogg physical bitstream.</li>\r
 224 </ul>\r
 225 \r
 226 <div id="copyright">\r
 227   The Xiph Fish Logo is a\r
 228   trademark (&trade;) of Xiph.Org.<br/>\r
 229 \r
 230   These pages &copy; 1994 - 2005 Xiph.Org. All rights reserved.\r
 231 </div>\r
 232 \r
 233 </body>\r
 234 </html>\r