--- /dev/null
+<?xml version="1.0" ?>
+<appendix id="vorbis-over-ogg">
+<appendixinfo>
+ <releaseinfo>
+ <emphasis>Last update to this document: July 14, 2002</emphasis>
+ $Id: a1-encapsulation_ogg.xml,v 1.1 2002/10/16 22:46:44 giles Exp $
+ </releaseinfo>
+</appendixinfo>
+<title>Embedding Vorbis into an Ogg stream</title>
+
+<section>
+<title>Overview</title>
+
+<para>
+This document describes using Ogg logical and physical transport
+streams to encapsulate Vorbis compressed audio packet data into file
+form.</para>
+
+<para>
+The <xref linkend="vorbis-spec-intro"/> provides an overview of the construction
+of Vorbis audio packets.</para>
+
+<para>
+The <ulink url="oggstream.html">Ogg
+bitstream overview</ulink> and <ulink url="framing.html">Ogg logical
+bitstream and framing spec</ulink> provide detailed descriptions of Ogg
+transport streams. This specification document assumes a working
+knowledge of the concepts covered in these named backround
+documents. Please read them first.</para>
+
+<section><title>Restrictions</title>
+
+<para>
+The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
+streams use Ogg transport streams in degenerate, unmultiplexed
+form only. That is:
+
+<itemizedlist>
+ <listitem><simpara>
+ A meta-headerless Ogg file encapsulates the Vorbis I packets
+ </simpara></listitem>
+ <listitem><simpara>
+ The Ogg stream may be chained, i.e. contain multiple, contigous logical streams (links).
+ </simpara></listitem>
+ <listitem><simpara>
+ The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
+ </simpara></listitem>
+</itemizedlist>
+</para>
+
+<para>
+This is not to say that it is not currently possible to multiplex
+Vorbis with other media types into a multi-stream Ogg file. At the
+time this document was written, Ogg was becoming a popular container
+for low-bitrate movies consisting of DiVX video and Vorbis audio.
+However, a 'Vorbis I audio file' is taken to imply Vorbis audio
+existing alone within a degenerate Ogg stream. A compliant 'Vorbis
+audio player' is not required to implement Ogg support beyond the
+specific support of Vorbis within a degenrate ogg stream (naturally,
+application authors are encouraged to support full multiplexed Ogg
+handling).
+</para>
+
+</section>
+
+<section><title>MIME type</title>
+
+<para>
+The correct MIME type of any Ogg file is <literal>application/x-ogg</literal>.
+However, if a file is a Vorbis I audio file (which implies a
+degenerate Ogg stream including only unmultiplexed Vorbis audio), the
+mime type <literal>audio/x-vorbis</literal> is also allowed.</para>
+
+</section>
+
+</section>
+
+<section>
+<title>Encapsulation</title>
+
+<para>
+Ogg encapsulation of a Vorbis packet stream is straightforward.</para>
+
+<itemizedlist>
+
+<listitem><simpara>
+ The first Vorbis packet (the indentification header), which
+ uniquely identifies a stream as Vorbis audio, is placed alone in the
+ first page of the logical Ogg stream. This results in a first Ogg
+ page of exactly 58 bytes at the very beginning of the logical stream.
+</simpara></listitem>
+
+<listitem><simpara>
+ This first page is marked 'beginning of stream' in the page flags.
+</simpara></listitem>
+
+<listitem><simpara>
+ The second and third vorbis packets (comment and setup
+ headers) may span one or more pages beginning on the second page of
+ the logical stream. However many pages they span, the third header
+ packet finishes the page on which it ends. The next (first audio) packet
+ must begin on a fresh page.
+</simpara></listitem>
+
+<listitem><simpara>
+ The granule position of these first pages containing only headers is zero.
+</simpara></listitem>
+
+<listitem><simpara>
+ The first audio packet of the logical stream begins a fresh Ogg page.
+</simpara></listitem>
+
+<listitem><simpara>
+ Packets are placed into ogg pages in order until the end of stream.
+</simpara></listitem>
+
+<listitem><simpara>
+ The last page is marked 'end of stream' in the page flags.
+</simpara></listitem>
+
+<listitem><simpara>
+ Vorbis packets may span page boundaries.
+</simpara></listitem>
+
+<listitem><simpara>
+ The granule position of pages containing Vorbis audio is in units
+ of PCM audio samples (per channel; a stereo stream's granule position
+ does not increment at twice the speed of a mono stream).
+</simpara></listitem>
+
+<listitem><simpara>
+ The granule position of a page represents the end PCM sample
+ position of the last packet <emphasis>completed</emphasis> on that page.
+ A page that is entirely spanned by a single packet (that completes on a
+ subsequent page) has no granule position, and the granule position is
+ set to '-1'.
+</simpara></listitem>
+
+<listitem>
+ <simpara>
+ The granule (PCM) position of the first page need not indicate
+ that the stream started at position zero. Although the granule
+ position belongs to the last completed packet on the page and a
+ valid granule position must be positive, by
+ inference it may indicate that the PCM position of the beginning
+ of audio is positive or negative.
+ </simpara>
+
+ <itemizedlist>
+ <listitem><simpara>
+ A positive starting value simply indicates that this stream begins at
+ some positive time offset, potentially within a larger
+ program. This is a common case when connecting to the middle
+ of broadcast stream.
+ </simpara></listitem>
+ <listitem><simpara>
+ A negative value indicates that
+ output samples preceeding time zero should be discarded during
+ decoding; this technique is used to allow sample-granularity
+ editing of the stream start time of already-encoded Vorbis
+ streams. The number of samples to be discarded must not exceed
+ the overlap-add span of the first two audio packets.
+ </simpara></listitem>
+ </itemizedlist>
+
+ <simpara>
+ In both of these cases in which the initial audio PCM starting
+ offset is nonzero, the second finished audio packet must flush the
+ page on which it appears and the third packet begin a fresh page.
+ This allows the decoder to always be able to perform PCM position
+ adjustments before needing to return any PCM data from synthesis,
+ resulting in correct positioning information without any aditional
+ seeking logic.
+ </simpara>
+
+ <note><simpara>
+ Failure to do so should, at worst, cause a
+ decoder implementation to return incorrect positioning information
+ for seeking operations at the very beginning of the stream.
+ </simpara></note>
+</listitem>
+
+<listitem><simpara>
+ A granule position on the final page in a stream that indicates
+ less audio data than the final packet would normally return is used to
+ end the stream on other than even frame boundaries. The difference
+ between the actual available data returned and the declared amount
+ indicates how many trailing samples to discard from the decoding
+ process.
+ </simpara></listitem>
+</itemizedlist>
+
+</section>
+
+</appendix>
+
+<!-- end appendix on Vorbis encapsulation in Ogg -->