1 <HTML><HEAD><TITLE>xiph.org: Ogg Vorbis documentation</TITLE>
2 <BODY bgcolor="#ffffff" text="#202020" link="#006666" vlink="#000000">
3 <nobr><img src="white-ogg.png"><img src="vorbisword2.png"></nobr><p>
6 <h1><font color=#000070>
7 Programming with Xiphophorus <tt>libvorbis</tt>
10 <em>Last update to this document: July 22, 1999</em><br>
14 Libvorbis is Xiphophorus's portable Ogg Vorbis CODEC implemented as a
15 programmatic library. Libvorbis provides primitives to handle framing
16 and manipulation of Ogg bitstreams (used by the Vorbis for
17 streaming), a full analysis (encoding) interface as well as packet
18 decoding and synthesis for playback. <p>
20 The libvorbis library does not provide any system interface; a
21 full-featured demonstration player included with the library
22 distribtion provides example code for a variety of system interfaces
23 as well as a working example of using libvorbis in production code.
25 <h2>Encoding Overview</h2>
29 <h2>Decoding Overview</h2>
31 Decoding a bitstream with libvorbis follows roughly the following
35 <li>Frame the incoming bitstream into pages
36 <li>Sort the pages by logical bitstream and buffer then into logical streams
37 <li>Decompose the logical streams into raw packets
38 <li>Reconstruct segments of the original data from each packet
39 <li>Glue the reconstructed segments back into a decoded stream
44 An Ogg bitstream is logically arranged into pages, but to decode
45 the pages, we have to find them first. The raw bitstream is first fed
46 into an <tt>ogg_sync_state</tt> buffer using <tt>ogg_sync_buffer()</tt>
47 and <tt>ogg_sync_wrote()</tt>. After each block we submit to the sync
48 buffer, we should check to see if we can frame and extract a complete
49 page or pages using <tt>ogg_sync_pageout()</tt>. Extra pages are
50 buffered; allowing them to build up in the <tt>ogg_sync_state</tt>
51 buffer will eventually exhaust memory.<p>
53 The Ogg pages returned from <tt>ogg_sync_pageout</tt> need not be
54 decoded further to be used as landmarks in seeking; seeking can be
55 either a rough process of simply jumping to approximately intuited
56 portions of the bitstream, or it can be a precise bisection process
57 that captures pages and inspects data position. When seeking,
58 however, sequential multiplexing (chaining) must be accounted for;
59 beginning play in a new logical bitstream requires initializing a
60 synthesis engine with the headers from that bitstream. Vorbis
61 bitstreams do not make use of concurent multiplexing (grouping).<p>
65 The pages produced by <tt>ogg_sync_pageout</tt> are then sorted by
66 serial number to seperate logical bitstreams. Initialize logical
67 bitstream buffers (<tt>og_stream_state</tt>) using
68 <tt>ogg_stream_init()</tt>. Pages are submitted to the matching
69 logical bitstream buffer using <tt>ogg_stream_pagein</tt>; the serial
70 number of the page and the stream buffer must match, or the page will
71 be rejected. A page submitted out of sequence will simply be noted,
72 and in the course of outputting packets, the hole will be flagged
73 (<tt>ogg_sync_pageout</tt> and <tt>ogg_stream_packetout</tt> will
74 return a negative value at positions where they had to recapture the
77 <h3>Extracting packets</h3>
79 After submitting page[s] to a logical stream, read available packets
80 using <tt>ogg_stream_packetout</tt>.
82 <h3>Decoding packets</h3>
84 <h3>Reassembling data segments</h3>
87 <h2>Ogg Bitstream Manipulation Structures</h3>
89 Two of the Ogg bitstream data structures are intended to be
90 transparent to the developer; the fields should be used directly.<p>
96 unsigned char *packet;
107 <dt>packet: <dd>a pointer to the byte data of the raw packet
108 <dt>bytes: <dd>the size of the packet' raw data
109 <dt>b_o_s: <dd>beginning of stream; nonzero if this is the first packet of
110 the logical bitstream
111 <dt>e_o_s: <dd>end of stream; nonzero if this is the last packet of the
113 <dt>frameno: <dd>the absolute position of this packet in the original
114 uncompressed data stream.
117 <h4>encoding notes</h4> The encoder is responsible for setting all of
118 the fields of the packet to appropriate values before submission to
119 <tt>ogg_stream_packetin()</tt>; however, it is noted that the value in
120 <tt>b_o_s</tt> is ignored; the first page produced from a given
121 <tt>ogg_stream_state</tt> structure will be stamped as the initial
122 page. <tt>e_o_s</tt>, however, must be set; this is the means by
123 which the stream encoding primitives handle end of stream and cleanup.
125 <h4>decoding notes</h4><tt>ogg_stream_packetout()</tt> sets the fields
126 to appropriate values. Note that frameno will be >= 0 only in the
127 case that the given packet actually represents that position (ie, only
128 the last packet completed on any page will have a meaningful
129 <tt>frameno</tt>). Intervening frames will see <tt>frameno</tt> set
136 unsigned char *header;
144 <dt>header: <dd>pointer to the page header data
145 <dt>header_len: <dd>length of the page header in bytes
146 <dt>body: <dd>pointer to the page body
147 <dt>body_len: <dd>length of the page body
150 Note that although the <tt>header</tt> and <tt>body</tt> pointers do
151 not necessarily point into a single contiguous page vector, the page
152 body must immediately follow the header in the bitstream.<p>
154 <h2>Ogg Bitstream Manipulation Functions</h3>
157 int ogg_page_bos(ogg_page *og);
160 Returns the 'beginning of stream' flag for the given Ogg page. The
161 beginning of stream flag is set on the initial page of a logical
164 Zero indicates the flag is cleared (this is not the initial page of a
165 logical bitstream). Nonzero indicates the flag is set (this is the
166 initial page of a logical bitstream).<p>
169 int ogg_page_continued(ogg_page *og);
172 Returns the 'packet continued' flag for the given Ogg page. The packet
173 continued flag indicates whether or not the body data of this page
174 begins with packet continued from a preceeding page.<p>
175 Zero (unset) indicates that the body data begins with a new packet.
176 Nonzero (set) indicates that the first packet data on the page is a
177 continuation from the preceeding page.
180 int ogg_page_eos(ogg_page *og);
183 Returns the 'end of stream' flag for a give Ogg page. The end of page
184 flag is set on the last (terminal) page of a logical bitstream.<p>
186 Zero (unset) indicates that this is not the last page of a logical
187 bitstream. Nonzero (set) indicates that this is the last page of a
188 logical bitstream and that no addiitonal pages belonging to this
189 bitstream may follow.<p>
192 size64 ogg_page_frameno(ogg_page *og);
195 Returns the position of this page as an absolute position within the
196 original uncompressed data. The position, as returned, is 'frames
197 encoded to date up to and including the last whole packet on this
198 page'. Partial packets begun on this page but continued to the
199 following page are not included. If no packet ends on this page, the
200 frame position value will be equal to the frame position value of the
201 preceeding page. If none of the original uncompressed data is yet
202 represented in the logical bitstream (for example, the first page of a
203 bitstream consists only of a header packet; this packet encodes only
204 metadata), the value shall be zero.<p>
206 The units of the framenumber are determined by media mapping. A
207 vorbis audio bitstream, for example, defines one frame to be the
208 channel values from a single sampling period (eg, a 16 bit stereo
209 bitstream consists of two samples of two bytes for a total of four
210 bytes, thus a frame would be four bytes). A video stream defines one
211 frame to be a single frame of video.<p>
214 int ogg_page_pageno(ogg_page *og);
217 Returns the sequential page number of the given Ogg page. The first
218 page in a logical bitstream is numbered zero; following pages are
219 numbered in increasing monotonic order.<p>
222 int ogg_page_serialno(ogg_page *og);
225 Returns the serial number of the given Ogg page. The serial number is
226 used as a handle to distinguish various logical bitstreams in a
227 physical Ogg bitstresm. Every logical bitstream within a
228 physical bitstream must use a unique (within the scope of the physical
229 bitstream) serial number, which is stamped on all bitstream pages.<p>
232 int ogg_page_version(ogg_page *og);
235 Returns the revision of the Ogg bitstream structure of the given page.
236 Currently, the only permitted number is zero. Later revisions of the
237 bitstream spec will increment this version should any changes be
241 int ogg_stream_clear(ogg_stream_state *os);
244 Clears and deallocates the internal storage of the given Ogg stream.
245 After clearing, the stream structure is not initialized for use;
246 <tt>ogg_stream_init</tt> must be called to reinitialize for use.
247 Use <tt>ogg_stream_reset</tt> to reset the stream state
248 to a fresh, intiialized state.<p>
250 <tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
251 <tt>os</tt>, allowing use of this call on stream structures in static
252 or automatic storage. <tt>ogg_stream_destroy</tt>is a complimentary
253 function that frees the pointer as well.<p>
255 Returns zero on success and non-zero on failure. This function always
259 int ogg_stream_destroy(ogg_stream_state *os);
262 Clears and deallocates the internal storage of the given Ogg stream,
263 then frees the storage associated with the pointer <tt>os</tt>.<p>
265 <tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
266 <tt>os</tt>, allowing use of that call on stream structures in static
267 or automatic storage.<p>
269 Returns zero on success and non-zero on failure. This function always
273 int ogg_stream_init(ogg_stream_state *os,int serialno);
276 Initialize the storage associated with <tt>os</tt> for use as an Ogg
277 stream. This call is used to initialize a stream for both encode and
278 decode. The given serial number is the serial number that will be
279 stamped on pages of the produced bitstream (during encode), or used as
280 a check that pages match (during decode).<p>
282 Returns zero on success, nonzero on failure.<p>
285 int ogg_stream_packetin(ogg_stream_state *os, ogg_packet *op);
288 Used during encoding to add the given raw packet to the given Ogg
289 bitstream. The contents of <tt>op</tt> are copied;
290 <tt>ogg_stream_packetin</tt> does not retain any pointers into
291 <tt>op</tt>'s storage. The encoding proccess buffers incoming packets
292 until enough packets have been assembled to form an entire page;
293 <tt>ogg_stream_pageout</tt> is used to read complete pages.<p>
295 Returns zero on success, nonzero on failure.<p>
298 int ogg_stream_packetout(ogg_stream_state *os,ogg_packet *op);
301 Used during decoding to read raw packets from the given logical
302 bitstream. <tt>ogg_stream_packetout</tt> will only return complete
303 packets for which checksumming indicates no corruption. The size and
304 contents of the packet exactly match those given in the encoding
307 Returns zero if the next packet is not ready to be read (not buffered
308 or incomplete), positive if it returned a complete packet in
309 <tt>op</tt> and negative if there is a gap, extra bytes or corruption
310 at this position in the bitstream (essentially that the bitstream had
311 to be recaptured). A negative value is not necessarily an error. It
312 would be a common occurence when seeking, for example, which requires
313 recapture of the bitstream at the position decoding continued.<p>
315 Iff the return value is positive, <tt>ogg_stream_packetout</tt> placed
316 a packet in <tt>op</tt>. The data in <t>op</tt> points to static
317 storage that is valid until the next call to
318 <tt>ogg_stream_pagein</tt>, <tt>ogg_stream_clear</tt>,
319 <tt>ogg_stream_reset</tt>, or <tt>ogg_stream_destroy</tt>. The
320 pointers are not invalidated by more calls to
321 <tt>ogg_stream_packetout</tt>.<p>
324 int ogg_stream_pagein(ogg_stream_state *os, ogg_page *og);
327 Used during decoding to buffer the given complete, pre-verified page
328 for decoding into raw Ogg packets. The given page must be framed,
329 normally produced by <tt>ogg_sync_pageout</tt>, and from the logical
330 bitstream associated with <tt>os</tt> (the serial numbers must match).
331 The contents of the given page are copied; <tt>ogg_stream_pagein</tt>
332 retains no pointers into <tt>og</tt> storage.<p>
334 Returns zero on success and non-zero on failure.<p>
337 int ogg_stream_pageout(ogg_stream_state *os, ogg_page *og);
340 Used during encode to read complete pages from the stream buffer. The
341 returned page is ready for sending out to the real world.<p>
343 Returns zero if there is no complete page ready for reading. Returns
344 nonzero when it has placed data for a complete page into
345 <tt>og</tt>. Note that the storage returned in og points into internal
346 storage; the pointers in <tt>og</tt> are valid until the next call to
347 <tt>ogg_stream_pageout</tt>, <tt>ogg_stream_packetin</tt>,
348 <tt>ogg_stream_reset</tt>, <tt>ogg_stream_clear</tt> or
349 <tt>ogg_stream_destroy</tt>.
352 int ogg_stream_reset(ogg_stream_state *os);
355 Resets the given stream's state to that of a blank, unused stream;
356 this may be used during encode or decode. <p>
358 Note that if used during encode, it does not alter the stream's serial
359 number. In addition, the next page produced during encoding will be
360 marked as the 'initial' page of the logical bitstream.<p>
362 When used during decode, this simply clears the data buffer of any
363 pending pages. Beginning and end of stream cues are read from the
364 bitstream and are unaffected by reset.<p>
366 Returns zero on success and non-zero on failure. This function always
370 char *ogg_sync_buffer(ogg_sync_state *oy, long size);
373 This call is used to buffer a raw bitstream for framing and
374 verification. <tt>ogg_sync_buffer</tt> handles stream capture and
375 recapture, checksumming, and division into Ogg pages (as required by
376 <tt>ogg_stream_pagein</tt>).<p>
378 <tt>ogg_sync_buffer</tt> exposes a buffer area into which the decoder
379 copies the next (up to) <tt>size</tt> bytes. We expose the buffer
380 (rather than taking a buffer) in order to avoid an extra copy many
381 uses; this way, for example, <tt>read()</tt> can transfer data
382 directly into the stream buffer without first needing to place it in
383 temporary storage.<p>
385 Returns a pointer into <tt>oy</tt>'s internal bitstream sync buffer;
386 the remaining space in the sync buffer is at least <tt>size</tt>
387 bytes. The decoder need not write all of <tt>size</tt> bytes;
388 <tt>ogg_sync_wrote</tt> is used to inform the engine how many bytes
389 were actually written. Use of <tt>ogg_sync_wrote</tt> after writing
390 into the exposed buffer is mandantory.<p>
393 int ogg_sync_clear(ogg_sync_state *oy);
396 <tt>ogg_sync_clear</tt>
398 Clears and deallocates the internal storage of the given Ogg sync
399 buffer. After clearing, the sync structure is not initialized for
400 use; <tt>ogg_sync_init</tt> must be called to reinitialize for use.
401 Use <tt>ogg_sync_reset</tt> to reset the sync state and buffer to a
402 fresh, intiialized state.<p>
404 <tt>ogg_sync_clear</tt> does not call <tt>free()</tt> on the pointer
405 <tt>oy</tt>, allowing use of this call on sync structures in static
406 or automatic storage. <tt>ogg_sync_destroy</tt>is a complimentary
407 function that frees the pointer as well.<p>
409 Returns zero on success and non-zero on failure. This function always
413 int ogg_sync_destroy(ogg_sync_state *oy);
416 Clears and deallocates the internal storage of the given Ogg sync
417 buffer, then frees the storage associated with the pointer
420 <tt>ogg_sync_clear</tt> does not call <tt>free()</tt> on the pointer
421 <tt>oy</tt>, allowing use of that call on stream structures in static
422 or automatic storage.<p>
424 Returns zero on success and non-zero on failure. This function always
428 int ogg_sync_init(ogg_sync_state *oy);
431 Initializes the sync buffer <tt>oy</tt> for use.<p>
432 Returns zero on success and non-zero on failure. This function always
436 int ogg_sync_pageout(ogg_sync_state *oy, ogg_page *og);
439 Reads complete, framed, verified Ogg pages from the sync buffer,
440 placing the page data in <tt>og</tt>.<p>
442 Returns zero when there's no complete pages buffered for
443 retrieval. Returns negative when a loss of sync or recapture occurred
444 (this is not necessarily an error; recapture would be required after
445 seeking, for example). Returns positive when a page is returned in
446 <tt>og</tt>. Note that the data in <tt>og</tt> points into the sync
447 buffer storage; the pointers are valid until the next call to
448 <tt>ogg_sync_buffer</tt>, <tt>ogg_sync_clear</tt>,
449 <tt>ogg_sync_destroy</tt> or <tt>ogg_sync_reset</tt>.
453 int ogg_sync_reset(ogg_sync_state *oy);
456 <tt>ogg_sync_reset</tt> resets the sync state in <tt>oy</tt> to a
457 clean, empty state. This is useful, for example, when seeking to a
458 new location in a bitstream.<p>
460 Returns zero on success, nonzero on failure.<p>
463 int ogg_sync_wrote(ogg_sync_state *oy, long bytes);
466 Used to inform the sync state as to how many bytes were actually
467 written into the exposed sync buffer. It must be equal to or less
468 than the size of the buffer requested.<p>
470 Returns zero on success and non-zero on failure; failure occurs only
471 when the number of bytes written were larger than the buffer.<p>
474 <a href="http://www.xiph.org/">
475 <img src="white-xifish.png" align=left border=0>
477 <font size=-2 color=#505050>
479 Ogg is a <a href="http://www.xiph.org">Xiphophorus</a> effort to
480 protect essential tenets of Internet multimedia from corporate
481 hostage-taking; Open Source is the net's greatest tool to keep
482 everyone honest. See <a href="http://www.xiph.org/about.html">About
483 Xiphophorus</a> for details.
486 Ogg Vorbis is the first Ogg audio CODEC. Anyone may
487 freely use and distribute the Ogg and Vorbis specification,
488 whether in a private, public or corporate capacity. However,
489 Xiphophorus and the Ogg project (xiph.org) reserve the right to set
490 the Ogg/Vorbis specification and certify specification compliance.<p>
492 Xiphophorus's Vorbis software CODEC implementation is distributed
493 under the Lesser/Library GNU Public License. This does not restrict
494 third parties from distributing independent implementations of Vorbis
495 software under other licenses.<p>
497 OggSquish, Vorbis, Xiphophorus and their logos are trademarks (tm) of
498 <a href="http://www.xiph.org/">Xiphophorus</a>. These pages are
499 copyright (C) 1994-2000 Xiphophorus. All rights reserved.<p>