--- /dev/null
+Network Working Group Jack Moffitt
+Internet-Draft Xiph.org Foundation
+Expire in six months February 2001
+
+
+ RTP Payload Format for Vorbis Encoded Audio
+
+ <draft-moffitt-vorbis-rtp-00.txt>
+
+Status of this Memo
+
+ This document is an Internet-Draft and is in full conformance
+ with all provisions of Section 10 of RFC2026.
+
+ Internet-Drafts are working documents of the Internet Engineering
+ Task Force (IETF), its areas, and its working groups. Note that
+ other groups may also distribute working documents as
+ Internet-Drafts.
+
+ Internet-Drafts are draft documents valid for a maximum of six
+ months and may be updated, replaced, or obsoleted by other
+ documents at any time. It is inappropriate to use Internet-
+ Drafts as reference material or to cite them other than as
+ "work in progress".
+
+ The list of current Internet-Drafts can be accessed at
+ http://www.ietf.org/ietf/1id-abstracts.txt
+
+ The list of Internet-Draft Shadow Directories can be accessed at
+ http://www.ietf.org/shadow.html.
+
+Abstract
+
+ This document describes a RTP payload format for transporting Vorbis
+ encoded audio.
+
+1 Introduction
+
+ This document describes how Vorbis encoded audio may be formatted for
+ use as an RTP payload type.
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [1].
+
+2 Background
+
+ The Xiph.org Foundation creates and defines codecs for use in
+ multimedia that are not encumbered by patents and thus may be freely
+ implemented by any individual or organization.
+
+ Vorbis is the general purpose multi-channel audio codec created by
+ the Xiph.org Foundation.
+
+ Vorbis encoded audio is generally found within an Ogg format
+ bitstream, which provides framing and synchronization. For the
+ purposes of RTP transport, this layer is unnecessary, and so raw
+ Vorbis packets are used in the payload.
+
+ Vorbis packets are unbounded in length currently. At some future
+ point there will likely be a practical limit placed on packet
+ length.
+
+ Typical Vorbis packet sizes are from very small (2-3 bytes) to
+ quite large (8-12 kilobytes). The reference implementation seems to
+ make every packet less than ~800 bytes, except for the codebooks
+ packet which is ~8-12 kilobytes.
+
+3 Payload Format
+
+ The standard RTP header is followed by an 8 bit payload header, and
+ the payload data.
+
+3.1 RTP Header
+
+ The following fields of the RTP header are used for Vorbis payloads:
+
+ Payload Type (PT): 7 bits
+ An RTP profile for a class of applications is expected to assign a
+ payload type for this format, or a dynamically allocated payload
+ type should be chosen which designates the payload as Vorbis.
+
+ Timestamp: 32 bits
+ A timestamp representing the sampling time of the first sample of
+ the first Vorbis packet in the RTP packet. The clock frequency
+ MUST be set to the sample rate of the encoded audio data and is
+ conveyed out-of-band.
+
+ Marker (M): 1 bit
+ Set to one if the payload contains complete packets or if it
+ contains the last fragment of a fragmented packet.
+
+3.2 Payload Header
+
+ The first byte of the payload data is the payload header:
+
+ +---+---+---+---+---+---+---+---+
+ | C | R | R | # of packets |
+ +---+---+---+---+---+---+---+---+
+
+ C: 1 bit
+ Set to one if this is a continuation of a fragmented packet.
+
+ R: 1 bit x 2
+ Reserved, must be set to zero by senders, and ignored by
+ receivers.
+
+ The last 5 bits are the number of complete packets in this payload.
+ If C is set to one, this number should be 0.
+
+3.3 Payload Data
+
+ If the payload contains a single Vorbis packet or a Vorbis packet
+ fragment, the Vorbis packet data follows the payload header.
+
+ For payloads which consist of multiple Vorbis packets, payload data
+ consists of one byte representing the packet length followed by the
+ packet data for each of the Vorbis packets in the payload.
+
+ The Vorbis packet length byte is the length minus one. A value of
+ 0 means a length of 1.
+
+3.4 Example RTP Packet
+
+ Here is an example RTP packet containing two Vorbis packets.
+
+ RTP Packet Header:
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |V=2|P|X| CC |M| PT | sequence number |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | timestamp (in sample rate units) |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | sychronization source (SSRC) identifier |
+ +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
+ | contributing source (CSRC) identifiers |
+ | ... |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Payload Data:
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |0|0|0| # pks: 2| len | vorbis data ... |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | ...vorbis data... |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | ... | len | next vorbis packet data... |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+4 Frame Packetizing
+
+ Each RTP packet contains either one complete Vorbis packet, one
+ Vorbis packet fragment, or an integer number of complete Vorbis
+ packets (a max of 32 packets, since the number of packets is
+ defined by a 5 bit value).
+
+ Any Vorbis packet that is larger than 256 bytes and less than the
+ path-MTU should be placed in a RTP packet by itself.
+
+ Any Vorbis packet that is 256 bytes or less should be bundled in the
+ RTP packet with as many Vorbis packets as will fit, up to a maximum
+ of 32.
+
+ If a packet will not fit into the RTP packet, it must be fragmented.
+ A fragmented packet has a zero in the last five bits of the payload
+ header. Each fragment after the first will also set the Continued
+ (C) bit to one in the payload header. The RTP packet containing the
+ last fragment of the Vorbis packet will have the Marker (M) bit set
+ to one.
+
+5 Open Issues
+
+ To decode a Vorbis stream, a set of codebooks is required. These
+ codebooks are allowed to change for each logical bitstream (for
+ example, for each song encoded in a radio stream).
+
+ The codebooks must be completely intact and a client can not decode
+ a stream with an incomplete or corrupted set.
+
+ A client connecting to a multicast RTP Vorbis session needs to get the
+ first set of codebooks in some manner. These codebooks are typically
+ between 4 kilobytes and 8 kilobytes in size.
+
+ A final solution to how best to deliver the codebooks has not yet been
+ realized. Here are the current proposals:
+
+ - Including the first set of codebooks in the SDP description
+
+ - Broadcasting a codebook only stream as a second multicast Vorbis
+ stream
+
+ - Create some method of requesting the codebooks via RTCP
+
+ - Periodically retransmit the headers inline
+
+6 Security Considerations
+
+ RTP packets using this payload format are subject to the security
+ considerations discussed in the RTP specification [1]. This implies
+ that the confidentiality of the media stream is achieved by using
+ encryption. Becase the data compression used with this payload
+ format is applied end-to-end, encryption may be performed on the
+ compressed data.
+
+7 Acknowledgements
+
+ Thanks to the rest of the Xiph.org team, especially Monty
+ <monty@xiph.org>. Thanks also to Rob Lanphier <robla@real.com> for
+ his guidance.
+
+8 References
+
+ 1. RTP: A Transport Protocol for Real-Time Applications (RFC 1889)
+
+ 2. Xiph.org's Ogg Vorbis pages http://www.xiph.org/ogg/vorbis/
+ Vorbis documentation only currently exists as API documenation,
+ or as source code. The source can be obtained at
+ http://www.xiph.org/ogg/vorbis/download.html
+
+9 Author's Address
+
+ Jack Moffitt
+ Executive Director
+ Xiph.org Foundation
+ email: jack@xiph.org
+ WWW: http://www.xiph.org/