2 <!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>
6 <rfc ipr="full3667" docName="RTP Payload Format for Vorbis Encoded Audio">
9 <title>draft-ietf-avt-vorbis-rtp-01</title>
11 <author initials="L" surname="Barbato" fullname="Luca Barbato">
12 <organization>Xiph.Org</organization>
14 <email>lu_zero@gentoo.org</email>
15 <uri>http://www.xiph.org/</uri>
19 <date day="15" month="October" year="2005" />
22 <workgroup>AVT Working Group</workgroup>
23 <keyword>I-D</keyword>
25 <keyword>Internet-Draft</keyword>
26 <keyword>Vorbis</keyword>
27 <keyword>RTP</keyword>
31 This document describes an RTP payload format for transporting Vorbis encoded audio. It details the RTP encapsulation mechanism for raw Vorbis data and details the delivery mechanisms for the decoder probability model, referred to as a codebook and other setup information.
35 Also included within the document are the necessary details for the use of Vorbis with MIME and Session Description Protocol (SDP).
40 <note title="Editors Note">
42 All references to RFC XXXX are to be replaced by references to the RFC number of this memo, when published.
50 <section anchor="Introduction" title="Introduction">
53 Vorbis is a general purpose perceptual audio codec intended to allow
54 maximum encoder flexibility, thus allowing it to scale competitively
55 over an exceptionally wide range of bitrates. At the high
56 quality/bitrate end of the scale (CD or DAT rate stereo, 16/24 bits), it
57 is in the same league as MPEG-2 and MPC.
58 Similarly, the version 1.1 reference encoder can encode high-quality CD
59 and DAT rate stereo at below 48k bits/sec without resampling to a lower
60 rate. Vorbis is also intended for lower and higher sample rates (from
61 8kHz telephony to 192kHz digital masters) and a range of channel
62 representations (monaural, polyphonic, stereo, quadraphonic, 5.1,
63 ambisonic, or up to 255 discrete channels).
67 Vorbis encoded audio is generally encapsulated within an Ogg format bitstream <xref target="rfc3533"></xref>, which provides
68 framing and synchronization. For the purposes of RTP transport, this layer is unnecessary, and so raw Vorbis packets are used
72 <section anchor="Terminology" title="Terminology">
75 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
76 in this document are to be interpreted as described in RFC 2119 <xref target="rfc2119"></xref>.
82 <section anchor="Payload Format" title="Payload Format">
85 For RTP based transportation of Vorbis encoded audio the standard RTP header is followed by a 4 octet payload header, then the payload data. The payload headers are used to associate the Vorbis data with its associated decoding codebooks as well as indicating if the following packet contains fragmented Vorbis data and/or the the number of whole Vorbis data frames. The payload data contains the raw Vorbis bitstream information.
88 <section anchor="RTP Header" title="RTP Header">
91 The format of the RTP header is specified in <xref target="rfc3550"></xref> and shown in Figure <xref target="RTP Header Figure"/>. This payload format uses the fields of the header in a manner consistent with that specification.
95 <figure anchor="RTP Header Figure" title="RTP Header">
98 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
99 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
100 |V=2|P|X| CC |M| PT | sequence number |
101 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
103 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
104 | synchronization source (SSRC) identifier |
105 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
106 | contributing source (CSRC) identifiers |
108 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
114 The RTP header begins with an octet of fields (V, P, X, and CC) to support specialized RTP uses (see <xref target="rfc3550">
115 </xref> and <xref target="rfc3551"></xref> for details). For Vorbis RTP, the following values are used.
119 Version (V): 2 bits</t>
121 This field identifies the version of RTP. The version used by this specification is two (2).
125 Padding (P): 1 bit</t>
127 Padding MAY be used with this payload format according to section 5.1 of <xref target="rfc3550"></xref>.
131 Extension (X): 1 bit</t>
133 The Extension bit is used in accordance with <xref target="rfc3550"></xref>.
137 CSRC count (CC): 4 bits</t>
139 The CSRC count is used in accordance with <xref target="rfc3550"></xref>.
143 Marker (M): 1 bit</t>
145 Set to zero. Audio silence suppression not used. This conforms to section 4.1 of <xref target="vorbis-spec-ref"></xref>.
149 Payload Type (PT): 7 bits</t>
151 An RTP profile for a class of applications is expected to assign a payload type for this format, or a dynamically allocated
152 payload type SHOULD be chosen which designates the payload as Vorbis.
156 Sequence number: 16 bits</t>
158 The sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and
159 to restore packet sequence. This field is detailed further in <xref target="rfc3550"></xref>.
163 Timestamp: 32 bits</t>
165 A timestamp representing the sampling time of the first sample of the first Vorbis packet in the RTP packet. The clock frequency
166 MUST be set to the sample rate of the encoded audio data and is conveyed out-of-band as a SDP attribute.
170 SSRC/CSRC identifiers: </t>
172 These two fields, 32 bits each with one SSRC field and a maximum of 16 CSRC fields, are as defined in <xref target="rfc3550">
178 <section anchor="Payload Header" title="Payload Header">
181 After the RTP Header section the following 4 octets are the Payload Header. This header is split into a number of bitfields detailing the format of the following payload data packets.
184 <figure anchor="Payload Header Figure" title="Payload Header">
187 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
189 | Ident | F |VDT|# pkts.|
190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
197 This 24 bit field is used to associate the Vorbis data to a decoding Configuration.
201 Fragment type (F): 2 bits</t>
203 This field is set accordingly the following list
205 <vspace blankLines="1" />
207 <t> 0 = Not Fragmented</t>
208 <t> 1 = Start Fragment</t>
209 <t> 2 = Continuation Fragment</t>
210 <t> 3 = End Fragment</t>
214 Vorbis Data Type (VDT): 2 bits</t>
216 This field sets the packet payload type for the Vorbis data. There are currently three type of Vorbis payloads.
219 <vspace blankLines="1" />
221 <t> 0 = Raw Vorbis payload</t>
222 <t> 1 = Vorbis Packed Configuration payload</t>
223 <t> 2 = Legacy Vorbis Comment payload</t>
228 The last 4 bits are the number of complete packets in this payload. This provides for a maximum number of 15 Vorbis packets in the payload. If the packet contains fragmented data the number of packets MUST be set to 0.
233 <section anchor="Payload Data" title="Payload Data">
236 Raw Vorbis packets are unbounded in length currently, although at some future point there will likely be a practical limit placed on them. Typical Vorbis packet sizes are from very small (2-3 bytes) to quite large (8-12 kilobytes). The reference implementation <xref target="libvorbis"></xref> typically produces packets less than ~800 bytes, except for the setup header packets which are ~4-12 kilobytes. Within an RTP context the maximum packet size, including the RTP and payload headers, SHOULD be kept below the path MTU to avoid packet fragmentation.
239 <figure anchor="Payload Data Figure" title="Payload Data Header">
242 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
243 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
244 | length | vorbis packet data ..
245 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
250 Each Vorbis payload packet starts with a two octet length header, which is used to represent the size of the following data payload, followed by the raw Vorbis data.
254 For payloads which consist of multiple Vorbis packets the payload data consists of the packet length followed by the packet data for each of the Vorbis packets in the payload.
258 The Vorbis packet length header is the length of the Vorbis data block only and does not count the length field.
262 The payload packing of the Vorbis data packets SHOULD follow the guidelines set-out in <xref target="rfc3551"></xref> where the oldest packet occurs immediately after the RTP packet header.
266 Channel mapping of the audio is in accordance with BS. 775-1 ITU-R <xref target="775itu"></xref>.
271 <section anchor="Example RTP Packet" title="Example RTP Packet">
274 Here is an example RTP packet containing two Vorbis packets.
281 <figure anchor="Example Header Packet (RTP Headers)" title="Example Packet (RTP Headers)">
284 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
286 | 2 |0|0| 0 |0| PT | sequence number |
287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
288 | timestamp (in sample rate units) |
289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
290 | synchronisation source (SSRC) identifier |
291 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
292 | contributing source (CSRC) identifiers |
294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
302 <figure anchor="Example Packet (Payload Data)" title="Example Packet (Payload Data)">
305 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
306 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
307 | Ident | 0 | 0 | 2 pks |
308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
309 | length | vorbis data ..
310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
313 | length | next vorbis packet data ..
314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
321 The payload data section of the RTP packet starts with the 24 bit Ident field followed by the one octet bitfield header, which has the number of Vorbis frames set to 2. Each of the Vorbis data frames is prefixed by the two octet length field. The Packet Type and Fragment Type are set to 0. The decode Configuration that will be used to decode the packets is the one indexed by the ident value.
329 <section anchor="Configuration Headers" title="Configuration Headers">
332 Unlike other mainstream audio codecs Vorbis has no statically
333 configured probability model. Instead, it packs all entropy decoding
334 configuration, VQ and Huffman models into a data block that must be
335 transmitted to the decoder along with the compressed data. A decoder
336 also requires identification information detailing the number of audio
337 channels, bitrates and other information to configure itself for a
338 particular compressed data stream. These two blocks of information are
339 often referred to collectively as the "codebooks" for a Vorbis stream,
340 and are nominally included as special "header" packets at the start
341 of the compressed data.
345 Thus these two codebook header packets must be received by the decoder
346 before any audio data can be interpreted. In addition,
347 the <xref target="vorbis-spec-ref">Vorbis I specification</xref>
348 requires the presense of a comment header packet which gives simple
349 metadata about the stream. This requirement poses problems in RTP,
350 which is often used over unreliable transports.
354 Since this information must be transmitted reliably, and as the RTP
355 stream may change certain configuration data mid-session there are
356 different methods for delivering this configuration data to a
357 client, both in-band and out-of-band which is detailed below. SDP
358 delivery is used to setup an initial state for the client application.
359 The changes may be due to different codebooks as well as different
360 bitrates of the stream.
364 The delivery vectors in use are specified by an SDP attribute to indicate the method and the optional URI where the Vorbis <xref target="Packed Configuration">Packed Configuration</xref> Packets could be fetched. Different delivery methods MAY be advertised for the same session. The in-band Configuration delivery SHOULD be considered as baseline, out-of-band delivery methods that don't use RTP will not be described in this document. For non chained streams, the Configuration delivery method RECOMMENDED is inline the <xref target="Packed Configuration">Packed Configuration</xref> in the SDP as explained in the <xref target="Mapping MIME Parameters into SDP"> IANA considerations</xref> section.
368 The 24 bit Ident field is used to indicate when a change in the stream has taken place. The client application MUST have in advance the correct configuration and if the client detects a change in the Ident value and does not have this information it MUST NOT decode the raw Vorbis data.
371 <section anchor="In-band Header Transmission" title="In-band Header Transmission">
374 The <xref target="Packed Configuration">Packed Configuration</xref> Payload is sent in-band with the packet type bits set to match the payload type. Clients MUST be capable of dealing with fragmentation and periodic re-transmission of the configuration headers.
377 <section anchor="Packed Configuration" title="Packed Configuration">
380 A Vorbis Packed Configuration is indicated with the payload type field set to 1. Of the three headers, defined in the <xref target="vorbis-spec-ref">Vorbis I specification</xref>, the identification and the setup will be packed together, the comment header is completely suppressed. Is up to the client provide a minimal size comment header to the decoder if required by the implementation.
383 <figure anchor="Packed Configuration Figure" title="Packed Configuration Figure">
386 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
388 |V=2|P|X| CC |M| PT | xxxx |
389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
391 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
392 | synchronization source (SSRC) identifier |
393 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
394 | contributing source (CSRC) identifiers |
396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
400 | Setup length | Identification ..
401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
417 <t>The Ident field is set with the value that will be used by the Raw Payload Packets to address this Configuration. The Fragment type is set to 0 since the packet bears the full Packed configuration, the number of packet is set to 1</t>
423 <section anchor="Packed Headers Delivery" title="Packed Headers Delivery">
426 As mentioned above the RECOMMENDED delivery vector for Vorbis configuration data is via a retrieval method that can be performed using a reliable transport protocol. As the RTP headers are not required for this method of delivery the structure of the configuration data is slightly different. The packed header starts with a 32 bit count field which details the number of packed headers that are contained in the bundle. Next is the Packed header payload for each chained Vorbis stream.
429 <figure anchor="Packed Headers Overview Figure" title="Packed Headers Overview">
431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
432 | Number of packed headers |
433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
444 Since the Configuration Ident and the Identification Header are fixed length there is only a 2 byte Setup Length tag to define the length of the Setup header.
447 <figure anchor="Packed Headers Detail Figure" title="Packed Headers Detail">
450 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
454 .. Length | Identification Header ..
455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
456 .. Identification Header |
457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
465 The key difference between the in-band format is there is no need for the payload header octet.
468 <section anchor="Packed Headers IANA Considerations" title="Packed Headers IANA Considerations">
471 The following IANA considerations MUST only be applied to the packed headers.
475 MIME media type name: audio
478 MIME subtype: vorbis-config
482 Required Parameters:</t><t>
487 Optional Parameters: </t><t>
492 Encoding considerations:</t><t>
493 This type is only defined for transfer via non RTP protocols as specified in RFC XXXX.
497 Security Considerations:</t><t>
498 See Section 6 of RFC 3047.
502 Interoperability considerations: none
506 Published specification:</t>
507 <t>See RFC XXXX for details.</t>
510 Applications which use this media type:</t><t>
511 Vorbis encoded audio, configuration data.
515 Additional information: none
519 Person & email address to contact for further information:</t><t>
520 Luca Barbato: <lu_zero@gentoo.org>
524 Intended usage: COMMON
527 <t>Author/Change controller:</t>
528 <t>Author: Luca Barbato</t>
529 <t>Change controller: IETF AVT Working Group</t>
536 <section anchor="Loss of Configuration Headers" title="Loss of Configuration Headers">
539 Unlike the loss of raw Vorbis payload data, loss of a configuration header can lead to a situation where it will not be possible to successfully decode the stream.
543 Loss of Configuration Packet results in the halting of stream decoding and SHOULD be reported to the client as well as a loss report sent via RTCP.
548 <!-- <section anchor="Mapping between Configuration and Stream" title="Mapping between Configuration and Stream">
551 The mapping between the stream and the the configuration is explicit.
559 <section anchor="Comment Headers" title="Comment Headers">
562 With the payload type flag set to 2, this indicates that the packet contain the comment metadata, such as artist name, track title and so on. These metadata messages are not intended to be fully descriptive but to offer basic track/song information. This packet SHOULD NOT be sent and clients MAY ignore it completely. The details on the format of the comments can be found in the <xref target="vorbis-spec-ref">Vorbis documentation</xref>.
564 <figure anchor="Comment Packet Figure" title="Comment Packet">
567 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
569 |V=2|P|X| CC |M| PT | xxxx |
570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
573 | synchronization source (SSRC) identifier |
574 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
575 | contributing source (CSRC) identifiers |
577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
581 | length | Comment ..
582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
590 <t>The 2 bytes length field is necessary since this packet could be fragmented.</t>
593 <section anchor="Frame Packetizing" title="Frame Packetizing">
596 Each RTP packet contains either one Vorbis packet fragment, or an integer number of complete Vorbis packets (up to a max of 15 packets, since the number of packets is defined by a 4 bit value).
600 Any Vorbis data packet that is less than path MTU SHOULD be bundled in the RTP packet with as many Vorbis packets as will fit, up to a maximum of 15. Path MTU is detailed in <xref target="rfc1063"></xref> and <xref target="rfc1981"></xref>.
604 If a Vorbis packet, not only data but also Configuration and Comment, is larger than 65535 octets it MUST be fragmented. A fragmented packet has a zero in the last four bits of the payload header. The first fragment will set the Fragment type to 1. Each fragment after the first will set the Fragment type to 2 in the payload header. The RTP packet containing the last fragment of the Vorbis packet will have the Fragment type set to 3. To maintain the correct sequence for fragmented packet reception the timestamp field of fragmented packets MUST be the same as the first packet sent, with the sequence number incremented as normal for the subsequent RTP packets.
607 <section anchor="Example Fragmented Vorbis Packet" title="Example Fragmented Vorbis Packet">
610 Here is an example fragmented Vorbis packet split over three RTP packets. Each packet contains the standard RTP headers as well as the 4 octet Vorbis headers.
613 <figure anchor="Example Fragmented Packet (Packet 1)" title="Example Fragmented Packet (Packet 1)">
618 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
620 |V=2|P|X| CC |M| PT | 1000 |
621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
624 | synchronization source (SSRC) identifier |
625 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
626 | contributing source (CSRC) identifiers |
628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
629 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
631 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
632 | length | vorbis data ..
633 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
635 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
640 In this packet the initial sequence number is 1000 and the timestamp is xxxxx. The Fragment type is set to 1, the number of packets field is set to 0, and as the payload is raw Vorbis data the VDT field is set to 0.
643 <figure anchor="Example Fragmented Packet (Packet 2)" title="Example Fragmented Packet (Packet 2)">
648 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
649 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
650 |V=2|P|X| CC |M| PT | 1001 |
651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
653 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
654 | synchronization source (SSRC) identifier |
655 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
656 | contributing source (CSRC) identifiers |
658 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
659 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
661 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
662 | length | vorbis data ..
663 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
670 The Fragment type field is set to 2 and the number of packets field is set to 0. For large Vorbis fragments there can be several of these type of payload packets. The maximum packet size SHOULD be no greater than the path MTU, including all RTP and payload headers. The sequence number has been incremented by one but the timestamp field remains the same as the initial packet.
673 <figure anchor="Example Fragmented Packet (Packet 3)" title="Example Fragmented Packet (Packet 3)">
678 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
679 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
680 |V=2|P|X| CC |M| PT | 1002 |
681 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
683 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
684 | synchronization source (SSRC) identifier |
685 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
686 | contributing source (CSRC) identifiers |
688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
689 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
691 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
692 | length | vorbis data ..
693 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
695 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
700 This is the last Vorbis fragment packet. The Fragment type is set to 3 and the packet count remains set to 0. As in the previous packets the timestamp remains set to the first packet in the sequence and the sequence number has been incremented.
704 <section anchor="Packet Loss" title="Packet Loss">
707 As there is no error correction within the Vorbis stream, packet loss will result in a loss of signal. Packet loss is more of an issue for fragmented Vorbis packets as the client will have to cope with the handling of the Fragment Type. In case of loss of fragments the client MUST discard all of them. If we use the fragmented Vorbis packet example above and the first packet is lost the client MUST detect that the next packet has the packet count field set to 0 and the Fragment type 2 and MUST drop it. The next packet, which is the final fragmented packet, MUST be dropped in the same manner. Feedback reports on lost and dropped packets MUST be sent back via RTCP.
711 If a particular multicast session has a large number of participants care must be taken to prevent an RTCP feedback implosion, <xref target="rtcp-feedback"></xref>, in the event of packet loss from a large number of participants.
715 Loss of any of the configuration headers, detailed below, is dealt with in the Loss of Configuration Headers Section later.
720 <section anchor="IANA Considerations" title="IANA Considerations">
723 MIME media type name: audio
734 delivery-method: indicates the delivery methods in use, the possible values are:inline, in_band, out_band
736 configuration: the <xref target="rfc3548">base16</xref> (hexadecimal) representation of the <xref target="Packed Headers Delivery">Packed Headers</xref>.
740 Optional Parameters: </t><t>
741 configuration-uri: the URI of the configuration headers in case of out of band transmission. In the form of proto://path/to/resource/ Depending on the specific method the single ident packet could be retrived by their number, or aggregated in a single stream.
745 Encoding considerations:</t><t>
746 This type is only defined for transfer via RTP as specified
751 Security Considerations:</t><t>
752 See Section 6 of RFC 3047.
756 Interoperability considerations: none
760 Published specification:</t>
761 <t>See the Vorbis documentation <xref target="vorbis-spec-ref"></xref> for details.</t>
764 Applications which use this media type:</t><t>
765 Audio streaming and conferencing tools
769 Additional information: none
773 Person & email address to contact for further information:</t><t>
774 Luca Barbato: <lu_zero@gentoo.org>
778 Intended usage: COMMON
781 <t>Author/Change controller:</t>
782 <t>Author: Luca Barbato</t>
783 <t>Change controller: IETF AVT Working Group</t>
785 <section anchor="Mapping MIME Parameters into SDP" title="Mapping MIME Parameters into SDP">
788 The information carried in the MIME media type specification has a specific mapping to fields in the Session Description Protocol (SDP) <xref target="rfc2327"></xref>, which is commonly used to describe RTP sessions. When SDP is used to specify sessions the mapping are as follows:
791 <vspace blankLines="1" />
792 <list style="symbols">
794 <t>The MIME type ("audio") goes in SDP "m=" as the media name.</t>
795 <vspace blankLines="1" />
797 <t>The MIME subtype ("VORBIS") goes in SDP "a=rtpmap" as the encoding name.</t>
798 <vspace blankLines="1" />
800 <t>The parameter "rate" also goes in "a=rtpmap" as clock rate.</t>
801 <vspace blankLines="1" />
803 <t>The parameter "channels" also goes in "a=rtpmap" as channel count.</t>
804 <vspace blankLines="1" />
806 <t>The mandated parameters "delivery-method" and "configuration" MUST be included in the SDP "a=fmpt" attribute.</t>
807 <vspace blankLines="1" />
809 <t>The optional parameter "configuration-uri", when present, MUST be included in the SDP "a=fmpt" attribute.</t>
814 If the stream comprises chained Vorbis files and all of them are known in advance, the Configuration Packet for each file SHOULD be passed to the client using the configuration attribute.
818 The Vorbis configuration specified in the configuration-uri attribute MUST point to a location where all of the Configuration Packets needed for the life of the session reside.
822 The port value is specified by the server application bound to the address specified in the c attribute. The bitrate value and channels specified in the rtpmap attribute MUST match the Vorbis sample rate value. An example is found below.
825 <vspace blankLines="1" />
828 <t>m=audio RTP/AVP 98</t>
829 <t>a=rtpmap:98 VORBIS/44100/2</t>
830 <t>a=delivery:out_band/http</t>
831 <t>a=fmtp:98 delivery-method:inline,out_band/http; configuration=base16string1; configuration-uri=http://path/to/the/resource</t>
835 Note that the payload format (encoding) names are commonly shown in upper case. MIME subtypes are commonly shown in lower case. These names are case-insensitive in both places. Similarly, parameter names are case-insensitive both in MIME types and in the default mapping to the SDP a=fmtp attribute. The exception regarding case sensitivity is the configuration-uri URI which MUST be regarded as being case sensitive.
839 The answer to any offer, <xref target="rfc3264"></xref>, MUST NOT change the URI specified in the configuration-uri attribute.
846 <section anchor="Congestion Control" title="Congestion Control">
849 Vorbis clients SHOULD send regular receiver reports detailing congestion. A mechanism for dynamically downgrading the stream,
850 known as bitrate peeling, will allow for a graceful backing off of the stream bitrate. This feature is not available at present
851 so an alternative would be to redirect the client to a lower bitrate stream if one is available.
855 If a particular multicast session has a large number of participants care must be taken to prevent an RTCP feedback implosion,
856 <xref target="rtcp-feedback"></xref>, in the event of congestion.
861 <section anchor="Security Considerations" title="Security Considerations">
863 RTP packets using this payload format are subject to the security considerations discussed in the RTP specification
864 <xref target="rfc3550"></xref>. This implies that the confidentiality of the media stream is achieved by using
865 encryption. Because the data compression used with this payload format is applied end-to-end, encryption may be performed on the
866 compressed data. Where the size of a data block is set care MUST be taken to prevent buffer overflows in the client applications.
871 <section anchor="Acknowledgments" title="Acknowledgments">
874 This document is a continuation of draft-moffitt-vorbis-rtp-00.txt and draft-kerr-avt-vorbis-rtp-04.txt. The MIME type section is a continuation of draft-short-avt-rtp-vorbis-mime-00.txt.
878 Thanks to the AVT, Ogg Vorbis Communities / Xiph.org including Steve Casner, Aaron Colwell, Ross Finlayson, Fluendo, Ramon Garcia, Pascal Hennequin, Ralph Giles, Tor-Einar Jarnbjo, Colin Law, John Lazzaro, Jack Moffitt, Christopher Montgomery, Colin Perkins, Barry Short, Mike Smith, Phil Kerr, Michael Sparks, Magnus Westerlund, David Barrett, Silvia Pfeiffer, Politecnico di Torino (LS)³/IMG Group in particular Federico Ridolfo, Francesco Varano, Giampaolo Mancini, Juan Carlos De Martin.
887 <references title="Normative References">
889 <reference anchor="rfc3533">
891 <title>The Ogg Encapsulation Format Version 0</title>
892 <author initials="S." surname="Pfeiffer" fullname="Silvia Pfeiffer"></author>
894 <seriesInfo name="RFC" value="3533" />
897 <reference anchor="rfc2119">
899 <title>Key words for use in RFCs to Indicate Requirement Levels </title>
900 <author initials="S." surname="Bradner" fullname="Scott Bradner"></author>
902 <seriesInfo name="RFC" value="2119" />
905 <reference anchor="rfc3550">
907 <title>RTP: A Transport Protocol for real-time applications</title>
908 <author initials="H." surname="Schulzrinne" fullname=""></author>
909 <author initials="S." surname="Casner" fullname=""></author>
910 <author initials="R." surname="Frederick" fullname=""></author>
911 <author initials="V." surname="Jacobson" fullname=""></author>
913 <seriesInfo name="RFC" value="3550" />
916 <reference anchor="rfc3551">
918 <title>RTP Profile for Audio and Video Conferences with Minimal Control.</title>
919 <author initials="H." surname="Schulzrinne" fullname=""></author>
920 <author initials="S." surname="Casner" fullname=""></author>
922 <date month="July" year="2003" />
923 <seriesInfo name="RFC" value="3551" />
926 <reference anchor="rfc2327">
928 <title>SDP: Session Description Protocol</title>
929 <author initials="M." surname="Handley" fullname="Mark Handley"></author>
930 <author initials="V." surname="Jacobson" fullname="Van Jacobson"></author>
932 <seriesInfo name="RFC" value="2327" />
935 <reference anchor="rfc1063">
937 <title>Path MTU Discovery</title>
938 <author initials="J." surname="Mogul et al." fullname="J. Mogul et al."></author>
940 <seriesInfo name="RFC" value="1063" />
943 <reference anchor="rfc1981">
945 <title>Path MTU Discovery for IP version 6</title>
946 <author initials="J." surname="McCann et al." fullname="J. McCann et al."></author>
948 <seriesInfo name="RFC" value="1981" />
951 <reference anchor="rfc3264">
953 <title>An Offer/Answer Model with Session Description Protocol (SDP)</title>
954 <author initials="J." surname="Rosenberg" fullname="Jonathan Rosenberg"></author>
955 <author initials="H." surname="Schulzrinne" fullname="Henning Schulzrinne"></author>
957 <seriesInfo name="RFC" value="3264" />
960 <reference anchor="rfc3548">
962 <title>The Base16, Base32, and Base64 Data Encodings</title>
963 <author initials="S." surname="Josefsson" fullname="Simon Josefsson"></author>
965 <seriesInfo name="RFC" value="3548" />
968 <reference anchor="rtcp-feedback">
970 <title>Extended RTP Profile for RTCP-based Feedback (RTP/AVPF)</title>
971 <author initials="J." surname="Ott" fullname="Joerg Ott"></author>
972 <author initials="S." surname="Wenger" fullname="Stephan Wenger"></author>
973 <author initials="N." surname="Sato" fullname="Noriyuki Sato"></author>
974 <author initials="C." surname="Burmeister" fullname="Carsten Burmeister"></author>
975 <author initials="J." surname="Rey" fullname="Jose Rey"></author>
977 <seriesInfo name="Internet Draft" value="(draft-ietf-avt-rtcp-feedback-11: Work in progress)" />
983 <references title="Informative References">
984 <reference anchor="libvorbis">
986 <title>libvorbis: Available from the Xiph website, http://www.xiph.org</title>
990 <reference anchor="vorbis-spec-ref">
992 <title>Ogg Vorbis I specification: Codec setup and packet decode. Available from the Xiph website, http://www.xiph.org</title>
996 <reference anchor="v-comment">
998 <title>Ogg Vorbis I specification: Comment field and header specification. Available from the Xiph website,
999 http://www.xiph.org</title>
1003 <reference anchor="775itu">
1005 <title>ITU (1992-1994) ITU-R Recommendation BS. 775-1 Multi-channel stereophonic sound system with or without accompanying
1006 picture. International Telecommunications Union. Available from the ITU website, http://www.itu.int