1 MIME types in GStreamer
6 A MIME type is a combination of two (short) strings (words)---the content type
7 and the content subtype. Content types are broad categories used for describing
8 almost all types of files: video, audio, text, and application are common
9 content types. The subtype further breaks the content type down into a more
10 specific type description, for example 'application/ogg', 'audio/raw',
11 'video/mpeg', or 'text/plain'.
13 So the content type and subtype make up a pair that describes the type of
14 information contained in a file. In multimedia processing, MIME types are used
15 to describe the type of information carried by a media stream. In GStreamer, we
16 use MIME types in the same way, to identify the types of information that are
17 allowed to pass between GStreamer elements. The MIME type is part of a GstCaps
18 object that describes a media stream. Besides a MIME type, a GstCaps object also
19 contains a name and some stream properties (GstProps, which hold combinations of
22 An example of a MIME type is 'video/mpeg'. A corresponding GstCaps could be
25 GstCaps *caps = gst_caps_new_simple ("video/mpeg",
26 "width", G_TYPE_INT, 384,
27 "height", G_TYPE_INT, 288,
30 MIME types and their corresponding properties are of major importance in
31 GStreamer for uniquely identifying media streams. Therefore, we define them
32 per media type. All GStreamer plugins should keep to this definition.
34 Official MIME media types are assigned by the IANA. Current assignments are at
35 http://www.iana.org/assignments/media-types/.
40 Some streams may have MIME types or GstCaps that do not fully describe the
41 stream. In most cases, this is not a problem, though. For example, if a stream
42 contains Ogg/Vorbis data (which is of type 'application/ogg'), we don't need to
43 know the samplerate of the raw audio stream, since we can't play the encoded
44 audio anyway. The samplerate is, however, important for raw audio, so a decoder
45 would need to retrieve the samplerate from the Ogg/Vorbis stream headers (the
46 headers are part of the bytestream) in order to pass it on in the GstCaps that
47 belongs to the decoded audio (which becomes a type like 'audio/raw'). However,
48 other plugins might want to know such properties, even for compressed streams.
49 One such example is an AVI muxer, which does want to know the samplerate of an
50 audio stream, even when it is compressed.
52 Another problem is that many media types can be defined in multiple ways. For
53 example, MJPEG video can be defined as 'video/jpeg', 'video/mjpeg',
54 'image/jpeg', 'video/x-msvideo' with a compression of (fourcc) MJPG, etc.
55 None of these is really official, since there isn't an official mimetype
56 for encoded MJPEG video.
58 The main focus of this document is to propose a standardized set of MIME types
59 and properties that will be used by the GStreamer plugins.
61 Different types of streams
62 ==========================
64 There are several types of media streams. The most important distinction will be
65 container formats, audio codecs and video codecs. Container formats are
66 bytestreams that contain one or more substreams inside it, and don't provide any
67 direct media data itself. Examples are Quicktime, AVI or MPEG System Stream.
68 They mostly contain of a set of headers that define the media streams that are
69 packed inside the container, along with the media data itself.
71 Video codecs and audio codecs describe encoded audio or video data. Examples are
72 MPEG-1 video, DivX video, MPEG-1 layer 3 (MP3) audio or Ogg/Vorbis audio.
73 Actually, Ogg is a container format too (for Vorbis audio), but these are
74 usually used in conjunction with each other.
76 Finally, there are the somewhat obvious (but not commonly encountered as files)
82 1 - AVI (Microsoft RIFF/AVI)
83 MIME type: video/x-msvideo
85 Parser: avidemux, ffdemux_avi
89 MIME type: video/quicktime
96 Properties: 'systemstream' = TRUE (BOOLEAN)
97 Parser: mpegdemux, ffdemux_mpeg (PS), ffdemux_mpegts (TS), dvddemux
101 MIME type: video/x-ms-asf
103 Parser: asfdemux, ffdemux_asf
106 5 - WAV (Microsoft RIFF/WAV)
107 MIME type: audio/x-wav
109 Parser: wavparse, ffdemux_wav
113 MIME type: application/vnd.rn-realmedia
114 Properties: 'systemstream' = TRUE (BOOLEAN)
115 Parser: rmdemux, ffdemux_rm
118 7 - DV (Digital Video)
119 MIME type: video/x-dv
120 Properties: 'systemstream' = TRUE (BOOLEAN)
121 Parser: gst1394, ffdemux_dv
125 MIME type: application/ogg
131 MIME type: video/x-mkv
133 Parser: matroskademux, ffdemux_matroska
134 Formatter: matroskamux
136 10 - Shockwave (Macromedia)
137 MIME type: application/x-shockwave-flash
139 Parser: swfdec, ffdemux_swf
143 MIME type: audio/x-au
145 Parser: auparse, ffdemux_au
149 MIME type: audio/x-mod
151 Parser: modplug, mikmod
155 MIME type: video/x-fli
161 MIME type: application/x-ape
167 MIME type: audio/x-aiff
173 MIME type: audio/x-sid
178 Please note that we try to keep these MIME types as similar as possible to the
179 MIME types used as standards in Gnome (Gnome-VFS/Nautilus) and KDE
180 (Konqueror). Both will (in future) stick to a shared-mime-info database that
181 is hosted on freedesktop.org, and bases itself on IANA.
183 Also, there is a very thin line between audio codecs and audio containers
184 (take mp3 vs. sid, etc.). This is just a per-case thing right now and needs to
185 be documented further.
190 For convenience, the fourcc codes used in the AVI container format will be
191 listed along with the MIME type and optional properties.
193 Optional properties for all video formats are the following:
195 width = 1 - MAXINT (INT)
196 height = 1 - MAXINT (INT)
197 pixel_width = 1 - MAXINT (INT, with pixel_height forms aspect ratio)
198 pixel_height = 1 - MAXINT (INT, with pixel_width forms aspect ratio)
199 framerate = 0 - MAXFLOAT (FLOAT)
201 1 - MPEG-1, -2 and -4 video (ISO/LA MPEG)
202 MIME type: video/mpeg
203 Properties: systemstream = FALSE (BOOLEAN)
204 mpegversion = 1/2/4 (INT)
205 Known fourccs: MPEG, MPGI
206 Encoder: mpeg1enc, mpeg2enc
207 Decoder: mpeg1dec, mpeg2dec, mpeg2subt
209 2 - DivX 3.x, 4.x and 5.x video (divx.com)
210 MIME type: video/x-divx
212 Optional properties: divxversion = 3/4/5 (INT)
213 Known fourccs: DIV3, DIV4, DIV5, DIVX, DX50, DIVX, divx
215 Decoder: divxdec, ffdec_mpeg4
217 3 - Microsoft MPEG 4.1, 4.2 and 4.3
218 MIME type: video/x-msmpeg
220 Optional properties: msmpegversion = 41/42/43 (INT)
221 Known fourccs: MPG4, MP42, MP43
222 Encoder: ffenc_msmpeg4, ffenc_msmpeg4v1, ffenc_msmpeg4v2
223 Decoder: ffdec_msmpeg4, ffdec_msmpeg4v1, ffdec_msmpeg4v2
225 4 - Motion-JPEG (official and extended)
226 MIME type: video/x-jpeg
228 Known fourccs: MJPG (YUY2 MJPEG), JPEG (any), PIXL (Pinnacle/Miro), VIXL
230 Decoder: jpegdec, ffdec_mjpeg
232 5 - Sorensen (Quicktime - SVQ1/SVQ3)
233 MIME types: video/x-svq
234 Properties: svqversion = 1/3 (INT)
236 Decoder: ffdec_svq1, ffdec_svq3
238 6 - H263 and related codecs
239 MIME type: video/x-h263
241 Known fourccs: H263/h263, i263, L263, M263/m263, s263, x263, VDOW, VIVO
242 Encoder: ffenc_h263, ffenc_h263p
243 Decoder: ffdec_h263, ffdec_h263i
246 MIME type: video/x-pn-realvideo
247 Properties: rmversion = "1"/"2"/"3"/"4" (INT)
248 Known fourccs: RV10, RV20, RV30, RV40
250 Decoder: ffdec_rv10, ffdec_rv20
252 8 - Digital Video (DV)
253 MIME type: video/x-dv
254 Properties: systemstream = FALSE (BOOLEAN)
255 Known fourccs: DVSD/dvsd (SDTV), dvhd (HDTV), dvsl (SDTV LongPlay)
256 Encoder: ffenc_dvvideo
257 Decoder: dvdec, ffdec_dvvideo
259 9 - Windows Media Video 1, 2 and 3 (WMV)
260 MIME type: video/x-wmv
261 Properties: wmvversion = 1/2/3 (INT)
262 Encoder: ffenc_wmv1, ffenc_wmv2, none
263 Decoder: ffdec_wmv1, ffdec_wmv2, none
266 MIME type: video/x-xvid
268 Known fourccs: xvid, XVID
270 Decoder: xviddec, ffdec_mpeg4
273 MIME type: video/x-3ivx
275 Known fourccs: 3IV0, 3IV1, 3IV2
279 12 - Ogg/Tarkin (Xiph)
280 MIME type: video/x-tarkin
286 MIME type: video/x-vp3
291 14 - Ogg/Theora (Xiph, VP3-like)
292 MIME type: video/x-theora
295 Decoder: theoradec, ffdec_theora
296 This is the raw stream that comes out of an ogg file.
299 MIME type: video/x-huffyuv
305 16 - FF Video 1 (FFMPEG)
306 MIME type: video/x-ffv
307 Properties: ffvversion = 1 (INT)
312 MIME type: video/x-h264
319 MIME type: video/x-indeo
320 Properties: indeoversion = 3 (INT)
322 Decoder: ffdec_indeo3
324 19 - Portable Network Graphics (PNG)
325 MIME type: video/x-png
328 Decoder: pngdec, gdkpixbufdec
331 MIME type: video/x-cinepak
334 Decoder: ffdec_cinepak
336 TODO: subsampling information for YUV?
338 TODO: colorspace identifications for MJPEG? How?
340 TODO: how to distinguish MJPEG-A/B (Quicktime) and lossless JPEG?
342 TODO: divx4/divx5/xvid/3ivx/mpeg-4 - how to make them overlap? (all
343 ISO MPEG-4 compatible)
347 For convenience, the two-byte hexcodes (as used for identification in AVI files)
350 Properties for all audio formats include the following:
352 rate = 1 - MAXINT (INT, sampling rate)
353 channels = 1 - MAXINT (INT, number of audio channels)
356 MIME type: audio/x-alaw
362 MIME type: audio/x-mulaw
367 3 - MPEG-1 layer 1/2/3 audio
368 MIME type: audio/mpeg
369 Properties: mpegversion = 1 (INT)
371 Encoder: lame, ffdec_mp3
375 MIME type: audio/x-vorbis
376 Encoder: rawvorbisenc (vorbisenc does rawvorbisenc+oggmux)
379 5 - Windows Media Audio 1, 2 and 3 (WMA)
380 MIME type: audio/x-wma
381 Properties: wmaversion = 1/2/3 (INT)
383 Decoder: ffdec_wmav1, ffdec_wmav2, none
386 MIME type: audio/x-ac3
389 Decoder: a52dec, ac3parse
391 7 - FLAC (Free Lossless Audio Codec)
392 MIME type: audio/x-flac
395 Decoder: flacdec, ffdec_flac
397 8 - MACE 3/6 (Quicktime audio)
398 MIME type: audio/x-mace
399 Properties: maceversion = 3/6 (INT)
401 Decoder: ffdec_mace3, ffdec_mace6
404 MIME type: audio/mpeg
405 Properties: mpegversion = 4 (INT)
409 10 - (IMA) ADPCM (Quicktime/WAV/Microsoft/4XM)
410 MIME type: audio/x-adpcm
411 Properties: layout = "quicktime"/"wav"/"microsoft"/"4xm"/"g721"/"g722"/"g723_3"/"g723_5" (STRING)
412 Encoder: ffenc_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffenc_adpcm_[ms/4xm/xa/adx/ea]
413 Decoder: ffdec_adpcm_ima_[qt/wav/dk3/dk4/ws/smjpeg], ffdec_adpcm_[ms/4xm/xa/adx/ea]
415 Note: The difference between each of these four PCM formats is the number
416 of samples packed together per channel. For WAV, for example, each
417 sample is 4 bit, and 8 samples are packed together per channel in the
418 bytestream. For the others, refer to technical documentation. We
419 probably want to distinguish these differently, but I don't know how,
422 11 - RealAudio (Real)
423 MIME type: audio/x-pn-realaudio
424 Properties: raversion ="1"/"2" (INT)
425 Known fourccs: 14_4, 28_8
427 Decoder: ffdec_real_144 / ffdec_real_288
430 MIME type: audio/x-dv
436 MIME type: audio/x-gsm
438 Encoder: gsmenc, rtpgsmenc
439 Decoder: gsmdec, rtpgsmparse
442 MIME type: audio/x-speex
448 MIME type: audio/x-qdm2
451 16 - Sony ATRAC4 (detected inside realmedia and wave/avi streams, nothing to decode it yet)
452 MIME type: audio/x-vnd.sony.atrac3
457 17 - Ensoniq PARIS audio
458 MIME type: audio/x-paris
463 18 - Amiga IFF / SVX8 / SV16 audio
464 MIME type: audio/x-svx
469 19 - Sphere NIST audio
470 MIME type: audio/x-nist
475 20 - Sound Blaster VOC audio
476 MIME type: audio/x-voc
481 21 - Berkeley/IRCAM/CARL audio
482 MIME type: audio/x-ircam
487 22 - Sonic Foundry's 64 bit RIFF/WAV
488 MIME type: audio/x-w64
493 TODO: adpcm/dv needs confirmation from someone with knowledge...
498 Raw formats contain unencoded, raw media information. These are rather rare from
499 an end user point of view since raw media files have historically been
500 prohibitively large ... hence the multitude of encoding formats.
502 Raw video formats require the following common properties, in addition to
503 format-specific properties:
505 width = 1 - MAXINT (INT)
506 height = 1 - MAXINT (INT)
508 1 - Raw Video (YUV/YCbCr)
509 MIME type: video/x-raw-yuv
510 Properties: 'format' = 'XXXX' (fourcc)
511 Known fourccs: YUY2, I420, Y41P, YVYU, UYVY, etc.
514 Some raw video formats have implicit alignment rules. We should discuss this
515 more. Also, some formats have multiple fourccs (e.g. IYUV/I420 or
516 YUY2/YUYV). For each of these, we only use one (e.g. I420 and YUY2).
518 Currently recognized formats:
520 YUY2: packed, Y-U-Y-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
521 YVYU: packed, Y-V-Y-U order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
522 UYVY: packed, U-Y-V-Y order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
523 Y41P: packed, UYVYUYVYYYYY order, U/V hor 4x subsampled (YUV-4:1:1, 12 bpp)
524 IUY2: packed, U-Y-V order, not subsampled (YUV-1:1:1, 24 bpp)
526 Y42B: planar, Y-U-V order, U/V hor 2x subsampled (YUV-4:2:2, 16 bpp)
527 YV12: planar, Y-V-U order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
528 I420: planar, Y-U-V order, U/V hor+ver 2x subsampled (YUV-4:2:0, 12 bpp)
529 Y41B: planar, Y-U-V order, U/V hor 4x subsampled (YUV-4:1:1, 12bpp)
530 YUV9: planar, Y-U-V order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
531 YVU9: planar, Y-V-U order, U/V hor+ver 4x subsampled (YUV-4:1:0, 9bpp)
533 Y800: one-plane (Y-only, YUV-4:0:0, 8bpp)
535 See http://www.fourcc.org/ for more information.
537 Note: YUV-4:4:4 (both planar and packed, in multiple orders) are missing.
540 MIME type: video/x-raw-rgb
541 Properties: endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN
542 depth = 15/16/24 (INT, color depth)
543 bpp = 16/24/32 (INT, bits used to store each pixel)
544 red_mask = bitmask (0x..) (INT)
545 green_mask = bitmask (0x..) (INT)
546 blue_mask = bitmask (0x..) (INT)
548 24 and 32 bit RGB should always be specified as big endian, since any little
549 endian format can be transformed into big endian by rearranging the color
550 masks. 15 and 16 bit formats should generally have the same byte order as
553 Color masks are interpreted by loading 'bpp' number of bits using the given
554 'endianness', and masking and shifting by each color mask. Loading a 24-bit
555 value cannot be done directly, but one can perform an equivalent operation.
559 - memory: RRRRRRRR GGGGGGGG BBBBBBBB RRRRRRRR GGGGGGGG ...
562 endianness = 4321 (G_BIG_ENDIAN)
564 green_mask = 0x00ff00
567 - memory: xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG ...
570 endianness = 4321 (G_BIG_ENDIAN)
575 - memory: GGGBBBBB xRRRRRGG GGGBBBBB xRRRRRGG GGGBBBBB ...
578 endianness = 1234 (G_LITTLE_ENDIAN)
583 The raw audio formats require the following common properties, in addition to
584 format-specific properties:
586 rate = 1 - MAXINT (INT, sampling rate)
587 channels = 1 - MAXINT (INT, number of audio channels)
588 endianness = 1234/4321 (INT) <- use G_LITTLE_ENDIAN/G_BIG_ENDIAN/G_BYTE_ORDER
590 3 - Raw audio (integer format)
591 MIME type: audio/x-raw-int
592 properties: width = 8/16/24/32 (INT, bits used to store each sample)
593 depth = 8 - 32 (INT, bits actually used per sample)
594 signed = TRUE/FALSE (BOOLEAN)
596 4 - Raw audio (floating point format)
597 MIME type: audio/x-raw-float
598 Properties: width = 32/64 (INT)
599 buffer-frames: number of audio frames per buffer, 0=undefined
604 So, a short bit on what plugins should do. Above, I've stated that audio
605 properties like 'channels' and 'rate' or video properties like 'width' and
606 'height' are all optional. This doesn't mean you can just simply omit them and
607 everything will still work!
609 An example is the best way to explain all this. AVI needs the width, height,
610 rate and channels for the AVI header. So if these properties are missing, the
611 avimux element cannot properly create the AVI header. On the other hand, MPEG
612 doesn't have such properties in its header, so the mpegdemux element would need
613 to parse the separate streams in order to find them out. We don't want that
614 either, because a plugin only does one job. So normally, mpegdemux and avimux
615 wouldn't allow transcoding. To solve this problem, there are stream parser
616 elements (such as mpegaudioparse, ac3parse and mpeg1videoparse).
618 Conclusions to draw from here: a plugin gives info it can provide as seen from
619 its own task/job. If it can't, other elements might still need it and a stream
620 parser needs to be written if it doesn't already exist.
622 On properties that can be described by one of these (properties such as 'width',
623 'height', 'fps', etc.): they're forbidden and should be handled using filtered
626 Status of this document
627 =======================
629 Not all plugins strictly follow these guidelines yet, but these are the official
630 types. Plugins not following these specs either use extensions that should be
631 documented, or are buggy (and should be fixed).
633 Blame Ronald Bultje <rbultje@ronald.bitfreak.net> aka BBB for any mistakes in