1 <!doctype book PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
2 <!notation PNG system "PNG">
3 <!entity % local.notation.class "| PNG">
8 <title>GMime 2.6 tutorial</title>
9 <date>Oct 28, 2010</date>
12 <firstname>Jeffrey</firstname>
13 <surname>Stedfast</surname>
16 <email>fejj@gnome.org</email>
22 <para>This tutorial is meant to demonstrate how to use the GMime
29 <!-- ***************************************************************** -->
30 <chapter id="ch-availability">
31 <title>Tutorial Availability</title>
33 <para>A copy of this tutorial is distributed with each source-code
34 release of GMime in both SGML and HTML formats. For binary
35 distributions, please check with your vendor.</para>
37 <para>An online version of this tutorial is also available at <ulink
38 url="http://spruce.sourceforge.net/gmime/tutorial/">http://spruce.sourceforge.net/gmime/tutorial/</ulink>.
43 <!-- ***************************************************************** -->
44 <chapter id="ch-introduction">
45 <title>Introduction</title>
47 <para>GMime is a library for parsing and creating messages using the
48 Multipurpose Internet Mail Extension (MIME) format. It is licensed
50 url="http://http://www.gnu.org/licenses/licenses.html#LGPL">GNU Lesser General
51 Public License (LGPL)</ulink> so you are free to develop your Free
52 Software applications using GMime without having to spend anything for
53 licenses or royalties.</para>
55 <para>The primary author and maintainer of GMime is:</para>
59 <simpara>Jeffrey Stedfast <ulink url="mailto:fejj@gnome.org">fejj@gnome.org</ulink></simpara>
63 <para>GMime is essentially an object-oriented application programmers
64 interface (API). Although written completely in C, it is implemented
65 using the idea of classes and callback functions (pointers to
68 <para>GMime is built upon another library called GLib which also
69 serves as the foundation for such libraries as the GIMP ToolKit
70 (Gtk+). GLib is mostly a portability layer but also contains
71 additional functionality such as hash tables, linked lists, etc. For
72 more information on GLib, you should see the API reference at <ulink
73 url="http://library.gnome.org/devel/glib/stable/">http://library.gnome.org/devel/glib/stable/</ulink>.</para>
77 <!-- ***************************************************************** -->
78 <chapter id="ch-getting-started">
79 <title>Getting Started</title>
81 <para>The first thing you need to do, of course, is download the
82 GMime source and install it. You can always get the latest version
83 from <ulink url="http://download.gnome.org/pub/GNOME/sources/gmime/">http://download.gnome.org/pub/GNOME/sources/gmime/</ulink>. GMime
84 uses GNU autoconf for configuration. Once untar'd, type
85 <literal>./configure --help</literal> to see a list of options.</para>
87 <para>More information about building GMime is available in either the
88 source distribution under <filename>docs/reference/</filename> or via
89 the online reference at <ulink
90 url="http://library.gnome.org/devel/gmime/stable/gmime-building.html">http://library.gnome.org/devel/gmime/stable/gmime-building.html</ulink>.</para>
94 <!-- ***************************************************************** -->
95 <chapter id="ch-basics">
96 <title>Getting Down to the Basics</title>
98 <!-- ----------------------------------------------------------------- -->
99 <sect1 id="sec-compiling">
100 <title>Compiling</title>
102 <para>The first thing you need to learn how to do is compile
103 your program with the proper compiler flags so that your program
104 will include the correct GMime headers and linker flags.</para>
106 <para>To compile and link a simple program, you'll want to do
107 the following:</para>
111 <literal>gcc -g -Wall -o simple simple.c `pkg-config --cflags --libs gmime-2.4`</literal>
116 <!-- ----------------------------------------------------------------- -->
117 <sect1 id="sec-stream-basics">
118 <title>GMimeStream Basics</title>
120 <para>Before we get too deep into using GMime, it is important
121 to understand how to use the underlying I/O classes since GMime
122 is so very heavily dependant upon them.</para>
124 <para>If you've looked at the API at all already, you will have
125 probably noticed that the stream functions work very much like
126 those of the standard low-level UNIX I/O functions (those that
127 use file descriptors) but with a few extras taken from the
128 higher-level Standard C I/O API.</para>
130 <para>Let's take a moment to regres back to our early days of
131 programming where we learned how to write "Hello World!" on the
134 <programlisting role="C">
135 #include <stdio.h>
137 int main (int argc, char **argv)
139 fprintf (stdout, "Hello World!\n");
146 <para>Everyone should recognize what that program does. The
147 above program, rewritten to use GMime's stream classes would
148 look something like this:</para>
150 <programlisting role="C">
151 #include <stdio.h>
152 #include <gmime/gmime.h>
154 int main (int argc, char **argv)
158 /* initialize GMime */
161 /* create a stream around stdout */
162 stream = g_mime_stream_file_new (stdout);
165 g_mime_stream_printf (stream, "Hello World!\n");
168 g_mime_stream_flush (stream);
170 /* free/close the stream */
171 g_object_unref (stream);
177 <para>Hopefully, the only thing that may be new to you in either
178 of the above examples is the flushing of the stream after
179 writing to it. Most likely, in both examples, it is an unneeded
180 call, however it is there for completeness and you should
181 probably get into the habbit of flushing a stream after you've
182 finished writing to it. Like fflush(), g_mime_stream_flush()
183 will flush any write-buffers that the previous write-calls may
186 <para>The first function called in the second example is
187 <literal>g_mime_init</literal> with a value of
188 <literal>0</literal>. If you haven't guessed,
189 <literal>g_mime_init</literal> initializes the GMime library. It
190 takes a single bit-mask argument specifying which options to
191 enable. Currently there is only one optional bit-flag,
192 <literal>GMIME_INIT_FLAG_UTF8</literal> which is the default
193 anyway, so a value of <literal>0</literal> is used here. The
194 UTF-8 flag only exists for historical reasons.</para>
196 <para>The only other line that should need explaining might be:</para>
198 <programlisting role="C">
199 stream = g_mime_stream_file_new (stdout);
202 <para>This line creates a new object of type GMimeStreamFile which
203 takes a <literal>FILE*</literal> argument. Once the
204 GMimeStreamFile is created, it takes ownership of the
205 <literal>FILE*</literal> so be careful if you want to be able to
206 ever use that <literal>FILE*</literal> handle again later in
207 your program or if you do not wish for it to be closed when the
208 GMimeStreamFile is closed later.</para>
210 <para>One way of working around this is to do something like the
211 following example:</para>
213 <programlisting role="C">
214 #include <stdio.h>
215 #include <unistd.h>
216 #include <gmime/gmime.h>
218 int main (int argc, char **argv)
222 /* initialize GMime */
225 /* create a stream around stdout */
226 stream = g_mime_stream_fs_new (dup (fileno (stdout)));
229 g_mime_stream_printf (stream, "Hello World!\n");
232 g_mime_stream_flush (stream);
234 /* free/close the stream */
235 g_object_unref (stream);
241 <para>Here we have made a duplicate copy of stdout to give to
242 <literal>g_mime_stream_fs_new()</literal>. GMimeStreamFs is the
243 second type of stream meant for basic I/O, but instead of using
244 a <literal>FILE*</literal> handle, it instead uses an integer
245 file descriptor. The <literal>fileno()</literal> function
246 returns the integer file descriptor for a given
247 <literal>FILE*</literal> handle. The <literal>dup()</literal>
248 function makes a duplicate of the file descriptor passed to
249 it. More information can be read about these 2 functions by
250 using <literal>man</literal> on your local UNIX system or by
251 reading the Reference Manual for your libc.</para>
253 <para>There are also some functions to tell GMimeStreamFile,
254 GMimeStreamFs and GMimeStreamMem that they are not the owners of
255 the backend storage and so when they are destroyed, they should
256 not close the file or free the memory buffer
257 (respectively). These functions are:</para>
259 <programlisting role="C">
260 void g_mime_stream_file_set_owner (GMimeStreamFile *stream, gboolean owner);
261 void g_mime_stream_fs_set_owner (GMimeStreamFs *stream, gboolean owner);
262 void g_mime_stream_mem_set_owner (GMimeStreamMem *stream, gboolean owner);
265 <para>Next, let's examine some of the other stream
268 <programlisting role="C">
269 g_mime_stream_eos (stream);
272 <para>This function is useful for finding out if the
273 End-Of-Stream has been reached. This is similar in functionality
274 to Standard C's <literal>feof()</literal> function.</para>
276 <programlisting role="C">
277 g_mime_stream_reset (stream);
280 <para>This function will reset the state of a stream. Usually
281 this only means 'rewinding' to the beginning of the file. For
282 more complex streams, such as GMimeStreamFilter, however, this
283 will also reset the state of all of the filters that have been
284 attached to it (more on this later).</para>
286 <programlisting role="C">
287 g_mime_stream_length (stream);
290 <para>This function will return the length of the stream if
291 known, or -1 otherwise. For the most part, this function should
292 be avoided unless you absolutely need to know the stream length
293 and there is no other way to get it. The reason to avoid using
294 it is that it may be inaccurate if any filters are to be applied
295 as well as possibly being slow depending on the underlying
296 storage device.</para>
298 <programlisting role="C">
299 g_mime_stream_substream (stream, start, end);
302 <para>This function will return a substream of the original
303 stream, where the beginning of the new substream is the start
304 offset and the end is the end offset. These start and end
305 offsets MUST be within the bounds of the original
306 stream. Substreams can be useful if you want to only allow
307 reading and writing to a subsection of the original
310 <programlisting role="C">
311 g_mime_stream_read (stream, buf, n);
314 <para>Like POSIX <literal>read()</literal>, this function will
315 try to read <literal>n</literal> bytes from the stream
316 into <literal>buf</literal>, but be warned that it is not
317 guaranteed to read the full requested buffer size if that much
318 data is not currently available.</para>
320 <programlisting role="C">
321 g_mime_stream_write (stream, buf, n);
324 <para>Like POSIX <literal>write()</literal> and standard
325 C's <literal>fwrite()</literal>, this function will write a
326 buffer of the specified length to the underlying
327 stream. However, unlike the POSIX <literal>write()</literal>
328 function, it will only fail if an irrecoverable error has
329 occurred and so it is not necessary to loop write attempts until
330 the entire buffer is written.</para>
332 <programlisting role="C">
333 g_mime_stream_seek (stream, offset, GMIME_STREAM_SEEK_SET);
334 g_mime_stream_seek (stream, offset, GMIME_STREAM_SEEK_CUR);
335 g_mime_stream_seek (stream, offset, GMIME_STREAM_SEEK_END);
338 <para>This function works exactly like the
339 POSIX <literal>lseek()</literal> or standard
340 C's <literal>fseek()</literal> functions.</para>
344 <!-- ----------------------------------------------------------------- -->
345 <sect1 id="sec-stream-classes">
346 <title>Stream Class Overview</title>
348 <para>There are a number of stream classes included with GMime,
349 but we are only going to go over the more widely useful stream
350 classes. You should be able to figure out the others on your
353 <para>We've already seen GMimeStreamFile and GMimeStreamFs in
354 action in the prevous chapter, so let's skip them and start with
355 GMimeStreamMem.</para>
357 <para>GMimeStremMem is a stream abstraction that reads and
358 writes to a memory buffer. Like any other stream, the basic
359 stream functions (read, write, seek, substream, eos, etc) apply
360 here as well. Internally, GMimeStreamMem uses the GLib
361 GByteArray structure for storage so you may want to read up on
364 <para>There are several ways to instantiate a GMimeStreamMem
365 object. You will probably use
366 <literal>g_mime_stream_mem_new()</literal> most of the
367 time. There may be times, however, when you will already have a
368 memory buffer that you'd like to use as a stream. There are
369 several ways to create a GMimeStreamMem object to use this
370 buffer (or a copy of it).</para>
373 <literal>g_mime_stream_mem_new_with_byte_array()</literal>. This
374 assumes that you are already using a GByteArray and want to use
375 it as a stream. As explained in the previous chapter about
376 GMimeStreamFile and ownership, the same applies here. When the
377 GMimeStreamMem is destroyed, so is the GByteArray structure and
378 the memory buffer it contained. To get around this, create a new
379 GMimeStreamMem object using
380 <literal>g_mime_stream_mem_new()</literal> and then use
381 <literal>g_mime_stream_mem_set_byte_array()</literal> to set the
382 GByteArray as the memory buffer. This will make it so that
383 GMimeStreamMem does not own the GByteArray, so when the
384 GMimeStremMem object is destroyed, the GByteArray will
387 <para>Also at your disposal for creating GMimeStreamMem objects
388 with an initial buffer is
389 <literal>g_mime_stream_mem_new_with_buffer()</literal>. This
390 function, however, will duplicate the buffer passed to it so if
391 you have memory quotas you are trying to keep, you may wish to
392 find a way to use one of the above methods.</para>
394 <para>That pretty much sums up how to use GMimeStreamMem. The
395 next most widely used stream class is probably
396 GMimeStreamBuffer. This stream class actually wraps another
397 stream object adding additional functionality such as read and
398 write buffering and a few additional read methods.</para>
400 <para>As you may or may not know, buffering reads and writes is
401 a great way to improve I/O performance in applications. The time
402 it takes to do a lot of small reads and writes accumulates
405 <para>When using a GMimeStreamBuffer in
406 <literal>GMIME_STREAM_BUFFER_BLOCK_READ</literal> mode, a block
407 of 4K (4096 bytes) will be read into an intermediate
408 buffer. Each time your application performs a read on this
409 GMimeStreamBuffer stream, a chunk of that intermediate buffer
410 will be copied to your read buffer until all 4K have been read,
411 at which point GMimeStreamBuffer will pre-scan the next 4K and so
414 <para>Similarly, using mode
415 <literal>GMIME_STREAM_BUFFER_BLOCK_WRITE</literal> will copy
416 each of your application write-buffers into an intermediate 4K
417 buffer. When that 4K buffer fills up, it will be flushed to the
418 underlying stream. You may also use
419 <literal>g_mime_stream_flush()</literal> to force the
420 intermediate buffer to be written to the underlying
423 <para>Note that the intermediate buffer size is 4096 bytes. You
424 should be aware that if you will mostly be reading and writing
425 blocks of larger than 4K, it is probably best to avoid using
426 GMimeStreamBuffer as it will not likely gain you any performance
427 and may decrease performance instead.</para>
429 <para>GMimeStreamBuffer also adds 2 convenience functions for
430 reading. While they will both work with any stream class, they
431 are obviously much faster if used with a GMimeStreamBuffer in
432 mode <literal>GMIME_STREAM_BUFFER_BLOCK_READ</literal>. These
433 functions are:</para>
435 <programlisting role="C">
436 ssize_t g_mime_stream_buffer_gets (GMimeStream *stream, char *buf, size_t max);
438 void g_mime_stream_buffer_readln (GMimeStream *stream, GByteArray *buffer);
441 <para>The first function is similar to Standard C's
442 <literal>fgets()</literal> function (although the arguments are
443 in a slightly different order). It reads up to the first
444 <literal>max - 1</literal> bytes, stopping after a
445 <literal>\n</literal> character, if found. <literal>buf</literal>
446 will always be nul-terminated.</para>
448 <para>The second function,
449 <literal>g_mime_stream_buffer_readln()</literal>, has no
450 Standard C equivalent that I am aware of, but you should get the
451 idea of what it does based on the function name (I hope). It
452 reads exactly one (1) line (including the <literal>\n</literal>
453 character) and appends it to the end of
454 <literal>buffer</literal>.</para>
456 <para>The last stream class you really need to know (and the
457 last one I have the patience to explain) is
458 GMimeStreamFilter. This is a special stream class which you can
459 attach GMimeFilters to so that reading/writing to this stream
460 will automagically convert the stream from one form to
461 another. GMime uses this stream internally for converting base64
462 encoded attachments into their raw form and vice versa.</para>
465 <para>As previously mentioned in the last chapter concerning
466 <literal>g_mime_stream_reset()</literal>, resetting a
467 GMimeStreamFilter stream will also reset all of the filters
471 <para>A great example usage of GMimeStreamFilter can be found in
472 the <filename>src/uuencode.c</filename> source file found in the
473 source distribution. Here's a clip of that source file
474 illustrating how to use stream filters:</para>
476 <programlisting role="C">
477 GMimeStream *istream, *ostream, *fstream;
483 if (g_mime_stream_printf (ostream, "begin %.3o %s\n", st.st_mode & 0777, name) == -1) {
484 fprintf (stderr, "%s: %s\n", progname, strerror (errno));
485 g_object_unref (ostream);
489 istream = g_mime_stream_fs_new (fd);
491 fstream = g_mime_stream_filter_new (ostream);
493 filter = g_mime_filter_basic_new (GMIME_CONTENT_ENCODING_UUENCODE, TRUE);
494 g_mime_stream_filter_add ((GMimeStreamFilter *) fstream, filter);
495 g_object_unref (filter);
497 if (g_mime_stream_write_to_stream (istream, fstream) == -1) {
498 fprintf (stderr, "%s: %s\n", progname, strerror (errno));
499 g_object_unref (fstream);
500 g_object_unref (istream);
501 g_object_unref (ostream);
505 g_mime_stream_flush (fstream);
506 g_object_unref (fstream);
507 g_object_unref (istream);
509 if (g_mime_stream_write_string (ostream, "end\n") == -1) {
510 fprintf (stderr, "%s: %s\n", progname, strerror (errno));
511 g_object_unref (ostream);
515 g_object_unref (ostream);
518 <para>The above snippet of code will read the contents of the input
519 stream (<literal>istream</literal>) and write it to our output
520 stream (<literal>ostream</literal>), but only after it has
521 passed through our filter-stream
522 (<literal>fstream</literal>). The filter attached to
523 <literal>fstream</literal> is one of the basic MIME filters that
524 encodes data in the traditional UUCP format. You have probably
525 run a program to do this many times in the past using the UNIX
526 command <literal>uuencode</literal>. Never thought writing a
527 replacement for <literal>uuencode</literal> could be so easy,
528 did you? Well, it is. And not only is it <emphasis>that
529 easy</emphasis>, but it also runs faster than the
530 <literal>uuencode</literal> shipped with GNU Sharutils (at least
531 up to and including the 4.2.1 release).</para>
535 <!-- ----------------------------------------------------------------- -->
536 <sect1 id="sec-filter-classes">
537 <title>Filter Class Overview</title>
539 <para>GMime comes pre-bundled with a number of stream filters
540 for your convenience and more may be added in the future. For
541 now, let's breeze through a summary of some of the more important
544 <para>GMimeFilterBasic is used quite a lot internally in GMime
545 for encoding and decoding the content of MIME parts. This class
546 contains a mode for encoding and decoding each of Base64,
547 Quoted-Printable, and Uuencode.</para>
549 <para>If you are interested in converting between charsets for
550 your users, you will likely want to become familiar with
551 GMimeFilterCharset which provides a convenient way to convert
552 text streams of one charset into another charset.</para>
554 <para>GMimeFilterCRLF will likely become very useful to you if
555 you are implementing any internet standards or DOS/UNIX
556 compatability. This filter is meant for converting line endings
557 from the traditional UNIX sequence (LF) to the internet standard
558 (and DOS) sequence, CRLF, and vice versa. Also included in this
559 filter is a way to escape and unescape lines beginning with '.'
560 in the method used by the SMTP and POP protocols.</para>
562 <para>GMimeFilterFrom is one you will likely need to use if ever
563 you need to write to an mbox-formatted mail spool. At present,
564 it has 2 modes: <literal>GMIME_FILTER_FROM_MODE_ESCAPE</literal>
565 and <literal>GMIME_FILTER_FROM_MODE_ARMOR</literal>. If you are
566 writing to an mbox-formatted spool, you will always want to use
567 the <literal>ESCAPE</literal> mode which will escape lines
568 beginning with "From " by prepending a '>' character, resulting
569 in ">From ". The other mode might come in handy if you are
570 implementing a multipart/signed method where you are
571 quoted-printable encoding a text stream and need to special-case
572 From-lines in order to protect against UNIX systems which will
573 alter the message when writing it to an mbox file such as the
574 previously mentioned filter mode. The result is something like
575 "=46rom " which prevents the need to prepend a '>' character
576 when the message arrives at a UNIX machine.</para>
578 <para>Also included are: GMimeFilterBest (which will likely not
579 concern you), GMimeFilterEnriched (which will convert
580 text/enriched and/or text/rtf to text/html), and GMimeFilterHTML
581 which will convert text/plain into text/html with options to
582 wrap strings that appear to be hyperlinks with appropriate <a
583 href=...> tags; GMimeFilterStrip (again, likely this won't
584 concern you), and finally GMimeFilterYenc which will encode or
585 decode the YEncode encoding.</para>
587 <para>For an example on how to use filters, please see the end
588 of the previous chapter where it talks about GMimeStreamFilter
589 and provides a snippet from
590 <filename>src/uuencode.c</filename></para>
593 <para>Note: Since it may be non-obvious, filters are applied
594 to a stream in the same order that they are added to the
595 GMimeFilterStream. This means that if you add a base64 encode
596 filter and then add a CRLF filter, the stream will first be
597 base64 encoded and then the end-of-line formatting will be
598 canonicalised to CRLF.</para>
605 <!-- ***************************************************************** -->
606 <chapter id="ch-mime">
607 <title>MIME, MIME, and more MIME</title>
609 <!-- ----------------------------------------------------------------- -->
610 <sect1 id="sec-mime-part">
611 <title>GMimePart</title>
613 <para>Since most people seem to want to know how to "save an
614 attachment", let's start there.</para>
616 <para>Given a GMimePart object, the first step to saving an
617 attachment is probably going to be figuring out what the
618 filename is. To do that, you'll likely want to do something
621 <programlisting role="C">
623 save_attachment (GMimePart *part)
625 GMimeDataWrapper *content;
626 const char *filename;
630 filename = g_mime_part_get_filename (part);
634 <para>The <literal>g_mime_part_get_filename()</literal> function
635 will first check for a <literal>filename</literal> parameter in
636 the Content-Disposition header. If that parameter exists,
637 it will return the value as the filename. However, if that does
638 not exist, it will fall back to checking for the
639 <literal>name</literal> parameter sometimes found in the
640 Content-Type header and return that value if it exists
641 (Microsoft Outlook, for example, will set the name parameter,
642 but will not set the filename parameter). If neither of these
643 param values are found, it will simply return
644 <literal>NULL</literal>.</para>
646 <para>Now that you've got a filename for the MIME part (well,
647 assuming that it isn't NULL - in which case you'll have to
648 prompt the user or make up your own filename or something), the
649 next step is to open an output stream and write the MIME part's
650 content to disk:</para>
652 <programlisting role="C">
654 if ((fd = open (filename, O_CREAT | O_WRONLY, 0666)) == -1)
657 stream = g_mime_stream_fs_new (fd);
659 content = g_mime_part_get_content_object (part);
660 g_mime_data_wrapper_write_to_stream (content, stream);
661 g_mime_stream_flush (stream);
662 g_object_unref (stream);
666 <para>In order to get the content of a MIME part (eg. the body
667 of a part, not including the headers), you'll want to use
668 <literal>g_mime_part_get_content_object()</literal>. To write
669 the content object to a stream, you can use
670 <literal>g_mime_data_wrapper_write_to_stream()</literal>. On
671 fail, this function will return <literal>-1</literal>, otherwise
672 it will return some positive value which will usually equate to
673 the number of bytes written (but not always, due to filter
674 transformations); generally it's a good idea to not rely on the
675 returned value for anything other than error-checking.</para>