libs/gil/doc/tutorial.rst

   1 Tutorial
   2 ========
   3
   4 .. contents::
   5    :local:
   6
   7 Installation
   8 ------------
   9
  10 The latest version of GIL can be downloaded from http://boostorg.github.com/gil.
  11 GIL consists of header files only and does not require any libraries to
  12 link against. It does not require Boost to be built and including
  13 ``boost/gil.hpp`` will be sufficient for most projects.
  14
  15 Video Lecture by Lubomir Bourdev
  16 --------------------------------
  17
  18 .. raw:: html
  19
  20     <div style="position: relative; overflow: hidden; max-width: 100%; height: auto;">
  21         <iframe width="800" height="450" src="https://www.youtube.com/embed/sR8Wjg0pceE?rel=0" frameborder="0" allow="autoplay=false; encrypted-media" allowfullscreen></iframe>
  22     </div>
  23
  24 Source link: https://www.youtube.com/watch?v=sR8Wjg0pceE
  25
  26 Example - Computing the Image Gradient
  27 --------------------------------------
  28
  29 This tutorial will walk through an example of using GIL to compute the
  30 image gradients. We will start with some very simple and non-generic
  31 code and make it more generic as we go along.  Let us start with a
  32 horizontal gradient and use the simplest possible approximation to a
  33 gradient - central difference.  The gradient at pixel x can be
  34 approximated with the half-difference of its two neighboring pixels:
  35 D[x] = (I[x-1] - I[x+1]) / 2
  36
  37 For simplicity, we will also ignore the boundary cases - the pixels
  38 along the edges of the image for which one of the neighbors is not
  39 defined.  The focus of this document is how to use GIL, not how to
  40 create a good gradient generation algorithm.
  41
  42 Interface and Glue Code
  43 ~~~~~~~~~~~~~~~~~~~~~~~
  44
  45 Let us first start with 8-bit unsigned grayscale image as the input and
  46 8-bit signed grayscale image as the output.
  47 Here is how the interface to our algorithm looks like::
  48
  49   #include <boost/gil.hpp>
  50   using namespace boost::gil;
  51
  52   void x_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
  53   {
  54     assert(src.dimensions() == dst.dimensions());
  55     ...    // compute the gradient
  56   }
  57
  58 ``gray8c_view_t`` is the type of the source image view - an 8-bit
  59 grayscale view, whose pixels are read-only (denoted by the "c"). The
  60 output is a grayscale view with a 8-bit signed (denoted by the "s")
  61 integer channel type. See Appendix 1 for the complete convention GIL
  62 uses to name concrete types.
  63
  64 GIL makes a distinction between an image and an image view. A GIL
  65 **image view**, is a shallow, lightweight view of a rectangular grid
  66 of pixels. It provides access to the pixels but does not own the
  67 pixels. Copy-constructing a view does not deep-copy the pixels. Image
  68 views do not propagate their constness to the pixels and should always
  69 be taken by a const reference. Whether a view is mutable or read-only
  70 (immutable) is a property of the view type.
  71
  72 A GIL `image`, on the other hand, is a view with associated
  73 ownership. It is a container of pixels; its constructor/destructor
  74 allocates/deallocates the pixels, its copy-constructor performs
  75 deep-copy of the pixels and its operator== performs deep-compare of
  76 the pixels. Images also propagate their constness to their pixels - a
  77 constant reference to an image will not allow for modifying its
  78 pixels.
  79
  80 Most GIL algorithms operate on image views; images are rarely
  81 needed. GIL's design is very similar to that of the STL. The STL
  82 equivalent of GIL's image is a container, like ``std::vector``,
  83 whereas GIL's image view corresponds to STL range, which is often
  84 represented with a pair of iterators. STL algorithms operate on
  85 ranges, just like GIL algorithms operate on image views.
  86
  87 GIL's image views can be constructed from raw data - the dimensions,
  88 the number of bytes per row and the pixels, which for chunky views are
  89 represented with one pointer. Here is how to provide the glue between
  90 your code and GIL::
  91
  92   void ComputeXGradientGray8(const unsigned char* src_pixels, ptrdiff_t src_row_bytes, int w, int h,
  93                              signed char* dst_pixels, ptrdiff_t dst_row_bytes)
  94   {
  95     gray8c_view_t src = interleaved_view(w, h, (const gray8_pixel_t*)src_pixels,src_row_bytes);
  96     gray8s_view_t dst = interleaved_view(w, h, (     gray8s_pixel_t*)dst_pixels,dst_row_bytes);
  97     x_gradient(src,dst);
  98   }
  99
 100 This glue code is very fast and views are lightweight - in the above
 101 example the views have a size of 16 bytes. They consist of a pointer
 102 to the top left pixel and three integers - the width, height, and
 103 number of bytes per row.
 104
 105 First Implementation
 106 ~~~~~~~~~~~~~~~~~~~~
 107
 108 Focusing on simplicity at the expense of speed, we can compute the horizontal
 109 gradient like this::
 110
 111   void x_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 112   {
 113     for (int y=0; y<src.height(); ++y)
 114         for (int x=1; x<src.width()-1; ++x)
 115             dst(x,y) = (src(x-1,y) - src(x+1,y)) / 2;
 116   }
 117
 118 We use image view's ``operator(x,y)`` to get a reference to the pixel
 119 at a given location and we set it to the half-difference of its left
 120 and right neighbors.  ``operator()`` returns a reference to a
 121 grayscale pixel. A grayscale pixel is convertible to its channel type
 122 (``unsigned char`` for ``src``) and it can be copy-constructed from a
 123 channel.  (This is only true for grayscale pixels).  While the above
 124 code is easy to read, it is not very fast, because the binary
 125 ``operator()`` computes the location of the pixel in a 2D grid, which
 126 involves addition and multiplication. Here is a faster version of the
 127 above::
 128
 129   void x_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 130   {
 131     for (int y=0; y<src.height(); ++y)
 132     {
 133         gray8c_view_t::x_iterator src_it = src.row_begin(y);
 134         gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
 135
 136         for (int x=1; x<src.width()-1; ++x)
 137             dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
 138     }
 139   }
 140
 141 We use pixel iterators initialized at the beginning of each row. GIL's
 142 iterators are Random Access Traversal iterators. If you are not
 143 familiar with random access iterators, think of them as if they were
 144 pointers. In fact, in the above example the two iterator types are raw
 145 C pointers and their ``operator[]`` is a fast pointer indexing
 146 operator.
 147
 148 The code to compute gradient in the vertical direction is very
 149 similar::
 150
 151   void y_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 152   {
 153     for (int x=0; x<src.width(); ++x)
 154     {
 155         gray8c_view_t::y_iterator src_it = src.col_begin(x);
 156         gray8s_view_t::y_iterator dst_it = dst.col_begin(x);
 157
 158         for (int y=1; y<src.height()-1; ++y)
 159             dst_it[y] = (src_it[y-1] - src_it[y+1])/2;
 160     }
 161   }
 162
 163 Instead of looping over the rows, we loop over each column and create
 164 a \p y_iterator, an iterator moving vertically. In this case a simple
 165 pointer cannot be used because the distance between two adjacent
 166 pixels equals the number of bytes in each row of the image. GIL uses
 167 here a special step iterator class whose size is 8 bytes - it contains
 168 a raw C pointer and a step.  Its ``operator[]`` multiplies the index
 169 by its step.
 170
 171 The above version of ``y_gradient``, however, is much slower (easily
 172 an order of magnitude slower) than ``x_gradient`` because of the
 173 memory access pattern; traversing an image vertically results in lots
 174 of cache misses. A much more efficient and cache-friendly version will
 175 iterate over the columns in the inner loop::
 176
 177   void y_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 178   {
 179     for (int y=1; y<src.height()-1; ++y)
 180     {
 181         gray8c_view_t::x_iterator src1_it = src.row_begin(y-1);
 182         gray8c_view_t::x_iterator src2_it = src.row_begin(y+1);
 183         gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
 184
 185         for (int x=0; x<src.width(); ++x) {
 186             *dst_it = ((*src1_it) - (*src2_it))/2;
 187             ++dst_it;
 188             ++src1_it;
 189             ++src2_it;
 190         }
 191     }
 192   }
 193
 194 This sample code also shows an alternative way of using pixel
 195 iterators - instead of ``operator[]`` one could use increments and
 196 dereferences.
 197
 198 Using Locators
 199 ~~~~~~~~~~~~~~
 200
 201 Unfortunately this cache-friendly version requires the extra hassle of
 202 maintaining two separate iterators in the source view. For every
 203 pixel, we want to access its neighbors above and below it. Such
 204 relative access can be done with GIL locators::
 205
 206   void y_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 207   {
 208     gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
 209     for (int y=1; y<src.height()-1; ++y)
 210     {
 211         gray8s_view_t::x_iterator dst_it  = dst.row_begin(y);
 212
 213         for (int x=0; x<src.width(); ++x)
 214     {
 215             (*dst_it) = (src_loc(0,-1) - src_loc(0,1)) / 2;
 216             ++dst_it;
 217             ++src_loc.x(); // each dimension can be advanced separately
 218         }
 219         src_loc+=point<std::ptrdiff_t>(-src.width(),1); // carriage return
 220     }
 221   }
 222
 223 The first line creates a locator pointing to the first pixel of the
 224 second row of the source view. A GIL pixel locator is very similar to
 225 an iterator, except that it can move both horizontally and
 226 vertically. ``src_loc.x()`` and ``src_loc.y()`` return references to a
 227 horizontal and a vertical iterator respectively, which can be used to
 228 move the locator along the desired dimension, as shown
 229 above. Additionally, the locator can be advanced in both dimensions
 230 simultaneously using its ``operator+=`` and ``operator-=``. Similar to
 231 image views, locators provide binary ``operator()`` which returns a
 232 reference to a pixel with a relative offset to the current locator
 233 position. For example, ``src_loc(0,1)`` returns a reference to the
 234 neighbor below the current pixel.  Locators are very lightweight
 235 objects - in the above example the locator has a size of 8 bytes - it
 236 consists of a raw pointer to the current pixel and an int indicating
 237 the number of bytes from one row to the next (which is the step when
 238 moving vertically). The call to ``++src_loc.x()`` corresponds to a
 239 single C pointer increment.  However, the example above performs more
 240 computations than necessary. The code ``src_loc(0,1)`` has to compute
 241 the offset of the pixel in two dimensions, which is slow.  Notice
 242 though that the offset of the two neighbors is the same, regardless of
 243 the pixel location. To improve the performance, GIL can cache and
 244 reuse this offset::
 245
 246   void y_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 247   {
 248     gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
 249     gray8c_view_t::xy_locator::cached_location_t above = src_loc.cache_location(0,-1);
 250     gray8c_view_t::xy_locator::cached_location_t below = src_loc.cache_location(0, 1);
 251
 252     for (int y=1; y<src.height()-1; ++y)
 253     {
 254         gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
 255
 256         for (int x=0; x<src.width(); ++x)
 257     {
 258             (*dst_it) = (src_loc[above] - src_loc[below])/2;
 259             ++dst_it;
 260             ++src_loc.x();
 261         }
 262         src_loc+=point<std::ptrdiff_t>(-src.width(),1);
 263     }
 264   }
 265
 266 In this example ``src_loc[above]`` corresponds to a fast pointer
 267 indexing operation and the code is efficient.
 268
 269 Creating a Generic Version of GIL Algorithms
 270 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 271
 272 Let us make our ``x_gradient`` more generic. It should work with any
 273 image views, as long as they have the same number of channels.  The
 274 gradient operation is to be computed for each channel
 275 independently. Here is how the new interface looks like::
 276
 277   template <typename SrcView, typename DstView>
 278   void x_gradient(const SrcView& src, const DstView& dst)
 279   {
 280     gil_function_requires<ImageViewConcept<SrcView> >();
 281     gil_function_requires<MutableImageViewConcept<DstView> >();
 282     gil_function_requires<ColorSpacesCompatibleConcept<
 283                                 typename color_space_type<SrcView>::type,
 284                                 typename color_space_type<DstView>::type> >();
 285
 286     ... // compute the gradient
 287   }
 288
 289 The new algorithm now takes the types of the input and output image
 290 views as template parameters.  That allows using both built-in GIL
 291 image views, as well as any user-defined image view classes.  The
 292 first three lines are optional; they use ``boost::concept_check`` to
 293 ensure that the two arguments are valid GIL image views, that the
 294 second one is mutable and that their color spaces are compatible
 295 (i.e. have the same set of channels).
 296
 297 GIL does not require using its own built-in constructs. You are free
 298 to use your own channels, color spaces, iterators, locators, views and
 299 images.  However, to work with the rest of GIL they have to satisfy a
 300 set of requirements; in other words, they have to \e model the
 301 corresponding GIL _concept_.  GIL's concepts are defined in the user
 302 guide.
 303
 304 One of the biggest drawbacks of using templates and generic
 305 programming in C++ is that compile errors can be very difficult to
 306 comprehend.  This is a side-effect of the lack of early type
 307 checking - a generic argument may not satisfy the requirements of a
 308 function, but the incompatibility may be triggered deep into a nested
 309 call, in code unfamiliar and hardly related to the problem.  GIL uses
 310 ``boost::concept_check`` to mitigate this problem. The above three
 311 lines of code check whether the template parameters are valid models
 312 of their corresponding concepts.  If a model is incorrect, the compile
 313 error will be inside ``gil_function_requires``, which is much closer
 314 to the problem and easier to track. Furthermore, such checks get
 315 compiled out and have zero performance overhead. The disadvantage of
 316 using concept checks is the sometimes severe impact they have on
 317 compile time. This is why GIL performs concept checks only in debug
 318 mode, and only if ``BOOST_GIL_USE_CONCEPT_CHECK`` is defined (off by
 319 default).
 320
 321 The body of the generic function is very similar to that of the
 322 concrete one. The biggest difference is that we need to loop over the
 323 channels of the pixel and compute the gradient for each channel::
 324
 325   template <typename SrcView, typename DstView>
 326   void x_gradient(const SrcView& src, const DstView& dst)
 327   {
 328     for (int y=0; y<src.height(); ++y)
 329     {
 330         typename SrcView::x_iterator src_it = src.row_begin(y);
 331         typename DstView::x_iterator dst_it = dst.row_begin(y);
 332
 333         for (int x=1; x<src.width()-1; ++x)
 334             for (int c=0; c<num_channels<SrcView>::value; ++c)
 335                 dst_it[x][c] = (src_it[x-1][c]- src_it[x+1][c])/2;
 336     }
 337   }
 338
 339 Having an explicit loop for each channel could be a performance
 340 problem. GIL allows us to abstract out such per-channel operations::
 341
 342   template <typename Out>
 343   struct halfdiff_cast_channels
 344   {
 345     template <typename T> Out operator()(const T& in1, const T& in2) const
 346     {
 347         return Out((in1-in2)/2);
 348     }
 349   };
 350
 351   template <typename SrcView, typename DstView>
 352   void x_gradient(const SrcView& src, const DstView& dst)
 353   {
 354     typedef typename channel_type<DstView>::type dst_channel_t;
 355
 356     for (int y=0; y<src.height(); ++y)
 357     {
 358         typename SrcView::x_iterator src_it = src.row_begin(y);
 359         typename DstView::x_iterator dst_it = dst.row_begin(y);
 360
 361         for (int x=1; x<src.width()-1; ++x)
 362             static_transform(src_it[x-1], src_it[x+1], dst_it[x],
 363                                halfdiff_cast_channels<dst_channel_t>());
 364     }
 365   }
 366
 367 ``static_transform`` is an example of a channel-level GIL
 368 algorithm. Other such algorithms are ``static_generate``,
 369 ``static_fill`` and ``static_for_each``. They are the channel-level
 370 equivalents of STL ``generate``, ``transform``, ``fill`` and
 371 ``for_each`` respectively. GIL channel algorithms use static recursion
 372 to unroll the loops; they never loop over the channels explicitly.
 373 Note that sometimes modern compilers (at least Visual Studio 8)
 374 already unroll channel-level loops, such as the one above. However,
 375 another advantage of using GIL's channel-level algorithms is that they
 376 pair the channels semantically, not based on their order in
 377 memory. For example, the above example will properly match an RGB
 378 source with a BGR destination.
 379
 380 Here is how we can use our generic version with images of different
 381 types::
 382
 383   // Calling with 16-bit grayscale data
 384   void XGradientGray16_Gray32(const unsigned short* src_pixels, ptrdiff_t src_row_bytes, int w, int h,
 385                               signed int* dst_pixels, ptrdiff_t dst_row_bytes)
 386   {
 387     gray16c_view_t src=interleaved_view(w,h,(const gray16_pixel_t*)src_pixels,src_row_bytes);
 388     gray32s_view_t dst=interleaved_view(w,h,(     gray32s_pixel_t*)dst_pixels,dst_row_bytes);
 389     x_gradient(src,dst);
 390   }
 391
 392   // Calling with 8-bit RGB data into 16-bit BGR
 393   void XGradientRGB8_BGR16(const unsigned char* src_pixels, ptrdiff_t src_row_bytes, int w, int h,
 394                            signed short* dst_pixels, ptrdiff_t dst_row_bytes)
 395   {
 396     rgb8c_view_t  src = interleaved_view(w,h,(const rgb8_pixel_t*)src_pixels,src_row_bytes);
 397     rgb16s_view_t dst = interleaved_view(w,h,(    rgb16s_pixel_t*)dst_pixels,dst_row_bytes);
 398     x_gradient(src,dst);
 399   }
 400
 401   // Either or both the source and the destination could be planar - the gradient code does not change
 402   void XGradientPlanarRGB8_RGB32(
 403            const unsigned short* src_r, const unsigned short* src_g, const unsigned short* src_b,
 404            ptrdiff_t src_row_bytes, int w, int h,
 405            signed int* dst_pixels, ptrdiff_t dst_row_bytes)
 406   {
 407     rgb16c_planar_view_t src=planar_rgb_view (w,h, src_r,src_g,src_b,         src_row_bytes);
 408     rgb32s_view_t        dst=interleaved_view(w,h,(rgb32s_pixel_t*)dst_pixels,dst_row_bytes);
 409     x_gradient(src,dst);
 410   }
 411
 412 As these examples illustrate, both the source and the destination can
 413 be interleaved or planar, of any channel depth (assuming the
 414 destination channel is assignable to the source), and of any
 415 compatible color spaces.
 416
 417 GIL 2.1 can also natively represent images whose channels are not
 418 byte-aligned, such as 6-bit RGB222 image or a 1-bit Gray1 image.  GIL
 419 algorithms apply to these images natively. See the design guide or
 420 sample files for more on using such images.
 421
 422 Image View Transformations
 423 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 424
 425 One way to compute the y-gradient is to rotate the image by 90
 426 degrees, compute the x-gradient and rotate the result back. Here is
 427 how to do this in GIL::
 428
 429   template <typename SrcView, typename DstView>
 430   void y_gradient(const SrcView& src, const DstView& dst)
 431   {
 432     x_gradient(rotated90ccw_view(src), rotated90ccw_view(dst));
 433   }
 434
 435 ``rotated90ccw_view`` takes an image view and returns an image view
 436 representing 90-degrees counter-clockwise rotation of its input. It is
 437 an example of a GIL view transformation function. GIL provides a
 438 variety of transformation functions that can perform any axis-aligned
 439 rotation, transpose the view, flip it vertically or horizontally,
 440 extract a rectangular subimage, perform color conversion, subsample
 441 view, etc. The view transformation functions are fast and shallow -
 442 they don't copy the pixels, they just change the "coordinate system"
 443 of accessing the pixels. ``rotated90cw_view``, for example, returns a
 444 view whose horizontal iterators are the vertical iterators of the
 445 original view. The above code to compute ``y_gradient`` is slow
 446 because of the memory access pattern; using ``rotated90cw_view`` does
 447 not make it any slower.
 448
 449 Another example: suppose we want to compute the gradient of the N-th
 450 channel of a color image. Here is how to do that::
 451
 452   template <typename SrcView, typename DstView>
 453   void nth_channel_x_gradient(const SrcView& src, int n, const DstView& dst)
 454   {
 455     x_gradient(nth_channel_view(src, n), dst);
 456   }
 457
 458 ``nth_channel_view`` is a view transformation function that takes any
 459 view and returns a single-channel (grayscale) view of its N-th
 460 channel.  For interleaved RGB view, for example, the returned view is
 461 a step view - a view whose horizontal iterator skips over two channels
 462 when incremented.  If applied on a planar RGB view, the returned type
 463 is a simple grayscale view whose horizontal iterator is a C pointer.
 464 Image view transformation functions can be piped together. For
 465 example, to compute the y gradient of the second channel of the even
 466 pixels in the view, use::
 467
 468   y_gradient(subsampled_view(nth_channel_view(src, 1), 2,2), dst);
 469
 470 GIL can sometimes simplify piped views. For example, two nested
 471 subsampled views (views that skip over pixels in X and in Y) can be
 472 represented as a single subsampled view whose step is the product of
 473 the steps of the two views.
 474
 475 1D pixel iterators
 476 ~~~~~~~~~~~~~~~~~~
 477
 478 Let's go back to ``x_gradient`` one more time.  Many image view
 479 algorithms apply the same operation for each pixel and GIL provides an
 480 abstraction to handle them. However, our algorithm has an unusual
 481 access pattern, as it skips the first and the last column. It would be
 482 nice and instructional to see how we can rewrite it in canonical
 483 form. The way to do that in GIL is to write a version that works for
 484 every pixel, but apply it only on the subimage that excludes the first
 485 and last column::
 486
 487   void x_gradient_unguarded(const gray8c_view_t& src, const gray8s_view_t& dst)
 488   {
 489     for (int y=0; y<src.height(); ++y)
 490     {
 491         gray8c_view_t::x_iterator src_it = src.row_begin(y);
 492         gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
 493
 494         for (int x=0; x<src.width(); ++x)
 495             dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
 496     }
 497   }
 498
 499   void x_gradient(const gray8c_view_t& src, const gray8s_view_t& dst)
 500   {
 501     assert(src.width()>=2);
 502     x_gradient_unguarded(subimage_view(src, 1, 0, src.width()-2, src.height()),
 503                          subimage_view(dst, 1, 0, src.width()-2, src.height()));
 504   }
 505
 506 ``subimage_view`` is another example of a GIL view transformation
 507 function. It takes a source view and a rectangular region (in this
 508 case, defined as x_min,y_min,width,height) and returns a view
 509 operating on that region of the source view. The above implementation
 510 has no measurable performance degradation from the version that
 511 operates on the original views.
 512
 513 Now that ``x_gradient_unguarded`` operates on every pixel, we can
 514 rewrite it more compactly::
 515
 516   void x_gradient_unguarded(const gray8c_view_t& src, const gray8s_view_t& dst)
 517   {
 518     gray8c_view_t::iterator src_it = src.begin();
 519     for (gray8s_view_t::iterator dst_it = dst.begin(); dst_it!=dst.end(); ++dst_it, ++src_it)
 520         *dst_it = (src_it.x()[-1] - src_it.x()[1]) / 2;
 521   }
 522
 523 GIL image views provide ``begin()`` and ``end()`` methods that return
 524 one dimensional pixel iterators which iterate over each pixel in the
 525 view, left to right and top to bottom. They do a proper "carriage
 526 return" - they skip any unused bytes at the end of a row. As such,
 527 they are slightly suboptimal, because they need to keep track of their
 528 current position with respect to the end of the row. Their increment
 529 operator performs one extra check (are we at the end of the row?), a
 530 check that is avoided if two nested loops are used instead. These
 531 iterators have a method ``x()`` which returns the more lightweight
 532 horizontal iterator that we used previously. Horizontal iterators have
 533 no notion of the end of rows. In this case, the horizontal iterators
 534 are raw C pointers. In our example, we must use the horizontal
 535 iterators to access the two neighbors properly, since they could
 536 reside outside the image view.
 537
 538 STL Equivalent Algorithms
 539 ~~~~~~~~~~~~~~~~~~~~~~~~~
 540
 541 GIL provides STL equivalents of many algorithms. For example,
 542 ``std::transform`` is an STL algorithm that sets each element in a
 543 destination range the result of a generic function taking the
 544 corresponding element of the source range. In our example, we want to
 545 assign to each destination pixel the value of the half-difference of
 546 the horizontal neighbors of the corresponding source pixel.  If we
 547 abstract that operation in a function object, we can use GIL's
 548 ``transform_pixel_positions`` to do that::
 549
 550   struct half_x_difference
 551   {
 552     int operator()(const gray8c_loc_t& src_loc) const
 553     {
 554         return (src_loc.x()[-1] - src_loc.x()[1]) / 2;
 555     }
 556   };
 557
 558   void x_gradient_unguarded(const gray8c_view_t& src, const gray8s_view_t& dst)
 559   {
 560     transform_pixel_positions(src, dst, half_x_difference());
 561   }
 562
 563 GIL provides the algorithms ``for_each_pixel`` and
 564 ``transform_pixels`` which are image view equivalents of STL
 565 ``std::for_each`` and ``std::transform``. It also provides
 566 ``for_each_pixel_position`` and ``transform_pixel_positions``, which
 567 instead of references to pixels, pass to the generic function pixel
 568 locators. This allows for more powerful functions that can use the
 569 pixel neighbors through the passed locators.  GIL algorithms iterate
 570 through the pixels using the more efficient two nested loops (as
 571 opposed to the single loop using 1-D iterators)
 572
 573 Color Conversion
 574 ~~~~~~~~~~~~~~~~
 575
 576 Instead of computing the gradient of each color plane of an image, we
 577 often want to compute the gradient of the luminosity. In other words,
 578 we want to convert the color image to grayscale and compute the
 579 gradient of the result. Here how to compute the luminosity gradient of
 580 a 32-bit float RGB image::
 581
 582   void x_gradient_rgb_luminosity(const rgb32fc_view_t& src, const gray8s_view_t& dst)
 583   {
 584     x_gradient(color_converted_view<gray8_pixel_t>(src), dst);
 585   }
 586
 587 ``color_converted_view`` is a GIL view transformation function that
 588 takes any image view and returns a view in a target color space and
 589 channel depth (specified as template parameters). In our example, it
 590 constructs an 8-bit integer grayscale view over 32-bit float RGB
 591 pixels. Like all other view transformation functions,
 592 ``color_converted_view`` is very fast and shallow. It doesn't copy the
 593 data or perform any color conversion. Instead it returns a view that
 594 performs color conversion every time its pixels are accessed.
 595
 596 In the generic version of this algorithm we might like to convert the
 597 color space to grayscale, but keep the channel depth the same. We do
 598 that by constructing the type of a GIL grayscale pixel with the same
 599 channel as the source, and color convert to that pixel type::
 600
 601   template <typename SrcView, typename DstView>
 602   void x_luminosity_gradient(const SrcView& src, const DstView& dst)
 603   {
 604     typedef pixel<typename channel_type<SrcView>::type, gray_layout_t> gray_pixel_t;
 605     x_gradient(color_converted_view<gray_pixel_t>(src), dst);
 606   }
 607
 608 When the destination color space and channel type happens to be the
 609 same as the source one, color conversion is unnecessary. GIL detects
 610 this case and avoids calling the color conversion code at all -
 611 i.e. ``color_converted_view`` returns back the source view unchanged.
 612
 613 Image
 614 ~~~~~
 615
 616 The above example has a performance problem - ``x_gradient``
 617 dereferences most source pixels twice, which will cause the above code
 618 to perform color conversion twice.  Sometimes it may be more efficient
 619 to copy the color converted image into a temporary buffer and use it
 620 to compute the gradient - that way color conversion is invoked once
 621 per pixel.  Using our non-generic version we can do it like this::
 622
 623   void x_luminosity_gradient(const rgb32fc_view_t& src, const gray8s_view_t& dst)
 624   {
 625     gray8_image_t ccv_image(src.dimensions());
 626     copy_pixels(color_converted_view<gray8_pixel_t>(src), view(ccv_image));
 627
 628     x_gradient(const_view(ccv_image), dst);
 629   }
 630
 631 First we construct an 8-bit grayscale image with the same dimensions
 632 as our source. Then we copy a color-converted view of the source into
 633 the temporary image.  Finally we use a read-only view of the temporary
 634 image in our ``x_gradient algorithm``. As the example shows, GIL
 635 provides global functions ``view`` and ``const_view`` that take an
 636 image and return a mutable or an immutable view of its pixels.
 637
 638 Creating a generic version of the above is a bit trickier::
 639
 640   template <typename SrcView, typename DstView>
 641   void x_luminosity_gradient(const SrcView& src, const DstView& dst)
 642   {
 643     typedef typename channel_type<DstView>::type d_channel_t;
 644     typedef typename channel_convert_to_unsigned<d_channel_t>::type channel_t;
 645     typedef pixel<channel_t, gray_layout_t>  gray_pixel_t;
 646     typedef image<gray_pixel_t, false>       gray_image_t;
 647
 648     gray_image_t ccv_image(src.dimensions());
 649     copy_pixels(color_converted_view<gray_pixel_t>(src), view(ccv_image));
 650     x_gradient(const_view(ccv_image), dst);
 651   }
 652
 653 First we use the ``channel_type`` metafunction to get the channel type
 654 of the destination view. A metafunction is a function operating on
 655 types. In GIL metafunctions are class templates (declared with
 656 ``struct`` type specifier) which take their parameters as template
 657 parameters and return their result in a nested typedef called
 658 ``type``. In this case, ``channel_type`` is a unary metafunction which
 659 in this example is called with the type of an image view and returns
 660 the type of the channel associated with that image view.
 661
 662 GIL constructs that have an associated pixel type, such as pixels,
 663 pixel iterators, locators, views and images, all model
 664 ``PixelBasedConcept``, which means that they provide a set of
 665 metafunctions to query the pixel properties, such as ``channel_type``,
 666 ``color_space_type``, ``channel_mapping_type``, and ``num_channels``.
 667
 668 After we get the channel type of the destination view, we use another
 669 metafunction to remove its sign (if it is a signed integral type) and
 670 then use it to generate the type of a grayscale pixel. From the pixel
 671 type we create the image type. GIL's image class is specialized over
 672 the pixel type and a boolean indicating whether the image should be
 673 planar or interleaved.  Single-channel (grayscale) images in GIL must
 674 always be interleaved. There are multiple ways of constructing types
 675 in GIL. Instead of instantiating the classes directly we could have
 676 used type factory metafunctions. The following code is equivalent::
 677
 678   template <typename SrcView, typename DstView>
 679   void x_luminosity_gradient(const SrcView& src, const DstView& dst)
 680   {
 681     typedef typename channel_type<DstView>::type d_channel_t;
 682     typedef typename channel_convert_to_unsigned<d_channel_t>::type channel_t;
 683     typedef typename image_type<channel_t, gray_layout_t>::type gray_image_t;
 684     typedef typename gray_image_t::value_type gray_pixel_t;
 685
 686     gray_image_t ccv_image(src.dimensions());
 687     copy_and_convert_pixels(src, view(ccv_image));
 688     x_gradient(const_view(ccv_image), dst);
 689   }
 690
 691 GIL provides a set of metafunctions that generate GIL types -
 692 ``image_type`` is one such meta-function that constructs the type of
 693 an image from a given channel type, color layout, and
 694 planar/interleaved option (the default is interleaved). There are also
 695 similar meta-functions to construct the types of pixel references,
 696 iterators, locators and image views. GIL also has metafunctions
 697 ``derived_pixel_reference_type``, ``derived_iterator_type``,
 698 ``derived_view_type`` and ``derived_image_type`` that construct the
 699 type of a GIL construct from a given source one by changing one or
 700 more properties of the type and keeping the rest.
 701
 702 From the image type we can use the nested typedef ``value_type`` to
 703 obtain the type of a pixel. GIL images, image views and locators have
 704 nested typedefs ``value_type`` and ``reference`` to obtain the type of
 705 the pixel and a reference to the pixel. If you have a pixel iterator,
 706 you can get these types from its ``iterator_traits``. Note also the
 707 algorithm ``copy_and_convert_pixels``, which is an abbreviated version
 708 of ``copy_pixels`` with a color converted source view.
 709
 710 Virtual Image Views
 711 ~~~~~~~~~~~~~~~~~~~
 712
 713 So far we have been dealing with images that have pixels stored in
 714 memory. GIL allows you to create an image view of an arbitrary image,
 715 including a synthetic function. To demonstrate this, let us create a
 716 view of the Mandelbrot set.  First, we need to create a function
 717 object that computes the value of the Mandelbrot set at a given
 718 location (x,y) in the image::
 719
 720   // models PixelDereferenceAdaptorConcept
 721   struct mandelbrot_fn
 722   {
 723     typedef point<ptrdiff_t>   point_t;
 724
 725     typedef mandelbrot_fn       const_t;
 726     typedef gray8_pixel_t       value_type;
 727     typedef value_type          reference;
 728     typedef value_type          const_reference;
 729     typedef point_t             argument_type;
 730     typedef reference           result_type;
 731     static bool constexpr is_mutable = false;
 732
 733     mandelbrot_fn() {}
 734     mandelbrot_fn(const point_t& sz) : _img_size(sz) {}
 735
 736     result_type operator()(const point_t& p) const
 737     {
 738         // normalize the coords to (-2..1, -1.5..1.5)
 739         double t=get_num_iter(point<double>(p.x/(double)_img_size.x*3-2, p.y/(double)_img_size.y*3-1.5f));
 740         return value_type((bits8)(pow(t,0.2)*255));   // raise to power suitable for viewing
 741     }
 742   private:
 743     point_t _img_size;
 744
 745     double get_num_iter(const point<double>& p) const
 746     {
 747         point<double> Z(0,0);
 748         for (int i=0; i<100; ++i)  // 100 iterations
 749     {
 750             Z = point<double>(Z.x*Z.x - Z.y*Z.y + p.x, 2*Z.x*Z.y + p.y);
 751             if (Z.x*Z.x + Z.y*Z.y > 4)
 752                 return i/(double)100;
 753         }
 754         return 0;
 755     }
 756   };
 757
 758 We can now use GIL's ``virtual_2d_locator`` with this function object
 759 to construct a Mandelbrot view of size 200x200 pixels::
 760
 761   typedef mandelbrot_fn::point_t point_t;
 762   typedef virtual_2d_locator<mandelbrot_fn,false> locator_t;
 763   typedef image_view<locator_t> my_virt_view_t;
 764
 765   point_t dims(200,200);
 766
 767   // Construct a Mandelbrot view with a locator, taking top-left corner (0,0) and step (1,1)
 768   my_virt_view_t mandel(dims, locator_t(point_t(0,0), point_t(1,1), mandelbrot_fn(dims)));
 769
 770 We can treat the synthetic view just like a real one. For example,
 771 let's invoke our ``x_gradient`` algorithm to compute the gradient of
 772 the 90-degree rotated view of the Mandelbrot set and save the original
 773 and the result::
 774
 775   gray8s_image_t img(dims);
 776   x_gradient(rotated90cw_view(mandel), view(img));
 777
 778   // Save the Mandelbrot set and its 90-degree rotated gradient (jpeg cannot save signed char; must convert to unsigned char)
 779   jpeg_write_view("mandel.jpg",mandel);
 780   jpeg_write_view("mandel_grad.jpg",color_converted_view<gray8_pixel_t>(const_view(img)));
 781
 782 Here is what the two files look like:
 783
 784 .. image:: images/mandel.jpg
 785
 786 Run-Time Specified Images and Image Views
 787 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 788
 789 So far we have created a generic function that computes the image
 790 gradient of an image view template specialization.  Sometimes,
 791 however, the properties of an image view, such as its color space and
 792 channel depth, may not be available at compile time.  GIL's
 793 ``dynamic_image`` extension allows for working with GIL constructs
 794 that are specified at run time, also called _variants_. GIL provides
 795 models of a run-time instantiated image, ``any_image``, and a run-time
 796 instantiated image view, ``any_image_view``. The mechanisms are in
 797 place to create other variants, such as ``any_pixel``,
 798 ``any_pixel_iterator``, etc.  Most of GIL's algorithms and all of the
 799 view transformation functions also work with run-time instantiated
 800 image views and binary algorithms, such as ``copy_pixels`` can have
 801 either or both arguments be variants.
 802
 803 Lets make our ``x_luminosity_gradient`` algorithm take a variant image
 804 view. For simplicity, let's assume that only the source view can be a
 805 variant.  (As an example of using multiple variants, see GIL's image
 806 view algorithm overloads taking multiple variants.)
 807
 808 First, we need to make a function object that contains the templated
 809 destination view and has an application operator taking a templated
 810 source view::
 811
 812   #include <boost/gil/extension/dynamic_image/dynamic_image_all.hpp>
 813
 814   template <typename DstView>
 815   struct x_gradient_obj
 816   {
 817     typedef void result_type;        // required typedef
 818
 819     const DstView& _dst;
 820     x_gradient_obj(const DstView& dst) : _dst(dst) {}
 821
 822     template <typename SrcView>
 823     void operator()(const SrcView& src) const { x_luminosity_gradient(src, _dst); }
 824   };
 825
 826 The second step is to provide an overload of ``x_luminosity_gradient``
 827 that takes image view variant and calls GIL's ``apply_operation``
 828 passing it the function object::
 829
 830   template <typename SrcViews, typename DstView>
 831   void x_luminosity_gradient(const any_image_view<SrcViews>& src, const DstView& dst)
 832   {
 833     apply_operation(src, x_gradient_obj<DstView>(dst));
 834   }
 835
 836 ``any_image_view<SrcViews>`` is the image view variant. It is
 837 templated over ``SrcViews``, an enumeration of all possible view types
 838 the variant can take.  ``src`` contains inside an index of the
 839 currently instantiated type, as well as a block of memory containing
 840 the instance.  ``apply_operation`` goes through a switch statement
 841 over the index, each case of which casts the memory to the correct
 842 view type and invokes the function object with it. Invoking an
 843 algorithm on a variant has the overhead of one switch
 844 statement. Algorithms that perform an operation for each pixel in an
 845 image view have practically no performance degradation when used with
 846 a variant.
 847
 848 Here is how we can construct a variant and invoke the algorithm::
 849
 850   #include <boost/mpl/vector.hpp>
 851   #include <boost/gil/extension/io/jpeg_dynamic_io.hpp>
 852
 853   typedef mpl::vector<gray8_image_t, gray16_image_t, rgb8_image_t, rgb16_image_t> my_img_types;
 854   any_image<my_img_types> runtime_image;
 855   jpeg_read_image("input.jpg", runtime_image);
 856
 857   gray8s_image_t gradient(runtime_image.dimensions());
 858   x_luminosity_gradient(const_view(runtime_image), view(gradient));
 859   jpeg_write_view("x_gradient.jpg", color_converted_view<gray8_pixel_t>(const_view(gradient)));
 860
 861 In this example, we create an image variant that could be 8-bit or
 862 16-bit RGB or grayscale image. We then use GIL's I/O extension to load
 863 the image from file in its native color space and channel depth. If
 864 none of the allowed image types matches the image on disk, an
 865 exception will be thrown.  We then construct a 8 bit signed
 866 (i.e. ``char``) image to store the gradient and invoke ``x_gradient``
 867 on it. Finally we save the result into another file.  We save the view
 868 converted to 8-bit unsigned, because JPEG I/O does not support signed
 869 char.
 870
 871 Note how free functions and methods such as ``jpeg_read_image``,
 872 ``dimensions``, ``view`` and ``const_view`` work on both templated and
 873 variant types.  For templated images ``view(img)`` returns a templated
 874 view, whereas for image variants it returns a view variant.  For
 875 example, the return type of ``view(runtime_image)`` is
 876 ``any_image_view<Views>`` where ``Views`` enumerates four views
 877 corresponding to the four image types.  ``const_view(runtime_image)``
 878 returns a ``any_image_view`` of the four read-only view types, etc.
 879
 880 A warning about using variants: instantiating an algorithm with a
 881 variant effectively instantiates it with every possible type the
 882 variant can take.  For binary algorithms, the algorithm is
 883 instantiated with every possible combination of the two input types!
 884 This can take a toll on both the compile time and the executable size.
 885
 886 Conclusion
 887 ~~~~~~~~~~
 888
 889 This tutorial provides a glimpse at the challenges associated with
 890 writing generic and efficient image processing algorithms in GIL.  We
 891 have taken a simple algorithm and shown how to make it work with image
 892 representations that vary in bit depth, color space, ordering of the
 893 channels, and planar/interleaved structure. We have demonstrated that
 894 the algorithm can work with fully abstracted virtual images, and even
 895 images whose type is specified at run time. The associated video
 896 presentation also demonstrates that even for complex scenarios the
 897 generated assembly is comparable to that of a C version of the
 898 algorithm, hand-written for the specific image types.
 899
 900 Yet, even for such a simple algorithm, we are far from making a fully
 901 generic and optimized code. In particular, the presented algorithms
 902 work on homogeneous images, i.e. images whose pixels have channels
 903 that are all of the same type. There are examples of images, such as a
 904 packed 565 RGB format, which contain channels of different
 905 types. While GIL provides concepts and algorithms operating on
 906 heterogeneous pixels, we leave the task of extending x_gradient as an
 907 exercise for the reader.  Second, after computing the value of the
 908 gradient we are simply casting it to the destination channel
 909 type. This may not always be the desired operation. For example, if
 910 the source channel is a float with range [0..1] and the destination is
 911 unsigned char, casting the half-difference to unsigned char will
 912 result in either 0 or 1. Instead, what we might want to do is scale
 913 the result into the range of the destination channel. GIL's
 914 channel-level algorithms might be useful in such cases. For example,
 915 \p channel_convert converts between channels by linearly scaling the
 916 source channel value into the range of the destination channel.
 917
 918 There is a lot to be done in improving the performance as
 919 well. Channel-level operations, such as the half-difference, could be
 920 abstracted out into atomic channel-level algorithms and performance
 921 overloads could be provided for concrete channel
 922 types. Processor-specific operations could be used, for example, to
 923 perform the operation over an entire row of pixels simultaneously, or
 924 the data could be pre-fetched. All of these optimizations can be
 925 realized as performance specializations of the generic
 926 algorithm. Finally, compilers, while getting better over time, are
 927 still failing to fully optimize generic code in some cases, such as
 928 failing to inline some functions or put some variables into
 929 registers. If performance is an issue, it might be worth trying your
 930 code with different compilers.
 931
 932 Appendix
 933 --------
 934
 935 Naming convention for GIL concrete types
 936 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 937
 938 Concrete (non-generic) GIL types follow this naming convention:
 939
 940 _ColorSpace_ + _BitDepth_ + [``f`` | ``s``]+ [``c``] + [``_planar``] +
 941 [``_step``] + _ClassType_ + ``_t``
 942
 943 Where _ColorSpace_ also indicates the ordering of components. Examples
 944 are ``rgb``, ``bgr``, ``cmyk``, ``rgba``.  _BitDepth_ indicates the
 945 bit depth of the color channel. Examples are ``8``,``16``,``32``. By
 946 default the type of channel is unsigned integral; using ``s``
 947 indicates signed integral and ``f`` - a floating point type, which is
 948 always signed. ``c`` indicates object operating over immutable
 949 pixels. ``_planar`` indicates planar organization (as opposed to
 950 interleaved). ``_step`` indicates special image views, locators and
 951 iterators which traverse the data in non-trivial way (for example,
 952 backwards or every other pixel).  _ClassType_ is ``_image`` (image),
 953 ``_view`` (image view), ``_loc`` (pixel 2D locator) ``_ptr`` (pixel
 954 iterator), ``_ref`` (pixel reference), ``_pixel`` (pixel value).
 955
 956 examples::
 957
 958    bgr8_image_t             a;    // 8-bit interleaved BGR image
 959    cmyk16_pixel_t           b;    // 16-bit CMYK pixel value;
 960    cmyk16c_planar_ref_t     c(b); // const reference to a 16-bit planar CMYK pixel.
 961    rgb32f_planar_step_ptr_t d;    // step pointer to a 32-bit planar RGB pixel.
 962