README.md

   1 Background
   2 ==========
   3
   4 libjpeg-turbo is a JPEG image codec that uses SIMD instructions to accelerate
   5 baseline JPEG compression and decompression on x86, x86-64, Arm, PowerPC, and
   6 MIPS systems, as well as progressive JPEG compression on x86 and x86-64
   7 systems.  On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg,
   8 all else being equal.  On other types of systems, libjpeg-turbo can still
   9 outperform libjpeg by a significant amount, by virtue of its highly-optimized
  10 Huffman coding routines.  In many cases, the performance of libjpeg-turbo
  11 rivals that of proprietary high-speed JPEG codecs.
  12
  13 libjpeg-turbo implements both the traditional libjpeg API as well as the less
  14 powerful but more straightforward TurboJPEG API.  libjpeg-turbo also features
  15 colorspace extensions that allow it to compress from/decompress to 32-bit and
  16 big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java
  17 interface.
  18
  19 libjpeg-turbo was originally based on libjpeg/SIMD, an MMX-accelerated
  20 derivative of libjpeg v6b developed by Miyasaka Masaru.  The TigerVNC and
  21 VirtualGL projects made numerous enhancements to the codec in 2009, and in
  22 early 2010, libjpeg-turbo spun off into an independent project, with the goal
  23 of making high-speed JPEG compression/decompression technology available to a
  24 broader range of users and developers.
  25
  26
  27 License
  28 =======
  29
  30 libjpeg-turbo is covered by three compatible BSD-style open source licenses.
  31 Refer to [LICENSE.md](LICENSE.md) for a roll-up of license terms.
  32
  33
  34 Building libjpeg-turbo
  35 ======================
  36
  37 Refer to [BUILDING.md](BUILDING.md) for complete instructions.
  38
  39
  40 Using libjpeg-turbo
  41 ===================
  42
  43 libjpeg-turbo includes two APIs that can be used to compress and decompress
  44 JPEG images:
  45
  46 - **TurboJPEG API**<br>
  47   This API provides an easy-to-use interface for compressing and decompressing
  48   JPEG images in memory.  It also provides some functionality that would not be
  49   straightforward to achieve using the underlying libjpeg API, such as
  50   generating planar YUV images and performing multiple simultaneous lossless
  51   transforms on an image.  The Java interface for libjpeg-turbo is written on
  52   top of the TurboJPEG API.  The TurboJPEG API is recommended for first-time
  53   users of libjpeg-turbo.  Refer to [tjexample.c](tjexample.c) and
  54   [TJExample.java](java/TJExample.java) for examples of its usage and to
  55   <http://libjpeg-turbo.org/Documentation/Documentation> for API documentation.
  56
  57 - **libjpeg API**<br>
  58   This is the de facto industry-standard API for compressing and decompressing
  59   JPEG images.  It is more difficult to use than the TurboJPEG API but also
  60   more powerful.  The libjpeg API implementation in libjpeg-turbo is both
  61   API/ABI-compatible and mathematically compatible with libjpeg v6b.  It can
  62   also optionally be configured to be API/ABI-compatible with libjpeg v7 and v8
  63   (see below.)  Refer to [cjpeg.c](cjpeg.c) and [djpeg.c](djpeg.c) for examples
  64   of its usage and to [libjpeg.txt](libjpeg.txt) for API documentation.
  65
  66 There is no significant performance advantage to either API when both are used
  67 to perform similar operations.
  68
  69 Colorspace Extensions
  70 ---------------------
  71
  72 libjpeg-turbo includes extensions that allow JPEG images to be compressed
  73 directly from (and decompressed directly to) buffers that use BGR, BGRX,
  74 RGBX, XBGR, and XRGB pixel ordering.  This is implemented with ten new
  75 colorspace constants:
  76
  77     JCS_EXT_RGB   /* red/green/blue */
  78     JCS_EXT_RGBX  /* red/green/blue/x */
  79     JCS_EXT_BGR   /* blue/green/red */
  80     JCS_EXT_BGRX  /* blue/green/red/x */
  81     JCS_EXT_XBGR  /* x/blue/green/red */
  82     JCS_EXT_XRGB  /* x/red/green/blue */
  83     JCS_EXT_RGBA  /* red/green/blue/alpha */
  84     JCS_EXT_BGRA  /* blue/green/red/alpha */
  85     JCS_EXT_ABGR  /* alpha/blue/green/red */
  86     JCS_EXT_ARGB  /* alpha/red/green/blue */
  87
  88 Setting `cinfo.in_color_space` (compression) or `cinfo.out_color_space`
  89 (decompression) to one of these values will cause libjpeg-turbo to read the
  90 red, green, and blue values from (or write them to) the appropriate position in
  91 the pixel when compressing from/decompressing to an RGB buffer.
  92
  93 Your application can check for the existence of these extensions at compile
  94 time with:
  95
  96     #ifdef JCS_EXTENSIONS
  97
  98 At run time, attempting to use these extensions with a libjpeg implementation
  99 that does not support them will result in a "Bogus input colorspace" error.
 100 Applications can trap this error in order to test whether run-time support is
 101 available for the colorspace extensions.
 102
 103 When using the RGBX, BGRX, XBGR, and XRGB colorspaces during decompression, the
 104 X byte is undefined, and in order to ensure the best performance, libjpeg-turbo
 105 can set that byte to whatever value it wishes.  If an application expects the X
 106 byte to be used as an alpha channel, then it should specify `JCS_EXT_RGBA`,
 107 `JCS_EXT_BGRA`, `JCS_EXT_ABGR`, or `JCS_EXT_ARGB`.  When these colorspace
 108 constants are used, the X byte is guaranteed to be 0xFF, which is interpreted
 109 as opaque.
 110
 111 Your application can check for the existence of the alpha channel colorspace
 112 extensions at compile time with:
 113
 114     #ifdef JCS_ALPHA_EXTENSIONS
 115
 116 [jcstest.c](jcstest.c), located in the libjpeg-turbo source tree, demonstrates
 117 how to check for the existence of the colorspace extensions at compile time and
 118 run time.
 119
 120 libjpeg v7 and v8 API/ABI Emulation
 121 -----------------------------------
 122
 123 With libjpeg v7 and v8, new features were added that necessitated extending the
 124 compression and decompression structures.  Unfortunately, due to the exposed
 125 nature of those structures, extending them also necessitated breaking backward
 126 ABI compatibility with previous libjpeg releases.  Thus, programs that were
 127 built to use libjpeg v7 or v8 did not work with libjpeg-turbo, since it is
 128 based on the libjpeg v6b code base.  Although libjpeg v7 and v8 are not
 129 as widely used as v6b, enough programs (including a few Linux distros) made
 130 the switch that there was a demand to emulate the libjpeg v7 and v8 ABIs
 131 in libjpeg-turbo.  It should be noted, however, that this feature was added
 132 primarily so that applications that had already been compiled to use libjpeg
 133 v7+ could take advantage of accelerated baseline JPEG encoding/decoding
 134 without recompiling.  libjpeg-turbo does not claim to support all of the
 135 libjpeg v7+ features, nor to produce identical output to libjpeg v7+ in all
 136 cases (see below.)
 137
 138 By passing an argument of `-DWITH_JPEG7=1` or `-DWITH_JPEG8=1` to `cmake`, you
 139 can build a version of libjpeg-turbo that emulates the libjpeg v7 or v8 ABI, so
 140 that programs that are built against libjpeg v7 or v8 can be run with
 141 libjpeg-turbo.  The following section describes which libjpeg v7+ features are
 142 supported and which aren't.
 143
 144 ### Support for libjpeg v7 and v8 Features
 145
 146 #### Fully supported
 147
 148 - **libjpeg API: IDCT scaling extensions in decompressor**<br>
 149   libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
 150   1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
 151   and 1/2 are SIMD-accelerated.)
 152
 153 - **libjpeg API: Arithmetic coding**
 154
 155 - **libjpeg API: In-memory source and destination managers**<br>
 156   See notes below.
 157
 158 - **cjpeg: Separate quality settings for luminance and chrominance**<br>
 159   Note that the libpjeg v7+ API was extended to accommodate this feature only
 160   for convenience purposes.  It has always been possible to implement this
 161   feature with libjpeg v6b (see rdswitch.c for an example.)
 162
 163 - **cjpeg: 32-bit BMP support**
 164
 165 - **cjpeg: `-rgb` option**
 166
 167 - **jpegtran: Lossless cropping**
 168
 169 - **jpegtran: `-perfect` option**
 170
 171 - **jpegtran: Forcing width/height when performing lossless crop**
 172
 173 - **rdjpgcom: `-raw` option**
 174
 175 - **rdjpgcom: Locale awareness**
 176
 177
 178 #### Not supported
 179
 180 NOTE:  As of this writing, extensive research has been conducted into the
 181 usefulness of DCT scaling as a means of data reduction and SmartScale as a
 182 means of quality improvement.  Readers are invited to peruse the research at
 183 <http://www.libjpeg-turbo.org/About/SmartScale> and draw their own conclusions,
 184 but it is the general belief of our project that these features have not
 185 demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
 186
 187 - **libjpeg API: DCT scaling in compressor**<br>
 188   `cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
 189   There is no technical reason why DCT scaling could not be supported when
 190   emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
 191   below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
 192   8/9 would be available, which is of limited usefulness.
 193
 194 - **libjpeg API: SmartScale**<br>
 195   `cinfo.block_size` is silently ignored.
 196   SmartScale is an extension to the JPEG format that allows for DCT block
 197   sizes other than 8x8.  Providing support for this new format would be
 198   feasible (particularly without full acceleration.)  However, until/unless
 199   the format becomes either an official industry standard or, at minimum, an
 200   accepted solution in the community, we are hesitant to implement it, as
 201   there is no sense of whether or how it might change in the future.  It is
 202   our belief that SmartScale has not demonstrated sufficient usefulness as a
 203   lossless format nor as a means of quality enhancement, and thus our primary
 204   interest in providing this feature would be as a means of supporting
 205   additional DCT scaling factors.
 206
 207 - **libjpeg API: Fancy downsampling in compressor**<br>
 208   `cinfo.do_fancy_downsampling` is silently ignored.
 209   This requires the DCT scaling feature, which is not supported.
 210
 211 - **jpegtran: Scaling**<br>
 212   This requires both the DCT scaling and SmartScale features, which are not
 213   supported.
 214
 215 - **Lossless RGB JPEG files**<br>
 216   This requires the SmartScale feature, which is not supported.
 217
 218 ### What About libjpeg v9?
 219
 220 libjpeg v9 introduced yet another field to the JPEG compression structure
 221 (`color_transform`), thus making the ABI backward incompatible with that of
 222 libjpeg v8.  This new field was introduced solely for the purpose of supporting
 223 lossless SmartScale encoding.  Furthermore, there was actually no reason to
 224 extend the API in this manner, as the color transform could have just as easily
 225 been activated by way of a new JPEG colorspace constant, thus preserving
 226 backward ABI compatibility.
 227
 228 Our research (see link above) has shown that lossless SmartScale does not
 229 generally accomplish anything that can't already be accomplished better with
 230 existing, standard lossless formats.  Therefore, at this time it is our belief
 231 that there is not sufficient technical justification for software projects to
 232 upgrade from libjpeg v8 to libjpeg v9, and thus there is not sufficient
 233 technical justification for us to emulate the libjpeg v9 ABI.
 234
 235 In-Memory Source/Destination Managers
 236 -------------------------------------
 237
 238 By default, libjpeg-turbo 1.3 and later includes the `jpeg_mem_src()` and
 239 `jpeg_mem_dest()` functions, even when not emulating the libjpeg v8 API/ABI.
 240 Previously, it was necessary to build libjpeg-turbo from source with libjpeg v8
 241 API/ABI emulation in order to use the in-memory source/destination managers,
 242 but several projects requested that those functions be included when emulating
 243 the libjpeg v6b API/ABI as well.  This allows the use of those functions by
 244 programs that need them, without breaking ABI compatibility for programs that
 245 don't, and it allows those functions to be provided in the "official"
 246 libjpeg-turbo binaries.
 247
 248 Those who are concerned about maintaining strict conformance with the libjpeg
 249 v6b or v7 API can pass an argument of `-DWITH_MEM_SRCDST=0` to `cmake` prior to
 250 building libjpeg-turbo.  This will restore the pre-1.3 behavior, in which
 251 `jpeg_mem_src()` and `jpeg_mem_dest()` are only included when emulating the
 252 libjpeg v8 API/ABI.
 253
 254 On Un*x systems, including the in-memory source/destination managers changes
 255 the dynamic library version from 62.2.0 to 62.3.0 if using libjpeg v6b API/ABI
 256 emulation and from 7.2.0 to 7.3.0 if using libjpeg v7 API/ABI emulation.
 257
 258 Note that, on most Un*x systems, the dynamic linker will not look for a
 259 function in a library until that function is actually used.  Thus, if a program
 260 is built against libjpeg-turbo 1.3+ and uses `jpeg_mem_src()` or
 261 `jpeg_mem_dest()`, that program will not fail if run against an older version
 262 of libjpeg-turbo or against libjpeg v7- until the program actually tries to
 263 call `jpeg_mem_src()` or `jpeg_mem_dest()`.  Such is not the case on Windows.
 264 If a program is built against the libjpeg-turbo 1.3+ DLL and uses
 265 `jpeg_mem_src()` or `jpeg_mem_dest()`, then it must use the libjpeg-turbo 1.3+
 266 DLL at run time.
 267
 268 Both cjpeg and djpeg have been extended to allow testing the in-memory
 269 source/destination manager functions.  See their respective man pages for more
 270 details.
 271
 272
 273 Mathematical Compatibility
 274 ==========================
 275
 276 For the most part, libjpeg-turbo should produce identical output to libjpeg
 277 v6b.  The one exception to this is when using the floating point DCT/IDCT, in
 278 which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
 279 following reasons:
 280
 281 - The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
 282   slightly more accurate than the implementation in libjpeg v6b, but not by
 283   any amount perceptible to human vision (generally in the range of 0.01 to
 284   0.08 dB gain in PNSR.)
 285
 286 - When not using the SIMD extensions, libjpeg-turbo uses the more accurate
 287   (and slightly faster) floating point IDCT algorithm introduced in libjpeg
 288   v8a as opposed to the algorithm used in libjpeg v6b.  It should be noted,
 289   however, that this algorithm basically brings the accuracy of the floating
 290   point IDCT in line with the accuracy of the accurate integer IDCT.  The
 291   floating point DCT/IDCT algorithms are mainly a legacy feature, and they do
 292   not produce significantly more accuracy than the accurate integer algorithms
 293   (to put numbers on this, the typical difference in PNSR between the two
 294   algorithms is less than 0.10 dB, whereas changing the quality level by 1 in
 295   the upper range of the quality scale is typically more like a 1.0 dB
 296   difference.)
 297
 298 - If the floating point algorithms in libjpeg-turbo are not implemented using
 299   SIMD instructions on a particular platform, then the accuracy of the
 300   floating point DCT/IDCT can depend on the compiler settings.
 301
 302 While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood it is
 303 still using the same algorithms as libjpeg v6b, so there are several specific
 304 cases in which libjpeg-turbo cannot be expected to produce the same output as
 305 libjpeg v8:
 306
 307 - When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
 308   implements those scaling algorithms differently than libjpeg v6b does, and
 309   libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
 310
 311 - When using chrominance subsampling, because libjpeg v8 implements this
 312   with its DCT/IDCT scaling algorithms rather than with a separate
 313   downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
 314   output of libjpeg v8 is less accurate than that of libjpeg v6b for this
 315   reason.
 316
 317 - When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
 318   "non-smooth") chrominance upsampling, because libjpeg v8 does not support
 319   merged upsampling with scaling factors > 1.
 320
 321
 322 Performance Pitfalls
 323 ====================
 324
 325 Restart Markers
 326 ---------------
 327
 328 The optimized Huffman decoder in libjpeg-turbo does not handle restart markers
 329 in a way that makes the rest of the libjpeg infrastructure happy, so it is
 330 necessary to use the slow Huffman decoder when decompressing a JPEG image that
 331 has restart markers.  This can cause the decompression performance to drop by
 332 as much as 20%, but the performance will still be much greater than that of
 333 libjpeg.  Many consumer packages, such as Photoshop, use restart markers when
 334 generating JPEG images, so images generated by those programs will experience
 335 this issue.
 336
 337 Fast Integer Forward DCT at High Quality Levels
 338 -----------------------------------------------
 339
 340 The algorithm used by the SIMD-accelerated quantization function cannot produce
 341 correct results whenever the fast integer forward DCT is used along with a JPEG
 342 quality of 98-100.  Thus, libjpeg-turbo must use the non-SIMD quantization
 343 function in those cases.  This causes performance to drop by as much as 40%.
 344 It is therefore strongly advised that you use the accurate integer forward DCT
 345 whenever encoding images with a JPEG quality of 98 or higher.
 346
 347
 348 Memory Debugger Pitfalls
 349 ========================
 350
 351 Valgrind and Memory Sanitizer (MSan) can generate false positives
 352 (specifically, incorrect reports of uninitialized memory accesses) when used
 353 with libjpeg-turbo's SIMD extensions.  It is generally recommended that the
 354 SIMD extensions be disabled, either by passing an argument of `-DWITH_SIMD=0`
 355 to `cmake` when configuring the build or by setting the environment variable
 356 `JSIMD_FORCENONE` to `1` at run time, when testing libjpeg-turbo with Valgrind,
 357 MSan, or other memory debuggers.