ChangeLog.md

   1 3.0.1
   2 =====
   3
   4 ### Significant changes relative to 3.0.0:
   5
   6 1. The x86-64 SIMD functions now use a standard stack frame, prologue, and
   7 epilogue so that debuggers and profilers can reliably capture backtraces from
   8 within the functions.
   9
  10 2. Fixed two minor issues in the interblock smoothing algorithm that caused
  11 mathematical (but not necessarily perceptible) edge block errors when
  12 decompressing progressive JPEG images exactly two MCU blocks in width or that
  13 use vertical chrominance subsampling.
  14
  15 3. Fixed a regression introduced by 3.0 beta2[6] that, in rare cases, caused
  16 the C Huffman encoder (which is not used by default on x86 and Arm CPUs) to
  17 generate incorrect results if the Neon SIMD extensions were explicitly disabled
  18 at build time (by setting the `WITH_SIMD` CMake variable to `0`) in an AArch64
  19 build of libjpeg-turbo.
  20
  21
  22 3.0.0
  23 =====
  24
  25 ### Significant changes relative to 3.0 beta2:
  26
  27 1. The TurboJPEG API now supports 4:4:1 (transposed 4:1:1) chrominance
  28 subsampling, which allows losslessly transposed or rotated 4:1:1 JPEG images to
  29 be losslessly cropped, partially decompressed, or decompressed to planar YUV
  30 images.
  31
  32 2. Fixed various segfaults and buffer overruns (CVE-2023-2804) that occurred
  33 when attempting to decompress various specially-crafted malformed
  34 12-bit-per-component and 16-bit-per-component lossless JPEG images using color
  35 quantization or merged chroma upsampling/color conversion.  The underlying
  36 cause of these issues was that the color quantization and merged chroma
  37 upsampling/color conversion algorithms were not designed with lossless
  38 decompression in mind.  Since libjpeg-turbo explicitly does not support color
  39 conversion when compressing or decompressing lossless JPEG images, merged
  40 chroma upsampling/color conversion never should have been enabled for such
  41 images.  Color quantization is a legacy feature that serves little or no
  42 purpose with lossless JPEG images, so it is also now disabled when
  43 decompressing such images.  (As a result, djpeg can no longer decompress a
  44 lossless JPEG image into a GIF image.)
  45
  46 3. Fixed an oversight in 1.4 beta1[8] that caused various segfaults and buffer
  47 overruns when attempting to decompress various specially-crafted malformed
  48 12-bit-per-component JPEG images using djpeg with both color quantization and
  49 RGB565 color conversion enabled.
  50
  51 4. Fixed an issue whereby `jpeg_crop_scanline()` sometimes miscalculated the
  52 downsampled width for components with 4x2 or 2x4 subsampling factors if
  53 decompression scaling was enabled.  This caused the components to be upsampled
  54 incompletely, which caused the color converter to read from uninitialized
  55 memory.  With 12-bit data precision, this caused a buffer overrun or underrun
  56 and subsequent segfault if the sample value read from uninitialized memory was
  57 outside of the valid sample range.
  58
  59 5. Fixed a long-standing issue whereby the `tj3Transform()` function, when used
  60 with the `TJXOP_TRANSPOSE`, `TJXOP_TRANSVERSE`, `TJXOP_ROT90`, or
  61 `TJXOP_ROT270` transform operation and without automatic JPEG destination
  62 buffer (re)allocation or lossless cropping, computed the worst-case transformed
  63 JPEG image size based on the source image dimensions rather than the
  64 transformed image dimensions.  If a calling program allocated the JPEG
  65 destination buffer based on the transformed image dimensions, as the API
  66 documentation instructs, and attempted to transform a specially-crafted 4:2:2,
  67 4:4:0, 4:1:1, or 4:4:1 JPEG source image containing a large amount of metadata,
  68 the issue caused `tj3Transform()` to overflow the JPEG destination buffer
  69 rather than fail gracefully.  The issue could be worked around by setting
  70 `TJXOPT_COPYNONE`.  Note that, irrespective of this issue, `tj3Transform()`
  71 cannot reliably transform JPEG source images that contain a large amount of
  72 metadata unless automatic JPEG destination buffer (re)allocation is used or
  73 `TJXOPT_COPYNONE` is set.
  74
  75 6. Fixed a regression introduced by 3.0 beta2[6] that prevented the djpeg
  76 `-map` option from working when decompressing 12-bit-per-component lossy JPEG
  77 images.
  78
  79 7. Fixed an issue that caused the C Huffman encoder (which is not used by
  80 default on x86 and Arm CPUs) to read from uninitialized memory when attempting
  81 to transform a specially-crafted malformed arithmetic-coded JPEG source image
  82 into a baseline Huffman-coded JPEG destination image.
  83
  84
  85 2.1.91 (3.0 beta2)
  86 ==================
  87
  88 ### Significant changes relative to 2.1.5.1:
  89
  90 1. Significantly sped up the computation of optimal Huffman tables.  This
  91 speeds up the compression of tiny images by as much as 2x and provides a
  92 noticeable speedup for images as large as 256x256 when using optimal Huffman
  93 tables.
  94
  95 2. All deprecated fields, constructors, and methods in the TurboJPEG Java API
  96 have been removed.
  97
  98 3. Arithmetic entropy coding is now supported with 12-bit-per-component JPEG
  99 images.
 100
 101 4. Overhauled the TurboJPEG API to address long-standing limitations and to
 102 make the API more extensible and intuitive:
 103
 104      - All C function names are now prefixed with `tj3`, and all version
 105 suffixes have been removed from the function names.  Future API overhauls will
 106 increment the prefix to `tj4`, etc., thus retaining backward API/ABI
 107 compatibility without versioning each individual function.
 108      - Stateless boolean flags have been replaced with stateful integer API
 109 parameters, the values of which persist between function calls.  New
 110 functions/methods (`tj3Set()`/`TJCompressor.set()`/`TJDecompressor.set()` and
 111 `tj3Get()`/`TJCompressor.get()`/`TJDecompressor.get()`) can be used to set and
 112 query the value of a particular API parameter.
 113      - The JPEG quality and subsampling are now implemented using API
 114 parameters rather than stateless function arguments (C) or dedicated set/get
 115 methods (Java.)
 116      - `tj3DecompressHeader()` now stores all relevant information about the
 117 JPEG image, including the width, height, subsampling type, entropy coding
 118 algorithm, etc., in API parameters rather than returning that information
 119 through pointer arguments.
 120      - `TJFLAG_LIMITSCANS`/`TJ.FLAG_LIMITSCANS` has been reimplemented as an
 121 API parameter (`TJPARAM_SCANLIMIT`/`TJ.PARAM_SCANLIMIT`) that allows the number
 122 of scans to be specified.
 123      - Optimized baseline entropy coding (the computation of optimal Huffman
 124 tables, as opposed to using the default Huffman tables) can now be specified,
 125 using a new API parameter (`TJPARAM_OPTIMIZE`/`TJ.PARAM_OPTIMIZE`), a new
 126 transform option (`TJXOPT_OPTIMIZE`/`TJTransform.OPT_OPTIMIZE`), and a new
 127 TJBench option (`-optimize`.)
 128      - Arithmetic entropy coding can now be specified or queried, using a new
 129 API parameter (`TJPARAM_ARITHMETIC`/`TJ.PARAM_ARITHMETIC`), a new transform
 130 option (`TJXOPT_ARITHMETIC`/`TJTransform.OPT_ARITHMETIC`), and a new TJBench
 131 option (`-arithmetic`.)
 132      - The restart marker interval can now be specified, using new API
 133 parameters (`TJPARAM_RESTARTROWS`/`TJ.PARAM_RESTARTROWS` and
 134 `TJPARAM_RESTARTBLOCKS`/`TJ.PARAM_RESTARTBLOCKS`) and a new TJBench option
 135 (`-restart`.)
 136      - Pixel density can now be specified or queried, using new API parameters
 137 (`TJPARAM_XDENSITY`/`TJ.PARAM_XDENSITY`,
 138 `TJPARAM_YDENSITY`/`TJ.PARAM_YDENSITY`, and
 139 `TJPARAM_DENSITYUNITS`/`TJ.PARAM_DENSITYUNITS`.)
 140      - The accurate DCT/IDCT algorithms are now the default for both
 141 compression and decompression, since the "fast" algorithms are considered to be
 142 a legacy feature.  (The "fast" algorithms do not pass the ISO compliance tests,
 143 and those algorithms are not any faster than the accurate algorithms on modern
 144 x86 CPUs.)
 145      - All C initialization functions have been combined into a single function
 146 (`tj3Init()`) that accepts an integer argument specifying the subsystems to
 147 initialize.
 148      - All C functions now use the `const` keyword for pointer arguments that
 149 point to unmodified buffers (and for both dimensions of pointer arguments that
 150 point to sets of unmodified buffers.)
 151      - All C functions now use `size_t` rather than `unsigned long` to
 152 represent buffer sizes, for compatibility with `malloc()` and to avoid
 153 disparities in the size of `unsigned long` between LP64 (Un*x) and LLP64
 154 (Windows) operating systems.
 155      - All C buffer size functions now return 0 if an error occurs, rather than
 156 trying to awkwardly return -1 in an unsigned data type (which could easily be
 157 misinterpreted as a very large value.)
 158      - Decompression scaling is now enabled explicitly, using a new
 159 function/method (`tj3SetScalingFactor()`/`TJDecompressor.setScalingFactor()`),
 160 rather than implicitly using awkward "desired width"/"desired height"
 161 arguments.
 162      - Partial image decompression has been implemented, using a new
 163 function/method (`tj3SetCroppingRegion()`/`TJDecompressor.setCroppingRegion()`)
 164 and a new TJBench option (`-crop`.)
 165      - The JPEG colorspace can now be specified explicitly when compressing,
 166 using a new API parameter (`TJPARAM_COLORSPACE`/`TJ.PARAM_COLORSPACE`.)  This
 167 allows JPEG images with the RGB and CMYK colorspaces to be created.
 168      - TJBench no longer generates error/difference images, since identical
 169 functionality is already available in ImageMagick.
 170      - JPEG images with unknown subsampling configurations can now be
 171 fully decompressed into packed-pixel images or losslessly transformed (with the
 172 exception of lossless cropping.)  They cannot currently be partially
 173 decompressed or decompressed into planar YUV images.
 174      - `tj3Destroy()` now silently accepts a NULL handle.
 175      - `tj3Alloc()` and `tj3Free()` now return/accept void pointers, as
 176 `malloc()` and `free()` do.
 177      - The C image I/O functions now accept a TurboJPEG instance handle, which
 178 is used to transmit/receive API parameter values and to receive error
 179 information.
 180
 181 5. Added support for 8-bit-per-component, 12-bit-per-component, and
 182 16-bit-per-component lossless JPEG images.  A new libjpeg API function
 183 (`jpeg_enable_lossless()`), TurboJPEG API parameters
 184 (`TJPARAM_LOSSLESS`/`TJ.PARAM_LOSSLESS`,
 185 `TJPARAM_LOSSLESSPSV`/`TJ.PARAM_LOSSLESSPSV`, and
 186 `TJPARAM_LOSSLESSPT`/`TJ.PARAM_LOSSLESSPT`), and a cjpeg/TJBench option
 187 (`-lossless`) can be used to create a lossless JPEG image.  (Decompression of
 188 lossless JPEG images is handled automatically.)  Refer to
 189 [libjpeg.txt](libjpeg.txt), [usage.txt](usage.txt), and the TurboJPEG API
 190 documentation for more details.
 191
 192 6. Added support for 12-bit-per-component (lossy and lossless) and
 193 16-bit-per-component (lossless) JPEG images to the libjpeg and TurboJPEG APIs:
 194
 195      - The existing `data_precision` field in `jpeg_compress_struct` and
 196 `jpeg_decompress_struct` has been repurposed to enable the creation of
 197 12-bit-per-component and 16-bit-per-component JPEG images or to detect whether
 198 a 12-bit-per-component or 16-bit-per-component JPEG image is being
 199 decompressed.
 200      - New 12-bit-per-component and 16-bit-per-component versions of
 201 `jpeg_write_scanlines()` and `jpeg_read_scanlines()`, as well as new
 202 12-bit-per-component versions of `jpeg_write_raw_data()`,
 203 `jpeg_skip_scanlines()`, `jpeg_crop_scanline()`, and `jpeg_read_raw_data()`,
 204 provide interfaces for compressing from/decompressing to 12-bit-per-component
 205 and 16-bit-per-component packed-pixel and planar YUV image buffers.
 206      - New 12-bit-per-component and 16-bit-per-component compression,
 207 decompression, and image I/O functions/methods have been added to the TurboJPEG
 208 API, and a new API parameter (`TJPARAM_PRECISION`/`TJ.PARAM_PRECISION`) can be
 209 used to query the data precision of a JPEG image.  (YUV functions are currently
 210 limited to 8-bit data precision but can be expanded to accommodate 12-bit data
 211 precision in the future, if such is deemed beneficial.)
 212      - A new cjpeg and TJBench command-line argument (`-precision`) can be used
 213 to create a 12-bit-per-component or 16-bit-per-component JPEG image.
 214 (Decompression and transformation of 12-bit-per-component and
 215 16-bit-per-component JPEG images is handled automatically.)
 216
 217     Refer to [libjpeg.txt](libjpeg.txt), [usage.txt](usage.txt), and the
 218 TurboJPEG API documentation for more details.
 219
 220
 221 2.1.5.1
 222 =======
 223
 224 ### Significant changes relative to 2.1.5:
 225
 226 1. The SIMD dispatchers in libjpeg-turbo 2.1.4 and prior stored the list of
 227 supported SIMD instruction sets in a global variable, which caused an innocuous
 228 race condition whereby the variable could have been initialized multiple times
 229 if `jpeg_start_*compress()` was called simultaneously in multiple threads.
 230 libjpeg-turbo 2.1.5 included an undocumented attempt to fix this race condition
 231 by making the SIMD support variable thread-local.  However, that caused another
 232 issue whereby, if `jpeg_start_*compress()` was called in one thread and
 233 `jpeg_read_*()` or `jpeg_write_*()` was called in a second thread, the SIMD
 234 support variable was never initialized in the second thread.  On x86 systems,
 235 this led the second thread to incorrectly assume that AVX2 instructions were
 236 always available, and when it attempted to use those instructions on older x86
 237 CPUs that do not support them, an illegal instruction error occurred.  The SIMD
 238 dispatchers now ensure that the SIMD support variable is initialized before
 239 dispatching based on its value.
 240
 241
 242 2.1.5
 243 =====
 244
 245 ### Significant changes relative to 2.1.4:
 246
 247 1. Fixed issues in the build system whereby, when using the Ninja Multi-Config
 248 CMake generator, a static build of libjpeg-turbo (a build in which
 249 `ENABLE_SHARED` is `0`) could not be installed, a Windows installer could not
 250 be built, and the Java regression tests failed.
 251
 252 2. Fixed a regression introduced by 2.0 beta1[15] that caused a buffer overrun
 253 in the progressive Huffman encoder when attempting to transform a
 254 specially-crafted malformed 12-bit-per-component JPEG image into a progressive
 255 12-bit-per-component JPEG image using a 12-bit-per-component build of
 256 libjpeg-turbo (`-DWITH_12BIT=1`.)  Given that the buffer overrun was fully
 257 contained within the progressive Huffman encoder structure and did not cause a
 258 segfault or other user-visible errant behavior, given that the lossless
 259 transformer (unlike the decompressor) is not generally exposed to arbitrary
 260 data exploits, and given that 12-bit-per-component builds of libjpeg-turbo are
 261 uncommon, this issue did not likely pose a security risk.
 262
 263 3. Fixed an issue whereby, when using a 12-bit-per-component build of
 264 libjpeg-turbo (`-DWITH_12BIT=1`), passing samples with values greater than 4095
 265 or less than 0 to `jpeg_write_scanlines()` caused a buffer overrun or underrun
 266 in the RGB-to-YCbCr color converter.
 267
 268 4. Fixed a floating point exception that occurred when attempting to use the
 269 jpegtran `-drop` and `-trim` options to losslessly transform a
 270 specially-crafted malformed JPEG image.
 271
 272 5. Fixed an issue in `tjBufSizeYUV2()` whereby it returned a bogus result,
 273 rather than throwing an error, if the `align` parameter was not a power of 2.
 274 Fixed a similar issue in `tjCompressFromYUV()` whereby it generated a corrupt
 275 JPEG image in certain cases, rather than throwing an error, if the `align`
 276 parameter was not a power of 2.
 277
 278 6. Fixed an issue whereby `tjDecompressToYUV2()`, which is a wrapper for
 279 `tjDecompressToYUVPlanes()`, used the desired YUV image dimensions rather than
 280 the actual scaled image dimensions when computing the plane pointers and
 281 strides to pass to `tjDecompressToYUVPlanes()`.  This caused a buffer overrun
 282 and subsequent segfault if the desired image dimensions exceeded the scaled
 283 image dimensions.
 284
 285 7. Fixed an issue whereby, when decompressing a 12-bit-per-component JPEG image
 286 (`-DWITH_12BIT=1`) using an alpha-enabled output color space such as
 287 `JCS_EXT_RGBA`, the alpha channel was set to 255 rather than 4095.
 288
 289 8. Fixed an issue whereby the Java version of TJBench did not accept a range of
 290 quality values.
 291
 292 9. Fixed an issue whereby, when `-progressive` was passed to TJBench, the JPEG
 293 input image was not transformed into a progressive JPEG image prior to
 294 decompression.
 295
 296
 297 2.1.4
 298 =====
 299
 300 ### Significant changes relative to 2.1.3:
 301
 302 1. Fixed a regression introduced in 2.1.3 that caused build failures with
 303 Visual Studio 2010.
 304
 305 2. The `tjDecompressHeader3()` function in the TurboJPEG C API and the
 306 `TJDecompressor.setSourceImage()` method in the TurboJPEG Java API now accept
 307 "abbreviated table specification" (AKA "tables-only") datastreams, which can be
 308 used to prime the decompressor with quantization and Huffman tables that can be
 309 used when decompressing subsequent "abbreviated image" datastreams.
 310
 311 3. libjpeg-turbo now performs run-time detection of AltiVec instructions on
 312 OS X/PowerPC systems if AltiVec instructions are not enabled at compile time.
 313 This allows both AltiVec-equipped (PowerPC G4 and G5) and non-AltiVec-equipped
 314 (PowerPC G3) CPUs to be supported using the same build of libjpeg-turbo.
 315
 316 4. Fixed an error ("Bogus virtual array access") that occurred when attempting
 317 to decompress a progressive JPEG image with a height less than or equal to one
 318 iMCU (8 * the vertical sampling factor) using buffered-image mode with
 319 interblock smoothing enabled.  This was a regression introduced by
 320 2.1 beta1[6(b)].
 321
 322 5. Fixed two issues that prevented partial image decompression from working
 323 properly with buffered-image mode:
 324
 325      - Attempting to call `jpeg_crop_scanline()` after
 326 `jpeg_start_decompress()` but before `jpeg_start_output()` resulted in an error
 327 ("Improper call to JPEG library in state 207".)
 328      - Attempting to use `jpeg_skip_scanlines()` resulted in an error ("Bogus
 329 virtual array access") under certain circumstances.
 330
 331
 332 2.1.3
 333 =====
 334
 335 ### Significant changes relative to 2.1.2:
 336
 337 1. Fixed a regression introduced by 2.0 beta1[7] whereby cjpeg compressed PGM
 338 input files into full-color JPEG images unless the `-grayscale` option was
 339 used.
 340
 341 2. cjpeg now automatically compresses GIF and 8-bit BMP input files into
 342 grayscale JPEG images if the input files contain only shades of gray.
 343
 344 3. The build system now enables the intrinsics implementation of the AArch64
 345 (Arm 64-bit) Neon SIMD extensions by default when using GCC 12 or later.
 346
 347 4. Fixed a segfault that occurred while decompressing a 4:2:0 JPEG image using
 348 the merged (non-fancy) upsampling algorithms (that is, with
 349 `cinfo.do_fancy_upsampling` set to `FALSE`) along with `jpeg_crop_scanline()`.
 350 Specifically, the segfault occurred if the number of bytes remaining in the
 351 output buffer was less than the number of bytes required to represent one
 352 uncropped scanline of the output image.  For that reason, the issue could only
 353 be reproduced using the libjpeg API, not using djpeg.
 354
 355
 356 2.1.2
 357 =====
 358
 359 ### Significant changes relative to 2.1.1:
 360
 361 1. Fixed a regression introduced by 2.1 beta1[13] that caused the remaining
 362 GAS implementations of AArch64 (Arm 64-bit) Neon SIMD functions (which are used
 363 by default with GCC for performance reasons) to be placed in the `.rodata`
 364 section rather than in the `.text` section.  This caused the GNU linker to
 365 automatically place the `.rodata` section in an executable segment, which
 366 prevented libjpeg-turbo from working properly with other linkers and also
 367 represented a potential security risk.
 368
 369 2. Fixed an issue whereby the `tjTransform()` function incorrectly computed the
 370 MCU block size for 4:4:4 JPEG images with non-unary sampling factors and thus
 371 unduly rejected some cropping regions, even though those regions aligned with
 372 8x8 MCU block boundaries.
 373
 374 3. Fixed a regression introduced by 2.1 beta1[13] that caused the build system
 375 to enable the Arm Neon SIMD extensions when targetting Armv6 and other legacy
 376 architectures that do not support Neon instructions.
 377
 378 4. libjpeg-turbo now performs run-time detection of AltiVec instructions on
 379 FreeBSD/PowerPC systems if AltiVec instructions are not enabled at compile
 380 time.  This allows both AltiVec-equipped and non-AltiVec-equipped CPUs to be
 381 supported using the same build of libjpeg-turbo.
 382
 383 5. cjpeg now accepts a `-strict` argument similar to that of djpeg and
 384 jpegtran, which causes the compressor to abort if an LZW-compressed GIF input
 385 image contains incomplete or corrupt image data.
 386
 387
 388 2.1.1
 389 =====
 390
 391 ### Significant changes relative to 2.1.0:
 392
 393 1. Fixed a regression introduced in 2.1.0 that caused build failures with
 394 non-GCC-compatible compilers for Un*x/Arm platforms.
 395
 396 2. Fixed a regression introduced by 2.1 beta1[13] that prevented the Arm 32-bit
 397 (AArch32) Neon SIMD extensions from building unless the C compiler flags
 398 included `-mfloat-abi=softfp` or `-mfloat-abi=hard`.
 399
 400 3. Fixed an issue in the AArch32 Neon SIMD Huffman encoder whereby reliance on
 401 undefined C compiler behavior led to crashes ("SIGBUS: illegal alignment") on
 402 Android systems when running AArch32/Thumb builds of libjpeg-turbo built with
 403 recent versions of Clang.
 404
 405 4. Added a command-line argument (`-copy icc`) to jpegtran that causes it to
 406 copy only the ICC profile markers from the source file and discard any other
 407 metadata.
 408
 409 5. libjpeg-turbo should now build and run on CHERI-enabled architectures, which
 410 use capability pointers that are larger than the size of `size_t`.
 411
 412 6. Fixed a regression (CVE-2021-37972) introduced by 2.1 beta1[5] that caused a
 413 segfault in the 64-bit SSE2 Huffman encoder when attempting to losslessly
 414 transform a specially-crafted malformed JPEG image.
 415
 416
 417 2.1.0
 418 =====
 419
 420 ### Significant changes relative to 2.1 beta1:
 421
 422 1. Fixed a regression (CVE-2021-29390) introduced by 2.1 beta1[6(b)] whereby
 423 attempting to decompress certain progressive JPEG images with one or more
 424 component planes of width 8 or less caused a buffer overrun.
 425
 426 2. Fixed a regression introduced by 2.1 beta1[6(b)] whereby attempting to
 427 decompress a specially-crafted malformed progressive JPEG image caused the
 428 block smoothing algorithm to read from uninitialized memory.
 429
 430 3. Fixed an issue in the Arm Neon SIMD Huffman encoders that caused the
 431 encoders to generate incorrect results when using the Clang compiler with
 432 Visual Studio.
 433
 434 4. Fixed a floating point exception (CVE-2021-20205) that occurred when
 435 attempting to compress a specially-crafted malformed GIF image with a specified
 436 image width of 0 using cjpeg.
 437
 438 5. Fixed a regression introduced by 2.0 beta1[15] whereby attempting to
 439 generate a progressive JPEG image on an SSE2-capable CPU using a scan script
 440 containing one or more scans with lengths divisible by 32 and non-zero
 441 successive approximation low bit positions would, under certain circumstances,
 442 result in an error ("Missing Huffman code table entry") and an invalid JPEG
 443 image.
 444
 445 6. Introduced a new flag (`TJFLAG_LIMITSCANS` in the TurboJPEG C API and
 446 `TJ.FLAG_LIMIT_SCANS` in the TurboJPEG Java API) and a corresponding TJBench
 447 command-line argument (`-limitscans`) that causes the TurboJPEG decompression
 448 and transform functions/operations to return/throw an error if a progressive
 449 JPEG image contains an unreasonably large number of scans.  This allows
 450 applications that use the TurboJPEG API to guard against an exploit of the
 451 progressive JPEG format described in the report
 452 ["Two Issues with the JPEG Standard"](https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf).
 453
 454 7. The PPM reader now throws an error, rather than segfaulting (due to a buffer
 455 overrun, CVE-2021-46822) or generating incorrect pixels, if an application
 456 attempts to use the `tjLoadImage()` function to load a 16-bit binary PPM file
 457 (a binary PPM file with a maximum value greater than 255) into a grayscale
 458 image buffer or to load a 16-bit binary PGM file into an RGB image buffer.
 459
 460 8. Fixed an issue in the PPM reader that caused incorrect pixels to be
 461 generated when using the `tjLoadImage()` function to load a 16-bit binary PPM
 462 file into an extended RGB image buffer.
 463
 464 9. Fixed an issue whereby, if a JPEG buffer was automatically re-allocated by
 465 one of the TurboJPEG compression or transform functions and an error
 466 subsequently occurred during compression or transformation, the JPEG buffer
 467 pointer passed by the application was not updated when the function returned.
 468
 469
 470 2.0.90 (2.1 beta1)
 471 ==================
 472
 473 ### Significant changes relative to 2.0.6:
 474
 475 1. The build system, x86-64 SIMD extensions, and accelerated Huffman codec now
 476 support the x32 ABI on Linux, which allows for using x86-64 instructions with
 477 32-bit pointers.  The x32 ABI is generally enabled by adding `-mx32` to the
 478 compiler flags.
 479
 480      Caveats:
 481      - CMake 3.9.0 or later is required in order for the build system to
 482 automatically detect an x32 build.
 483      - Java does not support the x32 ABI, and thus the TurboJPEG Java API will
 484 automatically be disabled with x32 builds.
 485
 486 2. Added Loongson MMI SIMD implementations of the RGB-to-grayscale, 4:2:2 fancy
 487 chroma upsampling, 4:2:2 and 4:2:0 merged chroma upsampling/color conversion,
 488 and fast integer DCT/IDCT algorithms.  Relative to libjpeg-turbo 2.0.x, this
 489 speeds up:
 490
 491      - the compression of RGB source images into grayscale JPEG images by
 492 approximately 20%
 493      - the decompression of 4:2:2 JPEG images by approximately 40-60% when
 494 using fancy upsampling
 495      - the decompression of 4:2:2 and 4:2:0 JPEG images by approximately
 496 15-20% when using merged upsampling
 497      - the compression of RGB source images by approximately 30-45% when using
 498 the fast integer DCT
 499      - the decompression of JPEG images into RGB destination images by
 500 approximately 2x when using the fast integer IDCT
 501
 502     The overall decompression speedup for RGB images is now approximately
 503 2.3-3.7x (compared to 2-3.5x with libjpeg-turbo 2.0.x.)
 504
 505 3. 32-bit (Armv7 or Armv7s) iOS builds of libjpeg-turbo are no longer
 506 supported, and the libjpeg-turbo build system can no longer be used to package
 507 such builds.  32-bit iOS apps cannot run in iOS 11 and later, and the App Store
 508 no longer allows them.
 509
 510 4. 32-bit (i386) OS X/macOS builds of libjpeg-turbo are no longer supported,
 511 and the libjpeg-turbo build system can no longer be used to package such
 512 builds.  32-bit Mac applications cannot run in macOS 10.15 "Catalina" and
 513 later, and the App Store no longer allows them.
 514
 515 5. The SSE2 (x86 SIMD) and C Huffman encoding algorithms have been
 516 significantly optimized, resulting in a measured average overall compression
 517 speedup of 12-28% for 64-bit code and 22-52% for 32-bit code on various Intel
 518 and AMD CPUs, as well as a measured average overall compression speedup of
 519 0-23% on platforms that do not have a SIMD-accelerated Huffman encoding
 520 implementation.
 521
 522 6. The block smoothing algorithm that is applied by default when decompressing
 523 progressive Huffman-encoded JPEG images has been improved in the following
 524 ways:
 525
 526      - The algorithm is now more fault-tolerant.  Previously, if a particular
 527 scan was incomplete, then the smoothing parameters for the incomplete scan
 528 would be applied to the entire output image, including the parts of the image
 529 that were generated by the prior (complete) scan.  Visually, this had the
 530 effect of removing block smoothing from lower-frequency scans if they were
 531 followed by an incomplete higher-frequency scan.  libjpeg-turbo now applies
 532 block smoothing parameters to each iMCU row based on which scan generated the
 533 pixels in that row, rather than always using the block smoothing parameters for
 534 the most recent scan.
 535      - When applying block smoothing to DC scans, a Gaussian-like kernel with a
 536 5x5 window is used to reduce the "blocky" appearance.
 537
 538 7. Added SIMD acceleration for progressive Huffman encoding on Arm platforms.
 539 This speeds up the compression of full-color progressive JPEGs by about 30-40%
 540 on average (relative to libjpeg-turbo 2.0.x) when using modern Arm CPUs.
 541
 542 8. Added configure-time and run-time auto-detection of Loongson MMI SIMD
 543 instructions, so that the Loongson MMI SIMD extensions can be included in any
 544 MIPS64 libjpeg-turbo build.
 545
 546 9. Added fault tolerance features to djpeg and jpegtran, mainly to demonstrate
 547 methods by which applications can guard against the exploits of the JPEG format
 548 described in the report
 549 ["Two Issues with the JPEG Standard"](https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf).
 550
 551      - Both programs now accept a `-maxscans` argument, which can be used to
 552 limit the number of allowable scans in the input file.
 553      - Both programs now accept a `-strict` argument, which can be used to
 554 treat all warnings as fatal.
 555
 556 10. CMake package config files are now included for both the libjpeg and
 557 TurboJPEG API libraries.  This facilitates using libjpeg-turbo with CMake's
 558 `find_package()` function.  For example:
 559
 560         find_package(libjpeg-turbo CONFIG REQUIRED)
 561
 562         add_executable(libjpeg_program libjpeg_program.c)
 563         target_link_libraries(libjpeg_program PUBLIC libjpeg-turbo::jpeg)
 564
 565         add_executable(libjpeg_program_static libjpeg_program.c)
 566         target_link_libraries(libjpeg_program_static PUBLIC
 567           libjpeg-turbo::jpeg-static)
 568
 569         add_executable(turbojpeg_program turbojpeg_program.c)
 570         target_link_libraries(turbojpeg_program PUBLIC
 571           libjpeg-turbo::turbojpeg)
 572
 573         add_executable(turbojpeg_program_static turbojpeg_program.c)
 574         target_link_libraries(turbojpeg_program_static PUBLIC
 575           libjpeg-turbo::turbojpeg-static)
 576
 577 11. Since the Unisys LZW patent has long expired, cjpeg and djpeg can now
 578 read/write both LZW-compressed and uncompressed GIF files (feature ported from
 579 jpeg-6a and jpeg-9d.)
 580
 581 12. jpegtran now includes the `-wipe` and `-drop` options from jpeg-9a and
 582 jpeg-9d, as well as the ability to expand the image size using the `-crop`
 583 option.  Refer to jpegtran.1 or usage.txt for more details.
 584
 585 13. Added a complete intrinsics implementation of the Arm Neon SIMD extensions,
 586 thus providing SIMD acceleration on Arm platforms for all of the algorithms
 587 that are SIMD-accelerated on x86 platforms.  This new implementation is
 588 significantly faster in some cases than the old GAS implementation--
 589 depending on the algorithms used, the type of CPU core, and the compiler.  GCC,
 590 as of this writing, does not provide a full or optimal set of Neon intrinsics,
 591 so for performance reasons, the default when building libjpeg-turbo with GCC is
 592 to continue using the GAS implementation of the following algorithms:
 593
 594      - 32-bit RGB-to-YCbCr color conversion
 595      - 32-bit fast and accurate inverse DCT
 596      - 64-bit RGB-to-YCbCr and YCbCr-to-RGB color conversion
 597      - 64-bit accurate forward and inverse DCT
 598      - 64-bit Huffman encoding
 599
 600     A new CMake variable (`NEON_INTRINSICS`) can be used to override this
 601 default.
 602
 603     Since the new intrinsics implementation includes SIMD acceleration
 604 for merged upsampling/color conversion, 1.5.1[5] is no longer necessary and has
 605 been reverted.
 606
 607 14. The Arm Neon SIMD extensions can now be built using Visual Studio.
 608
 609 15. The build system can now be used to generate a universal x86-64 + Armv8
 610 libjpeg-turbo SDK package for both iOS and macOS.
 611
 612
 613 2.0.6
 614 =====
 615
 616 ### Significant changes relative to 2.0.5:
 617
 618 1. Fixed "using JNI after critical get" errors that occurred on Android
 619 platforms when using any of the YUV encoding/compression/decompression/decoding
 620 methods in the TurboJPEG Java API.
 621
 622 2. Fixed or worked around multiple issues with `jpeg_skip_scanlines()`:
 623
 624      - Fixed segfaults (CVE-2020-35538) or "Corrupt JPEG data: premature end of
 625 data segment" errors in `jpeg_skip_scanlines()` that occurred when
 626 decompressing 4:2:2 or 4:2:0 JPEG images using merged (non-fancy)
 627 upsampling/color conversion (that is, when setting `cinfo.do_fancy_upsampling`
 628 to `FALSE`.)  2.0.0[6] was a similar fix, but it did not cover all cases.
 629      - `jpeg_skip_scanlines()` now throws an error if two-pass color
 630 quantization is enabled.  Two-pass color quantization never worked properly
 631 with `jpeg_skip_scanlines()`, and the issues could not readily be fixed.
 632      - Fixed an issue whereby `jpeg_skip_scanlines()` always returned 0 when
 633 skipping past the end of an image.
 634
 635 3. The Arm 64-bit (Armv8) Neon SIMD extensions can now be built using MinGW
 636 toolchains targetting Arm64 (AArch64) Windows binaries.
 637
 638 4. Fixed unexpected visual artifacts that occurred when using
 639 `jpeg_crop_scanline()` and interblock smoothing while decompressing only the DC
 640 scan of a progressive JPEG image.
 641
 642 5. Fixed an issue whereby libjpeg-turbo would not build if 12-bit-per-component
 643 JPEG support (`WITH_12BIT`) was enabled along with libjpeg v7 or libjpeg v8
 644 API/ABI emulation (`WITH_JPEG7` or `WITH_JPEG8`.)
 645
 646
 647 2.0.5
 648 =====
 649
 650 ### Significant changes relative to 2.0.4:
 651
 652 1. Worked around issues in the MIPS DSPr2 SIMD extensions that caused failures
 653 in the libjpeg-turbo regression tests.  Specifically, the
 654 `jsimd_h2v1_downsample_dspr2()` and `jsimd_h2v2_downsample_dspr2()` functions
 655 in the MIPS DSPr2 SIMD extensions are now disabled until/unless they can be
 656 fixed, and other functions that are incompatible with big endian MIPS CPUs are
 657 disabled when building libjpeg-turbo for such CPUs.
 658
 659 2. Fixed an oversight in the `TJCompressor.compress(int)` method in the
 660 TurboJPEG Java API that caused an error ("java.lang.IllegalStateException: No
 661 source image is associated with this instance") when attempting to use that
 662 method to compress a YUV image.
 663
 664 3. Fixed an issue (CVE-2020-13790) in the PPM reader that caused a buffer
 665 overrun in cjpeg, TJBench, or the `tjLoadImage()` function if one of the values
 666 in a binary PPM/PGM input file exceeded the maximum value defined in the file's
 667 header and that maximum value was less than 255.  libjpeg-turbo 1.5.0 already
 668 included a similar fix for binary PPM/PGM files with maximum values greater
 669 than 255.
 670
 671 4. The TurboJPEG API library's global error handler, which is used in functions
 672 such as `tjBufSize()` and `tjLoadImage()` that do not require a TurboJPEG
 673 instance handle, is now thread-safe on platforms that support thread-local
 674 storage.
 675
 676
 677 2.0.4
 678 =====
 679
 680 ### Significant changes relative to 2.0.3:
 681
 682 1. Fixed a regression in the Windows packaging system (introduced by
 683 2.0 beta1[2]) whereby, if both the 64-bit libjpeg-turbo SDK for GCC and the
 684 64-bit libjpeg-turbo SDK for Visual C++ were installed on the same system, only
 685 one of them could be uninstalled.
 686
 687 2. Fixed a signed integer overflow and subsequent segfault that occurred when
 688 attempting to decompress images with more than 715827882 pixels using the
 689 64-bit C version of TJBench.
 690
 691 3. Fixed out-of-bounds write in `tjDecompressToYUV2()` and
 692 `tjDecompressToYUVPlanes()` (sometimes manifesting as a double free) that
 693 occurred when attempting to decompress grayscale JPEG images that were
 694 compressed with a sampling factor other than 1 (for instance, with
 695 `cjpeg -grayscale -sample 2x2`).
 696
 697 4. Fixed a regression introduced by 2.0.2[5] that caused the TurboJPEG API to
 698 incorrectly identify some JPEG images with unusual sampling factors as 4:4:4
 699 JPEG images.  This was known to cause a buffer overflow when attempting to
 700 decompress some such images using `tjDecompressToYUV2()` or
 701 `tjDecompressToYUVPlanes()`.
 702
 703 5. Fixed an issue (CVE-2020-17541), detected by ASan, whereby attempting to
 704 losslessly transform a specially-crafted malformed JPEG image containing an
 705 extremely-high-frequency coefficient block (junk image data that could never be
 706 generated by a legitimate JPEG compressor) could cause the Huffman encoder's
 707 local buffer to be overrun. (Refer to 1.4.0[9] and 1.4beta1[15].)  Given that
 708 the buffer overrun was fully contained within the stack and did not cause a
 709 segfault or other user-visible errant behavior, and given that the lossless
 710 transformer (unlike the decompressor) is not generally exposed to arbitrary
 711 data exploits, this issue did not likely pose a security risk.
 712
 713 6. The Arm 64-bit (Armv8) Neon SIMD assembly code now stores constants in a
 714 separate read-only data section rather than in the text section, to support
 715 execute-only memory layouts.
 716
 717
 718 2.0.3
 719 =====
 720
 721 ### Significant changes relative to 2.0.2:
 722
 723 1. Fixed "using JNI after critical get" errors that occurred on Android
 724 platforms when passing invalid arguments to certain methods in the TurboJPEG
 725 Java API.
 726
 727 2. Fixed a regression in the SIMD feature detection code, introduced by
 728 the AVX2 SIMD extensions (2.0 beta1[1]), that was known to cause an illegal
 729 instruction exception, in rare cases, on CPUs that lack support for CPUID leaf
 730 07H (or on which the maximum CPUID leaf has been limited by way of a BIOS
 731 setting.)
 732
 733 3. The 4:4:0 (h1v2) fancy (smooth) chroma upsampling algorithm in the
 734 decompressor now uses a similar bias pattern to that of the 4:2:2 (h2v1) fancy
 735 chroma upsampling algorithm, rounding up or down the upsampled result for
 736 alternate pixels rather than always rounding down.  This ensures that,
 737 regardless of whether a 4:2:2 JPEG image is rotated or transposed prior to
 738 decompression (in the frequency domain) or after decompression (in the spatial
 739 domain), the final image will be similar.
 740
 741 4. Fixed an integer overflow and subsequent segfault that occurred when
 742 attempting to compress or decompress images with more than 1 billion pixels
 743 using the TurboJPEG API.
 744
 745 5. Fixed a regression introduced by 2.0 beta1[15] whereby attempting to
 746 generate a progressive JPEG image on an SSE2-capable CPU using a scan script
 747 containing one or more scans with lengths divisible by 16 would result in an
 748 error ("Missing Huffman code table entry") and an invalid JPEG image.
 749
 750 6. Fixed an issue whereby `tjDecodeYUV()` and `tjDecodeYUVPlanes()` would throw
 751 an error ("Invalid progressive parameters") or a warning ("Inconsistent
 752 progression sequence") if passed a TurboJPEG instance that was previously used
 753 to decompress a progressive JPEG image.
 754
 755
 756 2.0.2
 757 =====
 758
 759 ### Significant changes relative to 2.0.1:
 760
 761 1. Fixed a regression introduced by 2.0.1[5] that prevented a runtime search
 762 path (rpath) from being embedded in the libjpeg-turbo shared libraries and
 763 executables for macOS and iOS.  This caused a fatal error of the form
 764 "dyld: Library not loaded" when attempting to use one of the executables,
 765 unless `DYLD_LIBRARY_PATH` was explicitly set to the location of the
 766 libjpeg-turbo shared libraries.
 767
 768 2. Fixed an integer overflow and subsequent segfault (CVE-2018-20330) that
 769 occurred when attempting to load a BMP file with more than 1 billion pixels
 770 using the `tjLoadImage()` function.
 771
 772 3. Fixed a buffer overrun (CVE-2018-19664) that occurred when attempting to
 773 decompress a specially-crafted malformed JPEG image to a 256-color BMP using
 774 djpeg.
 775
 776 4. Fixed a floating point exception that occurred when attempting to
 777 decompress a specially-crafted malformed JPEG image with a specified image
 778 width or height of 0 using the C version of TJBench.
 779
 780 5. The TurboJPEG API will now decompress 4:4:4 JPEG images with 2x1, 1x2, 3x1,
 781 or 1x3 luminance and chrominance sampling factors.  This is a non-standard way
 782 of specifying 1x subsampling (normally 4:4:4 JPEGs have 1x1 luminance and
 783 chrominance sampling factors), but the JPEG format and the libjpeg API both
 784 allow it.
 785
 786 6. Fixed a regression introduced by 2.0 beta1[7] that caused djpeg to generate
 787 incorrect PPM images when used with the `-colors` option.
 788
 789 7. Fixed an issue whereby a static build of libjpeg-turbo (a build in which
 790 `ENABLE_SHARED` is `0`) could not be installed using the Visual Studio IDE.
 791
 792 8. Fixed a severe performance issue in the Loongson MMI SIMD extensions that
 793 occurred when compressing RGB images whose image rows were not 64-bit-aligned.
 794
 795
 796 2.0.1
 797 =====
 798
 799 ### Significant changes relative to 2.0.0:
 800
 801 1. Fixed a regression introduced with the new CMake-based Un*x build system,
 802 whereby jconfig.h could cause compiler warnings of the form
 803 `"HAVE_*_H" redefined` if it was included by downstream Autotools-based
 804 projects that used `AC_CHECK_HEADERS()` to check for the existence of locale.h,
 805 stddef.h, or stdlib.h.
 806
 807 2. The `jsimd_quantize_float_dspr2()` and `jsimd_convsamp_float_dspr2()`
 808 functions in the MIPS DSPr2 SIMD extensions are now disabled at compile time
 809 if the soft float ABI is enabled.  Those functions use instructions that are
 810 incompatible with the soft float ABI.
 811
 812 3. Fixed a regression in the SIMD feature detection code, introduced by
 813 the AVX2 SIMD extensions (2.0 beta1[1]), that caused libjpeg-turbo to crash on
 814 Windows 7 if Service Pack 1 was not installed.
 815
 816 4. Fixed out-of-bounds read in cjpeg that occurred when attempting to compress
 817 a specially-crafted malformed color-index (8-bit-per-sample) Targa file in
 818 which some of the samples (color indices) exceeded the bounds of the Targa
 819 file's color table.
 820
 821 5. Fixed an issue whereby installing a fully static build of libjpeg-turbo
 822 (a build in which `CFLAGS` contains `-static` and `ENABLE_SHARED` is `0`) would
 823 fail with "No valid ELF RPATH or RUNPATH entry exists in the file."
 824
 825
 826 2.0.0
 827 =====
 828
 829 ### Significant changes relative to 2.0 beta1:
 830
 831 1. The TurboJPEG API can now decompress CMYK JPEG images that have subsampled M
 832 and Y components (not to be confused with YCCK JPEG images, in which the C/M/Y
 833 components have been transformed into luma and chroma.)   Previously, an error
 834 was generated ("Could not determine subsampling type for JPEG image") when such
 835 an image was passed to `tjDecompressHeader3()`, `tjTransform()`,
 836 `tjDecompressToYUVPlanes()`, `tjDecompressToYUV2()`, or the equivalent Java
 837 methods.
 838
 839 2. Fixed an issue (CVE-2018-11813) whereby a specially-crafted malformed input
 840 file (specifically, a file with a valid Targa header but incomplete pixel data)
 841 would cause cjpeg to generate a JPEG file that was potentially thousands of
 842 times larger than the input file.  The Targa reader in cjpeg was not properly
 843 detecting that the end of the input file had been reached prematurely, so after
 844 all valid pixels had been read from the input, the reader injected dummy pixels
 845 with values of 255 into the JPEG compressor until the number of pixels
 846 specified in the Targa header had been compressed.  The Targa reader in cjpeg
 847 now behaves like the PPM reader and aborts compression if the end of the input
 848 file is reached prematurely.  Because this issue only affected cjpeg and not
 849 the underlying library, and because it did not involve any out-of-bounds reads
 850 or other exploitable behaviors, it was not believed to represent a security
 851 threat.
 852
 853 3. Fixed an issue whereby the `tjLoadImage()` and `tjSaveImage()` functions
 854 would produce a "Bogus message code" error message if the underlying bitmap and
 855 PPM readers/writers threw an error that was specific to the readers/writers
 856 (as opposed to a general libjpeg API error.)
 857
 858 4. Fixed an issue (CVE-2018-1152) whereby a specially-crafted malformed BMP
 859 file, one in which the header specified an image width of 1073741824 pixels,
 860 would trigger a floating point exception (division by zero) in the
 861 `tjLoadImage()` function when attempting to load the BMP file into a
 862 4-component image buffer.
 863
 864 5. Fixed an issue whereby certain combinations of calls to
 865 `jpeg_skip_scanlines()` and `jpeg_read_scanlines()` could trigger an infinite
 866 loop when decompressing progressive JPEG images that use vertical chroma
 867 subsampling (for instance, 4:2:0 or 4:4:0.)
 868
 869 6. Fixed a segfault in `jpeg_skip_scanlines()` that occurred when decompressing
 870 a 4:2:2 or 4:2:0 JPEG image using the merged (non-fancy) upsampling algorithms
 871 (that is, when setting `cinfo.do_fancy_upsampling` to `FALSE`.)
 872
 873 7. The new CMake-based build system will now disable the MIPS DSPr2 SIMD
 874 extensions if it detects that the compiler does not support DSPr2 instructions.
 875
 876 8. Fixed out-of-bounds read in cjpeg (CVE-2018-14498) that occurred when
 877 attempting to compress a specially-crafted malformed color-index
 878 (8-bit-per-sample) BMP file in which some of the samples (color indices)
 879 exceeded the bounds of the BMP file's color table.
 880
 881 9. Fixed a signed integer overflow in the progressive Huffman decoder, detected
 882 by the Clang and GCC undefined behavior sanitizers, that could be triggered by
 883 attempting to decompress a specially-crafted malformed JPEG image.  This issue
 884 did not pose a security threat, but removing the warning made it easier to
 885 detect actual security issues, should they arise in the future.
 886
 887
 888 1.5.90 (2.0 beta1)
 889 ==================
 890
 891 ### Significant changes relative to 1.5.3:
 892
 893 1. Added AVX2 SIMD implementations of the colorspace conversion, chroma
 894 downsampling and upsampling, integer quantization and sample conversion, and
 895 accurate integer DCT/IDCT algorithms.  When using the accurate integer DCT/IDCT
 896 algorithms on AVX2-equipped CPUs, the compression of RGB images is
 897 approximately 13-36% (avg. 22%) faster (relative to libjpeg-turbo 1.5.x) with
 898 64-bit code and 11-21% (avg. 17%) faster with 32-bit code, and the
 899 decompression of RGB images is approximately 9-35% (avg. 17%) faster with
 900 64-bit code and 7-17% (avg. 12%) faster with 32-bit code.  (As tested on a
 901 3 GHz Intel Core i7.  Actual mileage may vary.)
 902
 903 2. Overhauled the build system to use CMake on all platforms, and removed the
 904 autotools-based build system.  This decision resulted from extensive
 905 discussions within the libjpeg-turbo community.  libjpeg-turbo traditionally
 906 used CMake only for Windows builds, but there was an increasing amount of
 907 demand to extend CMake support to other platforms.  However, because of the
 908 unique nature of our code base (the need to support different assemblers on
 909 each platform, the need for Java support, etc.), providing dual build systems
 910 as other OSS imaging libraries do (including libpng and libtiff) would have
 911 created a maintenance burden.  The use of CMake greatly simplifies some aspects
 912 of our build system, owing to CMake's built-in support for various assemblers,
 913 Java, and unit testing, as well as generally fewer quirks that have to be
 914 worked around in order to implement our packaging system.  Eliminating
 915 autotools puts our project slightly at odds with the traditional practices of
 916 the OSS community, since most "system libraries" tend to be built with
 917 autotools, but it is believed that the benefits of this move outweigh the
 918 risks.  In addition to providing a unified build environment, switching to
 919 CMake allows for the use of various build tools and IDEs that aren't supported
 920 under autotools, including XCode, Ninja, and Eclipse.  It also eliminates the
 921 need to install autotools via MacPorts/Homebrew on OS X and allows
 922 libjpeg-turbo to be configured without the use of a terminal/command prompt.
 923 Extensive testing was conducted to ensure that all features provided by the
 924 autotools-based build system are provided by the new build system.
 925
 926 3. The libjpeg API in this version of libjpeg-turbo now includes two additional
 927 functions, `jpeg_read_icc_profile()` and `jpeg_write_icc_profile()`, that can
 928 be used to extract ICC profile data from a JPEG file while decompressing or to
 929 embed ICC profile data in a JPEG file while compressing or transforming.  This
 930 eliminates the need for downstream projects, such as color management libraries
 931 and browsers, to include their own glueware for accomplishing this.
 932
 933 4. Improved error handling in the TurboJPEG API library:
 934
 935      - Introduced a new function (`tjGetErrorStr2()`) in the TurboJPEG C API
 936 that allows compression/decompression/transform error messages to be retrieved
 937 in a thread-safe manner.  Retrieving error messages from global functions, such
 938 as `tjInitCompress()` or `tjBufSize()`, is still thread-unsafe, but since those
 939 functions will only throw errors if passed an invalid argument or if a memory
 940 allocation failure occurs, thread safety is not as much of a concern.
 941      - Introduced a new function (`tjGetErrorCode()`) in the TurboJPEG C API
 942 and a new method (`TJException.getErrorCode()`) in the TurboJPEG Java API that
 943 can be used to determine the severity of the last
 944 compression/decompression/transform error.  This allows applications to
 945 choose whether to ignore warnings (non-fatal errors) from the underlying
 946 libjpeg API or to treat them as fatal.
 947      - Introduced a new flag (`TJFLAG_STOPONWARNING` in the TurboJPEG C API and
 948 `TJ.FLAG_STOPONWARNING` in the TurboJPEG Java API) that causes the library to
 949 immediately halt a compression/decompression/transform operation if it
 950 encounters a warning from the underlying libjpeg API (the default behavior is
 951 to allow the operation to complete unless a fatal error is encountered.)
 952
 953 5. Introduced a new flag in the TurboJPEG C and Java APIs (`TJFLAG_PROGRESSIVE`
 954 and `TJ.FLAG_PROGRESSIVE`, respectively) that causes the library to use
 955 progressive entropy coding in JPEG images generated by compression and
 956 transform operations.  Additionally, a new transform option
 957 (`TJXOPT_PROGRESSIVE` in the C API and `TJTransform.OPT_PROGRESSIVE` in the
 958 Java API) has been introduced, allowing progressive entropy coding to be
 959 enabled for selected transforms in a multi-transform operation.
 960
 961 6. Introduced a new transform option in the TurboJPEG API (`TJXOPT_COPYNONE` in
 962 the C API and `TJTransform.OPT_COPYNONE` in the Java API) that allows the
 963 copying of markers (including EXIF and ICC profile data) to be disabled for a
 964 particular transform.
 965
 966 7. Added two functions to the TurboJPEG C API (`tjLoadImage()` and
 967 `tjSaveImage()`) that can be used to load/save a BMP or PPM/PGM image to/from a
 968 memory buffer with a specified pixel format and layout.  These functions
 969 replace the project-private (and slow) bmp API, which was previously used by
 970 TJBench, and they also provide a convenient way for first-time users of
 971 libjpeg-turbo to quickly develop a complete JPEG compression/decompression
 972 program.
 973
 974 8. The TurboJPEG C API now includes a new convenience array (`tjAlphaOffset[]`)
 975 that contains the alpha component index for each pixel format (or -1 if the
 976 pixel format lacks an alpha component.)  The TurboJPEG Java API now includes a
 977 new method (`TJ.getAlphaOffset()`) that returns the same value.  In addition,
 978 the `tjRedOffset[]`, `tjGreenOffset[]`, and `tjBlueOffset[]` arrays-- and the
 979 corresponding `TJ.getRedOffset()`, `TJ.getGreenOffset()`, and
 980 `TJ.getBlueOffset()` methods-- now return -1 for `TJPF_GRAY`/`TJ.PF_GRAY`
 981 rather than 0.  This allows programs to easily determine whether a pixel format
 982 has red, green, blue, and alpha components.
 983
 984 9. Added a new example (tjexample.c) that demonstrates the basic usage of the
 985 TurboJPEG C API.  This example mirrors the functionality of TJExample.java.
 986 Both files are now included in the libjpeg-turbo documentation.
 987
 988 10. Fixed two signed integer overflows in the arithmetic decoder, detected by
 989 the Clang undefined behavior sanitizer, that could be triggered by attempting
 990 to decompress a specially-crafted malformed JPEG image.  These issues did not
 991 pose a security threat, but removing the warnings makes it easier to detect
 992 actual security issues, should they arise in the future.
 993
 994 11. Fixed a bug in the merged 4:2:0 upsampling/dithered RGB565 color conversion
 995 algorithm that caused incorrect dithering in the output image.  This algorithm
 996 now produces bitwise-identical results to the unmerged algorithms.
 997
 998 12. The SIMD function symbols for x86[-64]/ELF, MIPS/ELF, macOS/x86[-64] (if
 999 libjpeg-turbo is built with Yasm), and iOS/Arm[64] builds are now private.
1000 This prevents those symbols from being exposed in applications or shared
1001 libraries that link statically with libjpeg-turbo.
1002
1003 13. Added Loongson MMI SIMD implementations of the RGB-to-YCbCr and
1004 YCbCr-to-RGB colorspace conversion, 4:2:0 chroma downsampling, 4:2:0 fancy
1005 chroma upsampling, integer quantization, and accurate integer DCT/IDCT
1006 algorithms.  When using the accurate integer DCT/IDCT, this speeds up the
1007 compression of RGB images by approximately 70-100% and the decompression of RGB
1008 images by approximately 2-3.5x.
1009
1010 14. Fixed a build error when building with older MinGW releases (regression
1011 caused by 1.5.1[7].)
1012
1013 15. Added SIMD acceleration for progressive Huffman encoding on SSE2-capable
1014 x86 and x86-64 platforms.  This speeds up the compression of full-color
1015 progressive JPEGs by about 85-90% on average (relative to libjpeg-turbo 1.5.x)
1016 when using modern Intel and AMD CPUs.
1017
1018
1019 1.5.3
1020 =====
1021
1022 ### Significant changes relative to 1.5.2:
1023
1024 1. Fixed a NullPointerException in the TurboJPEG Java wrapper that occurred
1025 when using the YUVImage constructor that creates an instance backed by separate
1026 image planes and allocates memory for the image planes.
1027
1028 2. Fixed an issue whereby the Java version of TJUnitTest would fail when
1029 testing BufferedImage encoding/decoding on big endian systems.
1030
1031 3. Fixed a segfault in djpeg that would occur if an output format other than
1032 PPM/PGM was selected along with the `-crop` option.  The `-crop` option now
1033 works with the GIF and Targa formats as well (unfortunately, it cannot be made
1034 to work with the BMP and RLE formats due to the fact that those output engines
1035 write scanlines in bottom-up order.)  djpeg will now exit gracefully if an
1036 output format other than PPM/PGM, GIF, or Targa is selected along with the
1037 `-crop` option.
1038
1039 4. Fixed an issue (CVE-2017-15232) whereby `jpeg_skip_scanlines()` would
1040 segfault if color quantization was enabled.
1041
1042 5. TJBench (both C and Java versions) will now display usage information if any
1043 command-line argument is unrecognized.  This prevents the program from silently
1044 ignoring typos.
1045
1046 6. Fixed an access violation in tjbench.exe (Windows) that occurred when the
1047 program was used to decompress an existing JPEG image.
1048
1049 7. Fixed an ArrayIndexOutOfBoundsException in the TJExample Java program that
1050 occurred when attempting to decompress a JPEG image that had been compressed
1051 with 4:1:1 chrominance subsampling.
1052
1053 8. Fixed an issue whereby, when using `jpeg_skip_scanlines()` to skip to the
1054 end of a single-scan (non-progressive) image, subsequent calls to
1055 `jpeg_consume_input()` would return `JPEG_SUSPENDED` rather than
1056 `JPEG_REACHED_EOI`.
1057
1058 9. `jpeg_crop_scanline()` now works correctly when decompressing grayscale JPEG
1059 images that were compressed with a sampling factor other than 1 (for instance,
1060 with `cjpeg -grayscale -sample 2x2`).
1061
1062
1063 1.5.2
1064 =====
1065
1066 ### Significant changes relative to 1.5.1:
1067
1068 1. Fixed a regression introduced by 1.5.1[7] that prevented libjpeg-turbo from
1069 building with Android NDK platforms prior to android-21 (5.0).
1070
1071 2. Fixed a regression introduced by 1.5.1[1] that prevented the MIPS DSPR2 SIMD
1072 code in libjpeg-turbo from building.
1073
1074 3. Fixed a regression introduced by 1.5 beta1[11] that prevented the Java
1075 version of TJBench from outputting any reference images (the `-nowrite` switch
1076 was accidentally enabled by default.)
1077
1078 4. libjpeg-turbo should now build and run with full AltiVec SIMD acceleration
1079 on PowerPC-based AmigaOS 4 and OpenBSD systems.
1080
1081 5. Fixed build and runtime errors on Windows that occurred when building
1082 libjpeg-turbo with libjpeg v7 API/ABI emulation and the in-memory
1083 source/destination managers.  Due to an oversight, the `jpeg_skip_scanlines()`
1084 and `jpeg_crop_scanline()` functions were not being included in jpeg7.dll when
1085 libjpeg-turbo was built with `-DWITH_JPEG7=1` and `-DWITH_MEMSRCDST=1`.
1086
1087 6. Fixed "Bogus virtual array access" error that occurred when using the
1088 lossless crop feature in jpegtran or the TurboJPEG API, if libjpeg-turbo was
1089 built with libjpeg v7 API/ABI emulation.  This was apparently a long-standing
1090 bug that has existed since the introduction of libjpeg v7/v8 API/ABI emulation
1091 in libjpeg-turbo v1.1.
1092
1093 7. The lossless transform features in jpegtran and the TurboJPEG API will now
1094 always attempt to adjust the EXIF image width and height tags if the image size
1095 changed as a result of the transform.  This behavior has always existed when
1096 using libjpeg v8 API/ABI emulation.  It was supposed to be available with
1097 libjpeg v7 API/ABI emulation as well but did not work properly due to a bug.
1098 Furthermore, there was never any good reason not to enable it with libjpeg v6b
1099 API/ABI emulation, since the behavior is entirely internal.  Note that
1100 `-copy all` must be passed to jpegtran in order to transfer the EXIF tags from
1101 the source image to the destination image.
1102
1103 8. Fixed several memory leaks in the TurboJPEG API library that could occur
1104 if the library was built with certain compilers and optimization levels
1105 (known to occur with GCC 4.x and clang with `-O1` and higher but not with
1106 GCC 5.x or 6.x) and one of the underlying libjpeg API functions threw an error
1107 after a TurboJPEG API function allocated a local buffer.
1108
1109 9. The libjpeg-turbo memory manager will now honor the `max_memory_to_use`
1110 structure member in jpeg\_memory\_mgr, which can be set to the maximum amount
1111 of memory (in bytes) that libjpeg-turbo should use during decompression or
1112 multi-pass (including progressive) compression.  This limit can also be set
1113 using the `JPEGMEM` environment variable or using the `-maxmemory` switch in
1114 cjpeg/djpeg/jpegtran (refer to the respective man pages for more details.)
1115 This has been a documented feature of libjpeg since v5, but the
1116 `malloc()`/`free()` implementation of the memory manager (jmemnobs.c) never
1117 implemented the feature.  Restricting libjpeg-turbo's memory usage is useful
1118 for two reasons:  it allows testers to more easily work around the 2 GB limit
1119 in libFuzzer, and it allows developers of security-sensitive applications to
1120 more easily defend against one of the progressive JPEG exploits (LJT-01-004)
1121 identified in
1122 [this report](http://www.libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf).
1123
1124 10. TJBench will now run each benchmark for 1 second prior to starting the
1125 timer, in order to improve the consistency of the results.  Furthermore, the
1126 `-warmup` option is now used to specify the amount of warmup time rather than
1127 the number of warmup iterations.
1128
1129 11. Fixed an error (`short jump is out of range`) that occurred when assembling
1130 the 32-bit x86 SIMD extensions with NASM versions prior to 2.04.  This was a
1131 regression introduced by 1.5 beta1[12].
1132
1133
1134 1.5.1
1135 =====
1136
1137 ### Significant changes relative to 1.5.0:
1138
1139 1. Previously, the undocumented `JSIMD_FORCE*` environment variables could be
1140 used to force-enable a particular SIMD instruction set if multiple instruction
1141 sets were available on a particular platform.  On x86 platforms, where CPU
1142 feature detection is bulletproof and multiple SIMD instruction sets are
1143 available, it makes sense for those environment variables to allow forcing the
1144 use of an instruction set only if that instruction set is available.  However,
1145 since the ARM implementations of libjpeg-turbo can only use one SIMD
1146 instruction set, and since their feature detection code is less bulletproof
1147 (parsing /proc/cpuinfo), it makes sense for the `JSIMD_FORCENEON` environment
1148 variable to bypass the feature detection code and really force the use of NEON
1149 instructions.  A new environment variable (`JSIMD_FORCEDSPR2`) was introduced
1150 in the MIPS implementation for the same reasons, and the existing
1151 `JSIMD_FORCENONE` environment variable was extended to that implementation.
1152 These environment variables provide a workaround for those attempting to test
1153 ARM and MIPS builds of libjpeg-turbo in QEMU, which passes through
1154 /proc/cpuinfo from the host system.
1155
1156 2. libjpeg-turbo previously assumed that AltiVec instructions were always
1157 available on PowerPC platforms, which led to "illegal instruction" errors when
1158 running on PowerPC chips that lack AltiVec support (such as the older 7xx/G3
1159 and newer e5500 series.)  libjpeg-turbo now examines /proc/cpuinfo on
1160 Linux/Android systems and enables AltiVec instructions only if the CPU supports
1161 them.  It also now provides two environment variables, `JSIMD_FORCEALTIVEC` and
1162 `JSIMD_FORCENONE`, to force-enable and force-disable AltiVec instructions in
1163 environments where /proc/cpuinfo is an unreliable means of CPU feature
1164 detection (such as when running in QEMU.)  On OS X, libjpeg-turbo continues to
1165 assume that AltiVec support is always available, which means that libjpeg-turbo
1166 cannot be used with G3 Macs unless you set the environment variable
1167 `JSIMD_FORCENONE` to `1`.
1168
1169 3. Fixed an issue whereby 64-bit ARM (AArch64) builds of libjpeg-turbo would
1170 crash when built with recent releases of the Clang/LLVM compiler.  This was
1171 caused by an ABI conformance issue in some of libjpeg-turbo's 64-bit NEON SIMD
1172 routines.  Those routines were incorrectly using 64-bit instructions to
1173 transfer a 32-bit JDIMENSION argument, whereas the ABI allows the upper
1174 (unused) 32 bits of a 32-bit argument's register to be undefined.  The new
1175 Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
1176 structure members into a single 64-bit register, and this exposed the ABI
1177 conformance issue.
1178
1179 4. Fancy upsampling is now supported when decompressing JPEG images that use
1180 4:4:0 (h1v2) chroma subsampling.  These images are generated when losslessly
1181 rotating or transposing JPEG images that use 4:2:2 (h2v1) chroma subsampling.
1182 The h1v2 fancy upsampling algorithm is not currently SIMD-accelerated.
1183
1184 5. If merged upsampling isn't SIMD-accelerated but YCbCr-to-RGB conversion is,
1185 then libjpeg-turbo will now disable merged upsampling when decompressing YCbCr
1186 JPEG images into RGB or extended RGB output images.  This significantly speeds
1187 up the decompression of 4:2:0 and 4:2:2 JPEGs on ARM platforms if fancy
1188 upsampling is not used (for example, if the `-nosmooth` option to djpeg is
1189 specified.)
1190
1191 6. The TurboJPEG API will now decompress 4:2:2 and 4:4:0 JPEG images with
1192 2x2 luminance sampling factors and 2x1 or 1x2 chrominance sampling factors.
1193 This is a non-standard way of specifying 2x subsampling (normally 4:2:2 JPEGs
1194 have 2x1 luminance and 1x1 chrominance sampling factors, and 4:4:0 JPEGs have
1195 1x2 luminance and 1x1 chrominance sampling factors), but the JPEG format and
1196 the libjpeg API both allow it.
1197
1198 7. Fixed an unsigned integer overflow in the libjpeg memory manager, detected
1199 by the Clang undefined behavior sanitizer, that could be triggered by
1200 attempting to decompress a specially-crafted malformed JPEG image.  This issue
1201 affected only 32-bit code and did not pose a security threat, but removing the
1202 warning makes it easier to detect actual security issues, should they arise in
1203 the future.
1204
1205 8. Fixed additional negative left shifts and other issues reported by the GCC
1206 and Clang undefined behavior sanitizers when attempting to decompress
1207 specially-crafted malformed JPEG images.  None of these issues posed a security
1208 threat, but removing the warnings makes it easier to detect actual security
1209 issues, should they arise in the future.
1210
1211 9. Fixed an out-of-bounds array reference, introduced by 1.4.90[2] (partial
1212 image decompression) and detected by the Clang undefined behavior sanitizer,
1213 that could be triggered by a specially-crafted malformed JPEG image with more
1214 than four components.  Because the out-of-bounds reference was still within the
1215 same structure, it was not known to pose a security threat, but removing the
1216 warning makes it easier to detect actual security issues, should they arise in
1217 the future.
1218
1219 10. Fixed another ABI conformance issue in the 64-bit ARM (AArch64) NEON SIMD
1220 code.  Some of the routines were incorrectly reading and storing data below the
1221 stack pointer, which caused segfaults in certain applications under specific
1222 circumstances.
1223
1224
1225 1.5.0
1226 =====
1227
1228 ### Significant changes relative to 1.5 beta1:
1229
1230 1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast
1231 path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory.
1232
1233 2. Added libjpeg-turbo version and build information to the global string table
1234 of the libjpeg and TurboJPEG API libraries.  This is a common practice in other
1235 infrastructure libraries, such as OpenSSL and libpng, because it makes it easy
1236 to examine an application binary and determine which version of the library the
1237 application was linked against.
1238
1239 3. Fixed a couple of issues in the PPM reader that would cause buffer overruns
1240 in cjpeg if one of the values in a binary PPM/PGM input file exceeded the
1241 maximum value defined in the file's header and that maximum value was greater
1242 than 255.  libjpeg-turbo 1.4.2 already included a similar fix for ASCII PPM/PGM
1243 files.  Note that these issues were not security bugs, since they were confined
1244 to the cjpeg program and did not affect any of the libjpeg-turbo libraries.
1245
1246 4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt
1247 header using the `tjDecompressToYUV2()` function would cause the function to
1248 abort without returning an error and, under certain circumstances, corrupt the
1249 stack.  This only occurred if `tjDecompressToYUV2()` was called prior to
1250 calling `tjDecompressHeader3()`, or if the return value from
1251 `tjDecompressHeader3()` was ignored (both cases represent incorrect usage of
1252 the TurboJPEG API.)
1253
1254 5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that
1255 prevented the code from assembling properly with clang.
1256
1257 6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and
1258 `jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a
1259 source/destination manager has already been assigned to the compress or
1260 decompress object by a different function or by the calling program.  This
1261 prevents these functions from attempting to reuse a source/destination manager
1262 structure that was allocated elsewhere, because there is no way to ensure that
1263 it would be big enough to accommodate the new source/destination manager.
1264
1265
1266 1.4.90 (1.5 beta1)
1267 ==================
1268
1269 ### Significant changes relative to 1.4.2:
1270
1271 1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX
1272 (128-bit SIMD) instructions.  Although the performance of libjpeg-turbo on
1273 PowerPC was already good, due to the increased number of registers available
1274 to the compiler vs. x86, it was still possible to speed up compression by about
1275 3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the
1276 use of AltiVec instructions.
1277
1278 2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and
1279 `jpeg_crop_scanline()`) that can be used to partially decode a JPEG image.  See
1280 [libjpeg.txt](libjpeg.txt) for more details.
1281
1282 3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now
1283 implement the Closeable interface, so those classes can be used with a
1284 try-with-resources statement.
1285
1286 4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions
1287 (IllegalArgumentException, IllegalStateException) for unrecoverable errors
1288 caused by incorrect API usage, and those classes throw a new checked exception
1289 type (TJException) for errors that are passed through from the C library.
1290
1291 5. Source buffers for the TurboJPEG C API functions, as well as the
1292 `jpeg_mem_src()` function in the libjpeg API, are now declared as const
1293 pointers.  This facilitates passing read-only buffers to those functions and
1294 ensures the caller that the source buffer will not be modified.  This should
1295 not create any backward API or ABI incompatibilities with prior libjpeg-turbo
1296 releases.
1297
1298 6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1
1299 FPUs.
1300
1301 7. Fixed additional negative left shifts and other issues reported by the GCC
1302 and Clang undefined behavior sanitizers.  Most of these issues affected only
1303 32-bit code, and none of them was known to pose a security threat, but removing
1304 the warnings makes it easier to detect actual security issues, should they
1305 arise in the future.
1306
1307 8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code.
1308 This directive was preventing the code from assembling using the clang
1309 integrated assembler.
1310
1311 9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit
1312 libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora
1313 distributions.  This was due to the addition of a macro in jconfig.h that
1314 allows the Huffman codec to determine the word size at compile time.  Since
1315 that macro differs between 32-bit and 64-bit builds, this caused a conflict
1316 between the i386 and x86_64 RPMs (any differing files, other than executables,
1317 are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
1318 Since the macro is used only internally, it has been moved into jconfigint.h.
1319
1320 10. The x86-64 SIMD code can now be disabled at run time by setting the
1321 `JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations
1322 already had this capability.)
1323
1324 11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the
1325 benchmark from outputting any images.  This removes any potential operating
1326 system overhead that might be caused by lazy writes to disk and thus improves
1327 the consistency of the performance measurements.
1328
1329 12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64
1330 platforms.  This speeds up the compression of full-color JPEGs by about 10-15%
1331 on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD
1332 CPUs.  Additionally, this works around an issue in the clang optimizer that
1333 prevents it (as of this writing) from achieving the same performance as GCC
1334 when compiling the C version of the Huffman encoder
1335 (<https://llvm.org/bugs/show_bug.cgi?id=16035>).  For the purposes of
1336 benchmarking or regression testing, SIMD-accelerated Huffman encoding can be
1337 disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`.
1338
1339 13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
1340 compression algorithms (including the accurate integer forward DCT and h2v2 &
1341 h2v1 downsampling algorithms, which are not accelerated in the 32-bit NEON
1342 implementation.)  This speeds up the compression of full-color JPEGs by about
1343 75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
1344 Cortex-A53 and Cortex-A57 cores.
1345
1346 14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
1347 and 64-bit platforms.
1348
1349     For 32-bit code, this speeds up the compression of full-color JPEGs by
1350 about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by
1351 about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and
1352 Cortex-A57), relative to libjpeg-turbo 1.4.x.  Note that the larger speedup
1353 under iOS is due to the fact that iOS builds use LLVM, which does not optimize
1354 the C Huffman encoder as well as GCC does.
1355
1356     For 64-bit code, NEON-accelerated Huffman encoding speeds up the
1357 compression of full-color JPEGs by about 40% on average on a typical iOS device
1358 (iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device
1359 (Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in
1360 [13] above.
1361
1362     For the purposes of benchmarking or regression testing, SIMD-accelerated
1363 Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment
1364 variable to `1`.
1365
1366 15. pkg-config (.pc) scripts are now included for both the libjpeg and
1367 TurboJPEG API libraries on Un*x systems.  Note that if a project's build system
1368 relies on these scripts, then it will not be possible to build that project
1369 with libjpeg or with a prior version of libjpeg-turbo.
1370
1371 16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
1372 improve performance on CPUs with in-order pipelines.  This speeds up the
1373 decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
1374 processor and by about 15% on average on a Cortex-A53 core.
1375
1376 17. Fixed an issue in the accelerated Huffman decoder that could have caused
1377 the decoder to read past the end of the input buffer when a malformed,
1378 specially-crafted JPEG image was being decompressed.  In prior versions of
1379 libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only
1380 if there were > 128 bytes of data in the input buffer.  However, it is possible
1381 to construct a JPEG image in which a single Huffman block is over 430 bytes
1382 long, so this version of libjpeg-turbo activates the accelerated Huffman
1383 decoder only if there are > 512 bytes of data in the input buffer.
1384
1385 18. Fixed a memory leak in tjunittest encountered when running the program
1386 with the `-yuv` option.
1387
1388
1389 1.4.2
1390 =====
1391
1392 ### Significant changes relative to 1.4.1:
1393
1394 1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a
1395 negative width or height was used as an input image (Windows bitmaps can have
1396 a negative height if they are stored in top-down order, but such files are
1397 rare and not supported by libjpeg-turbo.)
1398
1399 2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would
1400 incorrectly encode certain JPEG images when quality=100 and the fast integer
1401 forward DCT were used.  This was known to cause `make test` to fail when the
1402 library was built with `-march=haswell` on x86 systems.
1403
1404 3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest
1405 & greatest development version of the Clang/LLVM compiler.  This was caused by
1406 an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD
1407 routines.  Those routines were incorrectly using a 64-bit `mov` instruction to
1408 transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper
1409 (unused) 32 bits of a 32-bit argument's register to be undefined.  The new
1410 Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
1411 structure members into a single 64-bit register, and this exposed the ABI
1412 conformance issue.
1413
1414 4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged)
1415 upsampling routine that caused a buffer overflow (and subsequent segfault) when
1416 decompressing a 4:2:0 JPEG image whose scaled output width was less than 16
1417 pixels.  The "plain" upsampling routines are normally only used when
1418 decompressing a non-YCbCr JPEG image, but they are also used when decompressing
1419 a JPEG image whose scaled output height is 1.
1420
1421 5. Fixed various negative left shifts and other issues reported by the GCC and
1422 Clang undefined behavior sanitizers.  None of these was known to pose a
1423 security threat, but removing the warnings makes it easier to detect actual
1424 security issues, should they arise in the future.
1425
1426
1427 1.4.1
1428 =====
1429
1430 ### Significant changes relative to 1.4.0:
1431
1432 1. tjbench now properly handles CMYK/YCCK JPEG files.  Passing an argument of
1433 `-cmyk` (instead of, for instance, `-rgb`) will cause tjbench to internally
1434 convert the source bitmap to CMYK prior to compression, to generate YCCK JPEG
1435 files, and to internally convert the decompressed CMYK pixels back to RGB after
1436 decompression (the latter is done automatically if a CMYK or YCCK JPEG is
1437 passed to tjbench as a source image.)  The CMYK<->RGB conversion operation is
1438 not benchmarked.  NOTE: The quick & dirty CMYK<->RGB conversions that tjbench
1439 uses are suitable for testing only.  Proper conversion between CMYK and RGB
1440 requires a color management system.
1441
1442 2. `make test` now performs additional bitwise regression tests using tjbench,
1443 mainly for the purpose of testing compression from/decompression to a subregion
1444 of a larger image buffer.
1445
1446 3. `make test` no longer tests the regression of the floating point DCT/IDCT
1447 by default, since the results of those tests can vary if the algorithms in
1448 question are not implemented using SIMD instructions on a particular platform.
1449 See the comments in [Makefile.am](Makefile.am) for information on how to
1450 re-enable the tests and to specify an expected result for them based on the
1451 particulars of your platform.
1452
1453 4. The NULL color conversion routines have been significantly optimized,
1454 which speeds up the compression of RGB and CMYK JPEGs by 5-20% when using
1455 64-bit code and 0-3% when using 32-bit code, and the decompression of those
1456 images by 10-30% when using 64-bit code and 3-12% when using 32-bit code.
1457
1458 5. Fixed an "illegal instruction" error that occurred when djpeg from a
1459 SIMD-enabled libjpeg-turbo MIPS build was executed with the `-nosmooth` option
1460 on a MIPS machine that lacked DSPr2 support.  The MIPS SIMD routines for h2v1
1461 and h2v2 merged upsampling were not properly checking for the existence of
1462 DSPr2.
1463
1464 6. Performance has been improved significantly on 64-bit non-Linux and
1465 non-Windows platforms (generally 10-20% faster compression and 5-10% faster
1466 decompression.)  Due to an oversight, the 64-bit version of the accelerated
1467 Huffman codec was not being compiled in when libjpeg-turbo was built on
1468 platforms other than Windows or Linux.  Oops.
1469
1470 7. Fixed an extremely rare bug in the Huffman encoder that caused 64-bit
1471 builds of libjpeg-turbo to incorrectly encode a few specific test images when
1472 quality=98, an optimized Huffman table, and the accurate integer forward DCT
1473 were used.
1474
1475 8. The Windows (CMake) build system now supports building only static or only
1476 shared libraries.  This is accomplished by adding either `-DENABLE_STATIC=0` or
1477 `-DENABLE_SHARED=0` to the CMake command line.
1478
1479 9. TurboJPEG API functions will now return an error code if a warning is
1480 triggered in the underlying libjpeg API.  For instance, if a JPEG file is
1481 corrupt, the TurboJPEG decompression functions will attempt to decompress
1482 as much of the image as possible, but those functions will now return -1 to
1483 indicate that the decompression was not entirely successful.
1484
1485 10. Fixed a bug in the MIPS DSPr2 4:2:2 fancy upsampling routine that caused a
1486 buffer overflow (and subsequent segfault) when decompressing a 4:2:2 JPEG image
1487 in which the right-most MCU was 5 or 6 pixels wide.
1488
1489
1490 1.4.0
1491 =====
1492
1493 ### Significant changes relative to 1.4 beta1:
1494
1495 1. Fixed a build issue on OS X PowerPC platforms (md5cmp failed to build
1496 because OS X does not provide the `le32toh()` and `htole32()` functions.)
1497
1498 2. The non-SIMD RGB565 color conversion code did not work correctly on big
1499 endian machines.  This has been fixed.
1500
1501 3. Fixed an issue in `tjPlaneSizeYUV()` whereby it would erroneously return 1
1502 instead of -1 if `componentID` was > 0 and `subsamp` was `TJSAMP_GRAY`.
1503
1504 3. Fixed an issue in `tjBufSizeYUV2()` whereby it would erroneously return 0
1505 instead of -1 if `width` was < 1.
1506
1507 5. The Huffman encoder now uses `clz` and `bsr` instructions for bit counting
1508 on ARM64 platforms (see 1.4 beta1[5].)
1509
1510 6. The `close()` method in the TJCompressor and TJDecompressor Java classes is
1511 now idempotent.  Previously, that method would call the native `tjDestroy()`
1512 function even if the TurboJPEG instance had already been destroyed.  This
1513 caused an exception to be thrown during finalization, if the `close()` method
1514 had already been called.  The exception was caught, but it was still an
1515 expensive operation.
1516
1517 7. The TurboJPEG API previously generated an error (`Could not determine
1518 subsampling type for JPEG image`) when attempting to decompress grayscale JPEG
1519 images that were compressed with a sampling factor other than 1 (for instance,
1520 with `cjpeg -grayscale -sample 2x2`).  Subsampling technically has no meaning
1521 with grayscale JPEGs, and thus the horizontal and vertical sampling factors
1522 for such images are ignored by the decompressor.  However, the TurboJPEG API
1523 was being too rigid and was expecting the sampling factors to be equal to 1
1524 before it treated the image as a grayscale JPEG.
1525
1526 8. cjpeg, djpeg, and jpegtran now accept an argument of `-version`, which will
1527 print the library version and exit.
1528
1529 9. Referring to 1.4 beta1[15], another extremely rare circumstance was
1530 discovered under which the Huffman encoder's local buffer can be overrun
1531 when a buffered destination manager is being used and an
1532 extremely-high-frequency block (basically junk image data) is being encoded.
1533 Even though the Huffman local buffer was increased from 128 bytes to 136 bytes
1534 to address the previous issue, the new issue caused even the larger buffer to
1535 be overrun.  Further analysis reveals that, in the absolute worst case (such as
1536 setting alternating AC coefficients to 32767 and -32768 in the JPEG scanning
1537 order), the Huffman encoder can produce encoded blocks that approach double the
1538 size of the unencoded blocks.  Thus, the Huffman local buffer was increased to
1539 256 bytes, which should prevent any such issue from re-occurring in the future.
1540
1541 10. The new `tjPlaneSizeYUV()`, `tjPlaneWidth()`, and `tjPlaneHeight()`
1542 functions were not actually usable on any platform except OS X and Windows,
1543 because those functions were not included in the libturbojpeg mapfile.  This
1544 has been fixed.
1545
1546 11. Restored the `JPP()`, `JMETHOD()`, and `FAR` macros in the libjpeg-turbo
1547 header files.  The `JPP()` and `JMETHOD()` macros were originally implemented
1548 in libjpeg as a way of supporting non-ANSI compilers that lacked support for
1549 prototype parameters.  libjpeg-turbo has never supported such compilers, but
1550 some software packages still use the macros to define their own prototypes.
1551 Similarly, libjpeg-turbo has never supported MS-DOS and other platforms that
1552 have far symbols, but some software packages still use the `FAR` macro.  A
1553 pretty good argument can be made that this is a bad practice on the part of the
1554 software in question, but since this affects more than one package, it's just
1555 easier to fix it here.
1556
1557 12. Fixed issues that were preventing the ARM 64-bit SIMD code from compiling
1558 for iOS, and included an ARMv8 architecture in all of the binaries installed by
1559 the "official" libjpeg-turbo SDK for OS X.
1560
1561
1562 1.3.90 (1.4 beta1)
1563 ==================
1564
1565 ### Significant changes relative to 1.3.1:
1566
1567 1. New features in the TurboJPEG API:
1568
1569      - YUV planar images can now be generated with an arbitrary line padding
1570 (previously only 4-byte padding, which was compatible with X Video, was
1571 supported.)
1572      - The decompress-to-YUV function has been extended to support image
1573 scaling.
1574      - JPEG images can now be compressed from YUV planar source images.
1575      - YUV planar images can now be decoded into RGB or grayscale images.
1576      - 4:1:1 subsampling is now supported.  This is mainly included for
1577 compatibility, since 4:1:1 is not fully accelerated in libjpeg-turbo and has no
1578 significant advantages relative to 4:2:0.
1579      - CMYK images are now supported.  This feature allows CMYK source images
1580 to be compressed to YCCK JPEGs and YCCK or CMYK JPEGs to be decompressed to
1581 CMYK destination images.  Conversion between CMYK/YCCK and RGB or YUV images is
1582 not supported.  Such conversion requires a color management system and is thus
1583 out of scope for a codec library.
1584      - The handling of YUV images in the Java API has been significantly
1585 refactored and should now be much more intuitive.
1586      - The Java API now supports encoding a YUV image from an arbitrary
1587 position in a large image buffer.
1588      - All of the YUV functions now have a corresponding function that operates
1589 on separate image planes instead of a unified image buffer.  This allows for
1590 compressing/decoding from or decompressing/encoding to a subregion of a larger
1591 YUV image.  It also allows for handling YUV formats that swap the order of the
1592 U and V planes.
1593
1594 2. Added SIMD acceleration for DSPr2-capable MIPS platforms.  This speeds up
1595 the compression of full-color JPEGs by 70-80% on such platforms and
1596 decompression by 25-35%.
1597
1598 3. If an application attempts to decompress a Huffman-coded JPEG image whose
1599 header does not contain Huffman tables, libjpeg-turbo will now insert the
1600 default Huffman tables.  In order to save space, many motion JPEG video frames
1601 are encoded without the default Huffman tables, so these frames can now be
1602 successfully decompressed by libjpeg-turbo without additional work on the part
1603 of the application.  An application can still override the Huffman tables, for
1604 instance to re-use tables from a previous frame of the same video.
1605
1606 4. The Mac packaging system now uses pkgbuild and productbuild rather than
1607 PackageMaker (which is obsolete and no longer supported.)  This means that
1608 OS X 10.6 "Snow Leopard" or later must be used when packaging libjpeg-turbo,
1609 although the packages produced can be installed on OS X 10.5 "Leopard" or
1610 later.  OS X 10.4 "Tiger" is no longer supported.
1611
1612 5. The Huffman encoder now uses `clz` and `bsr` instructions for bit counting
1613 on ARM platforms rather than a lookup table.  This reduces the memory footprint
1614 by 64k, which may be important for some mobile applications.  Out of four
1615 Android devices that were tested, two demonstrated a small overall performance
1616 loss (~3-4% on average) with ARMv6 code and a small gain (also ~3-4%) with
1617 ARMv7 code when enabling this new feature, but the other two devices
1618 demonstrated a significant overall performance gain with both ARMv6 and ARMv7
1619 code (~10-20%) when enabling the feature.  Actual mileage may vary.
1620
1621 6. Worked around an issue with Visual C++ 2010 and later that caused incorrect
1622 pixels to be generated when decompressing a JPEG image to a 256-color bitmap,
1623 if compiler optimization was enabled when libjpeg-turbo was built.  This caused
1624 the regression tests to fail when doing a release build under Visual C++ 2010
1625 and later.
1626
1627 7. Improved the accuracy and performance of the non-SIMD implementation of the
1628 floating point inverse DCT (using code borrowed from libjpeg v8a and later.)
1629 The accuracy of this implementation now matches the accuracy of the SSE/SSE2
1630 implementation.  Note, however, that the floating point DCT/IDCT algorithms are
1631 mainly a legacy feature.  They generally do not produce significantly better
1632 accuracy than the accurate integer DCT/IDCT algorithms, and they are quite a
1633 bit slower.
1634
1635 8. Added a new output colorspace (`JCS_RGB565`) to the libjpeg API that allows
1636 for decompressing JPEG images into RGB565 (16-bit) pixels.  If dithering is not
1637 used, then this code path is SIMD-accelerated on ARM platforms.
1638
1639 9. Numerous obsolete features, such as support for non-ANSI compilers and
1640 support for the MS-DOS memory model, were removed from the libjpeg code,
1641 greatly improving its readability and making it easier to maintain and extend.
1642
1643 10. Fixed a segfault that occurred when calling `output_message()` with
1644 `msg_code` set to `JMSG_COPYRIGHT`.
1645
1646 11. Fixed an issue whereby wrjpgcom was allowing comments longer than 65k
1647 characters to be passed on the command line, which was causing it to generate
1648 incorrect JPEG files.
1649
1650 12. Fixed a bug in the build system that was causing the Windows version of
1651 wrjpgcom to be built using the rdjpgcom source code.
1652
1653 13. Restored 12-bit-per-component JPEG support.  A 12-bit version of
1654 libjpeg-turbo can now be built by passing an argument of `--with-12bit` to
1655 configure (Unix) or `-DWITH_12BIT=1` to cmake (Windows.)  12-bit JPEG support
1656 is included only for convenience.  Enabling this feature disables all of the
1657 performance features in libjpeg-turbo, as well as arithmetic coding and the
1658 TurboJPEG API.  The resulting library still contains the other libjpeg-turbo
1659 features (such as the colorspace extensions), but in general, it performs no
1660 faster than libjpeg v6b.
1661
1662 14. Added ARM 64-bit SIMD acceleration for the YCC-to-RGB color conversion
1663 and IDCT algorithms (both are used during JPEG decompression.)  For
1664 reasons (probably related to clang), this code cannot currently be compiled for
1665 iOS.
1666
1667 15. Fixed an extremely rare bug (CVE-2014-9092) that could cause the Huffman
1668 encoder's local buffer to overrun when a very high-frequency MCU is compressed
1669 using quality 100 and no subsampling, and when the JPEG output buffer is being
1670 dynamically resized by the destination manager.  This issue was so rare that,
1671 even with a test program specifically designed to make the bug occur (by
1672 injecting random high-frequency YUV data into the compressor), it was
1673 reproducible only once in about every 25 million iterations.
1674
1675 16. Fixed an oversight in the TurboJPEG C wrapper:  if any of the JPEG
1676 compression functions was called repeatedly with the same
1677 automatically-allocated destination buffer, then TurboJPEG would erroneously
1678 assume that the `jpegSize` parameter was equal to the size of the buffer, when
1679 in fact that parameter was probably equal to the size of the most recently
1680 compressed JPEG image.  If the size of the previous JPEG image was not as large
1681 as the current JPEG image, then TurboJPEG would unnecessarily reallocate the
1682 destination buffer.
1683
1684
1685 1.3.1
1686 =====
1687
1688 ### Significant changes relative to 1.3.0:
1689
1690 1. On Un*x systems, `make install` now installs the libjpeg-turbo libraries
1691 into /opt/libjpeg-turbo/lib32 by default on any 32-bit system, not just x86,
1692 and into /opt/libjpeg-turbo/lib64 by default on any 64-bit system, not just
1693 x86-64.  You can override this by overriding either the `prefix` or `libdir`
1694 configure variables.
1695
1696 2. The Windows installer now places a copy of the TurboJPEG DLLs in the same
1697 directory as the rest of the libjpeg-turbo binaries.  This was mainly done
1698 to support TurboVNC 1.3, which bundles the DLLs in its Windows installation.
1699 When using a 32-bit version of CMake on 64-bit Windows, it is impossible to
1700 access the c:\WINDOWS\system32 directory, which made it impossible for the
1701 TurboVNC build scripts to bundle the 64-bit TurboJPEG DLL.
1702
1703 3. Fixed a bug whereby attempting to encode a progressive JPEG with arithmetic
1704 entropy coding (by passing arguments of `-progressive -arithmetic` to cjpeg or
1705 jpegtran, for instance) would result in an error, `Requested feature was
1706 omitted at compile time`.
1707
1708 4. Fixed a couple of issues (CVE-2013-6629 and CVE-2013-6630) whereby malformed
1709 JPEG images would cause libjpeg-turbo to use uninitialized memory during
1710 decompression.
1711
1712 5. Fixed an error (`Buffer passed to JPEG library is too small`) that occurred
1713 when calling the TurboJPEG YUV encoding function with a very small (< 5x5)
1714 source image, and added a unit test to check for this error.
1715
1716 6. The Java classes should now build properly under Visual Studio 2010 and
1717 later.
1718
1719 7. Fixed an issue that prevented SRPMs generated using the in-tree packaging
1720 tools from being rebuilt on certain newer Linux distributions.
1721
1722 8. Numerous minor fixes to eliminate compilation and build/packaging system
1723 warnings, fix cosmetic issues, improve documentation clarity, and other general
1724 source cleanup.
1725
1726
1727 1.3.0
1728 =====
1729
1730 ### Significant changes relative to 1.3 beta1:
1731
1732 1. `make test` now works properly on FreeBSD, and it no longer requires the
1733 md5sum executable to be present on other Un*x platforms.
1734
1735 2. Overhauled the packaging system:
1736
1737      - To avoid conflict with vendor-supplied libjpeg-turbo packages, the
1738 official RPMs and DEBs for libjpeg-turbo have been renamed to
1739 "libjpeg-turbo-official".
1740      - The TurboJPEG libraries are now located under /opt/libjpeg-turbo in the
1741 official Linux and Mac packages, to avoid conflict with vendor-supplied
1742 packages and also to streamline the packaging system.
1743      - Release packages are now created with the directory structure defined
1744 by the configure variables `prefix`, `bindir`, `libdir`, etc. (Un\*x) or by the
1745 `CMAKE_INSTALL_PREFIX` variable (Windows.)  The exception is that the docs are
1746 always located under the system default documentation directory on Un\*x and
1747 Mac systems, and on Windows, the TurboJPEG DLL is always located in the Windows
1748 system directory.
1749      - To avoid confusion, official libjpeg-turbo packages on Linux/Unix
1750 platforms (except for Mac) will always install the 32-bit libraries in
1751 /opt/libjpeg-turbo/lib32 and the 64-bit libraries in /opt/libjpeg-turbo/lib64.
1752      - Fixed an issue whereby, in some cases, the libjpeg-turbo executables on
1753 Un*x systems were not properly linking with the shared libraries installed by
1754 the same package.
1755      - Fixed an issue whereby building the "installer" target on Windows when
1756 `WITH_JAVA=1` would fail if the TurboJPEG JAR had not been previously built.
1757      - Building the "install" target on Windows now installs files into the
1758 same places that the installer does.
1759
1760 3. Fixed a Huffman encoder bug that prevented I/O suspension from working
1761 properly.
1762
1763
1764 1.2.90 (1.3 beta1)
1765 ==================
1766
1767 ### Significant changes relative to 1.2.1:
1768
1769 1. Added support for additional scaling factors (3/8, 5/8, 3/4, 7/8, 9/8, 5/4,
1770 11/8, 3/2, 13/8, 7/4, 15/8, and 2) when decompressing.  Note that the IDCT will
1771 not be SIMD-accelerated when using any of these new scaling factors.
1772
1773 2. The TurboJPEG dynamic library is now versioned.  It was not strictly
1774 necessary to do so, because TurboJPEG uses versioned symbols, and if a function
1775 changes in an ABI-incompatible way, that function is renamed and a legacy
1776 function is provided to maintain backward compatibility.  However, certain
1777 Linux distro maintainers have a policy against accepting any library that isn't
1778 versioned.
1779
1780 3. Extended the TurboJPEG Java API so that it can be used to compress a JPEG
1781 image from and decompress a JPEG image to an arbitrary position in a large
1782 image buffer.
1783
1784 4. The `tjDecompressToYUV()` function now supports the `TJFLAG_FASTDCT` flag.
1785
1786 5. The 32-bit supplementary package for amd64 Debian systems now provides
1787 symlinks in /usr/lib/i386-linux-gnu for the TurboJPEG libraries in /usr/lib32.
1788 This allows those libraries to be used on MultiArch-compatible systems (such as
1789 Ubuntu 11 and later) without setting the linker path.
1790
1791 6. The TurboJPEG Java wrapper should now find the JNI library on Mac systems
1792 without having to pass `-Djava.library.path=/usr/lib` to java.
1793
1794 7. TJBench has been ported to Java to provide a convenient way of validating
1795 the performance of the TurboJPEG Java API.  It can be run with
1796 `java -cp turbojpeg.jar TJBench`.
1797
1798 8. cjpeg can now be used to generate JPEG files with the RGB colorspace
1799 (feature ported from jpeg-8d.)
1800
1801 9. The width and height in the `-crop` argument passed to jpegtran can now be
1802 suffixed with `f` to indicate that, when the upper left corner of the cropping
1803 region is automatically moved to the nearest iMCU boundary, the bottom right
1804 corner should be moved by the same amount.  In other words, this feature causes
1805 jpegtran to strictly honor the specified width/height rather than the specified
1806 bottom right corner (feature ported from jpeg-8d.)
1807
1808 10. JPEG files using the RGB colorspace can now be decompressed into grayscale
1809 images (feature ported from jpeg-8d.)
1810
1811 11. Fixed a regression caused by 1.2.1[7] whereby the build would fail with
1812 multiple "Mismatch in operand sizes" errors when attempting to build the x86
1813 SIMD code with NASM 0.98.
1814
1815 12. The in-memory source/destination managers (`jpeg_mem_src()` and
1816 `jpeg_mem_dest()`) are now included by default when building libjpeg-turbo with
1817 libjpeg v6b or v7 emulation, so that programs can take advantage of these
1818 functions without requiring the use of the backward-incompatible libjpeg v8
1819 ABI.  The "age number" of the libjpeg-turbo library on Un*x systems has been
1820 incremented by 1 to reflect this.  You can disable this feature with a
1821 configure/CMake switch in order to retain strict API/ABI compatibility with the
1822 libjpeg v6b or v7 API/ABI (or with previous versions of libjpeg-turbo.)  See
1823 [README.md](README.md) for more details.
1824
1825 13. Added ARMv7s architecture to libjpeg.a and libturbojpeg.a in the official
1826 libjpeg-turbo binary package for OS X, so that those libraries can be used to
1827 build applications that leverage the faster CPUs in the iPhone 5 and iPad 4.
1828
1829
1830 1.2.1
1831 =====
1832
1833 ### Significant changes relative to 1.2.0:
1834
1835 1. Creating or decoding a JPEG file that uses the RGB colorspace should now
1836 properly work when the input or output colorspace is one of the libjpeg-turbo
1837 colorspace extensions.
1838
1839 2. When libjpeg-turbo was built without SIMD support and merged (non-fancy)
1840 upsampling was used along with an alpha-enabled colorspace during
1841 decompression, the unused byte of the decompressed pixels was not being set to
1842 0xFF.  This has been fixed.  TJUnitTest has also been extended to test for the
1843 correct behavior of the colorspace extensions when merged upsampling is used.
1844
1845 3. Fixed a bug whereby the libjpeg-turbo SSE2 SIMD code would not preserve the
1846 upper 64 bits of xmm6 and xmm7 on Win64 platforms, which violated the Win64
1847 calling conventions.
1848
1849 4. Fixed a regression (CVE-2012-2806) caused by 1.2.0[6] whereby decompressing
1850 corrupt JPEG images (specifically, images in which the component count was
1851 erroneously set to a large value) would cause libjpeg-turbo to segfault.
1852
1853 5. Worked around a severe performance issue with "Bobcat" (AMD Embedded APU)
1854 processors.  The `MASKMOVDQU` instruction, which was used by the libjpeg-turbo
1855 SSE2 SIMD code, is apparently implemented in microcode on AMD processors, and
1856 it is painfully slow on Bobcat processors in particular.  Eliminating the use
1857 of this instruction improved performance by an order of magnitude on Bobcat
1858 processors and by a small amount (typically 5%) on AMD desktop processors.
1859
1860 6. Added SIMD acceleration for performing 4:2:2 upsampling on NEON-capable ARM
1861 platforms.  This speeds up the decompression of 4:2:2 JPEGs by 20-25% on such
1862 platforms.
1863
1864 7. Fixed a regression caused by 1.2.0[2] whereby, on Linux/x86 platforms
1865 running the 32-bit SSE2 SIMD code in libjpeg-turbo, decompressing a 4:2:0 or
1866 4:2:2 JPEG image into a 32-bit (RGBX, BGRX, etc.) buffer without using fancy
1867 upsampling would produce several incorrect columns of pixels at the right-hand
1868 side of the output image if each row in the output image was not evenly
1869 divisible by 16 bytes.
1870
1871 8. Fixed an issue whereby attempting to build the SIMD extensions with Xcode
1872 4.3 on OS X platforms would cause NASM to return numerous errors of the form
1873 "'%define' expects a macro identifier".
1874
1875 9. Added flags to the TurboJPEG API that allow the caller to force the use of
1876 either the fast or the accurate DCT/IDCT algorithms in the underlying codec.
1877
1878
1879 1.2.0
1880 =====
1881
1882 ### Significant changes relative to 1.2 beta1:
1883
1884 1. Fixed build issue with Yasm on Unix systems (the libjpeg-turbo build system
1885 was not adding the current directory to the assembler include path, so Yasm
1886 was not able to find jsimdcfg.inc.)
1887
1888 2. Fixed out-of-bounds read in SSE2 SIMD code that occurred when decompressing
1889 a JPEG image to a bitmap buffer whose size was not a multiple of 16 bytes.
1890 This was more of an annoyance than an actual bug, since it did not cause any
1891 actual run-time problems, but the issue showed up when running libjpeg-turbo in
1892 valgrind.  See <http://crbug.com/72399> for more information.
1893
1894 3. Added a compile-time macro (`LIBJPEG_TURBO_VERSION`) that can be used to
1895 check the version of libjpeg-turbo against which an application was compiled.
1896
1897 4. Added new RGBA/BGRA/ABGR/ARGB colorspace extension constants (libjpeg API)
1898 and pixel formats (TurboJPEG API), which allow applications to specify that,
1899 when decompressing to a 4-component RGB buffer, the unused byte should be set
1900 to 0xFF so that it can be interpreted as an opaque alpha channel.
1901
1902 5. Fixed regression issue whereby DevIL failed to build against libjpeg-turbo
1903 because libjpeg-turbo's distributed version of jconfig.h contained an `INLINE`
1904 macro, which conflicted with a similar macro in DevIL.  This macro is used only
1905 internally when building libjpeg-turbo, so it was moved into config.h.
1906
1907 6. libjpeg-turbo will now correctly decompress erroneous CMYK/YCCK JPEGs whose
1908 K component is assigned a component ID of 1 instead of 4.  Although these files
1909 are in violation of the spec, other JPEG implementations handle them
1910 correctly.
1911
1912 7. Added ARMv6 and ARMv7 architectures to libjpeg.a and libturbojpeg.a in
1913 the official libjpeg-turbo binary package for OS X, so that those libraries can
1914 be used to build both OS X and iOS applications.
1915
1916
1917 1.1.90 (1.2 beta1)
1918 ==================
1919
1920 ### Significant changes relative to 1.1.1:
1921
1922 1. Added a Java wrapper for the TurboJPEG API.  See [java/README](java/README)
1923 for more details.
1924
1925 2. The TurboJPEG API can now be used to scale down images during
1926 decompression.
1927
1928 3. Added SIMD routines for RGB-to-grayscale color conversion, which
1929 significantly improves the performance of grayscale JPEG compression from an
1930 RGB source image.
1931
1932 4. Improved the performance of the C color conversion routines, which are used
1933 on platforms for which SIMD acceleration is not available.
1934
1935 5. Added a function to the TurboJPEG API that performs lossless transforms.
1936 This function is implemented using the same back end as jpegtran, but it
1937 performs transcoding entirely in memory and allows multiple transforms and/or
1938 crop operations to be batched together, so the source coefficients only need to
1939 be read once.  This is useful when generating image tiles from a single source
1940 JPEG.
1941
1942 6. Added tests for the new TurboJPEG scaled decompression and lossless
1943 transform features to tjbench (the TurboJPEG benchmark, formerly called
1944 "jpgtest".)
1945
1946 7. Added support for 4:4:0 (transposed 4:2:2) subsampling in TurboJPEG, which
1947 was necessary in order for it to read 4:2:2 JPEG files that had been losslessly
1948 transposed or rotated 90 degrees.
1949
1950 8. All legacy VirtualGL code has been re-factored, and this has allowed
1951 libjpeg-turbo, in its entirety, to be re-licensed under a BSD-style license.
1952
1953 9. libjpeg-turbo can now be built with Yasm.
1954
1955 10. Added SIMD acceleration for ARM Linux and iOS platforms that support
1956 NEON instructions.
1957
1958 11. Refactored the TurboJPEG C API and documented it using Doxygen.  The
1959 TurboJPEG 1.2 API uses pixel formats to define the size and component order of
1960 the uncompressed source/destination images, and it includes a more efficient
1961 version of `TJBUFSIZE()` that computes a worst-case JPEG size based on the
1962 level of chrominance subsampling.  The refactored implementation of the
1963 TurboJPEG API now uses the libjpeg memory source and destination managers,
1964 which allows the TurboJPEG compressor to grow the JPEG buffer as necessary.
1965
1966 12. Eliminated errors in the output of jpegtran on Windows that occurred when
1967 the application was invoked using I/O redirection
1968 (`jpegtran <input.jpg >output.jpg`.)
1969
1970 13. The inclusion of libjpeg v7 and v8 emulation as well as arithmetic coding
1971 support in libjpeg-turbo v1.1.0 introduced several new error constants in
1972 jerror.h, and these were mistakenly enabled for all emulation modes, causing
1973 the error enum in libjpeg-turbo to sometimes have different values than the
1974 same enum in libjpeg.  This represents an ABI incompatibility, and it caused
1975 problems with rare applications that took specific action based on a particular
1976 error value.  The fix was to include the new error constants conditionally
1977 based on whether libjpeg v7 or v8 emulation was enabled.
1978
1979 14. Fixed an issue whereby Windows applications that used libjpeg-turbo would
1980 fail to compile if the Windows system headers were included before jpeglib.h.
1981 This issue was caused by a conflict in the definition of the INT32 type.
1982
1983 15. Fixed 32-bit supplementary package for amd64 Debian systems, which was
1984 broken by enhancements to the packaging system in 1.1.
1985
1986 16. When decompressing a JPEG image using an output colorspace of
1987 `JCS_EXT_RGBX`, `JCS_EXT_BGRX`, `JCS_EXT_XBGR`, or `JCS_EXT_XRGB`,
1988 libjpeg-turbo will now set the unused byte to 0xFF, which allows applications
1989 to interpret that byte as an alpha channel (0xFF = opaque).
1990
1991
1992 1.1.1
1993 =====
1994
1995 ### Significant changes relative to 1.1.0:
1996
1997 1. Fixed a 1-pixel error in row 0, column 21 of the luminance plane generated
1998 by `tjEncodeYUV()`.
1999
2000 2. libjpeg-turbo's accelerated Huffman decoder previously ignored unexpected
2001 markers found in the middle of the JPEG data stream during decompression.  It
2002 will now hand off decoding of a particular block to the unaccelerated Huffman
2003 decoder if an unexpected marker is found, so that the unaccelerated Huffman
2004 decoder can generate an appropriate warning.
2005
2006 3. Older versions of MinGW64 prefixed symbol names with underscores by
2007 default, which differed from the behavior of 64-bit Visual C++.  MinGW64 1.0
2008 has adopted the behavior of 64-bit Visual C++ as the default, so to accommodate
2009 this, the libjpeg-turbo SIMD function names are no longer prefixed with an
2010 underscore when building with MinGW64.  This means that, when building
2011 libjpeg-turbo with older versions of MinGW64, you will now have to add
2012 `-fno-leading-underscore` to the `CFLAGS`.
2013
2014 4. Fixed a regression bug in the NSIS script that caused the Windows installer
2015 build to fail when using the Visual Studio IDE.
2016
2017 5. Fixed a bug in `jpeg_read_coefficients()` whereby it would not initialize
2018 `cinfo->image_width` and `cinfo->image_height` if libjpeg v7 or v8 emulation
2019 was enabled.  This specifically caused the jpegoptim program to fail if it was
2020 linked against a version of libjpeg-turbo that was built with libjpeg v7 or v8
2021 emulation.
2022
2023 6. Eliminated excessive I/O overhead that occurred when reading BMP files in
2024 cjpeg.
2025
2026 7. Eliminated errors in the output of cjpeg on Windows that occurred when the
2027 application was invoked using I/O redirection (`cjpeg <inputfile >output.jpg`.)
2028
2029
2030 1.1.0
2031 =====
2032
2033 ### Significant changes relative to 1.1 beta1:
2034
2035 1. The algorithm used by the SIMD quantization function cannot produce correct
2036 results when the JPEG quality is >= 98 and the fast integer forward DCT is
2037 used.  Thus, the non-SIMD quantization function is now used for those cases,
2038 and libjpeg-turbo should now produce identical output to libjpeg v6b in all
2039 cases.
2040
2041 2. Despite the above, the fast integer forward DCT still degrades somewhat for
2042 JPEG qualities greater than 95, so the TurboJPEG wrapper will now automatically
2043 use the accurate integer forward DCT when generating JPEG images of quality 96
2044 or greater.  This reduces compression performance by as much as 15% for these
2045 high-quality images but is necessary to ensure that the images are perceptually
2046 lossless.  It also ensures that the library can avoid the performance pitfall
2047 created by [1].
2048
2049 3. Ported jpgtest.cxx to pure C to avoid the need for a C++ compiler.
2050
2051 4. Fixed visual artifacts in grayscale JPEG compression caused by a typo in
2052 the RGB-to-luminance lookup tables.
2053
2054 5. The Windows distribution packages now include the libjpeg run-time programs
2055 (cjpeg, etc.)
2056
2057 6. All packages now include jpgtest.
2058
2059 7. The TurboJPEG dynamic library now uses versioned symbols.
2060
2061 8. Added two new TurboJPEG API functions, `tjEncodeYUV()` and
2062 `tjDecompressToYUV()`, to replace the somewhat hackish `TJ_YUV` flag.
2063
2064
2065 1.0.90 (1.1 beta1)
2066 ==================
2067
2068 ### Significant changes relative to 1.0.1:
2069
2070 1. Added emulation of the libjpeg v7 and v8 APIs and ABIs.  See
2071 [README.md](README.md) for more details.  This feature was sponsored by
2072 CamTrace SAS.
2073
2074 2. Created a new CMake-based build system for the Visual C++ and MinGW builds.
2075
2076 3. Grayscale bitmaps can now be compressed from/decompressed to using the
2077 TurboJPEG API.
2078
2079 4. jpgtest can now be used to test decompression performance with existing
2080 JPEG images.
2081
2082 5. If the default install prefix (/opt/libjpeg-turbo) is used, then
2083 `make install` now creates /opt/libjpeg-turbo/lib32 and
2084 /opt/libjpeg-turbo/lib64 sym links to duplicate the behavior of the binary
2085 packages.
2086
2087 6. All symbols in the libjpeg-turbo dynamic library are now versioned, even
2088 when the library is built with libjpeg v6b emulation.
2089
2090 7. Added arithmetic encoding and decoding support (can be disabled with
2091 configure or CMake options)
2092
2093 8. Added a `TJ_YUV` flag to the TurboJPEG API, which causes both the compressor
2094 and decompressor to output planar YUV images.
2095
2096 9. Added an extended version of `tjDecompressHeader()` to the TurboJPEG API,
2097 which allows the caller to determine the type of subsampling used in a JPEG
2098 image.
2099
2100 10. Added further protections against invalid Huffman codes.
2101
2102
2103 1.0.1
2104 =====
2105
2106 ### Significant changes relative to 1.0.0:
2107
2108 1. The Huffman decoder will now handle erroneous Huffman codes (for instance,
2109 from a corrupt JPEG image.)  Previously, these would cause libjpeg-turbo to
2110 crash under certain circumstances.
2111
2112 2. Fixed typo in SIMD dispatch routines that was causing 4:2:2 upsampling to
2113 be used instead of 4:2:0 when decompressing JPEG images using SSE2 code.
2114
2115 3. The configure script will now automatically determine whether the
2116 `INCOMPLETE_TYPES_BROKEN` macro should be defined.
2117
2118
2119 1.0.0
2120 =====
2121
2122 ### Significant changes relative to 0.0.93:
2123
2124 1. 2983700: Further FreeBSD build tweaks (no longer necessary to specify
2125 `--host` when configuring on a 64-bit system)
2126
2127 2. Created symlinks in the Unix/Linux packages so that the TurboJPEG
2128 include file can always be found in /opt/libjpeg-turbo/include, the 32-bit
2129 static libraries can always be found in /opt/libjpeg-turbo/lib32, and the
2130 64-bit static libraries can always be found in /opt/libjpeg-turbo/lib64.
2131
2132 3. The Unix/Linux distribution packages now include the libjpeg run-time
2133 programs (cjpeg, etc.) and man pages.
2134
2135 4. Created a 32-bit supplementary package for amd64 Debian systems, which
2136 contains just the 32-bit libjpeg-turbo libraries.
2137
2138 5. Moved the libraries from */lib32 to */lib in the i386 Debian package.
2139
2140 6. Include distribution package for Cygwin
2141
2142 7. No longer necessary to specify `--without-simd` on non-x86 architectures,
2143 and unit tests now work on those architectures.
2144
2145
2146 0.0.93
2147 ======
2148
2149 ### Significant changes since 0.0.91:
2150
2151 1. 2982659: Fixed x86-64 build on FreeBSD systems
2152
2153 2. 2988188: Added support for Windows 64-bit systems
2154
2155
2156 0.0.91
2157 ======
2158
2159 ### Significant changes relative to 0.0.90:
2160
2161 1. Added documentation to .deb packages
2162
2163 2. 2968313: Fixed data corruption issues when decompressing large JPEG images
2164 and/or using buffered I/O with the libjpeg-turbo decompressor
2165
2166
2167 0.0.90
2168 ======
2169
2170 Initial release