1 ------------------------------------------------------------------------
2 The list of most significant changes made over time in
3 Intel(R) Threading Building Blocks (Intel(R) TBB).
5 Intel TBB 2019 Update 9
6 TBB_INTERFACE_VERSION == 11009
8 Changes (w.r.t. Intel TBB 2019 Update 8):
10 - Multiple APIs are deprecated. For details, please see
11 Deprecated Features appendix in the TBB reference manual.
12 - Added C++17 deduction guides for flow graph nodes.
16 - Added isolated_task_group class that allows multiple threads to add
17 and execute tasks sharing the same isolation.
18 - Extended the flow graph API to simplify connecting nodes.
19 - Added erase() by heterogeneous keys for concurrent ordered containers.
20 - Added a possibility to suspend task execution at a specific point
25 - Fixed the emplace() method of concurrent unordered containers to
26 destroy a temporary element that was not inserted.
27 - Fixed a bug in the merge() method of concurrent unordered
29 - Fixed behavior of a continue_node that follows buffering nodes.
31 Open-source contributions integrated:
33 - Added support for move-only types to tbb::parallel_pipeline
34 (https://github.com/intel/tbb/pull/159) by Raf Schietekat.
35 - Fixed detection of clang version when CUDA toolkit is installed
36 (https://github.com/intel/tbb/pull/150) by Guilherme Amadio.
38 ------------------------------------------------------------------------
39 Intel TBB 2019 Update 8
40 TBB_INTERFACE_VERSION == 11008
42 Changes (w.r.t. Intel TBB 2019 Update 7):
46 - Fixed a bug in TBB 2019 Update 7 that could lead to incorrect memory
47 reallocation on Linux (https://github.com/intel/tbb/issues/148).
48 - Fixed enqueuing tbb::task into tbb::task_arena not to fail on threads
49 with no task scheduler initialized
50 (https://github.com/intel/tbb/issues/116).
52 ------------------------------------------------------------------------
53 Intel TBB 2019 Update 7
54 TBB_INTERFACE_VERSION == 11007
56 Changes (w.r.t. Intel TBB 2019 Update 6):
58 - Added TBBMALLOC_SET_HUGE_SIZE_THRESHOLD parameter to set the lower
59 bound for allocations that are not released back to OS unless
60 a cleanup is explicitly requested.
61 - Added zip_iterator::base() method to get the tuple of underlying
63 - Improved async_node to never block a thread that sends a message
65 - Extended decrement port of the tbb::flow::limiter_node to accept
66 messages of integral types.
67 - Added support of Windows* to the CMake module TBBInstallConfig.
68 - Added packaging of CMake configuration files to TBB packages built
69 using build/build.py script
70 (https://github.com/intel/tbb/issues/141).
72 Changes affecting backward compatibility:
74 - Removed the number_of_decrement_predecessors parameter from the
75 constructor of flow::limiter_node. To allow its usage, set
76 TBB_DEPRECATED_LIMITER_NODE_CONSTRUCTOR macro to 1.
80 - Added ordered associative containers:
81 concurrent_{map,multimap,set,multiset} (requires C++11).
83 Open-source contributions integrated:
85 - Fixed makefiles to properly obtain the GCC version for GCC 7
86 and later (https://github.com/intel/tbb/pull/147) by Timmmm.
88 ------------------------------------------------------------------------
89 Intel TBB 2019 Update 6
90 TBB_INTERFACE_VERSION == 11006
92 Changes (w.r.t. Intel TBB 2019 Update 5):
94 - Added support for Microsoft* Visual Studio* 2019.
95 - Added support for enqueuing tbb::task into tbb::task_arena
96 (https://github.com/01org/tbb/issues/116).
97 - Improved support for allocator propagation on concurrent_hash_map
98 assigning and swapping.
99 - Improved scalable_allocation_command cleanup operations to release
100 more memory buffered by the calling thread.
101 - Separated allocation of small and large objects into distinct memory
102 regions, which helps to reduce excessive memory caching inside the
107 - Removed template class gfx_factory from the flow graph API.
109 ------------------------------------------------------------------------
110 Intel TBB 2019 Update 5
111 TBB_INTERFACE_VERSION == 11005
113 Changes (w.r.t. Intel TBB 2019 Update 4):
115 - Associating a task_scheduler_observer with an implicit or explicit
116 task arena is now a fully supported feature.
117 - Added a CMake module TBBInstallConfig that allows to generate and
118 install CMake configuration files for TBB packages.
119 Inspired by Hans Johnson (https://github.com/01org/tbb/pull/119).
120 - Added node handles, methods merge() and unsafe_extract() to concurrent
121 unordered containers.
122 - Added constructors with Compare argument to concurrent_priority_queue
123 (https://github.com/01org/tbb/issues/109).
124 - Controlling the stack size of worker threads is now supported for
125 Universal Windows Platform.
126 - Improved tbb::zip_iterator to work with algorithms that swap values
128 - Improved support for user-specified allocators in concurrent_hash_map,
129 including construction of allocator-aware data types.
130 - For ReaderWriterMutex types, upgrades and downgrades now succeed if
131 the mutex is already in the requested state.
132 Inspired by Niadb (https://github.com/01org/tbb/pull/122).
136 - The task_scheduler_observer::may_sleep() method has been removed.
140 - Fixed the issue with a pipeline parallel filter executing serially if
141 it follows a thread-bound filter.
142 - Fixed a performance regression observed when multiple parallel
143 algorithms start simultaneously.
145 ------------------------------------------------------------------------
146 Intel TBB 2019 Update 4
147 TBB_INTERFACE_VERSION == 11004
149 Changes (w.r.t. Intel TBB 2019 Update 3):
151 - global_control class is now a fully supported feature.
152 - Added deduction guides for tbb containers: concurrent_hash_map,
153 concurrent_unordered_map, concurrent_unordered_set.
154 - Added tbb::scalable_memory_resource function returning
155 std::pmr::memory_resource interface to the TBB memory allocator.
156 - Added tbb::cache_aligned_resource class that implements
157 std::pmr::memory_resource with cache alignment and no false sharing.
158 - Added rml::pool_msize function returning the usable size of a memory
159 block allocated from a given memory pool.
160 - Added default and copy constructors for tbb::counting_iterator
161 and tbb::zip_iterator.
162 - Added TBB_malloc_replacement_log function to obtain the status of
163 dynamic memory allocation replacement (Windows* only).
164 - CMake configuration file now supports release-only and debug-only
165 configurations (https://github.com/01org/tbb/issues/113).
166 - TBBBuild CMake module takes the C++ version from CMAKE_CXX_STANDARD.
170 - Fixed compilation for tbb::concurrent_vector when used with
171 std::pmr::polymorphic_allocator.
173 Open-source contributions integrated:
175 - TBB_INTERFACE_VERSION is included into TBB version in CMake
176 configuration (https://github.com/01org/tbb/pull/100)
178 - Fixed detection of C++17 deduction guides for Visual C++*
179 (https://github.com/01org/tbb/pull/112) by Marian Klymov.
181 ------------------------------------------------------------------------
182 Intel TBB 2019 Update 3
183 TBB_INTERFACE_VERSION == 11003
185 Changes (w.r.t. Intel TBB 2019 Update 2):
187 - Added tbb::transform_iterator.
188 - Added new Makefile target 'profile' to flow graph examples enabling
189 additional support for Intel(R) Parallel Studio XE tools.
190 - Added TBB_MALLOC_DISABLE_REPLACEMENT environment variable to switch off
191 dynamic memory allocation replacement on Windows*. Inspired by
192 a contribution from Edward Lam.
196 - Extended flow graph API to support relative priorities for functional
197 nodes, specified as an optional parameter to the node constructors.
199 Open-source contributions integrated:
201 - Enabled using process-local futex operations
202 (https://github.com/01org/tbb/pull/58) by Andrey Semashev.
204 ------------------------------------------------------------------------
205 Intel TBB 2019 Update 2
206 TBB_INTERFACE_VERSION == 11002
208 Changes (w.r.t. Intel TBB 2019 Update 1):
210 - Added overloads for parallel_reduce with default partitioner and
211 user-supplied context.
212 - Added deduction guides for tbb containers: concurrent_vector,
213 concurrent_queue, concurrent_bounded_queue,
214 concurrent_priority_queue.
215 - Reallocation of memory objects >1MB now copies and frees memory if
216 the size is decreased twice or more, trading performance off for
217 reduced memory usage.
218 - After a period of sleep, TBB worker threads now prefer returning to
219 their last used task arena.
223 - Fixed compilation of task_group.h when targeting macOS* 10.11 or
224 earlier (https://github.com/conda-forge/tbb-feedstock/issues/42).
226 Open-source contributions integrated:
228 - Added constructors with HashCompare argument to concurrent_hash_map
229 (https://github.com/01org/tbb/pull/63) by arewedancer.
231 ------------------------------------------------------------------------
232 Intel TBB 2019 Update 1
233 TBB_INTERFACE_VERSION == 11001
235 Changes (w.r.t. Intel TBB 2019):
237 - Doxygen documentation could be built with 'make doxygen' command now.
239 Changes affecting backward compatibility:
241 - Enforced 8 byte alignment for tbb::atomic<long long> and
242 tbb::atomic<double>. On IA-32 architecture it may cause layout
243 changes in structures that use these types.
247 - Fixed an issue with dynamic memory allocation replacement on Windows*
248 occurred for some versions of ucrtbase.dll.
249 - Fixed possible deadlock in tbbmalloc cleanup procedure during process
250 shutdown. Inspired by a contribution from Edward Lam.
251 - Fixed usage of std::uncaught_exception() deprecated in C++17
252 (https://github.com/01org/tbb/issues/67).
253 - Fixed a crash when a local observer is activated after an arena
255 - Fixed compilation of task_group.h by Visual C++* 15.7 with
256 /permissive- option (https://github.com/01org/tbb/issues/53).
257 - Fixed tbb4py to avoid dependency on Intel(R) C++ Compiler shared
259 - Fixed compilation for Anaconda environment with GCC 7.3 and higher.
261 Open-source contributions integrated:
263 - Fix various warnings when building with Visual C++
264 (https://github.com/01org/tbb/pull/70) by Edward Lam.
266 ------------------------------------------------------------------------
268 TBB_INTERFACE_VERSION == 11000
270 Changes (w.r.t. Intel TBB 2018 Update 5):
272 - Lightweight policy for functional nodes in the flow graph is now
273 a fully supported feature.
274 - Reservation support in flow::write_once_node and flow::overwrite_node
275 is now a fully supported feature.
276 - Support for Flow Graph Analyzer and improvements for
277 Intel(R) VTune(TM) Amplifier become a regular feature enabled by
278 TBB_USE_THREADING_TOOLS macro.
279 - Added support for std::new_handler in the replacement functions for
281 - Added C++14 constructors to concurrent unordered containers.
282 - Added tbb::counting_iterator and tbb::zip_iterator.
283 - Fixed multiple -Wextra warnings in TBB source files.
287 - Extracting nodes from a flow graph is deprecated and disabled by
288 default. To enable, use TBB_DEPRECATED_FLOW_NODE_EXTRACTION macro.
290 Changes affecting backward compatibility:
292 - Due to internal changes in the flow graph classes, recompilation is
293 recommended for all binaries that use the flow graph.
295 Open-source contributions integrated:
297 - Added support for OpenBSD by Anthony J. Bentley.
299 ------------------------------------------------------------------------
300 Intel TBB 2018 Update 6
301 TBB_INTERFACE_VERSION == 10006
303 Changes (w.r.t. Intel TBB 2018 Update 5):
307 - Fixed an issue with dynamic memory allocation replacement on Windows*
308 occurred for some versions of ucrtbase.dll.
310 ------------------------------------------------------------------------
311 Intel TBB 2018 Update 5
312 TBB_INTERFACE_VERSION == 10005
314 Changes (w.r.t. Intel TBB 2018 Update 4):
318 - Added user event tracing API for Intel(R) VTune(TM) Amplifier and
323 - Fixed the memory allocator to properly support transparent huge pages.
324 - Removed dynamic exception specifications in tbbmalloc_proxy for C++11
325 and later (https://github.com/01org/tbb/issues/41).
326 - Added -flifetime-dse=1 option when building with GCC on macOS*
327 (https://github.com/01org/tbb/issues/60).
329 Open-source contributions integrated:
331 - Added ARMv8 support by Siddhesh Poyarekar.
332 - Avoid GCC warnings for clearing an object of non-trivial type
333 (https://github.com/01org/tbb/issues/54) by Daniel Arndt.
335 ------------------------------------------------------------------------
336 Intel TBB 2018 Update 4
337 TBB_INTERFACE_VERSION == 10004
339 Changes (w.r.t. Intel TBB 2018 Update 3):
343 - Improved support for Flow Graph Analyzer and Intel(R) VTune(TM)
344 Amplifier in the task scheduler and generic parallel algorithms.
345 - Default device set for opencl_node now includes all the devices from
346 the first available OpenCL* platform.
347 - Added lightweight policy for functional nodes in the flow graph. It
348 indicates that the node body has little work and should, if possible,
349 be executed immediately upon receiving a message, avoiding task
352 ------------------------------------------------------------------------
353 Intel TBB 2018 Update 3
354 TBB_INTERFACE_VERSION == 10003
356 Changes (w.r.t. Intel TBB 2018 Update 2):
360 - Added template class blocked_rangeNd for a generic multi-dimensional
361 range (requires C++11). Inspired by a contribution from Jeff Hammond.
365 - Fixed a crash with dynamic memory allocation replacement on
366 Windows* for applications using system() function.
367 - Fixed parallel_deterministic_reduce to split range correctly when used
368 with static_partitioner.
369 - Fixed a synchronization issue in task_group::run_and_wait() which
370 caused a simultaneous call to task_group::wait() to return
373 ------------------------------------------------------------------------
374 Intel TBB 2018 Update 2
375 TBB_INTERFACE_VERSION == 10002
377 Changes (w.r.t. Intel TBB 2018 Update 1):
379 - Added support for Android* NDK r16, macOS* 10.13, Fedora* 26.
380 - Binaries for Universal Windows Driver (vc14_uwd) now link with static
381 Microsoft* runtime libraries, and are only available in commercial
383 - Extended flow graph documentation with more code samples.
387 - Added a Python* module for multi-processing computations in numeric
392 - Fixed constructors of concurrent_hash_map to be exception-safe.
393 - Fixed auto-initialization in the main thread to be cleaned up at
395 - Fixed a crash when tbbmalloc_proxy is used together with dbghelp.
396 - Fixed static_partitioner to assign tasks properly in case of nested
399 ------------------------------------------------------------------------
400 Intel TBB 2018 Update 1
401 TBB_INTERFACE_VERSION == 10001
403 Changes (w.r.t. Intel TBB 2018):
405 - Added lambda-friendly overloads for parallel_scan.
406 - Added support of static and simple partitioners in
407 parallel_deterministic_reduce.
411 - Added initial support for Flow Graph Analyzer to parallel_for.
412 - Added reservation support in overwrite_node and write_once_node.
416 - Fixed a potential deadlock scenario in the flow graph that affected
419 ------------------------------------------------------------------------
421 TBB_INTERFACE_VERSION == 10000
423 Changes (w.r.t. Intel TBB 2017 Update 7):
425 - Introduced Parallel STL, an implementation of the C++ standard
426 library algorithms with support for execution policies. For more
427 information, see Getting Started with Parallel STL
428 (https://software.intel.com/en-us/get-started-with-pstl).
429 - this_task_arena::isolate() function is now a fully supported feature.
430 - this_task_arena::isolate() function and task_arena::execute() method
431 were extended to pass on the value returned by the executed functor
433 - task_arena::enqueue() and task_group::run() methods extended to accept
435 - A flow graph now spawns all tasks into the same task arena,
436 and waiting for graph completion also happens in that arena.
437 - Improved support for Flow Graph Analyzer in async_node, opencl_node,
439 - Added support for Android* NDK r15, r15b.
440 - Added support for Universal Windows Platform.
441 - Increased minimally supported version of macOS*
442 (MACOSX_DEPLOYMENT_TARGET) to 10.11.
444 Changes affecting backward compatibility:
446 - Internal layout changes in some flow graph classes;
447 - Several undocumented methods are removed from class graph,
448 including set_active() and is_active().
449 - Due to incompatible changes, the namespace version is updated
450 for the flow graph; recompilation is recommended for all
451 binaries that use the flow graph classes.
455 - opencl_node can be used with any graph object; class opencl_graph
457 - graph::wait_for_all() now automatically waits for all not yet consumed
459 - Improved concurrent_lru_cache::handle_object to support C++11 move
460 semantics, default construction, and conversion to bool.
464 - Fixed a bug preventing use of streaming_node and opencl_node with
465 Clang; inspired by a contribution from Francisco Facioni.
466 - Fixed this_task_arena::isolate() function to work correctly with
467 parallel_invoke and parallel_do algorithms.
468 - Fixed a memory leak in composite_node.
469 - Fixed an assertion failure in debug tbbmalloc binaries when
470 TBBMALLOC_CLEAN_ALL_BUFFERS is used.
472 ------------------------------------------------------------------------
473 Intel TBB 2017 Update 8
474 TBB_INTERFACE_VERSION == 9108
476 Changes (w.r.t. Intel TBB 2017 Update 7):
480 - Fixed an assertion failure in debug tbbmalloc binaries when
481 TBBMALLOC_CLEAN_ALL_BUFFERS is used.
483 ------------------------------------------------------------------------
484 Intel TBB 2017 Update 7
485 TBB_INTERFACE_VERSION == 9107
487 Changes (w.r.t. Intel TBB 2017 Update 6):
489 - In the huge pages mode, the memory allocator now is also able to use
490 transparent huge pages.
494 - Added support for Intel TBB integration into CMake-aware
495 projects, with valuable guidance and feedback provided by Brad King
500 - Fixed scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS, 0)
501 to process memory left after exited threads.
503 ------------------------------------------------------------------------
504 Intel TBB 2017 Update 6
505 TBB_INTERFACE_VERSION == 9106
507 Changes (w.r.t. Intel TBB 2017 Update 5):
509 - Added support for Android* NDK r14.
513 - Added a blocking terminate extension to the task_scheduler_init class
514 that allows an object to wait for termination of worker threads.
518 - Fixed compilation and testing issues with MinGW (GCC 6).
519 - Fixed compilation with /std:c++latest option of VS 2017
520 (https://github.com/01org/tbb/issues/13).
522 ------------------------------------------------------------------------
523 Intel TBB 2017 Update 5
524 TBB_INTERFACE_VERSION == 9105
526 Changes (w.r.t. Intel TBB 2017 Update 4):
528 - Added support for Microsoft* Visual Studio* 2017.
529 - Added graph/matmult example to demonstrate support for compute offload
530 to Intel(R) Graphics Technology in the flow graph API.
531 - The "compiler" build option now allows to specify a full path to the
534 Changes affecting backward compatibility:
536 - Constructors for many classes, including graph nodes, concurrent
537 containers, thread-local containers, etc., are declared explicit and
538 cannot be used for implicit conversions anymore.
542 - Added a workaround for bug 16657 in the GNU C Library (glibc)
543 affecting the debug version of tbb::mutex.
544 - Fixed a crash in pool_identify() called for an object allocated in
547 ------------------------------------------------------------------------
548 Intel TBB 2017 Update 4
549 TBB_INTERFACE_VERSION == 9104
551 Changes (w.r.t. Intel TBB 2017 Update 3):
553 - Added support for C++11 move semantics in parallel_do.
554 - Added support for FreeBSD* 11.
556 Changes affecting backward compatibility:
558 - Minimal compiler versions required for support of C++11 move semantics
559 raised to GCC 4.5, VS 2012, and Intel(R) C++ Compiler 14.0.
563 - The workaround for crashes in the library compiled with GCC 6
564 (-flifetime-dse=1) was extended to Windows*.
566 ------------------------------------------------------------------------
567 Intel TBB 2017 Update 3
568 TBB_INTERFACE_VERSION == 9103
570 Changes (w.r.t. Intel TBB 2017 Update 2):
572 - Added support for Android* 7.0 and Android* NDK r13, r13b.
576 - Added template class gfx_factory to the flow graph API. It implements
577 the Factory concept for streaming_node to offload computations to
578 Intel(R) processor graphics.
582 - Fixed a possible deadlock caused by missed wakeup signals in
583 task_arena::execute().
585 Open-source contributions integrated:
587 - A build fix for Linux* s390x platform by Jerry J.
589 ------------------------------------------------------------------------
590 Intel TBB 2017 Update 2
591 TBB_INTERFACE_VERSION == 9102
593 Changes (w.r.t. Intel TBB 2017 Update 1):
595 - Removed the long-outdated support for Xbox* consoles.
599 - Fixed the issue with task_arena::execute() not being processed when
600 the calling thread cannot join the arena.
601 - Fixed dynamic memory allocation replacement failure on macOS* 10.12.
603 ------------------------------------------------------------------------
604 Intel TBB 2017 Update 1
605 TBB_INTERFACE_VERSION == 9101
607 Changes (w.r.t. Intel TBB 2017):
611 - Fixed dynamic memory allocation replacement failures on Windows* 10
613 - Fixed emplace() method of concurrent unordered containers to not
614 require a copy constructor.
616 ------------------------------------------------------------------------
618 TBB_INTERFACE_VERSION == 9100
620 Changes (w.r.t. Intel TBB 4.4 Update 5):
622 - static_partitioner class is now a fully supported feature.
623 - async_node class is now a fully supported feature.
624 - Improved dynamic memory allocation replacement on Windows* OS to skip
625 DLLs for which replacement cannot be done, instead of aborting.
626 - Intel TBB no longer performs dynamic memory allocation replacement
627 for Microsoft* Visual Studio* 2008.
628 - For 64-bit platforms, quadrupled the worst-case limit on the amount
629 of memory the Intel TBB allocator can handle.
630 - Added TBB_USE_GLIBCXX_VERSION macro to specify the version of GNU
631 libstdc++ when it cannot be properly recognized, e.g. when used
632 with Clang on Linux* OS. Inspired by a contribution from David A.
633 - Added graph/stereo example to demonstrate tbb::flow::async_msg.
634 - Removed a few cases of excessive user data copying in the flow graph.
635 - Reworked split_node to eliminate unnecessary overheads.
636 - Added support for C++11 move semantics to the argument of
637 tbb::parallel_do_feeder::add() method.
638 - Added C++11 move constructor and assignment operator to
639 tbb::combinable template class.
640 - Added tbb::this_task_arena::max_concurrency() function and
641 max_concurrency() method of class task_arena returning the maximal
642 number of threads that can work inside an arena.
643 - Deprecated tbb::task_arena::current_thread_index() static method;
644 use tbb::this_task_arena::current_thread_index() function instead.
645 - All examples for commercial version of library moved online:
646 https://software.intel.com/en-us/product-code-samples. Examples are
647 available as a standalone package or as a part of Intel(R) Parallel
648 Studio XE or Intel(R) System Studio Online Samples packages.
650 Changes affecting backward compatibility:
652 - Renamed following methods and types in async_node class:
654 async_gateway_type => gateway_type
655 async_gateway() => gateway()
656 async_try_put() => try_put()
657 async_reserve() => reserve_wait()
658 async_commit() => release_wait()
659 - Internal layout of some flow graph nodes has changed; recompilation
660 is recommended for all binaries that use the flow graph.
664 - Added template class streaming_node to the flow graph API. It allows
665 a flow graph to offload computations to other devices through
666 streaming or offloading APIs.
667 - Template class opencl_node reimplemented as a specialization of
668 streaming_node that works with OpenCL*.
669 - Added tbb::this_task_arena::isolate() function to isolate execution
670 of a group of tasks or an algorithm from other tasks submitted
675 - Added a workaround for GCC bug #62258 in std::rethrow_exception()
676 to prevent possible problems in case of exception propagation.
677 - Fixed parallel_scan to provide correct result if the initial value
678 of an accumulator is not the operation identity value.
679 - Fixed a memory corruption in the memory allocator when it meets
681 - Fixed the memory allocator on 64-bit platforms to align memory
682 to 16 bytes by default for all allocations bigger than 8 bytes.
683 - As a workaround for crashes in the Intel TBB library compiled with
684 GCC 6, added -flifetime-dse=1 to compilation options on Linux* OS.
685 - Fixed a race in the flow graph implementation.
687 Open-source contributions integrated:
689 - Enabling use of C++11 'override' keyword by Raf Schietekat.
691 ------------------------------------------------------------------------
692 Intel TBB 4.4 Update 6
693 TBB_INTERFACE_VERSION == 9006
695 Changes (w.r.t. Intel TBB 4.4 Update 5):
697 - For 64-bit platforms, quadrupled the worst-case limit on the amount
698 of memory the Intel TBB allocator can handle.
702 - Fixed a memory corruption in the memory allocator when it meets
704 - Fixed the memory allocator on 64-bit platforms to align memory
705 to 16 bytes by default for all allocations bigger than 8 bytes.
706 - Fixed parallel_scan to provide correct result if the initial value
707 of an accumulator is not the operation identity value.
708 - As a workaround for crashes in the Intel TBB library compiled with
709 GCC 6, added -flifetime-dse=1 to compilation options on Linux* OS.
711 ------------------------------------------------------------------------
712 Intel TBB 4.4 Update 5
713 TBB_INTERFACE_VERSION == 9005
715 Changes (w.r.t. Intel TBB 4.4 Update 4):
717 - Modified graph/fgbzip2 example to remove unnecessary data queuing.
721 - Added a Python* module which is able to replace Python's thread pool
722 class with the implementation based on Intel TBB task scheduler.
726 - Fixed the implementation of 64-bit tbb::atomic for IA-32 architecture
727 to work correctly with GCC 5.2 in C++11/14 mode.
728 - Fixed a possible crash when tasks with affinity (e.g. specified via
729 affinity_partitioner) are used simultaneously with task priority
732 ------------------------------------------------------------------------
733 Intel TBB 4.4 Update 4
734 TBB_INTERFACE_VERSION == 9004
736 Changes (w.r.t. Intel TBB 4.4 Update 3):
738 - Removed a few cases of excessive user data copying in the flow graph.
739 - Improved robustness of concurrent_bounded_queue::abort() in case of
740 simultaneous push and pop operations.
744 - Added tbb::flow::async_msg, a special message type to support
745 communications between the flow graph and external asynchronous
747 - async_node modified to support use with C++03 compilers.
751 - Fixed a bug in dynamic memory allocation replacement for Windows* OS.
752 - Fixed excessive memory consumption on Linux* OS caused by enabling
754 - Fixed performance regression on Intel(R) Xeon Phi(tm) coprocessor with
757 ------------------------------------------------------------------------
758 Intel TBB 4.4 Update 3
759 TBB_INTERFACE_VERSION == 9003
761 Changes (w.r.t. Intel TBB 4.4 Update 2):
763 - Modified parallel_sort to not require a default constructor for values
764 and to use iter_swap() for value swapping.
765 - Added support for creating or initializing a task_arena instance that
766 is connected to the arena currently used by the thread.
767 - graph/binpack example modified to use multifunction_node.
768 - For performance analysis, use Intel(R) VTune(TM) Amplifier XE 2015
769 and higher; older versions are no longer supported.
770 - Improved support for compilation with disabled RTTI, by omitting its use
771 in auxiliary code, such as assertions. However some functionality,
772 particularly the flow graph, does not work if RTTI is disabled.
773 - The tachyon example for Android* can be built using Android Studio 1.5
774 and higher with experimental Gradle plugin 0.4.0.
778 - Added class opencl_subbufer that allows using OpenCL* sub-buffer
779 objects with opencl_node.
780 - Class global_control supports the value of 1 for
781 max_allowed_parallelism.
785 - Fixed a race causing "TBB Warning: setaffinity syscall failed" message.
786 - Fixed a compilation issue on OS X* with Intel(R) C++ Compiler 15.0.
787 - Fixed a bug in queuing_rw_mutex::downgrade() that could temporarily
789 - Fixed speculative_spin_rw_mutex to stop using the lazy subscription
790 technique due to its known flaws.
791 - Fixed memory leaks in the tool support code.
793 ------------------------------------------------------------------------
794 Intel TBB 4.4 Update 2
795 TBB_INTERFACE_VERSION == 9002
797 Changes (w.r.t. Intel TBB 4.4 Update 1):
799 - Improved interoperability with Intel(R) OpenMP RTL (libiomp) on Linux:
800 OpenMP affinity settings do not affect the default number of threads
801 used in the task scheduler. Intel(R) C++ Compiler 16.0 Update 1
802 or later is required.
803 - Added a new flow graph example with different implementations of the
804 Cholesky Factorization algorithm.
808 - Added template class opencl_node to the flow graph API. It allows a
809 flow graph to offload computations to OpenCL* devices.
810 - Extended join_node to use type-specified message keys. It simplifies
811 the API of the node by obtaining message keys via functions
812 associated with the message type (instead of node ports).
813 - Added static_partitioner that minimizes overhead of parallel_for and
814 parallel_reduce for well-balanced workloads.
815 - Improved template class async_node in the flow graph API to support
816 user settable concurrency limits.
820 - Fixed a possible crash in the GUI layer for library examples on Linux.
822 ------------------------------------------------------------------------
823 Intel TBB 4.4 Update 1
824 TBB_INTERFACE_VERSION == 9001
826 Changes (w.r.t. Intel TBB 4.4):
828 - Added support for Microsoft* Visual Studio* 2015.
829 - Intel TBB no longer performs dynamic replacement of memory allocation
830 functions for Microsoft Visual Studio 2005 and earlier versions.
831 - For GCC 4.7 and higher, the intrinsics-based platform isolation layer
832 uses __atomic_* built-ins instead of the legacy __sync_* ones.
833 This change is inspired by a contribution from Mathieu Malaterre.
834 - Improvements in task_arena:
835 Several application threads may join a task_arena and execute tasks
836 simultaneously. The amount of concurrency reserved for application
837 threads at task_arena construction can be set to any value between
838 0 and the arena concurrency limit.
839 - The fractal example was modified to demonstrate class task_arena
840 and moved to examples/task_arena/fractal.
844 - Fixed a deadlock during destruction of task_scheduler_init objects
845 when one of destructors is set to wait for worker threads.
846 - Added a workaround for a possible crash on OS X* when dynamic memory
847 allocator replacement (libtbbmalloc_proxy) is used and memory is
848 released during application startup.
849 - Usage of mutable functors with task_group::run_and_wait() and
850 task_arena::enqueue() is disabled. An attempt to pass a functor
851 which operator()() is not const will produce compilation errors.
852 - Makefiles and environment scripts now properly recognize GCC 5.0 and
855 Open-source contributions integrated:
857 - Improved performance of parallel_for_each for inputs allowing random
858 access, by Raf Schietekat.
860 ------------------------------------------------------------------------
862 TBB_INTERFACE_VERSION == 9000
864 Changes (w.r.t. Intel TBB 4.3 Update 6):
866 - The following features are now fully supported:
867 tbb::flow::composite_node;
868 additional policies of tbb::flow::graph_node::reset().
869 - Platform abstraction layer for Windows* OS updated to use compiler
870 intrinsics for most atomic operations.
871 - The tbb/compat/thread header updated to automatically include
872 C++11 <thread> where available.
873 - Fixes and refactoring in the task scheduler and class task_arena.
874 - Added key_matching policy to tbb::flow::join_node, which removes
875 the restriction on the type that can be compared-against.
876 - For tag_matching join_node, tag_value is redefined to be 64 bits
877 wide on all architectures.
878 - Expanded the documentation for the flow graph with details about
879 node semantics and behavior.
880 - Added dynamic replacement of C11 standard function aligned_alloc()
882 - Added C++11 move constructors and assignment operators to
883 tbb::enumerable_thread_specific container.
884 - Added hashing support for tbb::tbb_thread::id.
885 - On OS X*, binaries that depend on libstdc++ are not provided anymore.
886 In the makefiles, libc++ is now used by default; for building with
887 libstdc++, specify stdlib=libstdc++ in the make command line.
891 - Added a new example, graph/fgbzip2, that shows usage of
892 tbb::flow::async_node.
893 - Modification to the low-level API for memory pools:
894 added a function for finding a memory pool by an object allocated
896 - tbb::memory_pool now does not request memory till the first allocation
899 Changes affecting backward compatibility:
901 - Internal layout of flow graph nodes has changed; recompilation is
902 recommended for all binaries that use the flow graph.
903 - Resetting a tbb::flow::source_node will immediately activate it,
904 unless it was created in inactive state.
908 - Failure at creation of a memory pool will not cause process
911 Open-source contributions integrated:
913 - Supported building TBB with Clang on AArch64 with use of built-in
914 intrinsics by David A.
916 ------------------------------------------------------------------------
917 Intel TBB 4.3 Update 6
918 TBB_INTERFACE_VERSION == 8006
920 Changes (w.r.t. Intel TBB 4.3 Update 5):
922 - Supported zero-copy realloc for objects >1MB under Linux* via
924 - C++11 move-aware insert and emplace methods have been added to
925 concurrent_hash_map container.
926 - install_name is set to @rpath/<library name> on OS X*.
930 - Added template class async_node to the flow graph API. It allows a
931 flow graph to communicate with an external activity managed by
932 the user or another runtime.
933 - Improved speed of flow::graph::reset() clearing graph edges.
934 rf_extract flag has been renamed rf_clear_edges.
935 - extract() method of graph nodes now takes no arguments.
939 - concurrent_unordered_{set,map} behaves correctly for degenerate
941 - Fixed a race condition in the memory allocator that may lead to
942 excessive memory consumption under high multithreading load.
944 ------------------------------------------------------------------------
945 Intel TBB 4.3 Update 5
946 TBB_INTERFACE_VERSION == 8005
948 Changes (w.r.t. Intel TBB 4.3 Update 4):
950 - Added add_ref_count() method of class tbb::task.
954 - Added class global_control for application-wide control of allowed
955 parallelism and thread stack size.
956 - memory_pool_allocator now throws the std::bad_alloc exception on
958 - Exceptions thrown for by memory pool constructors changed from
959 std::bad_alloc to std::invalid_argument and std::runtime_error.
963 - scalable_allocator now throws the std::bad_alloc exception on
965 - Fixed a race condition in the memory allocator that may lead to
966 excessive memory consumption under high multithreading load.
967 - A new scheduler created right after destruction of the previous one
968 might be unable to modify the number of worker threads.
970 Open-source contributions integrated:
972 - (Added but not enabled) push_front() method of class tbb::task_list
975 ------------------------------------------------------------------------
976 Intel TBB 4.3 Update 4
977 TBB_INTERFACE_VERSION == 8004
979 Changes (w.r.t. Intel TBB 4.3 Update 3):
981 - Added a C++11 variadic constructor for enumerable_thread_specific.
982 The arguments from this constructor are used to construct
984 - Improved exception safety for enumerable_thread_specific.
985 - Added documentation for tbb::flow::tagged_msg class and
986 tbb::flow::output_port function.
987 - Fixed build errors for systems that do not support dynamic linking.
988 - C++11 move-aware insert and emplace methods have been added to
989 concurrent unordered containers.
993 - Interface-breaking change: typedefs changed for node predecessor and
994 successor lists, affecting copy_predecessors and copy_successors
996 - Added template class composite_node to the flow graph API. It packages
997 a subgraph to represent it as a first-class flow graph node.
998 - make_edge and remove_edge now accept multiport nodes as arguments,
999 automatically using the node port with index 0 for an edge.
1001 Open-source contributions integrated:
1003 - Draft code for enumerable_thread_specific constructor with multiple
1004 arguments (see above) by Adrien Guinet.
1005 - Fix for GCC invocation on IBM* Blue Gene*
1006 by Jeff Hammond and Raf Schietekat.
1007 - Extended testing with smart pointers for Clang & libc++
1010 ------------------------------------------------------------------------
1011 Intel TBB 4.3 Update 3
1012 TBB_INTERFACE_VERSION == 8003
1014 Changes (w.r.t. Intel TBB 4.3 Update 2):
1016 - Move constructor and assignment operator were added to unique_lock.
1020 - Time overhead for memory pool destruction was reduced.
1022 Open-source contributions integrated:
1024 - Build error fix for iOS* by Raf Schietekat.
1026 ------------------------------------------------------------------------
1027 Intel TBB 4.3 Update 2
1028 TBB_INTERFACE_VERSION == 8002
1030 Changes (w.r.t. Intel TBB 4.3 Update 1):
1032 - Binary files for 64-bit Android* applications were added as part of the
1034 - Exact exception propagation is enabled for Intel C++ Compiler on OS X*.
1035 - concurrent_vector::shrink_to_fit was optimized for types that support
1036 C++11 move semantics.
1040 - Fixed concurrent unordered containers to insert elements much faster
1042 - Fixed concurrent priority queue to support types that do not have
1044 - Fixed enumerable_thread_specific to forbid copying from an instance
1045 with a different value type.
1047 Open-source contributions integrated:
1049 - Support for PathScale* EKOPath* Compiler by Erik Lindahl.
1051 ------------------------------------------------------------------------
1052 Intel TBB 4.3 Update 1
1053 TBB_INTERFACE_VERSION == 8001
1055 Changes (w.r.t. Intel TBB 4.3):
1057 - The ability to split blocked_ranges in a proportion, used by
1058 affinity_partitioner since version 4.2 Update 4, became a formal
1059 extension of the Range concept.
1060 - More checks for an incorrect address to release added to the debug
1061 version of the memory allocator.
1062 - Different kind of solutions for each TBB example were merged.
1066 - Task priorities are re-enabled in preview binaries.
1070 - Fixed a duplicate symbol when TBB_PREVIEW_VARIADIC_PARALLEL_INVOKE is
1071 used in multiple compilation units.
1072 - Fixed a crash in __itt_fini_ittlib seen on Ubuntu 14.04.
1073 - Fixed a crash in memory release after dynamic replacement of the
1074 OS X* memory allocator.
1075 - Fixed incorrect indexing of arrays in seismic example.
1076 - Fixed a data race in lazy initialization of task_arena.
1078 Open-source contributions integrated:
1080 - Fix for dumping information about gcc and clang compiler versions
1083 ------------------------------------------------------------------------
1085 TBB_INTERFACE_VERSION == 8000
1087 Changes (w.r.t. Intel TBB 4.2 Update 5):
1089 - The following features are now fully supported: flow::indexer_node,
1090 task_arena, speculative_spin_rw_mutex.
1091 - Compatibility with C++11 standard improved for tbb/compat/thread
1093 - C++11 move constructors have been added to concurrent_queue and
1094 concurrent_bounded_queue.
1095 - C++11 move constructors and assignment operators have been added to
1096 concurrent_vector, concurrent_hash_map, concurrent_priority_queue,
1097 concurrent_unordered_{set,multiset,map,multimap}.
1098 - C++11 move-aware emplace/push/pop methods have been added to
1099 concurrent_vector, concurrent_queue, concurrent_bounded_queue,
1100 concurrent_priority_queue.
1101 - Methods to insert a C++11 initializer list have been added:
1102 concurrent_vector::grow_by(), concurrent_hash_map::insert(),
1103 concurrent_unordered_{set,multiset,map,multimap}::insert().
1104 - Testing for compatibility of containers with some C++11 standard
1105 library types has been added.
1106 - Dynamic replacement of standard memory allocation routines has been
1108 - Microsoft* Visual Studio* projects for Intel TBB examples updated
1110 - For open-source packages, debugging information (line numbers) in
1111 precompiled binaries now matches the source code.
1112 - Debug information was added to release builds for OS X*, Solaris*,
1113 FreeBSD* operating systems and MinGW*.
1114 - Various improvements in documentation, debug diagnostics and examples.
1118 - Additional actions on reset of graphs, and extraction of individual
1119 nodes from a graph (TBB_PREVIEW_FLOW_GRAPH_FEATURES).
1120 - Support for an arbitrary number of arguments in parallel_invoke
1121 (TBB_PREVIEW_VARIADIC_PARALLEL_INVOKE).
1123 Changes affecting backward compatibility:
1125 - For compatibility with C++11 standard, copy and move constructors and
1126 assignment operators are disabled for all mutex classes. To allow
1127 the old behavior, use TBB_DEPRECATED_MUTEX_COPYING macro.
1128 - flow::sequencer_node rejects messages with repeating sequence numbers.
1129 - Changed internal interface between tbbmalloc and tbbmalloc_proxy.
1130 - Following deprecated functionality has been removed:
1131 old debugging macros TBB_DO_ASSERT & TBB_DO_THREADING_TOOLS;
1132 no-op depth-related methods in class task;
1133 tbb::deprecated::concurrent_queue;
1134 deprecated variants of concurrent_vector methods.
1135 - register_successor() and remove_successor() are deprecated as methods
1136 to add and remove edges in flow::graph; use make_edge() and
1137 remove_edge() instead.
1141 - Fixed incorrect scalable_msize() implementation for aligned objects.
1142 - Flow graph buffering nodes now destroy their copy of forwarded items.
1143 - Multiple fixes in task_arena implementation, including for:
1144 inconsistent task scheduler state inside executed functions;
1145 incorrect floating-point settings and exception propagation;
1146 possible stalls in concurrent invocations of execute().
1147 - Fixed floating-point settings propagation when the same instance of
1148 task_group_context is used in different arenas.
1149 - Fixed compilation error in pipeline.h with Intel Compiler on OS X*.
1150 - Added missed headers for individual components to tbb.h.
1152 Open-source contributions integrated:
1154 - Range interface addition to parallel_do, parallel_for_each and
1155 parallel_sort by Stephan Dollberg.
1156 - Variadic template implementation of parallel_invoke
1157 by Kizza George Mbidde (see Preview Features).
1158 - Improvement in Seismic example for MacBook Pro* with Retina* display
1161 ------------------------------------------------------------------------
1162 Intel TBB 4.2 Update 5
1163 TBB_INTERFACE_VERSION == 7005
1165 Changes (w.r.t. Intel TBB 4.2 Update 4):
1167 - The second template argument of class aligned_space<T,N> now is set
1172 - Better support for exception safety, task priorities and floating
1173 point settings in class task_arena.
1174 - task_arena::current_slot() has been renamed to
1175 task_arena::current_thread_index().
1179 - Task priority change possibly ignored by a worker thread entering
1180 a nested parallel construct.
1181 - Memory leaks inside the task scheduler when running on
1182 Intel(R) Xeon Phi(tm) coprocessor.
1184 Open-source contributions integrated:
1186 - Improved detection of X Window support for Intel TBB examples
1187 and other feedback by Raf Schietekat.
1189 ------------------------------------------------------------------------
1190 Intel TBB 4.2 Update 4
1191 TBB_INTERFACE_VERSION == 7004
1193 Changes (w.r.t. Intel TBB 4.2 Update 3):
1195 - Added possibility to specify floating-point settings at invocation
1196 of most parallel algorithms (including flow::graph) via
1198 - Added dynamic replacement of malloc_usable_size() under
1199 Linux*/Android* and dlmalloc_usable_size() under Android*.
1200 - Added new methods to concurrent_vector:
1201 grow_by() that appends a sequence between two given iterators;
1202 grow_to_at_least() that initializes new elements with a given value.
1203 - Improved affinity_partitioner for better performance on balanced
1205 - Improvements in the task scheduler, including better scalability
1206 when threads search for a task arena, and better diagnostics.
1207 - Improved allocation performance for workloads that do intensive
1208 allocation/releasing of same-size objects larger than ~8KB from
1210 - Exception support is enabled by default for 32-bit MinGW compilers.
1211 - The tachyon example for Android* can be built for all targets
1212 supported by the installed NDK.
1213 - Added Windows Store* version of the tachyon example.
1214 - GettingStarted/sub_string_finder example ported to offload execution
1215 on Windows* for Intel(R) Many Integrated Core Architecture.
1219 - Removed task_scheduler_observer::on_scheduler_leaving() callback.
1220 - Added task_scheduler_observer::may_sleep() callback.
1221 - The CPF or_node has been renamed indexer_node. The input to
1222 indexer_node is now a list of types. The output of indexer_node is
1223 a tagged_msg type composed of a tag and a value. For indexer_node,
1224 the tag is a size_t.
1228 - Fixed data races in preview extensions of task_scheduler_observer.
1229 - Added noexcept(false) for destructor of task_group_base to avoid
1230 crash on cancellation of structured task group in C++11.
1232 Open-source contributions integrated:
1234 - Improved concurrency detection for BG/Q, and other improvements
1236 - Fix for crashes in enumerable_thread_specific in case if a contained
1237 object is too big to be constructed on the stack by Adrien Guinet.
1239 ------------------------------------------------------------------------
1240 Intel TBB 4.2 Update 3
1241 TBB_INTERFACE_VERSION == 7003
1243 Changes (w.r.t. Intel TBB 4.2 Update 2):
1245 - Added support for Microsoft* Visual Studio* 2013.
1246 - Improved Microsoft* PPL-compatible form of parallel_for for better
1247 support of auto-vectorization.
1248 - Added a new example for cancellation and reset in the flow graph:
1249 Kohonen self-organizing map (examples/graph/som).
1250 - Various improvements in source code, tests, and makefiles.
1254 - Added dynamic replacement of _aligned_msize() previously missed.
1255 - Fixed task_group::run_and_wait() to throw invalid_multiple_scheduling
1256 exception if the specified task handle is already scheduled.
1258 Open-source contributions integrated:
1260 - A fix for ARM* processors by Steve Capper.
1261 - Improvements in std::swap calls by Robert Maynard.
1263 ------------------------------------------------------------------------
1264 Intel TBB 4.2 Update 2
1265 TBB_INTERFACE_VERSION == 7002
1267 Changes (w.r.t. Intel TBB 4.2 Update 1):
1269 - Enable C++11 features for Microsoft* Visual Studio* 2013 Preview.
1270 - Added a test for compatibility of TBB containers with C++11
1271 range-based for loop.
1273 Changes affecting backward compatibility:
1275 - Internal layout changed for class tbb::flow::limiter_node.
1279 - Added speculative_spin_rw_mutex, a read-write lock class which uses
1280 Intel(R) Transactional Synchronization Extensions.
1284 - When building for Intel(R) Xeon Phi(tm) coprocessor, TBB programs
1285 no longer require explicit linking with librt and libpthread.
1287 Open-source contributions integrated:
1289 - Fixes for ARM* processors by Steve Capper, Leif Lindholm
1291 - Support for Clang on Linux by Raf Schietekat.
1292 - Typo correction in scheduler.cpp by Julien Schueller.
1294 ------------------------------------------------------------------------
1295 Intel TBB 4.2 Update 1
1296 TBB_INTERFACE_VERSION == 7001
1298 Changes (w.r.t. Intel TBB 4.2):
1300 - Added project files for Microsoft* Visual Studio* 2010.
1301 - Initial support of Microsoft* Visual Studio* 2013 Preview.
1302 - Enable C++11 features available in Intel(R) C++ Compiler 14.0.
1303 - scalable_allocation_mode(TBBMALLOC_SET_SOFT_HEAP_LIMIT, <size>) can be
1304 used to urge releasing memory from tbbmalloc internal buffers when
1305 the given limit is exceeded.
1309 - Class task_arena no longer requires linking with a preview library,
1310 though still remains a community preview feature.
1311 - The method task_arena::wait_until_empty() is removed.
1312 - The method task_arena::current_slot() now returns -1 if
1313 the task scheduler is not initialized in the thread.
1315 Changes affecting backward compatibility:
1317 - Because of changes in internal layout of graph nodes, the namespace
1318 interface number of flow::graph has been incremented from 6 to 7.
1322 - Fixed a race in lazy initialization of task_arena.
1323 - Fixed flow::graph::reset() to prevent situations where tasks would be
1324 spawned in the process of resetting the graph to its initial state.
1325 - Fixed decrement bug in limiter_node.
1326 - Fixed a race in arc deletion in the flow graph.
1328 Open-source contributions integrated:
1330 - Improved support for IBM* Blue Gene* by Raf Schietekat.
1332 ------------------------------------------------------------------------
1334 TBB_INTERFACE_VERSION == 7000
1336 Changes (w.r.t. Intel TBB 4.1 Update 4):
1338 - Added speculative_spin_mutex, which uses Intel(R) Transactional
1339 Synchronization Extensions when they are supported by hardware.
1340 - Binary files linked with libc++ (the C++ standard library in Clang)
1341 were added on OS X*.
1342 - For OS X* exact exception propagation is supported with Clang;
1343 it requires use of libc++ and corresponding Intel TBB binaries.
1344 - Support for C++11 initializer lists in constructor and assignment
1345 has been added to concurrent_hash_map, concurrent_unordered_set,
1346 concurrent_unordered_multiset, concurrent_unordered_map,
1347 concurrent_unordered_multimap.
1348 - The memory allocator may now clean its per-thread memory caches
1349 when it cannot get more memory.
1350 - Added the scalable_allocation_command() function for on-demand
1351 cleaning of internal memory caches.
1352 - Reduced the time overhead for freeing memory objects smaller than ~8K.
1353 - Simplified linking with the debug library for applications that use
1354 Intel TBB in code offloaded to Intel(R) Xeon Phi(tm) coprocessors.
1356 examples/GettingStarted/sub_string_finder/Makefile.
1357 - Various improvements in source code, scripts and makefiles.
1359 Changes affecting backward compatibility:
1361 - tbb::flow::graph has been modified to spawn its tasks;
1362 the old behaviour (task enqueuing) is deprecated. This change may
1363 impact applications that expected a flow graph to make progress
1364 without calling wait_for_all(), which is no longer guaranteed. See
1365 the documentation for more details.
1366 - Changed the return values of the scalable_allocation_mode() function.
1370 - Fixed a leak of parallel_reduce body objects when execution is
1371 cancelled or an exception is thrown, as suggested by Darcy Harrison.
1372 - Fixed a race in the task scheduler which can lower the effective
1373 priority despite the existence of higher priority tasks.
1374 - On Linux an error during destruction of the internal thread local
1375 storage no longer results in an exception.
1377 Open-source contributions integrated:
1379 - Fixed task_group_context state propagation to unrelated context trees
1382 ------------------------------------------------------------------------
1383 Intel TBB 4.1 Update 4
1384 TBB_INTERFACE_VERSION == 6105
1386 Changes (w.r.t. Intel TBB 4.1 Update 3):
1388 - Use /volatile:iso option with VS 2012 to disable extended
1389 semantics for volatile variables.
1390 - Various improvements in affinity_partitioner, scheduler,
1391 tests, examples, makefiles.
1392 - Concurrent_priority_queue class now supports initialization/assignment
1393 via C++11 initializer list feature (std::initializer_list<T>).
1397 - Fixed more possible stalls in concurrent invocations of
1398 task_arena::execute(), especially waiting for enqueued tasks.
1399 - Fixed requested number of workers for task_arena(P,0).
1400 - Fixed interoperability with Intel(R) VTune(TM) Amplifier XE in
1401 case of using task_arena::enqueue() from a terminating thread.
1403 Open-source contributions integrated:
1405 - Type fixes, cleanups, and code beautification by Raf Schietekat.
1406 - Improvements in atomic operations for big endian platforms
1409 ------------------------------------------------------------------------
1410 Intel TBB 4.1 Update 3
1411 TBB_INTERFACE_VERSION == 6103
1413 Changes (w.r.t. Intel TBB 4.1 Update 2):
1415 - Binary files for Android* applications were added to the Linux* OS
1417 - Binary files for Windows Store* applications were added to the
1418 Windows* OS package.
1419 - Exact exception propagation (exception_ptr) support on Linux OS is
1420 now turned on by default for GCC 4.4 and higher.
1421 - Stopped implicit use of large memory pages by tbbmalloc (Linux-only).
1422 Now use of large pages must be explicitly enabled with
1423 scalable_allocation_mode() function or TBB_MALLOC_USE_HUGE_PAGES
1424 environment variable.
1426 Community Preview Features:
1428 - Extended class task_arena constructor and method initialize() to
1429 allow some concurrency to be reserved strictly for application
1431 - New methods terminate() and is_active() were added to class
1436 - Fixed initialization of hashing helper constant in the hash
1438 - Fixed possible stalls in concurrent invocations of
1439 task_arena::execute() when no worker thread is available to make
1441 - Fixed incorrect calculation of hardware concurrency in the presence
1442 of inactive processor groups, particularly on systems running
1443 Windows* 8 and Windows* Server 2012.
1445 Open-source contributions integrated:
1447 - The fix for the GUI examples on OS X* systems by Raf Schietekat.
1448 - Moved some power-of-2 calculations to functions to improve readability
1450 - C++11/Clang support improvements by arcata.
1451 - ARM* platform isolation layer by Steve Capper, Leif Lindholm, Leo Lara
1454 ------------------------------------------------------------------------
1455 Intel TBB 4.1 Update 2
1456 TBB_INTERFACE_VERSION == 6102
1458 Changes (w.r.t. Intel TBB 4.1 Update 1):
1460 - Objects up to 128 MB are now cached by the tbbmalloc. Previously
1461 the threshold was 8MB. Objects larger than 128 MB are still
1462 processed by direct OS calls.
1463 - concurrent_unordered_multiset and concurrent_unordered_multimap
1464 have been added, based on Microsoft* PPL prototype.
1465 - Ability to value-initialize a tbb::atomic<T> variable on construction
1466 in C++11, with const expressions properly supported.
1468 Community Preview Features:
1470 - Added a possibility to wait until all worker threads terminate.
1471 This is necessary before calling fork() from an application.
1475 - Fixed data race in tbbmalloc that might lead to memory leaks
1476 for large object allocations.
1477 - Fixed task_arena::enqueue() to use task_group_context of target arena.
1478 - Improved implementation of 64 bit atomics on ia32.
1480 ------------------------------------------------------------------------
1481 Intel TBB 4.1 Update 1
1482 TBB_INTERFACE_VERSION == 6101
1484 Changes (w.r.t. Intel TBB 4.1):
1486 - concurrent_vector class now supports initialization/assignment
1487 via C++11 initializer list feature (std::initializer_list<T>)
1488 - Added implementation of the platform isolation layer based on
1489 Intel compiler atomic built-ins; it is supposed to work on
1490 any platform supported by compiler version 12.1 and newer.
1491 - Using GetNativeSystemInfo() instead of GetSystemInfo() to support
1492 more than 32 processors for 32-bit applications under WOW64.
1493 - The following form of parallel_for:
1494 parallel_for(first, last, [step,] f[, context]) now accepts an
1495 optional partitioner parameter after the function f.
1497 Backward-incompatible API changes:
1499 - The library no longer injects tuple in to namespace std.
1500 In previous releases, tuple was injected into namespace std by
1501 flow_graph.h when std::tuple was not available. In this release,
1502 flow_graph.h now uses tbb::flow::tuple. On platforms where
1503 std::tuple is available, tbb::flow::tuple is typedef'ed to
1504 std::tuple. On all other platforms, tbb::flow::tuple provides
1505 a subset of the functionality defined by std::tuple. Users of
1506 flow_graph.h may need to change their uses of std::tuple to
1507 tbb::flow::tuple to ensure compatibility with non-C++11 compliant
1512 - Fixed local observer to be able to override propagated CPU state and
1513 to provide correct value of task_arena::current_slot() in callbacks.
1515 ------------------------------------------------------------------------
1517 TBB_INTERFACE_VERSION == 6100
1519 Changes (w.r.t. Intel TBB 4.0 Update 5):
1521 - _WIN32_WINNT must be set to 0x0501 or greater in order to use TBB
1522 on Microsoft* Windows*.
1523 - parallel_deterministic_reduce template function is fully supported.
1524 - TBB headers can be used with C++0x/C++11 mode (-std=c++0x) of GCC
1525 and Intel(R) Compiler.
1526 - C++11 std::make_exception_ptr is used where available, instead of
1527 std::copy_exception from earlier C++0x implementations.
1528 - Improvements in the TBB allocator to reduce extra memory consumption.
1529 - Partial refactoring of the task scheduler data structures.
1530 - TBB examples allow more flexible specification of the thread number,
1531 including arithmetic and geometric progression.
1535 - On Linux & OS X*, pre-built TBB binaries do not yet support exact
1536 exception propagation via C++11 exception_ptr. To prevent run time
1537 errors, by default TBB headers disable exact exception propagation
1538 even if the C++ implementation provides exception_ptr.
1540 Community Preview Features:
1542 - Added: class task_arena, for work submission by multiple application
1543 threads with thread-independent control of concurrency level.
1544 - Added: task_scheduler_observer can be created as local to a master
1545 thread, to observe threads that work on behalf of that master.
1546 Local observers may have new on_scheduler_leaving() callback.
1548 ------------------------------------------------------------------------
1549 Intel TBB 4.0 Update 5
1550 TBB_INTERFACE_VERSION == 6005
1552 Changes (w.r.t. Intel TBB 4.0 Update 4):
1554 - Parallel pipeline optimization (directly storing small objects in the
1555 interstage data buffers) limited to trivially-copyable types for
1556 C++11 and a short list of types for earlier compilers.
1557 - _VARIADIC_MAX switch is honored for TBB tuple implementation
1558 and flow::graph nodes based on tuple.
1559 - Support of Cocoa framework was added to the GUI examples on OS X*
1564 - Fixed a tv_nsec overflow bug in condition_variable::wait_for.
1565 - Fixed execution order of enqueued tasks with different priorities.
1566 - Fixed a bug with task priority changes causing lack of progress
1567 for fire-and-forget tasks when TBB was initialized to use 1 thread.
1568 - Fixed duplicate symbol problem when linking multiple compilation
1569 units that include flow_graph.h on VC 10.
1571 ------------------------------------------------------------------------
1572 Intel TBB 4.0 Update 4
1573 TBB_INTERFACE_VERSION == 6004
1575 Changes (w.r.t. Intel TBB 4.0 Update 3):
1577 - The TBB memory allocator transparently supports large pages on Linux.
1578 - A new flow_graph example, logic_sim, was added.
1579 - Support for DirectX* 9 was added to GUI examples.
1581 Community Preview Features:
1583 - Added: aggregator, a new concurrency control mechanism.
1587 - The abort operation on concurrent_bounded_queue now leaves the queue
1588 in a reusable state. If a bad_alloc or bad_last_alloc exception is
1589 thrown while the queue is recovering from an abort, that exception
1590 will be reported instead of user_abort on the thread on which it
1591 occurred, and the queue will not be reusable.
1592 - Steal limiting heuristic fixed to avoid premature stealing disabling
1593 when large amount of __thread data is allocated on thread stack.
1594 - Fixed a low-probability leak of arenas in the task scheduler.
1595 - In STL-compatible allocator classes, the method construct() was fixed
1596 to comply with C++11 requirements.
1597 - Fixed a bug that prevented creation of fixed-size memory pools
1599 - Significantly reduced the amount of warnings from various compilers.
1601 Open-source contributions integrated:
1603 - Multiple improvements by Raf Schietekat.
1604 - Basic support for Clang on OS X* by Blas Rodriguez Somoza.
1605 - Fixes for warnings and corner-case bugs by Blas Rodriguez Somoza
1608 ------------------------------------------------------------------------
1609 Intel TBB 4.0 Update 3
1610 TBB_INTERFACE_VERSION == 6003
1612 Changes (w.r.t. Intel TBB 4.0 Update 2):
1614 - Modifications to the low-level API for memory pools:
1615 added support for aligned allocations;
1616 pool policies reworked to allow backward-compatible extensions;
1617 added a policy to not return memory space till destruction;
1618 pool_reset() does not return memory space anymore.
1619 - Class tbb::flow::graph_iterator added to iterate over all nodes
1620 registered with a graph instance.
1621 - multioutput_function_node has been renamed multifunction_node.
1622 multifunction_node and split_node are now fully-supported features.
1623 - For the tagged join node, the policy for try_put of an item with
1624 already existing tag has been defined: the item will be rejected.
1625 - Matching the behavior on Windows, on other platforms the optional
1626 shared libraries (libtbbmalloc, libirml) now are also searched
1627 only in the directory where libtbb is located.
1628 - The platform isolation layer based on GCC built-ins is extended.
1630 Backward-incompatible API changes:
1632 - a graph reference parameter is now required to be passed to the
1633 constructors of the following flow graph nodes: overwrite_node,
1634 write_once_node, broadcast_node, and the CPF or_node.
1635 - the following tbb::flow node methods and typedefs have been renamed:
1637 join_node and or_node:
1638 inputs() -> input_ports()
1639 input_ports_tuple_type -> input_ports_type
1640 multifunction_node and split_node:
1641 ports_type -> output_ports_type
1645 - Not all logical processors were utilized on systems with more than
1646 64 cores split by Windows into several processor groups.
1648 ------------------------------------------------------------------------
1649 Intel TBB 4.0 Update 2 commercial-aligned release
1650 TBB_INTERFACE_VERSION == 6002
1652 Changes (w.r.t. Intel TBB 4.0 Update 1 commercial-aligned release):
1654 - concurrent_bounded_queue now has an abort() operation that releases
1655 threads involved in pending push or pop operations. The released
1656 threads will receive a tbb::user_abort exception.
1657 - Added Community Preview Feature: concurrent_lru_cache container,
1658 a concurrent implementation of LRU (least-recently-used) cache.
1662 - fixed a race condition in the TBB scalable allocator.
1663 - concurrent_queue counter wraparound bug was fixed, which occurred when
1664 the number of push and pop operations exceeded ~>4 billion on IA32.
1665 - fixed races in the TBB scheduler that could put workers asleep too
1666 early, especially in presence of affinitized tasks.
1668 ------------------------------------------------------------------------
1669 Intel TBB 4.0 Update 1 commercial-aligned release
1670 TBB_INTERFACE_VERSION == 6000 (forgotten to increment)
1672 Changes (w.r.t. Intel TBB 4.0 commercial-aligned release):
1674 - Memory leaks fixed in binpack example.
1675 - Improvements and fixes in the TBB allocator.
1677 ------------------------------------------------------------------------
1678 Intel TBB 4.0 commercial-aligned release
1679 TBB_INTERFACE_VERSION == 6000
1681 Changes (w.r.t. Intel TBB 3.0 Update 8 commercial-aligned release):
1683 - concurrent_priority_queue is now a fully supported feature.
1684 Capacity control methods were removed.
1685 - Flow graph is now a fully supported feature.
1686 - A new memory backend has been implemented in the TBB allocator.
1687 It can reuse freed memory for both small and large objects, and
1688 returns unused memory blocks to the OS more actively.
1689 - Improved partitioning algorithms for parallel_for and parallel_reduce
1690 to better handle load imbalance.
1691 - The convex_hull example has been refactored for reproducible
1692 performance results.
1693 - The major interface version has changed from 5 to 6.
1694 Deprecated interfaces might be removed in future releases.
1696 Community Preview Features:
1698 - Added: serial subset, i.e. sequential implementations of TBB generic
1699 algorithms (currently, only provided for parallel_for).
1700 - Preview of new flow graph nodes:
1701 or_node (accepts multiple inputs, forwards each input separately
1703 split_node (accepts tuples, and forwards each element of a tuple
1704 to a corresponding successor), and
1705 multioutput_function_node (accepts one input, and passes the input
1706 and a tuple of output ports to the function body to support outputs
1707 to multiple successors).
1708 - Added: memory pools for more control on memory source, grouping,
1709 and collective deallocation.
1711 ------------------------------------------------------------------------
1712 Intel TBB 3.0 Update 8 commercial-aligned release
1713 TBB_INTERFACE_VERSION == 5008
1715 Changes (w.r.t. Intel TBB 3.0 Update 7 commercial-aligned release):
1717 - Task priorities become an official feature of TBB,
1718 not community preview as before.
1719 - Atomics API extended, and implementation refactored.
1720 - Added task::set_parent() method.
1721 - Added concurrent_unordered_set container.
1723 Open-source contributions integrated:
1725 - PowerPC support by Raf Schietekat.
1726 - Fix of potential task pool overrun and other improvements
1727 in the task scheduler by Raf Schietekat.
1728 - Fix in parallel_for_each to work with std::set in Visual* C++ 2010.
1730 Community Preview Features:
1732 - Graph community preview feature was renamed to flow graph.
1733 Multiple improvements in the implementation.
1734 Binpack example was added for the feature.
1735 - A number of improvements to concurrent_priority_queue.
1736 Shortpath example was added for the feature.
1737 - TBB runtime loaded functionality was added (Windows*-only).
1738 It allows to specify which versions of TBB should be used,
1739 as well as to set directories for the library search.
1740 - parallel_deterministic_reduce template function was added.
1742 ------------------------------------------------------------------------
1743 Intel TBB 3.0 Update 7 commercial-aligned release
1744 TBB_INTERFACE_VERSION == 5006 (forgotten to increment)
1746 Changes (w.r.t. Intel TBB 3.0 Update 6 commercial-aligned release):
1748 - Added implementation of the platform isolation layer based on
1749 GCC atomic built-ins; it is supposed to work on any platform
1750 where GCC has these built-ins.
1752 Community Preview Features:
1754 - Graph's dining_philosophers example added.
1755 - A number of improvements to graph and concurrent_priority_queue.
1758 ------------------------------------------------------------------------
1759 Intel TBB 3.0 Update 6 commercial-aligned release
1760 TBB_INTERFACE_VERSION == 5006
1762 Changes (w.r.t. Intel TBB 3.0 Update 5 commercial-aligned release):
1764 - Added Community Preview feature: task and task group priority, and
1765 Fractal example demonstrating it.
1766 - parallel_pipeline optimized for data items of small and large sizes.
1767 - Graph's join_node is now parametrized with a tuple of up to 10 types.
1768 - Improved performance of concurrent_priority_queue.
1770 Open-source contributions integrated:
1772 - Initial NetBSD support by Aleksej Saushev.
1776 - Failure to enable interoperability with Intel(R) Cilk(tm) Plus runtime
1777 library, and a crash caused by invoking the interoperability layer
1778 after one of the libraries was unloaded.
1779 - Data race that could result in concurrent_unordered_map structure
1780 corruption after call to clear() method.
1781 - Stack corruption caused by PIC version of 64-bit CAS compiled by Intel
1783 - Inconsistency of exception propagation mode possible when application
1784 built with Microsoft* Visual Studio* 2008 or earlier uses TBB built
1785 with Microsoft* Visual Studio* 2010.
1786 - Affinitizing master thread to a subset of available CPUs after TBB
1787 scheduler was initialized tied all worker threads to the same CPUs.
1788 - Method is_stolen_task() always returned 'false' for affinitized tasks.
1789 - write_once_node and overwrite_node did not immediately send buffered
1792 ------------------------------------------------------------------------
1793 Intel TBB 3.0 Update 5 commercial-aligned release
1794 TBB_INTERFACE_VERSION == 5005
1796 Changes (w.r.t. Intel TBB 3.0 Update 4 commercial-aligned release):
1798 - Added Community Preview feature: graph.
1799 - Added automatic propagation of master thread FPU settings to
1801 - Added a public function to perform a sequentially consistent full
1802 memory fence: tbb::atomic_fence() in tbb/atomic.h.
1806 - Data race that could result in scheduler data structures corruption
1807 when using fire-and-forget tasks.
1808 - Potential referencing of destroyed concurrent_hash_map element after
1809 using erase(accessor&A) method with A acquired as const_accessor.
1810 - Fixed a correctness bug in the convex hull example.
1812 Open-source contributions integrated:
1814 - Patch for calls to internal::atomic_do_once() by Andrey Semashev.
1816 ------------------------------------------------------------------------
1817 Intel TBB 3.0 Update 4 commercial-aligned release
1818 TBB_INTERFACE_VERSION == 5004
1820 Changes (w.r.t. Intel TBB 3.0 Update 3 commercial-aligned release):
1822 - Added Community Preview feature: concurrent_priority_queue.
1823 - Fixed library loading to avoid possibility for remote code execution,
1824 see http://www.microsoft.com/technet/security/advisory/2269637.mspx.
1825 - Added support of more than 64 cores for appropriate Microsoft*
1826 Windows* versions. For more details, see
1827 http://msdn.microsoft.com/en-us/library/dd405503.aspx.
1828 - Default number of worker threads is adjusted in accordance with
1829 process affinity mask.
1833 - Calls of scalable_* functions from inside the allocator library
1834 caused issues if the functions were overridden by another module.
1835 - A crash occurred if methods run() and wait() were called concurrently
1836 for an empty tbb::task_group (1736).
1837 - The tachyon example exhibited build problems associated with
1838 bug 554339 on Microsoft* Visual Studio* 2010. Project files were
1839 modified as a partial workaround to overcome the problem. See
1840 http://connect.microsoft.com/VisualStudio/feedback/details/554339.
1842 ------------------------------------------------------------------------
1843 Intel TBB 3.0 Update 3 commercial-aligned release
1844 TBB_INTERFACE_VERSION == 5003
1846 Changes (w.r.t. Intel TBB 3.0 Update 2 commercial-aligned release):
1848 - cache_aligned_allocator class reworked to use scalable_aligned_malloc.
1849 - Improved performance of count() and equal_range() methods
1850 in concurrent_unordered_map.
1851 - Improved implementation of 64-bit atomic loads and stores on 32-bit
1852 platforms, including compilation with VC 7.1.
1853 - Added implementation of atomic operations on top of OSAtomic API
1855 - Removed gratuitous try/catch blocks surrounding thread function calls
1857 - Xcode* projects were added for sudoku and game_of_life examples.
1858 - Xcode* projects were updated to work without TBB framework.
1862 - Fixed a data race in task scheduler destruction that on rare occasion
1863 could result in memory corruption.
1864 - Fixed idle spinning in thread bound filters in tbb::pipeline (1670).
1866 Open-source contributions integrated:
1868 - MinGW-64 basic support by brsomoza (partially).
1869 - Patch for atomic.h by Andrey Semashev.
1870 - Support for AIX & GCC on PowerPC by Giannis Papadopoulos.
1871 - Various improvements by Raf Schietekat.
1873 ------------------------------------------------------------------------
1874 Intel TBB 3.0 Update 2 commercial-aligned release
1875 TBB_INTERFACE_VERSION == 5002
1877 Changes (w.r.t. Intel TBB 3.0 Update 1 commercial-aligned release):
1879 - Destructor of tbb::task_group class throws missing_wait exception
1880 if there are tasks running when it is invoked.
1881 - Interoperability layer with Intel Cilk Plus runtime library added
1882 to protect TBB TLS in case of nested usage with Intel Cilk Plus.
1883 - Compilation fix for dependent template names in concurrent_queue.
1884 - Memory allocator code refactored to ease development and maintenance.
1888 - Improved interoperability with other Intel software tools on Linux in
1889 case of dynamic replacement of memory allocator (1700)
1890 - Fixed install issues that prevented installation on
1891 Mac OS* X 10.6.4 (1711).
1893 ------------------------------------------------------------------------
1894 Intel TBB 3.0 Update 1 commercial-aligned release
1895 TBB_INTERFACE_VERSION == 5000 (forgotten to increment)
1897 Changes (w.r.t. Intel TBB 3.0 commercial-aligned release):
1899 - Decreased memory fragmentation by allocations bigger than 8K.
1900 - Lazily allocate worker threads, to avoid creating unnecessary stacks.
1904 - TBB allocator used much more memory than malloc (1703) - see above.
1905 - Deadlocks happened in some specific initialization scenarios
1906 of the TBB allocator (1701, 1704).
1907 - Regression in enumerable_thread_specific: excessive requirements
1908 for object constructors.
1909 - A bug in construction of parallel_pipeline filters when body instance
1910 was a temporary object.
1911 - Incorrect usage of memory fences on PowerPC and XBOX360 platforms.
1912 - A subtle issue in task group context binding that could result
1913 in cancellation signal being missed by nested task groups.
1914 - Incorrect construction of concurrent_unordered_map if specified
1915 number of buckets is not power of two.
1916 - Broken count() and equal_range() of concurrent_unordered_map.
1917 - Return type of postfix form of operator++ for hash map's iterators.
1919 ------------------------------------------------------------------------
1920 Intel TBB 3.0 commercial-aligned release
1921 TBB_INTERFACE_VERSION == 5000
1923 Changes (w.r.t. Intel TBB 2.2 Update 3 commercial-aligned release):
1925 - All open-source-release changes down to TBB 2.2 U3 below
1926 were incorporated into this release.
1928 ------------------------------------------------------------------------
1929 20100406 open-source release
1931 Changes (w.r.t. 20100310 open-source release):
1933 - Added support for Microsoft* Visual Studio* 2010, including binaries.
1934 - Added a PDF file with recommended Design Patterns for TBB.
1935 - Added parallel_pipeline function and companion classes and functions
1936 that provide a strongly typed lambda-friendly pipeline interface.
1937 - Reworked enumerable_thread_specific to use a custom implementation of
1938 hash map that is more efficient for ETS usage models.
1939 - Added example for class task_group; see examples/task_group/sudoku.
1940 - Removed two examples, as they were long outdated and superseded:
1941 pipeline/text_filter (use pipeline/square);
1942 parallel_while/parallel_preorder (use parallel_do/parallel_preorder).
1943 - PDF documentation updated.
1944 - Other fixes and changes in code, tests, and examples.
1948 - Eliminated build errors with MinGW32.
1949 - Fixed post-build step and other issues in VS projects for examples.
1950 - Fixed discrepancy between scalable_realloc and scalable_msize that
1951 caused crashes with malloc replacement on Windows.
1953 ------------------------------------------------------------------------
1954 20100310 open-source release
1956 Changes (w.r.t. Intel TBB 2.2 Update 3 commercial-aligned release):
1958 - Version macros changed in anticipation of a future release.
1959 - Directory structure aligned with Intel(R) C++ Compiler;
1960 now TBB binaries reside in <arch>/<os_key>/[bin|lib]
1961 (in TBB 2.x, it was [bin|lib]/<arch>/<os_key>).
1962 - Visual Studio projects changed for examples: instead of separate set
1963 of files for each VS version, now there is single 'msvs' directory
1964 that contains workspaces for MS C++ compiler (<example>_cl.sln) and
1965 Intel C++ compiler (<example>_icl.sln). Works with VS 2005 and above.
1966 - The name versioning scheme for backward compatibility was improved;
1967 now compatibility-breaking changes are done in a separate namespace.
1968 - Added concurrent_unordered_map implementation based on a prototype
1969 developed in Microsoft for a future version of PPL.
1970 - Added PPL-compatible writer-preference RW lock (reader_writer_lock).
1971 - Added TBB_IMPLEMENT_CPP0X macro to control injection of C++0x names
1972 implemented in TBB into namespace std.
1973 - Added almost-C++0x-compatible std::condition_variable, plus a bunch
1974 of other C++0x classes required by condition_variable.
1975 - With TBB_IMPLEMENT_CPP0X, tbb_thread can be also used as std::thread.
1976 - task.cpp was split into several translation units to structure
1977 TBB scheduler sources layout. Static data layout and library
1978 initialization logic were also updated.
1979 - TBB scheduler reworked to prevent master threads from stealing
1980 work belonging to other masters.
1981 - Class task was extended with enqueue() method, and slightly changed
1982 semantics of methods spawn() and destroy(). For exact semantics,
1983 refer to TBB Reference manual.
1984 - task_group_context now allows for destruction by non-owner threads.
1985 - Added TBB_USE_EXCEPTIONS macro to control use of exceptions in TBB
1986 headers. It turns off (i.e. sets to 0) automatically if specified
1987 compiler options disable exception handling.
1988 - TBB is enabled to run on top of Microsoft's Concurrency Runtime
1989 on Windows* 7 (via our worker dispatcher known as RML).
1990 - Removed old unused busy-waiting code in concurrent_queue.
1991 - Described the advanced build & test options in src/index.html.
1992 - Warning level for GCC raised with -Wextra and a few other options.
1993 - Multiple fixes and improvements in code, tests, examples, and docs.
1995 Open-source contributions integrated:
1997 - Xbox support by Roman Lut (Deep Shadows), though further changes are
1998 required to make it working; e.g. post-2.1 entry points are missing.
1999 - "Eventcount" by Dmitry Vyukov evolved into concurrent_monitor,
2000 an internal class used in the implementation of concurrent_queue.
2002 ------------------------------------------------------------------------
2003 Intel TBB 2.2 Update 3 commercial-aligned release
2004 TBB_INTERFACE_VERSION == 4003
2006 Changes (w.r.t. Intel TBB 2.2 Update 2 commercial-aligned release):
2008 - PDF documentation updated.
2012 - concurrent_hash_map compatibility issue exposed on Linux in case
2013 two versions of the container were used by different modules.
2014 - enforce 16 byte stack alignment for consistence with GCC; required
2015 to work correctly with 128-bit variables processed by SSE.
2016 - construct() methods of allocator classes now use global operator new.
2018 ------------------------------------------------------------------------
2019 Intel TBB 2.2 Update 2 commercial-aligned release
2020 TBB_INTERFACE_VERSION == 4002
2022 Changes (w.r.t. Intel TBB 2.2 Update 1 commercial-aligned release):
2024 - parallel_invoke and parallel_for_each now take function objects
2025 by const reference, not by value.
2026 - Building TBB with /MT is supported, to avoid dependency on particular
2027 versions of Visual C++* runtime DLLs. TBB DLLs built with /MT
2028 are located in vc_mt directory.
2029 - Class critical_section introduced.
2030 - Improvements in exception support: new exception classes introduced,
2031 all exceptions are thrown via an out-of-line internal method.
2032 - Improvements and fixes in the TBB allocator and malloc replacement,
2033 including robust memory identification, and more reliable dynamic
2034 function substitution on Windows*.
2035 - Method swap() added to class tbb_thread.
2036 - Methods rehash() and bucket_count() added to concurrent_hash_map.
2037 - Added support for Visual Studio* 2010 Beta2. No special binaries
2038 provided, but CRT-independent DLLs (vc_mt) should work.
2039 - Other fixes and improvements in code, tests, examples, and docs.
2041 Open-source contributions integrated:
2043 - The fix to build 32-bit TBB on Mac OS* X 10.6.
2044 - GCC-based port for SPARC Solaris by Michailo Matijkiw, with use of
2045 earlier work by Raf Schietekat.
2049 - 159 - TBB build for PowerPC* running Mac OS* X.
2050 - 160 - IBM* Java segfault if used with TBB allocator.
2051 - crash in concurrent_queue<char> (1616).
2053 ------------------------------------------------------------------------
2054 Intel TBB 2.2 Update 1 commercial-aligned release
2055 TBB_INTERFACE_VERSION == 4001
2057 Changes (w.r.t. Intel TBB 2.2 commercial-aligned release):
2059 - Incorporates all changes from open-source releases below.
2060 - Documentation was updated.
2061 - TBB scheduler auto-initialization now covers all possible use cases.
2062 - concurrent_queue: made argument types of sizeof used in paddings
2063 consistent with those actually used.
2064 - Memory allocator was improved: supported corner case of user's malloc
2065 calling scalable_malloc (non-Windows), corrected processing of
2066 memory allocation requests during tbb memory allocator startup
2068 - Windows malloc replacement has got better support for static objects.
2069 - In pipeline setups that do not allow actual parallelism, execution
2070 by a single thread is guaranteed, idle spinning eliminated, and
2071 performance improved.
2072 - RML refactoring and clean-up.
2073 - New constructor for concurrent_hash_map allows reserving space for
2075 - Operator delete() added to the TBB exception classes.
2076 - Lambda support was improved in parallel_reduce.
2077 - gcc 4.3 warnings were fixed for concurrent_queue.
2078 - Fixed possible initialization deadlock in modules using TBB entities
2079 during construction of global static objects.
2080 - Copy constructor in concurrent_hash_map was fixed.
2081 - Fixed a couple of rare crashes in the scheduler possible before
2082 in very specific use cases.
2083 - Fixed a rare crash in the TBB allocator running out of memory.
2084 - New tests were implemented, including test_lambda.cpp that checks
2085 support for lambda expressions.
2086 - A few other small changes in code, tests, and documentation.
2088 ------------------------------------------------------------------------
2089 20090809 open-source release
2091 Changes (w.r.t. Intel TBB 2.2 commercial-aligned release):
2093 - Fixed known exception safety issues in concurrent_vector.
2094 - Better concurrency of simultaneous grow requests in concurrent_vector.
2095 - TBB allocator further improves performance of large object allocation.
2096 - Problem with source of text relocations was fixed on Linux
2097 - Fixed bugs related to malloc replacement under Windows
2098 - A few other small changes in code and documentation.
2100 ------------------------------------------------------------------------
2101 Intel TBB 2.2 commercial-aligned release
2102 TBB_INTERFACE_VERSION == 4000
2104 Changes (w.r.t. Intel TBB 2.1 U4 commercial-aligned release):
2106 - Incorporates all changes from open-source releases below.
2107 - Architecture folders renamed from em64t to intel64 and from itanium
2109 - Major Interface version changed from 3 to 4. Deprecated interfaces
2110 might be removed in future releases.
2111 - Parallel algorithms that use partitioners have switched to use
2112 the auto_partitioner by default.
2113 - Improved memory allocator performance for allocations bigger than 8K.
2114 - Added new thread-bound filters functionality for pipeline.
2115 - New implementation of concurrent_hash_map that improves performance
2117 - A few other small changes in code and documentation.
2119 ------------------------------------------------------------------------
2120 20090511 open-source release
2122 Changes (w.r.t. previous open-source release):
2124 - Basic support for MinGW32 development kit.
2125 - Added tbb::zero_allocator class that initializes memory with zeros.
2126 It can be used as an adaptor to any STL-compatible allocator class.
2127 - Added tbb::parallel_for_each template function as alias to parallel_do.
2128 - Added more overloads for tbb::parallel_for.
2129 - Added support for exact exception propagation (can only be used with
2130 compilers that support C++0x std::exception_ptr).
2131 - tbb::atomic template class can be used with enumerations.
2132 - mutex, recursive_mutex, spin_mutex, spin_rw_mutex classes extended
2133 with explicit lock/unlock methods.
2134 - Fixed size() and grow_to_at_least() methods of tbb::concurrent_vector
2135 to provide space allocation guarantees. More methods added for
2136 compatibility with std::vector, including some from C++0x.
2137 - Preview of a lambda-friendly interface for low-level use of tasks.
2138 - scalable_msize function added to the scalable allocator (Windows only).
2139 - Rationalized internal auxiliary functions for spin-waiting and backoff.
2140 - Several tests undergo decent refactoring.
2142 Changes affecting backward compatibility:
2144 - Improvements in concurrent_queue, including limited API changes.
2145 The previous version is deprecated; its functionality is accessible
2146 via methods of the new tbb::concurrent_bounded_queue class.
2147 - grow* and push_back methods of concurrent_vector changed to return
2148 iterators; old semantics is deprecated.
2150 ------------------------------------------------------------------------
2151 Intel TBB 2.1 Update 4 commercial-aligned release
2152 TBB_INTERFACE_VERSION == 3016
2154 Changes (w.r.t. Intel TBB 2.1 U3 commercial-aligned release):
2156 - Added tests for aligned memory allocations and malloc replacement.
2157 - Several improvements for better bundling with Intel(R) C++ Compiler.
2158 - A few other small changes in code and documentation.
2162 - 150 - request to build TBB examples with debug info in release mode.
2163 - backward compatibility issue with concurrent_queue on Windows.
2164 - dependency on VS 2005 SP1 runtime libraries removed.
2165 - compilation of GUI examples under Xcode* 3.1 (1577).
2166 - On Windows, TBB allocator classes can be instantiated with const types
2167 for compatibility with MS implementation of STL containers (1566).
2169 ------------------------------------------------------------------------
2170 20090313 open-source release
2172 Changes (w.r.t. 20081109 open-source release):
2174 - Includes all changes introduced in TBB 2.1 Update 2 & Update 3
2175 commercial-aligned releases (see below for details).
2176 - Added tbb::parallel_invoke template function. It runs up to 10
2177 user-defined functions in parallel and waits for them to complete.
2178 - Added a special library providing ability to replace the standard
2179 memory allocation routines in Microsoft* C/C++ RTL (malloc/free,
2180 global new/delete, etc.) with the TBB memory allocator.
2181 Usage details are described in include/tbb/tbbmalloc_proxy.h file.
2182 - Task scheduler switched to use new implementation of its core
2183 functionality (deque based task pool, new structure of arena slots).
2184 - Preview of Microsoft* Visual Studio* 2005 project files for
2185 building the library is available in build/vsproject folder.
2186 - Added tests for aligned memory allocations and malloc replacement.
2187 - Added parallel_for/game_of_life.net example (for Windows only)
2188 showing TBB usage in a .NET application.
2189 - A number of other fixes and improvements to code, tests, makefiles,
2190 examples and documents.
2194 - The same list as in TBB 2.1 Update 4 right above.
2196 ------------------------------------------------------------------------
2197 Intel TBB 2.1 Update 3 commercial-aligned release
2198 TBB_INTERFACE_VERSION == 3015
2200 Changes (w.r.t. Intel TBB 2.1 U2 commercial-aligned release):
2202 - Added support for aligned allocations to the TBB memory allocator.
2203 - Added a special library to use with LD_PRELOAD on Linux* in order to
2204 replace the standard memory allocation routines in C/C++ with the
2205 TBB memory allocator.
2206 - Added null_mutex and null_rw_mutex: no-op classes interface-compliant
2207 to other TBB mutexes.
2208 - Improved performance of parallel_sort, to close most of the serial gap
2209 with std::sort, and beat it on 2 and more cores.
2210 - A few other small changes.
2214 - the problem where parallel_for hanged after exception throw
2215 if affinity_partitioner was used (1556).
2216 - get rid of VS warnings about mbstowcs deprecation (1560),
2217 as well as some other warnings.
2218 - operator== for concurrent_vector::iterator fixed to work correctly
2219 with different vector instances.
2221 ------------------------------------------------------------------------
2222 Intel TBB 2.1 Update 2 commercial-aligned release
2223 TBB_INTERFACE_VERSION == 3014
2225 Changes (w.r.t. Intel TBB 2.1 U1 commercial-aligned release):
2227 - Incorporates all open-source-release changes down to TBB 2.1 U1,
2229 - 20081019 addition of enumerable_thread_specific;
2230 - Warning level for Microsoft* Visual C++* compiler raised to /W4 /Wp64;
2231 warnings found on this level were cleaned or suppressed.
2232 - Added TBB_runtime_interface_version API function.
2233 - Added new example: pipeline/square.
2234 - Added exception handling and cancellation support
2235 for parallel_do and pipeline.
2236 - Added copy constructor and [begin,end) constructor to concurrent_queue.
2237 - Added some support for beta version of Intel(R) Parallel Amplifier.
2238 - Added scripts to set environment for cross-compilation of 32-bit
2239 applications on 64-bit Linux with Intel(R) C++ Compiler.
2240 - Fixed semantics of concurrent_vector::clear() to not deallocate
2241 internal arrays. Fixed compact() to perform such deallocation later.
2242 - Fixed the issue with atomic<T*> when T is incomplete type.
2243 - Improved support for PowerPC* Macintosh*, including the fix
2244 for a bug in masked compare-and-swap reported by a customer.
2245 - As usual, a number of other improvements everywhere.
2247 ------------------------------------------------------------------------
2248 20081109 open-source release
2250 Changes (w.r.t. previous open-source release):
2252 - Added new serial out of order filter for tbb::pipeline.
2253 - Fixed the issue with atomic<T*>::operator= reported at the forum.
2254 - Fixed the issue with using tbb::task::self() in task destructor
2255 reported at the forum.
2256 - A number of other improvements to code, tests, makefiles, examples
2259 Open-source contributions integrated:
2260 - Changes in the memory allocator were partially integrated.
2262 ------------------------------------------------------------------------
2263 20081019 open-source release
2265 Changes (w.r.t. previous open-source release):
2267 - Introduced enumerable_thread_specific<T>. This new class provides a
2268 wrapper around native thread local storage as well as iterators and
2269 ranges for accessing the thread local copies (1533).
2270 - Improved support for Intel(R) Threading Analysis Tools
2271 on Intel(R) 64 architecture.
2272 - Dependency from Microsoft* CRT was integrated to the libraries using
2273 manifests, to avoid issues if called from code that uses different
2274 version of Visual C++* runtime than the library.
2275 - Introduced new defines TBB_USE_ASSERT, TBB_USE_DEBUG,
2276 TBB_USE_PERFORMANCE_WARNINGS, TBB_USE_THREADING_TOOLS.
2277 - A number of other improvements to code, tests, makefiles, examples
2280 Open-source contributions integrated:
2282 - linker optimization: /incremental:no .
2284 ------------------------------------------------------------------------
2285 20080925 open-source release
2287 Changes (w.r.t. previous open-source release):
2289 - Same fix for a memory leak in the memory allocator as in TBB 2.1 U1.
2290 - Improved support for lambda functions.
2291 - Fixed more concurrent_queue issues reported at the forum.
2292 - A number of other improvements to code, tests, makefiles, examples
2295 ------------------------------------------------------------------------
2296 Intel TBB 2.1 Update 1 commercial-aligned release
2297 TBB_INTERFACE_VERSION == 3013
2299 Changes (w.r.t. Intel TBB 2.1 commercial-aligned release):
2301 - Fixed small memory leak in the memory allocator.
2302 - Incorporates all open-source-release changes since TBB 2.1,
2304 - 20080825 changes for parallel_do;
2306 ------------------------------------------------------------------------
2307 20080825 open-source release
2309 Changes (w.r.t. previous open-source release):
2311 - Added exception handling and cancellation support for parallel_do.
2312 - Added default HashCompare template argument for concurrent_hash_map.
2313 - Fixed concurrent_queue.clear() issues due to incorrect assumption
2314 about clear() being private method.
2315 - Added the possibility to use TBB in applications that change
2316 default calling conventions (Windows* only).
2317 - Many improvements to code, tests, examples, makefiles and documents.
2321 - 120, 130 - memset declaration missed in concurrent_hash_map.h
2323 ------------------------------------------------------------------------
2324 20080724 open-source release
2326 Changes (w.r.t. previous open-source release):
2328 - Inline assembly for atomic operations improved for gcc 4.3
2329 - A few more improvements to the code.
2331 ------------------------------------------------------------------------
2332 20080709 open-source release
2334 Changes (w.r.t. previous open-source release):
2336 - operator=() was added to the tbb_thread class according to
2337 the current working draft for std::thread.
2338 - Recognizing SPARC* in makefiles for Linux* and Sun Solaris*.
2342 - 127 - concurrent_hash_map::range fixed to split correctly.
2344 Open-source contributions integrated:
2346 - fix_set_midpoint.diff by jyasskin
2347 - SPARC* support in makefiles by Raf Schietekat
2349 ------------------------------------------------------------------------
2350 20080622 open-source release
2352 Changes (w.r.t. previous open-source release):
2354 - Fixed a hang that rarely happened on Linux
2355 during deinitialization of the TBB scheduler.
2356 - Improved support for Intel(R) Thread Checker.
2357 - A few more improvements to the code.
2359 ------------------------------------------------------------------------
2360 Intel TBB 2.1 commercial-aligned release
2361 TBB_INTERFACE_VERSION == 3011
2363 Changes (w.r.t. Intel TBB 2.0 U3 commercial-aligned release):
2365 - All open-source-release changes down to, and including, TBB 2.0 below,
2366 were incorporated into this release.
2368 ------------------------------------------------------------------------
2369 20080605 open-source release
2371 Changes (w.r.t. previous open-source release):
2373 - Explicit control of exported symbols by version scripts added on Linux.
2374 - Interfaces polished for exception handling & algorithm cancellation.
2375 - Cache behavior improvements in the scalable allocator.
2376 - Improvements in text_filter, polygon_overlay, and other examples.
2377 - A lot of other stability improvements in code, tests, and makefiles.
2378 - First release where binary packages include headers/docs/examples, so
2379 binary packages are now self-sufficient for using TBB.
2381 Open-source contributions integrated:
2383 - atomics patch (partially).
2384 - tick_count warning patch.
2388 - 118 - fix for boost compatibility.
2389 - 123 - fix for tbb_machine.h.
2391 ------------------------------------------------------------------------
2392 20080512 open-source release
2394 Changes (w.r.t. previous open-source release):
2396 - Fixed a problem with backward binary compatibility
2397 of debug Linux builds.
2398 - Sun* Studio* support added.
2399 - soname support added on Linux via linker script. To restore backward
2400 binary compatibility, *.so -> *.so.2 softlinks should be created.
2401 - concurrent_hash_map improvements - added few new forms of insert()
2402 method and fixed precondition and guarantees of erase() methods.
2403 Added runtime warning reporting about bad hash function used for
2404 the container. Various improvements for performance and concurrency.
2405 - Cancellation mechanism reworked so that it does not hurt scalability.
2406 - Algorithm parallel_do reworked. Requirement for Body::argument_type
2407 definition removed, and work item argument type can be arbitrarily
2409 - polygon_overlay example added.
2410 - A few more improvements to code, tests, examples and Makefiles.
2412 Open-source contributions integrated:
2414 - Soname support patch for Bugzilla #112.
2418 - 112 - fix for soname support.
2420 ------------------------------------------------------------------------
2421 Intel TBB 2.0 U3 commercial-aligned release (package 017, April 20, 2008)
2423 Corresponds to commercial 019 (for Linux*, 020; for Mac OS* X, 018)
2426 Changes (w.r.t. Intel TBB 2.0 U2 commercial-aligned release):
2428 - Does not contain open-source-release changes below; this release is
2429 only a minor update of TBB 2.0 U2.
2430 - Removed spin-waiting in pipeline and concurrent_queue.
2431 - A few more small bug fixes from open-source releases below.
2433 ------------------------------------------------------------------------
2434 20080408 open-source release
2436 Changes (w.r.t. previous open-source release):
2438 - count_strings example reworked: new word generator implemented, hash
2439 function replaced, and tbb_allocator is used with std::string class.
2440 - Static methods of spin_rw_mutex were replaced by normal member
2441 functions, and the class name was versioned.
2442 - tacheon example was renamed to tachyon.
2443 - Improved support for Intel(R) Thread Checker.
2444 - A few more minor improvements.
2446 Open-source contributions integrated:
2448 - Two sets of Sun patches for IA Solaris support.
2450 ------------------------------------------------------------------------
2451 20080402 open-source release
2453 Changes (w.r.t. previous open-source release):
2455 - Exception handling and cancellation support for tasks and algorithms
2457 - Exception safety guaranties defined and fixed for all concurrent
2459 - User-defined memory allocator support added to all concurrent
2461 - Performance improvement of concurrent_hash_map, spin_rw_mutex.
2462 - Critical fix for a rare race condition during scheduler
2463 initialization/de-initialization.
2464 - New methods added for concurrent containers to be closer to STL,
2465 as well as automatic filters removal from pipeline
2466 and __TBB_AtomicAND function.
2467 - The volatile keyword dropped from where it is not really needed.
2468 - A few more minor improvements.
2470 ------------------------------------------------------------------------
2471 20080319 open-source release
2473 Changes (w.r.t. previous open-source release):
2475 - Support for gcc version 4.3 was added.
2476 - tbb_thread class, near compatible with std::thread expected in C++0x,
2481 - 116 - fix for compilation issues with gcc version 4.2.1.
2482 - 120 - fix for compilation issues with gcc version 4.3.
2484 ------------------------------------------------------------------------
2485 20080311 open-source release
2487 Changes (w.r.t. previous open-source release):
2489 - An enumerator added for pipeline filter types (serial vs. parallel).
2490 - New task_scheduler_observer class introduced, to observe when
2491 threads start and finish interacting with the TBB task scheduler.
2492 - task_scheduler_init reverted to not use internal versioned class;
2493 binary compatibility guaranteed with stable releases only.
2494 - Various improvements to code, tests, examples and Makefiles.
2496 ------------------------------------------------------------------------
2497 20080304 open-source release
2499 Changes (w.r.t. previous open-source release):
2501 - Task-to-thread affinity support, previously kept under a macro,
2502 now fully legalized.
2503 - Work-in-progress on cache_aligned_allocator improvements.
2504 - Pipeline really supports parallel input stage; it's no more serialized.
2505 - Various improvements to code, tests, examples and Makefiles.
2509 - 119 - fix for scalable_malloc sometimes failing to return a big block.
2510 - TR575 - fixed a deadlock occurring on Windows in startup/shutdown
2511 under some conditions.
2513 ------------------------------------------------------------------------
2514 20080226 open-source release
2516 Changes (w.r.t. previous open-source release):
2518 - Introduced tbb_allocator to select between standard allocator and
2519 tbb::scalable_allocator when available.
2520 - Removed spin-waiting in pipeline and concurrent_queue.
2521 - Improved performance of concurrent_hash_map by using tbb_allocator.
2522 - Improved support for Intel(R) Thread Checker.
2523 - Various improvements to code, tests, examples and Makefiles.
2525 ------------------------------------------------------------------------
2526 Intel TBB 2.0 U2 commercial-aligned release (package 017, February 14, 2008)
2528 Corresponds to commercial 017 (for Linux*, 018; for Mac OS* X, 016)
2531 Changes (w.r.t. Intel TBB 2.0 U1 commercial-aligned release):
2533 - Does not contain open-source-release changes below; this release is
2534 only a minor update of TBB 2.0 U1.
2535 - Add support for Microsoft* Visual Studio* 2008, including binary
2536 libraries and VS2008 projects for examples.
2537 - Use SwitchToThread() not Sleep() to yield threads on Windows*.
2538 - Enhancements to Doxygen-readable comments in source code.
2539 - A few more small bug fixes from open-source releases below.
2543 - TR569 - Memory leak in concurrent_queue.
2545 ------------------------------------------------------------------------
2546 20080207 open-source release
2548 Changes (w.r.t. previous open-source release):
2550 - Improvements and minor fixes in VS2008 projects for examples.
2551 - Improvements in code for gating worker threads that wait for work,
2552 previously consolidated under #if IMPROVED_GATING, now legalized.
2553 - Cosmetic changes in code, examples, tests.
2557 - 113 - Iterators and ranges should be convertible to their const
2559 - TR569 - Memory leak in concurrent_queue.
2561 ------------------------------------------------------------------------
2562 20080122 open-source release
2564 Changes (w.r.t. previous open-source release):
2566 - Updated examples/parallel_for/seismic to improve the visuals and to
2567 use the affinity_partitioner (20071127 and forward) for better
2569 - Minor improvements to unittests and performance tests.
2571 ------------------------------------------------------------------------
2572 20080115 open-source release
2574 Changes (w.r.t. previous open-source release):
2576 - Cleanup, simplifications and enhancements to the Makefiles for
2577 building the libraries (see build/index.html for high-level
2578 changes) and the examples.
2579 - Use SwitchToThread() not Sleep() to yield threads on Windows*.
2580 - Engineering work-in-progress on exception safety/support.
2581 - Engineering work-in-progress on affinity_partitioner for
2583 - Engineering work-in-progress on improved gating for worker threads
2584 (idle workers now block in the OS instead of spinning).
2585 - Enhancements to Doxygen-readable comments in source code.
2589 - 102 - Support for parallel build with gmake -j
2590 - 114 - /Wp64 build warning on Windows*.
2592 ------------------------------------------------------------------------
2593 20071218 open-source release
2595 Changes (w.r.t. previous open-source release):
2597 - Full support for Microsoft* Visual Studio* 2008 in open-source.
2598 Binaries for vc9/ will be available in future stable releases.
2599 - New recursive_mutex class.
2600 - Full support for 32-bit PowerMac including export files for builds.
2601 - Improvements to parallel_do.
2603 ------------------------------------------------------------------------
2604 20071206 open-source release
2606 Changes (w.r.t. previous open-source release):
2608 - Support for Microsoft* Visual Studio* 2008 in building libraries
2609 from source as well as in vc9/ projects for examples.
2610 - Small fixes to the affinity_partitioner first introduced in 20071127.
2611 - Small fixes to the thread-stack size hook first introduced in 20071127.
2612 - Engineering work in progress on concurrent_vector.
2613 - Engineering work in progress on exception behavior.
2614 - Unittest improvements.
2616 ------------------------------------------------------------------------
2617 20071127 open-source release
2619 Changes (w.r.t. previous open-source release):
2621 - Task-to-thread affinity support (affinity partitioner) first appears.
2622 - More work on concurrent_vector.
2623 - New parallel_do algorithm (function-style version of parallel while)
2624 and parallel_do/parallel_preorder example.
2625 - New task_scheduler_init() hooks for getting default_num_threads() and
2626 for setting thread stack size.
2627 - Support for weak memory consistency models in the code base.
2628 - Futex usage in the task scheduler (Linux).
2629 - Started adding 32-bit PowerMac support.
2630 - Intel(R) 9.1 compilers are now the base supported Intel(R) compiler
2632 - TBB libraries added to link line automatically on Microsoft Windows*
2633 systems via #pragma comment linker directives.
2635 Open-source contributions integrated:
2637 - FreeBSD platform support patches.
2638 - AIX weak memory model patch.
2642 - 108 - Removed broken affinity.h reference.
2643 - 101 - Does not build on Debian Lenny (replaced arch with uname -m).
2645 ------------------------------------------------------------------------
2646 20071030 open-source release
2648 Changes (w.r.t. previous open-source release):
2650 - More work on concurrent_vector.
2651 - Better support for building with -Wall -Werror (or not) as desired.
2652 - A few fixes to eliminate extraneous warnings.
2653 - Begin introduction of versioning hooks so that the internal/API
2654 version is tracked via TBB_INTERFACE_VERSION. The newest binary
2655 libraries should always work with previously-compiled code when-
2657 - Engineering work in progress on using futex inside the mutexes (Linux).
2658 - Engineering work in progress on exception behavior.
2659 - Engineering work in progress on a new parallel_do algorithm.
2660 - Unittest improvements.
2662 ------------------------------------------------------------------------
2663 20070927 open-source release
2665 Changes (w.r.t. Intel TBB 2.0 U1 commercial-aligned release):
2667 - Minor update to TBB 2.0 U1 below.
2668 - Begin introduction of new concurrent_vector interfaces not released
2671 ------------------------------------------------------------------------
2672 Intel TBB 2.0 U1 commercial-aligned release (package 014, October 1, 2007)
2674 Corresponds to commercial 014 (for Linux*, 016) packages.
2676 Changes (w.r.t. Intel TBB 2.0 commercial-aligned release):
2678 - All open-source-release changes down to, and including, TBB 2.0
2679 below, were incorporated into this release.
2680 - Made a number of changes to the officially supported OS list:
2682 Asianux* 3, Debian* 4.0, Fedora Core* 6, Fedora* 7,
2683 Turbo Linux* 11, Ubuntu* 7.04;
2685 Asianux* 2, Fedora Core* 4, Haansoft* Linux 2006 Server,
2686 Mandriva/Mandrake* 10.1, Miracle Linux* 4.0,
2687 Red Flag* DC Server 5.0;
2688 Only Mac OS* X 10.4.9 (and forward) and Xcode* tool suite 2.4.1 (and
2689 forward) are now supported.
2690 - Commercial installers on Linux* fixed to recommend the correct
2691 binaries to use in more cases, with less unnecessary warnings.
2692 - Changes to eliminate spurious build warnings.
2694 Open-source contributions integrated:
2696 - Two small header guard macro patches; it also fixed bug #94.
2697 - New blocked_range3d class.
2701 - 93 - Removed misleading comments in task.h.
2704 ------------------------------------------------------------------------
2705 20070815 open-source release
2709 - Changes to eliminate spurious build warnings.
2710 - Engineering work in progress on concurrent_vector allocator behavior.
2711 - Added hooks to use the Intel(R) compiler code coverage tools.
2713 Open-source contributions integrated:
2715 - Mac OS* X build warning patch.
2719 - 88 - Fixed TBB compilation errors if both VS2005 and Windows SDK are
2722 ------------------------------------------------------------------------
2723 20070719 open-source release
2727 - Minor update to TBB 2.0 commercial-aligned release below.
2728 - Changes to eliminate spurious build warnings.
2730 ------------------------------------------------------------------------
2731 Intel TBB 2.0 commercial-aligned release (package 010, July 19, 2007)
2733 Corresponds to commercial 010 (for Linux*, 012) packages.
2735 - TBB open-source debut release.
2737 ------------------------------------------------------------------------
2738 Intel TBB 1.1 commercial release (April 10, 2007)
2740 Changes (w.r.t. Intel TBB 1.0 commercial release):
2742 - auto_partitioner which offered an automatic alternative to specifying
2743 a grain size parameter to estimate the best granularity for tasks.
2744 - The release was added to the Intel(R) C++ Compiler 10.0 Pro.
2746 ------------------------------------------------------------------------
2747 Intel TBB 1.0 Update 2 commercial release
2749 Changes (w.r.t. Intel TBB 1.0 Update 1 commercial release):
2751 - Mac OS* X 64-bit support added.
2752 - Source packages for commercial releases introduced.
2754 ------------------------------------------------------------------------
2755 Intel TBB 1.0 Update 1 commercial-aligned release
2757 Changes (w.r.t. Intel TBB 1.0 commercial release):
2759 - Fix for critical package issue on Mac OS* X.
2761 ------------------------------------------------------------------------
2762 Intel TBB 1.0 commercial release (August 29, 2006)
2764 Changes (w.r.t. Intel TBB 1.0 beta commercial release):
2766 - New namespace (and compatibility headers for old namespace).
2767 Namespaces are tbb and tbb::internal and all classes are in the
2768 underscore_style not the WindowsStyle.
2769 - New class: scalable_allocator (and cache_aligned_allocator using that
2771 - Added parallel_for/tacheon example.
2772 - Removed C-style casts from headers for better C++ compliance.
2774 - Documentation improvements.
2775 - Improved performance of the concurrent_hash_map class.
2776 - Upgraded parallel_sort() to support STL-style random-access iterators
2777 instead of just pointers.
2778 - The Windows vs7_1 directories renamed to vs7.1 in examples.
2779 - New class: spin version of reader-writer lock.
2780 - Added push_back() interface to concurrent_vector().
2782 ------------------------------------------------------------------------
2783 Intel TBB 1.0 beta commercial release
2789 - Concurrent containers: ConcurrentHashTable, ConcurrentVector,
2791 - Parallel algorithms: ParallelFor, ParallelReduce, ParallelScan,
2792 ParallelWhile, Pipeline, ParallelSort.
2793 - Support: AlignedSpace, BlockedRange (i.e., 1D), BlockedRange2D
2794 - Task scheduler with multi-master support.
2795 - Atomics: read, write, fetch-and-store, fetch-and-add, compare-and-swap.
2796 - Locks: spin, reader-writer, queuing, OS-wrapper.
2797 - Memory allocation: STL-style memory allocator that avoids false
2802 - Intel(R) Thread Checker 3.0.
2803 - Intel(R) Thread Profiler 3.0.
2806 - First Use Documents: README.txt, INSTALL.txt, Release_Notes.txt,
2807 Doc_Index.html, Getting_Started.pdf, Tutorial.pdf, Reference.pdf.
2808 - Class hierarchy HTML pages (Doxygen).
2809 - Tree of index.html pages for navigating the installed package, esp.
2813 - One for each of these TBB features: ConcurrentHashTable, ParallelFor,
2814 ParallelReduce, ParallelWhile, Pipeline, Task.
2815 - Live copies of examples from Getting_Started.pdf.
2816 - TestAll example that exercises every class and header in the package
2817 (i.e., a "liveness test").
2818 - Compilers: see Release_Notes.txt.
2819 - APIs: OpenMP, WinThreads, Pthreads.
2822 - Package for Windows installs IA-32 and EM64T bits.
2823 - Package for Linux installs IA-32, EM64T and IPF bits.
2824 - Package for Mac OS* X installs IA-32 bits.
2825 - All packages support Intel(R) software setup assistant (ISSA) and
2826 install-time FLEXlm license checking.
2827 - ISSA support allows license file to be specified directly in case of
2828 no Internet connection or problems with IRC or serial #s.
2829 - Linux installer allows root or non-root, RPM or non-RPM installs.
2830 - FLEXlm license servers (for those who need floating/counted licenses)
2831 are provided separately on Intel(R) Premier.
2833 ------------------------------------------------------------------------
2834 Intel, the Intel logo, Xeon, Intel Xeon Phi, and Cilk are registered
2835 trademarks or trademarks of Intel Corporation or its subsidiaries in
2836 the United States and other countries.
2838 * Other names and brands may be claimed as the property of others.