From b6001a2ee342f7b8e2c7f8d92e3487b82653b3ec Mon Sep 17 00:00:00 2001 From: Martin Kroeker Date: Sun, 19 Dec 2021 14:34:14 +0100 Subject: [PATCH] Update with 0.3.19 changes --- Changelog.txt | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/Changelog.txt b/Changelog.txt index 59fe1d4..180f7ad 100644 --- a/Changelog.txt +++ b/Changelog.txt @@ -1,5 +1,52 @@ OpenBLAS ChangeLog ==================================================================== +Version 0.3.19 + 19-Dec-2021 + + general: + - reverted unsafe TRSV/ZRSV optimizations introduced in 0.3.16 + - fixed a potential thread race in the thread buffer reallocation routines + that were introduced in 0.3.18 + - fixed miscounting of thread pool size on Linux with OMP_PROC_BIND=TRUE + - fixed CBLAS interfaces for CSROT/ZSROT and CROTG/ZROTG + - made automatic library suffix for CMAKE builds with INTERFACE64 available + to CBLAS-only builds + +x86_64: + - DYNAMIC_ARCH builds now fall back to the cpu with most similar capabilities + when an unknown CPUID is encountered, instead of defaulting to Prescott + - added cpu detection for Intel Alder Lake + - added cpu detection for Intel Sapphire Rapids + - added an optimized SBGEMM kernel for Sapphire Rapids + - fixed DYNAMIC_ARCH builds on OSX with CMAKE + - worked around DYNAMIC_ARCH builds made on Sandybridge failing on SkylakeX + - fixed missing thread initialization for static builds on Windows/MSVC + - fixed an excessive read in ZSYMV + +POWER: + - added support for POWER10 in big-endian mode + - added support for building with CMAKE + - added optimized SGEMM and DGEMM kernels for small matrix sizes + +ARMV8: + - added basic support and cputype detection for Fujitsu A64FX + - added a generic ARMV8SVE target + - added SVE-enabled SGEMM and DGEMM kernels for ARMV8SVE and A64FX + - added optimized CGEMM and ZGEMM kernels for Cortex A53 and A55 cpus + - fixed cpuid detection for Apple M1 and improved performance + - improved compiler flag setting in CMAKE builds + +RISCV64: + - fixed improper initialization in CSCAL/ZSCAL for strided access patterns + +MIPS: + - added a GENERIC target for MIPS32 + - added support for cross-compiling to MIPS32 on x86_64 using CMAKE + +MIPS64: + - fixed misdetection of MSA capability + +==================================================================== Version 0.3.18 02-Oct-2021 -- 2.7.4