Martin Kroeker [Thu, 31 Jan 2019 18:10:03 +0000 (19:10 +0100)]
Merge pull request #1991 from maamountki/z14
[ZARCH] Z14 Support, BLAS 1/2 single precision implementations
maamountki [Thu, 31 Jan 2019 17:36:41 +0000 (19:36 +0200)]
Merge branch 'develop' into z14
maamountki [Thu, 31 Jan 2019 17:11:11 +0000 (19:11 +0200)]
[ZARCH] Add Z13 version for max/min functions
maamountki [Thu, 31 Jan 2019 16:52:11 +0000 (18:52 +0200)]
[ZARCH] Improve loading performance for camax/icamax
Martin Kroeker [Thu, 31 Jan 2019 14:27:21 +0000 (15:27 +0100)]
Fix wrong comparison that made IMIN identical to IMAX
as reported by aarnez in #1990
Martin Kroeker [Thu, 31 Jan 2019 14:25:15 +0000 (15:25 +0100)]
Fix wrong comparison that made IMIN identical to IMAX
as suggested in #1990
maamountki [Thu, 31 Jan 2019 07:26:50 +0000 (09:26 +0200)]
Remove ztest
maamountki [Tue, 29 Jan 2019 15:59:38 +0000 (17:59 +0200)]
[ZARCH] Fix bug in max/min functions
maamountki [Tue, 29 Jan 2019 01:47:49 +0000 (03:47 +0200)]
[ZARCH] Fix icamax/icamin
maamountki [Mon, 28 Jan 2019 15:52:23 +0000 (17:52 +0200)]
[ZARCH] Fix iamax/imax single precision
maamountki [Mon, 28 Jan 2019 15:32:24 +0000 (17:32 +0200)]
[ZARCH] Undo the last commit
maamountki [Mon, 28 Jan 2019 15:16:18 +0000 (17:16 +0200)]
[ZARCH] Fix bug in iamax/iamin/imax/imin
Martin Kroeker [Mon, 28 Jan 2019 14:44:57 +0000 (15:44 +0100)]
Merge pull request #1985 from martin-frbg/issue1984
Correct naming of getrf_parallel object
Martin Kroeker [Mon, 28 Jan 2019 14:44:42 +0000 (15:44 +0100)]
Merge pull request #1981 from edisongustavo/develop
Fix include directory of exported targets
Martin Kroeker [Mon, 28 Jan 2019 14:43:35 +0000 (15:43 +0100)]
Merge pull request #1978 from danielgindi/feature/msvc_cmake
Better support for MSVC/Windows in CMake (v0.3.x)
Martin Kroeker [Mon, 28 Jan 2019 14:42:57 +0000 (15:42 +0100)]
Merge pull request #1962 from brada4/r
Modrenize R benchmarks slightly
Martin Kroeker [Sat, 26 Jan 2019 21:25:29 +0000 (22:25 +0100)]
Merge pull request #1987 from martin-frbg/issue1961
Change ARMV8 target with BINARY=32 to ARMV7 automatically
Martin Kroeker [Sat, 26 Jan 2019 16:52:33 +0000 (17:52 +0100)]
Change ARMV8 target to ARMV7 for BINARY=32
Martin Kroeker [Sat, 26 Jan 2019 16:47:22 +0000 (17:47 +0100)]
Change ARMV8 target to ARMV7 when BINARY32 is set
fixes #1961
Martin Kroeker [Fri, 25 Jan 2019 23:45:45 +0000 (00:45 +0100)]
Correct naming of getrf_parallel object
fixes #1984
Martin Kroeker [Thu, 24 Jan 2019 08:17:48 +0000 (09:17 +0100)]
Merge pull request #1971 from martin-frbg/trsm-threshold
Shift transition to multithreading towards larger matrix sizes
Edison Gustavo Muenz [Wed, 23 Jan 2019 14:09:13 +0000 (15:09 +0100)]
Fix include directory of exported targets
Martin Kroeker [Wed, 23 Jan 2019 09:03:00 +0000 (10:03 +0100)]
Avoid penalizing tall skinny matrices
Martin Kroeker [Tue, 22 Jan 2019 20:10:38 +0000 (21:10 +0100)]
Merge pull request #1980 from martin-frbg/issue1979
Report SkylakeX as Haswell if compiler does not support AVX512
Martin Kroeker [Tue, 22 Jan 2019 17:55:43 +0000 (18:55 +0100)]
Syntax fix
Martin Kroeker [Tue, 22 Jan 2019 17:47:12 +0000 (18:47 +0100)]
Report SkylakeX as Haswell if compiler does not support AVX512
... or make was invoked with NO_AVX512=1
Daniel Cohen Gindi [Tue, 22 Jan 2019 12:38:01 +0000 (14:38 +0200)]
Adjust test script for correct deployment
Martin Kroeker [Tue, 22 Jan 2019 11:32:24 +0000 (12:32 +0100)]
Use VERSION_LESS for comparisons involving software version numbers
Daniel Cohen Gindi [Mon, 21 Jan 2019 06:35:23 +0000 (08:35 +0200)]
Better support for MSVC/Windows in CMake
maamountki [Mon, 21 Jan 2019 13:56:04 +0000 (15:56 +0200)]
[ZARCH] Update max/min functions
Martin Kroeker [Sun, 20 Jan 2019 19:30:11 +0000 (20:30 +0100)]
Merge pull request #1973 from martin-frbg/issue1464
Increase Zen SWITCH_RATIO to 16
Martin Kroeker [Sun, 20 Jan 2019 11:18:53 +0000 (12:18 +0100)]
Fix compilation with NO_AVX=1 set
fixes #1974
Martin Kroeker [Sat, 19 Jan 2019 22:01:31 +0000 (23:01 +0100)]
Increase Zen SWITCH_RATIO to 16
following GEMM benchmarks on Ryzen2700X. For #1464
Martin Kroeker [Fri, 18 Jan 2019 23:10:01 +0000 (00:10 +0100)]
Shift transition to multithreading towards larger matrix sizes
See #1886 and JuliaRobotics issue 500. trsm benchmarks on Haswell and Zen showed that with these values performance is roughly doubled for matrix sizes between 8x8 and 14x14, and still 10 to 20 percent better near the new cutoff at 32x32.
Martin Kroeker [Fri, 18 Jan 2019 07:11:39 +0000 (08:11 +0100)]
Fix declaration of input arguments in the Sandybridge GER microkernels (#1967)
* Tag arguments 0 and 1 as both input and output
Martin Kroeker [Fri, 18 Jan 2019 07:11:07 +0000 (08:11 +0100)]
Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966)
* Tag arguments 0 and 1 as both input and output (see #1964)
Martin Kroeker [Thu, 17 Jan 2019 22:20:32 +0000 (23:20 +0100)]
Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965)
* Tag operands 0 and 1 as both input and output
For #1964 (basically a continuation of coding problems first seen in #1292)
Martin Kroeker [Thu, 17 Jan 2019 15:42:11 +0000 (16:42 +0100)]
Merge pull request #1970 from quickwritereader/develop
crot fix
Martin Kroeker [Thu, 17 Jan 2019 15:19:03 +0000 (16:19 +0100)]
Bump xcode version to 10.1 to make sure it handles AVX512
Ubuntu [Thu, 17 Jan 2019 14:45:31 +0000 (14:45 +0000)]
crot fix
Martin Kroeker [Wed, 16 Jan 2019 17:41:03 +0000 (18:41 +0100)]
Merge pull request #1963 from quickwritereader/develop
Blas1 single missing kernels implemented with vector builtins
Abdelrauf [Wed, 16 Jan 2019 15:25:13 +0000 (19:25 +0400)]
Merge branch 'develop' into develop
Ubuntu [Wed, 16 Jan 2019 15:16:21 +0000 (15:16 +0000)]
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
Fixed idamin,icamin choosing the first occurance index of equal minimals
Andrew [Wed, 16 Jan 2019 09:54:22 +0000 (11:54 +0200)]
disable NaN checks before BLAS calls dgemm.R
Andrew [Wed, 16 Jan 2019 09:41:46 +0000 (11:41 +0200)]
disable NaN checks before BLAS calls deig.R (shorten matrix def)
Andrew [Wed, 16 Jan 2019 09:38:14 +0000 (11:38 +0200)]
disable NaN checks before BLAS calls deig.R
Andrew [Wed, 16 Jan 2019 09:34:46 +0000 (11:34 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter formula)
Martin Kroeker [Wed, 16 Jan 2019 09:27:14 +0000 (10:27 +0100)]
Merge pull request #1960 from cnjsdfcy/Hygon
Add support for Hygon Dhyana
Andrew [Wed, 16 Jan 2019 09:23:51 +0000 (11:23 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter config part)
Andrew [Wed, 16 Jan 2019 09:18:54 +0000 (11:18 +0200)]
disable NaN checks before BLAS calls dsolve.R
Andrew [Wed, 16 Jan 2019 07:51:29 +0000 (09:51 +0200)]
init
caiyu [Wed, 16 Jan 2019 06:25:19 +0000 (14:25 +0800)]
Add support for Hygon Dhyana
maamountki [Tue, 15 Jan 2019 19:04:22 +0000 (21:04 +0200)]
[ZARCH] fix a bug in max/min functions
Martin Kroeker [Mon, 14 Jan 2019 21:41:31 +0000 (22:41 +0100)]
Fix missing braces in support_av() call
Martin Kroeker [Mon, 14 Jan 2019 21:38:32 +0000 (22:38 +0100)]
Fix missing braces in support_avx()
maamountki [Fri, 11 Jan 2019 15:43:11 +0000 (17:43 +0200)]
[ZARCH] Update dgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:39:17 +0000 (17:39 +0200)]
[ZARCH] update cgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:37:11 +0000 (17:37 +0200)]
[ZARCH] Update cgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:14:04 +0000 (17:14 +0200)]
Update sgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:13:02 +0000 (17:13 +0200)]
Update dgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:08:24 +0000 (17:08 +0200)]
[ZARCH] fix sgemv_n_4.c
maamountki [Fri, 11 Jan 2019 14:44:46 +0000 (16:44 +0200)]
[ZARCH] fix cgemv_n_4.c
Martin Kroeker [Thu, 10 Jan 2019 11:04:08 +0000 (12:04 +0100)]
Merge pull request #1957 from martin-frbg/issue1954
Move TLS key deletion to openblas_quit
Martin Kroeker [Wed, 9 Jan 2019 23:32:50 +0000 (00:32 +0100)]
Move TLS key deletion to openblas_quit
fixes #1954 (as suggested by thrasibule in that issue)
maamountki [Wed, 9 Jan 2019 14:50:07 +0000 (16:50 +0200)]
[ZARCH] fix data prefetch type in sdot
maamountki [Wed, 9 Jan 2019 14:49:44 +0000 (16:49 +0200)]
[ZARCH] fix data prefetch type in ddot
maamountki [Wed, 9 Jan 2019 14:33:54 +0000 (16:33 +0200)]
[ZARCH] fix dsdot.c
maamountki [Wed, 9 Jan 2019 05:43:45 +0000 (07:43 +0200)]
[ZARCH] fix cgemv_n_4.c
Martin Kroeker [Tue, 8 Jan 2019 19:44:08 +0000 (20:44 +0100)]
Merge pull request #1949 from martin-frbg/issue1947
Query AVX2 and AVX512VL support when selecting x86 kernels
Martin Kroeker [Tue, 8 Jan 2019 13:43:45 +0000 (14:43 +0100)]
Bump xcode to 8.3
Martin Kroeker [Tue, 8 Jan 2019 13:41:48 +0000 (14:41 +0100)]
Update OSX environment to Sierra
as homebrew seems to have dropped support for El Capitan in their gcc packages
Martin Kroeker [Tue, 8 Jan 2019 09:46:47 +0000 (10:46 +0100)]
Add travis_wait to the OSX brew install phase
Martin Kroeker [Sat, 5 Jan 2019 18:41:13 +0000 (19:41 +0100)]
Add message for SkylakeX and KNL fallbacks to Haswell
Martin Kroeker [Sat, 5 Jan 2019 17:08:02 +0000 (18:08 +0100)]
Add xcr0 (os support) check
Martin Kroeker [Sat, 5 Jan 2019 17:07:14 +0000 (18:07 +0100)]
Add xcr0 (os support) check
Martin Kroeker [Sat, 5 Jan 2019 15:58:56 +0000 (16:58 +0100)]
Query AVX2 and AVX512VL capability in x86 cpu detection
Martin Kroeker [Sat, 5 Jan 2019 15:55:33 +0000 (16:55 +0100)]
Query AVX2 and AVX512 capability for runtime cpu selection
maamountki [Fri, 4 Jan 2019 15:45:56 +0000 (17:45 +0200)]
[ZARCH] fix cgemv_n_4.c
Martin Kroeker [Fri, 4 Jan 2019 00:37:37 +0000 (01:37 +0100)]
Merge pull request #1946 from martin-frbg/issue1908
More fixes for cross-compiling ARM64 targets
maamountki [Thu, 3 Jan 2019 23:38:18 +0000 (01:38 +0200)]
[ZARCH] fix sgemv_t_4.c
Martin Kroeker [Thu, 3 Jan 2019 21:17:31 +0000 (22:17 +0100)]
More fixes for cross-compiling ARM64 targets
Fixed core naming for DYNAMIC_ARCH. Corrected GEMM_DEFAULT entries and added SYMV_P. Replaced outdated VULCAN define for ThunderX2T99 with ARMV8 to get basic definitions back. For issue #1908
Martin Kroeker [Wed, 2 Jan 2019 19:15:35 +0000 (20:15 +0100)]
Fix missing quotes around thunderx targets
TiborGY [Mon, 31 Dec 2018 22:19:44 +0000 (23:19 +0100)]
Validate user supplied TARGET (#1941)
the build will now abort with an error message when an undefined build TARGET is named
Fixes #1938
Martin Kroeker [Mon, 31 Dec 2018 22:11:37 +0000 (23:11 +0100)]
Increment version to 0.3.6.dev
Martin Kroeker [Mon, 31 Dec 2018 22:10:59 +0000 (23:10 +0100)]
Increment version to 0.3.6.dev
Martin Kroeker [Mon, 31 Dec 2018 22:07:53 +0000 (23:07 +0100)]
Merge branch 'release-0.3.0' into develop
Martin Kroeker [Mon, 31 Dec 2018 22:00:46 +0000 (23:00 +0100)]
Update ChangeLog.txt with changes from 0.3.5
Martin Kroeker [Mon, 31 Dec 2018 17:36:18 +0000 (18:36 +0100)]
Merge pull request #1944 from hartzell/patch-1
Typo: Skyalke -> Skylake
George Hartzell [Sun, 30 Dec 2018 22:55:34 +0000 (14:55 -0800)]
Typo: Skyalke -> Skylake
Worth fixing, it gets in the way of searching....
Martin Kroeker [Sun, 30 Dec 2018 19:10:05 +0000 (20:10 +0100)]
Merge pull request #1939 from TiborGY/patch-2
Fix typo in UNKNOWN core name
Martin Kroeker [Sun, 30 Dec 2018 19:07:01 +0000 (20:07 +0100)]
Merge pull request #1943 from martin-frbg/issue1748
Re-enable loop unrolling in trmv and remove the scary warning
Martin Kroeker [Sun, 30 Dec 2018 14:22:37 +0000 (15:22 +0100)]
Re-enable loop unrolling in trmv and remove the scary warning
fixes #1748 as that half of the fix for #1332 appears to have been an overreaction on my part.
Martin Kroeker [Sun, 30 Dec 2018 13:47:05 +0000 (14:47 +0100)]
Merge pull request #1942 from martin-frbg/issue1720
Delete the pthread key on cleanup in TLS mode
Martin Kroeker [Sun, 30 Dec 2018 13:39:18 +0000 (14:39 +0100)]
Remove stray include of complex.h
already provided conditionally by common.h via openblas_utest.h
Unconditional inclusion breaks older Android and similar platforms that use OPENBLAS_COMPLEX_STRUCT
Martin Kroeker [Sat, 29 Dec 2018 20:59:31 +0000 (21:59 +0100)]
Delete the pthread key on cleanup in TLS mode
to avoid a crash when OpenBLAS was loaded via dlopen and libc tries to clean up the leaked TLS after dlclose
Fixes #1720
Martin Kroeker [Sat, 29 Dec 2018 17:12:54 +0000 (18:12 +0100)]
Fix wrong case in TARGET setting for Alpine
TiborGY [Fri, 28 Dec 2018 13:36:39 +0000 (14:36 +0100)]
Update cpuid_mips64.c
TiborGY [Fri, 28 Dec 2018 13:35:41 +0000 (14:35 +0100)]
Update Makefile
TiborGY [Fri, 28 Dec 2018 13:34:38 +0000 (14:34 +0100)]
Update cpuid_mips.c
TiborGY [Fri, 28 Dec 2018 13:33:18 +0000 (14:33 +0100)]
Update cpuid_arm.c