platform/upstream/openblas.git
5 years agoAdd Z14 target
Martin Kroeker [Thu, 31 Jan 2019 20:13:46 +0000 (21:13 +0100)]
Add Z14 target

from patch provided by aarnez in #991

5 years agoMerge pull request #1991 from maamountki/z14
Martin Kroeker [Thu, 31 Jan 2019 18:10:03 +0000 (19:10 +0100)]
Merge pull request #1991 from maamountki/z14

[ZARCH] Z14 Support, BLAS 1/2 single precision implementations

5 years agoMerge branch 'develop' into z14
maamountki [Thu, 31 Jan 2019 17:36:41 +0000 (19:36 +0200)]
Merge branch 'develop' into z14

5 years ago[ZARCH] Add Z13 version for max/min functions
maamountki [Thu, 31 Jan 2019 17:11:11 +0000 (19:11 +0200)]
[ZARCH] Add Z13 version for max/min functions

5 years ago[ZARCH] Improve loading performance for camax/icamax
maamountki [Thu, 31 Jan 2019 16:52:11 +0000 (18:52 +0200)]
[ZARCH] Improve loading performance for camax/icamax

5 years agoFix wrong comparison that made IMIN identical to IMAX
Martin Kroeker [Thu, 31 Jan 2019 14:27:21 +0000 (15:27 +0100)]
Fix wrong comparison that made IMIN identical to IMAX

as reported by aarnez in #1990

5 years agoFix wrong comparison that made IMIN identical to IMAX
Martin Kroeker [Thu, 31 Jan 2019 14:25:15 +0000 (15:25 +0100)]
Fix wrong comparison that made IMIN identical to IMAX

as suggested in #1990

5 years agoRemove ztest
maamountki [Thu, 31 Jan 2019 07:26:50 +0000 (09:26 +0200)]
Remove ztest

5 years ago[ZARCH] Fix bug in max/min functions
maamountki [Tue, 29 Jan 2019 15:59:38 +0000 (17:59 +0200)]
[ZARCH] Fix bug in max/min functions

5 years ago[ZARCH] Fix icamax/icamin
maamountki [Tue, 29 Jan 2019 01:47:49 +0000 (03:47 +0200)]
[ZARCH] Fix icamax/icamin

5 years ago[ZARCH] Fix iamax/imax single precision
maamountki [Mon, 28 Jan 2019 15:52:23 +0000 (17:52 +0200)]
[ZARCH] Fix iamax/imax single precision

5 years ago[ZARCH] Undo the last commit
maamountki [Mon, 28 Jan 2019 15:32:24 +0000 (17:32 +0200)]
[ZARCH] Undo the last commit

5 years ago[ZARCH] Fix bug in iamax/iamin/imax/imin
maamountki [Mon, 28 Jan 2019 15:16:18 +0000 (17:16 +0200)]
[ZARCH] Fix bug in iamax/iamin/imax/imin

5 years agoMerge pull request #1985 from martin-frbg/issue1984
Martin Kroeker [Mon, 28 Jan 2019 14:44:57 +0000 (15:44 +0100)]
Merge pull request #1985 from martin-frbg/issue1984

Correct naming of getrf_parallel object

5 years agoMerge pull request #1981 from edisongustavo/develop
Martin Kroeker [Mon, 28 Jan 2019 14:44:42 +0000 (15:44 +0100)]
Merge pull request #1981 from edisongustavo/develop

Fix include directory of exported targets

5 years agoMerge pull request #1978 from danielgindi/feature/msvc_cmake
Martin Kroeker [Mon, 28 Jan 2019 14:43:35 +0000 (15:43 +0100)]
Merge pull request #1978 from danielgindi/feature/msvc_cmake

Better support for MSVC/Windows in CMake (v0.3.x)

5 years agoMerge pull request #1962 from brada4/r
Martin Kroeker [Mon, 28 Jan 2019 14:42:57 +0000 (15:42 +0100)]
Merge pull request #1962 from brada4/r

Modrenize R benchmarks slightly

5 years agoMerge pull request #1987 from martin-frbg/issue1961
Martin Kroeker [Sat, 26 Jan 2019 21:25:29 +0000 (22:25 +0100)]
Merge pull request #1987 from martin-frbg/issue1961

Change ARMV8 target with BINARY=32 to ARMV7 automatically

5 years agoChange ARMV8 target to ARMV7 for BINARY=32
Martin Kroeker [Sat, 26 Jan 2019 16:52:33 +0000 (17:52 +0100)]
Change ARMV8 target to ARMV7 for BINARY=32

5 years agoChange ARMV8 target to ARMV7 when BINARY32 is set
Martin Kroeker [Sat, 26 Jan 2019 16:47:22 +0000 (17:47 +0100)]
Change ARMV8 target to ARMV7 when BINARY32 is set

fixes #1961

5 years agoCorrect naming of getrf_parallel object
Martin Kroeker [Fri, 25 Jan 2019 23:45:45 +0000 (00:45 +0100)]
Correct naming of getrf_parallel object

fixes #1984

5 years agoMerge pull request #1971 from martin-frbg/trsm-threshold
Martin Kroeker [Thu, 24 Jan 2019 08:17:48 +0000 (09:17 +0100)]
Merge pull request #1971 from martin-frbg/trsm-threshold

Shift transition to multithreading towards larger matrix sizes

5 years agoFix include directory of exported targets
Edison Gustavo Muenz [Wed, 23 Jan 2019 14:09:13 +0000 (15:09 +0100)]
Fix include directory of exported targets

5 years agoAvoid penalizing tall skinny matrices
Martin Kroeker [Wed, 23 Jan 2019 09:03:00 +0000 (10:03 +0100)]
Avoid penalizing tall skinny matrices

5 years agoMerge pull request #1980 from martin-frbg/issue1979
Martin Kroeker [Tue, 22 Jan 2019 20:10:38 +0000 (21:10 +0100)]
Merge pull request #1980 from martin-frbg/issue1979

Report SkylakeX as Haswell if compiler does not support AVX512

5 years agoSyntax fix
Martin Kroeker [Tue, 22 Jan 2019 17:55:43 +0000 (18:55 +0100)]
Syntax fix

5 years agoReport SkylakeX as Haswell if compiler does not support AVX512
Martin Kroeker [Tue, 22 Jan 2019 17:47:12 +0000 (18:47 +0100)]
Report SkylakeX as Haswell if compiler does not support AVX512

... or make was invoked with NO_AVX512=1

5 years agoAdjust test script for correct deployment
Daniel Cohen Gindi [Tue, 22 Jan 2019 12:38:01 +0000 (14:38 +0200)]
Adjust test script for correct deployment

5 years agoUse VERSION_LESS for comparisons involving software version numbers
Martin Kroeker [Tue, 22 Jan 2019 11:32:24 +0000 (12:32 +0100)]
Use VERSION_LESS for comparisons involving software version numbers

5 years agoBetter support for MSVC/Windows in CMake
Daniel Cohen Gindi [Mon, 21 Jan 2019 06:35:23 +0000 (08:35 +0200)]
Better support for MSVC/Windows in CMake

5 years ago[ZARCH] Update max/min functions
maamountki [Mon, 21 Jan 2019 13:56:04 +0000 (15:56 +0200)]
[ZARCH] Update max/min functions

5 years agoMerge pull request #1973 from martin-frbg/issue1464
Martin Kroeker [Sun, 20 Jan 2019 19:30:11 +0000 (20:30 +0100)]
Merge pull request #1973 from martin-frbg/issue1464

Increase Zen SWITCH_RATIO to 16

5 years agoFix compilation with NO_AVX=1 set
Martin Kroeker [Sun, 20 Jan 2019 11:18:53 +0000 (12:18 +0100)]
Fix compilation with NO_AVX=1 set

fixes #1974

5 years agoIncrease Zen SWITCH_RATIO to 16
Martin Kroeker [Sat, 19 Jan 2019 22:01:31 +0000 (23:01 +0100)]
Increase Zen SWITCH_RATIO to 16

following GEMM benchmarks on Ryzen2700X. For #1464

5 years agoShift transition to multithreading towards larger matrix sizes
Martin Kroeker [Fri, 18 Jan 2019 23:10:01 +0000 (00:10 +0100)]
Shift transition to multithreading towards larger matrix sizes

See #1886 and JuliaRobotics issue 500. trsm benchmarks on Haswell and Zen showed that with these values performance is roughly doubled for matrix sizes between 8x8 and 14x14, and still 10 to 20 percent better near the new cutoff at 32x32.

5 years agoFix declaration of input arguments in the Sandybridge GER microkernels (#1967)
Martin Kroeker [Fri, 18 Jan 2019 07:11:39 +0000 (08:11 +0100)]
Fix declaration of input arguments in the Sandybridge GER microkernels (#1967)

* Tag arguments 0 and 1 as both input and output

5 years agoFix declaration of input arguments in the x86_64 SCAL microkernels (#1966)
Martin Kroeker [Fri, 18 Jan 2019 07:11:07 +0000 (08:11 +0100)]
Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966)

* Tag arguments 0 and 1 as both input and output (see #1964)

5 years agoFix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (...
Martin Kroeker [Thu, 17 Jan 2019 22:20:32 +0000 (23:20 +0100)]
Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965)

* Tag operands 0 and 1 as both input and output

For #1964 (basically a continuation of coding problems first seen in #1292)

5 years agoMerge pull request #1970 from quickwritereader/develop
Martin Kroeker [Thu, 17 Jan 2019 15:42:11 +0000 (16:42 +0100)]
Merge pull request #1970 from quickwritereader/develop

crot fix

5 years agoBump xcode version to 10.1 to make sure it handles AVX512
Martin Kroeker [Thu, 17 Jan 2019 15:19:03 +0000 (16:19 +0100)]
Bump xcode version to 10.1 to make sure it handles AVX512

5 years agocrot fix
Ubuntu [Thu, 17 Jan 2019 14:45:31 +0000 (14:45 +0000)]
crot fix

5 years agoMerge pull request #1963 from quickwritereader/develop
Martin Kroeker [Wed, 16 Jan 2019 17:41:03 +0000 (18:41 +0100)]
Merge pull request #1963 from quickwritereader/develop

Blas1 single missing kernels implemented with vector builtins

5 years agoMerge branch 'develop' into develop
Abdelrauf [Wed, 16 Jan 2019 15:25:13 +0000 (19:25 +0400)]
Merge branch 'develop' into develop

5 years agoAdded missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot...
Ubuntu [Wed, 16 Jan 2019 15:16:21 +0000 (15:16 +0000)]
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
Fixed idamin,icamin choosing the first occurance index of equal minimals

5 years agodisable NaN checks before BLAS calls dgemm.R
Andrew [Wed, 16 Jan 2019 09:54:22 +0000 (11:54 +0200)]
disable NaN checks before BLAS calls dgemm.R

5 years agodisable NaN checks before BLAS calls deig.R (shorten matrix def)
Andrew [Wed, 16 Jan 2019 09:41:46 +0000 (11:41 +0200)]
disable NaN checks before BLAS calls deig.R (shorten matrix def)

5 years agodisable NaN checks before BLAS calls deig.R
Andrew [Wed, 16 Jan 2019 09:38:14 +0000 (11:38 +0200)]
disable NaN checks before BLAS calls deig.R

5 years agodisable NaN checks before BLAS calls dsolve.R (shorter formula)
Andrew [Wed, 16 Jan 2019 09:34:46 +0000 (11:34 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter formula)

5 years agoMerge pull request #1960 from cnjsdfcy/Hygon
Martin Kroeker [Wed, 16 Jan 2019 09:27:14 +0000 (10:27 +0100)]
Merge pull request #1960 from cnjsdfcy/Hygon

Add support for Hygon Dhyana

5 years agodisable NaN checks before BLAS calls dsolve.R (shorter config part)
Andrew [Wed, 16 Jan 2019 09:23:51 +0000 (11:23 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter config part)

5 years agodisable NaN checks before BLAS calls dsolve.R
Andrew [Wed, 16 Jan 2019 09:18:54 +0000 (11:18 +0200)]
disable NaN checks before BLAS calls dsolve.R

5 years agoinit
Andrew [Wed, 16 Jan 2019 07:51:29 +0000 (09:51 +0200)]
init

5 years agoAdd support for Hygon Dhyana
caiyu [Wed, 16 Jan 2019 06:25:19 +0000 (14:25 +0800)]
Add support for Hygon Dhyana

5 years ago[ZARCH] fix a bug in max/min functions
maamountki [Tue, 15 Jan 2019 19:04:22 +0000 (21:04 +0200)]
[ZARCH] fix a bug in max/min functions

5 years agoFix missing braces in support_av() call
Martin Kroeker [Mon, 14 Jan 2019 21:41:31 +0000 (22:41 +0100)]
Fix missing braces in support_av() call

5 years agoFix missing braces in support_avx()
Martin Kroeker [Mon, 14 Jan 2019 21:38:32 +0000 (22:38 +0100)]
Fix missing braces in support_avx()

5 years ago[ZARCH] Update dgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:43:11 +0000 (17:43 +0200)]
[ZARCH] Update dgemv_n_4.c

5 years ago[ZARCH] update cgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:39:17 +0000 (17:39 +0200)]
[ZARCH] update cgemv_n_4.c

5 years ago[ZARCH] Update cgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:37:11 +0000 (17:37 +0200)]
[ZARCH] Update cgemv_t_4.c

5 years agoUpdate sgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:14:04 +0000 (17:14 +0200)]
Update sgemv_t_4.c

5 years agoUpdate dgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:13:02 +0000 (17:13 +0200)]
Update dgemv_t_4.c

5 years ago[ZARCH] fix sgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:08:24 +0000 (17:08 +0200)]
[ZARCH] fix sgemv_n_4.c

5 years ago[ZARCH] fix cgemv_n_4.c
maamountki [Fri, 11 Jan 2019 14:44:46 +0000 (16:44 +0200)]
[ZARCH] fix cgemv_n_4.c

5 years agoMerge pull request #1957 from martin-frbg/issue1954
Martin Kroeker [Thu, 10 Jan 2019 11:04:08 +0000 (12:04 +0100)]
Merge pull request #1957 from martin-frbg/issue1954

Move TLS key deletion to openblas_quit

5 years agoMove TLS key deletion to openblas_quit
Martin Kroeker [Wed, 9 Jan 2019 23:32:50 +0000 (00:32 +0100)]
Move TLS key deletion to openblas_quit

fixes #1954 (as suggested by thrasibule in that issue)

5 years ago[ZARCH] fix data prefetch type in sdot
maamountki [Wed, 9 Jan 2019 14:50:07 +0000 (16:50 +0200)]
[ZARCH] fix data prefetch type in sdot

5 years ago[ZARCH] fix data prefetch type in ddot
maamountki [Wed, 9 Jan 2019 14:49:44 +0000 (16:49 +0200)]
[ZARCH] fix data prefetch type in ddot

5 years ago[ZARCH] fix dsdot.c
maamountki [Wed, 9 Jan 2019 14:33:54 +0000 (16:33 +0200)]
[ZARCH] fix dsdot.c

5 years ago[ZARCH] fix cgemv_n_4.c
maamountki [Wed, 9 Jan 2019 05:43:45 +0000 (07:43 +0200)]
[ZARCH] fix cgemv_n_4.c

5 years agoMerge pull request #1949 from martin-frbg/issue1947
Martin Kroeker [Tue, 8 Jan 2019 19:44:08 +0000 (20:44 +0100)]
Merge pull request #1949 from martin-frbg/issue1947

Query AVX2 and AVX512VL support when selecting x86 kernels

5 years agoBump xcode to 8.3
Martin Kroeker [Tue, 8 Jan 2019 13:43:45 +0000 (14:43 +0100)]
Bump xcode to 8.3

5 years agoUpdate OSX environment to Sierra
Martin Kroeker [Tue, 8 Jan 2019 13:41:48 +0000 (14:41 +0100)]
Update OSX environment to Sierra

as homebrew seems to have dropped support for El Capitan in their gcc packages

5 years agoAdd travis_wait to the OSX brew install phase
Martin Kroeker [Tue, 8 Jan 2019 09:46:47 +0000 (10:46 +0100)]
Add travis_wait to the OSX brew install phase

5 years agoAdd message for SkylakeX and KNL fallbacks to Haswell
Martin Kroeker [Sat, 5 Jan 2019 18:41:13 +0000 (19:41 +0100)]
Add message for SkylakeX and KNL fallbacks to Haswell

5 years agoAdd xcr0 (os support) check
Martin Kroeker [Sat, 5 Jan 2019 17:08:02 +0000 (18:08 +0100)]
Add xcr0 (os support) check

5 years agoAdd xcr0 (os support) check
Martin Kroeker [Sat, 5 Jan 2019 17:07:14 +0000 (18:07 +0100)]
Add xcr0 (os support) check

5 years agoQuery AVX2 and AVX512VL capability in x86 cpu detection
Martin Kroeker [Sat, 5 Jan 2019 15:58:56 +0000 (16:58 +0100)]
Query AVX2 and AVX512VL capability in x86 cpu detection

5 years agoQuery AVX2 and AVX512 capability for runtime cpu selection
Martin Kroeker [Sat, 5 Jan 2019 15:55:33 +0000 (16:55 +0100)]
Query AVX2 and AVX512 capability for runtime cpu selection

5 years ago[ZARCH] fix cgemv_n_4.c
maamountki [Fri, 4 Jan 2019 15:45:56 +0000 (17:45 +0200)]
[ZARCH] fix cgemv_n_4.c

5 years agoMerge pull request #1946 from martin-frbg/issue1908
Martin Kroeker [Fri, 4 Jan 2019 00:37:37 +0000 (01:37 +0100)]
Merge pull request #1946 from martin-frbg/issue1908

More fixes for cross-compiling ARM64 targets

5 years ago[ZARCH] fix sgemv_t_4.c
maamountki [Thu, 3 Jan 2019 23:38:18 +0000 (01:38 +0200)]
[ZARCH] fix sgemv_t_4.c

5 years agoMore fixes for cross-compiling ARM64 targets
Martin Kroeker [Thu, 3 Jan 2019 21:17:31 +0000 (22:17 +0100)]
More fixes for cross-compiling ARM64 targets

Fixed core naming for DYNAMIC_ARCH. Corrected GEMM_DEFAULT entries and added SYMV_P. Replaced outdated VULCAN define for ThunderX2T99 with ARMV8 to get basic definitions back. For issue #1908

5 years agoFix missing quotes around thunderx targets
Martin Kroeker [Wed, 2 Jan 2019 19:15:35 +0000 (20:15 +0100)]
Fix missing quotes around thunderx targets

5 years agoValidate user supplied TARGET (#1941)
TiborGY [Mon, 31 Dec 2018 22:19:44 +0000 (23:19 +0100)]
Validate user supplied TARGET (#1941)

the build will now abort with an error message when an undefined build TARGET is named

Fixes #1938

5 years agoIncrement version to 0.3.6.dev
Martin Kroeker [Mon, 31 Dec 2018 22:11:37 +0000 (23:11 +0100)]
Increment version to 0.3.6.dev

5 years agoIncrement version to 0.3.6.dev
Martin Kroeker [Mon, 31 Dec 2018 22:10:59 +0000 (23:10 +0100)]
Increment version to 0.3.6.dev

5 years agoMerge branch 'release-0.3.0' into develop
Martin Kroeker [Mon, 31 Dec 2018 22:07:53 +0000 (23:07 +0100)]
Merge branch 'release-0.3.0' into develop

5 years agoUpdate ChangeLog.txt with changes from 0.3.5
Martin Kroeker [Mon, 31 Dec 2018 22:00:46 +0000 (23:00 +0100)]
Update ChangeLog.txt with changes from 0.3.5

5 years agoMerge pull request #1944 from hartzell/patch-1
Martin Kroeker [Mon, 31 Dec 2018 17:36:18 +0000 (18:36 +0100)]
Merge pull request #1944 from hartzell/patch-1

Typo: Skyalke -> Skylake

5 years agoTypo: Skyalke -> Skylake
George Hartzell [Sun, 30 Dec 2018 22:55:34 +0000 (14:55 -0800)]
Typo: Skyalke -> Skylake

Worth fixing, it gets in the way of searching....

5 years agoMerge pull request #1939 from TiborGY/patch-2
Martin Kroeker [Sun, 30 Dec 2018 19:10:05 +0000 (20:10 +0100)]
Merge pull request #1939 from TiborGY/patch-2

Fix typo in UNKNOWN core name

5 years agoMerge pull request #1943 from martin-frbg/issue1748
Martin Kroeker [Sun, 30 Dec 2018 19:07:01 +0000 (20:07 +0100)]
Merge pull request #1943 from martin-frbg/issue1748

Re-enable loop unrolling in trmv and remove the scary warning

5 years agoRe-enable loop unrolling in trmv and remove the scary warning
Martin Kroeker [Sun, 30 Dec 2018 14:22:37 +0000 (15:22 +0100)]
Re-enable loop unrolling in trmv and remove the scary warning

fixes #1748 as that half of the fix for #1332 appears to have been an overreaction on my part.

5 years agoMerge pull request #1942 from martin-frbg/issue1720
Martin Kroeker [Sun, 30 Dec 2018 13:47:05 +0000 (14:47 +0100)]
Merge pull request #1942 from martin-frbg/issue1720

Delete the pthread key on cleanup in TLS mode

5 years agoRemove stray include of complex.h
Martin Kroeker [Sun, 30 Dec 2018 13:39:18 +0000 (14:39 +0100)]
Remove stray include of complex.h

already provided conditionally by common.h via openblas_utest.h
Unconditional inclusion breaks older Android and similar platforms that use OPENBLAS_COMPLEX_STRUCT

5 years agoDelete the pthread key on cleanup in TLS mode
Martin Kroeker [Sat, 29 Dec 2018 20:59:31 +0000 (21:59 +0100)]
Delete the pthread key on cleanup in TLS mode

to avoid a crash when OpenBLAS was loaded via dlopen and libc tries to clean up the leaked TLS after dlclose
Fixes #1720

5 years agoFix wrong case in TARGET setting for Alpine
Martin Kroeker [Sat, 29 Dec 2018 17:12:54 +0000 (18:12 +0100)]
Fix wrong case in TARGET setting for Alpine

5 years agoUpdate cpuid_mips64.c
TiborGY [Fri, 28 Dec 2018 13:36:39 +0000 (14:36 +0100)]
Update cpuid_mips64.c

5 years agoUpdate Makefile
TiborGY [Fri, 28 Dec 2018 13:35:41 +0000 (14:35 +0100)]
Update Makefile

5 years agoUpdate cpuid_mips.c
TiborGY [Fri, 28 Dec 2018 13:34:38 +0000 (14:34 +0100)]
Update cpuid_mips.c