platform/upstream/openblas.git
5 years agoFix error introduced during cleanup
Martin Kroeker [Tue, 19 Feb 2019 21:16:33 +0000 (22:16 +0100)]
Fix error introduced during cleanup

5 years agoAllow multithreading TRMV again
Martin Kroeker [Tue, 19 Feb 2019 20:03:30 +0000 (21:03 +0100)]
Allow multithreading TRMV again

revert workaround introduced for issue #1332 as the actual cause appears to be my incorrect fix from #1262 (see #1388)

5 years agoCorrect range_n limiting
Martin Kroeker [Tue, 19 Feb 2019 19:59:48 +0000 (20:59 +0100)]
Correct range_n limiting

same bug as seen in #1388, somehow missed in corresponding PR #1389

5 years agoRestore dropped patches in the non-TLS branch of memory.c (#2004)
Martin Kroeker [Thu, 7 Feb 2019 19:06:13 +0000 (20:06 +0100)]
Restore dropped patches in the non-TLS branch of memory.c (#2004)

* Restore dropped patches in the non-TLS branch of memory.c

As discovered in #2002, the reintroduction of the "original" non-TLS version of memory.c as an alternate branch had inadvertently used ba1f91f rather than a8002e2 , thereby dropping the commits for #1450, #1468, #1501, #1504 and #1520.

5 years agoMerge pull request #2001 from martin-frbg/cmake-dynlist
Martin Kroeker [Wed, 6 Feb 2019 07:39:24 +0000 (08:39 +0100)]
Merge pull request #2001 from martin-frbg/cmake-dynlist

Support DYNAMIC_LIST option in cmake

5 years agoMerge pull request #2000 from martin-frbg/issue1989
Martin Kroeker [Tue, 5 Feb 2019 23:29:30 +0000 (00:29 +0100)]
Merge pull request #2000 from martin-frbg/issue1989

Make c_check robust against old or incomplete perl installations

5 years agoSupport DYNAMIC_LIST option in cmake
Martin Kroeker [Tue, 5 Feb 2019 22:51:40 +0000 (23:51 +0100)]
Support DYNAMIC_LIST option in cmake

e.g. cmake -DDYNAMIC_ARCH=1 -DDYNAMIC_LIST="NEHALEM;HASWELL;ZEN" ..
original issue was #1639

5 years agoMerge pull request #1999 from martin-frbg/issue1996-2
Martin Kroeker [Tue, 5 Feb 2019 21:02:11 +0000 (22:02 +0100)]
Merge pull request #1999 from martin-frbg/issue1996-2

fix second instance of complex.h for c++ as well

5 years agoMake c_check robust against old or incomplete perl installations
Martin Kroeker [Tue, 5 Feb 2019 19:06:34 +0000 (20:06 +0100)]
Make c_check robust against old or incomplete perl installations

by catching and working around failures to load modules, and avoiding object-oriented syntax in tempfile creation.
Fixes #1989

5 years agofix second instance of complex.h for c++ as well
Martin Kroeker [Tue, 5 Feb 2019 18:29:33 +0000 (19:29 +0100)]
fix second instance of complex.h for c++ as well

5 years agoMerge pull request #1998 from martin-frbg/issue1992
Martin Kroeker [Tue, 5 Feb 2019 16:39:59 +0000 (17:39 +0100)]
Merge pull request #1998 from martin-frbg/issue1992

Include complex rather than complex.h in C++ contexts

5 years agoInclude complex rather than complex.h in C++ contexts
Martin Kroeker [Tue, 5 Feb 2019 12:30:13 +0000 (13:30 +0100)]
Include complex rather than complex.h in C++ contexts

to avoid name clashes e.g. with boost headers that use I as a generic placeholder.
Fixes #1992 as suggested by aprokop in that issue ticket.

5 years agoMerge pull request #1996 from quickwritereader/develop
Martin Kroeker [Mon, 4 Feb 2019 15:52:04 +0000 (16:52 +0100)]
Merge pull request #1996 from quickwritereader/develop

NBMAX=4096 for gemvn, added sgemvn 8x8 for future

5 years agoNote for unused kernels
Ubuntu [Mon, 4 Feb 2019 15:41:56 +0000 (15:41 +0000)]
Note for unused kernels

5 years agoNBMAX=4096 for gemvn, added sgemvn 8x8 for future
Ubuntu [Mon, 4 Feb 2019 06:57:11 +0000 (06:57 +0000)]
NBMAX=4096 for gemvn, added sgemvn 8x8 for future

5 years agoMerge pull request #1994 from quickwritereader/develop
Martin Kroeker [Fri, 1 Feb 2019 20:04:47 +0000 (21:04 +0100)]
Merge pull request #1994 from quickwritereader/develop

sgemv cgemv pairs

5 years agosgemv cgemv pairs
Ubuntu [Fri, 1 Feb 2019 13:45:00 +0000 (13:45 +0000)]
sgemv cgemv pairs

5 years agoFix incorrect sgemv results for IBM z14
Martin Kroeker [Fri, 1 Feb 2019 11:58:59 +0000 (12:58 +0100)]
Fix incorrect sgemv results for IBM z14

part of PR #1993 that was inadvertently misplaced into the toplevel directory

5 years agoDelete misplaced file sgemv_t_4.c
Martin Kroeker [Fri, 1 Feb 2019 11:57:01 +0000 (12:57 +0100)]
Delete misplaced file sgemv_t_4.c

from #1993 , file should have gone into kernel/zarch

5 years agoMerge pull request #1993 from martin-frbg/aarnes-zarch
Martin Kroeker [Thu, 31 Jan 2019 20:27:00 +0000 (21:27 +0100)]
Merge pull request #1993 from martin-frbg/aarnes-zarch

Various fixes for the new Z14 target

5 years agoImprove the z14 SGEMVT kernel
Martin Kroeker [Thu, 31 Jan 2019 20:24:55 +0000 (21:24 +0100)]
Improve the z14 SGEMVT kernel

from patch provided by aarnez in #991

5 years agoFix precision of zarch DSDOT
Martin Kroeker [Thu, 31 Jan 2019 20:22:26 +0000 (21:22 +0100)]
Fix precision of zarch DSDOT

from patch provided by aarnez in #991

5 years agoFix typo in the zarch min/max kernels
Martin Kroeker [Thu, 31 Jan 2019 20:21:40 +0000 (21:21 +0100)]
Fix typo in the zarch min/max kernels

from patch provided by aarnez in #991

5 years agoUSE_TRMM on Z14
Martin Kroeker [Thu, 31 Jan 2019 20:18:09 +0000 (21:18 +0100)]
USE_TRMM on Z14

from patch provided by aarnez in #991

5 years agoAdd cache sizes for Z14
Martin Kroeker [Thu, 31 Jan 2019 20:16:44 +0000 (21:16 +0100)]
Add cache sizes for Z14

from patch provided by aarnez in #991

5 years agoAdd FORCE Z14
Martin Kroeker [Thu, 31 Jan 2019 20:15:50 +0000 (21:15 +0100)]
Add FORCE Z14

from patch provided by aarnez in #991

5 years agoAdd parameters for Z14
Martin Kroeker [Thu, 31 Jan 2019 20:14:37 +0000 (21:14 +0100)]
Add parameters for Z14

from patch provided by aarnez in #991

5 years agoAdd Z14 target
Martin Kroeker [Thu, 31 Jan 2019 20:13:46 +0000 (21:13 +0100)]
Add Z14 target

from patch provided by aarnez in #991

5 years agoMerge pull request #1991 from maamountki/z14
Martin Kroeker [Thu, 31 Jan 2019 18:10:03 +0000 (19:10 +0100)]
Merge pull request #1991 from maamountki/z14

[ZARCH] Z14 Support, BLAS 1/2 single precision implementations

5 years agoMerge branch 'develop' into z14
maamountki [Thu, 31 Jan 2019 17:36:41 +0000 (19:36 +0200)]
Merge branch 'develop' into z14

5 years ago[ZARCH] Add Z13 version for max/min functions
maamountki [Thu, 31 Jan 2019 17:11:11 +0000 (19:11 +0200)]
[ZARCH] Add Z13 version for max/min functions

5 years ago[ZARCH] Improve loading performance for camax/icamax
maamountki [Thu, 31 Jan 2019 16:52:11 +0000 (18:52 +0200)]
[ZARCH] Improve loading performance for camax/icamax

5 years agoFix wrong comparison that made IMIN identical to IMAX
Martin Kroeker [Thu, 31 Jan 2019 14:27:21 +0000 (15:27 +0100)]
Fix wrong comparison that made IMIN identical to IMAX

as reported by aarnez in #1990

5 years agoFix wrong comparison that made IMIN identical to IMAX
Martin Kroeker [Thu, 31 Jan 2019 14:25:15 +0000 (15:25 +0100)]
Fix wrong comparison that made IMIN identical to IMAX

as suggested in #1990

5 years agoRemove ztest
maamountki [Thu, 31 Jan 2019 07:26:50 +0000 (09:26 +0200)]
Remove ztest

5 years ago[ZARCH] Fix bug in max/min functions
maamountki [Tue, 29 Jan 2019 15:59:38 +0000 (17:59 +0200)]
[ZARCH] Fix bug in max/min functions

5 years ago[ZARCH] Fix icamax/icamin
maamountki [Tue, 29 Jan 2019 01:47:49 +0000 (03:47 +0200)]
[ZARCH] Fix icamax/icamin

5 years ago[ZARCH] Fix iamax/imax single precision
maamountki [Mon, 28 Jan 2019 15:52:23 +0000 (17:52 +0200)]
[ZARCH] Fix iamax/imax single precision

5 years ago[ZARCH] Undo the last commit
maamountki [Mon, 28 Jan 2019 15:32:24 +0000 (17:32 +0200)]
[ZARCH] Undo the last commit

5 years ago[ZARCH] Fix bug in iamax/iamin/imax/imin
maamountki [Mon, 28 Jan 2019 15:16:18 +0000 (17:16 +0200)]
[ZARCH] Fix bug in iamax/iamin/imax/imin

5 years agoMerge pull request #1985 from martin-frbg/issue1984
Martin Kroeker [Mon, 28 Jan 2019 14:44:57 +0000 (15:44 +0100)]
Merge pull request #1985 from martin-frbg/issue1984

Correct naming of getrf_parallel object

5 years agoMerge pull request #1981 from edisongustavo/develop
Martin Kroeker [Mon, 28 Jan 2019 14:44:42 +0000 (15:44 +0100)]
Merge pull request #1981 from edisongustavo/develop

Fix include directory of exported targets

5 years agoMerge pull request #1978 from danielgindi/feature/msvc_cmake
Martin Kroeker [Mon, 28 Jan 2019 14:43:35 +0000 (15:43 +0100)]
Merge pull request #1978 from danielgindi/feature/msvc_cmake

Better support for MSVC/Windows in CMake (v0.3.x)

5 years agoMerge pull request #1962 from brada4/r
Martin Kroeker [Mon, 28 Jan 2019 14:42:57 +0000 (15:42 +0100)]
Merge pull request #1962 from brada4/r

Modrenize R benchmarks slightly

5 years agoMerge pull request #1987 from martin-frbg/issue1961
Martin Kroeker [Sat, 26 Jan 2019 21:25:29 +0000 (22:25 +0100)]
Merge pull request #1987 from martin-frbg/issue1961

Change ARMV8 target with BINARY=32 to ARMV7 automatically

5 years agoChange ARMV8 target to ARMV7 for BINARY=32
Martin Kroeker [Sat, 26 Jan 2019 16:52:33 +0000 (17:52 +0100)]
Change ARMV8 target to ARMV7 for BINARY=32

5 years agoChange ARMV8 target to ARMV7 when BINARY32 is set
Martin Kroeker [Sat, 26 Jan 2019 16:47:22 +0000 (17:47 +0100)]
Change ARMV8 target to ARMV7 when BINARY32 is set

fixes #1961

5 years agoCorrect naming of getrf_parallel object
Martin Kroeker [Fri, 25 Jan 2019 23:45:45 +0000 (00:45 +0100)]
Correct naming of getrf_parallel object

fixes #1984

5 years agoMerge pull request #1971 from martin-frbg/trsm-threshold
Martin Kroeker [Thu, 24 Jan 2019 08:17:48 +0000 (09:17 +0100)]
Merge pull request #1971 from martin-frbg/trsm-threshold

Shift transition to multithreading towards larger matrix sizes

5 years agoFix include directory of exported targets
Edison Gustavo Muenz [Wed, 23 Jan 2019 14:09:13 +0000 (15:09 +0100)]
Fix include directory of exported targets

5 years agoAvoid penalizing tall skinny matrices
Martin Kroeker [Wed, 23 Jan 2019 09:03:00 +0000 (10:03 +0100)]
Avoid penalizing tall skinny matrices

5 years agoMerge pull request #1980 from martin-frbg/issue1979
Martin Kroeker [Tue, 22 Jan 2019 20:10:38 +0000 (21:10 +0100)]
Merge pull request #1980 from martin-frbg/issue1979

Report SkylakeX as Haswell if compiler does not support AVX512

5 years agoSyntax fix
Martin Kroeker [Tue, 22 Jan 2019 17:55:43 +0000 (18:55 +0100)]
Syntax fix

5 years agoReport SkylakeX as Haswell if compiler does not support AVX512
Martin Kroeker [Tue, 22 Jan 2019 17:47:12 +0000 (18:47 +0100)]
Report SkylakeX as Haswell if compiler does not support AVX512

... or make was invoked with NO_AVX512=1

5 years agoAdjust test script for correct deployment
Daniel Cohen Gindi [Tue, 22 Jan 2019 12:38:01 +0000 (14:38 +0200)]
Adjust test script for correct deployment

5 years agoUse VERSION_LESS for comparisons involving software version numbers
Martin Kroeker [Tue, 22 Jan 2019 11:32:24 +0000 (12:32 +0100)]
Use VERSION_LESS for comparisons involving software version numbers

5 years agoBetter support for MSVC/Windows in CMake
Daniel Cohen Gindi [Mon, 21 Jan 2019 06:35:23 +0000 (08:35 +0200)]
Better support for MSVC/Windows in CMake

5 years ago[ZARCH] Update max/min functions
maamountki [Mon, 21 Jan 2019 13:56:04 +0000 (15:56 +0200)]
[ZARCH] Update max/min functions

5 years agoMerge pull request #1973 from martin-frbg/issue1464
Martin Kroeker [Sun, 20 Jan 2019 19:30:11 +0000 (20:30 +0100)]
Merge pull request #1973 from martin-frbg/issue1464

Increase Zen SWITCH_RATIO to 16

5 years agoFix compilation with NO_AVX=1 set
Martin Kroeker [Sun, 20 Jan 2019 11:18:53 +0000 (12:18 +0100)]
Fix compilation with NO_AVX=1 set

fixes #1974

5 years agoIncrease Zen SWITCH_RATIO to 16
Martin Kroeker [Sat, 19 Jan 2019 22:01:31 +0000 (23:01 +0100)]
Increase Zen SWITCH_RATIO to 16

following GEMM benchmarks on Ryzen2700X. For #1464

5 years agoShift transition to multithreading towards larger matrix sizes
Martin Kroeker [Fri, 18 Jan 2019 23:10:01 +0000 (00:10 +0100)]
Shift transition to multithreading towards larger matrix sizes

See #1886 and JuliaRobotics issue 500. trsm benchmarks on Haswell and Zen showed that with these values performance is roughly doubled for matrix sizes between 8x8 and 14x14, and still 10 to 20 percent better near the new cutoff at 32x32.

5 years agoFix declaration of input arguments in the Sandybridge GER microkernels (#1967)
Martin Kroeker [Fri, 18 Jan 2019 07:11:39 +0000 (08:11 +0100)]
Fix declaration of input arguments in the Sandybridge GER microkernels (#1967)

* Tag arguments 0 and 1 as both input and output

5 years agoFix declaration of input arguments in the x86_64 SCAL microkernels (#1966)
Martin Kroeker [Fri, 18 Jan 2019 07:11:07 +0000 (08:11 +0100)]
Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966)

* Tag arguments 0 and 1 as both input and output (see #1964)

5 years agoFix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (...
Martin Kroeker [Thu, 17 Jan 2019 22:20:32 +0000 (23:20 +0100)]
Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965)

* Tag operands 0 and 1 as both input and output

For #1964 (basically a continuation of coding problems first seen in #1292)

5 years agoMerge pull request #1970 from quickwritereader/develop
Martin Kroeker [Thu, 17 Jan 2019 15:42:11 +0000 (16:42 +0100)]
Merge pull request #1970 from quickwritereader/develop

crot fix

5 years agoBump xcode version to 10.1 to make sure it handles AVX512
Martin Kroeker [Thu, 17 Jan 2019 15:19:03 +0000 (16:19 +0100)]
Bump xcode version to 10.1 to make sure it handles AVX512

5 years agocrot fix
Ubuntu [Thu, 17 Jan 2019 14:45:31 +0000 (14:45 +0000)]
crot fix

5 years agoMerge pull request #1963 from quickwritereader/develop
Martin Kroeker [Wed, 16 Jan 2019 17:41:03 +0000 (18:41 +0100)]
Merge pull request #1963 from quickwritereader/develop

Blas1 single missing kernels implemented with vector builtins

5 years agoMerge branch 'develop' into develop
Abdelrauf [Wed, 16 Jan 2019 15:25:13 +0000 (19:25 +0400)]
Merge branch 'develop' into develop

5 years agoAdded missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot...
Ubuntu [Wed, 16 Jan 2019 15:16:21 +0000 (15:16 +0000)]
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
Fixed idamin,icamin choosing the first occurance index of equal minimals

5 years agodisable NaN checks before BLAS calls dgemm.R
Andrew [Wed, 16 Jan 2019 09:54:22 +0000 (11:54 +0200)]
disable NaN checks before BLAS calls dgemm.R

5 years agodisable NaN checks before BLAS calls deig.R (shorten matrix def)
Andrew [Wed, 16 Jan 2019 09:41:46 +0000 (11:41 +0200)]
disable NaN checks before BLAS calls deig.R (shorten matrix def)

5 years agodisable NaN checks before BLAS calls deig.R
Andrew [Wed, 16 Jan 2019 09:38:14 +0000 (11:38 +0200)]
disable NaN checks before BLAS calls deig.R

5 years agodisable NaN checks before BLAS calls dsolve.R (shorter formula)
Andrew [Wed, 16 Jan 2019 09:34:46 +0000 (11:34 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter formula)

5 years agoMerge pull request #1960 from cnjsdfcy/Hygon
Martin Kroeker [Wed, 16 Jan 2019 09:27:14 +0000 (10:27 +0100)]
Merge pull request #1960 from cnjsdfcy/Hygon

Add support for Hygon Dhyana

5 years agodisable NaN checks before BLAS calls dsolve.R (shorter config part)
Andrew [Wed, 16 Jan 2019 09:23:51 +0000 (11:23 +0200)]
disable NaN checks before BLAS calls dsolve.R (shorter config part)

5 years agodisable NaN checks before BLAS calls dsolve.R
Andrew [Wed, 16 Jan 2019 09:18:54 +0000 (11:18 +0200)]
disable NaN checks before BLAS calls dsolve.R

5 years agoinit
Andrew [Wed, 16 Jan 2019 07:51:29 +0000 (09:51 +0200)]
init

5 years agoAdd support for Hygon Dhyana
caiyu [Wed, 16 Jan 2019 06:25:19 +0000 (14:25 +0800)]
Add support for Hygon Dhyana

5 years ago[ZARCH] fix a bug in max/min functions
maamountki [Tue, 15 Jan 2019 19:04:22 +0000 (21:04 +0200)]
[ZARCH] fix a bug in max/min functions

5 years agoFix missing braces in support_av() call
Martin Kroeker [Mon, 14 Jan 2019 21:41:31 +0000 (22:41 +0100)]
Fix missing braces in support_av() call

5 years agoFix missing braces in support_avx()
Martin Kroeker [Mon, 14 Jan 2019 21:38:32 +0000 (22:38 +0100)]
Fix missing braces in support_avx()

5 years ago[ZARCH] Update dgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:43:11 +0000 (17:43 +0200)]
[ZARCH] Update dgemv_n_4.c

5 years ago[ZARCH] update cgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:39:17 +0000 (17:39 +0200)]
[ZARCH] update cgemv_n_4.c

5 years ago[ZARCH] Update cgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:37:11 +0000 (17:37 +0200)]
[ZARCH] Update cgemv_t_4.c

5 years agoUpdate sgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:14:04 +0000 (17:14 +0200)]
Update sgemv_t_4.c

5 years agoUpdate dgemv_t_4.c
maamountki [Fri, 11 Jan 2019 15:13:02 +0000 (17:13 +0200)]
Update dgemv_t_4.c

5 years ago[ZARCH] fix sgemv_n_4.c
maamountki [Fri, 11 Jan 2019 15:08:24 +0000 (17:08 +0200)]
[ZARCH] fix sgemv_n_4.c

5 years ago[ZARCH] fix cgemv_n_4.c
maamountki [Fri, 11 Jan 2019 14:44:46 +0000 (16:44 +0200)]
[ZARCH] fix cgemv_n_4.c

5 years agoMerge pull request #1957 from martin-frbg/issue1954
Martin Kroeker [Thu, 10 Jan 2019 11:04:08 +0000 (12:04 +0100)]
Merge pull request #1957 from martin-frbg/issue1954

Move TLS key deletion to openblas_quit

5 years agoMove TLS key deletion to openblas_quit
Martin Kroeker [Wed, 9 Jan 2019 23:32:50 +0000 (00:32 +0100)]
Move TLS key deletion to openblas_quit

fixes #1954 (as suggested by thrasibule in that issue)

5 years ago[ZARCH] fix data prefetch type in sdot
maamountki [Wed, 9 Jan 2019 14:50:07 +0000 (16:50 +0200)]
[ZARCH] fix data prefetch type in sdot

5 years ago[ZARCH] fix data prefetch type in ddot
maamountki [Wed, 9 Jan 2019 14:49:44 +0000 (16:49 +0200)]
[ZARCH] fix data prefetch type in ddot

5 years ago[ZARCH] fix dsdot.c
maamountki [Wed, 9 Jan 2019 14:33:54 +0000 (16:33 +0200)]
[ZARCH] fix dsdot.c

5 years ago[ZARCH] fix cgemv_n_4.c
maamountki [Wed, 9 Jan 2019 05:43:45 +0000 (07:43 +0200)]
[ZARCH] fix cgemv_n_4.c

5 years agoMerge pull request #1949 from martin-frbg/issue1947
Martin Kroeker [Tue, 8 Jan 2019 19:44:08 +0000 (20:44 +0100)]
Merge pull request #1949 from martin-frbg/issue1947

Query AVX2 and AVX512VL support when selecting x86 kernels

5 years agoBump xcode to 8.3
Martin Kroeker [Tue, 8 Jan 2019 13:43:45 +0000 (14:43 +0100)]
Bump xcode to 8.3

5 years agoUpdate OSX environment to Sierra
Martin Kroeker [Tue, 8 Jan 2019 13:41:48 +0000 (14:41 +0100)]
Update OSX environment to Sierra

as homebrew seems to have dropped support for El Capitan in their gcc packages

5 years agoAdd travis_wait to the OSX brew install phase
Martin Kroeker [Tue, 8 Jan 2019 09:46:47 +0000 (10:46 +0100)]
Add travis_wait to the OSX brew install phase