platform/upstream/openblas.git
4 years agoAdd DYNAMIC_ARCH support for ARMV8 EMAG8180
Martin Kroeker [Mon, 24 Feb 2020 18:20:00 +0000 (19:20 +0100)]
Add DYNAMIC_ARCH support for ARMV8 EMAG8180

4 years agoAdd preliminary support for EMAG8180
Martin Kroeker [Wed, 19 Feb 2020 18:00:28 +0000 (19:00 +0100)]
Add preliminary support for EMAG8180

4 years agoAdd preliminary support for EMAG8180 ARMV8 processor
Martin Kroeker [Wed, 19 Feb 2020 17:57:26 +0000 (18:57 +0100)]
Add preliminary support for EMAG8180 ARMV8 processor

4 years agoRecognize Ampere EMAG8180
Martin Kroeker [Wed, 19 Feb 2020 17:49:13 +0000 (18:49 +0100)]
Recognize Ampere EMAG8180

4 years agoMerge pull request #32 from xianyi/develop
Martin Kroeker [Wed, 19 Feb 2020 17:06:39 +0000 (18:06 +0100)]
Merge pull request #32 from xianyi/develop

rebase

4 years agoMerge pull request #2432 from isuruf/install_name
Martin Kroeker [Wed, 19 Feb 2020 07:14:28 +0000 (08:14 +0100)]
Merge pull request #2432 from isuruf/install_name

Fix install name on osx again

4 years agoFix install name on osx again
Isuru Fernando [Tue, 18 Feb 2020 18:22:49 +0000 (10:22 -0800)]
Fix install name on osx again

4 years agoMerge pull request #2426 from zbeekman/nightly-homebrew-check
Martin Kroeker [Tue, 18 Feb 2020 11:09:15 +0000 (12:09 +0100)]
Merge pull request #2426 from zbeekman/nightly-homebrew-check

Nightly homebrew check

4 years agoMerge pull request #2427 from martin-frbg/powermin
Martin Kroeker [Tue, 18 Feb 2020 07:15:02 +0000 (08:15 +0100)]
Merge pull request #2427 from martin-frbg/powermin

Fix ISMIN and ISMAX kernel choices for POWER8

4 years agoSpecify ismin/ismax assembly kernels for POWER8 directly
Martin Kroeker [Mon, 17 Feb 2020 18:55:39 +0000 (19:55 +0100)]
Specify ismin/ismax assembly kernels for POWER8 directly

to fix utest failure in new ismin test -  Makefile.L1 defaults look wrong

4 years agoFix bottle upload problem & typo
Izaak Beekman [Mon, 17 Feb 2020 18:32:33 +0000 (13:32 -0500)]
Fix bottle upload problem & typo

4 years agoTest push & PRs only when workflow file changes
Izaak Beekman [Mon, 17 Feb 2020 18:12:50 +0000 (13:12 -0500)]
Test push & PRs only when workflow file changes

Also, add comments to clarify what the test is testing

4 years agoAdd Github Action to build development branch nightly with Homebrew
Izaak Beekman [Mon, 17 Feb 2020 16:49:53 +0000 (11:49 -0500)]
Add Github Action to build development branch nightly with Homebrew

4 years agoMerge pull request #2424 from isuruf/osx
Martin Kroeker [Mon, 17 Feb 2020 16:00:08 +0000 (17:00 +0100)]
Merge pull request #2424 from isuruf/osx

Fix building on osx

4 years agoMerge pull request #30 from xianyi/develop
Martin Kroeker [Mon, 17 Feb 2020 13:53:46 +0000 (14:53 +0100)]
Merge pull request #30 from xianyi/develop

rebase

4 years agoMerge pull request #2414 from marxin/fix-iamax_sse-implementation
Martin Kroeker [Mon, 17 Feb 2020 13:50:18 +0000 (14:50 +0100)]
Merge pull request #2414 from marxin/fix-iamax_sse-implementation

Fix iamax sse implementation and add utests

4 years agoCome up with LOAD_AND_COMPARE_TO_MXX macro in iamax_sse.S.
Martin Liska [Thu, 13 Feb 2020 13:42:45 +0000 (14:42 +0100)]
Come up with LOAD_AND_COMPARE_TO_MXX macro in iamax_sse.S.

4 years agoFix implementation of iamax_sse.S as reported in #2116.
Martin Liska [Thu, 13 Feb 2020 13:32:24 +0000 (14:32 +0100)]
Fix implementation of iamax_sse.S as reported in #2116.

The was a typo in iamax_sse.S where one of the comparison
was cmpeqps instead of cmpeqss. That misdetected index
for sequences where the minimum value was 0.

4 years agoAdd missing USE_MIN in kernel/CMakeLists.txt.
Martin Liska [Fri, 14 Feb 2020 09:35:51 +0000 (10:35 +0100)]
Add missing USE_MIN in kernel/CMakeLists.txt.

4 years agoMerge pull request #2423 from xianyi/issue2419
Martin Kroeker [Mon, 17 Feb 2020 06:24:02 +0000 (07:24 +0100)]
Merge pull request #2423 from xianyi/issue2419

Restore -march flag for Android builds

4 years agoPass CFLAGS from env to Makefile.prebuild and remove iOS hack
Isuru Fernando [Sun, 16 Feb 2020 21:11:40 +0000 (15:11 -0600)]
Pass CFLAGS from  env to Makefile.prebuild and remove iOS hack

4 years agoRestore -march flag for Android builds
Martin Kroeker [Sun, 16 Feb 2020 16:32:13 +0000 (17:32 +0100)]
Restore -march flag for Android builds

fixes #2419 - renewed discussion in #2112 suggests removal of the option was primarily aimed at non-Android builds

4 years agoUpdate KERNEL.POWER8
Martin Kroeker [Sun, 16 Feb 2020 16:29:35 +0000 (17:29 +0100)]
Update KERNEL.POWER8

4 years agoMerge pull request #29 from xianyi/develop
Martin Kroeker [Sun, 16 Feb 2020 16:28:10 +0000 (17:28 +0100)]
Merge pull request #29 from xianyi/develop

rebase

4 years agoUpdate KERNEL.POWER8
Martin Kroeker [Sat, 15 Feb 2020 22:07:50 +0000 (23:07 +0100)]
Update KERNEL.POWER8

4 years agoUpdate KERNEL.POWER8
Martin Kroeker [Sat, 15 Feb 2020 22:06:51 +0000 (23:06 +0100)]
Update KERNEL.POWER8

4 years agoMerge pull request #2417 from marxin/make-ctest-verbose-for-drone
Martin Kroeker [Sat, 15 Feb 2020 20:57:41 +0000 (21:57 +0100)]
Merge pull request #2417 from marxin/make-ctest-verbose-for-drone

Make ctest verbose for drone

4 years agoMerge pull request #2415 from marxin/add-cmake-to-gitignore
Martin Kroeker [Sat, 15 Feb 2020 20:57:03 +0000 (21:57 +0100)]
Merge pull request #2415 from marxin/add-cmake-to-gitignore

Add CMake related files to .gitignore.

4 years agoMerge pull request #2420 from martin-frbg/issue2396
Martin Kroeker [Sat, 15 Feb 2020 20:56:16 +0000 (21:56 +0100)]
Merge pull request #2420 from martin-frbg/issue2396

Correct generation of GETRF files by the CMAKE build

4 years agoCorrect generation of GETRF files by the CMAKE build
Martin Kroeker [Sat, 15 Feb 2020 18:29:14 +0000 (19:29 +0100)]
Correct generation of GETRF files by the CMAKE build

fixes #2396

4 years agoMerge pull request #2411 from martin-frbg/fix2254-038
Martin Kroeker [Fri, 14 Feb 2020 22:07:43 +0000 (23:07 +0100)]
Merge pull request #2411 from martin-frbg/fix2254-038

Fix pre-processed POWER8 codes and wrong conditionals in the POWER8,PPC440 and PPC970 KERNEL files

4 years agoMake ctest verbose for drone builder.
Martin Liska [Fri, 14 Feb 2020 09:45:31 +0000 (10:45 +0100)]
Make ctest verbose for drone builder.

4 years agoUpdate caxpy_power8.S
Martin Kroeker [Thu, 13 Feb 2020 21:44:09 +0000 (22:44 +0100)]
Update caxpy_power8.S

4 years agoUpdate caxpy_power8.S
Martin Kroeker [Thu, 13 Feb 2020 20:24:54 +0000 (21:24 +0100)]
Update caxpy_power8.S

4 years agoUpdate icamin_power8.S
Martin Kroeker [Thu, 13 Feb 2020 17:38:43 +0000 (18:38 +0100)]
Update icamin_power8.S

4 years agoAdd CMake related files to .gitignore.
Martin Liska [Thu, 13 Feb 2020 13:51:55 +0000 (14:51 +0100)]
Add CMake related files to .gitignore.

4 years agoUpdate isamin_power8.S
Martin Kroeker [Wed, 12 Feb 2020 23:00:32 +0000 (00:00 +0100)]
Update isamin_power8.S

4 years agoUpdate isamax_power8.S
Martin Kroeker [Wed, 12 Feb 2020 22:59:50 +0000 (23:59 +0100)]
Update isamax_power8.S

4 years agoUpdate isamin_power8.S
Martin Kroeker [Wed, 12 Feb 2020 22:57:48 +0000 (23:57 +0100)]
Update isamin_power8.S

4 years agoUpdate isamax_power8.S
Martin Kroeker [Wed, 12 Feb 2020 22:56:57 +0000 (23:56 +0100)]
Update isamax_power8.S

4 years agoFix syntax of endianness conditional
Martin Kroeker [Wed, 12 Feb 2020 19:00:29 +0000 (20:00 +0100)]
Fix syntax of endianness conditional

4 years agoFix syntax of endianness conditional
Martin Kroeker [Wed, 12 Feb 2020 18:58:42 +0000 (19:58 +0100)]
Fix syntax of endianness conditional

4 years agoFix syntax of endianness conditional and add gcc version check for workaround
Martin Kroeker [Wed, 12 Feb 2020 18:56:52 +0000 (19:56 +0100)]
Fix syntax of endianness conditional and add gcc version check for workaround

4 years agoMerge pull request #27 from xianyi/develop
Martin Kroeker [Wed, 12 Feb 2020 18:16:14 +0000 (19:16 +0100)]
Merge pull request #27 from xianyi/develop

rebase

4 years agoMerge pull request #2410 from bartoldeman/fix-dscal-inline-asm
Martin Kroeker [Wed, 12 Feb 2020 14:38:37 +0000 (15:38 +0100)]
Merge pull request #2410 from bartoldeman/fix-dscal-inline-asm

Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408

4 years agoFix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408
Bart Oldeman [Wed, 12 Feb 2020 14:11:44 +0000 (14:11 +0000)]
Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408

The leaq instructions in dscal_kernel_inc_8 modify x and x1 so they
must be declared as input/output constraints, otherwise the compiler
may assume the corresponding registers are not modified.

4 years agoMerge pull request #2407 from susilehtola/patch-2
Martin Kroeker [Tue, 11 Feb 2020 12:04:44 +0000 (13:04 +0100)]
Merge pull request #2407 from susilehtola/patch-2

Patch out instances of Z15 in dynamic_zarch.c

4 years agoMerge pull request #2405 from susilehtola/patch-1
Martin Kroeker [Tue, 11 Feb 2020 12:03:35 +0000 (13:03 +0100)]
Merge pull request #2405 from susilehtola/patch-1

Fix typo in dynamic_zarch.c

4 years agoMerge pull request #2404 from martin-frbg/issue2395
Martin Kroeker [Tue, 11 Feb 2020 12:00:36 +0000 (13:00 +0100)]
Merge pull request #2404 from martin-frbg/issue2395

Fix spurious application of USE_TRMM in cmake builds

4 years agoMerge pull request #2403 from martin-frbg/issue2400
Martin Kroeker [Tue, 11 Feb 2020 12:00:16 +0000 (13:00 +0100)]
Merge pull request #2403 from martin-frbg/issue2400

Fix coretype identification of Intel Cannon Lake, Ice Lake and Goldmont

4 years agoMerge pull request #2402 from gxw-loongson/develop
Martin Kroeker [Tue, 11 Feb 2020 11:59:53 +0000 (12:59 +0100)]
Merge pull request #2402 from gxw-loongson/develop

Avoid printing the following information on mips and mips64 when check msa

4 years agoMerge pull request #2399 from martin-frbg/buffersize
Martin Kroeker [Tue, 11 Feb 2020 11:56:56 +0000 (12:56 +0100)]
Merge pull request #2399 from martin-frbg/buffersize

Make BUFFER_SIZE configurable at build time

4 years agoPatch out instances of Z15 in dynamic_zarch.c
Susi Lehtola [Tue, 11 Feb 2020 02:07:33 +0000 (15:07 +1300)]
Patch out instances of Z15 in dynamic_zarch.c

There does not appear to be a Z15 kernel yet, causing link errors from the code. This patch fixes the issue.

4 years agoFix typo in dynamic_zarch.c
Susi Lehtola [Tue, 11 Feb 2020 01:46:30 +0000 (14:46 +1300)]
Fix typo in dynamic_zarch.c

4 years agoFix bad conditional syntax that caused spurious application of USE_TRMM
Martin Kroeker [Mon, 10 Feb 2020 20:17:39 +0000 (21:17 +0100)]
Fix bad conditional syntax that caused spurious application of USE_TRMM

4 years agoFix coretype detection for Intel extended models 6 and 7
Martin Kroeker [Mon, 10 Feb 2020 18:17:32 +0000 (19:17 +0100)]
Fix coretype detection for Intel extended models 6 and 7

affecting Goldmont, Cannon Lake, Ice Lake autodetection

4 years agoAvoid printing the following information on mips and mips64 when check msa:
gxw [Mon, 10 Feb 2020 11:11:45 +0000 (19:11 +0800)]
Avoid printing the following information on mips and mips64 when check msa:
"unrecognized command line option ‘-mmsa’"

4 years agoMake BUFFER_SIZE configurable
Martin Kroeker [Sun, 9 Feb 2020 22:32:57 +0000 (23:32 +0100)]
Make BUFFER_SIZE configurable

4 years agoMake BUFFER_SIZE configurable
Martin Kroeker [Sun, 9 Feb 2020 22:30:22 +0000 (23:30 +0100)]
Make BUFFER_SIZE configurable

4 years agoAdd configuration option for BUFFER_SIZE
Martin Kroeker [Sun, 9 Feb 2020 22:28:04 +0000 (23:28 +0100)]
Add configuration option for BUFFER_SIZE

4 years agoMerge pull request #26 from xianyi/develop
Martin Kroeker [Sun, 9 Feb 2020 22:23:55 +0000 (23:23 +0100)]
Merge pull request #26 from xianyi/develop

rebase

4 years agoIncrement version to 0.3.9.dev
Martin Kroeker [Sun, 9 Feb 2020 22:18:44 +0000 (23:18 +0100)]
Increment version to 0.3.9.dev

4 years agoIncrement version to 0.3.9.dev
Martin Kroeker [Sun, 9 Feb 2020 22:18:07 +0000 (23:18 +0100)]
Increment version to 0.3.9.dev

4 years agoMerge branch 'release-0.3.0' into develop
Martin Kroeker [Sun, 9 Feb 2020 22:16:06 +0000 (23:16 +0100)]
Merge branch 'release-0.3.0' into develop

4 years agoMerge pull request #2397 from martin-frbg/038changes
Martin Kroeker [Sun, 9 Feb 2020 22:01:52 +0000 (23:01 +0100)]
Merge pull request #2397 from martin-frbg/038changes

Update Changelog with changes from 0.3.8

4 years agoUpdate with changes from 0.3.8
Martin Kroeker [Sun, 9 Feb 2020 22:00:36 +0000 (23:00 +0100)]
Update with changes from 0.3.8

4 years agoMerge pull request #25 from xianyi/develop
Martin Kroeker [Sun, 9 Feb 2020 21:48:15 +0000 (22:48 +0100)]
Merge pull request #25 from xianyi/develop

rebase

4 years agotypo fixes
Martin Kroeker [Sun, 9 Feb 2020 00:06:40 +0000 (01:06 +0100)]
typo fixes

4 years agoMerge pull request #2393 from martin-frbg/issue2388
Martin Kroeker [Sun, 9 Feb 2020 00:00:33 +0000 (01:00 +0100)]
Merge pull request #2393 from martin-frbg/issue2388

Provide more documentation in README.md

4 years agoMerge pull request #2390 from martin-frbg/pgi
Martin Kroeker [Sat, 8 Feb 2020 23:13:40 +0000 (00:13 +0100)]
Merge pull request #2390 from martin-frbg/pgi

Small corrections for compilation with PGI compilers

4 years agoUpdate CPU and OS support and document DYNAMIC_ARCH option in README.md
Martin Kroeker [Sat, 8 Feb 2020 23:06:07 +0000 (00:06 +0100)]
Update  CPU and OS support and document DYNAMIC_ARCH option in README.md

prompted by #2388

4 years agoRemove PGI from list again as it is actually still not capable
Martin Kroeker [Sat, 8 Feb 2020 09:20:13 +0000 (10:20 +0100)]
Remove PGI from list again as it is actually still not capable

4 years agoMerge pull request #2389 from Zeyiii/develop
Martin Kroeker [Fri, 7 Feb 2020 15:05:46 +0000 (16:05 +0100)]
Merge pull request #2389 from Zeyiii/develop

Fix bugs in benchmark of gemv

4 years agoRemove OpenMP libraries from link list
Martin Kroeker [Fri, 7 Feb 2020 15:03:51 +0000 (16:03 +0100)]
Remove OpenMP libraries from link list

4 years agoRemove OpenMP libraries from link list
Martin Kroeker [Fri, 7 Feb 2020 15:02:17 +0000 (16:02 +0100)]
Remove OpenMP libraries from link list

4 years agoMerge pull request #2384 from wjc404/develop
Martin Kroeker [Fri, 7 Feb 2020 12:47:12 +0000 (13:47 +0100)]
Merge pull request #2384 from wjc404/develop

Optimize AVX512 DGEMM (& DTRMM)

4 years agoAdd PGI to avx512-supporting compilers
Martin Kroeker [Fri, 7 Feb 2020 12:01:31 +0000 (13:01 +0100)]
Add PGI to avx512-supporting compilers

4 years agoFix utest compilation with PGI
Martin Kroeker [Fri, 7 Feb 2020 09:15:18 +0000 (10:15 +0100)]
Fix utest compilation with PGI

4 years agoSet SUFFIX in tempfile commands, fix bad architecture option for PGI compiler in...
Martin Kroeker [Fri, 7 Feb 2020 09:09:25 +0000 (10:09 +0100)]
Set SUFFIX in tempfile commands, fix bad architecture option for PGI compiler in avx512 test

4 years agoMerge pull request #24 from xianyi/develop
Martin Kroeker [Fri, 7 Feb 2020 09:03:02 +0000 (10:03 +0100)]
Merge pull request #24 from xianyi/develop

rebase

4 years agoUpdate dgemm_kernel_16x2_skylakex.c
wjc404 [Thu, 6 Feb 2020 02:14:10 +0000 (02:14 +0000)]
Update dgemm_kernel_16x2_skylakex.c

4 years agoUpdate sgemm_kernel_8x4_haswell.c
wjc404 [Thu, 6 Feb 2020 01:47:46 +0000 (01:47 +0000)]
Update sgemm_kernel_8x4_haswell.c

4 years agoUpdate dgemm_kernel_16x2_skylakex.c
wjc404 [Thu, 6 Feb 2020 01:46:36 +0000 (01:46 +0000)]
Update dgemm_kernel_16x2_skylakex.c

4 years agoFix another branch
w00421467 [Wed, 5 Feb 2020 07:07:18 +0000 (15:07 +0800)]
Fix another branch

4 years agoFix bugs in benchmark of gemv
w00421467 [Wed, 5 Feb 2020 06:53:37 +0000 (14:53 +0800)]
Fix bugs in benchmark of gemv

4 years agoUpdate dgemm_kernel_16x2_skylakex.c
wjc404 [Wed, 5 Feb 2020 05:36:57 +0000 (13:36 +0800)]
Update dgemm_kernel_16x2_skylakex.c

4 years agoUpdate trmm_R.c
wjc404 [Wed, 5 Feb 2020 02:15:02 +0000 (10:15 +0800)]
Update trmm_R.c

4 years agoUpdate trmm_L.c
wjc404 [Wed, 5 Feb 2020 02:09:41 +0000 (10:09 +0800)]
Update trmm_L.c

4 years agoUpdate level3_thread.c
wjc404 [Tue, 4 Feb 2020 12:33:08 +0000 (20:33 +0800)]
Update level3_thread.c

4 years agoUpdate level3.c
wjc404 [Tue, 4 Feb 2020 12:30:23 +0000 (20:30 +0800)]
Update level3.c

4 years agoUpdate param.h
wjc404 [Tue, 4 Feb 2020 11:55:26 +0000 (19:55 +0800)]
Update param.h

4 years agoUpdate KERNEL.SKYLAKEX
wjc404 [Mon, 3 Feb 2020 13:38:08 +0000 (21:38 +0800)]
Update KERNEL.SKYLAKEX

4 years agoUpdate param.h
wjc404 [Mon, 3 Feb 2020 13:34:12 +0000 (21:34 +0800)]
Update param.h

4 years agoAVX512 16x2 DGEMM kernel
wjc404 [Mon, 3 Feb 2020 13:32:56 +0000 (21:32 +0800)]
AVX512 16x2 DGEMM kernel

4 years agoMerge pull request #2378 from martin-frbg/issue2377
Martin Kroeker [Thu, 30 Jan 2020 16:07:19 +0000 (17:07 +0100)]
Merge pull request #2378 from martin-frbg/issue2377

Add -march option for AVX512 in cmake as well

4 years agoAdd -march option for AVX512
Martin Kroeker [Thu, 30 Jan 2020 11:41:18 +0000 (12:41 +0100)]
Add -march option for AVX512

4 years agoMerge pull request #2375 from ewanglong/master
Martin Kroeker [Thu, 30 Jan 2020 09:27:29 +0000 (10:27 +0100)]
Merge pull request #2375 from ewanglong/master

fix a few performance drop in some matrix size per data type

4 years agoMerge pull request #2376 from wjc404/develop
Martin Kroeker [Thu, 23 Jan 2020 20:50:19 +0000 (21:50 +0100)]
Merge pull request #2376 from wjc404/develop

Fix remaining bugs in parallel GEMM3M

4 years agoUpdate level3_gemm3m_thread.c
wjc404 [Wed, 22 Jan 2020 17:40:03 +0000 (17:40 +0000)]
Update level3_gemm3m_thread.c

4 years agofix a few performance drop in some matrix size per data type
Wang,Long [Wed, 22 Jan 2020 15:07:50 +0000 (15:07 +0000)]
fix a few performance drop in some matrix size per data type

Signed-off-by: Wang,Long <long1.wang@intel.com>