Martin Kroeker [Mon, 30 Mar 2020 18:15:37 +0000 (20:15 +0200)]
Merge pull request #2533 from martin-frbg/gemmdirect2
Use runtime check for AVX512 capability in DYNAMIC_ARCH builds made on SKX
Martin Kroeker [Thu, 26 Mar 2020 20:25:39 +0000 (21:25 +0100)]
Expose the support_avx512 function provided in dynamic.c
Martin Kroeker [Thu, 26 Mar 2020 20:12:56 +0000 (21:12 +0100)]
Use runtime check for AVX512 (sgemm_direct) capability when using DYNAMIC_ARCH
Martin Kroeker [Thu, 26 Mar 2020 20:06:51 +0000 (21:06 +0100)]
Merge pull request #39 from xianyi/develop
rebase
Martin Kroeker [Tue, 24 Mar 2020 14:44:46 +0000 (15:44 +0100)]
Merge pull request #2530 from martin-frbg/dynmsg
Add message highlighting minimum target choice at end of DYNAMIC_ARCH…
Martin Kroeker [Tue, 24 Mar 2020 14:44:27 +0000 (15:44 +0100)]
Merge pull request #2529 from shengyang-3390/dev1
add ctest for drotm and modified ctest for drot.
Martin Kroeker [Mon, 23 Mar 2020 18:35:51 +0000 (19:35 +0100)]
Add message highlighting minimum target choice at end of DYNAMIC_ARCH builds
related to #2526
Martin Kroeker [Mon, 23 Mar 2020 11:47:19 +0000 (12:47 +0100)]
Merge pull request #2527 from martin-frbg/gemmdirect
Avoid calling DIRECT codepath in DYNAMIC_ARCH on non-SKX
shengyang [Sat, 21 Mar 2020 07:58:21 +0000 (15:58 +0800)]
add ctest for drotm and modified ctest for drot.
make sure that test cases cover all code path when kernel uses looping unrolling.
Martin Kroeker [Sun, 22 Mar 2020 13:33:16 +0000 (14:33 +0100)]
Avoid calling DIRECT codepath in DYNAMIC_ARCH on non-SKX
Martin Kroeker [Sun, 22 Mar 2020 00:03:42 +0000 (01:03 +0100)]
Merge pull request #2521 from martin-frbg/cm-avx512
Use proper extension on the avx512 testcase filename
Martin Kroeker [Sat, 21 Mar 2020 17:47:48 +0000 (18:47 +0100)]
Merge pull request #2525 from andreas-schwab/develop
Fix ARCHCONFIG for Neoverse-N1
Andreas Schwab [Sat, 21 Mar 2020 16:33:33 +0000 (17:33 +0100)]
Fix ARCHCONFIG for Neoverse-N1
../config_kernel.h:24:9: warning: missing whitespace after the macro name
24 | #define ARMV8-march armv8.2-a
| ^~~~~
Martin Kroeker [Fri, 20 Mar 2020 22:05:53 +0000 (23:05 +0100)]
Use proper extension on the avx512 testcase filename
The need to call it .tmp existed only when it was generated by a tmpfile call, and the "-x c" option to tell the compiler it is actually a C source is not universally supported (this broke the test with clang-cl at least)
Martin Kroeker [Fri, 20 Mar 2020 22:00:06 +0000 (23:00 +0100)]
Merge pull request #2518 from shengyang-3390/dev
add ctest for srotm and modified ctest for srot.
Martin Kroeker [Fri, 20 Mar 2020 21:57:44 +0000 (22:57 +0100)]
Merge pull request #2519 from martin-frbg/issue2472
Fix cmake compilation with ifort on Windows
Martin Kroeker [Fri, 20 Mar 2020 00:08:10 +0000 (01:08 +0100)]
Make ifort on Windows create lowercase symbols with appended underscore
tentative fix for #2472
Martin Kroeker [Fri, 20 Mar 2020 00:05:22 +0000 (01:05 +0100)]
Merge pull request #38 from xianyi/develop
rebase
shengyang [Wed, 18 Mar 2020 06:17:32 +0000 (14:17 +0800)]
add ctest for srotm and modified ctest for srot.
make sure that test cases cover all code path when kernel uses looping unrolling.
Martin Kroeker [Tue, 17 Mar 2020 09:12:53 +0000 (10:12 +0100)]
Merge pull request #2517 from wjc404/develop
Temporary fix for SKX STRSM
wjc404 [Tue, 17 Mar 2020 04:52:55 +0000 (12:52 +0800)]
Update KERNEL.SKYLAKEX
Martin Kroeker [Mon, 16 Mar 2020 20:59:55 +0000 (21:59 +0100)]
Merge pull request #2515 from zelong-1024/develop
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
Martin Kroeker [Mon, 16 Mar 2020 20:58:55 +0000 (21:58 +0100)]
Merge pull request #2513 from aaawuanjun/develop
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
Martin Kroeker [Mon, 16 Mar 2020 20:58:34 +0000 (21:58 +0100)]
Merge pull request #2516 from wjc404/develop
AVX2 STRSM kernels
wjc404 [Mon, 16 Mar 2020 16:39:37 +0000 (16:39 +0000)]
Update KERNEL.ZEN
wjc404 [Mon, 16 Mar 2020 16:34:08 +0000 (00:34 +0800)]
AVX2 STRSM kernel
l00536773 [Mon, 16 Mar 2020 03:19:05 +0000 (11:19 +0800)]
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
[description]: benchmark for her/her2
[solution]: added benchmark for her/her2, modified makefile in benchmark
[dts]:
Martin Kroeker [Sat, 14 Mar 2020 13:21:30 +0000 (14:21 +0100)]
Merge pull request #2508 from liujingjue/develop
[OpenBLAS]:fix the iamax benchmark error
Martin Kroeker [Sat, 14 Mar 2020 12:27:40 +0000 (13:27 +0100)]
Merge pull request #2512 from martin-frbg/lapackh
Move declarations of lapack_complex_custom types outside the extern C
Martin Kroeker [Sat, 14 Mar 2020 12:08:36 +0000 (13:08 +0100)]
Merge pull request #2506 from xiaofengF/develop
Add benchmark for SPMV and fix segmentation fault when data size >= 50000
wuanjun 00447568 [Sat, 14 Mar 2020 01:11:08 +0000 (09:11 +0800)]
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
[Description]: Solve lack of tpsv benchmark.
Martin Kroeker [Fri, 13 Mar 2020 22:16:35 +0000 (23:16 +0100)]
Merge pull request #2505 from aaawuanjun/develop
[OpenBlas]:Add benchmark tpmv.c and modify benchmark/Makefile
Martin Kroeker [Fri, 13 Mar 2020 22:04:01 +0000 (23:04 +0100)]
Merge pull request #2511 from martin-frbg/fixppctest
Prevent attempts to run ctest or test when fortran is not available
Martin Kroeker [Fri, 13 Mar 2020 19:34:13 +0000 (20:34 +0100)]
Move declarations of lapack_complex_custom types outside the extern C
fixes #2510
Martin Kroeker [Fri, 13 Mar 2020 19:11:19 +0000 (20:11 +0100)]
Do not attempt to run test without fortran
Martin Kroeker [Fri, 13 Mar 2020 19:10:26 +0000 (20:10 +0100)]
Do not attempt to run ctest without fortran
The main Makefile takes care of this in the build process, but users or CI jobs may try to run this directly
l00546269 [Fri, 13 Mar 2020 02:58:39 +0000 (10:58 +0800)]
[OpenBLAS]:fix the iamax benchmark error
[Description]:the result for i?amax is not MFlops, it is MBytes
jayfely@qq.com [Wed, 11 Mar 2020 09:02:34 +0000 (17:02 +0800)]
Remove cspmv and zspmv to remove the error occured in travis CI
jayfely@qq.com [Wed, 11 Mar 2020 08:36:45 +0000 (16:36 +0800)]
Modify Makefile in interface to remove the error occured in travis CI
jayfely@qq.com [Wed, 11 Mar 2020 07:48:58 +0000 (15:48 +0800)]
Only keep spmv.goto and spmv.atlas
wuanjun 00447568 [Wed, 11 Mar 2020 04:31:48 +0000 (12:31 +0800)]
[OpenBlas]:Add benchmark tpmv.c and modify Makefile
[Description]:Solve the problem of missing tpmv.c benchmark file
jayfely@qq.com [Wed, 11 Mar 2020 02:30:09 +0000 (10:30 +0800)]
Update spmv.c: solve segmentation fault when m and n are larger than 50000
Martin Kroeker [Tue, 10 Mar 2020 22:38:07 +0000 (23:38 +0100)]
Merge pull request #2503 from martin-frbg/xerbl
Apply fix for LAPACK issue 394 (fixed-form code beyond column 72)
Martin Kroeker [Tue, 10 Mar 2020 19:01:23 +0000 (20:01 +0100)]
Merge pull request #2502 from martin-frbg/issue2497
Fix INTERFACE64 not propagating to the fortran codes on ARMV8
Martin Kroeker [Tue, 10 Mar 2020 15:44:40 +0000 (16:44 +0100)]
Merge pull request #2501 from jijiwawa/Fix_mistakes
Fix pr #2487 error
s00527847 [Tue, 10 Mar 2020 23:26:06 +0000 (19:26 -0400)]
Use the correct unit of measure
Martin Kroeker [Tue, 10 Mar 2020 12:37:41 +0000 (13:37 +0100)]
Apply fix for Reference-LAPACK issue 394
reference to XERBLA extending beyond column 72, breaking builds with compilers that default to traditional punch card format
Martin Kroeker [Tue, 10 Mar 2020 11:51:07 +0000 (12:51 +0100)]
Restore INTERFACE64 for arm64
Martin Kroeker [Tue, 10 Mar 2020 11:49:21 +0000 (12:49 +0100)]
Merge pull request #37 from xianyi/develop
rebase
jayfely@qq.com [Tue, 10 Mar 2020 06:32:18 +0000 (14:32 +0800)]
Modify Makefile in Benchmark
jayfely@qq.com [Tue, 10 Mar 2020 06:22:18 +0000 (14:22 +0800)]
Add benchmark for SPMV
Zhang Xianyi [Mon, 9 Mar 2020 08:04:33 +0000 (16:04 +0800)]
Merge pull request #2498 from njutcz/develop
Add benchmark for ?amax, ?max, ?amin, ?min, i?max, i?amin and i?min.
s00548429 [Mon, 9 Mar 2020 07:36:50 +0000 (15:36 +0800)]
Fix the functional bugs for zamax.
s00548429 [Mon, 9 Mar 2020 06:59:03 +0000 (14:59 +0800)]
Add benchmark for ?amax, ?max, ?amin, ?min, i?max, i?amin and i?min.
njutcz [Mon, 9 Mar 2020 02:39:40 +0000 (10:39 +0800)]
Merge pull request #1 from xianyi/develop
update
Martin Kroeker [Sun, 8 Mar 2020 07:09:58 +0000 (08:09 +0100)]
Merge pull request #2495 from ZuoQ3/develop
add benchmark for axpby test
Martin Kroeker [Sat, 7 Mar 2020 22:04:21 +0000 (23:04 +0100)]
Merge pull request #2494 from shengyang-3390/develop
add benchmark for csrot and zdrot
Martin Kroeker [Sat, 7 Mar 2020 21:26:00 +0000 (22:26 +0100)]
Merge pull request #2489 from jijiwawa/brightness
Remove redundant code
s00527847 [Sat, 7 Mar 2020 18:09:19 +0000 (13:09 -0500)]
add trmm.c
s00527847 [Wed, 4 Mar 2020 22:44:50 +0000 (17:44 -0500)]
Remove redundant code
Martin Kroeker [Sat, 7 Mar 2020 15:55:53 +0000 (16:55 +0100)]
Merge pull request #2493 from martin-frbg/plainmake
Fix use of make vs $(MAKE) in building lapack-testing
Martin Kroeker [Sat, 7 Mar 2020 15:52:29 +0000 (16:52 +0100)]
Merge pull request #2488 from liujingjue/develop
Modify the main Makefile in OpenBLAS
zq [Sat, 7 Mar 2020 09:48:55 +0000 (17:48 +0800)]
Add benchmark file axpby.c and modify benchmark/Makefile to test s/d/c/zaxpby
zq [Sat, 7 Mar 2020 09:04:59 +0000 (17:04 +0800)]
Merge pull request #1 from xianyi/develop
update
shengyang [Sat, 7 Mar 2020 07:17:49 +0000 (15:17 +0800)]
add benchmark for csrot and zdrot
modified: benchmark/Makefile
modified: benchmark/rot.c
l00546269 [Sat, 7 Mar 2020 02:14:33 +0000 (10:14 +0800)]
[OpenBLAS]:modifed the Makefile
[Description]: check the compiler version and show the detail info
Martin Kroeker [Fri, 6 Mar 2020 14:37:26 +0000 (15:37 +0100)]
Fix another spot where make was used instead of $(MAKE)
Broke lapack-testing on BSD as their default "make" does not support GNU Makefile syntax
Martin Kroeker [Fri, 6 Mar 2020 14:32:27 +0000 (15:32 +0100)]
Merge pull request #36 from xianyi/develop
rebase
Martin Kroeker [Fri, 6 Mar 2020 14:06:42 +0000 (15:06 +0100)]
Merge pull request #2491 from chenxuqiang/hbmv_benchmark
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
Martin Kroeker [Fri, 6 Mar 2020 14:05:55 +0000 (15:05 +0100)]
Merge pull request #2490 from shengyang-3390/develop
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
Martin Kroeker [Fri, 6 Mar 2020 13:42:25 +0000 (14:42 +0100)]
Merge pull request #2487 from jijiwawa/develop
add benchmark for spr/spr2
Martin Kroeker [Fri, 6 Mar 2020 13:41:40 +0000 (14:41 +0100)]
Merge branch 'develop' into develop
Martin Kroeker [Fri, 6 Mar 2020 13:30:09 +0000 (14:30 +0100)]
Merge pull request #2486 from qqqil/develop
add benchmark for trsv
Martin Kroeker [Fri, 6 Mar 2020 13:29:27 +0000 (14:29 +0100)]
Merge pull request #2485 from Darkness303/develop
Add syr2 benchmark
Martin Kroeker [Fri, 6 Mar 2020 13:28:58 +0000 (14:28 +0100)]
Merge pull request #2469 from AGSaidi/acq-rel-2
Use acq/rel semantics to pass flags/pointers in getrf_parallel.
Ali Saidi [Mon, 24 Feb 2020 05:45:30 +0000 (05:45 +0000)]
Use acq/rel semantics to pass flags/pointers in getrf_parallel.
The current implementation has locks, but the locks each only
have a critical section of one variable so atomic reads/writes
with barriers can be used to achieve the same behavior.
Like the previous patch, pthread_mutex_lock isn't fair, so in a
tight loop the previous thread that has the lock can keep it
starving another thread, even if that thread is about to write
the data that will stop the current thread from spinning.
On a 64c Arm system this improves performance by 20x on sgesv.goto.
chenxuqiang [Fri, 6 Mar 2020 06:02:02 +0000 (01:02 -0500)]
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
Signed-off-by: Xuqiang Chen chenxuqiang3@hisilicon.com
shengyang [Thu, 5 Mar 2020 01:55:16 +0000 (09:55 +0800)]
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
modified: benchmark/Makefile
new file: benchmark/rotm.c
s00527847 [Wed, 4 Mar 2020 20:50:19 +0000 (15:50 -0500)]
add benchmark for spr/spr2
q00437336 [Wed, 4 Mar 2020 08:54:40 +0000 (03:54 -0500)]
change clock to CLOCK_PROCESS_CPUTIME_ID
l00546269 [Wed, 4 Mar 2020 08:47:23 +0000 (16:47 +0800)]
[OpenBLAS]:modifed the Makefile
[Description]:add c/fortran compiler version information in final note
q00437336 [Wed, 4 Mar 2020 07:57:33 +0000 (02:57 -0500)]
add benchmark for trsv
Martin Kroeker [Wed, 4 Mar 2020 07:06:06 +0000 (08:06 +0100)]
Merge pull request #2484 from RajalakshmiSR/power-dynamic
Fix DYNAMIC_ARCH build for POWER9
Martin Kroeker [Wed, 4 Mar 2020 06:59:56 +0000 (07:59 +0100)]
Merge pull request #2483 from aaawuanjun/develop
Add benchmark file trmv.c and modify benchmark/Makefile to test s/d/c/ztrmv
Martin Kroeker [Wed, 4 Mar 2020 06:59:31 +0000 (07:59 +0100)]
Merge pull request #2466 from AGSaidi/acq-rel-1
Switch blas_server to use acq/rel semantics
Darkness303 [Wed, 4 Mar 2020 06:09:10 +0000 (14:09 +0800)]
1.Add syr2 benchmark
2.Fixed some errors
Martin Kroeker [Tue, 3 Mar 2020 20:37:48 +0000 (21:37 +0100)]
Fix cut/paste glitch
Martin Kroeker [Tue, 3 Mar 2020 20:04:12 +0000 (21:04 +0100)]
Restore initializers for mutex and conditional
Rajalakshmi Srinivasaraghavan [Tue, 3 Mar 2020 18:35:10 +0000 (12:35 -0600)]
Fix DYNAMIC_ARCH build for POWER9
Setting DYNAMIC_ARCH=1 on POWER9 does not build POWER9 files due to some
compiler version checks. This patch fixes some of the macros that are used
to check compiler version. On fixing those checks, there are some new make
failures related to icamin, icamax, isamin, isamax and caxpy files on POWER9.
This patch fixes those failures as well.
wuanjun 00447568 [Tue, 3 Mar 2020 11:03:57 +0000 (19:03 +0800)]
Merge branch 'develop' of https://github.com/aaawuanjun/OpenBLAS into develop
wuanjun 00447568 [Tue, 3 Mar 2020 09:39:26 +0000 (17:39 +0800)]
Merge branch 'develop' of https://github.com/aaawuanjun/OpenBLAS into develop
wuanjun 00447568 [Tue, 3 Mar 2020 09:39:26 +0000 (17:39 +0800)]
Merge branch 'develop' of https://github.com/aaawuanjun/OpenBLAS into develop
wuanjun 00447568 [Tue, 3 Mar 2020 09:13:49 +0000 (17:13 +0800)]
[OpenBlas]: add benchmark file trmv.c and modify benchmark/Makefile to test s/d/c/ztrmv
wuanjun 00447568 [Tue, 3 Mar 2020 09:13:49 +0000 (17:13 +0800)]
[OpenBlas]: add benchmark file trmv.c and modify benchmark/Makefile to test s/d/c/ztrmv
Martin Kroeker [Tue, 3 Mar 2020 07:46:49 +0000 (08:46 +0100)]
Merge pull request #2479 from Darkness303/develop
Fix potential index overflows at large matrix sizes in the benchmark codes
Martin Kroeker [Tue, 3 Mar 2020 07:43:00 +0000 (08:43 +0100)]
Merge pull request #2436 from marxin/improve-utest-coverage
Improve test coverage for utests.
Martin Kroeker [Mon, 2 Mar 2020 20:21:29 +0000 (21:21 +0100)]
Merge pull request #2481 from ChinouneMehdi/fix2480
Fix #2480
Martin Kroeker [Mon, 2 Mar 2020 20:20:51 +0000 (21:20 +0100)]
Merge pull request #2478 from MacChen02/develop
Update benchmark statistical time function
مهدي شينون (Mehdi Chinoune) [Mon, 2 Mar 2020 16:22:28 +0000 (17:22 +0100)]
fixes #2480
Martin Liska [Wed, 19 Feb 2020 17:24:01 +0000 (18:24 +0100)]
Improve test coverage for utests.