Martin Kroeker [Tue, 29 May 2018 20:02:06 +0000 (22:02 +0200)]
Merge pull request #1579 from martin-frbg/issue1574
Adapt lapack-test and blas-test to changes in netlib directory layout
Martin Kroeker [Tue, 29 May 2018 12:27:46 +0000 (14:27 +0200)]
Adapt lapack-test and blas-test to changes in netlib directory layout
partial fix for #1574 - the problem with lapack_testing.py looks like an upstream bug
Zhang Xianyi [Thu, 24 May 2018 12:56:24 +0000 (20:56 +0800)]
Add -lm for Android.
Conflicts:
exports/Makefile
Martin Kroeker [Wed, 23 May 2018 20:55:37 +0000 (22:55 +0200)]
Merge pull request #1572 from martin-frbg/issue1571
Use the new zrot.c on POWER8 for crot as well
Martin Kroeker [Wed, 23 May 2018 20:54:39 +0000 (22:54 +0200)]
Use the new zrot.c on POWER8 for crot as well
fixes #1571 (the old zrot.S assembly does not handle incx=0 correctly)
Martin Kroeker [Thu, 17 May 2018 18:50:23 +0000 (20:50 +0200)]
Merge pull request #1567 from martin-frbg/mipstrmm
Revert " Switch mips32 target to USE_TRMM to fix complex TRMM"
Martin Kroeker [Thu, 17 May 2018 18:30:03 +0000 (20:30 +0200)]
Revert " Switch mips32 target to USE_TRMM to fix complex TRMM"
... as it was just a silly workaround for the issue seen in #1563, caused by #1419
Martin Kroeker [Thu, 17 May 2018 18:22:58 +0000 (20:22 +0200)]
Merge pull request #1565 from martin-frbg/mipstypo
Remove extraneous brace from previous commit of mips dsdot fix
Martin Kroeker [Thu, 17 May 2018 16:43:59 +0000 (18:43 +0200)]
Remove extraneous brace from previous commit
Martin Kroeker [Thu, 17 May 2018 12:04:13 +0000 (14:04 +0200)]
Merge pull request #1564 from martin-frbg/issue1563
Revert changes from PR#1419
Martin Kroeker [Thu, 17 May 2018 09:40:08 +0000 (11:40 +0200)]
Revert changes from PR#1419
at least one of these changes apparently is an oversimplification, leading to TRMM breakage on some platforms as observed in #1563
Martin Kroeker [Tue, 15 May 2018 15:46:09 +0000 (17:46 +0200)]
Merge pull request #1562 from martin-frbg/issue1561
Use correct data type for initializers of v2f64, v4f32
Martin Kroeker [Tue, 15 May 2018 12:42:12 +0000 (14:42 +0200)]
Use correct data type for initializers of v2f64, v4f32
Fixes #1561
Martin Kroeker [Mon, 14 May 2018 16:49:53 +0000 (18:49 +0200)]
Merge pull request #1559 from martin-frbg/buildconf
Add build-time configuration options to pkgconfig file
Martin Kroeker [Mon, 14 May 2018 15:38:12 +0000 (17:38 +0200)]
Merge pull request #1558 from martin-frbg/instpc
Overwrite any pre-existing openblas.pc rather than append to it
Martin Kroeker [Mon, 14 May 2018 15:37:55 +0000 (17:37 +0200)]
Merge pull request #1557 from martin-frbg/getconfig
Add threading and OpenMP information to output
Martin Kroeker [Sun, 13 May 2018 22:10:15 +0000 (00:10 +0200)]
Add build-time configuration options to pkgconfig file
Martin Kroeker [Sun, 13 May 2018 22:09:35 +0000 (00:09 +0200)]
Add build-time configuration options to pkgconfig file
Martin Kroeker [Sat, 12 May 2018 20:11:27 +0000 (22:11 +0200)]
Overwrite any pre-existing openblas.pc rather than append to it
Martin Kroeker [Sat, 12 May 2018 10:11:38 +0000 (12:11 +0200)]
Add threading and OpenMP information to output
For #1416 and #1529, more information about the options OpenBLAS was built with is needed. Additionally we may want to add this data to the openblas.pc file (but not all projects use pkgconfig, and as far as I am aware the cmake module for accessing it does not make such "private" declarations available)
Zhang Xianyi [Fri, 11 May 2018 09:02:47 +0000 (17:02 +0800)]
Merge pull request #1556 from WestAlgo/develop
move _Atomic define to common.h
zhiyong.dang [Fri, 11 May 2018 07:13:16 +0000 (00:13 -0700)]
move _Atomic define to common.h
Zhang Xianyi [Fri, 11 May 2018 04:25:24 +0000 (12:25 +0800)]
Merge pull request #1555 from WestAlgo/develop
Change _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang [Fri, 11 May 2018 04:15:08 +0000 (12:15 +0800)]
Change _STDC_VERSION__ to __STDC_VERSION__
Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42
Zhang Xianyi [Fri, 11 May 2018 02:09:14 +0000 (10:09 +0800)]
Merge pull request #1536 from WestAlgo/develop
Fix race condition in blas_server_omp.c
Martin Kroeker [Thu, 10 May 2018 13:32:08 +0000 (15:32 +0200)]
Merge pull request #1554 from martin-frbg/lapack-249
LAPACKE fixes from lapack PR249
Martin Kroeker [Thu, 10 May 2018 11:15:42 +0000 (13:15 +0200)]
LAPACKE fixes from lapack PR249
Copied from Reference-LAPACK/lapack#249, this fixes out-of-bounds memory accesses
in the nancheck calls of the LAPACKE lacgv, lassq,larfg,larfb,larfx and mtr functions
Martin Kroeker [Wed, 9 May 2018 12:39:52 +0000 (14:39 +0200)]
Merge pull request #1553 from martin-frbg/ifort-openmpflag
Change -openmp to -fopenmp for ifort entry as well
Martin Kroeker [Wed, 9 May 2018 10:34:09 +0000 (12:34 +0200)]
Change -openmp to -fopenmp for ifort entry as well
Martin Kroeker [Wed, 9 May 2018 07:02:52 +0000 (09:02 +0200)]
Merge pull request #1551 from martin-frbg/f_check_fix
Fixes for ifort 2018
Martin Kroeker [Wed, 9 May 2018 07:02:38 +0000 (09:02 +0200)]
Merge pull request #1550 from martin-frbg/ifort-openmpflag
Update compiler flag for openmp use with ICC
Martin Kroeker [Tue, 8 May 2018 21:52:55 +0000 (23:52 +0200)]
Merge pull request #1549 from martin-frbg/fix_ompcheck
Drop C-style "L" suffx from OPENMP version number tests in the LAPACK source
Martin Kroeker [Tue, 8 May 2018 19:55:37 +0000 (21:55 +0200)]
Fixes for ifort 2018
1. the already deprecated -openmp option was removed in 2018, switch to -fopenmp
2. add leading blank in search for "zho_ge__" symbol to work around misleading tags in the 2018 assembly
Expected to fix #1548
Martin Kroeker [Tue, 8 May 2018 19:47:10 +0000 (21:47 +0200)]
Update compiler flag for openmp use with ICC
The deprecated -openmp option was finally removed in favor of -qopenmp or -fopenmp, picking the latter to stay compatible with Intel compiler versions before 2015 (when -q options were introduced). Fixes #1546
Martin Kroeker [Tue, 8 May 2018 19:39:42 +0000 (21:39 +0200)]
Drop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Tue, 8 May 2018 19:38:25 +0000 (21:38 +0200)]
Drop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Tue, 8 May 2018 19:36:56 +0000 (21:36 +0200)]
Drop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Wed, 2 May 2018 20:47:45 +0000 (22:47 +0200)]
Merge pull request #1543 from martin-frbg/mips32
Fix MIPS32 build and add MIPS 1004K cpu (MT7621 SOC)
Martin Kroeker [Wed, 2 May 2018 18:37:06 +0000 (20:37 +0200)]
Restore compiler options for mips P5600 target
Martin Kroeker [Wed, 2 May 2018 18:27:56 +0000 (20:27 +0200)]
Add MIPS 1004K target
Martin Kroeker [Wed, 2 May 2018 18:25:32 +0000 (20:25 +0200)]
Switch mips32 target to USE_TRMM to fix complex TRMM
Martin Kroeker [Wed, 2 May 2018 18:20:44 +0000 (20:20 +0200)]
Add MIPS 1004K target (Mediatek MT7621 SOC)
Martin Kroeker [Wed, 2 May 2018 18:17:26 +0000 (20:17 +0200)]
Add mips32r2 api target
Martin Kroeker [Wed, 2 May 2018 18:12:25 +0000 (20:12 +0200)]
Make cpuid_mips compile again and add 1004K cpu
Martin Kroeker [Wed, 2 May 2018 16:11:50 +0000 (18:11 +0200)]
Merge pull request #1542 from martin-frbg/quickdiv64
Avoid out-of-bounds accesses in blas_quickdivide on big X86 systems
Martin Kroeker [Wed, 2 May 2018 12:44:50 +0000 (14:44 +0200)]
Omit the divide table overflow check on small systems
Martin Kroeker [Wed, 2 May 2018 12:43:08 +0000 (14:43 +0200)]
Omit the table overflow check when building for small systems
Martin Kroeker [Sun, 29 Apr 2018 12:40:12 +0000 (14:40 +0200)]
Update common_x86_64.h
Martin Kroeker [Sun, 29 Apr 2018 12:38:55 +0000 (14:38 +0200)]
Avoid out-of-bounds reads from blas_quick_divide_table on big systems
Martin Kroeker [Sun, 29 Apr 2018 12:34:33 +0000 (14:34 +0200)]
Avoid out of bounds reads from blas_quick_divide_table on big systems
Should fix #1541
Martin Kroeker [Fri, 27 Apr 2018 21:10:21 +0000 (23:10 +0200)]
Merge pull request #1539 from martin-frbg/ztrmv-1332
Disable multithreading in ztrmv
Martin Kroeker [Fri, 27 Apr 2018 21:09:57 +0000 (23:09 +0200)]
Merge pull request #1486 from martin-frbg/atomic
Use _Atomic instead of volatile for thread safety where C11 is supported
Martin Kroeker [Fri, 27 Apr 2018 10:08:06 +0000 (12:08 +0200)]
Update Makefile.rule
Zhiyong Dang [Tue, 24 Apr 2018 02:34:53 +0000 (10:34 +0800)]
Fix race condition in blas_server_omp.c
Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d
Martin Kroeker [Wed, 25 Apr 2018 21:23:00 +0000 (23:23 +0200)]
Merge pull request #1540 from martin-frbg/mips32-zasum
Fix typo in MIPS P5600 complex ASUM code selection
Martin Kroeker [Wed, 25 Apr 2018 20:50:10 +0000 (22:50 +0200)]
Fix typo in MIPS P5600 complex ASUM code selection
Martin Kroeker [Wed, 25 Apr 2018 20:35:46 +0000 (22:35 +0200)]
Disable multithreading in ztrmv
BLAS-Tester shows that the same problem exists as with DTRMV (issue #1332)
Martin Kroeker [Wed, 25 Apr 2018 06:38:58 +0000 (08:38 +0200)]
Merge pull request #1538 from martin-frbg/arm7utest
Fix handling of zero INCX, INCY in ArmV7 AXPY and ROT
Martin Kroeker [Tue, 24 Apr 2018 20:43:00 +0000 (22:43 +0200)]
Move the test for zero incx,incy in ARMV7 ROT
to pass the related utest (see #1469)
Martin Kroeker [Tue, 24 Apr 2018 20:39:50 +0000 (22:39 +0200)]
Drop test for zero incx,incy in armv7 AXPY
...to pass the related utest (see #1469)
Martin Kroeker [Mon, 23 Apr 2018 17:05:49 +0000 (19:05 +0200)]
Use generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535)
* Use generic C implementation of zrot on ppc64/POWER6 to work around utest failure from #1469
Martin Kroeker [Sun, 22 Apr 2018 21:34:17 +0000 (23:34 +0200)]
Merge pull request #1534 from xianyi/revert-1333-haswell32
Revert "Fix 32bit HASWELL builds"
Martin Kroeker [Sun, 22 Apr 2018 18:20:04 +0000 (20:20 +0200)]
Revert "Fix 32bit HASWELL builds"
Martin Kroeker [Fri, 20 Apr 2018 21:44:15 +0000 (23:44 +0200)]
Merge pull request #1532 from martin-frbg/utest-cblas
Do not try to build the fork utest when NO_CBLAS=1
Martin Kroeker [Fri, 20 Apr 2018 13:43:59 +0000 (15:43 +0200)]
fork utest depends on CBLAS
Martin Kroeker [Fri, 20 Apr 2018 13:42:13 +0000 (15:42 +0200)]
fork utest depends on CBLAS
Martin Kroeker [Thu, 19 Apr 2018 12:10:57 +0000 (14:10 +0200)]
Merge pull request #1530 from ashwinyes/develop_20180419_Tx2AutoDetect
ARM64: Enable Auto Detection of ThunderX2T99
Ashwin Sekhar T K [Thu, 19 Apr 2018 09:05:25 +0000 (09:05 +0000)]
ARM64: Enable Auto Detection of ThunderX2T99
Martin Kroeker [Sun, 15 Apr 2018 11:09:30 +0000 (13:09 +0200)]
Merge pull request #1523 from martin-frbg/utest_waith
Include sys/types.h for proper typedefs related to wait()
Martin Kroeker [Sat, 14 Apr 2018 20:24:34 +0000 (22:24 +0200)]
Merge pull request #1520 from martin-frbg/cpucounts
Catch invalid cpu count returned by CPU_COUNT_S
Martin Kroeker [Sat, 14 Apr 2018 16:59:46 +0000 (18:59 +0200)]
Include sys/types.h for proper typedefs related to wait()
Should fix #1519
Martin Kroeker [Sat, 14 Apr 2018 16:29:10 +0000 (18:29 +0200)]
Catch invalid cpu count returned by CPU_COUNT_S
mips32 was seen to return zero here, driving nthreads to zero with subsequent fpe in blas_quickdivide
Martin Kroeker [Wed, 11 Apr 2018 06:21:25 +0000 (08:21 +0200)]
Merge pull request #1515 from martin-frbg/mipsdot
Correct precision of mips dsdot
Martin Kroeker [Tue, 10 Apr 2018 21:30:59 +0000 (23:30 +0200)]
Fix precision of mips dsdot
Martin Kroeker [Sat, 7 Apr 2018 21:31:26 +0000 (23:31 +0200)]
Merge pull request #1512 from ararslan/aa/travis-macos-2
Add macOS to the Travis testing matrix: Take 2!
Alex Arslan [Sat, 7 Apr 2018 19:29:57 +0000 (12:29 -0700)]
Add a BINARY=32 build to macOS
Alex Arslan [Sat, 7 Apr 2018 17:56:34 +0000 (10:56 -0700)]
Add macOS to the Travis testing matrix
Martin Kroeker [Sat, 7 Apr 2018 11:29:31 +0000 (13:29 +0200)]
Merge pull request #1511 from xianyi/revert-1510-aa/travis-macos
Revert "Add macOS to the Travis testing matrix"
Martin Kroeker [Sat, 7 Apr 2018 11:27:24 +0000 (13:27 +0200)]
Revert "Add macOS to the Travis testing matrix"
Martin Kroeker [Sat, 7 Apr 2018 10:09:39 +0000 (12:09 +0200)]
Merge branch 'develop' into atomic
Martin Kroeker [Sat, 7 Apr 2018 10:07:12 +0000 (12:07 +0200)]
Merge pull request #1510 from ararslan/aa/travis-macos
Add macOS to the Travis testing matrix
Martin Kroeker [Sat, 7 Apr 2018 10:06:57 +0000 (12:06 +0200)]
Merge pull request #1509 from ararslan/aa/dragonfly
Add DragonFly to exports/Makefile
Alex Arslan [Sat, 7 Apr 2018 00:53:58 +0000 (17:53 -0700)]
Add macOS to the Travis testing matrix
Alex Arslan [Sat, 7 Apr 2018 00:30:10 +0000 (17:30 -0700)]
Add DragonFly to exports/Makefile
Its exclusion was an oversight on my part.
Martin Kroeker [Thu, 5 Apr 2018 21:46:36 +0000 (23:46 +0200)]
Merge pull request #1506 from martin-frbg/issue1497
Fix thread races and infinite looping on systems with many cpus
Martin Kroeker [Thu, 5 Apr 2018 06:54:07 +0000 (08:54 +0200)]
Merge pull request #1507 from martin-frbg/threads_usage
Underline importance of NUM_THREADS setting for BUFFER allocation
Martin Kroeker [Thu, 5 Apr 2018 06:53:38 +0000 (08:53 +0200)]
Merge pull request #1508 from ararslan/aa/wording
Minor changes to wording and formatting in the README
Alex Arslan [Wed, 4 Apr 2018 21:30:32 +0000 (14:30 -0700)]
Minor changes to wording and formatting in the README
The wording in some places is not grammatically correct. This change
also provides minor adjustments to the Markdown formatting which provide
modest improvements to readability.
Martin Kroeker [Wed, 4 Apr 2018 20:45:33 +0000 (22:45 +0200)]
Merge pull request #1505 from ararslan/aa/compiler
Compile with cc rather than gcc whenever possible
Martin Kroeker [Wed, 4 Apr 2018 20:40:30 +0000 (22:40 +0200)]
Remove unguarded use of _Atomic and fix tabbing
Martin Kroeker [Wed, 4 Apr 2018 20:26:51 +0000 (22:26 +0200)]
Underline importance of NUM_THREADS setting for BUFFER allocation
following augray's suggestion from #1451, and incorporating ashwinyes' comments from #1141 on the importance of NUM_THREADS even for single-threaded builds.
Alex Arslan [Wed, 4 Apr 2018 18:41:45 +0000 (11:41 -0700)]
Reinstate macOS logic
Alex Arslan [Tue, 3 Apr 2018 22:09:25 +0000 (15:09 -0700)]
Compile with cc rather than gcc whenever possible
Martin Kroeker [Wed, 4 Apr 2018 16:16:52 +0000 (18:16 +0200)]
Fix thread races and infinite looping on systems with many cpus
On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497.
This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue.
Martin Kroeker [Wed, 4 Apr 2018 13:26:46 +0000 (15:26 +0200)]
Merge pull request #1504 from ararslan/aa/openbsd
Allow building on OpenBSD
Martin Kroeker [Wed, 4 Apr 2018 13:26:21 +0000 (15:26 +0200)]
Merge pull request #1501 from martin-frbg/issue875
Add workaround for old gcc and clang versions
Alex Arslan [Tue, 3 Apr 2018 23:42:01 +0000 (16:42 -0700)]
Add OpenBSD and DragonFly to community supported platforms
Alex Arslan [Tue, 3 Apr 2018 23:39:29 +0000 (16:39 -0700)]
Add support for DragonFly BSD
Alex Arslan [Mon, 2 Apr 2018 17:48:22 +0000 (10:48 -0700)]
Allow building on OpenBSD
With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.
Martin Kroeker [Sat, 31 Mar 2018 20:32:06 +0000 (22:32 +0200)]
Update memory.c