platform/upstream/openblas.git
6 years agoChange _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang [Fri, 11 May 2018 04:15:08 +0000 (12:15 +0800)]
Change _STDC_VERSION__ to __STDC_VERSION__

Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42

6 years agoMerge pull request #1536 from WestAlgo/develop
Zhang Xianyi [Fri, 11 May 2018 02:09:14 +0000 (10:09 +0800)]
Merge pull request #1536 from WestAlgo/develop

Fix race condition in blas_server_omp.c

6 years agoMerge pull request #1554 from martin-frbg/lapack-249
Martin Kroeker [Thu, 10 May 2018 13:32:08 +0000 (15:32 +0200)]
Merge pull request #1554 from martin-frbg/lapack-249

LAPACKE fixes from lapack PR249

6 years agoLAPACKE fixes from lapack PR249
Martin Kroeker [Thu, 10 May 2018 11:15:42 +0000 (13:15 +0200)]
LAPACKE fixes from lapack PR249

Copied from Reference-LAPACK/lapack#249, this fixes out-of-bounds memory accesses
in the nancheck calls of the LAPACKE lacgv, lassq,larfg,larfb,larfx and mtr functions

6 years agoMerge pull request #1553 from martin-frbg/ifort-openmpflag
Martin Kroeker [Wed, 9 May 2018 12:39:52 +0000 (14:39 +0200)]
Merge pull request #1553 from martin-frbg/ifort-openmpflag

Change -openmp to -fopenmp for ifort entry as well

6 years agoChange -openmp to -fopenmp for ifort entry as well
Martin Kroeker [Wed, 9 May 2018 10:34:09 +0000 (12:34 +0200)]
Change -openmp to -fopenmp for ifort entry as well

6 years agoMerge pull request #1551 from martin-frbg/f_check_fix
Martin Kroeker [Wed, 9 May 2018 07:02:52 +0000 (09:02 +0200)]
Merge pull request #1551 from martin-frbg/f_check_fix

Fixes for ifort 2018

6 years agoMerge pull request #1550 from martin-frbg/ifort-openmpflag
Martin Kroeker [Wed, 9 May 2018 07:02:38 +0000 (09:02 +0200)]
Merge pull request #1550 from martin-frbg/ifort-openmpflag

Update compiler flag for openmp use with ICC

6 years agoMerge pull request #1549 from martin-frbg/fix_ompcheck
Martin Kroeker [Tue, 8 May 2018 21:52:55 +0000 (23:52 +0200)]
Merge pull request #1549 from martin-frbg/fix_ompcheck

Drop C-style "L" suffx from OPENMP version number tests in the LAPACK source

6 years agoFixes for ifort 2018
Martin Kroeker [Tue, 8 May 2018 19:55:37 +0000 (21:55 +0200)]
Fixes for ifort 2018

1. the already deprecated -openmp option was removed in 2018, switch to -fopenmp
2. add leading blank in search for "zho_ge__" symbol to work around misleading tags in the 2018 assembly
Expected to fix #1548

6 years agoUpdate compiler flag for openmp use with ICC
Martin Kroeker [Tue, 8 May 2018 19:47:10 +0000 (21:47 +0200)]
Update compiler flag for openmp use with ICC

The deprecated -openmp option was finally removed in favor of -qopenmp or -fopenmp, picking the latter to stay compatible with Intel compiler versions before 2015 (when -q options were introduced). Fixes #1546

6 years agoDrop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Tue, 8 May 2018 19:39:42 +0000 (21:39 +0200)]
Drop C-style "L" suffix from OPENMP version number in check

6 years agoDrop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Tue, 8 May 2018 19:38:25 +0000 (21:38 +0200)]
Drop C-style "L" suffix from OPENMP version number in check

6 years agoDrop C-style "L" suffix from OPENMP version number in check
Martin Kroeker [Tue, 8 May 2018 19:36:56 +0000 (21:36 +0200)]
Drop C-style "L" suffix from OPENMP version number in check

6 years agoMerge pull request #1543 from martin-frbg/mips32
Martin Kroeker [Wed, 2 May 2018 20:47:45 +0000 (22:47 +0200)]
Merge pull request #1543 from martin-frbg/mips32

Fix MIPS32 build and add MIPS 1004K cpu (MT7621 SOC)

6 years agoRestore compiler options for mips P5600 target
Martin Kroeker [Wed, 2 May 2018 18:37:06 +0000 (20:37 +0200)]
Restore compiler options for mips P5600 target

6 years agoAdd MIPS 1004K target
Martin Kroeker [Wed, 2 May 2018 18:27:56 +0000 (20:27 +0200)]
Add MIPS 1004K target

6 years agoSwitch mips32 target to USE_TRMM to fix complex TRMM
Martin Kroeker [Wed, 2 May 2018 18:25:32 +0000 (20:25 +0200)]
Switch mips32 target to USE_TRMM to fix complex TRMM

6 years agoAdd MIPS 1004K target (Mediatek MT7621 SOC)
Martin Kroeker [Wed, 2 May 2018 18:20:44 +0000 (20:20 +0200)]
Add MIPS 1004K target (Mediatek MT7621 SOC)

6 years agoAdd mips32r2 api target
Martin Kroeker [Wed, 2 May 2018 18:17:26 +0000 (20:17 +0200)]
Add mips32r2 api target

6 years agoMake cpuid_mips compile again and add 1004K cpu
Martin Kroeker [Wed, 2 May 2018 18:12:25 +0000 (20:12 +0200)]
Make cpuid_mips compile again and add 1004K cpu

6 years agoMerge pull request #1542 from martin-frbg/quickdiv64
Martin Kroeker [Wed, 2 May 2018 16:11:50 +0000 (18:11 +0200)]
Merge pull request #1542 from martin-frbg/quickdiv64

Avoid out-of-bounds accesses in blas_quickdivide on big X86 systems

6 years agoOmit the divide table overflow check on small systems
Martin Kroeker [Wed, 2 May 2018 12:44:50 +0000 (14:44 +0200)]
Omit the divide table overflow check on small systems

6 years agoOmit the table overflow check when building for small systems
Martin Kroeker [Wed, 2 May 2018 12:43:08 +0000 (14:43 +0200)]
Omit the table overflow check when building for small systems

6 years agoUpdate common_x86_64.h
Martin Kroeker [Sun, 29 Apr 2018 12:40:12 +0000 (14:40 +0200)]
Update common_x86_64.h

6 years agoAvoid out-of-bounds reads from blas_quick_divide_table on big systems
Martin Kroeker [Sun, 29 Apr 2018 12:38:55 +0000 (14:38 +0200)]
Avoid out-of-bounds reads from blas_quick_divide_table on big systems

6 years agoAvoid out of bounds reads from blas_quick_divide_table on big systems
Martin Kroeker [Sun, 29 Apr 2018 12:34:33 +0000 (14:34 +0200)]
Avoid out of bounds reads from blas_quick_divide_table on big systems

Should fix #1541

6 years agoMerge pull request #1539 from martin-frbg/ztrmv-1332
Martin Kroeker [Fri, 27 Apr 2018 21:10:21 +0000 (23:10 +0200)]
Merge pull request #1539 from martin-frbg/ztrmv-1332

Disable multithreading in ztrmv

6 years agoMerge pull request #1486 from martin-frbg/atomic
Martin Kroeker [Fri, 27 Apr 2018 21:09:57 +0000 (23:09 +0200)]
Merge pull request #1486 from martin-frbg/atomic

 Use _Atomic instead of volatile for thread safety where C11 is supported

6 years agoUpdate Makefile.rule
Martin Kroeker [Fri, 27 Apr 2018 10:08:06 +0000 (12:08 +0200)]
Update Makefile.rule

6 years agoFix race condition in blas_server_omp.c
Zhiyong Dang [Tue, 24 Apr 2018 02:34:53 +0000 (10:34 +0800)]
Fix race condition in blas_server_omp.c

Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d

6 years agoMerge pull request #1540 from martin-frbg/mips32-zasum
Martin Kroeker [Wed, 25 Apr 2018 21:23:00 +0000 (23:23 +0200)]
Merge pull request #1540 from martin-frbg/mips32-zasum

Fix typo in MIPS P5600 complex ASUM code selection

6 years agoFix typo in MIPS P5600 complex ASUM code selection
Martin Kroeker [Wed, 25 Apr 2018 20:50:10 +0000 (22:50 +0200)]
Fix typo in MIPS P5600 complex ASUM code selection

6 years agoDisable multithreading in ztrmv
Martin Kroeker [Wed, 25 Apr 2018 20:35:46 +0000 (22:35 +0200)]
Disable multithreading in ztrmv

BLAS-Tester shows that the same problem exists as with DTRMV (issue #1332)

6 years agoMerge pull request #1538 from martin-frbg/arm7utest
Martin Kroeker [Wed, 25 Apr 2018 06:38:58 +0000 (08:38 +0200)]
Merge pull request #1538 from martin-frbg/arm7utest

Fix handling of zero INCX, INCY in ArmV7 AXPY and ROT

6 years agoMove the test for zero incx,incy in ARMV7 ROT
Martin Kroeker [Tue, 24 Apr 2018 20:43:00 +0000 (22:43 +0200)]
Move the test for zero incx,incy in ARMV7 ROT

to pass the related utest (see #1469)

6 years agoDrop test for zero incx,incy in armv7 AXPY
Martin Kroeker [Tue, 24 Apr 2018 20:39:50 +0000 (22:39 +0200)]
Drop test for zero incx,incy in armv7 AXPY

...to pass the related utest (see #1469)

6 years agoUse generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535)
Martin Kroeker [Mon, 23 Apr 2018 17:05:49 +0000 (19:05 +0200)]
Use generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535)

* Use generic C implementation of zrot on ppc64/POWER6 to work around utest failure from #1469

6 years agoMerge pull request #1534 from xianyi/revert-1333-haswell32
Martin Kroeker [Sun, 22 Apr 2018 21:34:17 +0000 (23:34 +0200)]
Merge pull request #1534 from xianyi/revert-1333-haswell32

Revert "Fix 32bit HASWELL builds"

6 years agoRevert "Fix 32bit HASWELL builds"
Martin Kroeker [Sun, 22 Apr 2018 18:20:04 +0000 (20:20 +0200)]
Revert "Fix 32bit HASWELL builds"

6 years agoMerge pull request #1532 from martin-frbg/utest-cblas
Martin Kroeker [Fri, 20 Apr 2018 21:44:15 +0000 (23:44 +0200)]
Merge pull request #1532 from martin-frbg/utest-cblas

Do not try to build the fork utest when NO_CBLAS=1

6 years agofork utest depends on CBLAS
Martin Kroeker [Fri, 20 Apr 2018 13:43:59 +0000 (15:43 +0200)]
fork utest depends on CBLAS

6 years agofork utest depends on CBLAS
Martin Kroeker [Fri, 20 Apr 2018 13:42:13 +0000 (15:42 +0200)]
fork utest depends on CBLAS

6 years agoMerge pull request #1530 from ashwinyes/develop_20180419_Tx2AutoDetect
Martin Kroeker [Thu, 19 Apr 2018 12:10:57 +0000 (14:10 +0200)]
Merge pull request #1530 from ashwinyes/develop_20180419_Tx2AutoDetect

ARM64: Enable Auto Detection of ThunderX2T99

6 years agoARM64: Enable Auto Detection of ThunderX2T99
Ashwin Sekhar T K [Thu, 19 Apr 2018 09:05:25 +0000 (09:05 +0000)]
ARM64: Enable Auto Detection of ThunderX2T99

6 years agoMerge pull request #1523 from martin-frbg/utest_waith
Martin Kroeker [Sun, 15 Apr 2018 11:09:30 +0000 (13:09 +0200)]
Merge pull request #1523 from martin-frbg/utest_waith

Include sys/types.h for proper typedefs related to wait()

6 years agoMerge pull request #1520 from martin-frbg/cpucounts
Martin Kroeker [Sat, 14 Apr 2018 20:24:34 +0000 (22:24 +0200)]
Merge pull request #1520 from martin-frbg/cpucounts

Catch invalid cpu count returned by CPU_COUNT_S

6 years agoInclude sys/types.h for proper typedefs related to wait()
Martin Kroeker [Sat, 14 Apr 2018 16:59:46 +0000 (18:59 +0200)]
Include sys/types.h for proper typedefs related to wait()

Should fix #1519

6 years agoCatch invalid cpu count returned by CPU_COUNT_S
Martin Kroeker [Sat, 14 Apr 2018 16:29:10 +0000 (18:29 +0200)]
Catch invalid cpu count returned by CPU_COUNT_S

mips32 was seen to return zero here, driving nthreads to zero with subsequent fpe in blas_quickdivide

6 years agoMerge pull request #1515 from martin-frbg/mipsdot
Martin Kroeker [Wed, 11 Apr 2018 06:21:25 +0000 (08:21 +0200)]
Merge pull request #1515 from martin-frbg/mipsdot

Correct precision of mips dsdot

6 years agoFix precision of mips dsdot
Martin Kroeker [Tue, 10 Apr 2018 21:30:59 +0000 (23:30 +0200)]
Fix precision of mips dsdot

6 years agoMerge pull request #1512 from ararslan/aa/travis-macos-2
Martin Kroeker [Sat, 7 Apr 2018 21:31:26 +0000 (23:31 +0200)]
Merge pull request #1512 from ararslan/aa/travis-macos-2

Add macOS to the Travis testing matrix: Take 2!

6 years agoAdd a BINARY=32 build to macOS
Alex Arslan [Sat, 7 Apr 2018 19:29:57 +0000 (12:29 -0700)]
Add a BINARY=32 build to macOS

6 years agoAdd macOS to the Travis testing matrix
Alex Arslan [Sat, 7 Apr 2018 17:56:34 +0000 (10:56 -0700)]
Add macOS to the Travis testing matrix

6 years agoMerge pull request #1511 from xianyi/revert-1510-aa/travis-macos
Martin Kroeker [Sat, 7 Apr 2018 11:29:31 +0000 (13:29 +0200)]
Merge pull request #1511 from xianyi/revert-1510-aa/travis-macos

Revert "Add macOS to the Travis testing matrix"

6 years agoRevert "Add macOS to the Travis testing matrix"
Martin Kroeker [Sat, 7 Apr 2018 11:27:24 +0000 (13:27 +0200)]
Revert "Add macOS to the Travis testing matrix"

6 years agoMerge branch 'develop' into atomic
Martin Kroeker [Sat, 7 Apr 2018 10:09:39 +0000 (12:09 +0200)]
Merge branch 'develop' into atomic

6 years agoMerge pull request #1510 from ararslan/aa/travis-macos
Martin Kroeker [Sat, 7 Apr 2018 10:07:12 +0000 (12:07 +0200)]
Merge pull request #1510 from ararslan/aa/travis-macos

Add macOS to the Travis testing matrix

6 years agoMerge pull request #1509 from ararslan/aa/dragonfly
Martin Kroeker [Sat, 7 Apr 2018 10:06:57 +0000 (12:06 +0200)]
Merge pull request #1509 from ararslan/aa/dragonfly

Add DragonFly to exports/Makefile

6 years agoAdd macOS to the Travis testing matrix
Alex Arslan [Sat, 7 Apr 2018 00:53:58 +0000 (17:53 -0700)]
Add macOS to the Travis testing matrix

6 years agoAdd DragonFly to exports/Makefile
Alex Arslan [Sat, 7 Apr 2018 00:30:10 +0000 (17:30 -0700)]
Add DragonFly to exports/Makefile

Its exclusion was an oversight on my part.

6 years agoMerge pull request #1506 from martin-frbg/issue1497
Martin Kroeker [Thu, 5 Apr 2018 21:46:36 +0000 (23:46 +0200)]
Merge pull request #1506 from martin-frbg/issue1497

Fix thread races and infinite looping on systems with many cpus

6 years agoMerge pull request #1507 from martin-frbg/threads_usage
Martin Kroeker [Thu, 5 Apr 2018 06:54:07 +0000 (08:54 +0200)]
Merge pull request #1507 from martin-frbg/threads_usage

Underline importance of NUM_THREADS setting for BUFFER allocation

6 years agoMerge pull request #1508 from ararslan/aa/wording
Martin Kroeker [Thu, 5 Apr 2018 06:53:38 +0000 (08:53 +0200)]
Merge pull request #1508 from ararslan/aa/wording

Minor changes to wording and formatting in the README

6 years agoMinor changes to wording and formatting in the README
Alex Arslan [Wed, 4 Apr 2018 21:30:32 +0000 (14:30 -0700)]
Minor changes to wording and formatting in the README

The wording in some places is not grammatically correct. This change
also provides minor adjustments to the Markdown formatting which provide
modest improvements to readability.

6 years agoMerge pull request #1505 from ararslan/aa/compiler
Martin Kroeker [Wed, 4 Apr 2018 20:45:33 +0000 (22:45 +0200)]
Merge pull request #1505 from ararslan/aa/compiler

Compile with cc rather than gcc whenever possible

6 years agoRemove unguarded use of _Atomic and fix tabbing
Martin Kroeker [Wed, 4 Apr 2018 20:40:30 +0000 (22:40 +0200)]
Remove unguarded use of _Atomic and fix tabbing

6 years agoUnderline importance of NUM_THREADS setting for BUFFER allocation
Martin Kroeker [Wed, 4 Apr 2018 20:26:51 +0000 (22:26 +0200)]
Underline importance of NUM_THREADS setting for BUFFER allocation

following augray's suggestion from #1451, and incorporating ashwinyes' comments from #1141 on the importance of NUM_THREADS even for single-threaded builds.

6 years agoReinstate macOS logic
Alex Arslan [Wed, 4 Apr 2018 18:41:45 +0000 (11:41 -0700)]
Reinstate macOS logic

6 years agoCompile with cc rather than gcc whenever possible
Alex Arslan [Tue, 3 Apr 2018 22:09:25 +0000 (15:09 -0700)]
Compile with cc rather than gcc whenever possible

6 years agoFix thread races and infinite looping on systems with many cpus
Martin Kroeker [Wed, 4 Apr 2018 16:16:52 +0000 (18:16 +0200)]
Fix thread races and infinite looping on systems with many cpus

On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497.
This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue.

6 years agoMerge pull request #1504 from ararslan/aa/openbsd
Martin Kroeker [Wed, 4 Apr 2018 13:26:46 +0000 (15:26 +0200)]
Merge pull request #1504 from ararslan/aa/openbsd

Allow building on OpenBSD

6 years agoMerge pull request #1501 from martin-frbg/issue875
Martin Kroeker [Wed, 4 Apr 2018 13:26:21 +0000 (15:26 +0200)]
Merge pull request #1501 from martin-frbg/issue875

Add workaround for old gcc and clang versions

6 years agoAdd OpenBSD and DragonFly to community supported platforms
Alex Arslan [Tue, 3 Apr 2018 23:42:01 +0000 (16:42 -0700)]
Add OpenBSD and DragonFly to community supported platforms

6 years agoAdd support for DragonFly BSD
Alex Arslan [Tue, 3 Apr 2018 23:39:29 +0000 (16:39 -0700)]
Add support for DragonFly BSD

6 years agoAllow building on OpenBSD
Alex Arslan [Mon, 2 Apr 2018 17:48:22 +0000 (10:48 -0700)]
Allow building on OpenBSD

With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.

6 years agoUpdate memory.c
Martin Kroeker [Sat, 31 Mar 2018 20:32:06 +0000 (22:32 +0200)]
Update memory.c

6 years agoUpdate memory.c
Martin Kroeker [Thu, 29 Mar 2018 11:13:49 +0000 (13:13 +0200)]
Update memory.c

6 years agoAdd workaround for old gcc and clang versions
Martin Kroeker [Thu, 29 Mar 2018 09:56:56 +0000 (11:56 +0200)]
Add workaround for old gcc and clang versions

Old gcc and clang do not handle constructor arguments, finally fix #875 as discussed there, using the fedora patch

6 years agoMerge pull request #1500 from martin-frbg/issue1474
Martin Kroeker [Wed, 28 Mar 2018 07:15:34 +0000 (09:15 +0200)]
Merge pull request #1500 from martin-frbg/issue1474

Correct index variables used in MFlops calculation

6 years agoCorrect index variables used in MFlops calculation
Martin Kroeker [Tue, 27 Mar 2018 19:52:29 +0000 (21:52 +0200)]
Correct index variables used in MFlops calculation

Fixes #1474

6 years agoMerge pull request #1499 from quickwritereader/develop
Martin Kroeker [Tue, 27 Mar 2018 19:43:23 +0000 (21:43 +0200)]
Merge pull request #1499 from quickwritereader/develop

Implemented missing vsx simd  kernels for power8 blas1/2 double. z13 modifications

6 years agoMerge pull request #1491 from martin-frbg/ddot_mt
Martin Kroeker [Tue, 27 Mar 2018 19:43:05 +0000 (21:43 +0200)]
Merge pull request #1491 from martin-frbg/ddot_mt

Add multithreading support for Haswell DDOT

6 years agopower8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
QWR QWR [Wed, 7 Mar 2018 15:01:03 +0000 (10:01 -0500)]
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
z13: improved zgemv_(t|n)_4,zscal,zaxpy

6 years agoMerge pull request #1495 from martin-frbg/aff
Martin Kroeker [Mon, 19 Mar 2018 17:03:25 +0000 (18:03 +0100)]
Merge pull request #1495 from martin-frbg/aff

Disable CPU affinity by default again

6 years agoDisable CPU affinity by default again
Martin Kroeker [Mon, 19 Mar 2018 17:02:23 +0000 (18:02 +0100)]
Disable CPU affinity by default again

This setting must have been changed unintentionally by my PR #1214 (probably leftover from unrelated tests)

6 years agoMerge pull request #1494 from martin-frbg/x86_dsdot
Martin Kroeker [Sat, 17 Mar 2018 14:26:47 +0000 (15:26 +0100)]
Merge pull request #1494 from martin-frbg/x86_dsdot

Use generic/dot.c instead of the inferior arm/dot.c for x86 DSDOT

6 years agoUse generic/dot.c instead of the inferior arm/dot.c for x86 DSDOT
Martin Kroeker [Sat, 17 Mar 2018 12:49:15 +0000 (13:49 +0100)]
Use generic/dot.c instead of the inferior arm/dot.c for x86 DSDOT

to resolve dsdot utest failure seen in #1492

6 years agoDeclare dot_compute static to avoid conflicts in multiarch builds
Martin Kroeker [Fri, 16 Mar 2018 21:23:36 +0000 (22:23 +0100)]
Declare dot_compute static to avoid conflicts in multiarch builds

6 years agoAdd multithreading support for Haswell DDOT
Martin Kroeker [Fri, 16 Mar 2018 15:58:47 +0000 (16:58 +0100)]
Add multithreading support for Haswell DDOT

copied from ashwinyes' implementation in dot_thunderx2t99.c

6 years ago Use _Atomic instead of volatile for thread safety where C11 is supported
Martin Kroeker [Fri, 9 Mar 2018 23:15:44 +0000 (00:15 +0100)]
 Use _Atomic instead of volatile for thread safety where C11 is supported

6 years agoUse _Atomic instead of volatile for thread safety where C11 is supported
Martin Kroeker [Fri, 9 Mar 2018 23:03:49 +0000 (00:03 +0100)]
Use _Atomic instead of volatile for thread safety where C11 is supported

Suggested by dodomorandi in #660

6 years agoMerge pull request #1482 from martin-frbg/haswell_axpy
Martin Kroeker [Sun, 4 Mar 2018 21:21:18 +0000 (22:21 +0100)]
Merge pull request #1482 from martin-frbg/haswell_axpy

Re-enable DAXPY AVX microkernels  for x86_64

6 years agoRe-enable DAXPY microkernels for x86_64
Martin Kroeker [Sun, 4 Mar 2018 18:37:03 +0000 (19:37 +0100)]
Re-enable DAXPY microkernels  for x86_64

as the inaccuracies seen in the original testcase for #1332 appear to be due to an artefact that amplifies the very small rounding differences between FMA and discrete multiply+add

6 years agoRewrite ROTMG to address cases not covered by the netlib algorithm (#1480)
Martin Kroeker [Sun, 4 Mar 2018 16:39:56 +0000 (17:39 +0100)]
Rewrite ROTMG to address cases not covered by the netlib algorithm (#1480)

* Rewrite ROTMG based on the new implementation in GONUM based on the algorithm proposed by Tim Hopkins, see issue 1452 for the reference
* Correct ROTMG utest for issue1452 and add another from gonum, also correct transposition of expected and observed values in error messages

6 years agoMerge pull request #1481 from martin-frbg/utest-fixup
Martin Kroeker [Sat, 3 Mar 2018 21:43:56 +0000 (22:43 +0100)]
Merge pull request #1481 from martin-frbg/utest-fixup

Fix transposition of expected and computed values in error message

6 years agoFix transposition of expected and computed values in error message
Martin Kroeker [Sat, 3 Mar 2018 17:01:51 +0000 (18:01 +0100)]
Fix transposition of expected and computed values in error message

6 years agoMerge pull request #1476 from xsacha/patch-1
Martin Kroeker [Wed, 28 Feb 2018 17:47:57 +0000 (18:47 +0100)]
Merge pull request #1476 from xsacha/patch-1

Fix CMake cross-compiling

6 years agoMerge pull request #1477 from quickwritereader/develop
Martin Kroeker [Wed, 28 Feb 2018 17:46:54 +0000 (18:46 +0100)]
Merge pull request #1477 from quickwritereader/develop

Power8 blas3 copy-pack routines

6 years agoMerge pull request #1468 from martin-frbg/martin-frbg-patch-1
Martin Kroeker [Wed, 28 Feb 2018 17:40:31 +0000 (18:40 +0100)]
Merge pull request #1468 from martin-frbg/martin-frbg-patch-1

Limit the additional locking from PRs 1052,1299 to non-OpenMP cases