platform/upstream/openblas.git
5 years agoARM64: Use THUNDERX2T99 Neon Kernels for ARMV8
Ashwin Sekhar T K [Wed, 17 Oct 2018 15:11:27 +0000 (08:11 -0700)]
ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8

Currently the generic ARMV8 target uses C implementations
for many routines. Replace these with the neon implementations
written for THUNDERX2T99 target which are upto 6x faster for
certain routines.

5 years agoARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile
Ashwin Sekhar T K [Wed, 17 Oct 2018 15:02:40 +0000 (08:02 -0700)]
ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile

5 years agoARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile
Ashwin Sekhar T K [Wed, 17 Oct 2018 15:02:16 +0000 (08:02 -0700)]
ARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile

5 years agoARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile
Ashwin Sekhar T K [Wed, 17 Oct 2018 15:01:45 +0000 (08:01 -0700)]
ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile

5 years agoARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile
Ashwin Sekhar T K [Wed, 17 Oct 2018 15:01:27 +0000 (08:01 -0700)]
ARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile

5 years agoMerge pull request #1815 from fenrus75/sgemm_beta_fix
Martin Kroeker [Sun, 14 Oct 2018 17:57:34 +0000 (19:57 +0200)]
Merge pull request #1815 from fenrus75/sgemm_beta_fix

enable the SGEMM/SKX C based kernel

5 years agoenable the SGEMM/SKX C based kernel
Arjan van de Ven [Fri, 12 Oct 2018 09:30:35 +0000 (09:30 +0000)]
enable the SGEMM/SKX C based kernel

In QA the final bug was found so now the sklyakex sgemm C based kernel can
be activated....

5 years agoMerge pull request #1812 from martin-frbg/issue1806-2
Martin Kroeker [Thu, 11 Oct 2018 19:51:31 +0000 (21:51 +0200)]
Merge pull request #1812 from martin-frbg/issue1806-2

Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake…

5 years agoUse KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512
Martin Kroeker [Thu, 11 Oct 2018 09:03:27 +0000 (11:03 +0200)]
Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512

5 years agoMerge pull request #1808 from martin-frbg/issue1806
Martin Kroeker [Thu, 11 Oct 2018 05:48:08 +0000 (07:48 +0200)]
Merge pull request #1808 from martin-frbg/issue1806

Add -march=skylake-avx512 to CFLAGS when the target is Skylake

5 years agoMerge pull request #1807 from xianyi/revert-1798-cmake-avx512
Martin Kroeker [Thu, 11 Oct 2018 05:47:53 +0000 (07:47 +0200)]
Merge pull request #1807 from xianyi/revert-1798-cmake-avx512

Revert "Add -march=skylake-avx512 when required"

5 years agoSyntax fix
Martin Kroeker [Wed, 10 Oct 2018 21:47:35 +0000 (23:47 +0200)]
Syntax fix

5 years agoAdd -march=skylake-avx512 to CFLAGS when the target is Skylake
Martin Kroeker [Wed, 10 Oct 2018 17:22:01 +0000 (19:22 +0200)]
Add -march=skylake-avx512 to CFLAGS when the target is Skylake

Should fix 1806 and #1801

5 years agoRevert "Add -march=skylake-avx512 when required"
Martin Kroeker [Wed, 10 Oct 2018 17:15:32 +0000 (19:15 +0200)]
Revert "Add -march=skylake-avx512 when required"

5 years agoMerge pull request #1802 from martin-frbg/issue1801
Martin Kroeker [Wed, 10 Oct 2018 06:52:53 +0000 (08:52 +0200)]
Merge pull request #1802 from martin-frbg/issue1801

Use avx512 workaround with msys2/mingw64 as well

5 years agoMerge pull request #1804 from fenrus75/sgemm
Martin Kroeker [Wed, 10 Oct 2018 06:50:44 +0000 (08:50 +0200)]
Merge pull request #1804 from fenrus75/sgemm

Add a C+intrinsics version of the SGEMM/skylakex kernel

5 years agoAdd a C+intrinsics version of the SGEMM/skylakex kernel
Arjan van de Ven [Wed, 10 Oct 2018 01:49:22 +0000 (01:49 +0000)]
Add a C+intrinsics version of the SGEMM/skylakex kernel

for most sizes this is 1.2x to 1.4x faster than the current code

5 years agoMerge pull request #1800 from fengrl/patch-1
Martin Kroeker [Tue, 9 Oct 2018 08:56:37 +0000 (10:56 +0200)]
Merge pull request #1800 from fengrl/patch-1

Update common_mips64.h for the 1st loop of blas_memory_alloc

5 years agoMerge pull request #1792 from martin-frbg/cmakesuffix
Martin Kroeker [Tue, 9 Oct 2018 08:34:52 +0000 (10:34 +0200)]
Merge pull request #1792 from martin-frbg/cmakesuffix

Improve CMake help output and add SYMBOLPREFIX and -SUFFIX options

5 years agoUse cygwin compilation workaround for avx512 on msys2/mingw64 as well
Martin Kroeker [Tue, 9 Oct 2018 08:31:59 +0000 (10:31 +0200)]
Use cygwin compilation workaround for avx512 on msys2/mingw64 as well

5 years agoMerge pull request #1799 from martin-frbg/issue1796
Martin Kroeker [Tue, 9 Oct 2018 06:20:52 +0000 (08:20 +0200)]
Merge pull request #1799 from martin-frbg/issue1796

Handle conflicting usage of ARCH in at least some BSD environments

5 years agoMerge pull request #1793 from fenrus75/ncopy
Martin Kroeker [Tue, 9 Oct 2018 06:19:14 +0000 (08:19 +0200)]
Merge pull request #1793 from fenrus75/ncopy

Add optimized *copy versions for skylakex

5 years agoUpdate common_mips64.h
fengrl [Tue, 9 Oct 2018 03:20:16 +0000 (11:20 +0800)]
Update common_mips64.h

5 years agoCatch conflicting usage of ARCH in at least some BSD environments
Martin Kroeker [Mon, 8 Oct 2018 20:29:35 +0000 (22:29 +0200)]
Catch conflicting usage of ARCH in at least some BSD environments

fixes #1796

5 years agoUse override for ARCH in make.inc
Martin Kroeker [Mon, 8 Oct 2018 20:26:59 +0000 (22:26 +0200)]
Use override for ARCH in make.inc

in case a conflicting setting of ARCH (for architecture) gets pulled in from the environment
(originally suggested by dloghin in #1753)

5 years agoMerge pull request #1798 from martin-frbg/cmake-avx512
Martin Kroeker [Mon, 8 Oct 2018 19:15:17 +0000 (21:15 +0200)]
Merge pull request #1798 from martin-frbg/cmake-avx512

Add -march=skylake-avx512 when required

5 years agoAdd -march=skylake-avx512 when required
Martin Kroeker [Mon, 8 Oct 2018 17:18:12 +0000 (19:18 +0200)]
Add -march=skylake-avx512 when required

fixes #1797

5 years agodgemm/skylakex: replace discrete mul/add with fma
Arjan van de Ven [Sat, 6 Oct 2018 23:13:26 +0000 (23:13 +0000)]
dgemm/skylakex: replace discrete mul/add with fma

very minor gains since it's not super hot code, but general principles

5 years agoAdd vector optimizations for ncopy as well for dgemm/skylakex
Arjan van de Ven [Sat, 6 Oct 2018 21:18:12 +0000 (21:18 +0000)]
Add vector optimizations for ncopy as well for dgemm/skylakex

5 years agoadd a skylakex optimized dgemm beta function
Arjan van de Ven [Sat, 6 Oct 2018 16:36:26 +0000 (16:36 +0000)]
add a skylakex optimized dgemm beta function

5 years agoMerge pull request #1791 from dev-zero/develop
Martin Kroeker [Sat, 6 Oct 2018 14:29:29 +0000 (16:29 +0200)]
Merge pull request #1791 from dev-zero/develop

fix parallel build issues with APFS/HFS+/ext2/3 in netlib-lapack

5 years agodgemm/avx512 simplify and speed up the 4x4 kernel
Arjan van de Ven [Sat, 6 Oct 2018 14:12:32 +0000 (14:12 +0000)]
dgemm/avx512 simplify and speed up the 4x4 kernel

5 years agoundo slow dgemm/skylake microoptimization
Arjan van de Ven [Sat, 6 Oct 2018 14:00:37 +0000 (14:00 +0000)]
undo slow dgemm/skylake microoptimization

the compare is more costly than the work

5 years agoAdd optimized *copy versions for skylakex
Arjan van de Ven [Sat, 6 Oct 2018 13:47:20 +0000 (13:47 +0000)]
Add optimized *copy versions for skylakex

Add optimized n/t copy versions for skylakex; in the patch the
tcopy is also rewritten using intrinsics; the ncopy file
will be worked on in a future commit

5 years agoMerge pull request #6 from xianyi/develop
Martin Kroeker [Sat, 6 Oct 2018 12:36:36 +0000 (14:36 +0200)]
Merge pull request #6 from xianyi/develop

merge develop

5 years agoAdd SYMBOLPREFIX and -SUFFIX options and improve help output
Martin Kroeker [Sat, 6 Oct 2018 12:28:04 +0000 (14:28 +0200)]
Add SYMBOLPREFIX and -SUFFIX options and improve help output

5 years agofix parallel build issues with APFS/HFS+/ext2/3 in netlib-lapack
Tiziano Müller [Sat, 6 Oct 2018 12:10:02 +0000 (14:10 +0200)]
fix parallel build issues with APFS/HFS+/ext2/3 in netlib-lapack

The problem is that OpenBLAS sets the LAPACKE_LIB and the TMGLIB to the
same object and uses the `ar` feature to update the archive file. If the
underlying filesystem does not have sub-second timestamp resolution and
the system is fast enough (or `ccache` is used), the timestamp of the
builds which should be added to the previously generated archive is the
same as the archive file itself and therefore `make` does not update the
archive.

Since OpenBLAS takes care to not run the different targets updating the
archive in parallel, the easiest solution is to declare the respective
targets `.PHONY`, forcing `make` to always update them.

fixes #1682

5 years agoMerge pull request #1789 from brada4/develop
Martin Kroeker [Fri, 5 Oct 2018 18:42:37 +0000 (20:42 +0200)]
Merge pull request #1789 from brada4/develop

update travis alpine chroot with avx512 intrinsics headers

5 years agoMerge pull request #1788 from fenrus75/avx512-8x16
Martin Kroeker [Fri, 5 Oct 2018 18:40:38 +0000 (20:40 +0200)]
Merge pull request #1788 from fenrus75/avx512-8x16

skylake dgemm: Add a 16x8 kernel

5 years agoAdd a 24x8 kernel to the skylakex dgemm implementation
Arjan van de Ven [Fri, 5 Oct 2018 13:22:21 +0000 (13:22 +0000)]
Add a 24x8 kernel to the skylakex dgemm implementation

Minor gains for small matrixes, but at 512x512 and above the gain
gets more significant.

5 years agoskylake dgemm: Add a 16x8 kernel
Arjan van de Ven [Fri, 5 Oct 2018 11:49:43 +0000 (11:49 +0000)]
skylake dgemm: Add a 16x8 kernel

The next step for the avx512 dgemm code is adding a 16x8 kernel.
In the 8x8 kernel, each FMA has a matching load (the broadcast);
in the 16x8 kernel we can reuse this load for 2 FMAs, which
in turn reduces pressure on the load ports of the CPU and gives
a nice performance boost (in the 25% range).

5 years agoupdate travis alpine chroot with avx512 intrinsics headers
Andrew [Fri, 5 Oct 2018 12:47:55 +0000 (15:47 +0300)]
update travis alpine chroot with avx512 intrinsics headers

5 years agoupdate travis alpine chroot with avx512 intrinsics headers
Andrew [Fri, 5 Oct 2018 12:41:52 +0000 (15:41 +0300)]
update travis alpine chroot with avx512 intrinsics headers

5 years agoMerge pull request #1785 from brada4/develop
Martin Kroeker [Fri, 5 Oct 2018 06:25:38 +0000 (08:25 +0200)]
Merge pull request #1785 from brada4/develop

address #1782 2nd loop

5 years agoMerge pull request #1784 from fenrus75/dgemm-avx512
Martin Kroeker [Fri, 5 Oct 2018 06:03:27 +0000 (08:03 +0200)]
Merge pull request #1784 from fenrus75/dgemm-avx512

Create a AVX512 enabled version of DGEMM

5 years agoFunction name needs to be CNAME, set from outside to allow suffixing for dynamic_arch
Martin Kroeker [Thu, 4 Oct 2018 17:14:59 +0000 (19:14 +0200)]
Function name needs to be CNAME, set from outside to allow suffixing for dynamic_arch

5 years agoMerge pull request #1787 from jeromerobert/develop
Martin Kroeker [Thu, 4 Oct 2018 16:41:47 +0000 (18:41 +0200)]
Merge pull request #1787 from jeromerobert/develop

Fix unknown type name __WAIT_STATUS on RHEL5

5 years agoFix unknown type name __WAIT_STATUS on RHEL5
Jerome Robert [Thu, 4 Oct 2018 10:27:44 +0000 (12:27 +0200)]
Fix unknown type name __WAIT_STATUS on RHEL5

With glibc 2.5 one must have #define _XOPEN_SOURCE >= 500 to use wait.
But reading glibc code this is actually needed only if stdlib.h was
included before sys/wait.h. This was the case here through
openblas_utest.h. So changing include fix compilation on RHEL5 and
should ne hurt with more recent distro.

* Problem found when using with gcc 5.5 and 4.7.2 on RHEL5/CENTOS5
* Fix #1519

5 years agoMerge pull request #1786 from martin-frbg/immintrin
Martin Kroeker [Thu, 4 Oct 2018 07:07:09 +0000 (09:07 +0200)]
Merge pull request #1786 from martin-frbg/immintrin

Check for Immintrin.h presence in the AVX512 compatibility test as well

5 years agoCheck availability of immintrin.h in the AVX512 compatibility test
Martin Kroeker [Thu, 4 Oct 2018 05:36:49 +0000 (07:36 +0200)]
Check availability of immintrin.h in the AVX512 compatibility test

5 years agoCheck availability of immintrin.h in the AVX512 compatibility test
Martin Kroeker [Thu, 4 Oct 2018 05:35:30 +0000 (07:35 +0200)]
Check availability of immintrin.h in the AVX512 compatibility test

5 years agoaddress #1782 2nd loop
Andrew [Wed, 3 Oct 2018 19:20:50 +0000 (21:20 +0200)]
address #1782 2nd loop

5 years agoCreate a AVX512 enabled version of DGEMM
Arjan van de Ven [Wed, 3 Oct 2018 14:45:25 +0000 (14:45 +0000)]
Create a AVX512 enabled version of DGEMM

This patch adds dgemm_kernel_4x8_skylakex.c which is
* dgemm_kernel_4x8_haswell.s converted to C + intrinsics
* 8x8 support added
* 8x8 kernel implemented using AVX512

Performance is a work in progress, but already shows a 10% - 20%
increase for a wide range of matrix sizes.

5 years agoMerge pull request #1780 from martin-frbg/issue1774-2
Martin Kroeker [Sat, 29 Sep 2018 07:27:47 +0000 (09:27 +0200)]
Merge pull request #1780 from martin-frbg/issue1774-2

Convert fldmia/fstmia instructions to UAL syntax for clang7

5 years agoConvert fldmia/fstmia instructions to UAL syntax for clang7
Martin Kroeker [Fri, 28 Sep 2018 21:05:15 +0000 (23:05 +0200)]
Convert fldmia/fstmia instructions to UAL syntax for clang7

second part of fix for #1774, containing files missed in #1775

5 years agoMerge pull request #1778 from fengrl/develop
Martin Kroeker [Wed, 26 Sep 2018 09:14:58 +0000 (11:14 +0200)]
Merge pull request #1778 from fengrl/develop

test_axpy work error on LOONGSON3A platform #1777

5 years agotest_axpy work error on LOONGSON3A platform #1777
fengruilin [Wed, 26 Sep 2018 07:14:04 +0000 (15:14 +0800)]
test_axpy work error on LOONGSON3A platform #1777

5 years agoMerge pull request #1775 from martin-frbg/issue1774
Martin Kroeker [Tue, 25 Sep 2018 16:58:39 +0000 (18:58 +0200)]
Merge pull request #1775 from martin-frbg/issue1774

Convert fldmia/fstmia instructions to UAL syntax for clang7

5 years agoConvert fldmia/fstmia instructions to UAL syntax for clang7
Martin Kroeker [Tue, 25 Sep 2018 07:41:58 +0000 (09:41 +0200)]
Convert fldmia/fstmia instructions to UAL syntax for clang7

fixes #1774

5 years agoMerge pull request #1773 from martin-frbg/issue1767
Martin Kroeker [Sun, 23 Sep 2018 21:25:15 +0000 (23:25 +0200)]
Merge pull request #1773 from martin-frbg/issue1767

Include thread numbers in failure message from blas_thread_init

5 years agoInclude thread numbers in failure message from blas_thread_init
Martin Kroeker [Sat, 22 Sep 2018 12:00:15 +0000 (14:00 +0200)]
Include thread numbers in failure message from blas_thread_init

to aid in debugging cases like #1767

5 years agoMerge pull request #1771 from staticfloat/sf/ldflags
Martin Kroeker [Sat, 22 Sep 2018 11:11:39 +0000 (13:11 +0200)]
Merge pull request #1771 from staticfloat/sf/ldflags

Add `$(LDFLAGS)` to `$(CC)` and `$(FC)` invocations within `exports/Makefile`

5 years agoDocument the stub status of the QUAD_PRECiSION code (#1772)
Martin Kroeker [Sat, 22 Sep 2018 10:31:37 +0000 (12:31 +0200)]
Document the stub status of the QUAD_PRECiSION code (#1772)

* Document the stub status of the QUAD_PRECiSION code inherited from GotoBLAS2

in response to #1769

5 years agoAdd `$(LDFLAGS)` to `$(CC)` and `$(FC)` invocations within `exports/Makefile`
Elliot Saba [Fri, 21 Sep 2018 09:19:51 +0000 (09:19 +0000)]
Add `$(LDFLAGS)` to `$(CC)` and `$(FC)` invocations within `exports/Makefile`

5 years agoMerge pull request #1765 from martin-frbg/issue1761
Martin Kroeker [Wed, 19 Sep 2018 20:02:21 +0000 (22:02 +0200)]
Merge pull request #1765 from martin-frbg/issue1761

Do not use the new TLS-enabled memory allocator for non-threaded builds, and disable TLS by default in gmake as well

5 years agoMerge pull request #1764 from yurivict/64-suffix
Martin Kroeker [Wed, 19 Sep 2018 16:16:38 +0000 (18:16 +0200)]
Merge pull request #1764 from yurivict/64-suffix

Allow to install the 'interface64' version concurrently with the regular version

5 years agoMerge pull request #1762 from martin-frbg/issue1710-2
Martin Kroeker [Wed, 19 Sep 2018 16:16:21 +0000 (18:16 +0200)]
Merge pull request #1762 from martin-frbg/issue1710-2

Add explicit casts to silence compiler warnings

5 years agoFix default settings - USE_TLS and USE_SIMPLE_THREADED_LEVEL3 should both be off
Martin Kroeker [Wed, 19 Sep 2018 16:08:31 +0000 (18:08 +0200)]
Fix default settings - USE_TLS and USE_SIMPLE_THREADED_LEVEL3 should both be off

5 years agoCatch inadvertent USE_TLS=0 declaration
Martin Kroeker [Wed, 19 Sep 2018 16:03:43 +0000 (18:03 +0200)]
Catch inadvertent USE_TLS=0 declaration

for #1766

5 years agoDo not use the new TLS code for non-threaded builds even if USE_TLS is set
Martin Kroeker [Sun, 16 Sep 2018 10:43:36 +0000 (12:43 +0200)]
Do not use the new TLS code for non-threaded builds even if USE_TLS is set

Workaround for #1761 as that exposed a problem in the new code (which was intended to speed up multithreaded code only anyway).

5 years agoMerge pull request #4 from xianyi/develop
Martin Kroeker [Sun, 16 Sep 2018 10:36:49 +0000 (12:36 +0200)]
Merge pull request #4 from xianyi/develop

Update branch

5 years agoAllow to install the 'interfare64' version concurrently with the regular version
Yuri [Sun, 16 Sep 2018 02:59:17 +0000 (19:59 -0700)]
Allow to install the 'interfare64' version concurrently with the regular version

5 years agoAdd an explicit cast to silence a warning
Martin Kroeker [Thu, 13 Sep 2018 12:24:29 +0000 (14:24 +0200)]
Add an explicit cast to silence a warning

for #1710

5 years agoAdd explicit cast to silence a warning
Martin Kroeker [Thu, 13 Sep 2018 12:23:31 +0000 (14:23 +0200)]
Add explicit cast to silence a warning

for #1710

5 years agoMerge pull request #1759 from martin-frbg/lapack283
Martin Kroeker [Tue, 11 Sep 2018 11:52:09 +0000 (13:52 +0200)]
Merge pull request #1759 from martin-frbg/lapack283

Remove an unused variable from several LAPACKE 2stage_work functions

5 years agoremove unused variable ldb_t
Martin Kroeker [Tue, 11 Sep 2018 08:53:47 +0000 (10:53 +0200)]
remove unused variable ldb_t

Copied from Reference-LAPACK PR283

5 years agoremove unused variable ldb_t
Martin Kroeker [Tue, 11 Sep 2018 08:52:30 +0000 (10:52 +0200)]
remove unused variable ldb_t

Copied from Reference-LAPACK PR283

5 years agoremove unused variable ldb_t
Martin Kroeker [Tue, 11 Sep 2018 08:51:17 +0000 (10:51 +0200)]
remove unused variable ldb_t

Copied from Reference-LAPACK PR283

5 years agoMerge pull request #1757 from brada4/develop
Martin Kroeker [Sun, 9 Sep 2018 20:55:15 +0000 (22:55 +0200)]
Merge pull request #1757 from brada4/develop

fix small typo in strmm_ LN

5 years agofix small typo
Andrew [Sun, 9 Sep 2018 14:52:25 +0000 (16:52 +0200)]
fix small typo

5 years agoMerge pull request #1756 from martin-frbg/issue1754
Martin Kroeker [Fri, 7 Sep 2018 09:02:18 +0000 (11:02 +0200)]
Merge pull request #1756 from martin-frbg/issue1754

Follow netlib renaming/aliasing CBLAS_ORDER to CBLAS_LAYOUT

5 years agoMerge pull request #1749 from martin-frbg/issue1531
Martin Kroeker [Fri, 7 Sep 2018 09:02:01 +0000 (11:02 +0200)]
Merge pull request #1749 from martin-frbg/issue1531

Fix ARMV8 cross-compilation for IOS

5 years agoAdjust ARMV8 SGEMM unrolling when using the C fallback kernel_2x2 for IOS
Martin Kroeker [Thu, 6 Sep 2018 19:41:54 +0000 (21:41 +0200)]
Adjust ARMV8 SGEMM unrolling when using the C fallback kernel_2x2 for IOS

5 years agojust make CBLAS_LAYOUT an alias of the existing CBLAS_ORDER
Martin Kroeker [Thu, 6 Sep 2018 14:54:31 +0000 (16:54 +0200)]
just make CBLAS_LAYOUT an alias of the existing CBLAS_ORDER

to avoid having to change all instances of enum CBLAS_ORDER in this file

5 years agoFollow netlib renaming/aliasing CBLAS_ORDER to CBLAS_LAYOUT
Martin Kroeker [Thu, 6 Sep 2018 14:39:52 +0000 (16:39 +0200)]
Follow netlib renaming/aliasing CBLAS_ORDER to CBLAS_LAYOUT

fixes #1754

5 years agoConditional compilation of assembly files that IOS does not like
Martin Kroeker [Tue, 4 Sep 2018 09:06:51 +0000 (11:06 +0200)]
Conditional compilation of assembly files that IOS does not like

5 years agoFix paths to C kernels for nrm2
Martin Kroeker [Tue, 4 Sep 2018 08:51:19 +0000 (10:51 +0200)]
Fix paths to C kernels for nrm2

5 years agoUpdate with the changes from 0.3.3
Martin Kroeker [Thu, 30 Aug 2018 22:21:13 +0000 (00:21 +0200)]
Update with the changes from 0.3.3

5 years agoUpdate version to 0.3.4.dev
Martin Kroeker [Thu, 30 Aug 2018 22:19:21 +0000 (00:19 +0200)]
Update version to 0.3.4.dev

5 years agoUpdate version to 0.3.4.dev
Martin Kroeker [Thu, 30 Aug 2018 22:18:37 +0000 (00:18 +0200)]
Update version to 0.3.4.dev

5 years agoMerge pull request #1746 from martin-frbg/issue1674
Martin Kroeker [Thu, 30 Aug 2018 15:48:07 +0000 (17:48 +0200)]
Merge pull request #1746 from martin-frbg/issue1674

Assume cross-compilation if host and target os differ

5 years agoAssume cross-compilation if host and target os differ
Martin Kroeker [Thu, 30 Aug 2018 11:28:46 +0000 (13:28 +0200)]
Assume cross-compilation if host and target os differ

fixes 1674

5 years agoMerge pull request #1745 from martin-frbg/issue1743
Martin Kroeker [Wed, 29 Aug 2018 05:43:58 +0000 (07:43 +0200)]
Merge pull request #1745 from martin-frbg/issue1743

Set USE_TRMM for all ZARCH variants to fix TRMM faults with zarch-gen…

5 years agoMerge pull request #1744 from martin-frbg/lapack272
Martin Kroeker [Tue, 28 Aug 2018 20:58:58 +0000 (22:58 +0200)]
Merge pull request #1744 from martin-frbg/lapack272

Fix missing replacements of ILAENV by ILAENV_2STAGE (lapack PR 272)

5 years agoSet USE_TRMM for all ZARCH variants to fix TRMM faults with zarch-generic
Martin Kroeker [Tue, 28 Aug 2018 19:34:07 +0000 (21:34 +0200)]
Set USE_TRMM for all ZARCH variants to fix TRMM faults with zarch-generic

fixes #1743

5 years agoFix missing replacements of ILAENV by ILAENV_2STAGE (lapack PR 272)
Martin Kroeker [Tue, 28 Aug 2018 19:11:54 +0000 (21:11 +0200)]
Fix missing replacements of ILAENV by ILAENV_2STAGE (lapack PR 272)

This could cause spurious "parameter has an illegal value" errors in DSYEVR and related routines, see https://github.com/Reference-LAPACK/lapack/issues/262

5 years agoMerge pull request #1742 from martin-frbg/interim033
Martin Kroeker [Tue, 28 Aug 2018 06:02:15 +0000 (08:02 +0200)]
Merge pull request #1742 from martin-frbg/interim033

Add combination of old and new thread memory code selectable by new option USE_TLS

5 years agotypo fix
Martin Kroeker [Sun, 26 Aug 2018 09:31:07 +0000 (11:31 +0200)]
typo fix

5 years agoRewrite glibc version check
Martin Kroeker [Sun, 26 Aug 2018 09:18:02 +0000 (11:18 +0200)]
Rewrite glibc version check

5 years agoUpdate memory.c
Martin Kroeker [Sat, 25 Aug 2018 20:12:40 +0000 (22:12 +0200)]
Update memory.c