platform/upstream/openblas.git
5 years agoFix C compiler handling and BINARY=32 mode in CMAKE builds (#2248)
Martin Kroeker [Tue, 10 Sep 2019 06:27:06 +0000 (08:27 +0200)]
Fix C compiler handling and BINARY=32 mode in CMAKE builds (#2248)

* Fix compiler identification and option setting

* Handle BINARY=32 option on X86_64

* Add xGEMM3M unroll parameters for crossbuild-target CORE2

* Replace bogus mingw64/32bit CI job with actual 32bit build

mingw64 is not multilib-capable, so using an x86_64-mingw with BINARY=32 in the CI was not going to work anyway (but build passed while BINARY=32 was ignored).

5 years agoImprove cmake build behaviour with non-host cpu targets (#2246)
Martin Kroeker [Tue, 3 Sep 2019 20:41:17 +0000 (22:41 +0200)]
Improve cmake build behaviour with non-host cpu targets (#2246)

1. Supply appropriate values for C/Z GEMM unroll when cross-compiling for CORE2 or ARMV7
2. Add the required xLOCAL_BUFFER_SIZE parameters for cross-compiling CORE2
3. Add -DFORCE_<target> option to getarch when building with -DTARGET=target
for #2245

5 years agoMerge pull request #2242 from martin-frbg/issue2235
Martin Kroeker [Mon, 2 Sep 2019 20:06:29 +0000 (22:06 +0200)]
Merge pull request #2242 from martin-frbg/issue2235

Add arch data for cmake cross-compiling to CORE2

5 years agoAdd cgemm and zgemm unroll factors for core2
Martin Kroeker [Mon, 2 Sep 2019 13:03:45 +0000 (15:03 +0200)]
Add cgemm and zgemm unroll factors for core2

5 years agoDisable ppc64le test environment on Travis CI
Martin Kroeker [Sat, 31 Aug 2019 16:06:12 +0000 (18:06 +0200)]
Disable ppc64le test environment on Travis CI

as this semi-official beta option has suddenly reverted to a standard x86_64 environment causing spurious failures

5 years agoMerge pull request #2243 from quickwritereader/develop
Martin Kroeker [Fri, 30 Aug 2019 21:06:23 +0000 (23:06 +0200)]
Merge pull request #2243 from quickwritereader/develop

possible cgemv,caxpy,cdot fix

5 years ago fix uninitialized variables i
AbdelRauf [Fri, 30 Aug 2019 11:14:55 +0000 (11:14 +0000)]
 fix uninitialized variables i

5 years agocaxpy and cdot are using vec_vsx_ld
AbdelRauf [Fri, 30 Aug 2019 04:09:15 +0000 (04:09 +0000)]
caxpy and cdot are using vec_vsx_ld

5 years agocgemv using vec_vsx_ld instead of letting gcc to decide
AbdelRauf [Fri, 30 Aug 2019 02:52:04 +0000 (02:52 +0000)]
cgemv using vec_vsx_ld instead of letting gcc to decide

5 years agoaligned
AbdelRauf [Thu, 29 Aug 2019 23:22:23 +0000 (23:22 +0000)]
aligned

5 years agoMerge pull request #2241 from martin-frbg/zdotfix
Martin Kroeker [Thu, 29 Aug 2019 05:12:54 +0000 (07:12 +0200)]
Merge pull request #2241 from martin-frbg/zdotfix

Make x86_64 zdot compile with PGI and Sun C again

5 years agoKeep both PGI/SUN and default code paths to avoid breaking Clang/WIndows
Martin Kroeker [Wed, 28 Aug 2019 16:07:44 +0000 (18:07 +0200)]
Keep both PGI/SUN and default code paths to avoid breaking Clang/WIndows

5 years agoAdd arch data for cross-compiling to CORE2
Martin Kroeker [Wed, 28 Aug 2019 15:35:56 +0000 (17:35 +0200)]
Add arch data for cross-compiling to CORE2

for #2235

5 years agoMerge pull request #2240 from martin-frbg/issue2237
Martin Kroeker [Wed, 28 Aug 2019 13:30:53 +0000 (15:30 +0200)]
Merge pull request #2240 from martin-frbg/issue2237

Fix PGI build options (again)

5 years agoMake x86_64 zdot compile with PGI and Sun C again
Martin Kroeker [Wed, 28 Aug 2019 09:35:31 +0000 (11:35 +0200)]
Make x86_64 zdot compile with PGI and Sun C again

broken by #2222 as CREAL,CIMAG do not expand to a valid lvalue with these compilers

5 years agoFix PGI build options (again)
Martin Kroeker [Wed, 28 Aug 2019 09:31:20 +0000 (11:31 +0200)]
Fix PGI build options (again)

for #2237

5 years agoMerge pull request #2239 from martin-frbg/issue2231
Martin Kroeker [Wed, 28 Aug 2019 05:54:57 +0000 (07:54 +0200)]
Merge pull request #2239 from martin-frbg/issue2231

Fix 32bit armv8 compilation regression

5 years agoDo not abuse the global ARCH variable as a local temporary
Martin Kroeker [Tue, 27 Aug 2019 20:52:17 +0000 (22:52 +0200)]
Do not abuse the global ARCH variable as a local temporary

Setting it with a simple "uname -m" just to be able to decide whether to compile getarch.c with -march=native
may actually keep getarch from doing a proper probe. Fixes #2231, a regression caused by #2110

5 years agoMerge pull request #2 from xianyi/develop
Martin Kroeker [Tue, 27 Aug 2019 20:41:31 +0000 (22:41 +0200)]
Merge pull request #2 from xianyi/develop

merge develop

5 years agoMerge pull request #2228 from martin-frbg/issue2227
Martin Kroeker [Mon, 19 Aug 2019 16:26:51 +0000 (18:26 +0200)]
Merge pull request #2228 from martin-frbg/issue2227

Add Intel Goldmont Plus CPUID

5 years agoMerge branch 'develop' into issue2227
Martin Kroeker [Mon, 19 Aug 2019 12:20:39 +0000 (14:20 +0200)]
Merge branch 'develop' into issue2227

5 years agoAdd Intel Goldmont Plus CPUID
Martin Kroeker [Mon, 19 Aug 2019 12:19:21 +0000 (14:19 +0200)]
Add Intel Goldmont Plus CPUID

fixes #2227

5 years agoMerge pull request #2223 from martin-frbg/getarch-pgi
Martin Kroeker [Fri, 16 Aug 2019 10:21:30 +0000 (12:21 +0200)]
Merge pull request #2223 from martin-frbg/getarch-pgi

Make getarch compile with PGI

5 years agoFix PGI compiler detection for getarch
Martin Kroeker [Fri, 16 Aug 2019 07:00:11 +0000 (09:00 +0200)]
Fix PGI compiler detection for getarch

5 years agoDo not use -march=native with the PGI compiler
Martin Kroeker [Fri, 16 Aug 2019 06:58:10 +0000 (08:58 +0200)]
Do not use -march=native with the PGI compiler

5 years agoMerge pull request #1 from xianyi/develop
Martin Kroeker [Fri, 16 Aug 2019 06:56:15 +0000 (08:56 +0200)]
Merge pull request #1 from xianyi/develop

rebase

5 years agoAdd multithreading support to the x86_64 zdot kernel (#2222)
Martin Kroeker [Thu, 15 Aug 2019 20:09:12 +0000 (22:09 +0200)]
Add multithreading support to the x86_64 zdot kernel (#2222)

* Add multithreading support

copied from the ThunderX2T99 kernel. For #2221

5 years agoMerge pull request #2218 from martin-frbg/issue2215
Martin Kroeker [Wed, 14 Aug 2019 05:32:31 +0000 (07:32 +0200)]
Merge pull request #2218 from martin-frbg/issue2215

Make the new DGEMM regression test properly depend on CBLAS and LAPACKE

5 years agoMake the new DGEMM regression test properly depend on CBLAS and LAPACKE
Martin Kroeker [Tue, 13 Aug 2019 20:29:48 +0000 (22:29 +0200)]
Make the new DGEMM regression test properly depend on CBLAS and LAPACKE

fixes #2215

5 years agoMerge pull request #2216 from martin-frbg/issue2214
Martin Kroeker [Tue, 13 Aug 2019 11:59:33 +0000 (13:59 +0200)]
Merge pull request #2216 from martin-frbg/issue2214

Remove case-sensitivity in x86 LSAME on (AMD) cpus without CMOV

5 years agoFix unwanted case-sensitivity in x86 LSAME for (AMD) processors without CMOV
Martin Kroeker [Tue, 13 Aug 2019 08:19:10 +0000 (10:19 +0200)]
Fix unwanted case-sensitivity in x86 LSAME for (AMD) processors without CMOV

Problem was already noticed some years ago in #238, but back then the problem was only corrected in one of the #ifdef branches.
Fixes #2214

5 years agoUpdate with changes from 0.3.7
Martin Kroeker [Sun, 11 Aug 2019 21:31:36 +0000 (23:31 +0200)]
Update with changes from 0.3.7

5 years agoIncrement version to 0.3.8.dev
Martin Kroeker [Sun, 11 Aug 2019 21:28:47 +0000 (23:28 +0200)]
Increment version to 0.3.8.dev

5 years agoIncrement version to 0.3.8.dev
Martin Kroeker [Sun, 11 Aug 2019 21:28:13 +0000 (23:28 +0200)]
Increment version to 0.3.8.dev

5 years agoMerge pull request #2212 from martin-frbg/nofort-nolib
Martin Kroeker [Sun, 11 Aug 2019 18:26:34 +0000 (20:26 +0200)]
Merge pull request #2212 from martin-frbg/nofort-nolib

Avoid spurious dependency on the fortran runtime despite NOFORTRAN=1

5 years agoAvoid adding a spurious dependency on the fortran runtime despite NOFORTRAN=1
Martin Kroeker [Sun, 11 Aug 2019 14:24:39 +0000 (16:24 +0200)]
Avoid adding a spurious dependency on the fortran runtime despite NOFORTRAN=1

for cases where a fortran compiler is present but not wanted (e.g. not fully functional)

5 years agoMerge pull request #2211 from martin-frbg/arm64_gcc_trivial
Martin Kroeker [Sun, 11 Aug 2019 14:08:05 +0000 (16:08 +0200)]
Merge pull request #2211 from martin-frbg/arm64_gcc_trivial

Silence two nuisance warnings from gcc

5 years agoSilence two nuisance warnings from gcc
Martin Kroeker [Sun, 11 Aug 2019 10:46:05 +0000 (12:46 +0200)]
Silence two nuisance warnings from gcc

5 years agoMerge pull request #2208 from martin-frbg/munmap-debug
Martin Kroeker [Fri, 9 Aug 2019 05:55:35 +0000 (07:55 +0200)]
Merge pull request #2208 from martin-frbg/munmap-debug

Provide more information on mmap/munmap failure

5 years agoMerge pull request #2206 from martin-frbg/zen-dtrmm
Martin Kroeker [Fri, 9 Aug 2019 05:55:20 +0000 (07:55 +0200)]
Merge pull request #2206 from martin-frbg/zen-dtrmm

Replace vpermpd with vpermilpd in the Haswell DTRMM kernel

5 years agoMerge pull request #2199 from martin-frbg/zen-dtrsm
Martin Kroeker [Fri, 9 Aug 2019 05:55:02 +0000 (07:55 +0200)]
Merge pull request #2199 from martin-frbg/zen-dtrsm

Replace most vpermpd calls in the Haswell DTRSM_RN kernel

5 years agoAdd files via upload
Martin Kroeker [Thu, 8 Aug 2019 22:08:11 +0000 (00:08 +0200)]
Add files via upload

5 years agoProvide more information on mmap/munmap failure
Martin Kroeker [Thu, 8 Aug 2019 21:15:35 +0000 (23:15 +0200)]
Provide more information on mmap/munmap failure

for #2207

5 years agoReplace most vpermpd calls in the Haswell DTRSM_RN kernel
Martin Kroeker [Sat, 3 Aug 2019 10:40:13 +0000 (12:40 +0200)]
Replace most vpermpd calls in the Haswell DTRSM_RN kernel

5 years agoMerge pull request #2198 from martin-frbg/icelake
Martin Kroeker [Fri, 2 Aug 2019 06:36:14 +0000 (08:36 +0200)]
Merge pull request #2198 from martin-frbg/icelake

Update CPUID recognition for Intel Ice Lake

5 years agoAdd CPUID identification of Intel Ice Lake
Martin Kroeker [Thu, 1 Aug 2019 20:52:35 +0000 (22:52 +0200)]
Add CPUID identification of Intel Ice Lake

5 years agoAutodetect Intel Ice Lake (as SKYLAKEX target)
Martin Kroeker [Thu, 1 Aug 2019 20:51:09 +0000 (22:51 +0200)]
Autodetect Intel Ice Lake (as SKYLAKEX target)

5 years agoReplace vpermpd with vpermilpd in the Haswell DTRMM kernel
Martin Kroeker [Sun, 28 Jul 2019 21:17:28 +0000 (23:17 +0200)]
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel

to improve performance on AMD Zen (#2180) applying wjc404's improvement of the DGEMM kernel from #2186

5 years agoMerge pull request #2196 from wjc404/develop
Martin Kroeker [Sun, 28 Jul 2019 21:11:40 +0000 (23:11 +0200)]
Merge pull request #2196 from wjc404/develop

Add vbroadcastsd kernel to dgemm_kernel_4x8_haswell.S

5 years agoAdd files via upload
wjc404 [Sat, 27 Jul 2019 23:39:09 +0000 (07:39 +0800)]
Add files via upload

5 years agoMerge pull request #2112 from ffontaine/develop
Martin Kroeker [Sat, 27 Jul 2019 11:00:13 +0000 (13:00 +0200)]
Merge pull request #2112 from ffontaine/develop

Makefile.arm: remove -march flags

5 years agoMerge pull request #2193 from martin-frbg/makeutest
Martin Kroeker [Wed, 24 Jul 2019 18:19:21 +0000 (20:19 +0200)]
Merge pull request #2193 from martin-frbg/makeutest

Override special make variables

5 years agoUnset special make variables in ctest Makefile as well
Martin Kroeker [Wed, 24 Jul 2019 13:26:09 +0000 (15:26 +0200)]
Unset special make variables in ctest Makefile as well

5 years agoOverride special make variables
Martin Kroeker [Tue, 23 Jul 2019 14:56:40 +0000 (16:56 +0200)]
Override special make variables

as seen in https://github.com/xianyi/OpenBLAS/issues/1912#issuecomment-514183900 , any external setting of TARGET_ARCH (which could result from building OpenBLAS as part of a larger project that actually uses this variable) would cause the utest build to fail.
(Other subtargets appear to be unaffected as they do not use implicit make rules)

5 years agoMerge pull request #2191 from tylerjereddy/conditional_updates
Martin Kroeker [Tue, 23 Jul 2019 14:20:39 +0000 (16:20 +0200)]
Merge pull request #2191 from tylerjereddy/conditional_updates

MAINT: remove legacy CMake endif()

5 years agoMerge pull request #2190 from martin-frbg/zdot-zen
Martin Kroeker [Tue, 23 Jul 2019 14:15:08 +0000 (16:15 +0200)]
Merge pull request #2190 from martin-frbg/zdot-zen

Replace vpermpd with vpermilpd in the Haswell/Zen zdot microkernel

5 years agoMerge pull request #2189 from wjc404/develop
Martin Kroeker [Tue, 23 Jul 2019 06:32:56 +0000 (08:32 +0200)]
Merge pull request #2189 from wjc404/develop

Update dgemm_kernel_4x8_haswell.S for reducing cache misses

5 years agoMAINT: remove legacy CMake endif()
Tyler Reddy [Tue, 23 Jul 2019 03:24:57 +0000 (21:24 -0600)]
MAINT: remove legacy CMake endif()

* clean up a case where CMake endif()
contained the conditional used in the
if(), which is no longer needed /
discouraged since our minimum required
CMake version supports the modern syntax

5 years agoReplace vpermpd with vpermilpd
Martin Kroeker [Mon, 22 Jul 2019 06:28:16 +0000 (08:28 +0200)]
Replace vpermpd with vpermilpd

to improve performance on Zen/Zen2 (as demonstrated by wjc404 in #2180)

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Sat, 20 Jul 2019 17:10:32 +0000 (01:10 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Sat, 20 Jul 2019 16:47:45 +0000 (00:47 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 14:08:22 +0000 (22:08 +0800)]
Add files via upload

5 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 14:04:41 +0000 (22:04 +0800)]
Add files via upload

5 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 06:33:37 +0000 (14:33 +0800)]
Add files via upload

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Fri, 19 Jul 2019 15:58:24 +0000 (23:58 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoAdd files via upload
wjc404 [Fri, 19 Jul 2019 15:47:58 +0000 (23:47 +0800)]
Add files via upload

5 years agoMerge pull request #2186 from wjc404/develop
Martin Kroeker [Thu, 18 Jul 2019 14:04:44 +0000 (16:04 +0200)]
Merge pull request #2186 from wjc404/develop

Update "dgemm_kernel_4x8_haswell.S" for improving performance on zen2 chips

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 15:50:03 +0000 (23:50 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 15:47:30 +0000 (23:47 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 14:39:15 +0000 (22:39 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 13:27:41 +0000 (21:27 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 09:02:35 +0000 (17:02 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Tue, 16 Jul 2019 16:55:06 +0000 (00:55 +0800)]
Update dgemm_kernel_4x8_haswell.S

5 years agoUpdate dgemm_kernel_4x8_haswell.S for zen2
wjc404 [Tue, 16 Jul 2019 16:46:51 +0000 (00:46 +0800)]
Update dgemm_kernel_4x8_haswell.S for zen2

replaced a bunch of vpermpd instructions with vpermilpd and vperm2f128

5 years agoMerge pull request #2181 from isuruf/install_name
Martin Kroeker [Tue, 9 Jul 2019 18:08:52 +0000 (20:08 +0200)]
Merge pull request #2181 from isuruf/install_name

Change install_name on osx to match linux

5 years agoChange install_name on osx to match linux
Isuru Fernando [Mon, 8 Jul 2019 22:13:21 +0000 (17:13 -0500)]
Change install_name on osx to match linux

5 years agoMerge pull request #2177 from martin-frbg/noaff
Martin Kroeker [Sun, 7 Jul 2019 16:28:21 +0000 (18:28 +0200)]
Merge pull request #2177 from martin-frbg/noaff

Fix surprising behaviour of NO_AFFINITY=0

5 years agoFix surprising behaviour of NO_AFFINITY=0
Martin Kroeker [Sun, 7 Jul 2019 14:04:45 +0000 (16:04 +0200)]
Fix surprising behaviour of NO_AFFINITY=0

5 years agoMerge pull request #2175 from martin-frbg/cmake-mingw-fixes
Martin Kroeker [Sat, 6 Jul 2019 16:07:19 +0000 (18:07 +0200)]
Merge pull request #2175 from martin-frbg/cmake-mingw-fixes

Fix CMAKE compilation with MinGW32 and add it to Appveyor

5 years agoMingw32 needs leading underscore on object names
Martin Kroeker [Sat, 6 Jul 2019 13:07:15 +0000 (15:07 +0200)]
Mingw32 needs leading underscore on object names

(also copy BUNDERSCORE settings for FORTRAN from the corresponding Makefile)

5 years agoMake disabling DYNAMIC_ARCH on unsupported systems work
Martin Kroeker [Sat, 6 Jul 2019 13:05:04 +0000 (15:05 +0200)]
Make disabling DYNAMIC_ARCH on unsupported systems work

needs to be unset in the cache for the change to have any effect

5 years agoAdd getarch flags to disable AVX on x86
Martin Kroeker [Sat, 6 Jul 2019 13:02:39 +0000 (15:02 +0200)]
Add getarch flags to disable AVX on x86

(and other small fixes to match Makefile behaviour)

5 years agoAdd mingw builds to Appveyor config
Martin Kroeker [Sat, 6 Jul 2019 12:30:33 +0000 (14:30 +0200)]
Add mingw builds to Appveyor config

5 years agoUtest needs CBLAS but not necessarily FORTRAN
Martin Kroeker [Sat, 6 Jul 2019 12:29:47 +0000 (14:29 +0200)]
Utest needs CBLAS but not necessarily FORTRAN

5 years agoMerge pull request #2162 from martin-frbg/pgi
Martin Kroeker [Wed, 3 Jul 2019 17:16:30 +0000 (19:16 +0200)]
Merge pull request #2162 from martin-frbg/pgi

Fixes for PGI compiler

5 years agoMerge pull request #2172 from quickwritereader/develop
Martin Kroeker [Mon, 1 Jul 2019 19:06:02 +0000 (21:06 +0200)]
Merge pull request #2172 from quickwritereader/develop

power9 cgemm/ctrmm. new sgemm 8x16

5 years agocgemm/ctrmm power9
AbdelRauf [Tue, 18 Jun 2019 15:55:56 +0000 (15:55 +0000)]
cgemm/ctrmm power9

5 years agoMerge pull request #2170 from pkubaj/patch-1
Martin Kroeker [Sun, 30 Jun 2019 21:29:02 +0000 (23:29 +0200)]
Merge pull request #2170 from pkubaj/patch-1

Fix build on PPC970 for FreeBSD

5 years agoFix build for PPC970 on FreeBSD pt.2
pkubaj [Fri, 28 Jun 2019 10:31:45 +0000 (10:31 +0000)]
Fix build for PPC970 on FreeBSD pt.2

FreeBSD needs those macros too.

5 years agoFix build for PPC970 on FreeBSD pt. 1
pkubaj [Fri, 28 Jun 2019 10:29:44 +0000 (10:29 +0000)]
Fix build for PPC970 on FreeBSD pt. 1

FreeBSD needs DCBT_ARG=0 as well.

5 years agoMerge pull request #2169 from pkubaj/develop
Martin Kroeker [Tue, 25 Jun 2019 10:56:33 +0000 (12:56 +0200)]
Merge pull request #2169 from pkubaj/develop

Fix build on FreeBSD/powerpc64.

5 years agoFix build on FreeBSD/powerpc64.
Piotr Kubaj [Tue, 25 Jun 2019 08:58:56 +0000 (10:58 +0200)]
Fix build on FreeBSD/powerpc64.

Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
5 years agoPGI compiler does not like -march=native
Martin Kroeker [Thu, 20 Jun 2019 17:56:01 +0000 (19:56 +0200)]
PGI compiler does not like -march=native

5 years agoMerge pull request #2167 from kavanabhat/dtrmm_power8_segfault
Martin Kroeker [Wed, 19 Jun 2019 12:38:01 +0000 (14:38 +0200)]
Merge pull request #2167 from kavanabhat/dtrmm_power8_segfault

Fix DTRMMKERNEL register save for power8 64-bit mode (Fix for #2166)

5 years agoUpdate dtrmm_kernel_16x4_power8.S
kavanabhat [Wed, 19 Jun 2019 09:57:14 +0000 (15:27 +0530)]
Update dtrmm_kernel_16x4_power8.S

5 years agonew sgemm 8x16
AbdelRauf [Mon, 17 Jun 2019 15:33:38 +0000 (15:33 +0000)]
new sgemm 8x16

5 years agoFix mov syntax
Martin Kroeker [Sun, 16 Jun 2019 16:35:43 +0000 (18:35 +0200)]
Fix mov syntax

5 years agoZero ecx with a mov instruction
Martin Kroeker [Sun, 16 Jun 2019 13:04:10 +0000 (15:04 +0200)]
Zero ecx with a mov instruction

PGI assembler does not like the initialization in the constraints.

5 years agoUpdate Makefile.x86_64
Martin Kroeker [Fri, 14 Jun 2019 06:08:11 +0000 (08:08 +0200)]
Update Makefile.x86_64

5 years agoDo not force gcc options on non-gcc compilers
Martin Kroeker [Thu, 13 Jun 2019 21:01:35 +0000 (23:01 +0200)]
Do not force gcc options on non-gcc compilers

fixes compile failure with pgi 18.10 as reported on OpenBLAS-users