platform/upstream/openblas.git
5 years agocgemm/ctrmm power9
AbdelRauf [Tue, 18 Jun 2019 15:55:56 +0000 (15:55 +0000)]
cgemm/ctrmm power9

5 years agonew sgemm 8x16
AbdelRauf [Mon, 17 Jun 2019 15:33:38 +0000 (15:33 +0000)]
new sgemm 8x16

5 years agoconflict resolve
AbdelRauf [Wed, 5 Jun 2019 20:50:50 +0000 (20:50 +0000)]
conflict resolve

5 years agopower9 zgemm ztrmm optimized
AbdelRauf [Wed, 5 Jun 2019 10:30:57 +0000 (10:30 +0000)]
power9 zgemm ztrmm optimized

5 years agosgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed...
AbdelRauf [Fri, 31 May 2019 22:48:16 +0000 (22:48 +0000)]
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52

5 years agoimproved zgemm power9 based on power8
AbdelRauf [Thu, 23 May 2019 04:23:43 +0000 (04:23 +0000)]
improved zgemm power9 based on power8

5 years agoconflict resolve
AbdelRauf [Wed, 1 May 2019 19:36:22 +0000 (19:36 +0000)]
conflict resolve

5 years agoMerge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop
AbdelRauf [Mon, 29 Apr 2019 08:57:44 +0000 (08:57 +0000)]
Merge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop

5 years agosgemm/strmm
AbdelRauf [Sat, 13 Apr 2019 13:56:19 +0000 (13:56 +0000)]
sgemm/strmm

5 years agoMerge branch 'develop' into develop
Martin Kroeker [Fri, 29 Mar 2019 18:36:29 +0000 (19:36 +0100)]
Merge branch 'develop' into develop

5 years agopower9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled...
AbdelRauf [Thu, 14 Mar 2019 10:42:04 +0000 (10:42 +0000)]
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself

5 years agoMerge pull request #2069 from aixoss/aix-asm-change
Martin Kroeker [Mon, 25 Mar 2019 20:34:30 +0000 (21:34 +0100)]
Merge pull request #2069 from aixoss/aix-asm-change

AIX asm syntax changes needed for shared object creation

5 years agoAIX asm syntax changes needed for shared object creation
Ayappan P [Mon, 25 Mar 2019 13:23:25 +0000 (18:53 +0530)]
AIX asm syntax changes needed for shared object creation

5 years agoMerge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
Martin Kroeker [Tue, 19 Mar 2019 21:12:51 +0000 (22:12 +0100)]
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup

Fix for #2063

5 years agoAlso call CloseHandle on each thread, as well as on the event so as to not leak threa...
Erik M. Bray [Tue, 19 Mar 2019 09:22:02 +0000 (10:22 +0100)]
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.

5 years agoFix for #2063: The DllMain used in Cygwin did not run the thread memory
Erik M. Bray [Mon, 18 Mar 2019 19:32:48 +0000 (20:32 +0100)]
Fix for #2063: The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.

5 years agoMerge pull request #2058 from xsacha/patch-3
Martin Kroeker [Sat, 16 Mar 2019 10:57:23 +0000 (11:57 +0100)]
Merge pull request #2058 from xsacha/patch-3

Change 64-bit detection as explained in #2056

5 years agoMerge pull request #2060 from embray/cygwin/readenv
Martin Kroeker [Sat, 16 Mar 2019 10:56:51 +0000 (11:56 +0100)]
Merge pull request #2060 from embray/cygwin/readenv

Use POSIX getenv on Cygwin

5 years agoUse POSIX getenv on Cygwin
Erik M. Bray [Fri, 15 Mar 2019 14:06:30 +0000 (15:06 +0100)]
Use POSIX getenv on Cygwin

The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().

5 years agoTrivial typo fix
Martin Kroeker [Wed, 13 Mar 2019 18:20:23 +0000 (19:20 +0100)]
Trivial typo fix

as suggested in #2022

5 years agoChange 64-bit detection as explained in #2056
Sacha [Wed, 13 Mar 2019 13:21:54 +0000 (23:21 +1000)]
Change 64-bit detection as explained in #2056

5 years agoMerge pull request #2042 from maomao194313/develop
Martin Kroeker [Tue, 12 Mar 2019 21:57:39 +0000 (22:57 +0100)]
Merge pull request #2042 from maomao194313/develop

add TARGET support for HiSilicon tsv110 CPUs

5 years agoMerge pull request #2055 from martin-frbg/atomid
Martin Kroeker [Tue, 12 Mar 2019 21:57:07 +0000 (22:57 +0100)]
Merge pull request #2055 from martin-frbg/atomid

Add CPUID data for Intel Denverton (as Nehalem)

5 years agoAdd Intel Denverton
Martin Kroeker [Tue, 12 Mar 2019 15:09:55 +0000 (16:09 +0100)]
Add Intel Denverton

5 years agoAdd Intel Denverton
Martin Kroeker [Tue, 12 Mar 2019 15:03:56 +0000 (16:03 +0100)]
Add Intel Denverton

for #2048

5 years agomake DYNAMIC_ARCH=1 package work on TSV110
maomao194313 [Tue, 12 Mar 2019 08:11:01 +0000 (16:11 +0800)]
make DYNAMIC_ARCH=1 package work on TSV110

5 years agomake DYNAMIC_ARCH=1 package work on TSV110.
maomao194313 [Tue, 12 Mar 2019 08:05:19 +0000 (16:05 +0800)]
make DYNAMIC_ARCH=1 package work on TSV110.

5 years agoMerge pull request #2051 from martin-frbg/issue2048
Martin Kroeker [Sat, 9 Mar 2019 15:39:35 +0000 (16:39 +0100)]
Merge pull request #2051 from martin-frbg/issue2048

Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1

5 years agoMerge pull request #2050 from kencu/PowerMacFix
Martin Kroeker [Sat, 9 Mar 2019 15:39:08 +0000 (16:39 +0100)]
Merge pull request #2050 from kencu/PowerMacFix

PowerMac 970 fixes

5 years agoMake TARGET=GENERIC compatible with DYNAMIC_ARCH=1
Martin Kroeker [Sat, 9 Mar 2019 10:21:16 +0000 (11:21 +0100)]
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1

for issue #2048

5 years agocommon_power.h: force DCBT_ARG 0 on PPC970 Darwin
ken-cunningham-webuse [Thu, 7 Mar 2019 19:41:58 +0000 (11:41 -0800)]
common_power.h: force DCBT_ARG 0 on PPC970 Darwin

without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries

that relates to this assembly command
dcbt 8, r24, r18

this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass

5 years agoparam.h : enable defines for PPC970 on DarwinOS
ken-cunningham-webuse [Thu, 7 Mar 2019 19:36:35 +0000 (11:36 -0800)]
param.h : enable defines for PPC970 on DarwinOS

fixes:
gemm.c: In function 'sgemm_':
../common_param.h:981:18: error: 'SGEMM_DEFAULT_P' undeclared (first use in this function)
 #define SGEMM_P  SGEMM_DEFAULT_P
                  ^

5 years agoMerge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64
Martin Kroeker [Thu, 7 Mar 2019 18:28:06 +0000 (19:28 +0100)]
Merge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64

Fix crash in sgemm SSE/nano kernel on x86_64

5 years agoFix crash in sgemm SSE/nano kernel on x86_64
Celelibi [Thu, 7 Mar 2019 15:39:41 +0000 (16:39 +0100)]
Fix crash in sgemm SSE/nano kernel on x86_64

Fix bug #2047.

Signed-off-by: Celelibi <celelibi@gmail.com>
5 years agoMerge pull request #2046 from kencu/powermac
Martin Kroeker [Thu, 7 Mar 2019 13:51:41 +0000 (14:51 +0100)]
Merge pull request #2046 from kencu/powermac

ctest.c : add __POWERPC__ for PowerMac

5 years agoctest.c : add __POWERPC__ for PowerMac
ken-cunningham-webuse [Thu, 7 Mar 2019 04:55:06 +0000 (20:55 -0800)]
ctest.c : add __POWERPC__ for PowerMac

5 years agoMerge pull request #2045 from martin-frbg/2033-3
Martin Kroeker [Wed, 6 Mar 2019 21:40:26 +0000 (22:40 +0100)]
Merge pull request #2045 from martin-frbg/2033-3

Do not compile in AVX512 check if AVX support is disabled

5 years agoDo not compile in AVX512 check if AVX support is disabled
Martin Kroeker [Tue, 5 Mar 2019 15:04:25 +0000 (16:04 +0100)]
Do not compile in AVX512 check if AVX support is disabled

xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway

5 years agoMerge pull request #2044 from martin-frbg/issue2043
Martin Kroeker [Tue, 5 Mar 2019 11:11:32 +0000 (12:11 +0100)]
Merge pull request #2044 from martin-frbg/issue2043

Fix module definition conflicts between LAPACK and ReLAPACK

5 years agoMerge pull request #2039 from brada4/meminit
Martin Kroeker [Tue, 5 Mar 2019 11:11:15 +0000 (12:11 +0100)]
Merge pull request #2039 from brada4/meminit

Address warning in memory.c

5 years agoFix module definition conflicts between LAPACK and ReLAPACK
Martin Kroeker [Mon, 4 Mar 2019 20:17:08 +0000 (21:17 +0100)]
Fix module definition conflicts between LAPACK and ReLAPACK

for #2043

5 years agoMerge pull request #2026 from martin-frbg/trmv_threads
Martin Kroeker [Mon, 4 Mar 2019 14:08:31 +0000 (15:08 +0100)]
Merge pull request #2026 from martin-frbg/trmv_threads

Correct range limiting in trmv_thread and re-enable TRMV multithreading

5 years agoMerge pull request #2038 from martin-frbg/issue2035
Martin Kroeker [Mon, 4 Mar 2019 14:07:48 +0000 (15:07 +0100)]
Merge pull request #2038 from martin-frbg/issue2035

Improve handling of NO_STATIC and NO_SHARED

5 years agoMerge pull request #2040 from martin-frbg/locks2002
Martin Kroeker [Mon, 4 Mar 2019 14:07:14 +0000 (15:07 +0100)]
Merge pull request #2040 from martin-frbg/locks2002

Restore locking optimizations for OpenMP case

5 years agoadd TARGET support for HiSilicon tsv110 CPUs
maomao194313 [Mon, 4 Mar 2019 08:48:49 +0000 (16:48 +0800)]
add TARGET support for HiSilicon tsv110 CPUs

5 years agoadd TARGET support for HiSilicon tsv110 CPUs
maomao194313 [Mon, 4 Mar 2019 08:45:22 +0000 (16:45 +0800)]
add TARGET support for HiSilicon tsv110 CPUs

5 years agoadd TARGET support for HiSilicon tsv110 CPUs
maomao194313 [Mon, 4 Mar 2019 08:41:21 +0000 (16:41 +0800)]
add TARGET support for  HiSilicon tsv110 CPUs

5 years agoHiSilicon tsv110 CPUs optimization branch
maomao194313 [Mon, 4 Mar 2019 08:30:50 +0000 (16:30 +0800)]
HiSilicon tsv110 CPUs optimization branch

add HiSilicon tsv110 CPUs  optimization branch

5 years agoRestore locking optimizations for OpenMP case
Martin Kroeker [Sun, 3 Mar 2019 13:17:07 +0000 (14:17 +0100)]
Restore locking optimizations for OpenMP case

restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461

5 years agoaddress warning introed with #1814 et al
Andrew [Sun, 3 Mar 2019 07:05:11 +0000 (09:05 +0200)]
address warning introed with #1814 et al

5 years agoinit
Andrew [Sun, 3 Mar 2019 06:59:27 +0000 (08:59 +0200)]
init

5 years agoImprove handling of NO_STATIC and NO_SHARED
Martin Kroeker [Sat, 2 Mar 2019 22:36:36 +0000 (23:36 +0100)]
Improve handling of NO_STATIC and NO_SHARED

to avoid surprises from defining either as zero. Fixes #2035 by addressing some concerns from #1422

5 years agoMerge pull request #2037 from martin-frbg/issue2033-2
Martin Kroeker [Fri, 1 Mar 2019 10:45:02 +0000 (11:45 +0100)]
Merge pull request #2037 from martin-frbg/issue2033-2

Make sure that AVX512 is disabled in 32bit builds

5 years agoMake sure that AVX512 is disabled in 32bit builds
Martin Kroeker [Fri, 1 Mar 2019 08:23:03 +0000 (09:23 +0100)]
Make sure that AVX512 is disabled in 32bit builds

for #2033

5 years agoMerge pull request #2034 from martin-frbg/issue2033
Martin Kroeker [Thu, 28 Feb 2019 21:10:12 +0000 (22:10 +0100)]
Merge pull request #2034 from martin-frbg/issue2033

Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX

5 years agoKeep xcode8.3 for osx BINARY=32 build
Martin Kroeker [Thu, 28 Feb 2019 09:51:54 +0000 (10:51 +0100)]
Keep xcode8.3 for osx BINARY=32 build

as xcode10 deprecated i386

5 years agoMake x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX
Martin Kroeker [Thu, 28 Feb 2019 08:58:25 +0000 (09:58 +0100)]
Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX

fixes #2033

5 years agoFix AVX512 test always returning false due to missing compiler option
Martin Kroeker [Mon, 25 Feb 2019 16:58:31 +0000 (17:58 +0100)]
Fix AVX512 test always returning false due to missing compiler option

5 years agoFix missing -c option in AVX512 test
Martin Kroeker [Mon, 25 Feb 2019 16:55:36 +0000 (17:55 +0100)]
Fix missing -c option in AVX512 test

5 years agoMerge pull request #2028 from brada4/mv
Martin Kroeker [Sun, 24 Feb 2019 18:50:23 +0000 (19:50 +0100)]
Merge pull request #2028 from brada4/mv

Move one of clobber fixes to right place

5 years agomove fix to right place
Andrew [Sun, 24 Feb 2019 18:41:02 +0000 (20:41 +0200)]
move fix to right place

5 years agoinit
Andrew [Sun, 24 Feb 2019 18:39:25 +0000 (20:39 +0200)]
init

5 years agoReduce list of kernels in the dynamic arch build
Martin Kroeker [Wed, 20 Feb 2019 09:27:48 +0000 (10:27 +0100)]
Reduce list of kernels in the dynamic arch build

to make compilation complete reliably within the 1h limit again

5 years agoFix error introduced during cleanup
Martin Kroeker [Tue, 19 Feb 2019 21:16:33 +0000 (22:16 +0100)]
Fix error introduced during cleanup

5 years agoAllow multithreading TRMV again
Martin Kroeker [Tue, 19 Feb 2019 20:03:30 +0000 (21:03 +0100)]
Allow multithreading TRMV again

revert workaround introduced for issue #1332 as the actual cause appears to be my incorrect fix from #1262 (see #1388)

5 years agoCorrect range_n limiting
Martin Kroeker [Tue, 19 Feb 2019 19:59:48 +0000 (20:59 +0100)]
Correct range_n limiting

same bug as seen in #1388, somehow missed in corresponding PR #1389

5 years agoMerge pull request #2024 from martin-frbg/gcc9fixes4
Martin Kroeker [Sun, 17 Feb 2019 10:49:15 +0000 (11:49 +0100)]
Merge pull request #2024 from martin-frbg/gcc9fixes4

Fix inline assembly constraints in Bulldozer TRSM kernels

5 years agoMerge pull request #2023 from martin-frbg/gcc9fixes3
Martin Kroeker [Sun, 17 Feb 2019 10:48:57 +0000 (11:48 +0100)]
Merge pull request #2023 from martin-frbg/gcc9fixes3

Fix inline assembly constraints in various x86_64 GEMVN kernels

5 years agoMerge pull request #1988 from TiborGY/patch-1
Martin Kroeker [Sun, 17 Feb 2019 10:36:04 +0000 (11:36 +0100)]
Merge pull request #1988 from TiborGY/patch-1

Reword/expand comments in Makefile.rule

5 years agofix the the
TiborGY [Sat, 16 Feb 2019 22:26:13 +0000 (23:26 +0100)]
fix the the

5 years agoFix inline assembly constraints in Bulldozer TRSM kernels
Martin Kroeker [Sat, 16 Feb 2019 19:06:48 +0000 (20:06 +0100)]
Fix inline assembly constraints in Bulldozer TRSM kernels

rework indices to allow marking i,as and bs as both input and output (marked operand n1 as well for simplicity). For #2009

5 years agoFix inline assembly constraints
Martin Kroeker [Sat, 16 Feb 2019 17:51:09 +0000 (18:51 +0100)]
Fix inline assembly constraints

5 years agoFix inline assembly constraints
Martin Kroeker [Sat, 16 Feb 2019 17:46:17 +0000 (18:46 +0100)]
Fix inline assembly constraints

5 years agoFix inline assembly constraints
Martin Kroeker [Sat, 16 Feb 2019 17:36:39 +0000 (18:36 +0100)]
Fix inline assembly constraints

rework indices to allow marking argument lda as input and output.

5 years agoFix inline assembly constraints
Martin Kroeker [Sat, 16 Feb 2019 17:24:11 +0000 (18:24 +0100)]
Fix inline assembly constraints

rework indices to allow marking argument lda4 as input and output. For #2009

5 years agoMerge pull request #2021 from martin-frbg/gcc9fixes2
Martin Kroeker [Sat, 16 Feb 2019 17:05:40 +0000 (18:05 +0100)]
Merge pull request #2021 from martin-frbg/gcc9fixes2

Fix wrong constraints in inline assembly of Haswell DTRSM kernel

5 years agoUpdate Makefile.rule
TiborGY [Sat, 16 Feb 2019 11:12:39 +0000 (12:12 +0100)]
Update Makefile.rule

add note about NUM_THREADS for package maintainers, add examples of programs that cause affinity troubles

5 years agoFix wrong constraints in inline assembly
Martin Kroeker [Fri, 15 Feb 2019 14:08:16 +0000 (15:08 +0100)]
Fix wrong constraints in inline assembly

for #2009

5 years agoMerge pull request #2019 from martin-frbg/gcc9fixes
Martin Kroeker [Fri, 15 Feb 2019 14:02:54 +0000 (15:02 +0100)]
Merge pull request #2019 from martin-frbg/gcc9fixes

Fix unannounced modification of input operand 8 (lda4) in Haswell GEMVN microkernel

5 years agoRename operands to put lda on the input/output constraint list
Martin Kroeker [Fri, 15 Feb 2019 09:10:04 +0000 (10:10 +0100)]
Rename operands to put lda on the input/output constraint list

5 years agoMerge pull request #2020 from martin-frbg/issue1956
Martin Kroeker [Fri, 15 Feb 2019 08:57:59 +0000 (09:57 +0100)]
Merge pull request #2020 from martin-frbg/issue1956

With the Intel compiler on Linux, prefer ifort for the final link step

5 years agoWith the Intel compiler on Linux, prefer ifort for the final link step
Martin Kroeker [Thu, 14 Feb 2019 21:57:30 +0000 (22:57 +0100)]
With the Intel compiler on Linux, prefer ifort for the final link step

icc has known problems with mixed-language builds that ifort can handle just fine. Fixes #1956

5 years agoSave and restore input argument 8 (lda4)
Martin Kroeker [Thu, 14 Feb 2019 21:43:18 +0000 (22:43 +0100)]
Save and restore input argument 8 (lda4)

Fixes miscompilation with gcc9 -ftree-vectorize (related to issue #2009)

5 years agoMerge pull request #2018 from bartoldeman/fix-dgemv-znver1-tree-vectorize
Martin Kroeker [Thu, 14 Feb 2019 20:55:11 +0000 (21:55 +0100)]
Merge pull request #2018 from bartoldeman/fix-dgemv-znver1-tree-vectorize

dgemv_kernel_4x4(Haswell): add missing clobbers for xmm0,xmm1,xmm2,xmm3

5 years agodgemv_kernel_4x4(Haswell): add missing clobbers for xmm0,xmm1,xmm2,xmm3
Bart Oldeman [Thu, 14 Feb 2019 16:19:41 +0000 (16:19 +0000)]
dgemv_kernel_4x4(Haswell): add missing clobbers for xmm0,xmm1,xmm2,xmm3

This fixes a crash in dblat2 when OpenBLAS is compiled using
-march=znver1 -ftree-vectorize -O2

See also:
https://github.com/easybuilders/easybuild-easyconfigs/issues/7180

5 years agoFix missing clobber in x86/x86_64 blas_quickdivide inline assembly function (#2017)
Martin Kroeker [Thu, 14 Feb 2019 14:21:36 +0000 (15:21 +0100)]
Fix missing clobber in x86/x86_64 blas_quickdivide inline assembly function (#2017)

* Fix missing clobber in blas_quickdivide assembly

5 years agoMerge pull request #2013 from martin-frbg/issue2011
Martin Kroeker [Thu, 14 Feb 2019 08:29:34 +0000 (09:29 +0100)]
Merge pull request #2013 from martin-frbg/issue2011

Fix invalid memory access in PPC gemm_beta

5 years agoFix out-of-bounds memory access in gemm_beta
Martin Kroeker [Wed, 13 Feb 2019 21:08:37 +0000 (22:08 +0100)]
Fix out-of-bounds memory access in gemm_beta

Fixes #2011 (as suggested by davemq), assuming typo by K.Goto

5 years agoFix out-of-bounds memory access in gemm_beta
Martin Kroeker [Wed, 13 Feb 2019 21:06:41 +0000 (22:06 +0100)]
Fix out-of-bounds memory access in gemm_beta

Fixes #2011 (as suggested by davemq) presuming typo by K.Goto

5 years agoMerge pull request #2012 from maamountki/z14
Martin Kroeker [Wed, 13 Feb 2019 19:15:56 +0000 (20:15 +0100)]
Merge pull request #2012 from maamountki/z14

[ZARCH] Many improvements

5 years ago[ZARCH] Modify constraints
maamountki [Wed, 13 Feb 2019 19:06:25 +0000 (21:06 +0200)]
[ZARCH] Modify constraints

5 years ago[ZARCH] Fix caxpy
maamountki [Wed, 13 Feb 2019 10:54:35 +0000 (12:54 +0200)]
[ZARCH] Fix caxpy

5 years agoMerge pull request #2010 from martin-frbg/issue2009
Martin Kroeker [Tue, 12 Feb 2019 22:24:02 +0000 (23:24 +0100)]
Merge pull request #2010 from martin-frbg/issue2009

 Fix declaration of input arguments in x86_64 GEMV, SYMV and DSCAL

5 years agoFix declaration of arguments in inline assembly
Martin Kroeker [Tue, 12 Feb 2019 15:14:02 +0000 (16:14 +0100)]
Fix declaration of arguments in inline assembly

Argument 0 is modified so should be input and output

5 years agoFix declaration of assembly arguments in SSYMV and DSYMV microkernels
Martin Kroeker [Tue, 12 Feb 2019 15:00:18 +0000 (16:00 +0100)]
Fix declaration of assembly arguments in SSYMV and DSYMV microkernels

Arguments 0 and 1 are both input and output

5 years agoFix declaration of input arguments in inline assembly
Martin Kroeker [Tue, 12 Feb 2019 14:51:43 +0000 (15:51 +0100)]
Fix declaration of input arguments in inline assembly

Argument 0 is modified as it doubles as a counter

5 years ago Fix declaration of input arguments in the x86_64 s/dGEMV_T and s/dGEMV_N kernels
Martin Kroeker [Tue, 12 Feb 2019 14:33:48 +0000 (15:33 +0100)]
 Fix declaration of input arguments in the x86_64 s/dGEMV_T and s/dGEMV_N kernels

Arguments 0 and 1 need to be tagged as both input and output

5 years ago[ZARCH] Fix cgemv_t_4
maamountki [Tue, 12 Feb 2019 11:12:28 +0000 (13:12 +0200)]
[ZARCH] Fix cgemv_t_4

5 years ago[ZARCH] Fix constraints and source code formatting
maamountki [Mon, 11 Feb 2019 14:01:13 +0000 (16:01 +0200)]
[ZARCH] Fix constraints and source code formatting

5 years agoFix potential memory leak in cpu enumeration on Linux (#2008)
Martin Kroeker [Sun, 10 Feb 2019 22:24:45 +0000 (23:24 +0100)]
Fix potential memory leak in cpu enumeration on Linux (#2008)

* Fix potential memory leak in cpu enumeration with glibc

An early return after a failed call to sched_getaffinity would leak the previously allocated cpu_set_t. Wrong calculation of the size argument in that call increased the likelyhood of that failure. Fixes #2003