platform/upstream/openblas.git
4 years agoAdd files via upload
wjc404 [Sat, 27 Jul 2019 23:39:09 +0000 (07:39 +0800)]
Add files via upload

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Sat, 20 Jul 2019 17:10:32 +0000 (01:10 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Sat, 20 Jul 2019 16:47:45 +0000 (00:47 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 14:08:22 +0000 (22:08 +0800)]
Add files via upload

4 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 14:04:41 +0000 (22:04 +0800)]
Add files via upload

4 years agoAdd files via upload
wjc404 [Sat, 20 Jul 2019 06:33:37 +0000 (14:33 +0800)]
Add files via upload

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Fri, 19 Jul 2019 15:58:24 +0000 (23:58 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoAdd files via upload
wjc404 [Fri, 19 Jul 2019 15:47:58 +0000 (23:47 +0800)]
Add files via upload

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 15:50:03 +0000 (23:50 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 15:47:30 +0000 (23:47 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 14:39:15 +0000 (22:39 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 13:27:41 +0000 (21:27 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Wed, 17 Jul 2019 09:02:35 +0000 (17:02 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S
wjc404 [Tue, 16 Jul 2019 16:55:06 +0000 (00:55 +0800)]
Update dgemm_kernel_4x8_haswell.S

4 years agoUpdate dgemm_kernel_4x8_haswell.S for zen2
wjc404 [Tue, 16 Jul 2019 16:46:51 +0000 (00:46 +0800)]
Update dgemm_kernel_4x8_haswell.S for zen2

replaced a bunch of vpermpd instructions with vpermilpd and vperm2f128

4 years agoMerge pull request #2181 from isuruf/install_name
Martin Kroeker [Tue, 9 Jul 2019 18:08:52 +0000 (20:08 +0200)]
Merge pull request #2181 from isuruf/install_name

Change install_name on osx to match linux

4 years agoChange install_name on osx to match linux
Isuru Fernando [Mon, 8 Jul 2019 22:13:21 +0000 (17:13 -0500)]
Change install_name on osx to match linux

4 years agoMerge pull request #2177 from martin-frbg/noaff
Martin Kroeker [Sun, 7 Jul 2019 16:28:21 +0000 (18:28 +0200)]
Merge pull request #2177 from martin-frbg/noaff

Fix surprising behaviour of NO_AFFINITY=0

4 years agoFix surprising behaviour of NO_AFFINITY=0
Martin Kroeker [Sun, 7 Jul 2019 14:04:45 +0000 (16:04 +0200)]
Fix surprising behaviour of NO_AFFINITY=0

4 years agoMerge pull request #2175 from martin-frbg/cmake-mingw-fixes
Martin Kroeker [Sat, 6 Jul 2019 16:07:19 +0000 (18:07 +0200)]
Merge pull request #2175 from martin-frbg/cmake-mingw-fixes

Fix CMAKE compilation with MinGW32 and add it to Appveyor

4 years agoMingw32 needs leading underscore on object names
Martin Kroeker [Sat, 6 Jul 2019 13:07:15 +0000 (15:07 +0200)]
Mingw32 needs leading underscore on object names

(also copy BUNDERSCORE settings for FORTRAN from the corresponding Makefile)

4 years agoMake disabling DYNAMIC_ARCH on unsupported systems work
Martin Kroeker [Sat, 6 Jul 2019 13:05:04 +0000 (15:05 +0200)]
Make disabling DYNAMIC_ARCH on unsupported systems work

needs to be unset in the cache for the change to have any effect

4 years agoAdd getarch flags to disable AVX on x86
Martin Kroeker [Sat, 6 Jul 2019 13:02:39 +0000 (15:02 +0200)]
Add getarch flags to disable AVX on x86

(and other small fixes to match Makefile behaviour)

4 years agoAdd mingw builds to Appveyor config
Martin Kroeker [Sat, 6 Jul 2019 12:30:33 +0000 (14:30 +0200)]
Add mingw builds to Appveyor config

4 years agoUtest needs CBLAS but not necessarily FORTRAN
Martin Kroeker [Sat, 6 Jul 2019 12:29:47 +0000 (14:29 +0200)]
Utest needs CBLAS but not necessarily FORTRAN

4 years agoMerge pull request #2162 from martin-frbg/pgi
Martin Kroeker [Wed, 3 Jul 2019 17:16:30 +0000 (19:16 +0200)]
Merge pull request #2162 from martin-frbg/pgi

Fixes for PGI compiler

4 years agoMerge pull request #2172 from quickwritereader/develop
Martin Kroeker [Mon, 1 Jul 2019 19:06:02 +0000 (21:06 +0200)]
Merge pull request #2172 from quickwritereader/develop

power9 cgemm/ctrmm. new sgemm 8x16

4 years agocgemm/ctrmm power9
AbdelRauf [Tue, 18 Jun 2019 15:55:56 +0000 (15:55 +0000)]
cgemm/ctrmm power9

4 years agoMerge pull request #2170 from pkubaj/patch-1
Martin Kroeker [Sun, 30 Jun 2019 21:29:02 +0000 (23:29 +0200)]
Merge pull request #2170 from pkubaj/patch-1

Fix build on PPC970 for FreeBSD

4 years agoFix build for PPC970 on FreeBSD pt.2
pkubaj [Fri, 28 Jun 2019 10:31:45 +0000 (10:31 +0000)]
Fix build for PPC970 on FreeBSD pt.2

FreeBSD needs those macros too.

4 years agoFix build for PPC970 on FreeBSD pt. 1
pkubaj [Fri, 28 Jun 2019 10:29:44 +0000 (10:29 +0000)]
Fix build for PPC970 on FreeBSD pt. 1

FreeBSD needs DCBT_ARG=0 as well.

4 years agoMerge pull request #2169 from pkubaj/develop
Martin Kroeker [Tue, 25 Jun 2019 10:56:33 +0000 (12:56 +0200)]
Merge pull request #2169 from pkubaj/develop

Fix build on FreeBSD/powerpc64.

4 years agoFix build on FreeBSD/powerpc64.
Piotr Kubaj [Tue, 25 Jun 2019 08:58:56 +0000 (10:58 +0200)]
Fix build on FreeBSD/powerpc64.

Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
4 years agoPGI compiler does not like -march=native
Martin Kroeker [Thu, 20 Jun 2019 17:56:01 +0000 (19:56 +0200)]
PGI compiler does not like -march=native

4 years agoMerge pull request #2167 from kavanabhat/dtrmm_power8_segfault
Martin Kroeker [Wed, 19 Jun 2019 12:38:01 +0000 (14:38 +0200)]
Merge pull request #2167 from kavanabhat/dtrmm_power8_segfault

Fix DTRMMKERNEL register save for power8 64-bit mode (Fix for #2166)

4 years agoUpdate dtrmm_kernel_16x4_power8.S
kavanabhat [Wed, 19 Jun 2019 09:57:14 +0000 (15:27 +0530)]
Update dtrmm_kernel_16x4_power8.S

4 years agonew sgemm 8x16
AbdelRauf [Mon, 17 Jun 2019 15:33:38 +0000 (15:33 +0000)]
new sgemm 8x16

4 years agoFix mov syntax
Martin Kroeker [Sun, 16 Jun 2019 16:35:43 +0000 (18:35 +0200)]
Fix mov syntax

4 years agoZero ecx with a mov instruction
Martin Kroeker [Sun, 16 Jun 2019 13:04:10 +0000 (15:04 +0200)]
Zero ecx with a mov instruction

PGI assembler does not like the initialization in the constraints.

4 years agoUpdate Makefile.x86_64
Martin Kroeker [Fri, 14 Jun 2019 06:08:11 +0000 (08:08 +0200)]
Update Makefile.x86_64

4 years agoDo not force gcc options on non-gcc compilers
Martin Kroeker [Thu, 13 Jun 2019 21:01:35 +0000 (23:01 +0200)]
Do not force gcc options on non-gcc compilers

fixes compile failure with pgi 18.10 as reported on OpenBLAS-users

4 years agoMerge pull request #2159 from martin-frbg/issue2149
Martin Kroeker [Mon, 10 Jun 2019 17:12:45 +0000 (19:12 +0200)]
Merge pull request #2159 from martin-frbg/issue2149

Avoid unintentional activation of TLS codepath via USE_TLS=0

4 years agoAvoid unintentional activation of TLS code via USE_TLS=0
Martin Kroeker [Mon, 10 Jun 2019 15:24:15 +0000 (17:24 +0200)]
Avoid unintentional activation of TLS code via USE_TLS=0

fixes #2149

4 years agoMerge pull request #2158 from martin-frbg/issue2143
Martin Kroeker [Mon, 10 Jun 2019 12:08:11 +0000 (14:08 +0200)]
Merge pull request #2158 from martin-frbg/issue2143

Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds

4 years agoRemove any inadvertent use of -march=native from DYNAMIC_ARCH builds
Martin Kroeker [Mon, 10 Jun 2019 07:50:13 +0000 (09:50 +0200)]
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds

from #2143, -march=native precludes use of more specific options like -march=skylake-avx512 in individual kernels, and defeats the purpose of dynamic arch anyway.

4 years agoMerge pull request #2157 from martin-frbg/2154-2
Martin Kroeker [Sun, 9 Jun 2019 10:19:08 +0000 (12:19 +0200)]
Merge pull request #2157 from martin-frbg/2154-2

Add gfortran workaround for potential ABI violation

4 years agoUpdate fc.cmake
Martin Kroeker [Sun, 9 Jun 2019 07:31:13 +0000 (09:31 +0200)]
Update fc.cmake

4 years agoAdd gfortran workaround for potential ABI violation
Martin Kroeker [Sat, 8 Jun 2019 21:17:03 +0000 (23:17 +0200)]
Add gfortran workaround for potential ABI violation

for #2154

4 years agoMerge pull request #2148 from TiborGY/cpp_thread_test_2
Martin Kroeker [Fri, 7 Jun 2019 11:23:07 +0000 (13:23 +0200)]
Merge pull request #2148 from TiborGY/cpp_thread_test_2

Thread safety tester using C++11 threading (cleaned history)

4 years agoMerge pull request #2156 from martin-frbg/issue2154
Martin Kroeker [Thu, 6 Jun 2019 11:43:12 +0000 (13:43 +0200)]
Merge pull request #2156 from martin-frbg/issue2154

Add gfortran workaround for C->FORTRAN ABI violation

4 years agoAdd gfortran workaround for ABI violations
Martin Kroeker [Thu, 6 Jun 2019 08:24:16 +0000 (10:24 +0200)]
Add gfortran workaround for ABI violations

for #2154 (see gcc bug 90329)

4 years agoAdd gfortran workaround for ABI violations in LAPACKE
Martin Kroeker [Thu, 6 Jun 2019 08:18:40 +0000 (10:18 +0200)]
Add gfortran workaround for ABI violations in LAPACKE

for #2154 (see gcc bug 90329)

4 years agoMerge pull request #2153 from quickwritereader/develop
Martin Kroeker [Thu, 6 Jun 2019 05:42:56 +0000 (07:42 +0200)]
Merge pull request #2153 from quickwritereader/develop

improved power9 zgemm,sgemm

4 years agoconflict resolve
AbdelRauf [Wed, 5 Jun 2019 20:50:50 +0000 (20:50 +0000)]
conflict resolve

4 years agopower9 zgemm ztrmm optimized
AbdelRauf [Wed, 5 Jun 2019 10:30:57 +0000 (10:30 +0000)]
power9 zgemm ztrmm optimized

4 years agoMerge pull request #2145 from martin-frbg/1912-3
Martin Kroeker [Wed, 5 Jun 2019 18:27:45 +0000 (20:27 +0200)]
Merge pull request #2145 from martin-frbg/1912-3

Separate implementations of AMAX and IAMAX on arm

4 years agoMerge pull request #2110 from pc2/cpu-detection
Martin Kroeker [Wed, 5 Jun 2019 18:27:05 +0000 (20:27 +0200)]
Merge pull request #2110 from pc2/cpu-detection

Fix detection of Skylake processors when using GCC

4 years agoc_check: Unlink correct file
Michael Lass [Fri, 3 May 2019 19:22:27 +0000 (21:22 +0200)]
c_check: Unlink correct file

4 years agoFix detection of AVX512 capable compilers in getarch
Michael Lass [Fri, 3 May 2019 19:07:14 +0000 (21:07 +0200)]
Fix detection of AVX512 capable compilers in getarch

21eda8b5 introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.

4 years agosgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed...
AbdelRauf [Fri, 31 May 2019 22:48:16 +0000 (22:48 +0000)]
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52

4 years agoDocument NO_AVX512
Martin Kroeker [Mon, 3 Jun 2019 09:01:33 +0000 (11:01 +0200)]
Document NO_AVX512

for #2151

4 years ago add c++ thread test option to Makefile.rule
TiborGY [Sat, 1 Jun 2019 19:36:41 +0000 (21:36 +0200)]
 add c++ thread test option to Makefile.rule

4 years agohook up c++ thread safety test (main Makefile)
TiborGY [Sat, 1 Jun 2019 19:32:52 +0000 (21:32 +0200)]
hook up c++ thread safety test (main Makefile)

4 years agoupload thread safety test folder
TiborGY [Sat, 1 Jun 2019 19:30:06 +0000 (21:30 +0200)]
upload thread safety test folder

4 years agoimproved zgemm power9 based on power8
AbdelRauf [Thu, 23 May 2019 04:23:43 +0000 (04:23 +0000)]
improved zgemm power9 based on power8

4 years agoUse generic kernels for complex (I)AMAX to support softfp
Martin Kroeker [Thu, 30 May 2019 09:38:11 +0000 (11:38 +0200)]
Use generic kernels for complex (I)AMAX to support softfp

4 years agoEnsure correct output for DAMAX with softfp
Martin Kroeker [Thu, 30 May 2019 09:25:43 +0000 (11:25 +0200)]
Ensure correct output for DAMAX with softfp

4 years agoSeparate implementations of AMAX and IAMAX on arm
Martin Kroeker [Wed, 29 May 2019 13:02:51 +0000 (15:02 +0200)]
Separate implementations of AMAX and IAMAX on arm

As noted in #1912 and comment on #1942, the combined implementation happens to "do the right thing" on hardfp, but cannot return both value and index on softfp where they would have to share the return register

4 years agoMerge pull request #2144 from xianyi/revert-2142-issue1912-2
Martin Kroeker [Wed, 29 May 2019 12:09:10 +0000 (14:09 +0200)]
Merge pull request #2144 from xianyi/revert-2142-issue1912-2

Revert "Add softfp support in min/max kernels"

4 years agoRevert "Add softfp support in min/max kernels"
Martin Kroeker [Wed, 29 May 2019 12:07:17 +0000 (14:07 +0200)]
Revert "Add softfp support in min/max kernels"

4 years agoMerge pull request #2142 from martin-frbg/issue1912-2
Martin Kroeker [Tue, 28 May 2019 20:56:08 +0000 (22:56 +0200)]
Merge pull request #2142 from martin-frbg/issue1912-2

Add softfp support in min/max kernels

4 years agoMerge pull request #2141 from martin-frbg/issue1912
Martin Kroeker [Tue, 28 May 2019 18:50:40 +0000 (20:50 +0200)]
Merge pull request #2141 from martin-frbg/issue1912

Build and run utests independently of fortran

4 years agoAdd softfp support in min/max kernels
Martin Kroeker [Tue, 28 May 2019 18:34:22 +0000 (20:34 +0200)]
Add softfp support in min/max kernels

fix for #1912

4 years agoMerge pull request #2140 from martin-frbg/pgi19
Martin Kroeker [Sun, 26 May 2019 10:39:20 +0000 (12:39 +0200)]
Merge pull request #2140 from martin-frbg/pgi19

Do not try ancient PGI hacks with recent versions of that compiler

4 years agoBuild and run utests in any case, they do their own checks for fortran availability
Martin Kroeker [Fri, 24 May 2019 11:02:23 +0000 (13:02 +0200)]
Build and run utests in any case, they do their own checks for fortran availability

4 years agoDo not try ancient PGI hacks with recent versions of that compiler
Martin Kroeker [Wed, 22 May 2019 11:48:27 +0000 (13:48 +0200)]
Do not try ancient PGI hacks with recent versions of that compiler

should fix #2139

5 years agoMerge pull request #2136 from martin-frbg/issue2126
Martin Kroeker [Thu, 16 May 2019 10:08:16 +0000 (12:08 +0200)]
Merge pull request #2136 from martin-frbg/issue2126

Add option to allow combining USE_THREAD=0 with thread locking support

5 years agoMerge pull request #2134 from tylerjereddy/skylake_regress_guard_may14
Martin Kroeker [Wed, 15 May 2019 21:40:06 +0000 (23:40 +0200)]
Merge pull request #2134 from tylerjereddy/skylake_regress_guard_may14

TST: add SkylakeX AVX512 CI test

5 years agoRemove unrelated change
Martin Kroeker [Wed, 15 May 2019 21:38:12 +0000 (23:38 +0200)]
Remove unrelated change

5 years agoAdd option USE_LOCKING but keep default settings intact
Martin Kroeker [Wed, 15 May 2019 21:36:17 +0000 (23:36 +0200)]
Add option USE_LOCKING but keep default settings intact

5 years agoAdd option USE_LOCKING for SMP-like locking in USE_THREAD=0 builds
Martin Kroeker [Wed, 15 May 2019 21:21:20 +0000 (23:21 +0200)]
Add option USE_LOCKING for SMP-like locking in USE_THREAD=0 builds

5 years agoAdd option USE_LOCKING for single-threaded build with locking support
Martin Kroeker [Wed, 15 May 2019 21:19:30 +0000 (23:19 +0200)]
Add option USE_LOCKING for single-threaded build with locking support

5 years agoAdd option USE_LOCKING for single-threaded build with locking support
Martin Kroeker [Wed, 15 May 2019 21:18:43 +0000 (23:18 +0200)]
Add option USE_LOCKING for single-threaded build with locking support

for calling from concurrent threads

5 years agoTST: add SkylakeX AVX512 CI test
Tyler Reddy [Tue, 14 May 2019 18:32:23 +0000 (11:32 -0700)]
TST: add SkylakeX AVX512 CI test

* adapt the C-level reproducer code for some
recent SkylakeX AVX512 kernel issues, provided
by Isuru Fernando and modified by Martin Kroeker,
for usage in the utest suite

* add an Intel SDE SkylakeX emulation utest run to
the Azure CI matrix; a custom Docker build was required
because Ubuntu image provided by Azure does not support
AVX512VL instructions

5 years agoMerge pull request #2130 from isuruf/drone
Martin Kroeker [Tue, 14 May 2019 07:37:00 +0000 (09:37 +0200)]
Merge pull request #2130 from isuruf/drone

Drone CI for arm64 native builds

5 years agoFix typo
Isuru Fernando [Sun, 12 May 2019 20:25:45 +0000 (15:25 -0500)]
Fix typo

5 years agoarm32 build
Isuru Fernando [Sun, 12 May 2019 20:14:46 +0000 (15:14 -0500)]
arm32 build

5 years agoRemove qemu armv8 builds
Isuru Fernando [Sun, 12 May 2019 20:09:53 +0000 (15:09 -0500)]
Remove qemu armv8 builds

5 years agoSee if ubuntu 19.04 fixes the ICE
Isuru Fernando [Sun, 12 May 2019 19:28:48 +0000 (14:28 -0500)]
See if ubuntu 19.04 fixes the ICE

5 years agoparallel build
Isuru Fernando [Sun, 12 May 2019 19:22:36 +0000 (14:22 -0500)]
parallel build

5 years agobuild without lapack on cmake
Isuru Fernando [Sun, 12 May 2019 19:17:12 +0000 (14:17 -0500)]
build without lapack on cmake

5 years agoAdd cmake builds and print options
Isuru Fernando [Sun, 12 May 2019 19:09:29 +0000 (14:09 -0500)]
Add cmake builds and print options

5 years agoAdd a cmake build as well
Isuru Fernando [Sun, 12 May 2019 19:06:04 +0000 (14:06 -0500)]
Add a cmake build as well

5 years agono need of gcc in clang build
Isuru Fernando [Sun, 12 May 2019 19:02:39 +0000 (14:02 -0500)]
no need of gcc in clang build

5 years agoupdate yes
Isuru Fernando [Sun, 12 May 2019 18:56:59 +0000 (13:56 -0500)]
update yes

5 years agoFix typo
Isuru Fernando [Sun, 12 May 2019 18:55:38 +0000 (13:55 -0500)]
Fix typo

5 years agoapt update
Isuru Fernando [Sun, 12 May 2019 18:55:04 +0000 (13:55 -0500)]
apt update

5 years agoSwitch to ubuntu and parallel jobs
Isuru Fernando [Sun, 12 May 2019 18:53:58 +0000 (13:53 -0500)]
Switch to ubuntu and parallel jobs

5 years agogfortran->gcc-gfortran
Isuru Fernando [Sun, 12 May 2019 18:50:37 +0000 (13:50 -0500)]
gfortran->gcc-gfortran

5 years agoInstall gfortran and add a clang job
Isuru Fernando [Sun, 12 May 2019 18:47:49 +0000 (13:47 -0500)]
Install gfortran and add a clang job