platform/upstream/openblas.git
4 years agoUpdate DYNAMIC_ARCH support for ARM64 and PPC (#2332)
Martin Kroeker [Wed, 4 Dec 2019 10:06:03 +0000 (11:06 +0100)]
Update DYNAMIC_ARCH support for ARM64 and PPC (#2332)

* Update DYNAMIC_ARCH list of ARM64 targets for gmake
* Update arm64 cpu list for runtime detection
* Update DYNAMIC_ARCH list of ARM64 targets for cmake and add POWERPC targets

4 years agoMerge pull request #2334 from martin-frbg/fix2228
Martin Kroeker [Tue, 3 Dec 2019 21:23:52 +0000 (22:23 +0100)]
Merge pull request #2334 from martin-frbg/fix2228

Remove misplaced file

4 years agoAdd Intel Goldmont+ cpuid
Martin Kroeker [Tue, 3 Dec 2019 07:32:29 +0000 (08:32 +0100)]
Add Intel Goldmont+ cpuid

was originally in #2228 but that PR had misplaced the file in the toplevel directory

4 years agoDelete stray copy of dynamic.c from PR 2228
Martin Kroeker [Tue, 3 Dec 2019 07:24:10 +0000 (08:24 +0100)]
Delete stray copy of dynamic.c from PR 2228

4 years agoMerge pull request #20 from xianyi/develop
Martin Kroeker [Tue, 3 Dec 2019 07:22:40 +0000 (08:22 +0100)]
Merge pull request #20 from xianyi/develop

Rebase

4 years agoMerge pull request #2329 from isuruf/patch-1
Martin Kroeker [Mon, 2 Dec 2019 07:30:43 +0000 (08:30 +0100)]
Merge pull request #2329 from isuruf/patch-1

Workaround an ICE in clang 9.0.0

4 years agoWorkaround an ICE in clang 9.0.0
Isuru Fernando [Sun, 1 Dec 2019 17:55:49 +0000 (11:55 -0600)]
Workaround an ICE in clang 9.0.0

This bug is not there in 8.x nor in the 9.0 daily snapshot.

4 years agoMerge pull request #2328 from martin-frbg/ppc9
Martin Kroeker [Sat, 30 Nov 2019 11:23:57 +0000 (12:23 +0100)]
Merge pull request #2328 from martin-frbg/ppc9

Fix precompiled kernels on POWER9 and make their use conditional on (old) gcc version

4 years agoMerge pull request #2324 from antonblanchard/power9_segv
Martin Kroeker [Fri, 29 Nov 2019 23:03:42 +0000 (00:03 +0100)]
Merge pull request #2324 from antonblanchard/power9_segv

Fix SEGV in cdot_power9

4 years agoFix caxpy/caxpyc naming in localentry
Martin Kroeker [Fri, 29 Nov 2019 22:56:57 +0000 (23:56 +0100)]
Fix caxpy/caxpyc naming in localentry

4 years agoFix caxpy/caxpyc naming in localentry
Martin Kroeker [Fri, 29 Nov 2019 22:54:15 +0000 (23:54 +0100)]
Fix caxpy/caxpyc naming in localentry

4 years agoSubstitute precompiled gcc7 codes only when gcc is older than 9.x
Martin Kroeker [Fri, 29 Nov 2019 22:49:50 +0000 (23:49 +0100)]
Substitute precompiled gcc7 codes only when gcc is older than 9.x

4 years agoAdd variable for gcc >=9 test
Martin Kroeker [Fri, 29 Nov 2019 22:47:23 +0000 (23:47 +0100)]
Add variable for gcc >=9 test

used in KERNEL.POWER9

4 years agoMerge pull request #19 from xianyi/develop
Martin Kroeker [Fri, 29 Nov 2019 22:44:09 +0000 (23:44 +0100)]
Merge pull request #19 from xianyi/develop

rebase

4 years agoMerge pull request #2323 from wjc404/develop
Martin Kroeker [Thu, 28 Nov 2019 19:55:16 +0000 (20:55 +0100)]
Merge pull request #2323 from wjc404/develop

some optimizations of AVX512 DGEMM

4 years agoUpdate param.h
wjc404 [Thu, 28 Nov 2019 11:57:50 +0000 (19:57 +0800)]
Update param.h

4 years agoUpdate dgemm_kernel_4x8_skylakex_2.c
wjc404 [Thu, 28 Nov 2019 11:56:35 +0000 (19:56 +0800)]
Update dgemm_kernel_4x8_skylakex_2.c

4 years agoMerge pull request #2321 from martin-frbg/issue2319
Martin Kroeker [Thu, 28 Nov 2019 08:30:24 +0000 (09:30 +0100)]
Merge pull request #2321 from martin-frbg/issue2319

Fix race conditions in multithreaded GEMM3M

4 years agoMerge pull request #2327 from martin-frbg/travisosx
Martin Kroeker [Thu, 28 Nov 2019 07:43:45 +0000 (08:43 +0100)]
Merge pull request #2327 from martin-frbg/travisosx

Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

4 years agoMerge pull request #2326 from xianyi/revert-2325-travisosx
Martin Kroeker [Wed, 27 Nov 2019 23:17:19 +0000 (00:17 +0100)]
Merge pull request #2326 from xianyi/revert-2325-travisosx

Revert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now"

4 years agoCleanup IOS build and disable FORTRAN on 32bit and ios builds for now
Martin Kroeker [Wed, 27 Nov 2019 23:15:36 +0000 (00:15 +0100)]
Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

 Travis recently appears unable to find a matching homebrew package for 32bit gfortran,
and the IOS crossbuild suffered from excessive output due to the known problem with "ASMNAME redefined"
warnings when CFLAGS is set in the environment

4 years agoRevert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for...
Martin Kroeker [Wed, 27 Nov 2019 23:09:06 +0000 (00:09 +0100)]
Revert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now"

4 years agoMerge pull request #2325 from martin-frbg/travisosx
Martin Kroeker [Wed, 27 Nov 2019 20:59:36 +0000 (21:59 +0100)]
Merge pull request #2325 from martin-frbg/travisosx

Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now

4 years agoCleanup IOS build and disable FORTRAN on 32bit and ios builds for now
Martin Kroeker [Wed, 27 Nov 2019 14:10:57 +0000 (15:10 +0100)]
Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

Travis recently appears unable to find a matching homebrew package for 32bit gfortran,
and the IOS crossbuild suffered from excessive output due to the known problem with "ASMNAME redefined"
warnings when CFLAGS is set in the environment

4 years agoFix SEGV in cdot_power9
Anton Blanchard [Wed, 27 Nov 2019 04:55:04 +0000 (21:55 -0700)]
Fix SEGV in cdot_power9

We were corrupting r2 because the local entry wasn't being
setup correctly.

4 years agosome optimizations
wjc404 [Tue, 26 Nov 2019 06:12:20 +0000 (14:12 +0800)]
some optimizations

4 years agoFix AVX512 capability test (always returning zero)
Martin Kroeker [Sat, 23 Nov 2019 21:38:07 +0000 (22:38 +0100)]
Fix AVX512 capability test (always returning zero)

from #2322

4 years agoFix race conditions in multithreaded GEMM3M
Martin Kroeker [Sat, 23 Nov 2019 18:54:56 +0000 (19:54 +0100)]
Fix race conditions in multithreaded GEMM3M

by adding barriers (and a mutex lock for the non-OpenMP case) like it was already done for GEMM in level3_thread.c some time ago

4 years agoAdd the cpuid of the business/rackmount version of z15 as well
Martin Kroeker [Thu, 21 Nov 2019 17:14:29 +0000 (18:14 +0100)]
Add the cpuid of the business/rackmount version of z15 as well

4 years agoMerge pull request #2316 from sharkcz/s390x
Martin Kroeker [Thu, 21 Nov 2019 17:03:00 +0000 (18:03 +0100)]
Merge pull request #2316 from sharkcz/s390x

zarch: treat z15 as z14 instead of generic

4 years agoMerge pull request #2317 from aarnez/develop
Martin Kroeker [Thu, 21 Nov 2019 16:59:21 +0000 (17:59 +0100)]
Merge pull request #2317 from aarnez/develop

Change bad usage of "asum" to "sum" in ZARCH versions of ?sum

4 years agoChange bad usage of "asum" to "sum" in ZARCH versions of ?sum
Andreas Arnez [Fri, 20 Sep 2019 16:32:47 +0000 (18:32 +0200)]
Change bad usage of "asum" to "sum" in ZARCH versions of ?sum

The ZARCH implementations of ?sum contain a cut & paste-error: An inline
assembly argument is named "sum", but the assembly references "asum"
instead.  The mismatch causes a build error.  This is fixed.

4 years agozarch: treat z15 as z14 instead of generic
Dan Horák [Thu, 21 Nov 2019 11:49:54 +0000 (12:49 +0100)]
zarch: treat z15 as z14 instead of generic

Signed-off-by: Dan Horák <dan@danny.cz>
4 years agoMerge pull request #2315 from ewanglong/develop
Martin Kroeker [Thu, 21 Nov 2019 04:06:44 +0000 (05:06 +0100)]
Merge pull request #2315 from ewanglong/develop

revised fix windows compatible for #2313

4 years agorevised fix windows compatible for #2313
Wang, Long [Thu, 21 Nov 2019 02:19:40 +0000 (10:19 +0800)]
revised fix windows compatible for #2313

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoMerge pull request #2314 from Jehan/wip/Jehan/fix-openblas-crash
Martin Kroeker [Wed, 20 Nov 2019 15:16:35 +0000 (16:16 +0100)]
Merge pull request #2314 from Jehan/wip/Jehan/fix-openblas-crash

Fix usage of TerminateThread() causing critical section corruption.

4 years agoMerge pull request #2312 from martin-frbg/power8be
Martin Kroeker [Wed, 20 Nov 2019 14:12:06 +0000 (15:12 +0100)]
Merge pull request #2312 from martin-frbg/power8be

Further Power8 big-endian corrections

4 years agoMerge pull request #2313 from ewanglong/develop
Martin Kroeker [Wed, 20 Nov 2019 13:49:15 +0000 (14:49 +0100)]
Merge pull request #2313 from ewanglong/develop

Fix the integer overflow issue for large matrix size

4 years agoFor the sake of windows compatible, used "unsigned long long" to ensure 64-bit length
Wang, Long [Wed, 20 Nov 2019 13:30:16 +0000 (21:30 +0800)]
For the sake of windows compatible, used "unsigned long long" to ensure 64-bit length

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoFix usage of TerminateThread() causing critical section corruption.
Jehan [Wed, 20 Nov 2019 11:21:35 +0000 (12:21 +0100)]
Fix usage of TerminateThread() causing critical section corruption.

This patch was submitted to the GIMP project by a publisher wishing to
keep confidentiality (hence anonymously). I just pass along the patch.
Here is the patch explanation which came with:

First they remind us what Microsoft documentation says about
TerminateThread:
> TerminateThread is a dangerous function that should only be used in
> the most extreme cases. You should call TerminateThread only if you
> know exactly what the target thread is doing, and you control all of
> the code that the target thread could possibly be running at the time
> of the termination.
(https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-terminatethread)

Then they say that 5 milliseconds time-out might not be long enough for
the thread to exit gracefully. They propose to set it to a much higher
value (for instance here 5 seconds).

And finally you should always check the return value of
WaitForSingleObject(). In particular you want to run TerminateThread()
only if WaitForSingleObject() failed, not on success case.

4 years agoFix the integer overflow issue for large matrix size
Wang, Long [Wed, 20 Nov 2019 03:50:37 +0000 (11:50 +0800)]
Fix the integer overflow issue for large matrix size

For large matrix, e.g. M=N=K, and M>1290, int mnk=M*N*K will overflow.
This will lead to wrong branching to single-threading. The performance
is downgraded significantly.

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoMerge pull request #2310 from martin-frbg/ppc440
Martin Kroeker [Sun, 17 Nov 2019 22:19:48 +0000 (23:19 +0100)]
Merge pull request #2310 from martin-frbg/ppc440

Fix PPC440 big-endian support and disable the QCDOC qalloc routine by default

4 years agoDefine alternate kernels for big-endian POWER8
Martin Kroeker [Sun, 17 Nov 2019 22:12:10 +0000 (23:12 +0100)]
Define alternate kernels for big-endian POWER8

4 years agoFix compilation for big-endian POWER8
Martin Kroeker [Sun, 17 Nov 2019 21:58:32 +0000 (22:58 +0100)]
Fix compilation for big-endian POWER8

4 years agoDefine alternate kernels for big-endian PPC440
Martin Kroeker [Sun, 17 Nov 2019 18:25:08 +0000 (19:25 +0100)]
Define alternate kernels for big-endian PPC440

4 years agoDisable the old QCDOC qalloc by default and copy utility functions from memory.c
Martin Kroeker [Sun, 17 Nov 2019 18:22:04 +0000 (19:22 +0100)]
Disable the old QCDOC qalloc by default and copy utility functions from memory.c

1. qalloc() appears to have been a special routine written for the PPC440-based QCDOC supercomputer(s) from around 2005, its source does not seem to be readily available. So switch the #if 1 in the code to rely on standard malloc() by default.
2. Utility functions like get_num_procs, get_num_threads that were added to the "normally" used memory.c in the meantime were still missing here.

4 years agoMerge pull request #17 from xianyi/develop
Martin Kroeker [Sun, 17 Nov 2019 18:09:49 +0000 (19:09 +0100)]
Merge pull request #17 from xianyi/develop

rebase

4 years agoMerge pull request #2309 from martin-frbg/ppc970-be
Martin Kroeker [Sun, 17 Nov 2019 17:22:24 +0000 (18:22 +0100)]
Merge pull request #2309 from martin-frbg/ppc970-be

Fix PPC970 big-endian support

4 years agoDefine alternate kernels for big-endian PPC970
Martin Kroeker [Sun, 17 Nov 2019 14:19:39 +0000 (15:19 +0100)]
Define alternate kernels for big-endian PPC970

The altivec versions of SGEMM and CGEMM fail most test in LAPACK-TESTING when compiled for big endian, STRSM/CTRSM even cause segfaults. The rot kernels either fail the corresponding utest or lead to failures in LAPACK-TESTING.

4 years agoUse "generic" S/CGEMM unroll M on big-endian PPC970
Martin Kroeker [Sun, 17 Nov 2019 14:10:26 +0000 (15:10 +0100)]
Use "generic" S/CGEMM unroll M on big-endian PPC970

as the respective PPC970 "altivec" kernels give wrong results when compiled for big endian

4 years agoMerge pull request #2308 from martin-frbg/ctestfix
Martin Kroeker [Fri, 15 Nov 2019 07:33:17 +0000 (08:33 +0100)]
Merge pull request #2308 from martin-frbg/ctestfix

Fix potential issue in the c/z blas3 ctests

4 years agoFix potential spurious failure from uninitialized variable
Martin Kroeker [Thu, 14 Nov 2019 23:20:36 +0000 (00:20 +0100)]
Fix potential spurious failure from uninitialized variable

4 years agoFix potential spurious failure from uninitialized variable
Martin Kroeker [Thu, 14 Nov 2019 23:19:24 +0000 (00:19 +0100)]
Fix potential spurious failure from uninitialized variable

5 years agoMerge pull request #2305 from wjc404/develop
Martin Kroeker [Tue, 12 Nov 2019 06:38:37 +0000 (07:38 +0100)]
Merge pull request #2305 from wjc404/develop

AVX512 CGEMM & ZGEMM kernels

5 years agoAVX512 CGEMM & ZGEMM kernels
wjc404 [Mon, 11 Nov 2019 12:04:52 +0000 (20:04 +0800)]
AVX512 CGEMM & ZGEMM kernels

96-99% 1-thread performance of MKL2018

5 years agoMerge pull request #15 from xianyi/develop
Martin Kroeker [Sat, 9 Nov 2019 17:52:08 +0000 (18:52 +0100)]
Merge pull request #15 from xianyi/develop

rebase

5 years agoMerge pull request #2300 from wjc404/develop
Martin Kroeker [Wed, 6 Nov 2019 06:27:33 +0000 (07:27 +0100)]
Merge pull request #2300 from wjc404/develop

Optimize SGEMM on SKYLAKEX CPUs

5 years agooptimizations of software prefetching
wjc404 [Tue, 5 Nov 2019 05:36:56 +0000 (13:36 +0800)]
optimizations of software prefetching

5 years agoMerge pull request #2302 from martin-frbg/ppc970
Martin Kroeker [Mon, 4 Nov 2019 21:55:05 +0000 (22:55 +0100)]
Merge pull request #2302 from martin-frbg/ppc970

Disable three-operand DCBT on PPC970 regardless of operating system

5 years agoMerge pull request #2301 from martin-frbg/ppc8be
Martin Kroeker [Mon, 4 Nov 2019 21:54:28 +0000 (22:54 +0100)]
Merge pull request #2301 from martin-frbg/ppc8be

Disable IDAMIN/MAX and IZAMIN/MAX optimizations on big-endian POWER8

5 years agoMerge pull request #2294 from martin-frbg/ios-cleanup
Martin Kroeker [Mon, 4 Nov 2019 21:53:58 +0000 (22:53 +0100)]
Merge pull request #2294 from martin-frbg/ios-cleanup

Remove obsolete workarounds for IOS on ARMV8

5 years agoAdd files via upload
wjc404 [Mon, 4 Nov 2019 12:10:12 +0000 (20:10 +0800)]
Add files via upload

5 years agooptimizations via software prefetches
wjc404 [Mon, 4 Nov 2019 11:37:19 +0000 (19:37 +0800)]
optimizations via software prefetches

5 years agoUse the two-operand form of DCBT on all PPC970 regardless of OS
Martin Kroeker [Sun, 3 Nov 2019 21:55:31 +0000 (22:55 +0100)]
Use the two-operand form of DCBT on all PPC970 regardless of OS

There seems to be no advantage to the three-operand form used in the earliest GotoBLAS kernels, and it causes compilation problems  on other than the previously special-cased platforms as well

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:42:46 +0000 (22:42 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:41:19 +0000 (22:41 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:39:06 +0000 (22:39 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:37:27 +0000 (22:37 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoMerge pull request #13 from xianyi/develop
Martin Kroeker [Sun, 3 Nov 2019 21:33:31 +0000 (22:33 +0100)]
Merge pull request #13 from xianyi/develop

resync with upstream

5 years agoAdd files via upload
wjc404 [Sat, 2 Nov 2019 02:09:19 +0000 (10:09 +0800)]
Add files via upload

5 years agoAdd files via upload
wjc404 [Sat, 2 Nov 2019 02:06:13 +0000 (10:06 +0800)]
Add files via upload

5 years agonew sgemm kernel for skylakex
wjc404 [Fri, 1 Nov 2019 16:00:48 +0000 (00:00 +0800)]
new sgemm kernel for skylakex

5 years agoupdate sgemm_q on skylakex cpus
wjc404 [Fri, 1 Nov 2019 15:59:18 +0000 (23:59 +0800)]
update sgemm_q on skylakex cpus

5 years agoMerge pull request #2296 from kdunee/develop
Martin Kroeker [Mon, 28 Oct 2019 12:24:18 +0000 (13:24 +0100)]
Merge pull request #2296 from kdunee/develop

Fixed a minor cmake problem, occuring when DYNAMIC_ARCH=ON and CMAKE_C_FLAGS was empty

5 years agoFixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was...
k.dunikowski [Mon, 28 Oct 2019 07:51:05 +0000 (08:51 +0100)]
Fixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was empty

5 years agoMerge pull request #2293 from martin-frbg/pr2288
Martin Kroeker [Fri, 25 Oct 2019 21:46:39 +0000 (23:46 +0200)]
Merge pull request #2293 from martin-frbg/pr2288

Add support for NetBSD by adding it to the existing xBSD conditionals

5 years agoRemove special parameter set for obsolete IOS/ARMV8 workaround
Martin Kroeker [Fri, 25 Oct 2019 21:07:00 +0000 (23:07 +0200)]
Remove special parameter set for obsolete IOS/ARMV8 workaround

5 years agoRemove the IOS fallbacks to generic C kernels
Martin Kroeker [Fri, 25 Oct 2019 21:02:37 +0000 (23:02 +0200)]
Remove the IOS fallbacks to generic C kernels

5 years agoFix regex to parse -R options with and without whitespace
Martin Kroeker [Fri, 25 Oct 2019 20:52:30 +0000 (22:52 +0200)]
Fix regex to parse -R options with and without whitespace

Both forms are seen on NetBSD (#2288)

5 years agoAdd NetBSD to the xBSD conditionals
Martin Kroeker [Fri, 25 Oct 2019 10:52:49 +0000 (12:52 +0200)]
Add NetBSD to the xBSD conditionals

5 years agoAdd NetBSD
Martin Kroeker [Fri, 25 Oct 2019 10:51:06 +0000 (12:51 +0200)]
Add NetBSD

5 years agoMerge pull request #2292 from martin-frbg/g95fixes
Martin Kroeker [Fri, 25 Oct 2019 08:35:17 +0000 (10:35 +0200)]
Merge pull request #2292 from martin-frbg/g95fixes

Improve support for g95 and non-GNU ld

5 years agoMerge pull request #2291 from martin-frbg/gensymbol
Martin Kroeker [Fri, 25 Oct 2019 08:34:50 +0000 (10:34 +0200)]
Merge pull request #2291 from martin-frbg/gensymbol

Fix netlib 3.7/3.8 function enumeration for linktest

5 years agoMerge pull request #2282 from martin-frbg/issue2281
Martin Kroeker [Fri, 25 Oct 2019 07:56:30 +0000 (09:56 +0200)]
Merge pull request #2282 from martin-frbg/issue2281

Optimize RPCC function on ARM64

5 years agoMerge pull request #2290 from martin-frbg/cpuidfixes
Martin Kroeker [Thu, 24 Oct 2019 20:52:15 +0000 (22:52 +0200)]
Merge pull request #2290 from martin-frbg/cpuidfixes

Fixup x86 cpuid changes from #2283

5 years agoImprove support for g95 and non-GNU ld
Martin Kroeker [Thu, 24 Oct 2019 20:43:27 +0000 (22:43 +0200)]
Improve support for g95 and non-GNU ld

Auto-add "-fno-second-underscore" option to make LAPACKE compile (as it calls LAPACK functions that may have gotten a second underscore added otherwise). Also support -R for rpath when parsing compiler directives in f_check

5 years agoMove most lapack 3.7/3.8 additions to the embedded_underscores list
Martin Kroeker [Thu, 24 Oct 2019 19:26:20 +0000 (21:26 +0200)]
Move most lapack 3.7/3.8 additions to the embedded_underscores list

to allow linktest to pass with a compiler that adds a second underscore to such names

5 years agoDisable direct clock register access on IOS and Android
Martin Kroeker [Thu, 24 Oct 2019 19:18:17 +0000 (21:18 +0200)]
Disable direct clock register access on IOS and Android

as I find conflicting information on accessibility from non-priviledged processes

5 years agoRemove prototype of unused, unimplemented function (#2274)
luzpaz [Thu, 24 Oct 2019 16:56:53 +0000 (12:56 -0400)]
Remove prototype of unused, unimplemented function (#2274)

* Fix source typo

Found via `codespell -q 3 -L amin,als,ba,dum,mone,nd,nto,orign -S Changelog.txt,./lapack*`

* Remove beta-thread function per request

5 years agoRestore Goldmont ID and improve QEMU support
Martin Kroeker [Thu, 24 Oct 2019 16:45:27 +0000 (18:45 +0200)]
Restore Goldmont ID and improve QEMU support

#2283 had inadvertently removed Goldmont+, and cpuid was reporting a mix of Core2 and Pentium2 for some QEMU configurations

5 years agoMerge pull request #12 from xianyi/develop
Martin Kroeker [Thu, 24 Oct 2019 16:40:13 +0000 (18:40 +0200)]
Merge pull request #12 from xianyi/develop

resync with upstream

5 years agoMerge pull request #2286 from wjc404/develop
Martin Kroeker [Sun, 20 Oct 2019 10:44:19 +0000 (12:44 +0200)]
Merge pull request #2286 from wjc404/develop

AVX512 DGEMM kernel

5 years agonative support for icopy_4
wjc404 [Fri, 18 Oct 2019 19:54:44 +0000 (03:54 +0800)]
native support for icopy_4

90% MKL 1-thread performance.

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Fri, 18 Oct 2019 07:00:17 +0000 (15:00 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agosome correction
wjc404 [Fri, 18 Oct 2019 06:58:07 +0000 (14:58 +0800)]
some correction

5 years agomake further changes to icopy_8 easier
wjc404 [Fri, 18 Oct 2019 02:47:31 +0000 (10:47 +0800)]
make further changes to icopy_8 easier

5 years agoAdd files via upload
wjc404 [Wed, 16 Oct 2019 11:23:36 +0000 (19:23 +0800)]
Add files via upload

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Wed, 16 Oct 2019 02:14:51 +0000 (10:14 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Tue, 15 Oct 2019 19:20:08 +0000 (03:20 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agoAdd files via upload
wjc404 [Tue, 15 Oct 2019 18:01:13 +0000 (02:01 +0800)]
Add files via upload