platform/upstream/openblas.git
4 years agoMerge pull request #19 from xianyi/develop
Martin Kroeker [Fri, 29 Nov 2019 22:44:09 +0000 (23:44 +0100)]
Merge pull request #19 from xianyi/develop

rebase

4 years agoMerge pull request #2323 from wjc404/develop
Martin Kroeker [Thu, 28 Nov 2019 19:55:16 +0000 (20:55 +0100)]
Merge pull request #2323 from wjc404/develop

some optimizations of AVX512 DGEMM

4 years agoUpdate param.h
wjc404 [Thu, 28 Nov 2019 11:57:50 +0000 (19:57 +0800)]
Update param.h

4 years agoUpdate dgemm_kernel_4x8_skylakex_2.c
wjc404 [Thu, 28 Nov 2019 11:56:35 +0000 (19:56 +0800)]
Update dgemm_kernel_4x8_skylakex_2.c

4 years agoMerge pull request #2321 from martin-frbg/issue2319
Martin Kroeker [Thu, 28 Nov 2019 08:30:24 +0000 (09:30 +0100)]
Merge pull request #2321 from martin-frbg/issue2319

Fix race conditions in multithreaded GEMM3M

4 years agoMerge pull request #2327 from martin-frbg/travisosx
Martin Kroeker [Thu, 28 Nov 2019 07:43:45 +0000 (08:43 +0100)]
Merge pull request #2327 from martin-frbg/travisosx

Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

4 years agoMerge pull request #2326 from xianyi/revert-2325-travisosx
Martin Kroeker [Wed, 27 Nov 2019 23:17:19 +0000 (00:17 +0100)]
Merge pull request #2326 from xianyi/revert-2325-travisosx

Revert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now"

4 years agoCleanup IOS build and disable FORTRAN on 32bit and ios builds for now
Martin Kroeker [Wed, 27 Nov 2019 23:15:36 +0000 (00:15 +0100)]
Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

 Travis recently appears unable to find a matching homebrew package for 32bit gfortran,
and the IOS crossbuild suffered from excessive output due to the known problem with "ASMNAME redefined"
warnings when CFLAGS is set in the environment

4 years agoRevert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for...
Martin Kroeker [Wed, 27 Nov 2019 23:09:06 +0000 (00:09 +0100)]
Revert "Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now"

4 years agoMerge pull request #2325 from martin-frbg/travisosx
Martin Kroeker [Wed, 27 Nov 2019 20:59:36 +0000 (21:59 +0100)]
Merge pull request #2325 from martin-frbg/travisosx

Cleanup Travis IOS xbuild and disable FORTRAN on 32bit and ios builds for now

4 years agoCleanup IOS build and disable FORTRAN on 32bit and ios builds for now
Martin Kroeker [Wed, 27 Nov 2019 14:10:57 +0000 (15:10 +0100)]
Cleanup IOS build and disable FORTRAN on 32bit and ios builds for now

Travis recently appears unable to find a matching homebrew package for 32bit gfortran,
and the IOS crossbuild suffered from excessive output due to the known problem with "ASMNAME redefined"
warnings when CFLAGS is set in the environment

4 years agosome optimizations
wjc404 [Tue, 26 Nov 2019 06:12:20 +0000 (14:12 +0800)]
some optimizations

4 years agoFix AVX512 capability test (always returning zero)
Martin Kroeker [Sat, 23 Nov 2019 21:38:07 +0000 (22:38 +0100)]
Fix AVX512 capability test (always returning zero)

from #2322

4 years agoFix race conditions in multithreaded GEMM3M
Martin Kroeker [Sat, 23 Nov 2019 18:54:56 +0000 (19:54 +0100)]
Fix race conditions in multithreaded GEMM3M

by adding barriers (and a mutex lock for the non-OpenMP case) like it was already done for GEMM in level3_thread.c some time ago

4 years agoAdd the cpuid of the business/rackmount version of z15 as well
Martin Kroeker [Thu, 21 Nov 2019 17:14:29 +0000 (18:14 +0100)]
Add the cpuid of the business/rackmount version of z15 as well

4 years agoMerge pull request #2316 from sharkcz/s390x
Martin Kroeker [Thu, 21 Nov 2019 17:03:00 +0000 (18:03 +0100)]
Merge pull request #2316 from sharkcz/s390x

zarch: treat z15 as z14 instead of generic

4 years agoMerge pull request #2317 from aarnez/develop
Martin Kroeker [Thu, 21 Nov 2019 16:59:21 +0000 (17:59 +0100)]
Merge pull request #2317 from aarnez/develop

Change bad usage of "asum" to "sum" in ZARCH versions of ?sum

4 years agoChange bad usage of "asum" to "sum" in ZARCH versions of ?sum
Andreas Arnez [Fri, 20 Sep 2019 16:32:47 +0000 (18:32 +0200)]
Change bad usage of "asum" to "sum" in ZARCH versions of ?sum

The ZARCH implementations of ?sum contain a cut & paste-error: An inline
assembly argument is named "sum", but the assembly references "asum"
instead.  The mismatch causes a build error.  This is fixed.

4 years agozarch: treat z15 as z14 instead of generic
Dan Horák [Thu, 21 Nov 2019 11:49:54 +0000 (12:49 +0100)]
zarch: treat z15 as z14 instead of generic

Signed-off-by: Dan Horák <dan@danny.cz>
4 years agoMerge pull request #2315 from ewanglong/develop
Martin Kroeker [Thu, 21 Nov 2019 04:06:44 +0000 (05:06 +0100)]
Merge pull request #2315 from ewanglong/develop

revised fix windows compatible for #2313

4 years agorevised fix windows compatible for #2313
Wang, Long [Thu, 21 Nov 2019 02:19:40 +0000 (10:19 +0800)]
revised fix windows compatible for #2313

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoMerge pull request #2314 from Jehan/wip/Jehan/fix-openblas-crash
Martin Kroeker [Wed, 20 Nov 2019 15:16:35 +0000 (16:16 +0100)]
Merge pull request #2314 from Jehan/wip/Jehan/fix-openblas-crash

Fix usage of TerminateThread() causing critical section corruption.

4 years agoMerge pull request #2312 from martin-frbg/power8be
Martin Kroeker [Wed, 20 Nov 2019 14:12:06 +0000 (15:12 +0100)]
Merge pull request #2312 from martin-frbg/power8be

Further Power8 big-endian corrections

4 years agoMerge pull request #2313 from ewanglong/develop
Martin Kroeker [Wed, 20 Nov 2019 13:49:15 +0000 (14:49 +0100)]
Merge pull request #2313 from ewanglong/develop

Fix the integer overflow issue for large matrix size

4 years agoFor the sake of windows compatible, used "unsigned long long" to ensure 64-bit length
Wang, Long [Wed, 20 Nov 2019 13:30:16 +0000 (21:30 +0800)]
For the sake of windows compatible, used "unsigned long long" to ensure 64-bit length

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoFix usage of TerminateThread() causing critical section corruption.
Jehan [Wed, 20 Nov 2019 11:21:35 +0000 (12:21 +0100)]
Fix usage of TerminateThread() causing critical section corruption.

This patch was submitted to the GIMP project by a publisher wishing to
keep confidentiality (hence anonymously). I just pass along the patch.
Here is the patch explanation which came with:

First they remind us what Microsoft documentation says about
TerminateThread:
> TerminateThread is a dangerous function that should only be used in
> the most extreme cases. You should call TerminateThread only if you
> know exactly what the target thread is doing, and you control all of
> the code that the target thread could possibly be running at the time
> of the termination.
(https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-terminatethread)

Then they say that 5 milliseconds time-out might not be long enough for
the thread to exit gracefully. They propose to set it to a much higher
value (for instance here 5 seconds).

And finally you should always check the return value of
WaitForSingleObject(). In particular you want to run TerminateThread()
only if WaitForSingleObject() failed, not on success case.

4 years agoFix the integer overflow issue for large matrix size
Wang, Long [Wed, 20 Nov 2019 03:50:37 +0000 (11:50 +0800)]
Fix the integer overflow issue for large matrix size

For large matrix, e.g. M=N=K, and M>1290, int mnk=M*N*K will overflow.
This will lead to wrong branching to single-threading. The performance
is downgraded significantly.

Signed-off-by: Wang, Long <long1.wang@intel.com>
4 years agoMerge pull request #2310 from martin-frbg/ppc440
Martin Kroeker [Sun, 17 Nov 2019 22:19:48 +0000 (23:19 +0100)]
Merge pull request #2310 from martin-frbg/ppc440

Fix PPC440 big-endian support and disable the QCDOC qalloc routine by default

4 years agoDefine alternate kernels for big-endian POWER8
Martin Kroeker [Sun, 17 Nov 2019 22:12:10 +0000 (23:12 +0100)]
Define alternate kernels for big-endian POWER8

4 years agoFix compilation for big-endian POWER8
Martin Kroeker [Sun, 17 Nov 2019 21:58:32 +0000 (22:58 +0100)]
Fix compilation for big-endian POWER8

4 years agoDefine alternate kernels for big-endian PPC440
Martin Kroeker [Sun, 17 Nov 2019 18:25:08 +0000 (19:25 +0100)]
Define alternate kernels for big-endian PPC440

4 years agoDisable the old QCDOC qalloc by default and copy utility functions from memory.c
Martin Kroeker [Sun, 17 Nov 2019 18:22:04 +0000 (19:22 +0100)]
Disable the old QCDOC qalloc by default and copy utility functions from memory.c

1. qalloc() appears to have been a special routine written for the PPC440-based QCDOC supercomputer(s) from around 2005, its source does not seem to be readily available. So switch the #if 1 in the code to rely on standard malloc() by default.
2. Utility functions like get_num_procs, get_num_threads that were added to the "normally" used memory.c in the meantime were still missing here.

4 years agoMerge pull request #17 from xianyi/develop
Martin Kroeker [Sun, 17 Nov 2019 18:09:49 +0000 (19:09 +0100)]
Merge pull request #17 from xianyi/develop

rebase

4 years agoMerge pull request #2309 from martin-frbg/ppc970-be
Martin Kroeker [Sun, 17 Nov 2019 17:22:24 +0000 (18:22 +0100)]
Merge pull request #2309 from martin-frbg/ppc970-be

Fix PPC970 big-endian support

4 years agoDefine alternate kernels for big-endian PPC970
Martin Kroeker [Sun, 17 Nov 2019 14:19:39 +0000 (15:19 +0100)]
Define alternate kernels for big-endian PPC970

The altivec versions of SGEMM and CGEMM fail most test in LAPACK-TESTING when compiled for big endian, STRSM/CTRSM even cause segfaults. The rot kernels either fail the corresponding utest or lead to failures in LAPACK-TESTING.

4 years agoUse "generic" S/CGEMM unroll M on big-endian PPC970
Martin Kroeker [Sun, 17 Nov 2019 14:10:26 +0000 (15:10 +0100)]
Use "generic" S/CGEMM unroll M on big-endian PPC970

as the respective PPC970 "altivec" kernels give wrong results when compiled for big endian

4 years agoMerge pull request #2308 from martin-frbg/ctestfix
Martin Kroeker [Fri, 15 Nov 2019 07:33:17 +0000 (08:33 +0100)]
Merge pull request #2308 from martin-frbg/ctestfix

Fix potential issue in the c/z blas3 ctests

4 years agoFix potential spurious failure from uninitialized variable
Martin Kroeker [Thu, 14 Nov 2019 23:20:36 +0000 (00:20 +0100)]
Fix potential spurious failure from uninitialized variable

4 years agoFix potential spurious failure from uninitialized variable
Martin Kroeker [Thu, 14 Nov 2019 23:19:24 +0000 (00:19 +0100)]
Fix potential spurious failure from uninitialized variable

5 years agoMerge pull request #2305 from wjc404/develop
Martin Kroeker [Tue, 12 Nov 2019 06:38:37 +0000 (07:38 +0100)]
Merge pull request #2305 from wjc404/develop

AVX512 CGEMM & ZGEMM kernels

5 years agoAVX512 CGEMM & ZGEMM kernels
wjc404 [Mon, 11 Nov 2019 12:04:52 +0000 (20:04 +0800)]
AVX512 CGEMM & ZGEMM kernels

96-99% 1-thread performance of MKL2018

5 years agoMerge pull request #15 from xianyi/develop
Martin Kroeker [Sat, 9 Nov 2019 17:52:08 +0000 (18:52 +0100)]
Merge pull request #15 from xianyi/develop

rebase

5 years agoMerge pull request #2300 from wjc404/develop
Martin Kroeker [Wed, 6 Nov 2019 06:27:33 +0000 (07:27 +0100)]
Merge pull request #2300 from wjc404/develop

Optimize SGEMM on SKYLAKEX CPUs

5 years agooptimizations of software prefetching
wjc404 [Tue, 5 Nov 2019 05:36:56 +0000 (13:36 +0800)]
optimizations of software prefetching

5 years agoMerge pull request #2302 from martin-frbg/ppc970
Martin Kroeker [Mon, 4 Nov 2019 21:55:05 +0000 (22:55 +0100)]
Merge pull request #2302 from martin-frbg/ppc970

Disable three-operand DCBT on PPC970 regardless of operating system

5 years agoMerge pull request #2301 from martin-frbg/ppc8be
Martin Kroeker [Mon, 4 Nov 2019 21:54:28 +0000 (22:54 +0100)]
Merge pull request #2301 from martin-frbg/ppc8be

Disable IDAMIN/MAX and IZAMIN/MAX optimizations on big-endian POWER8

5 years agoMerge pull request #2294 from martin-frbg/ios-cleanup
Martin Kroeker [Mon, 4 Nov 2019 21:53:58 +0000 (22:53 +0100)]
Merge pull request #2294 from martin-frbg/ios-cleanup

Remove obsolete workarounds for IOS on ARMV8

5 years agoAdd files via upload
wjc404 [Mon, 4 Nov 2019 12:10:12 +0000 (20:10 +0800)]
Add files via upload

5 years agooptimizations via software prefetches
wjc404 [Mon, 4 Nov 2019 11:37:19 +0000 (19:37 +0800)]
optimizations via software prefetches

5 years agoUse the two-operand form of DCBT on all PPC970 regardless of OS
Martin Kroeker [Sun, 3 Nov 2019 21:55:31 +0000 (22:55 +0100)]
Use the two-operand form of DCBT on all PPC970 regardless of OS

There seems to be no advantage to the three-operand form used in the earliest GotoBLAS kernels, and it causes compilation problems  on other than the previously special-cased platforms as well

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:42:46 +0000 (22:42 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:41:19 +0000 (22:41 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:39:06 +0000 (22:39 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoThe assembly microkernel is not safe to use on ELFv1
Martin Kroeker [Sun, 3 Nov 2019 21:37:27 +0000 (22:37 +0100)]
The assembly microkernel is not safe to use on ELFv1

5 years agoMerge pull request #13 from xianyi/develop
Martin Kroeker [Sun, 3 Nov 2019 21:33:31 +0000 (22:33 +0100)]
Merge pull request #13 from xianyi/develop

resync with upstream

5 years agoAdd files via upload
wjc404 [Sat, 2 Nov 2019 02:09:19 +0000 (10:09 +0800)]
Add files via upload

5 years agoAdd files via upload
wjc404 [Sat, 2 Nov 2019 02:06:13 +0000 (10:06 +0800)]
Add files via upload

5 years agonew sgemm kernel for skylakex
wjc404 [Fri, 1 Nov 2019 16:00:48 +0000 (00:00 +0800)]
new sgemm kernel for skylakex

5 years agoupdate sgemm_q on skylakex cpus
wjc404 [Fri, 1 Nov 2019 15:59:18 +0000 (23:59 +0800)]
update sgemm_q on skylakex cpus

5 years agoMerge pull request #2296 from kdunee/develop
Martin Kroeker [Mon, 28 Oct 2019 12:24:18 +0000 (13:24 +0100)]
Merge pull request #2296 from kdunee/develop

Fixed a minor cmake problem, occuring when DYNAMIC_ARCH=ON and CMAKE_C_FLAGS was empty

5 years agoFixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was...
k.dunikowski [Mon, 28 Oct 2019 07:51:05 +0000 (08:51 +0100)]
Fixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was empty

5 years agoMerge pull request #2293 from martin-frbg/pr2288
Martin Kroeker [Fri, 25 Oct 2019 21:46:39 +0000 (23:46 +0200)]
Merge pull request #2293 from martin-frbg/pr2288

Add support for NetBSD by adding it to the existing xBSD conditionals

5 years agoRemove special parameter set for obsolete IOS/ARMV8 workaround
Martin Kroeker [Fri, 25 Oct 2019 21:07:00 +0000 (23:07 +0200)]
Remove special parameter set for obsolete IOS/ARMV8 workaround

5 years agoRemove the IOS fallbacks to generic C kernels
Martin Kroeker [Fri, 25 Oct 2019 21:02:37 +0000 (23:02 +0200)]
Remove the IOS fallbacks to generic C kernels

5 years agoFix regex to parse -R options with and without whitespace
Martin Kroeker [Fri, 25 Oct 2019 20:52:30 +0000 (22:52 +0200)]
Fix regex to parse -R options with and without whitespace

Both forms are seen on NetBSD (#2288)

5 years agoAdd NetBSD to the xBSD conditionals
Martin Kroeker [Fri, 25 Oct 2019 10:52:49 +0000 (12:52 +0200)]
Add NetBSD to the xBSD conditionals

5 years agoAdd NetBSD
Martin Kroeker [Fri, 25 Oct 2019 10:51:06 +0000 (12:51 +0200)]
Add NetBSD

5 years agoMerge pull request #2292 from martin-frbg/g95fixes
Martin Kroeker [Fri, 25 Oct 2019 08:35:17 +0000 (10:35 +0200)]
Merge pull request #2292 from martin-frbg/g95fixes

Improve support for g95 and non-GNU ld

5 years agoMerge pull request #2291 from martin-frbg/gensymbol
Martin Kroeker [Fri, 25 Oct 2019 08:34:50 +0000 (10:34 +0200)]
Merge pull request #2291 from martin-frbg/gensymbol

Fix netlib 3.7/3.8 function enumeration for linktest

5 years agoMerge pull request #2282 from martin-frbg/issue2281
Martin Kroeker [Fri, 25 Oct 2019 07:56:30 +0000 (09:56 +0200)]
Merge pull request #2282 from martin-frbg/issue2281

Optimize RPCC function on ARM64

5 years agoMerge pull request #2290 from martin-frbg/cpuidfixes
Martin Kroeker [Thu, 24 Oct 2019 20:52:15 +0000 (22:52 +0200)]
Merge pull request #2290 from martin-frbg/cpuidfixes

Fixup x86 cpuid changes from #2283

5 years agoImprove support for g95 and non-GNU ld
Martin Kroeker [Thu, 24 Oct 2019 20:43:27 +0000 (22:43 +0200)]
Improve support for g95 and non-GNU ld

Auto-add "-fno-second-underscore" option to make LAPACKE compile (as it calls LAPACK functions that may have gotten a second underscore added otherwise). Also support -R for rpath when parsing compiler directives in f_check

5 years agoMove most lapack 3.7/3.8 additions to the embedded_underscores list
Martin Kroeker [Thu, 24 Oct 2019 19:26:20 +0000 (21:26 +0200)]
Move most lapack 3.7/3.8 additions to the embedded_underscores list

to allow linktest to pass with a compiler that adds a second underscore to such names

5 years agoDisable direct clock register access on IOS and Android
Martin Kroeker [Thu, 24 Oct 2019 19:18:17 +0000 (21:18 +0200)]
Disable direct clock register access on IOS and Android

as I find conflicting information on accessibility from non-priviledged processes

5 years agoRemove prototype of unused, unimplemented function (#2274)
luzpaz [Thu, 24 Oct 2019 16:56:53 +0000 (12:56 -0400)]
Remove prototype of unused, unimplemented function (#2274)

* Fix source typo

Found via `codespell -q 3 -L amin,als,ba,dum,mone,nd,nto,orign -S Changelog.txt,./lapack*`

* Remove beta-thread function per request

5 years agoRestore Goldmont ID and improve QEMU support
Martin Kroeker [Thu, 24 Oct 2019 16:45:27 +0000 (18:45 +0200)]
Restore Goldmont ID and improve QEMU support

#2283 had inadvertently removed Goldmont+, and cpuid was reporting a mix of Core2 and Pentium2 for some QEMU configurations

5 years agoMerge pull request #12 from xianyi/develop
Martin Kroeker [Thu, 24 Oct 2019 16:40:13 +0000 (18:40 +0200)]
Merge pull request #12 from xianyi/develop

resync with upstream

5 years agoMerge pull request #2286 from wjc404/develop
Martin Kroeker [Sun, 20 Oct 2019 10:44:19 +0000 (12:44 +0200)]
Merge pull request #2286 from wjc404/develop

AVX512 DGEMM kernel

5 years agonative support for icopy_4
wjc404 [Fri, 18 Oct 2019 19:54:44 +0000 (03:54 +0800)]
native support for icopy_4

90% MKL 1-thread performance.

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Fri, 18 Oct 2019 07:00:17 +0000 (15:00 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agosome correction
wjc404 [Fri, 18 Oct 2019 06:58:07 +0000 (14:58 +0800)]
some correction

5 years agomake further changes to icopy_8 easier
wjc404 [Fri, 18 Oct 2019 02:47:31 +0000 (10:47 +0800)]
make further changes to icopy_8 easier

5 years agoAdd files via upload
wjc404 [Wed, 16 Oct 2019 11:23:36 +0000 (19:23 +0800)]
Add files via upload

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Wed, 16 Oct 2019 02:14:51 +0000 (10:14 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agoUpdate dgemm_kernel_8x8_skylakex.c
wjc404 [Tue, 15 Oct 2019 19:20:08 +0000 (03:20 +0800)]
Update dgemm_kernel_8x8_skylakex.c

5 years agoAdd files via upload
wjc404 [Tue, 15 Oct 2019 18:01:13 +0000 (02:01 +0800)]
Add files via upload

5 years agoAdd files via upload
wjc404 [Tue, 15 Oct 2019 18:00:34 +0000 (02:00 +0800)]
Add files via upload

5 years agoMerge pull request #2283 from martin-frbg/issue2176
Martin Kroeker [Wed, 9 Oct 2019 20:06:09 +0000 (22:06 +0200)]
Merge pull request #2283 from martin-frbg/issue2176

Support QEMU virtual cpu in 64bit mode as CORE2 or BARCELONA

5 years agoSupport QEMU cpu calling itself 64bit AMD Athlon as well
Martin Kroeker [Wed, 9 Oct 2019 16:24:13 +0000 (18:24 +0200)]
Support QEMU cpu calling itself 64bit AMD Athlon as well

Some QEMU instances pretend to be "AuthenticAMD" with the same family 6/model 6 even when running on an Intel host
(could be related to qemu or libvirt version and/or kvm availability). Also fix the define to depend on __x86_64__ set by the
compiler, the defines using __64BIT__ will only work for getarch_2nd.

5 years agoSupport QEMU virtual cpu as CORE2
Martin Kroeker [Tue, 8 Oct 2019 20:30:02 +0000 (22:30 +0200)]
Support QEMU virtual cpu as CORE2

qemu itself claims it is a 64bit P6, which does not exist in the wild.

5 years agoSimplify OSX/IOS cross-compilation and add a CI test for it (#2279)
Martin Kroeker [Tue, 8 Oct 2019 18:13:14 +0000 (20:13 +0200)]
Simplify OSX/IOS cross-compilation and add a CI test for it (#2279)

* Add automatic fixups for OSX/IOS cross-compilation

* Add OSX/IOS cross-compilation test to Travis CI

* Handle platforms that lack hwcap.h by falling back to ARMV8

* Fix PROLOGUE for OSX/IOS

5 years agoUpdate common_arm64.h
Martin Kroeker [Tue, 8 Oct 2019 18:12:08 +0000 (20:12 +0200)]
Update common_arm64.h

5 years agoMerge pull request #2280 from martin-frbg/iosfix
Martin Kroeker [Tue, 8 Oct 2019 08:25:25 +0000 (10:25 +0200)]
Merge pull request #2280 from martin-frbg/iosfix

Add overlooked part of IOS compilation fix

5 years agoRemove automatic label postfixes from macro included only once
Martin Kroeker [Tue, 8 Oct 2019 06:37:50 +0000 (08:37 +0200)]
Remove automatic label postfixes from macro included only once

5 years agoMerge pull request #11 from xianyi/develop
Martin Kroeker [Tue, 8 Oct 2019 06:32:52 +0000 (08:32 +0200)]
Merge pull request #11 from xianyi/develop

sync with upstream

5 years agoFix accidental duplication of jump instruction
Martin Kroeker [Tue, 8 Oct 2019 06:09:26 +0000 (08:09 +0200)]
Fix accidental duplication of jump instruction

5 years agoMerge pull request #2277 from martin-frbg/issue2275
Martin Kroeker [Sun, 6 Oct 2019 21:01:54 +0000 (23:01 +0200)]
Merge pull request #2277 from martin-frbg/issue2275

Rewrite ARMV8 code to allow cross-compilation for IOS

5 years agoMerge pull request #2276 from xianyi/revert-2272-thread-sqrt-of-negative
Martin Kroeker [Sun, 6 Oct 2019 09:12:44 +0000 (11:12 +0200)]
Merge pull request #2276 from xianyi/revert-2272-thread-sqrt-of-negative

Revert "Avoid taking root of negative number in symv_thread.c"

5 years agoMove 32bit OSX build back to xcode 8.3 but switch to gcc8
Martin Kroeker [Sat, 5 Oct 2019 08:52:47 +0000 (10:52 +0200)]
Move 32bit OSX build back to xcode 8.3 but switch to gcc8

5 years agoMake local labels in macro compatible with the xcode assembler
Martin Kroeker [Fri, 4 Oct 2019 12:53:23 +0000 (14:53 +0200)]
Make local labels in macro compatible with the xcode assembler

... which does not perform the automatic numbering on instantiation that the _@ suffix signifies