platform/upstream/openblas.git
2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:10:21 +0000 (00:10 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:09:14 +0000 (00:09 +0100)]
Fix offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:08:20 +0000 (00:08 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:07:33 +0000 (00:07 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:06:41 +0000 (00:06 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot index for negative increments
Martin Kroeker [Sun, 16 Jan 2022 23:05:33 +0000 (00:05 +0100)]
Fix pivot index for negative increments

2 years agoMerge pull request #3511 from martin-frbg/cmakeutils
Martin Kroeker [Mon, 10 Jan 2022 08:12:52 +0000 (09:12 +0100)]
Merge pull request #3511 from martin-frbg/cmakeutils

Fix handling of ifdef/ifndef in CMAKE

2 years agoFix handling of ifdef/ifndef
Martin Kroeker [Sun, 9 Jan 2022 22:31:59 +0000 (23:31 +0100)]
Fix handling of ifdef/ifndef

2 years agoMerge pull request #3510 from martin-frbg/issue3505
Martin Kroeker [Sun, 9 Jan 2022 13:50:51 +0000 (14:50 +0100)]
Merge pull request #3510 from martin-frbg/issue3505

Fix recent SkylakeX/DYNAMIC_ARCH DGEMM breakage

2 years agoMerge pull request #3508 from snadampal/v1_n2
Martin Kroeker [Sun, 9 Jan 2022 13:50:26 +0000 (14:50 +0100)]
Merge pull request #3508 from snadampal/v1_n2

OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics

2 years agomake DYNAMIC_ARCH option available to getarch_2nd/param.h
Martin Kroeker [Sat, 8 Jan 2022 22:50:34 +0000 (23:50 +0100)]
make DYNAMIC_ARCH option available to getarch_2nd/param.h

2 years agoForward DYNAMIC_ARCH option to Makefile.prebuild
Martin Kroeker [Sat, 8 Jan 2022 22:48:58 +0000 (23:48 +0100)]
Forward DYNAMIC_ARCH option to Makefile.prebuild

2 years agoSkylakeX: match parameters to dgemm kernels for dyn/non-dyn
Martin Kroeker [Sat, 8 Jan 2022 22:48:13 +0000 (23:48 +0100)]
SkylakeX: match parameters to dgemm kernels for dyn/non-dyn

2 years agoOpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
Sunita Nadampalli [Fri, 7 Jan 2022 00:28:17 +0000 (00:28 +0000)]
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics

2 years agoMerge pull request #3502 from jgillis/develop
Martin Kroeker [Sat, 1 Jan 2022 11:12:32 +0000 (12:12 +0100)]
Merge pull request #3502 from jgillis/develop

Fix cmake crosscompilation for core2 target

2 years agoMerge pull request #3504 from martin-frbg/issue3503
Martin Kroeker [Sat, 1 Jan 2022 10:43:17 +0000 (11:43 +0100)]
Merge pull request #3504 from martin-frbg/issue3503

Guard against omp_get_num_places returning zero

2 years agoGuard against omp_get_num_places returning zero
Martin Kroeker [Fri, 31 Dec 2021 23:46:23 +0000 (00:46 +0100)]
Guard against omp_get_num_places returning zero

2 years agoFix cmake crosscompilation for core2 target
jgillis [Wed, 29 Dec 2021 21:50:20 +0000 (22:50 +0100)]
Fix cmake crosscompilation for core2 target

Missing HAVE_SSE* cmake variables cause cc.cmake to forget about `-msse*` flags

2 years agoMerge pull request #3500 from martin-frbg/osx_dyn_xerbla
Martin Kroeker [Tue, 28 Dec 2021 21:54:27 +0000 (22:54 +0100)]
Merge pull request #3500 from martin-frbg/osx_dyn_xerbla

Ensure that the right xerbla gets included in OSX DYNAMIC_ARCH builds

2 years agoEnsure that the right xerbla gets included in OSX DYNAMIC_ARCH builds
Martin Kroeker [Tue, 28 Dec 2021 18:06:55 +0000 (19:06 +0100)]
Ensure that the right xerbla gets included in OSX DYNAMIC_ARCH builds

2 years agoMerge pull request #3496 from yuanhec/develop
Martin Kroeker [Tue, 28 Dec 2021 17:51:56 +0000 (18:51 +0100)]
Merge pull request #3496 from yuanhec/develop

Fixed MSA enabled optimization on Loongson-3A4000

2 years agoMerge remote-tracking branch 'upstream/develop' into develop
yuanhecai [Mon, 27 Dec 2021 01:50:57 +0000 (09:50 +0800)]
Merge remote-tracking branch 'upstream/develop' into develop

2 years agoFixed MSA enabled optimization on Loongson-3A4000
yuanhecai [Thu, 23 Dec 2021 12:04:27 +0000 (20:04 +0800)]
Fixed MSA enabled optimization on Loongson-3A4000

2 years agoMerge pull request #3491 from gxw-loongson/develop
Martin Kroeker [Wed, 22 Dec 2021 07:34:12 +0000 (08:34 +0100)]
Merge pull request #3491 from gxw-loongson/develop

loongarch64: Optimize dgemm_kernel

2 years agoloongarch64: Optimize dgemm_kernel
gxw [Tue, 21 Dec 2021 01:22:59 +0000 (09:22 +0800)]
loongarch64: Optimize dgemm_kernel

2 years agoUpdate version to 0.3.19.dev
Martin Kroeker [Sun, 19 Dec 2021 20:22:19 +0000 (21:22 +0100)]
Update version to 0.3.19.dev

2 years agoUpdate version to 0.3.19.dev
Martin Kroeker [Sun, 19 Dec 2021 20:21:47 +0000 (21:21 +0100)]
Update version to 0.3.19.dev

2 years agoMerge pull request #3489 from xianyi/release-0.3.0
Martin Kroeker [Sun, 19 Dec 2021 20:21:13 +0000 (21:21 +0100)]
Merge pull request #3489 from xianyi/release-0.3.0

Merge back from release branch to copy 0.3.19 tag

2 years agoUpdate version to 0.3.19
Martin Kroeker [Sun, 19 Dec 2021 19:55:57 +0000 (20:55 +0100)]
Update version to 0.3.19

2 years agoMerge pull request #3488 from xianyi/develop
Martin Kroeker [Sun, 19 Dec 2021 19:54:49 +0000 (20:54 +0100)]
Merge pull request #3488 from xianyi/develop

Update from develop branch for 0.3.19 release

2 years agoMerge branch 'release-0.3.0' into develop
Martin Kroeker [Sun, 19 Dec 2021 15:35:07 +0000 (16:35 +0100)]
Merge branch 'release-0.3.0' into develop

2 years agoUpdate version to 0.3.19
Martin Kroeker [Sun, 19 Dec 2021 15:32:04 +0000 (16:32 +0100)]
Update version to 0.3.19

2 years agoMerge pull request #3487 from martin-frbg/0319changes
Martin Kroeker [Sun, 19 Dec 2021 15:30:47 +0000 (16:30 +0100)]
Merge pull request #3487 from martin-frbg/0319changes

Update Changelog for 0.3.19 release

2 years agoUpdate with 0.3.19 changes
Martin Kroeker [Sun, 19 Dec 2021 13:34:14 +0000 (14:34 +0100)]
Update with 0.3.19 changes

2 years agoMerge pull request #3486 from martin-frbg/nvhpc
Martin Kroeker [Sat, 18 Dec 2021 22:09:30 +0000 (23:09 +0100)]
Merge pull request #3486 from martin-frbg/nvhpc

Update -tp option for recent nvfortran on x86_64

2 years agoUpdate -tp option for recent nvfortran on x86_64
Martin Kroeker [Sat, 18 Dec 2021 20:56:26 +0000 (21:56 +0100)]
Update -tp option for recent nvfortran on x86_64

2 years agoMerge pull request #3485 from martin-frbg/issue3453
Martin Kroeker [Fri, 17 Dec 2021 10:08:36 +0000 (11:08 +0100)]
Merge pull request #3485 from martin-frbg/issue3453

Add feature-based fallback for unknown x86_64 cpus

2 years agoAdd feature-based fallback for unknown x86_64 cpus
Martin Kroeker [Thu, 16 Dec 2021 21:02:49 +0000 (22:02 +0100)]
Add feature-based fallback for unknown x86_64 cpus

2 years agoMerge pull request #3484 from martin-frbg/issue3481
Martin Kroeker [Thu, 16 Dec 2021 20:50:28 +0000 (21:50 +0100)]
Merge pull request #3484 from martin-frbg/issue3481

Enable delayed (re)init of Windows threads beyond Cygwin

2 years agoMerge pull request #3480 from wzgpeter/develop
Martin Kroeker [Thu, 16 Dec 2021 20:50:06 +0000 (21:50 +0100)]
Merge pull request #3480 from wzgpeter/develop

fix bug in zscal function

2 years agoMerge pull request #3478 from ffontaine/develop
Martin Kroeker [Thu, 16 Dec 2021 20:49:19 +0000 (21:49 +0100)]
Merge pull request #3478 from ffontaine/develop

Makefile: also consider -O, -Og and -Os when stripping flags

2 years agodefine "unlikely" on non-cygwin too
Martin Kroeker [Thu, 16 Dec 2021 16:28:28 +0000 (17:28 +0100)]
define "unlikely" on non-cygwin too

2 years agoOpen up delayed (re)init to non-Cygwin OS as well
Martin Kroeker [Thu, 16 Dec 2021 15:58:12 +0000 (16:58 +0100)]
Open up delayed (re)init to non-Cygwin OS as well

2 years agoMerge pull request #3483 from martin-frbg/issue3482
Martin Kroeker [Thu, 16 Dec 2021 10:54:20 +0000 (11:54 +0100)]
Merge pull request #3483 from martin-frbg/issue3482

Fix bracing in cpuid_mips64.c

2 years agomove brace inside the ifdef block
Martin Kroeker [Thu, 16 Dec 2021 08:37:58 +0000 (09:37 +0100)]
move brace inside the ifdef block

2 years agofix bug in zscal function
Wu Zhigang [Wed, 15 Dec 2021 08:22:19 +0000 (00:22 -0800)]
fix bug in zscal function

memset can not be used in zscal because of
the stride parameters.

Signed-off-by: Wu Zhigang <zhigang.wu@starfivetech.com>
2 years agoMakefile: also consider -O, -Og and -Os when stripping flags
Thomas De Schampheleire [Tue, 14 Dec 2021 22:36:16 +0000 (23:36 +0100)]
Makefile: also consider -O, -Og and -Os when stripping flags

gcc also supports -O, -Og and -Os as optimization flags.
They may be given on the make command-line by users.

For the calculation of LAPACK_NOOPT, all such flags should be considered.

Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
[Retrieved from:
https://git.buildroot.net/buildroot/tree/package/openblas/0003-Makefile-also-consider-Os-when-determining-LAPACK_NO.patch]
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
2 years agoMove the threads overflow flag under the protection of the local blas lock (#3476)
Martin Kroeker [Mon, 13 Dec 2021 07:34:52 +0000 (08:34 +0100)]
Move the threads overflow flag under the protection of the local blas lock (#3476)

* Move accesses to the overflow flag into the scope of the blas lock

2 years agoMerge pull request #3475 from wjc404/optimize-A53-dgemm
Martin Kroeker [Sun, 12 Dec 2021 18:09:08 +0000 (19:09 +0100)]
Merge pull request #3475 from wjc404/optimize-A53-dgemm

optimize cgemm on ARM cortex A53 & cortex A55

2 years agoMerge pull request #3474 from rafaelcfsousa/rafael/cmake_power
Martin Kroeker [Sun, 12 Dec 2021 18:08:27 +0000 (19:08 +0100)]
Merge pull request #3474 from rafaelcfsousa/rafael/cmake_power

Add CMake support for Power

2 years agooptimize cgemm on ARM cortex A53 & cortex A55
Jia-Chen [Sun, 12 Dec 2021 09:22:52 +0000 (17:22 +0800)]
optimize cgemm on ARM cortex A53 & cortex A55

2 years agoMerge pull request #3464 from binebrank/arm_sve_sgemm
Martin Kroeker [Sat, 11 Dec 2021 19:35:22 +0000 (20:35 +0100)]
Merge pull request #3464 from binebrank/arm_sve_sgemm

Add sgemm part for Arm SVE

2 years agofix UNROLL_MN and add to targets for SVE
Bine Brank [Sat, 11 Dec 2021 15:37:23 +0000 (16:37 +0100)]
fix UNROLL_MN and add to targets for SVE

2 years agoadjust Makefile.L3 for SVE
Bine Brank [Sat, 11 Dec 2021 15:35:08 +0000 (16:35 +0100)]
adjust Makefile.L3 for SVE

2 years agoUse CMake variables instead of as
Rafael Cardoso Fernandes Sousa [Fri, 10 Dec 2021 23:35:28 +0000 (17:35 -0600)]
Use CMake variables instead of as

2 years agoFix error cmake (small kernels)
Rafael Cardoso Fernandes Sousa [Thu, 9 Dec 2021 15:57:39 +0000 (09:57 -0600)]
Fix error cmake (small kernels)

2 years agoFix cmake for power
Rafael Cardoso Fernandes Sousa [Thu, 9 Dec 2021 14:28:17 +0000 (08:28 -0600)]
Fix cmake for power

2 years agoMerge pull request #3472 from kavanabhat/p10_aixas_p8
Martin Kroeker [Thu, 9 Dec 2021 06:28:57 +0000 (07:28 +0100)]
Merge pull request #3472 from kavanabhat/p10_aixas_p8

Fallback for Power kernels

2 years agoMerge pull request #3469 from martin-frbg/issue2986
Martin Kroeker [Wed, 8 Dec 2021 21:19:32 +0000 (22:19 +0100)]
Merge pull request #3469 from martin-frbg/issue2986

Roll back SkylakeX DGEMM kernels to 4x8 when compiling for DYNAMIC_ARCH

2 years agoFix ar path in ARMV7 Darwin NDK build on Azure (#3473)
Martin Kroeker [Wed, 8 Dec 2021 21:18:44 +0000 (22:18 +0100)]
Fix ar path in ARMV7 Darwin NDK build on Azure (#3473)

* Adjust ar commad in ARMV7 Darwin NDK build after homebrew update to NDK 23b

2 years agoFallback for Power kernels
kavanabhat [Wed, 8 Dec 2021 09:52:23 +0000 (03:52 -0600)]
Fallback for Power kernels

2 years agoroll back DGEMM kernels to 4x8 when compiling for DYNAMIC_ARCH
Martin Kroeker [Mon, 6 Dec 2021 18:43:54 +0000 (19:43 +0100)]
roll back DGEMM kernels to 4x8 when compiling for DYNAMIC_ARCH

2 years agoswitch DGEMM unroll parameters for SkylakeX if DYNAMIC_ARCH
Martin Kroeker [Mon, 6 Dec 2021 18:42:51 +0000 (19:42 +0100)]
switch DGEMM unroll parameters for SkylakeX if DYNAMIC_ARCH

2 years agosgemm v2x8 SVE kernel
Bine Brank [Sun, 5 Dec 2021 17:47:29 +0000 (18:47 +0100)]
sgemm v2x8 SVE kernel

2 years agoMerge pull request #3468 from martin-frbg/issue3467
Martin Kroeker [Sun, 5 Dec 2021 14:52:44 +0000 (15:52 +0100)]
Merge pull request #3468 from martin-frbg/issue3467

Fix hardcoded library name in cpp_thread_test Makefile

2 years agoFix hardcoded library name
Martin Kroeker [Sun, 5 Dec 2021 13:38:41 +0000 (14:38 +0100)]
Fix hardcoded library name

2 years agostrmm sve v1x8 kernel
Bine Brank [Sun, 5 Dec 2021 13:03:08 +0000 (14:03 +0100)]
strmm sve v1x8 kernel

2 years agoFix DYNAMIC_ARCH builds with CMAKE on OSX and add corresponding test to Azure CI...
Martin Kroeker [Sat, 4 Dec 2021 21:24:02 +0000 (22:24 +0100)]
Fix DYNAMIC_ARCH builds with CMAKE on OSX and add corresponding test to Azure CI (#3409)

* Use linker response files and a custom link command to get around ARG_MAX limitations on OSX
* Reconfigure a redundant job on Azure to test shared library builds with CMAKE and DYNAMIC_ARCH on OSX

2 years agoMerge pull request #3466 from rafaelcfsousa/rafael/small_matrix_p10
Martin Kroeker [Fri, 3 Dec 2021 11:12:20 +0000 (12:12 +0100)]
Merge pull request #3466 from rafaelcfsousa/rafael/small_matrix_p10

[POWER] Add small matrix for sgemm/dgemm on Power10

2 years agoMerge pull request #3465 from kavanabhat/develop
Martin Kroeker [Fri, 3 Dec 2021 11:11:43 +0000 (12:11 +0100)]
Merge pull request #3465 from kavanabhat/develop

Fix truncated assembler checks used to build Power10 Kernels

2 years agoDelete test_zhemv.c
Martin Kroeker [Fri, 3 Dec 2021 10:41:53 +0000 (11:41 +0100)]
Delete test_zhemv.c

2 years agoMerge pull request #3455 from cenewcombe/develop
Martin Kroeker [Fri, 3 Dec 2021 09:01:20 +0000 (10:01 +0100)]
Merge pull request #3455 from cenewcombe/develop

Fix unsafe read during final iteration of zsymv_L_sse2.S

2 years agoUpdate Makefile.system
kavanabhat [Thu, 2 Dec 2021 07:59:38 +0000 (13:29 +0530)]
Update Makefile.system

2 years agoMerge pull request #1 from kavanabhat/as_check_fix
kavanabhat [Wed, 1 Dec 2021 15:00:43 +0000 (20:30 +0530)]
Merge pull request #1 from kavanabhat/as_check_fix

Fix truncated assembler checks used for Power10 kernel build

2 years agoFix truncated assembler checks
kavanabhat [Wed, 1 Dec 2021 14:00:40 +0000 (19:30 +0530)]
Fix truncated assembler checks

2 years agotrmm sve copy fucntions for single precision
Bine Brank [Mon, 29 Nov 2021 20:25:05 +0000 (21:25 +0100)]
trmm sve copy fucntions for single precision

2 years ago[POWER] Add support for SMALL_MATRIX_OPT
Rafael Cardoso Fernandes Sousa [Tue, 16 Nov 2021 20:47:41 +0000 (14:47 -0600)]
[POWER] Add support for SMALL_MATRIX_OPT

2 years agoadd sgemm kernel and copy functions for sgemm and ssymm
Bine Brank [Sun, 28 Nov 2021 17:12:47 +0000 (18:12 +0100)]
add sgemm kernel and copy functions for sgemm and ssymm

2 years agoMerge pull request #3425 from binebrank/arm_sve_dgemm
Martin Kroeker [Fri, 26 Nov 2021 15:14:55 +0000 (16:14 +0100)]
Merge pull request #3425 from binebrank/arm_sve_dgemm

Add dgemm kernel for arm64 SVE

2 years agoMerge pull request #3459 from rafaelcfsousa/fix_cmake
Martin Kroeker [Fri, 26 Nov 2021 14:19:24 +0000 (15:19 +0100)]
Merge pull request #3459 from rafaelcfsousa/fix_cmake

Fix issues when building OpenBLAS with cmake

2 years agoMerge pull request #3462 from martin-frbg/azure-alpine2
Martin Kroeker [Fri, 26 Nov 2021 12:40:23 +0000 (13:40 +0100)]
Merge pull request #3462 from martin-frbg/azure-alpine2

Azure CI: Update alpine-chroot-install again

2 years agoUpdate alpine-chroot-install again
Martin Kroeker [Fri, 26 Nov 2021 12:39:49 +0000 (13:39 +0100)]
Update alpine-chroot-install again

2 years agoupdate CONTRIBUTORS.md
Bine Brank [Fri, 26 Nov 2021 12:11:19 +0000 (13:11 +0100)]
update CONTRIBUTORS.md

2 years agoAdapt CMake for SVE
Bine Brank [Fri, 26 Nov 2021 09:35:01 +0000 (10:35 +0100)]
Adapt CMake for SVE

2 years agoMerge pull request #3457 from wjc404/optimize-A53-dgemm
Martin Kroeker [Fri, 26 Nov 2021 09:30:47 +0000 (10:30 +0100)]
Merge pull request #3457 from wjc404/optimize-A53-dgemm

MOD: optimize zgemm on cortex-A53/cortex-A55

2 years agoMerge pull request #3456 from martin-frbg/issue3444
Martin Kroeker [Fri, 26 Nov 2021 09:29:28 +0000 (10:29 +0100)]
Merge pull request #3456 from martin-frbg/issue3444

Add/restore a GENERIC target for MIPS32 and support MIPS32 cross-compilation using CMAKE

2 years agoAzureCI: Fetch alpine-chroot-install from master to get key updates (#3460)
Martin Kroeker [Fri, 26 Nov 2021 08:38:41 +0000 (09:38 +0100)]
AzureCI: Fetch alpine-chroot-install from master to get key updates (#3460)

* Fetch alpine-chroot-install from master to get key updates

2 years agoMOD: add comments to a53 zgemm kernel
Jia-Chen [Thu, 25 Nov 2021 14:48:48 +0000 (22:48 +0800)]
MOD: add comments to a53 zgemm kernel

2 years agoModify the order that cmake set the KERNEL variables (generic now is fallback)
Rafael Cardoso Fernandes Sousa [Thu, 25 Nov 2021 02:07:20 +0000 (20:07 -0600)]
Modify the order that cmake set the KERNEL variables (generic now is fallback)

2 years agoFix the cmake parser to identify more patterns
Rafael Cardoso Fernandes Sousa [Wed, 24 Nov 2021 20:07:28 +0000 (14:07 -0600)]
Fix the cmake parser to identify more patterns

2 years agoMOD: optimize zgemm on cortex-A53/cortex-A55
Jia-Chen [Wed, 24 Nov 2021 13:51:45 +0000 (21:51 +0800)]
MOD: optimize zgemm on cortex-A53/cortex-A55

2 years agoreduced dgemm_unroll_m to work with 128-bit sve
Bine Brank [Tue, 23 Nov 2021 20:18:08 +0000 (21:18 +0100)]
reduced dgemm_unroll_m to work with 128-bit sve

2 years agoremoved unused code (compiler warnings)
Bine Brank [Mon, 22 Nov 2021 09:12:34 +0000 (10:12 +0100)]
removed unused code (compiler warnings)

2 years agomodify Makefile for SVE copy
Bine Brank [Mon, 22 Nov 2021 08:54:20 +0000 (09:54 +0100)]
modify Makefile for SVE copy

2 years agoconfigure SVE Makefile
Bine Brank [Sun, 21 Nov 2021 17:33:43 +0000 (18:33 +0100)]
configure SVE Makefile

2 years agosome clean-up & commentary
Bine Brank [Sun, 21 Nov 2021 13:56:27 +0000 (14:56 +0100)]
some clean-up & commentary

2 years agoFix unintended reversion of recent CortexA53 changes
Martin Kroeker [Sat, 20 Nov 2021 22:54:48 +0000 (23:54 +0100)]
Fix unintended reversion of recent CortexA53 changes

2 years agoAdd CMAKE support for cross-compiling to MIPS32
Martin Kroeker [Sat, 20 Nov 2021 16:34:28 +0000 (17:34 +0100)]
Add CMAKE support for cross-compiling to MIPS32

2 years agoAdd generic mips32 target
Martin Kroeker [Sat, 20 Nov 2021 16:31:51 +0000 (17:31 +0100)]
Add generic mips32 target

2 years agoAdd generic MIPS32 target
Martin Kroeker [Sat, 20 Nov 2021 16:31:11 +0000 (17:31 +0100)]
Add generic MIPS32 target