platform/upstream/openblas.git
2 years agoDisable the LAPACK testsuite for the Windows clang/flang build as it takes too long
Martin Kroeker [Thu, 24 Mar 2022 20:25:16 +0000 (21:25 +0100)]
Disable the LAPACK testsuite for the Windows clang/flang build as it takes too long

2 years agoAdd LAPACK-like option to omit the LAPACK testsuite
Martin Kroeker [Thu, 24 Mar 2022 20:23:28 +0000 (21:23 +0100)]
Add LAPACK-like option to omit the LAPACK testsuite

2 years agoMerge pull request #3580 from martin-frbg/dynx86_sbgemm
Martin Kroeker [Thu, 24 Mar 2022 10:33:00 +0000 (11:33 +0100)]
Merge pull request #3580 from martin-frbg/dynx86_sbgemm

Remove extraneous (and wrong) definition of sbgemm_r on x86_64

2 years agoMerge pull request #3579 from martin-frbg/issue3557-2
Martin Kroeker [Thu, 24 Mar 2022 07:28:37 +0000 (08:28 +0100)]
Merge pull request #3579 from martin-frbg/issue3557-2

Fix malfunctioning AVX512 check

2 years agoRemove extraneous (and wrong) definition of sbgemm_r on x86_64
Martin Kroeker [Wed, 23 Mar 2022 19:05:32 +0000 (20:05 +0100)]
Remove extraneous (and wrong) definition of sbgemm_r on x86_64

2 years agoMerge branch 'xianyi:develop' into issue3557-2
Martin Kroeker [Wed, 23 Mar 2022 18:13:54 +0000 (19:13 +0100)]
Merge branch 'xianyi:develop' into issue3557-2

2 years agoFix checks for AVX512 and atomics
Martin Kroeker [Wed, 23 Mar 2022 14:48:58 +0000 (15:48 +0100)]
Fix checks for AVX512 and atomics

2 years agoRevert AVX512 capability check from PR #1980 (moved to build)
Martin Kroeker [Wed, 23 Mar 2022 14:22:13 +0000 (15:22 +0100)]
Revert AVX512 capability check from PR #1980 (moved to build)

2 years agoUtilize compiler AVX512 capability info from c_check when building getarch
Martin Kroeker [Wed, 23 Mar 2022 14:19:55 +0000 (15:19 +0100)]
Utilize compiler AVX512 capability info from c_check when building getarch

2 years agoMerge pull request #3561 from AlessioZanga/patch-msvc
Martin Kroeker [Wed, 23 Mar 2022 10:28:13 +0000 (11:28 +0100)]
Merge pull request #3561 from AlessioZanga/patch-msvc

Remove MSVC limitation

2 years agoMerge pull request #3576 from martin-frbg/cmaketestbom
Martin Kroeker [Wed, 23 Mar 2022 06:19:15 +0000 (07:19 +0100)]
Merge pull request #3576 from martin-frbg/cmaketestbom

Skip BLAS tests if Windows powershell added a BOM

2 years agoMerge pull request #3577 from martin-frbg/azure_win2022
Martin Kroeker [Wed, 23 Mar 2022 06:18:45 +0000 (07:18 +0100)]
Merge pull request #3577 from martin-frbg/azure_win2022

Update Windows jobs in Azure CI to use Windows2022

2 years agoUpdate Windows jobs in Azure CI to use Windows2022
Martin Kroeker [Tue, 22 Mar 2022 20:51:09 +0000 (21:51 +0100)]
Update Windows jobs in Azure CI to use Windows2022

2 years agoSkip tests if Windows powershell added a BOM
Martin Kroeker [Tue, 22 Mar 2022 20:37:55 +0000 (21:37 +0100)]
Skip tests if Windows powershell added a BOM

2 years agoMerge pull request #3574 from AdamNiederer/fix-dynamic-list-compilation
Martin Kroeker [Sat, 19 Mar 2022 08:21:56 +0000 (09:21 +0100)]
Merge pull request #3574 from AdamNiederer/fix-dynamic-list-compilation

Fix broken elif in dynamic.c

2 years agoFix broken elif in dynamic.c
Adam Niederer [Fri, 18 Mar 2022 00:02:39 +0000 (20:02 -0400)]
Fix broken elif in dynamic.c

This fixes compilation in the following case:

$(MAKE) USE_OPENMP=1 USE_THREAD=1 NO_LAPACK=0 DYNAMIC_ARCH=1 \
DYNAMIC_LIST="HASWELL SKYLAKEX ATOM COOPERLAKE SAPPHIRERAPIDS ZEN"

2 years agoMerge pull request #3567 from cenewcombe/develop
Martin Kroeker [Sat, 12 Mar 2022 12:40:17 +0000 (13:40 +0100)]
Merge pull request #3567 from cenewcombe/develop

Fix unsafe read of Y in zsymv_L_sse2.S

2 years agofix unsafe read of Y in assembly kernel
Caroline Newcombe [Fri, 11 Mar 2022 17:56:33 +0000 (11:56 -0600)]
fix unsafe read of Y in assembly kernel

2 years agoMerge pull request #3565 from jonaszhou1/develop
Martin Kroeker [Fri, 11 Mar 2022 13:29:30 +0000 (14:29 +0100)]
Merge pull request #3565 from jonaszhou1/develop

Support Zhaoxin/Centaur kh40000 as ZEN

2 years agoMerge pull request #3566 from martin-frbg/configtls
Martin Kroeker [Fri, 11 Mar 2022 13:27:27 +0000 (14:27 +0100)]
Merge pull request #3566 from martin-frbg/configtls

Report USE_TLS in get_config output if set

2 years agoReport USE_TLS if set
Martin Kroeker [Thu, 10 Mar 2022 15:19:29 +0000 (16:19 +0100)]
Report USE_TLS if set

2 years agoSupport Zhaoxin/Centaur kh40000 as ZEN
JonasZhou [Fri, 4 Mar 2022 09:14:52 +0000 (17:14 +0800)]
Support Zhaoxin/Centaur kh40000 as ZEN

Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>
2 years agoChange `BUILD_WITHOUT_LAPACK` to `OFF` by default
AlessioZanga [Sat, 5 Mar 2022 22:35:29 +0000 (23:35 +0100)]
Change `BUILD_WITHOUT_LAPACK` to `OFF` by default

2 years agoRemove MSVC limitation
Alessio Zanga [Fri, 4 Mar 2022 23:07:01 +0000 (00:07 +0100)]
Remove MSVC limitation

2 years agoMerge pull request #3550 from guowangy/smatrix-mask-fix
Martin Kroeker [Mon, 28 Feb 2022 07:28:02 +0000 (08:28 +0100)]
Merge pull request #3550 from guowangy/smatrix-mask-fix

Small Matrix: use proper inline asm input constraint for AVX512 mask

2 years agoSmall Matrix: use proper inline asm input constraint for AVX512 mask
Wangyang Guo [Mon, 28 Feb 2022 03:22:31 +0000 (03:22 +0000)]
Small Matrix: use proper inline asm input constraint for AVX512 mask

2 years agoMerge pull request #3549 from martin-frbg/issue3543
Martin Kroeker [Sat, 26 Feb 2022 20:49:05 +0000 (21:49 +0100)]
Merge pull request #3549 from martin-frbg/issue3543

Annotate LAPACKE_lsame with attribute const for GCC(+compatible)

2 years agoMerge pull request #3548 from martin-frbg/rela-gemmt
Martin Kroeker [Sat, 26 Feb 2022 20:48:39 +0000 (21:48 +0100)]
Merge pull request #3548 from martin-frbg/rela-gemmt

Enable the ?GEMMT functions in ReLAPACK

2 years agoAnnotate LAPACKE_lsame with the const attribute for GCC and compatible compilers
Martin Kroeker [Sat, 26 Feb 2022 18:27:34 +0000 (19:27 +0100)]
Annotate LAPACKE_lsame with the const attribute for GCC and compatible compilers

2 years agoFix xGEMMT argument lists
Martin Kroeker [Sat, 26 Feb 2022 18:24:27 +0000 (19:24 +0100)]
Fix xGEMMT argument lists

2 years agoEnable xGEMMT functions
Martin Kroeker [Sat, 26 Feb 2022 18:23:40 +0000 (19:23 +0100)]
Enable xGEMMT functions

2 years agoMerge pull request #3547 from martin-frbg/issue3540-2
Martin Kroeker [Fri, 25 Feb 2022 20:54:11 +0000 (21:54 +0100)]
Merge pull request #3547 from martin-frbg/issue3540-2

More build fixes for CooperLake with BFLOAT16 and DYNAMIC_ARCH

2 years agoreally fix definition of SHUFFLE_MAGIC_NO
Martin Kroeker [Fri, 25 Feb 2022 14:36:02 +0000 (15:36 +0100)]
really fix definition of SHUFFLE_MAGIC_NO

2 years agoRemove stray $
Martin Kroeker [Fri, 25 Feb 2022 14:33:02 +0000 (15:33 +0100)]
Remove stray $

2 years agoDeclare SHUFFLE_MAGIC_NO as const to placate clang
Martin Kroeker [Fri, 25 Feb 2022 09:05:36 +0000 (10:05 +0100)]
Declare SHUFFLE_MAGIC_NO as const to placate clang

2 years agoDefine sbgemm_r to fix DYNAMIC_ARCH builds
Martin Kroeker [Fri, 25 Feb 2022 09:04:00 +0000 (10:04 +0100)]
Define sbgemm_r to fix DYNAMIC_ARCH builds

2 years agoMerge pull request #3542 from martin-frbg/issue3540
Martin Kroeker [Wed, 23 Feb 2022 23:00:00 +0000 (00:00 +0100)]
Merge pull request #3542 from martin-frbg/issue3540

Fix compilation for CooperLake on Windows/clang

2 years agoMerge pull request #3544 from giordano/mg/gcc6
Martin Kroeker [Wed, 23 Feb 2022 22:57:57 +0000 (23:57 +0100)]
Merge pull request #3544 from giordano/mg/gcc6

Fix compilation of Skylake AVX512 kernels with GCC 6

2 years agoFix compilation of Skylake AVX512 kernels with GCC 6
Mosè Giordano [Wed, 23 Feb 2022 22:51:59 +0000 (22:51 +0000)]
Fix compilation of Skylake AVX512 kernels with GCC 6

2 years agoMerge pull request #3541 from martin-frbg/issue3530
Martin Kroeker [Wed, 23 Feb 2022 22:13:53 +0000 (23:13 +0100)]
Merge pull request #3541 from martin-frbg/issue3530

Fix compilation for SkylakeX with gcc 6.x

2 years agoPrevent compiler attempts to use k0 as mask register
Martin Kroeker [Wed, 23 Feb 2022 19:12:20 +0000 (20:12 +0100)]
Prevent compiler attempts to use k0 as mask register

2 years agoFix non-portable u_int64_t
Martin Kroeker [Wed, 23 Feb 2022 19:10:59 +0000 (20:10 +0100)]
Fix non-portable u_int64_t

2 years agoGuard uses of _mm512_reduce_add_p?
Martin Kroeker [Wed, 23 Feb 2022 19:06:14 +0000 (20:06 +0100)]
Guard uses of _mm512_reduce_add_p?

2 years agoMerge pull request #3537 from xianyi/release-0.3.0
Martin Kroeker [Mon, 21 Feb 2022 05:57:27 +0000 (06:57 +0100)]
Merge pull request #3537 from xianyi/release-0.3.0

Merge back from 0.3.20 release to copy tag

2 years agoUpdate version to 0.3.20
Martin Kroeker [Sun, 20 Feb 2022 21:35:05 +0000 (22:35 +0100)]
Update version to 0.3.20

2 years agoMerge pull request #3536 from xianyi/develop
Martin Kroeker [Sun, 20 Feb 2022 21:33:59 +0000 (22:33 +0100)]
Merge pull request #3536 from xianyi/develop

Update from develop for release 0.3.20

2 years agoMerge branch 'release-0.3.0' into develop
Martin Kroeker [Sun, 20 Feb 2022 21:33:45 +0000 (22:33 +0100)]
Merge branch 'release-0.3.0' into develop

2 years agoUpdate version to 0.3.20
Martin Kroeker [Sun, 20 Feb 2022 21:30:50 +0000 (22:30 +0100)]
Update version to 0.3.20

2 years agoMerge pull request #3535 from martin-frbg/0320changes
Martin Kroeker [Sun, 20 Feb 2022 21:21:02 +0000 (22:21 +0100)]
Merge pull request #3535 from martin-frbg/0320changes

Update with 0.3.20 changes

2 years agoUpdate with 0.3.20 changes
Martin Kroeker [Sun, 20 Feb 2022 21:16:04 +0000 (22:16 +0100)]
Update with 0.3.20 changes

2 years agoMerge pull request #3532 from martin-frbg/issue3528-2
Martin Kroeker [Fri, 11 Feb 2022 10:44:32 +0000 (11:44 +0100)]
Merge pull request #3532 from martin-frbg/issue3528-2

Fix building a shared library on Mac with flang-classic

2 years agokeep flang-classic on MacOS from trying to create an executable instead of a library
Martin Kroeker [Thu, 10 Feb 2022 22:04:45 +0000 (23:04 +0100)]
keep flang-classic on MacOS from trying to create an executable instead of a library

2 years agofilter out libflangmain as well
Martin Kroeker [Thu, 10 Feb 2022 22:03:05 +0000 (23:03 +0100)]
filter out libflangmain as well

2 years agoMerge pull request #3531 from martin-frbg/issue2973
Martin Kroeker [Thu, 10 Feb 2022 13:16:08 +0000 (14:16 +0100)]
Merge pull request #3531 from martin-frbg/issue2973

Add .NOTPARALLEL: to MATGEN Makefile as a workaround for builds on DFS

2 years agoAdd .NOTPARALLEL: as a workaround for builds on DFS
Martin Kroeker [Wed, 9 Feb 2022 21:09:25 +0000 (22:09 +0100)]
Add .NOTPARALLEL: as a workaround for builds on DFS

2 years agoMerge pull request #3527 from martin-frbg/issue3490
Martin Kroeker [Mon, 7 Feb 2022 07:14:11 +0000 (08:14 +0100)]
Merge pull request #3527 from martin-frbg/issue3490

Treat AVX512-enabled Alder Lake like Cooper Lake/Sapphire Rapids

2 years agoSupport AVX512-enabled Alder Lake
Martin Kroeker [Sun, 6 Feb 2022 23:00:56 +0000 (00:00 +0100)]
Support AVX512-enabled Alder Lake

2 years agoSupport AVX512-enabled AlderLake
Martin Kroeker [Sun, 6 Feb 2022 23:00:15 +0000 (00:00 +0100)]
Support AVX512-enabled AlderLake

2 years agoMerge pull request #3493 from martin-frbg/casts+cleanup
Martin Kroeker [Sun, 6 Feb 2022 22:55:06 +0000 (23:55 +0100)]
Merge pull request #3493 from martin-frbg/casts+cleanup

WIP casts and cleanups

2 years agoUpdate azure-pipelines.yml
Martin Kroeker [Sat, 5 Feb 2022 21:39:03 +0000 (22:39 +0100)]
Update azure-pipelines.yml

2 years agoMerge pull request #3524 from martin-frbg/lapack646
Martin Kroeker [Thu, 3 Feb 2022 21:31:23 +0000 (22:31 +0100)]
Merge pull request #3524 from martin-frbg/lapack646

Fix input argument check in ?GEQRT2 (from Reference-LAPACK PR 646)

2 years agoFix input argument check (LAPACK PR 646)
Martin Kroeker [Thu, 3 Feb 2022 10:43:17 +0000 (11:43 +0100)]
Fix input argument check (LAPACK PR 646)

2 years agoMerge pull request #3521 from martin-frbg/issue3520
Martin Kroeker [Fri, 28 Jan 2022 12:39:36 +0000 (13:39 +0100)]
Merge pull request #3521 from martin-frbg/issue3520

Add proper defaults for Sparc IMIN/IMAX

2 years agoMerge pull request #3522 from martin-frbg/issue3517
Martin Kroeker [Fri, 28 Jan 2022 09:36:57 +0000 (10:36 +0100)]
Merge pull request #3522 from martin-frbg/issue3517

Disable building C/Z SPMV,SPR,SYMV,SYR when NO_LAPACK=1

2 years agoExclude some complex (LAPACK) functions when NO_LAPACK is set
Martin Kroeker [Thu, 27 Jan 2022 21:02:08 +0000 (22:02 +0100)]
Exclude some complex (LAPACK) functions when NO_LAPACK is set

2 years agoExclude some complex drivers when NO_LAPACK is set
Martin Kroeker [Thu, 27 Jan 2022 21:00:39 +0000 (22:00 +0100)]
Exclude some complex drivers when NO_LAPACK is set

2 years agoAdd proper defaults for IMIN/IMAX
Martin Kroeker [Thu, 27 Jan 2022 18:56:32 +0000 (19:56 +0100)]
Add proper defaults for IMIN/IMAX

2 years agoMerge pull request #3518 from martin-frbg/elbrus
Martin Kroeker [Tue, 25 Jan 2022 19:57:59 +0000 (20:57 +0100)]
Merge pull request #3518 from martin-frbg/elbrus

Add basic support for the (mostly x86_64 compatible) Elbrus E2000 architecture

2 years agoMerge pull request #3516 from mmuetzel/no-fortran
Martin Kroeker [Tue, 25 Jan 2022 19:57:38 +0000 (20:57 +0100)]
Merge pull request #3516 from mmuetzel/no-fortran

cmake: Check if Fortran compiler is usable before enabling it.

2 years agoUpdate CONTRIBUTORS.md
Martin Kroeker [Sat, 22 Jan 2022 18:09:00 +0000 (19:09 +0100)]
Update CONTRIBUTORS.md

2 years agoUpdate CONTRIBUTORS.md
Martin Kroeker [Sat, 22 Jan 2022 18:02:57 +0000 (19:02 +0100)]
Update CONTRIBUTORS.md

2 years agoAdd default KERNEL file for Elbrus E2K arch
Martin Kroeker [Sat, 22 Jan 2022 17:59:36 +0000 (18:59 +0100)]
Add default KERNEL file for Elbrus E2K arch

2 years agoCreate Makefile
Martin Kroeker [Sat, 22 Jan 2022 17:57:28 +0000 (18:57 +0100)]
Create Makefile

2 years agoAdd Elbrus e2k architecture support
Martin Kroeker [Sat, 22 Jan 2022 17:55:10 +0000 (18:55 +0100)]
Add Elbrus e2k architecture support

2 years agoAdd Elbrus E2000 architecture as generic x86_64 compatible
Martin Kroeker [Sat, 22 Jan 2022 17:53:38 +0000 (18:53 +0100)]
Add Elbrus E2000 architecture as generic x86_64 compatible

2 years agoAdd Elbrus e2k architecture detection
Martin Kroeker [Sat, 22 Jan 2022 17:27:38 +0000 (18:27 +0100)]
Add Elbrus e2k architecture detection

2 years agocmake: Check if Fortran compiler is usable before enabling it.
Markus Mützel [Fri, 21 Jan 2022 12:27:17 +0000 (13:27 +0100)]
cmake: Check if Fortran compiler is usable before enabling it.

2 years agoMerge pull request #3492 from binebrank/arm_sve_zgemm
Martin Kroeker [Tue, 18 Jan 2022 20:36:33 +0000 (21:36 +0100)]
Merge pull request #3492 from binebrank/arm_sve_zgemm

SVE zgemm&cgemm (and other BLAS 3 complex)

2 years agoupdate armv8sve + contributors
Bine Brank [Tue, 18 Jan 2022 07:28:31 +0000 (08:28 +0100)]
update armv8sve + contributors

2 years agoadapt CMake
Bine Brank [Mon, 17 Jan 2022 21:36:48 +0000 (22:36 +0100)]
adapt CMake

2 years agoMerge pull request #3514 from martin-frbg/issue3513
Martin Kroeker [Mon, 17 Jan 2022 18:22:18 +0000 (19:22 +0100)]
Merge pull request #3514 from martin-frbg/issue3513

Fix ?LASWP pivot index calculation for negative increments other than -1

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:11:18 +0000 (00:11 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:10:21 +0000 (00:10 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:09:14 +0000 (00:09 +0100)]
Fix offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:08:20 +0000 (00:08 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:07:33 +0000 (00:07 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot offset calculation for negative incx
Martin Kroeker [Sun, 16 Jan 2022 23:06:41 +0000 (00:06 +0100)]
Fix pivot offset calculation for negative incx

2 years agoFix pivot index for negative increments
Martin Kroeker [Sun, 16 Jan 2022 23:05:33 +0000 (00:05 +0100)]
Fix pivot index for negative increments

2 years agoadapt Makefile for SVE trsm
Bine Brank [Sun, 16 Jan 2022 20:40:56 +0000 (21:40 +0100)]
adapt Makefile for SVE trsm

2 years agofix ztrsm lt/ut copy
Bine Brank [Sun, 16 Jan 2022 20:39:57 +0000 (21:39 +0100)]
fix ztrsm lt/ut copy

2 years agoadd sve ztrsm
Bine Brank [Sat, 15 Jan 2022 21:27:25 +0000 (22:27 +0100)]
add sve ztrsm

2 years agofix sve dtrsm kernels
Bine Brank [Sat, 15 Jan 2022 20:02:14 +0000 (21:02 +0100)]
fix sve dtrsm kernels

2 years agoadd remaining sve trsm copy kernels
Bine Brank [Tue, 11 Jan 2022 20:16:38 +0000 (21:16 +0100)]
add remaining sve trsm copy kernels

2 years agotrsm_lncopy_sve
Bine Brank [Mon, 10 Jan 2022 20:45:37 +0000 (21:45 +0100)]
trsm_lncopy_sve

2 years agosve trsmRN and trsmRT
Bine Brank [Mon, 10 Jan 2022 19:42:20 +0000 (20:42 +0100)]
sve trsmRN and trsmRT

2 years agoMerge pull request #3511 from martin-frbg/cmakeutils
Martin Kroeker [Mon, 10 Jan 2022 08:12:52 +0000 (09:12 +0100)]
Merge pull request #3511 from martin-frbg/cmakeutils

Fix handling of ifdef/ifndef in CMAKE

2 years agoFix handling of ifdef/ifndef
Martin Kroeker [Sun, 9 Jan 2022 22:31:59 +0000 (23:31 +0100)]
Fix handling of ifdef/ifndef

2 years agoadd trsm_kernel_LT_sve
Bine Brank [Sun, 9 Jan 2022 19:11:47 +0000 (20:11 +0100)]
add trsm_kernel_LT_sve

2 years agosve trsm_kernel_LN
Bine Brank [Sun, 9 Jan 2022 18:40:04 +0000 (19:40 +0100)]
sve trsm_kernel_LN

2 years agoMerge pull request #3510 from martin-frbg/issue3505
Martin Kroeker [Sun, 9 Jan 2022 13:50:51 +0000 (14:50 +0100)]
Merge pull request #3510 from martin-frbg/issue3505

Fix recent SkylakeX/DYNAMIC_ARCH DGEMM breakage