platform/upstream/openblas.git
3 years agoWork around another recent macro name collision with winnt.h
Martin Kroeker [Wed, 16 Jun 2021 10:32:34 +0000 (12:32 +0200)]
Work around another recent macro name collision with winnt.h

3 years agoModify defines for CR and RC to work around name collision on Windows
Martin Kroeker [Wed, 16 Jun 2021 10:17:25 +0000 (12:17 +0200)]
Modify defines for CR and RC to work around name collision on Windows

3 years agoMerge pull request #3273 from austinpagan/sbgemm_gcc10_fix
Martin Kroeker [Tue, 15 Jun 2021 20:58:48 +0000 (22:58 +0200)]
Merge pull request #3273 from austinpagan/sbgemm_gcc10_fix

Power10: Fix for SBGEMM

3 years agoPower10: Fix for SBGEMM
Gordon Fossum [Tue, 15 Jun 2021 18:07:47 +0000 (13:07 -0500)]
Power10: Fix for SBGEMM

While testing bfloat16 sbgemm kernel, there are some failures for odd value inputs due to updating result for
additional bytes.

3 years agoMerge pull request #3252 from martin-frbg/more_shortcuts
Martin Kroeker [Tue, 15 Jun 2021 14:14:20 +0000 (16:14 +0200)]
Merge pull request #3252 from martin-frbg/more_shortcuts

Further shortcuts for (small) cases that do not need buffer allocation

3 years agoMerge pull request #3250 from martin-frbg/gemv-shortcut
Martin Kroeker [Tue, 15 Jun 2021 12:50:14 +0000 (14:50 +0200)]
Merge pull request #3250 from martin-frbg/gemv-shortcut

Add shortcut for small-size S/D GEMV_N with increments of one

3 years agoMerge pull request #3270 from ggouaillardet/topic/dznrm2_tx2
Martin Kroeker [Mon, 14 Jun 2021 11:00:33 +0000 (13:00 +0200)]
Merge pull request #3270 from ggouaillardet/topic/dznrm2_tx2

arm64: add the missing d9 register to the clobber list

3 years agoarm64: add the missing d9 register to the clobber list
Gilles Gouaillardet [Mon, 14 Jun 2021 08:01:28 +0000 (17:01 +0900)]
arm64: add the missing d9 register to the clobber list

Refs. numpy/numpy#18422

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
3 years agoMerge pull request #3266 from martin-frbg/powerparam
Martin Kroeker [Thu, 10 Jun 2021 16:05:47 +0000 (18:05 +0200)]
Merge pull request #3266 from martin-frbg/powerparam

Remove spurious casts from PPC parameters and fix compilation for older targets

3 years agoMerge pull request #3260 from intelmy/sgemv_t_opt
Martin Kroeker [Thu, 10 Jun 2021 14:08:24 +0000 (16:08 +0200)]
Merge pull request #3260 from intelmy/sgemv_t_opt

Optimized sgemv_t for small N based on AVX512

3 years agoMerge pull request #3264 from RajalakshmiSR/sbgemmp10
Martin Kroeker [Thu, 10 Jun 2021 14:07:47 +0000 (16:07 +0200)]
Merge pull request #3264 from RajalakshmiSR/sbgemmp10

POWER10: Fixes for sbgemm kernel

3 years agoAdd prefetch values for power3
Martin Kroeker [Thu, 10 Jun 2021 09:20:29 +0000 (11:20 +0200)]
Add prefetch values for power3

3 years agoAdd prefetch values for power3
Martin Kroeker [Thu, 10 Jun 2021 09:19:40 +0000 (11:19 +0200)]
Add prefetch values for power3

3 years agoAdd prefetch values for power3
Martin Kroeker [Thu, 10 Jun 2021 09:18:22 +0000 (11:18 +0200)]
Add prefetch values for power3

3 years agoAdd prefetch values for power3
Martin Kroeker [Thu, 10 Jun 2021 09:17:33 +0000 (11:17 +0200)]
Add prefetch values for power3

3 years agoFix caxpy/zaxpy for big-endian
Martin Kroeker [Thu, 10 Jun 2021 09:15:48 +0000 (11:15 +0200)]
Fix caxpy/zaxpy for big-endian

3 years agoFix inverted conditional for caxpy/zaxpy
Martin Kroeker [Thu, 10 Jun 2021 09:14:03 +0000 (11:14 +0200)]
Fix inverted conditional for caxpy/zaxpy

3 years agofix c/zrot and sgemv for POWER5
Martin Kroeker [Thu, 10 Jun 2021 09:11:56 +0000 (11:11 +0200)]
fix c/zrot and sgemv for POWER5

3 years agoRemove casts for PPC/POWER and complete parameters for POWER3/4
Martin Kroeker [Thu, 10 Jun 2021 09:09:50 +0000 (11:09 +0200)]
Remove casts for PPC/POWER and complete parameters for POWER3/4

3 years agoPOWER10: Fixes for sbgemm kernel
Rajalakshmi Srinivasaraghavan [Wed, 9 Jun 2021 17:20:09 +0000 (12:20 -0500)]
POWER10: Fixes for sbgemm kernel

While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.

3 years agoOptimized sgemv_t for small N based on AVX512
Ma, Yu [Tue, 8 Jun 2021 19:08:28 +0000 (15:08 -0400)]
Optimized sgemv_t for small N based on AVX512

3 years agoMerge pull request #3259 from zhaofengli/riscv64-fixes
Zhang Xianyi [Tue, 8 Jun 2021 08:26:56 +0000 (16:26 +0800)]
Merge pull request #3259 from zhaofengli/riscv64-fixes

riscv64 fixes

3 years agoriscv64: Add Makefile
Zhaofeng Li [Mon, 7 Jun 2021 22:55:56 +0000 (22:55 +0000)]
riscv64: Add Makefile

3 years agoRISCV64_GENERIC: Use generic kernel for DSDOT for better precision
Zhaofeng Li [Mon, 7 Jun 2021 22:50:23 +0000 (22:50 +0000)]
RISCV64_GENERIC: Use generic kernel for DSDOT for better precision

The implementation in `riscv64/dot.c` fails the `test_dsdot` test, and
the generic kernel seems to have better precision. Tested on SiFive
FU740 (HiFive Unmatched) and QEMU.

Also see #1469.

3 years agoriscv64/imin: Fix wrong comparison
Zhaofeng Li [Mon, 7 Jun 2021 22:49:39 +0000 (22:49 +0000)]
riscv64/imin: Fix wrong comparison

Same as #1990.

3 years agoMerge pull request #3258 from martin-frbg/hbaction
Martin Kroeker [Sun, 6 Jun 2021 20:15:29 +0000 (22:15 +0200)]
Merge pull request #3258 from martin-frbg/hbaction

revert "try to work around gcc update problems" in Homebrew workflow

3 years agorevert "try to work around gcc update problems"
Martin Kroeker [Sun, 6 Jun 2021 17:17:36 +0000 (19:17 +0200)]
revert "try to work around gcc update problems"

...as homebrew has dropped at least gcc8 now

3 years agoAdd shortcuts for (small) cases that do not need expensive buffer allocation
Martin Kroeker [Sat, 29 May 2021 20:28:00 +0000 (22:28 +0200)]
Add shortcuts for (small) cases that do not need expensive buffer allocation

3 years agorevert symv changes for now
Martin Kroeker [Sat, 29 May 2021 13:40:03 +0000 (15:40 +0200)]
revert symv changes for now

3 years agoFix copy-paste errors in variables used
Martin Kroeker [Fri, 28 May 2021 07:38:48 +0000 (09:38 +0200)]
Fix copy-paste errors in variables used

3 years agoAdd shortcuts for (small) cases that do not need expensive buffer allocation
Martin Kroeker [Thu, 27 May 2021 20:39:18 +0000 (22:39 +0200)]
Add shortcuts for (small) cases that do not need expensive buffer allocation

3 years agoAdd shortcut for small-size gemv_n with increments of one
Martin Kroeker [Wed, 26 May 2021 20:02:34 +0000 (22:02 +0200)]
Add shortcut for small-size gemv_n with increments of one

3 years agoMerge pull request #3249 from MikaelUrankar/develop
Martin Kroeker [Wed, 26 May 2021 13:26:30 +0000 (15:26 +0200)]
Merge pull request #3249 from MikaelUrankar/develop

Fix typo

3 years agoFix typo
MikaelUrankar [Wed, 26 May 2021 10:14:57 +0000 (12:14 +0200)]
Fix typo

3 years agoMerge pull request #3244 from martin-frbg/issue3237
Martin Kroeker [Sat, 22 May 2021 20:38:09 +0000 (22:38 +0200)]
Merge pull request #3244 from martin-frbg/issue3237

Add fast path for small xSYR with INCX==1

3 years agoAdd fast path for small xSYR with INCX==1
Martin Kroeker [Sat, 22 May 2021 18:41:18 +0000 (20:41 +0200)]
Add fast path for small xSYR with INCX==1

3 years agoMerge pull request #3243 from martin-frbg/lapack564
Martin Kroeker [Sat, 22 May 2021 17:25:56 +0000 (19:25 +0200)]
Merge pull request #3243 from martin-frbg/lapack564

Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)

3 years agoMerge pull request #3196 from guowangy/skylakex-gemm-batch-k
Martin Kroeker [Sat, 22 May 2021 17:25:28 +0000 (19:25 +0200)]
Merge pull request #3196 from guowangy/skylakex-gemm-batch-k

GEMM: skylake: improve the performance when m is small

3 years agoMerge pull request #3242 from martin-frbg/issue3239
Martin Kroeker [Sat, 22 May 2021 17:24:46 +0000 (19:24 +0200)]
Merge pull request #3242 from martin-frbg/issue3239

Handle inadvertent use of DYNAMIC_ARCH=0

3 years agoFix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
Martin Kroeker [Sat, 22 May 2021 12:29:45 +0000 (14:29 +0200)]
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)

3 years agoHandle inadvertent use of DYNAMIC_ARCH=0
Martin Kroeker [Sat, 22 May 2021 12:23:49 +0000 (14:23 +0200)]
Handle inadvertent use of DYNAMIC_ARCH=0

3 years agoMerge pull request #3205 from intelmy/sgemv_n_opt
Martin Kroeker [Mon, 17 May 2021 15:49:01 +0000 (17:49 +0200)]
Merge pull request #3205 from intelmy/sgemv_n_opt

optimize on sgemv_n for small n

3 years agoMerge pull request #3238 from martin-frbg/lapack555
Martin Kroeker [Mon, 17 May 2021 15:32:23 +0000 (17:32 +0200)]
Merge pull request #3238 from martin-frbg/lapack555

Correct function name in error message from SLASQ2 (LAPACK PR555)

3 years agoCorrect function name in error message from SLASQ2 (Reference-LAPACK PR 555)
Martin Kroeker [Mon, 17 May 2021 12:47:14 +0000 (14:47 +0200)]
Correct function name in error message from SLASQ2 (Reference-LAPACK PR 555)

3 years agoMerge pull request #3236 from martin-frbg/issue3234
Martin Kroeker [Sun, 16 May 2021 15:17:18 +0000 (17:17 +0200)]
Merge pull request #3236 from martin-frbg/issue3234

Add -lm for FreeBSD on ARM/ARM64

3 years agoMerge pull request #3235 from dnoan/develop
Martin Kroeker [Sun, 16 May 2021 15:15:45 +0000 (17:15 +0200)]
Merge pull request #3235 from dnoan/develop

Update Makefile.arm64

3 years agoAdd -lm for FreeBSD on ARM/ARM64
Martin Kroeker [Sun, 16 May 2021 11:04:38 +0000 (13:04 +0200)]
Add -lm for FreeBSD on ARM/ARM64

3 years agoUpdate Makefile.arm64
Noan [Sun, 16 May 2021 09:49:13 +0000 (09:49 +0000)]
Update Makefile.arm64

Added -march and -mtune flags for EMAG processors when GCC 9 or later

3 years agoMerge pull request #3228 from martin-frbg/issue3226
Martin Kroeker [Sat, 15 May 2021 07:06:12 +0000 (09:06 +0200)]
Merge pull request #3228 from martin-frbg/issue3226

filter out -mavx flag on Sandybridge zgemm/ztrmm kernels

3 years agoMerge pull request #3233 from martin-frbg/issue3230
Martin Kroeker [Fri, 14 May 2021 23:04:09 +0000 (01:04 +0200)]
Merge pull request #3233 from martin-frbg/issue3230

Add autodetection for Intel Ice Lake SP

3 years agoMerge pull request #3232 from martin-frbg/lapack553
Martin Kroeker [Fri, 14 May 2021 21:28:45 +0000 (23:28 +0200)]
Merge pull request #3232 from martin-frbg/lapack553

Reduce stack size requirements in the LAPACK LIN tests (LAPACK PR 553)

3 years agoMerge pull request #3231 from martin-frbg/issue3227
Martin Kroeker [Fri, 14 May 2021 21:28:06 +0000 (23:28 +0200)]
Merge pull request #3231 from martin-frbg/issue3227

Support compilation with pre-C99 versions of MSVC

3 years agoOnly filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels
Martin Kroeker [Fri, 14 May 2021 21:19:10 +0000 (23:19 +0200)]
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels

3 years agoRecognize Intel Ice Lake SP as Cooper Lake
Martin Kroeker [Fri, 14 May 2021 18:44:06 +0000 (20:44 +0200)]
Recognize Intel Ice Lake SP as Cooper Lake

3 years agoSupport Intel Ice Lake SP as Cooper Lake
Martin Kroeker [Fri, 14 May 2021 18:39:55 +0000 (20:39 +0200)]
Support Intel Ice Lake SP as Cooper Lake

3 years agoDelete zchkaa.f
Martin Kroeker [Fri, 14 May 2021 17:55:31 +0000 (19:55 +0200)]
Delete zchkaa.f

3 years agoDelete schkaa.f
Martin Kroeker [Fri, 14 May 2021 17:54:54 +0000 (19:54 +0200)]
Delete schkaa.f

3 years agoDelete dchkaa.f
Martin Kroeker [Fri, 14 May 2021 17:54:13 +0000 (19:54 +0200)]
Delete dchkaa.f

3 years agoDelete cchkaa.f
Martin Kroeker [Fri, 14 May 2021 17:53:38 +0000 (19:53 +0200)]
Delete cchkaa.f

3 years agoConvert ?chkaa to use dynamic allocation for the larger arrays
Martin Kroeker [Fri, 14 May 2021 17:53:03 +0000 (19:53 +0200)]
Convert ?chkaa to use dynamic allocation for the larger arrays

3 years agoSupport compilation with pre-C99 versions of MSVC
Martin Kroeker [Fri, 14 May 2021 13:08:12 +0000 (15:08 +0200)]
Support compilation with pre-C99 versions of MSVC

3 years agoDrop redundant inclusion of complex.h
Martin Kroeker [Fri, 14 May 2021 13:06:44 +0000 (15:06 +0200)]
Drop redundant inclusion of complex.h

3 years agofilter out -mavx flag on zgemm kernels as it can cause problems with older gcc
Martin Kroeker [Thu, 13 May 2021 21:05:00 +0000 (23:05 +0200)]
filter out -mavx flag on zgemm kernels as it can cause problems with older gcc

3 years agoMerge pull request #3192 from damonyu1989/develop
Martin Kroeker [Tue, 11 May 2021 14:00:45 +0000 (16:00 +0200)]
Merge pull request #3192 from damonyu1989/develop

Update the intrinsic api to the offical name.

3 years agoAdd an Android crossbuild on OSX to Azure CI (#3224)
Martin Kroeker [Mon, 10 May 2021 06:02:01 +0000 (08:02 +0200)]
Add an Android crossbuild on OSX to Azure CI (#3224)

* Add an Android crossbuild on OSX

3 years agoMerge pull request #3223 from martin-frbg/develop
Martin Kroeker [Fri, 7 May 2021 06:51:45 +0000 (08:51 +0200)]
Merge pull request #3223 from martin-frbg/develop

Use percent instead of ampersand as placeholder for substitutions

3 years agoUse percent instead of ampersand as placeholder for substitutions
Martin Kroeker [Thu, 6 May 2021 18:20:08 +0000 (20:20 +0200)]
Use percent instead of ampersand as placeholder for substitutions

3 years agoFix missing conditionals for non-SKX kernels
Martin Kroeker [Wed, 5 May 2021 12:55:36 +0000 (14:55 +0200)]
Fix missing conditionals for non-SKX kernels

3 years agoMerge pull request #3219 from austinpagan/Gemm.ErrorFix
Martin Kroeker [Wed, 5 May 2021 12:30:41 +0000 (14:30 +0200)]
Merge pull request #3219 from austinpagan/Gemm.ErrorFix

Add error message token for SBGEMM in gemm.c

3 years agoMerge pull request #3220 from drhpc/drhpc-fixup
Martin Kroeker [Wed, 5 May 2021 12:30:24 +0000 (14:30 +0200)]
Merge pull request #3220 from drhpc/drhpc-fixup

Delete lapack_wrappers.c.orig

3 years agoDelete lapack_wrappers.c.orig
drhpc [Tue, 4 May 2021 19:02:07 +0000 (21:02 +0200)]
Delete lapack_wrappers.c.orig

This looks like a leftover from patching and confuses further patching;-)

3 years agoAdd error message token for SBGEMM in gemm.c
Gordon Fossum [Tue, 4 May 2021 18:55:02 +0000 (13:55 -0500)]
Add error message token for SBGEMM in gemm.c

3 years agoUpdate version to 0.3.15.dev
Martin Kroeker [Sun, 2 May 2021 22:01:08 +0000 (00:01 +0200)]
Update version to 0.3.15.dev

3 years agoUpdate version to 0.3.15.dev
Martin Kroeker [Sun, 2 May 2021 22:00:29 +0000 (00:00 +0200)]
Update version to 0.3.15.dev

3 years agoMerge pull request #3217 from xianyi/release-0.3.0
Martin Kroeker [Sun, 2 May 2021 21:59:55 +0000 (23:59 +0200)]
Merge pull request #3217 from xianyi/release-0.3.0

merge 0.3.15 back into develop to copy tag

3 years agoUpdate version to 0.3.15
Martin Kroeker [Sun, 2 May 2021 21:50:22 +0000 (23:50 +0200)]
Update version to 0.3.15

3 years agoUpdate version to 0.3.15
Martin Kroeker [Sun, 2 May 2021 21:49:49 +0000 (23:49 +0200)]
Update version to 0.3.15

3 years agoMerge pull request #3216 from xianyi/develop
Martin Kroeker [Sun, 2 May 2021 21:48:28 +0000 (23:48 +0200)]
Merge pull request #3216 from xianyi/develop

Update from develop for 0.3.15 release

3 years agoMerge pull request #3215 from martin-frbg/cl0315
Martin Kroeker [Sun, 2 May 2021 21:47:24 +0000 (23:47 +0200)]
Merge pull request #3215 from martin-frbg/cl0315

Update Changelog for 0.3.15

3 years agoUpdate Changelog for 0.3.15
Martin Kroeker [Sun, 2 May 2021 21:46:55 +0000 (23:46 +0200)]
Update Changelog for 0.3.15

3 years agoMerge pull request #3214 from martin-frbg/lapack-3.9.1hrt
Martin Kroeker [Sun, 2 May 2021 21:40:03 +0000 (23:40 +0200)]
Merge pull request #3214 from martin-frbg/lapack-3.9.1hrt

Add new Householder Reconstruction functions from LAPACK 3.9.1

3 years agoAdd files via upload
Martin Kroeker [Sun, 2 May 2021 18:47:58 +0000 (20:47 +0200)]
Add files via upload

3 years agoAdd LAPACKE interfaces for the new Householder Reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:57:47 +0000 (19:57 +0200)]
Add LAPACKE interfaces for the new Householder Reconstruction functions from 3.9.1

3 years agoAdd entries for the new Householder Reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:56:11 +0000 (19:56 +0200)]
Add entries for the new Householder Reconstruction functions from 3.9.1

3 years agoAdd entries for the new Householder Reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:55:15 +0000 (19:55 +0200)]
Add entries for the new Householder Reconstruction functions from 3.9.1

3 years agoAdd new tests for Householder reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:28:21 +0000 (19:28 +0200)]
Add new tests for Householder reconstruction functions from 3.9.1

3 years agoAdd new files for Householder reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:25:43 +0000 (19:25 +0200)]
Add new files for Householder reconstruction functions from 3.9.1

3 years agoAdd entries for Householder reconstruction functions from 3.9.1
Martin Kroeker [Sun, 2 May 2021 17:21:59 +0000 (19:21 +0200)]
Add entries for Householder reconstruction functions from 3.9.1

3 years agoMerge pull request #26 from xianyi/develop
Martin Kroeker [Sun, 2 May 2021 17:19:28 +0000 (19:19 +0200)]
Merge pull request #26 from xianyi/develop

rebase

3 years agoMerge pull request #3213 from martin-frbg/lapack382
Martin Kroeker [Sun, 2 May 2021 16:45:15 +0000 (18:45 +0200)]
Merge pull request #3213 from martin-frbg/lapack382

Avoid allocating the transposed triangular matrix in LAPACKE_xlantr_work (Reference-LAPACK 382)

3 years agoMerge pull request #3212 from martin-frbg/lapack463
Martin Kroeker [Sun, 2 May 2021 16:44:59 +0000 (18:44 +0200)]
Merge pull request #3212 from martin-frbg/lapack463

Initialize X and Y to zero for N=0 in xGGGLM (Reference-LAPACK PR463)

3 years agoMerge pull request #3211 from martin-frbg/lapack471
Martin Kroeker [Sun, 2 May 2021 16:44:29 +0000 (18:44 +0200)]
Merge pull request #3211 from martin-frbg/lapack471

Handle norm NaN value in xGESDD (Reference LAPACK PR471)

3 years agoAvoid allocating the transposed triangular matrix (Reference-LAPACK PR382)
Martin Kroeker [Sun, 2 May 2021 10:18:17 +0000 (12:18 +0200)]
Avoid allocating the transposed triangular matrix (Reference-LAPACK PR382)

3 years agoInitialize X and Y to zero for N=0 (Reference-LAPACK PR463)
Martin Kroeker [Sun, 2 May 2021 09:40:56 +0000 (11:40 +0200)]
Initialize X and Y to zero for N=0 (Reference-LAPACK PR463)

3 years agoHandle norm NaN value (Reference LAPACK PR471)
Martin Kroeker [Sun, 2 May 2021 09:24:50 +0000 (11:24 +0200)]
Handle norm NaN value (Reference LAPACK PR471)

3 years agoMerge pull request #3210 from martin-frbg/lapack502
Martin Kroeker [Sun, 2 May 2021 07:02:11 +0000 (09:02 +0200)]
Merge pull request #3210 from martin-frbg/lapack502

Fix possible division by zero in LAPACK xTGSJA (Reference-LAPACK PR502)

3 years agoFix possible division by zero in xTGSJA (Reference-LAPACK PR502)
Martin Kroeker [Sat, 1 May 2021 19:31:13 +0000 (21:31 +0200)]
Fix possible division by zero in xTGSJA (Reference-LAPACK PR502)

3 years agoMerge pull request #3208 from martin-frbg/lapack534
Martin Kroeker [Sat, 1 May 2021 18:18:29 +0000 (20:18 +0200)]
Merge pull request #3208 from martin-frbg/lapack534

Apply MKL team fixes to the LAPACKE interfaces (Reference-LAPACK PR 534)

3 years agoMerge pull request #3209 from martin-frbg/issue3160
Martin Kroeker [Sat, 1 May 2021 18:08:24 +0000 (20:08 +0200)]
Merge pull request #3209 from martin-frbg/issue3160

Add casts to prevent overflow of intermediate results

3 years agoAdd casts to prevent overflow of intermediate result
Martin Kroeker [Sat, 1 May 2021 12:48:19 +0000 (14:48 +0200)]
Add casts to prevent overflow of intermediate result