platform/upstream/openblas.git
3 years agoMerge pull request #2919 from isuruf/export
Martin Kroeker [Mon, 19 Oct 2020 06:14:27 +0000 (08:14 +0200)]
Merge pull request #2919 from isuruf/export

Fix exporting some lapack and cblas symbols

3 years agoFix exporting some lapack and cblas
Isuru Fernando [Mon, 19 Oct 2020 02:42:32 +0000 (21:42 -0500)]
Fix exporting some lapack and cblas

3 years agoMerge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex
Martin Kroeker [Sun, 18 Oct 2020 22:09:54 +0000 (00:09 +0200)]
Merge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex

sgemm_direct_skylakex: fix 75eeb26 regression.

3 years agoMerge pull request #2913 from martin-frbg/issue2910
Martin Kroeker [Sun, 18 Oct 2020 21:04:56 +0000 (23:04 +0200)]
Merge pull request #2913 from martin-frbg/issue2910

Support cross-compiling for Apple Vortex

3 years agosgemm_direct_skylakex: fix 75eeb26 regression.
Bart Oldeman [Sun, 18 Oct 2020 19:50:38 +0000 (19:50 +0000)]
sgemm_direct_skylakex: fix 75eeb26 regression.

The
`#if defined(SKYLAKEX) || defined (COOPERLAKE)`
from that commit was before #include "common.h" so caused the
compiled function to be empty, returning garbage results for
qualifying sgemm's on those architectures.

Closes #2914

3 years agoFix naming of L2 cache size item reported for Vortex
Martin Kroeker [Sun, 18 Oct 2020 17:22:05 +0000 (19:22 +0200)]
Fix naming of L2 cache size item reported for Vortex

3 years agoMerge pull request #2909 from isuruf/patch-1
Martin Kroeker [Sun, 18 Oct 2020 17:16:08 +0000 (19:16 +0200)]
Merge pull request #2909 from isuruf/patch-1

Need a space when redirecting to file

3 years agoSupport cross-compiling for Apple Vortex
Martin Kroeker [Sun, 18 Oct 2020 17:10:58 +0000 (19:10 +0200)]
Support cross-compiling for Apple Vortex

3 years agoSupport cross-compiling for Apple Vortex
Martin Kroeker [Sun, 18 Oct 2020 16:54:54 +0000 (18:54 +0200)]
Support cross-compiling for Apple Vortex

3 years agoMerge pull request #102 from xianyi/develop
Martin Kroeker [Sun, 18 Oct 2020 16:49:59 +0000 (18:49 +0200)]
Merge pull request #102 from xianyi/develop

rebase

3 years agoNeed a space when redirecting to file
Isuru Fernando [Sun, 18 Oct 2020 14:40:31 +0000 (09:40 -0500)]
Need a space when redirecting to file

Following two commands have two completely different meanings
perl ./gensymbol objcopy x86_64 _ 0 0  0 0 0 0 "" "64_" 1 0 1 1 1 1 > objcopy.def
perl ./gensymbol objcopy x86_64 _ 0 0  0 0 0 0 "" "64_" 1 0 1 1 1 1> objcopy.def

3 years agoUpdate version string to 0.3.11.dev
Martin Kroeker [Sat, 17 Oct 2020 20:40:47 +0000 (22:40 +0200)]
Update version string to 0.3.11.dev

3 years agoUpdate version string to 0.3.11.dev
Martin Kroeker [Sat, 17 Oct 2020 20:40:06 +0000 (22:40 +0200)]
Update version string to 0.3.11.dev

3 years agoMerge pull request #2908 from xianyi/release-0.3.0
Martin Kroeker [Sat, 17 Oct 2020 20:38:58 +0000 (22:38 +0200)]
Merge pull request #2908 from xianyi/release-0.3.0

Synchronyse tag with release 0.3.11

3 years agoMerge pull request #2907 from xianyi/develop
Martin Kroeker [Sat, 17 Oct 2020 20:14:12 +0000 (22:14 +0200)]
Merge pull request #2907 from xianyi/develop

Update from develop for 0.3.11

3 years agoUpdate version number to 0.3.11
Martin Kroeker [Sat, 17 Oct 2020 20:11:34 +0000 (22:11 +0200)]
Update version number to 0.3.11

3 years agoUpdate version for 0.3.11 release
Martin Kroeker [Sat, 17 Oct 2020 20:10:50 +0000 (22:10 +0200)]
Update version for 0.3.11 release

3 years agoMerge pull request #2906 from martin-frbg/changelog-0311
Martin Kroeker [Sat, 17 Oct 2020 20:07:14 +0000 (22:07 +0200)]
Merge pull request #2906 from martin-frbg/changelog-0311

Update Changelog.txt with the 0.3.11 changes

3 years agoUpdate Changelog.txt with the 0.3.11 changes
Martin Kroeker [Sat, 17 Oct 2020 20:05:36 +0000 (22:05 +0200)]
Update Changelog.txt with the 0.3.11 changes

3 years agoMerge pull request #2905 from martin-frbg/aocc-clang
Martin Kroeker [Sat, 17 Oct 2020 07:45:22 +0000 (09:45 +0200)]
Merge pull request #2905 from martin-frbg/aocc-clang

Add -mavx for clang & aocc

3 years agoAdd AVX flags for clang/aocc as well
Martin Kroeker [Fri, 16 Oct 2020 18:52:15 +0000 (20:52 +0200)]
Add AVX flags for clang/aocc as well

3 years agoMerge pull request #101 from xianyi/develop
Martin Kroeker [Fri, 16 Oct 2020 18:48:58 +0000 (20:48 +0200)]
Merge pull request #101 from xianyi/develop

rebase

3 years agoMerge pull request #2900 from martin-frbg/fixcmake_sse
Martin Kroeker [Fri, 16 Oct 2020 14:17:36 +0000 (16:17 +0200)]
Merge pull request #2900 from martin-frbg/fixcmake_sse

Add compiler options for SSE to the cmake support files

3 years agoAdd compiler options for sse/sse2/ssse3/sse4.1
Martin Kroeker [Fri, 16 Oct 2020 08:47:06 +0000 (10:47 +0200)]
Add compiler options for sse/sse2/ssse3/sse4.1

3 years agoAdd sse options for use of intrinics with older compilers
Martin Kroeker [Fri, 16 Oct 2020 08:41:53 +0000 (10:41 +0200)]
Add sse options for use of intrinics with older compilers

3 years agofix core list for sse/sse2
Martin Kroeker [Fri, 16 Oct 2020 07:55:48 +0000 (09:55 +0200)]
fix core list for sse/sse2

3 years agoMerge pull request #2898 from martin-frbg/morefixes
Martin Kroeker [Fri, 16 Oct 2020 05:26:39 +0000 (07:26 +0200)]
Merge pull request #2898 from martin-frbg/morefixes

More pre-release fixes

3 years agoadd sse2
Martin Kroeker [Thu, 15 Oct 2020 20:10:32 +0000 (22:10 +0200)]
add sse2

3 years agoExpressly enable -msse for 32bit DYNAMIC_ARCH kernels
Martin Kroeker [Thu, 15 Oct 2020 18:16:15 +0000 (20:16 +0200)]
Expressly enable -msse for 32bit DYNAMIC_ARCH kernels

3 years agoSilence a redefinition warning
Martin Kroeker [Thu, 15 Oct 2020 17:08:12 +0000 (19:08 +0200)]
Silence a redefinition warning

3 years agoAdd -msse where supported, apparently required for older gcc
Martin Kroeker [Thu, 15 Oct 2020 17:06:45 +0000 (19:06 +0200)]
Add -msse where supported, apparently required for older gcc

3 years agoUse ifdef instead of if
Martin Kroeker [Thu, 15 Oct 2020 17:05:37 +0000 (19:05 +0200)]
Use ifdef instead of if

3 years agoMerge pull request #100 from xianyi/develop
Martin Kroeker [Thu, 15 Oct 2020 16:54:20 +0000 (18:54 +0200)]
Merge pull request #100 from xianyi/develop

rebase

3 years agoMerge pull request #2896 from martin-frbg/intrin-double
Martin Kroeker [Thu, 15 Oct 2020 09:12:35 +0000 (11:12 +0200)]
Merge pull request #2896 from martin-frbg/intrin-double

Add compiler flag for SSE4 where available

3 years agoMerge pull request #2897 from Qiyu8/usimd-double
Martin Kroeker [Thu, 15 Oct 2020 06:38:24 +0000 (08:38 +0200)]
Merge pull request #2897 from Qiyu8/usimd-double

Add double precision universal intrinsics for X86/ARM

3 years agoRevert "add double precision SSE"
Martin Kroeker [Thu, 15 Oct 2020 06:37:02 +0000 (08:37 +0200)]
Revert "add double precision SSE"

3 years agoadapt arm platform
Qiyu8 [Thu, 15 Oct 2020 03:08:10 +0000 (11:08 +0800)]
adapt arm platform

3 years agoAdd double precision universal intrinsics for X86/ARM
Qiyu8 [Thu, 15 Oct 2020 02:29:42 +0000 (10:29 +0800)]
Add double precision universal intrinsics for X86/ARM

3 years agoadd sse4.1 for DYNAMIC_ARCH kernels
Martin Kroeker [Wed, 14 Oct 2020 18:34:33 +0000 (20:34 +0200)]
add sse4.1 for DYNAMIC_ARCH kernels

3 years agoAdd -msse4.1 when SSE4.1 is supported
Martin Kroeker [Wed, 14 Oct 2020 17:18:07 +0000 (19:18 +0200)]
Add -msse4.1 when SSE4.1 is supported

3 years agoAdd double precision operations
Martin Kroeker [Wed, 14 Oct 2020 16:10:45 +0000 (18:10 +0200)]
Add double precision operations

3 years agoMerge pull request #99 from xianyi/develop
Martin Kroeker [Wed, 14 Oct 2020 16:09:20 +0000 (18:09 +0200)]
Merge pull request #99 from xianyi/develop

rebase

3 years agoMerge pull request #2890 from martin-frbg/s-d-sum
Martin Kroeker [Wed, 14 Oct 2020 07:02:03 +0000 (09:02 +0200)]
Merge pull request #2890 from martin-frbg/s-d-sum

Revert special handling of Windows xNRM2 and enable C+intrinsics kern…

3 years agoMerge pull request #2895 from martin-frbg/sb-tests
Martin Kroeker [Wed, 14 Oct 2020 07:01:16 +0000 (09:01 +0200)]
Merge pull request #2895 from martin-frbg/sb-tests

Fix remaining build errors related to bfloat16 and cmake

3 years agoMerge pull request #2894 from RajalakshmiSR/bf16_packing
Martin Kroeker [Wed, 14 Oct 2020 06:12:08 +0000 (08:12 +0200)]
Merge pull request #2894 from RajalakshmiSR/bf16_packing

POWER10: Change the packing format for bfloat16

3 years agoReplace Makefile with simplified version again
Martin Kroeker [Tue, 13 Oct 2020 23:08:50 +0000 (01:08 +0200)]
Replace Makefile with simplified version again

3 years agoAdd express -mavx and -msse options (and fix a stray = for cooperlake)
Martin Kroeker [Tue, 13 Oct 2020 23:01:58 +0000 (01:01 +0200)]
Add express -mavx and -msse options (and fix a stray = for cooperlake)

3 years agoAdd the BFLOAT16 functions to cmake builds
Martin Kroeker [Tue, 13 Oct 2020 21:21:38 +0000 (23:21 +0200)]
Add the BFLOAT16 functions to cmake builds

3 years agoPOWER10: Change the packing format for bfloat16
Rajalakshmi Srinivasaraghavan [Tue, 13 Oct 2020 21:05:10 +0000 (16:05 -0500)]
POWER10: Change the packing format for bfloat16

As the new MMA instructions need the inputs in 4x2 order for bfloat16,
changing the format in copy/packing code.  This avoids permute instructions
in the gemm kernel inner loop.

3 years agoRename "HALF" type to "BFLOAT16"
Martin Kroeker [Tue, 13 Oct 2020 18:07:19 +0000 (20:07 +0200)]
Rename "HALF" type to "BFLOAT16"

3 years agoCleanup
Martin Kroeker [Tue, 13 Oct 2020 17:56:09 +0000 (19:56 +0200)]
Cleanup

3 years agosh prefix renamed to sb
Martin Kroeker [Tue, 13 Oct 2020 17:55:14 +0000 (19:55 +0200)]
sh prefix renamed to sb

3 years agoMerge pull request #98 from xianyi/develop
Martin Kroeker [Tue, 13 Oct 2020 16:50:30 +0000 (18:50 +0200)]
Merge pull request #98 from xianyi/develop

rebase

3 years agoMerge pull request #2892 from RajalakshmiSR/bf16_make
Martin Kroeker [Tue, 13 Oct 2020 16:48:37 +0000 (18:48 +0200)]
Merge pull request #2892 from RajalakshmiSR/bf16_make

Fix build issues with bfloat16

3 years agoFix build issues with bfloat16
Rajalakshmi Srinivasaraghavan [Tue, 13 Oct 2020 16:00:22 +0000 (11:00 -0500)]
Fix build issues with bfloat16

This patch fixes compilation errors due to recent renaming from SH to SB
with BUILD_BFLOAT16.

3 years agoFix typo
Martin Kroeker [Tue, 13 Oct 2020 13:02:17 +0000 (15:02 +0200)]
Fix typo

3 years agoExpressly enable -mavx2 on Zen, SkylakeX and Cooperlake as well
Martin Kroeker [Tue, 13 Oct 2020 12:41:25 +0000 (14:41 +0200)]
Expressly enable -mavx2 on Zen, SkylakeX and Cooperlake as well

3 years agoMerge pull request #2891 from martin-frbg/fix-2886
Martin Kroeker [Tue, 13 Oct 2020 11:46:17 +0000 (13:46 +0200)]
Merge pull request #2891 from martin-frbg/fix-2886

Fix several bugs and omissions from the BFLOAT16 rename

3 years agoAdd -mssse3 if supported by the hardware
Martin Kroeker [Tue, 13 Oct 2020 09:57:04 +0000 (11:57 +0200)]
Add -mssse3 if supported by the hardware

3 years agoAdd -mssse3
Martin Kroeker [Tue, 13 Oct 2020 09:55:41 +0000 (11:55 +0200)]
Add -mssse3

3 years agoAdd Haswell and Zen to temporary sse3 whitelist
Martin Kroeker [Tue, 13 Oct 2020 09:42:39 +0000 (11:42 +0200)]
Add Haswell and Zen to temporary sse3 whitelist

3 years agowhitelist SANDYBRIDGE for SSE3
Martin Kroeker [Tue, 13 Oct 2020 08:32:19 +0000 (10:32 +0200)]
whitelist SANDYBRIDGE for SSE3

3 years agoCleanup
Martin Kroeker [Tue, 13 Oct 2020 08:14:08 +0000 (10:14 +0200)]
Cleanup

3 years agoFix typos in currently unused sections
Martin Kroeker [Tue, 13 Oct 2020 07:17:15 +0000 (09:17 +0200)]
Fix typos in currently unused sections

3 years agoFix bfloat16 conditional
Martin Kroeker [Tue, 13 Oct 2020 07:11:36 +0000 (09:11 +0200)]
Fix bfloat16 conditional

3 years agoAdd a POWER9 build with BFLOAT16 enabled
Martin Kroeker [Tue, 13 Oct 2020 07:07:50 +0000 (09:07 +0200)]
Add a POWER9 build with BFLOAT16 enabled

3 years agoFix some overlooked "SHBLAS" entries
Martin Kroeker [Tue, 13 Oct 2020 07:05:04 +0000 (09:05 +0200)]
Fix some overlooked "SHBLAS" entries

3 years agoMerge pull request #97 from xianyi/develop
Martin Kroeker [Tue, 13 Oct 2020 07:01:49 +0000 (09:01 +0200)]
Merge pull request #97 from xianyi/develop

rebase

3 years agoRevert special handling of Windows xNRM2 and enable C+intrinsics kernel for SSUM...
Martin Kroeker [Mon, 12 Oct 2020 22:14:29 +0000 (00:14 +0200)]
Revert special handling of Windows xNRM2 and enable C+intrinsics kernel for SSUM/DSUM

3 years agoMerge pull request #2886 from martin-frbg/issue_2767
Martin Kroeker [Mon, 12 Oct 2020 22:04:35 +0000 (00:04 +0200)]
Merge pull request #2886 from martin-frbg/issue_2767

Rename "HALF" precision functions (sh prefix) to "BFLOAT16" with "sb" prefix

3 years agoMerge pull request #2881 from mattip/fninit
Martin Kroeker [Mon, 12 Oct 2020 21:50:41 +0000 (23:50 +0200)]
Merge pull request #2881 from mattip/fninit

add fninit to reset fpu registers before assembler routines

3 years agoMerge pull request #2888 from Qiyu8/usimd-sum
Martin Kroeker [Mon, 12 Oct 2020 21:22:08 +0000 (23:22 +0200)]
Merge pull request #2888 from Qiyu8/usimd-sum

Optimize the performance of sum by using universal intrinsics

3 years agouse emms instead, add WIN guards
Matti Picus [Mon, 12 Oct 2020 15:15:01 +0000 (18:15 +0300)]
use emms instead, add WIN guards

3 years agoConvert the prototypes of the unimplemented BFLOAT16 functions to the new naming...
Martin Kroeker [Mon, 12 Oct 2020 12:44:33 +0000 (14:44 +0200)]
Convert the prototypes of the unimplemented BFLOAT16 functions to the new naming scheme

3 years agoOptimize the performance of sum by using universal intrinsics
Qiyu8 [Mon, 12 Oct 2020 11:48:53 +0000 (19:48 +0800)]
Optimize the performance of sum by using universal intrinsics

3 years agoRestore -msse3
Martin Kroeker [Sun, 11 Oct 2020 22:42:05 +0000 (00:42 +0200)]
Restore -msse3

3 years agocommon_sh.h renamed to common_sb.h
Martin Kroeker [Sun, 11 Oct 2020 22:27:11 +0000 (00:27 +0200)]
common_sh.h renamed to common_sb.h

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:11:31 +0000 (00:11 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:08:29 +0000 (00:08 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:07:37 +0000 (00:07 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:06:06 +0000 (00:06 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:05:05 +0000 (00:05 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:03:21 +0000 (00:03 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:02:16 +0000 (00:02 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoChange "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 22:00:55 +0000 (00:00 +0200)]
Change "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:56:17 +0000 (23:56 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:54:53 +0000 (23:54 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:53:50 +0000 (23:53 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:52:45 +0000 (23:52 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16"and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:51:34 +0000 (23:51 +0200)]
Rename "HALF" and "sh" to "BFLOAT16"and "sb"

3 years agoRename common_sh.h to common_sb.h
Martin Kroeker [Sun, 11 Oct 2020 21:50:54 +0000 (23:50 +0200)]
Rename common_sh.h to common_sb.h

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:49:22 +0000 (23:49 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:44:38 +0000 (23:44 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:43:36 +0000 (23:43 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c
Martin Kroeker [Sun, 11 Oct 2020 21:42:45 +0000 (23:42 +0200)]
Rename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c

3 years agoRename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:42:07 +0000 (23:42 +0200)]
Rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c
Martin Kroeker [Sun, 11 Oct 2020 21:41:13 +0000 (23:41 +0200)]
Rename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c

3 years agoRename shdot.c to sbdot.c
Martin Kroeker [Sun, 11 Oct 2020 21:40:43 +0000 (23:40 +0200)]
Rename shdot.c to sbdot.c

3 years agorename "HALF" and "sh" to "BFLOAT16" and "sb"
Martin Kroeker [Sun, 11 Oct 2020 21:39:42 +0000 (23:39 +0200)]
rename "HALF" and "sh" to "BFLOAT16" and "sb"

3 years agoRename shgemm_kernel_power10.c to sbgemm_kernel_power10.c
Martin Kroeker [Sun, 11 Oct 2020 21:37:38 +0000 (23:37 +0200)]
Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c