platform/upstream/openblas.git
4 years agoMerge pull request #2750 from RajalakshmiSR/dgemv_p10
Martin Kroeker [Thu, 30 Jul 2020 08:13:19 +0000 (10:13 +0200)]
Merge pull request #2750 from RajalakshmiSR/dgemv_p10

dgemv optimization for POWER10

4 years agodgemv optimization for POWER10
Rajalakshmi Srinivasaraghavan [Wed, 29 Jul 2020 23:59:32 +0000 (18:59 -0500)]
dgemv optimization for POWER10

Making use of new vector pair POWER10 instructions in dgemv_n and dgemv_t.
Also adding a new block 4x128 to make use of Matrix-Multiply Assist (MMA)
feature introduced in POWER ISA v3.1.  Tested on simulator and there
are no new test failures.

4 years agoMerge pull request #2741 from martin-frbg/issue2739
Martin Kroeker [Wed, 29 Jul 2020 08:01:14 +0000 (10:01 +0200)]
Merge pull request #2741 from martin-frbg/issue2739

Adjust A53 SGEMM parameters to reflect recent switch to 8x8 kernel

4 years agoMerge pull request #2744 from martin-frbg/issue2738
Martin Kroeker [Tue, 28 Jul 2020 17:32:04 +0000 (19:32 +0200)]
Merge pull request #2744 from martin-frbg/issue2738

Add AMD Renoir/Matisse cpu autodetection and preliminary support for Zen3

4 years agoMerge pull request #2740 from RajalakshmiSR/clang-power
Martin Kroeker [Tue, 28 Jul 2020 16:15:25 +0000 (18:15 +0200)]
Merge pull request #2740 from RajalakshmiSR/clang-power

Fix compilation issues with clang on POWER

4 years agoPut hint to use git develop rather than master branch in README
Martin Kroeker [Tue, 28 Jul 2020 14:22:41 +0000 (14:22 +0000)]
Put hint to use git develop rather than master branch in README

4 years agoAdd AMD Renoir/Matisse and preliminary support for Zen3 as Zen2
Martin Kroeker [Tue, 28 Jul 2020 13:53:17 +0000 (13:53 +0000)]
Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2

also support AMD family 22 Jaguar/Puma as Bobcat

4 years agoAdd AMD Renoir models and preliminary support for ZEN3 as ZEN2
Martin Kroeker [Tue, 28 Jul 2020 13:45:23 +0000 (13:45 +0000)]
Add AMD Renoir models and preliminary support for ZEN3 as ZEN2

also remap erroneous family 16 entry to BOBCAT and reclaim erroneous family 25 "Barcelona" for Zen3

4 years agomissing braces
Martin Kroeker [Mon, 27 Jul 2020 20:19:22 +0000 (20:19 +0000)]
missing braces

4 years agoAdjust A53 SGEMM parameters to reflect move to 8x8 kernel
Martin Kroeker [Mon, 27 Jul 2020 19:54:46 +0000 (19:54 +0000)]
Adjust A53 SGEMM parameters to reflect move to 8x8 kernel

4 years agoFix compilation issues with clang on POWER
Rajalakshmi Srinivasaraghavan [Mon, 27 Jul 2020 19:11:07 +0000 (14:11 -0500)]
Fix compilation issues with clang on POWER

As gcc defaults to -malign-power, removing that option. Also
adding -fno-integrated-as to use GNU assembler for powerpc
assembly optimization files. Fixed other compilation errors
reported in dgemv_t.c file.

4 years agoMerge pull request #2737 from ashwinyes/add_thunderx3_target
Martin Kroeker [Mon, 27 Jul 2020 13:19:47 +0000 (15:19 +0200)]
Merge pull request #2737 from ashwinyes/add_thunderx3_target

ARM64: Add THUNDERX3T110 Target

4 years agoARM64: Add THUNDERX3T110 Target
Ashwin Sekhar T K [Thu, 11 Jun 2020 11:12:49 +0000 (04:12 -0700)]
ARM64: Add THUNDERX3T110 Target

4 years agoMerge pull request #2735 from martin-frbg/move_potrf
Martin Kroeker [Sun, 26 Jul 2020 17:54:11 +0000 (19:54 +0200)]
Merge pull request #2735 from martin-frbg/move_potrf

Move potrf_parallel.c from lapack/getrf to lapack/potrf where it belongs

4 years agoMerge pull request #2734 from RajalakshmiSR/p10_fix
Martin Kroeker [Sat, 25 Jul 2020 07:02:32 +0000 (09:02 +0200)]
Merge pull request #2734 from RajalakshmiSR/p10_fix

Fix to store results in correct order for POWER10 GEMM kernels

4 years agoUse _Atomic instead of volatile where available (file moved from ../getrf)
Martin Kroeker [Sat, 25 Jul 2020 06:52:24 +0000 (08:52 +0200)]
Use _Atomic instead of volatile where available (file moved from ../getrf)

must have misplaced this in ../getrf when I made that change in March 2018 (40160ff)
the only changes since then were
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50

    Change _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang committed on 11 May 2018 as 3716267

4 years agoDelete potrf_parallel.c (moving it to ../potrf)
Martin Kroeker [Sat, 25 Jul 2020 06:42:39 +0000 (06:42 +0000)]
Delete potrf_parallel.c (moving it to ../potrf)

4 years agoFix to store results in correct order for POWER10 GEMM kernels
Rajalakshmi Srinivasaraghavan [Sat, 25 Jul 2020 04:08:11 +0000 (23:08 -0500)]
Fix to store results in correct order for POWER10 GEMM kernels

There is a recent compiler change in __builtin_mma_disassemble_acc() which
affects the order of storing result in POWER10. Also removing new LDFLAG
-mno-power10-stub as it is handled by linker automatically.

4 years agoMerge pull request #2720 from martin-frbg/issue2694
Martin Kroeker [Fri, 24 Jul 2020 21:19:45 +0000 (23:19 +0200)]
Merge pull request #2720 from martin-frbg/issue2694

WIP Further fixes for 32bit POWER8

4 years agoTypo fix
Martin Kroeker [Fri, 24 Jul 2020 16:04:58 +0000 (16:04 +0000)]
Typo fix

4 years agoRegroup the 32 and 64bit sections and restore 64bit CAXPY
Martin Kroeker [Fri, 24 Jul 2020 10:13:46 +0000 (10:13 +0000)]
Regroup the 32 and 64bit sections and restore 64bit CAXPY

4 years agoMerge pull request #2721 from martin-frbg/p8align
Martin Kroeker [Fri, 24 Jul 2020 09:06:20 +0000 (11:06 +0200)]
Merge pull request #2721 from martin-frbg/p8align

Fix alignment errors in the power8 saxpy kernel

4 years agoMerge pull request #2731 from martin-frbg/pgippc
Martin Kroeker [Fri, 24 Jul 2020 09:05:16 +0000 (11:05 +0200)]
Merge pull request #2731 from martin-frbg/pgippc

Fixes for compilation on POWER with PGI compilers

4 years agoUse OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only
Martin Kroeker [Thu, 23 Jul 2020 20:40:13 +0000 (20:40 +0000)]
Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only

4 years agoAdd ifdefs around call to altivec microkernel
Martin Kroeker [Thu, 23 Jul 2020 18:30:42 +0000 (18:30 +0000)]
Add ifdefs around call to altivec microkernel

4 years agoTypo fix
Martin Kroeker [Thu, 23 Jul 2020 17:34:56 +0000 (17:34 +0000)]
Typo fix

4 years agoRewrite assignment to complex for better portability
Martin Kroeker [Thu, 23 Jul 2020 15:10:59 +0000 (17:10 +0200)]
Rewrite assignment to complex for better portability

4 years agoExclude altivec code paths if the compiler does not support them
Martin Kroeker [Thu, 23 Jul 2020 15:08:20 +0000 (17:08 +0200)]
Exclude altivec code paths if the compiler does not support them

4 years agoAvoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions
Martin Kroeker [Thu, 23 Jul 2020 15:03:28 +0000 (17:03 +0200)]
Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions

4 years agoMerge pull request #73 from xianyi/develop
Martin Kroeker [Thu, 23 Jul 2020 14:59:06 +0000 (16:59 +0200)]
Merge pull request #73 from xianyi/develop

rebase

4 years agoMerge pull request #2729 from martin-frbg/issue2728
Martin Kroeker [Wed, 22 Jul 2020 20:45:57 +0000 (22:45 +0200)]
Merge pull request #2729 from martin-frbg/issue2728

Unify BUFFER_SIZE settings for x86_64 again to fix DYNAMIC_ARCH crashes

4 years agoUnify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in...
Martin Kroeker [Wed, 22 Jul 2020 17:30:55 +0000 (17:30 +0000)]
Unify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in DYNAMIC_ARCH builds

4 years agoMerge pull request #2727 from wyphan/develop
Martin Kroeker [Tue, 21 Jul 2020 15:06:53 +0000 (17:06 +0200)]
Merge pull request #2727 from wyphan/develop

Patch for building on POWERPC with PGI compilers (was Patch for building on Summit)

4 years agoMerge pull request #2726 from martin-frbg/2725-2
Martin Kroeker [Tue, 21 Jul 2020 14:42:06 +0000 (16:42 +0200)]
Merge pull request #2726 from martin-frbg/2725-2

Add detection of stdatomic.h for cmake

4 years agoPatch for building on Summit
Wileam Phan [Tue, 21 Jul 2020 03:30:28 +0000 (23:30 -0400)]
Patch for building on Summit

4 years agoAdd trivial check for stdatomic.h
Martin Kroeker [Mon, 20 Jul 2020 22:52:09 +0000 (22:52 +0000)]
Add trivial check for stdatomic.h

4 years agoMerge pull request #72 from xianyi/develop
Martin Kroeker [Mon, 20 Jul 2020 22:49:12 +0000 (00:49 +0200)]
Merge pull request #72 from xianyi/develop

rebase

4 years agoMerge pull request #2725 from martin-frbg/ccheck_c11
Martin Kroeker [Sat, 18 Jul 2020 21:08:08 +0000 (23:08 +0200)]
Merge pull request #2725 from martin-frbg/ccheck_c11

Have c_check probe availability of C11 atomics support and stdatomic.h

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:19:59 +0000 (17:19 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:14:50 +0000 (17:14 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoUpdate conditional for C11 atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:13:24 +0000 (17:13 +0000)]
Update conditional for C11 atomics to use HAVE_C11

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:09:56 +0000 (17:09 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:09:01 +0000 (17:09 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoUpdate conditional for atomics to HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:07:38 +0000 (17:07 +0000)]
Update conditional for atomics to HAVE_C11

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:05:59 +0000 (17:05 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoUpdate conditional for atomics to use HAVE_C11
Martin Kroeker [Sat, 18 Jul 2020 17:03:31 +0000 (17:03 +0000)]
Update conditional for atomics to use HAVE_C11

4 years agoReport availability of C11 support
Martin Kroeker [Sat, 18 Jul 2020 16:59:33 +0000 (16:59 +0000)]
Report availability of C11 support

4 years agoAdd a check for C11 atomics and stdatomic.h
Martin Kroeker [Sat, 18 Jul 2020 16:57:41 +0000 (16:57 +0000)]
Add a check for C11 atomics and stdatomic.h

4 years agoMerge pull request #2724 from martin-frbg/loongsonreadme
Martin Kroeker [Sat, 18 Jul 2020 16:08:40 +0000 (18:08 +0200)]
Merge pull request #2724 from martin-frbg/loongsonreadme

Update cross-compiling example in README to reflect change in Loongson gcc

4 years agoUpdate cross-compiling example to reflect change in Loongson gcc
Martin Kroeker [Sat, 18 Jul 2020 12:51:37 +0000 (12:51 +0000)]
Update cross-compiling example to reflect change in Loongson gcc

for #2723

4 years agoMerge pull request #2722 from martin-frbg/cmakefcheck
Martin Kroeker [Fri, 17 Jul 2020 08:33:03 +0000 (10:33 +0200)]
Merge pull request #2722 from martin-frbg/cmakefcheck

Handle lack of fortran compiler more gracefully in cmake

4 years agoinclude CheckLanguage module
Martin Kroeker [Thu, 16 Jul 2020 22:36:35 +0000 (22:36 +0000)]
include CheckLanguage module

4 years agohandle missing lack of fortran compiler more gracefully
Martin Kroeker [Thu, 16 Jul 2020 22:17:39 +0000 (22:17 +0000)]
handle missing lack of fortran compiler more gracefully

4 years agoUse vec_vsx_ld/st to fix misaligned accesses flagged by asan
Martin Kroeker [Thu, 16 Jul 2020 21:32:54 +0000 (23:32 +0200)]
Use vec_vsx_ld/st to fix misaligned accesses flagged by asan

4 years agoremove debug output and revert changes to cdot and crot
Martin Kroeker [Wed, 15 Jul 2020 08:00:07 +0000 (10:00 +0200)]
remove debug output and revert changes to cdot and crot

4 years agoMerge pull request #2716 from RajalakshmiSR/p10_ldflag
Martin Kroeker [Tue, 14 Jul 2020 23:20:54 +0000 (01:20 +0200)]
Merge pull request #2716 from RajalakshmiSR/p10_ldflag

Add new linker option for POWER10

4 years agoAdd new linker option for POWER10
Rajalakshmi Srinivasaraghavan [Tue, 14 Jul 2020 16:54:04 +0000 (11:54 -0500)]
Add new linker option for POWER10

While building with DYNAMIC_ARCH on POWER9 with POWER10
aware toolchain, new LDFLAG is needed to avoid POWER10
instructions on PLT calls .

4 years agofix trailing whitespace
Martin Kroeker [Tue, 14 Jul 2020 16:20:03 +0000 (18:20 +0200)]
fix trailing whitespace

4 years agoUse POWER6 GEMM, TRMM and DTRSM on 32bit POWER8
Martin Kroeker [Tue, 14 Jul 2020 16:11:19 +0000 (18:11 +0200)]
Use POWER6 GEMM, TRMM and DTRSM on 32bit POWER8

4 years agoDo not define USE_TRMM for 32bit POWER8
Martin Kroeker [Tue, 14 Jul 2020 16:10:12 +0000 (18:10 +0200)]
Do not define USE_TRMM for 32bit POWER8

4 years agoUse POWER6 GEMM parameters on 32bit POWER8
Martin Kroeker [Tue, 14 Jul 2020 16:07:58 +0000 (18:07 +0200)]
Use POWER6 GEMM parameters on 32bit POWER8

4 years agoMerge pull request #71 from xianyi/develop
Martin Kroeker [Tue, 14 Jul 2020 16:01:34 +0000 (18:01 +0200)]
Merge pull request #71 from xianyi/develop

rebase

4 years agoMerge pull request #2682 from martin-frbg/aix
Martin Kroeker [Mon, 13 Jul 2020 12:43:24 +0000 (14:43 +0200)]
Merge pull request #2682 from martin-frbg/aix

[WIP] fix compilation on AIX

4 years agoMerge pull request #2651 from leezu/actionsflang
Martin Kroeker [Mon, 13 Jul 2020 11:00:39 +0000 (13:00 +0200)]
Merge pull request #2651 from leezu/actionsflang

Add flang build to Github Actions

4 years agoMerge branch 'develop' into actionsflang
Martin Kroeker [Sun, 12 Jul 2020 18:37:29 +0000 (20:37 +0200)]
Merge branch 'develop' into actionsflang

4 years agoMerge pull request #2706 from jussienko/use-always-omp-threads
Martin Kroeker [Sun, 12 Jul 2020 18:17:11 +0000 (20:17 +0200)]
Merge pull request #2706 from jussienko/use-always-omp-threads

Fix OpenMP builds defaulting to singlethreading with OMP_PLACES or OMP_PROC_BIND set

4 years agoMake 32bit POWER8 use POWER6 kernels for now
Martin Kroeker [Sun, 12 Jul 2020 16:59:01 +0000 (18:59 +0200)]
Make 32bit POWER8 use POWER6 kernels for now

4 years agomerge overwritten part of power10 support
Martin Kroeker [Sun, 12 Jul 2020 16:51:58 +0000 (18:51 +0200)]
merge overwritten part of power10 support

4 years agoMerge pull request #2710 from martin-frbg/cmake-lapacktest
Martin Kroeker [Fri, 10 Jul 2020 10:06:50 +0000 (12:06 +0200)]
Merge pull request #2710 from martin-frbg/cmake-lapacktest

Add LAPACK-TESTING to the cmake build

4 years agoMerge pull request #2713 from RajalakshmiSR/p10-gcc10
Martin Kroeker [Fri, 10 Jul 2020 08:43:33 +0000 (10:43 +0200)]
Merge pull request #2713 from RajalakshmiSR/p10-gcc10

Change minimum gcc version for POWER10

4 years agoChange minimum gcc version for POWER10
Rajalakshmi Srinivasaraghavan [Fri, 10 Jul 2020 02:46:06 +0000 (21:46 -0500)]
Change minimum gcc version for POWER10

As the MMA patches for POWER10 are backported to gcc10.2, changing
the minimum gcc version needed to build OpenBLAS for POWER10.

4 years agoDo not build lapack-test on MSVC for now (same as with BLAS test)
Martin Kroeker [Thu, 9 Jul 2020 11:57:27 +0000 (13:57 +0200)]
Do not build lapack-test on MSVC for now (same as with BLAS test)

4 years agoenable fortran for cmake
Martin Kroeker [Thu, 9 Jul 2020 11:44:25 +0000 (13:44 +0200)]
enable fortran for cmake

4 years agoModify for building with OpenBLAS
Martin Kroeker [Thu, 9 Jul 2020 11:13:16 +0000 (13:13 +0200)]
Modify for building with OpenBLAS

4 years agoModify for building with OpenBLAS
Martin Kroeker [Thu, 9 Jul 2020 11:12:35 +0000 (13:12 +0200)]
Modify for building with OpenBLAS

4 years agoAppend crude hack for enabling lapack tests in the OpenBLAS build
Martin Kroeker [Thu, 9 Jul 2020 09:44:31 +0000 (11:44 +0200)]
Append crude hack for enabling lapack tests in the OpenBLAS build

4 years agoAdd lapack-test
Martin Kroeker [Thu, 9 Jul 2020 09:42:02 +0000 (11:42 +0200)]
Add lapack-test

4 years agoMerge pull request #2708 from RajalakshmiSR/p10_future
Martin Kroeker [Wed, 8 Jul 2020 10:26:44 +0000 (12:26 +0200)]
Merge pull request #2708 from RajalakshmiSR/p10_future

Changing mcpu option as power10

4 years agoMerge branch 'develop' into aix
Martin Kroeker [Tue, 7 Jul 2020 16:52:06 +0000 (18:52 +0200)]
Merge branch 'develop' into aix

4 years agoChanging mcpu option as power10
Rajalakshmi Srinivasaraghavan [Tue, 7 Jul 2020 16:25:20 +0000 (11:25 -0500)]
Changing mcpu option as power10

As compiler enabled mcpu option as power10, changing it from future.

4 years agoObtain actual cpu count on AIX and suppress spurious NO_AVX512 on non-x86
Martin Kroeker [Tue, 7 Jul 2020 13:46:32 +0000 (15:46 +0200)]
Obtain actual cpu count on AIX and suppress spurious NO_AVX512 on non-x86

4 years agofixes #2238
Jussi Enkovaara [Tue, 7 Jul 2020 10:35:43 +0000 (13:35 +0300)]
fixes #2238

Always obey omp_get_max_threads() when build with USE_OPENMP

4 years agoMerge pull request #2670 from mhillenibm/dumpfullversion_on_gcc7
Martin Kroeker [Mon, 6 Jul 2020 22:12:28 +0000 (00:12 +0200)]
Merge pull request #2670 from mhillenibm/dumpfullversion_on_gcc7

RFC: Use -dumpfullversion to get minor version on gcc-7 and newer

4 years agoMerge pull request #2703 from martin-frbg/issue2702
Martin Kroeker [Thu, 2 Jul 2020 20:32:51 +0000 (22:32 +0200)]
Merge pull request #2703 from martin-frbg/issue2702

Compatibility fix for gcc < 4.7

4 years agoOption -mavx2 requires at least gcc 4.7
Martin Kroeker [Thu, 2 Jul 2020 15:00:15 +0000 (17:00 +0200)]
Option -mavx2 requires at least gcc 4.7

4 years agoMerge pull request #69 from xianyi/develop
Martin Kroeker [Thu, 2 Jul 2020 14:56:00 +0000 (16:56 +0200)]
Merge pull request #69 from xianyi/develop

rebase

4 years agoMerge pull request #2693 from EGuesnet/AIX-build-on-POWER8-32bits
Martin Kroeker [Wed, 1 Jul 2020 06:29:52 +0000 (08:29 +0200)]
Merge pull request #2693 from EGuesnet/AIX-build-on-POWER8-32bits

AIX build on POWER8 32bits

4 years agoUpdate cgemm_kernel_8x4_power8.S
EGuesnet [Tue, 30 Jun 2020 13:16:39 +0000 (15:16 +0200)]
Update cgemm_kernel_8x4_power8.S

4 years agoMerge pull request #2688 from martin-frbg/cometlake
Martin Kroeker [Sat, 27 Jun 2020 15:47:24 +0000 (17:47 +0200)]
Merge pull request #2688 from martin-frbg/cometlake

Add autodetection of Intel Comet Lake H and S models

4 years agoAdd support for Comet Lake H and S
Martin Kroeker [Sat, 27 Jun 2020 12:41:24 +0000 (14:41 +0200)]
Add support for Comet Lake H and S

4 years agoAdd support for Comet Lake H & S
Martin Kroeker [Sat, 27 Jun 2020 12:36:37 +0000 (14:36 +0200)]
Add support for Comet Lake H & S

4 years agoMerge pull request #68 from xianyi/develop
Martin Kroeker [Sat, 27 Jun 2020 12:29:29 +0000 (14:29 +0200)]
Merge pull request #68 from xianyi/develop

rebase

4 years agoMerge pull request #2687 from martin-frbg/utfbom
Martin Kroeker [Fri, 26 Jun 2020 20:53:09 +0000 (22:53 +0200)]
Merge pull request #2687 from martin-frbg/utfbom

Strip UTF8 byte order marker from source files

4 years agoMerge pull request #2686 from RajalakshmiSR/p10_shgemm
Martin Kroeker [Fri, 26 Jun 2020 20:52:45 +0000 (22:52 +0200)]
Merge pull request #2686 from RajalakshmiSR/p10_shgemm

powerpc: Optimized SHGEMM kernel for POWER10

4 years agoMerge pull request #2683 from mtreinish/add-comet-lake-support
Martin Kroeker [Fri, 26 Jun 2020 10:11:03 +0000 (12:11 +0200)]
Merge pull request #2683 from mtreinish/add-comet-lake-support

Add cpu detection support for comet lake U

4 years agoMerge pull request #2680 from kavanabhat/aix_makefile_fix
Martin Kroeker [Fri, 26 Jun 2020 09:27:28 +0000 (11:27 +0200)]
Merge pull request #2680 from kavanabhat/aix_makefile_fix

Fix for #2671

4 years agoStrip UTF8 byte order marker from source
Martin Kroeker [Fri, 26 Jun 2020 07:00:43 +0000 (09:00 +0200)]
Strip UTF8 byte order marker from source

4 years agopowerpc: Optimized SHGEMM kernel for POWER10
Rajalakshmi Srinivasaraghavan [Fri, 26 Jun 2020 03:19:08 +0000 (22:19 -0500)]
powerpc: Optimized SHGEMM kernel for POWER10

This patch introduces new optimized version of SHGEMM kernel
using power10 Matrix-Multiply Assist (MMA) feature introduced in
POWER ISA v3.1. This patch makes use of new POWER10 compute instructions
for matrix multiplication operation.

Tested on simulator and there are no new test failures.

4 years agoAlso set CPUTYPE in get_cpuname()
Matthew Treinish [Thu, 25 Jun 2020 19:53:56 +0000 (15:53 -0400)]
Also set CPUTYPE in get_cpuname()

4 years agoAdd support to driver/others/dynamic.c too
Matthew Treinish [Thu, 25 Jun 2020 15:56:49 +0000 (11:56 -0400)]
Add support to driver/others/dynamic.c too