powerpc: Optimized SGEMM/DGEMM/CGEMM for POWER10
authorRajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
Wed, 24 Jun 2020 19:48:15 +0000 (14:48 -0500)
committerRajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
Wed, 24 Jun 2020 19:48:15 +0000 (14:48 -0500)
commit571eadb88063c91ea9b5b1bcb2ae33cd8fbc5762
treedb072ffc237f358f93b1c3bdc1c56ae9a8dab02f
parent93592d1260e09c74a42cccba0b6262f626ce775d
powerpc: Optimized SGEMM/DGEMM/CGEMM for POWER10

This patch introduces new optimized version of SGEMM, CGEMM and DGEMM
using power10 Matrix-Multiply Assist (MMA) feature introduced in
POWER ISA v3.1. This patch makes use of new POWER10 compute instructions
for matrix multiplication operation.

Tested on simulator and there are no new test failures.
Cycles count reduced by 30-50%  compared to POWER9 version depending on
M/N/K sizes.
MMA GCC patch for reference:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8ee2640bfdc62f835ec9740278f948034bc7d9f1
kernel/power/KERNEL.POWER10
kernel/power/cgemm_kernel_power10.S [new file with mode: 0644]
kernel/power/cgemm_logic_power10.S [new file with mode: 0644]
kernel/power/cgemm_macros_power10.S [new file with mode: 0644]
kernel/power/dgemm_kernel_power10.c [new file with mode: 0644]
kernel/power/sgemm_kernel_power10.c [new file with mode: 0644]