review.tizen.org Git - platform/upstream/openblas.git/commit

projects / platform / upstream / openblas.git / commit

author	Rajalakshmi Srinivasaraghavan <raji@linux.ibm.com>
	Tue, 14 Apr 2020 19:55:08 +0000 (14:55 -0500)
committer	Rajalakshmi Srinivasaraghavan <raji@linux.ibm.com>
	Tue, 14 Apr 2020 19:55:08 +0000 (14:55 -0500)
commit	7eb55504b1727eebcb0f451fa5b148dbea303b69
tree	1de8d9b68acf46139b1e01dc36664e220aac0b6d	tree \| snapshot
parent	c861b2a7bda3c88ad30aac105e473b46fd940dd7	commit \| diff

RFC : Add half precision gemm for bfloat16 in OpenBLAS

This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.

27 files changed:

Makefile.system		diff \| blob \| history
Makefile.tail		diff \| blob \| history
cmake/prebuild.cmake		diff \| blob \| history
cmake/system.cmake		diff \| blob \| history
common.h		diff \| blob \| history
common_interface.h		diff \| blob \| history
common_level3.h		diff \| blob \| history
common_macro.h		diff \| blob \| history
common_param.h		diff \| blob \| history
common_sh.h	[new file with mode: 0644]	blob
driver/level3/Makefile		diff \| blob \| history
driver/level3/level3.c		diff \| blob \| history
driver/level3/level3_thread.c		diff \| blob \| history
driver/others/parameter.c		diff \| blob \| history
getarch_2nd.c		diff \| blob \| history
interface/Makefile		diff \| blob \| history
interface/gemm.c		diff \| blob \| history
kernel/Makefile.L3		diff \| blob \| history
kernel/generic/gemm_beta.c		diff \| blob \| history
kernel/generic/gemm_ncopy_2.c		diff \| blob \| history
kernel/generic/gemm_tcopy_2.c		diff \| blob \| history
kernel/generic/gemmkernel_2x2.c		diff \| blob \| history
kernel/power/KERNEL.POWER9		diff \| blob \| history
kernel/setparam-ref.c		diff \| blob \| history
lapack/getrf/potrf_parallel.c		diff \| blob \| history
param.h		diff \| blob \| history
test/compare_sgemm_shgemm.c	[new file with mode: 0644]	blob

Domain: Machine Learning / ML Framework;

RSS Atom