[flang] Implement MATMUL in the runtime
authorpeter klausler <pklausler@nvidia.com>
Mon, 17 May 2021 21:06:44 +0000 (14:06 -0700)
committerpeter klausler <pklausler@nvidia.com>
Tue, 18 May 2021 17:59:52 +0000 (10:59 -0700)
commit5e1421b22f642a6b34690d0d724e691ba3984836
tree68247f38710d572e644bac13a2d53ffd69404d57
parent2919222d8017f2425a85765b95e4b7c6f8e70ca4
[flang] Implement MATMUL in the runtime

Define an API for the transformational intrinsic function MATMUL,
implement it, and add some basic unit tests.  The large number of
possible argument type combinations are covered by a set of
generalized templates that are instantiated for each valid
pair of possible argument types.

Places where BLAS-2/3 routines could be called for acceleration
are marked with TODOs.  Handling for other special cases (e.g.,
known-shape 3x3 matrices and vectors) are deferred.

Some minor tweaks were made to the recent related implementation
of DOT_PRODUCT to reflect lessons learned.

Differential Revision: https://reviews.llvm.org/D102652
flang/runtime/CMakeLists.txt
flang/runtime/dot-product.cpp
flang/runtime/matmul.cpp [new file with mode: 0644]
flang/runtime/matmul.h [new file with mode: 0644]
flang/runtime/reduction.h
flang/unittests/RuntimeGTest/CMakeLists.txt
flang/unittests/RuntimeGTest/Matmul.cpp [new file with mode: 0644]