[mlir][VectorToGPU] Fix support for i4, col-major operand support
authorChristopher Bate <cbate@nvidia.com>
Fri, 17 Jun 2022 17:40:28 +0000 (11:40 -0600)
committerChristopher Bate <cbate@nvidia.com>
Thu, 30 Jun 2022 16:26:59 +0000 (10:26 -0600)
commit670eee08cecfcfe170fb0e7daa88df8c2a150dbe
tree4e912348587954713128f31b5859dc50ff9b435d
parent3a56858cee5c88a3027492ebe4abecc87fb50ee1
[mlir][VectorToGPU] Fix support for i4, col-major operand support

For the conversion to nvgpu `mma.sync` and `ldmatrix` pathways, the code
was missing support for the `i4` data type. While fixing this, another
bug was discoverd that caused the number of ldmatrix tiles calculated for
certain operand types and configurations to be incorrect. This change
fixes both issues and adds additional tests.

Differential Revision: https://reviews.llvm.org/D128074
mlir/lib/Conversion/VectorToGPU/NvGpuSupport.cpp
mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
mlir/test/Conversion/VectorToGPU/vector-to-mma-ops-mma-sync.mlir