[MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass
authorChristopher Bate <cbate@nvidia.com>
Tue, 17 May 2022 23:54:29 +0000 (17:54 -0600)
committerChristopher Bate <cbate@nvidia.com>
Fri, 20 May 2022 15:42:55 +0000 (09:42 -0600)
commit1ca772ed951e6412ef006459b56ae9a21691a97c
tree75266e3df8ab3745fab697b6143cf3cbcae07bfd
parent480dcdc8975d895ebb1f35ccf324900c816a0d6d
[MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass

This changes adds the option to lower to NvGpu dialect ops during the
VectorToGPU convsersion pass. Because this transformation reuses
existing VectorToGPU logic, a seperate VectorToNvGpu conversion pass is
not created. The option `use-nvgpu` is added to the VectorToGPU pass.
When this is true, the pass will attempt to convert slices rooted at
`vector.contract` operations into `nvgpu.mma.sync` ops, and
`vector.transfer_read` ops are converted to either `nvgpu.ldmatrix` or
one or more `vector.load` operations.  The specific data loaded will
depend on the thread id within a subgroup (warp). These index
calculations depend on data type and shape of the MMA op
according to the downstream PTX specification. The code for supporting
these details is separated into `NvGpuSupport.cpp|h`.

Differential Revision: https://reviews.llvm.org/D122940
mlir/include/mlir/Conversion/Passes.td
mlir/include/mlir/Conversion/VectorToGPU/VectorToGPU.h
mlir/lib/Conversion/PassDetail.h
mlir/lib/Conversion/VectorToGPU/CMakeLists.txt
mlir/lib/Conversion/VectorToGPU/NvGpuSupport.cpp [new file with mode: 0644]
mlir/lib/Conversion/VectorToGPU/NvGpuSupport.h [new file with mode: 0644]
mlir/lib/Conversion/VectorToGPU/VectorToGPU.cpp
mlir/test/Conversion/VectorToGPU/vector-to-mma-ops-mma-sync.mlir [new file with mode: 0644]