PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.
authorArtem Belevich <tra@google.com>
Thu, 25 Apr 2019 22:27:57 +0000 (22:27 +0000)
committerArtem Belevich <tra@google.com>
Thu, 25 Apr 2019 22:27:57 +0000 (22:27 +0000)
commit16737538f4fc4757ae5226e95b177155ed8e13ad
treec6e9434b7754b9a25d87a91b5db4e91912e24927
parent8d825b38ed2c3198f0523baef788227832298b9c
PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.

All of the new instructions are still handled mostly by tablegen. I've slightly
refactored the code to drive intrinsic/instruction generation from a master
list of supported variants, so all irregularities have to be implemented in one place only.

The test generation script wmma.py has been refactored in a similar way.

Differential Revision: https://reviews.llvm.org/D60015

llvm-svn: 359247
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/wmma.py