TensorCore Support using Intrinsic (#4136)
authorSiyuan Feng <Hzfengsy@vip.qq.com>
Thu, 24 Oct 2019 19:04:37 +0000 (12:04 -0700)
committerLeyuan Wang <laurawly@gmail.com>
Thu, 24 Oct 2019 19:04:37 +0000 (12:04 -0700)
commit324a9607eb563f81e55fdd0c9d078c2f74651817
treedac7e5e5441ee585354db677c5d25ae7f8d81833
parent4ab73634c195fb7006e6279d9dbd71ecba33997b
TensorCore Support using Intrinsic (#4136)

* add tensor core support

* avoid memory bank conflict

* fix thread sync & better performance

* better performance

* add schedule test for conv2d

* extend into BatchMatMul

* support config fragment shape and layout using intrinsic

* add TensorCore tutorial

* add int support and fix lint

* address comment

* add 32*16*8 TensorCore test

* fix wmma include logic
15 files changed:
include/tvm/ir.h
include/tvm/ir_pass.h
python/tvm/build_module.py
src/api/api_pass.cc
src/codegen/build_module.cc
src/codegen/codegen_cuda.cc
src/codegen/codegen_cuda.h
src/pass/infer_fragment.cc [new file with mode: 0644]
src/pass/storage_access.cc
src/pass/storage_sync.cc
src/runtime/thread_storage_scope.h
tests/python/unittest/test_schedule_tensor_core.py [new file with mode: 0644]
topi/python/topi/testing/conv2d_nhwc_python.py
tutorials/optimize/opt_conv_tensorcore.py [new file with mode: 0644]
vta/python/vta/build_module.py