review.tizen.org Git - platform/upstream/llvm.git/commit

[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.

hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately stops as soon as any memory
access encountered between loads that are expected to be merged
together. Although, Read-After-Read conflict cannot affect execution
correctness.

Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%.
Also improvement expected on any massive sequences of reads from LDS.

Differential Revision: https://reviews.llvm.org/D25944

llvm-svn: 285919

author	Alexander Timofeev <Alexander.Timofeev@amd.com>
	Thu, 3 Nov 2016 14:37:13 +0000 (14:37 +0000)
committer	Alexander Timofeev <Alexander.Timofeev@amd.com>
	Thu, 3 Nov 2016 14:37:13 +0000 (14:37 +0000)
commit	f867a40bf60ad813560fe4cc3d2cc100472ffef4
tree	e888ef6d503dc980fc536452f72a71ab5182b7af	tree \| snapshot
parent	73aba6229f7f6cdc1aa5b107518684a95da4851e	commit \| diff

llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/ds_read2.ll		diff \| blob \| history