review.tizen.org Git - platform/upstream/llvm.git/commit

author	QingShan Zhang <qshanz@cn.ibm.com>
	Wed, 26 Aug 2020 12:26:21 +0000 (12:26 +0000)
committer	QingShan Zhang <qshanz@cn.ibm.com>
	Wed, 26 Aug 2020 12:33:59 +0000 (12:33 +0000)
commit	ebf3b188c6edcce7e90ddcacbe7c51c90d95b0ac
tree	b8e8af31aac49efbe18773f16aa8d53415c0e3e9	tree \| snapshot
parent	d289a97f91443177b605926668512479c2cee37b	commit \| diff

[Scheduling] Implement a new way to cluster loads/stores

Before calling target hook to determine if two loads/stores are clusterable,
we put them into different groups to avoid fake cluster due to dependency.
For now, we are putting the loads/stores into the same group if they have
the same predecessor. We assume that, if two loads/stores have the same
predecessor, it is likely that, they didn't have dependency for each other.

However, one SUnit might have several predecessors and for now, we just
pick up the first predecessor that has non-data/non-artificial dependency,
which is too arbitrary. And we are struggling to fix it.

So, I am proposing some better implementation.
1. Collect all the loads/stores that has memory info first to reduce the complexity.
2. Sort these loads/stores so that we can stop the seeking as early as possible.
3. For each load/store, seeking for the first non-dependency instruction with the
sorted order, and check if they can cluster or not.

Reviewed By: Jay Foad

Differential Revision: https://reviews.llvm.org/D85517

llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h		diff \| blob \| history
llvm/lib/CodeGen/MachineScheduler.cpp		diff \| blob \| history
llvm/test/CodeGen/AArch64/aarch64-stp-cluster.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/max.i16.ll		diff \| blob \| history
llvm/test/CodeGen/AMDGPU/stack-realign.ll		diff \| blob \| history