review.tizen.org Git - platform/upstream/llvm.git/commit

author	Roman Lebedev <lebedev.ri@gmail.com>
	Sat, 2 Oct 2021 10:40:09 +0000 (13:40 +0300)
committer	Roman Lebedev <lebedev.ri@gmail.com>
	Sat, 2 Oct 2021 10:40:21 +0000 (13:40 +0300)
commit	acb459574afc344bcb676737496f3fa35b1f04c1
tree	e79287003ac102c1e1ce7b3764f3facfe2d2c7bf	tree \| snapshot
parent	0e71ae6da8f3142f453267d4f1668b0d6d77bec5	commit \| diff

[X86][Costmodel] Load/store i8 Stride=4 VF=32 interleaving costs

While we already model this tuple, the load cost is divergent from reality, so fix it.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/zWMhhnPYa - for intels `Block RThroughput: =56.0`; for ryzens, `Block RThroughput: <=24.0`
So pick cost of `56`.

For store we have:
https://godbolt.org/z/vnqqjWx51 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=4.0`
So pick cost of `12`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110971

llvm/lib/Target/X86/X86TargetTransformInfo.cpp		diff \| blob \| history
llvm/test/Analysis/CostModel/X86/interleaved-load-i8-stride-4.ll		diff \| blob \| history