review.tizen.org Git - platform/upstream/llvm.git/commit

author	Roman Lebedev <lebedev.ri@gmail.com>
	Sat, 2 Oct 2021 10:40:04 +0000 (13:40 +0300)
committer	Roman Lebedev <lebedev.ri@gmail.com>
	Sat, 2 Oct 2021 10:40:20 +0000 (13:40 +0300)
commit	74e4a0e327579bfc3b00f6af0c9fd408c5843e8b
tree	7858028a85b23ad0dbef37abe09804899b90c190	tree \| snapshot
parent	ae08362cb8e60864a0505af47189d6a996cfb5d9	commit \| diff

[X86][Costmodel] Load/store i8 Stride=4 VF=8 interleaving costs

While we already model this tuple, the values are divergent from reality, so fix them.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/v7746Wcf7 - for intels `Block RThroughput: =12.0`; for ryzens, `Block RThroughput: <=6.0`
So pick cost of `12`.

For store we have:
https://godbolt.org/z/aEeEohEbP - for intels `Block RThroughput: =4.0`; for ryzens, `Block RThroughput: <=2.0`
So pick cost of `4`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110969

llvm/lib/Target/X86/X86TargetTransformInfo.cpp		diff \| blob \| history
llvm/test/Analysis/CostModel/X86/interleaved-load-i8-stride-4.ll		diff \| blob \| history
llvm/test/Analysis/CostModel/X86/interleaved-store-i8-stride-4.ll		diff \| blob \| history