[x86] make SLM extract vector element more expensive than default
authorSanjay Patel <spatel@rotateright.com>
Wed, 27 Nov 2019 18:33:11 +0000 (13:33 -0500)
committerSanjay Patel <spatel@rotateright.com>
Wed, 27 Nov 2019 19:08:56 +0000 (14:08 -0500)
commit5c166f1d1969e9c1e5b72aa672add429b9c22b53
treeadf6302c8508cb2d3cf48fcf5e53eab409bfa65f
parent5c5e860535d8924a3d6eb950bb8a4945df01e9b7
[x86] make SLM extract vector element more expensive than default

I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc

The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.

This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605

Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.

Differential Revision: https://reviews.llvm.org/D70607
12 files changed:
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/test/Analysis/CostModel/X86/fptosi.ll
llvm/test/Analysis/CostModel/X86/fptoui.ll
llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
llvm/test/Analysis/CostModel/X86/vector-extract.ll
llvm/test/Transforms/LoopVectorize/X86/interleaving.ll
llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
llvm/test/Transforms/SLPVectorizer/X86/hadd.ll
llvm/test/Transforms/SLPVectorizer/X86/hsub.ll
llvm/test/Transforms/SLPVectorizer/X86/sext.ll
llvm/test/Transforms/SLPVectorizer/X86/zext.ll