[TTI] Add DemandedElts to getScalarizationOverhead
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Wed, 29 Apr 2020 10:39:13 +0000 (11:39 +0100)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Wed, 29 Apr 2020 11:00:38 +0000 (12:00 +0100)
commit090cae8491279126690b9530ce72b7f8fdb1dc9e
tree75854704ef7ddc648da968f0d5cabc27dd03ae40
parent42a56bf63f699a620a57c34474510d9937ebf715
[TTI] Add DemandedElts to getScalarizationOverhead

The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.

This patch does 2 things:
1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.

This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.

A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.

Reviewed By: @craig.topper

Differential Revision: https://reviews.llvm.org/D78216
22 files changed:
llvm/include/llvm/Analysis/TargetTransformInfo.h
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
llvm/include/llvm/CodeGen/BasicTTIImpl.h
llvm/lib/Analysis/TargetTransformInfo.cpp
llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
llvm/lib/Target/X86/X86TargetTransformInfo.h
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
llvm/test/Analysis/CostModel/X86/arith-fp.ll
llvm/test/Analysis/CostModel/X86/fptosi.ll
llvm/test/Analysis/CostModel/X86/fptoui.ll
llvm/test/Analysis/CostModel/X86/fround.ll
llvm/test/Analysis/CostModel/X86/intrinsic-cost.ll
llvm/test/Analysis/CostModel/X86/load_store.ll
llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
llvm/test/Analysis/CostModel/X86/sitofp.ll
llvm/test/Transforms/LoopVectorize/X86/strided_load_cost.ll
llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll
llvm/test/Transforms/SLPVectorizer/X86/resched.ll
llvm/test/Transforms/SLPVectorizer/X86/vectorize-reorder-reuse.ll