review.tizen.org Git - platform/upstream/llvm.git/commit

[SLP] avoid reduction transform on patterns that the backend can load-combine

I don't see an ideal solution to these 2 related, potentially large, perf regressions:
https://bugs.llvm.org/show_bug.cgi?id=42708
https://bugs.llvm.org/show_bug.cgi?id=43146

We decided that load combining was unsuitable for IR because it could obscure other
optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
Therefore, preventing SLP from destroying load combine opportunities requires that it
recognizes patterns that could be combined later, but not do the optimization itself (
it's not a vector combine anyway, so it's probably out-of-scope for SLP).

Here, we add a scalar cost model adjustment with a conservative pattern match and cost
summation for a multi-instruction sequence that can probably be reduced later.
This should prevent SLP from creating a vector reduction unless that sequence is
extremely cheap.

In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
will produce a single instruction on these tests like:

  movbe   rax, qword ptr [rdi]

or:

  mov     rax, qword ptr [rdi]

Not some (half) vector monstrosity as we currently do using SLP:

  vpmovzxbq       ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
  vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
  movzx   eax, byte ptr [rdi]
  movzx   ecx, byte ptr [rdi + 5]
  shl     rcx, 40
  movzx   edx, byte ptr [rdi + 6]
  shl     rdx, 48
  or      rdx, rcx
  movzx   ecx, byte ptr [rdi + 7]
  shl     rcx, 56
  or      rcx, rdx
  or      rcx, rax
  vextracti128    xmm1, ymm0, 1
  vpor    xmm0, xmm0, xmm1
  vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
  vpor    xmm0, xmm0, xmm1
  vmovq   rax, xmm0
  or      rax, rcx
  vzeroupper
  ret

Differential Revision: https://reviews.llvm.org/D67841

llvm-svn: 373833

author	Sanjay Patel <spatel@rotateright.com>
	Sat, 5 Oct 2019 18:03:58 +0000 (18:03 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Sat, 5 Oct 2019 18:03:58 +0000 (18:03 +0000)
commit	e2321bb4488a81b87742f3343e3bdf8e161aa35b
tree	48e6260a743b8adf2a2866d6250955e09c2ce8a6	tree \| snapshot
parent	9ecacb0d54fb89dc7e6da66d9ecae934ca5c01d4	commit \| diff

llvm/include/llvm/Analysis/TargetTransformInfo.h		diff \| blob \| history
llvm/lib/Analysis/TargetTransformInfo.cpp		diff \| blob \| history
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp		diff \| blob \| history
llvm/test/Transforms/SLPVectorizer/X86/bad-reduction.ll		diff \| blob \| history