review.tizen.org Git - platform/upstream/llvm.git/commit

author	Sanjay Patel <spatel@rotateright.com>
	Sun, 28 Apr 2019 12:23:43 +0000 (12:23 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Sun, 28 Apr 2019 12:23:43 +0000 (12:23 +0000)
commit	fb9a5307a94e6f1f850e4d89f79103b123f16279
tree	31c89d47d039aa4f08883d8f130751ce8034f48d	tree \| snapshot
parent	43003f0fec7f03e39b065e484b3153bbceec9d52	commit \| diff

[DAGCombiner] try repeated fdiv divisor transform before building estimate

This was originally part of D61028, but it's an independent diff.

If we try the repeated divisor reciprocal transform before producing an estimate sequence,
then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5
vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the
full-precision division is only 3 cycle throughput, so that's probably the better perf
default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using
the function attributes for reciprocal estimates, but those patterns are potentially made
faster by converting the vector ops (including ymm ops) to scalar math.

Differential Revision: https://reviews.llvm.org/D61149

llvm-svn: 359398

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp		diff \| blob \| history
llvm/test/CodeGen/X86/fdiv-combine-vec.ll		diff \| blob \| history