widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
authorJakub Jelinek <jakub@redhat.com>
Mon, 17 Jan 2022 12:39:05 +0000 (13:39 +0100)
committerJakub Jelinek <jakub@redhat.com>
Mon, 17 Jan 2022 12:39:05 +0000 (13:39 +0100)
commit463d9108766dcbb6a1051985e6c840a46897fe10
tree5e83cb1ed249317106bf8059c1eb133fdf0cb84f
parent4152e4ad3f3a67ce30f5e0e01d5eba03fcff10b8
widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

C++20:
 #include <compare>
 auto cmp4way(double a, double b)
 {
   return a <=> b;
 }
expands to:
        ucomisd %xmm1, %xmm0
        jp      .L8
        movl    $0, %eax
        jne     .L8
.L2:
        ret
        .p2align 4,,10
        .p2align 3
.L8:
        comisd  %xmm0, %xmm1
        movl    $-1, %eax
        ja      .L2
        ucomisd %xmm1, %xmm0
        setbe   %al
        addl    $1, %eax
        ret
That is 3 comparisons of the same operands.
The following patch improves it to just one comparison:
        comisd  %xmm1, %xmm0
        jp      .L4
        seta    %al
        movl    $0, %edx
        leal    -1(%rax,%rax), %eax
        cmove   %edx, %eax
        ret
.L4:
        movl    $2, %eax
        ret
While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
where the first comparison is equality and this shouldn't raise
exceptions on qNaN operands, if the operands aren't equal (which
includes unordered cases), then it immediately performs < or >
comparison and that raises exceptions even on qNaNs, so we can just
perform a single comparison that raises exceptions on qNaN.
As the 4 different cases are encoded as
ZF CF PF
1  1  1  a unordered b
0  0  0  a > b
0  1  0  a < b
1  0  0  a == b
we can emit optimal sequence of comparions, first jp
for the unordered case, then je for the == case and finally jb
for the < case.

The patch pattern recognizes spaceship-like comparisons during
widening_mul if the spaceship optab is implemented, and replaces
those comparisons with comparisons of .SPACESHIP ifn which returns
-1/0/1/2 based on the comparison.  This seems to work well both for the
case of just returning the -1/0/1/2 (when we have just a common
successor with a PHI) or when the different cases are handled with
various other basic blocks.  The testcases cover both of those cases,
the latter with different function calls in those.

2022-01-17  Jakub Jelinek  <jakub@redhat.com>

PR target/103973
* tree-cfg.h (cond_only_block_p): Declare.
* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
* optabs.def (spaceship_optab): New optab.
* internal-fn.def (SPACESHIP): New internal function.
* internal-fn.h (expand_SPACESHIP): Declare.
* internal-fn.c (expand_PHI): Formatting fix.
(expand_SPACESHIP): New function.
* tree-ssa-math-opts.c (optimize_spaceship): New function.
(math_opts_dom_walker::after_dom_children): Use it.
* config/i386/i386.md (spaceship<mode>3): New define_expand.
* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
* doc/md.texi (spaceship@var{m}3): Document.

* gcc.target/i386/pr103973-1.c: New test.
* gcc.target/i386/pr103973-2.c: New test.
* gcc.target/i386/pr103973-3.c: New test.
* gcc.target/i386/pr103973-4.c: New test.
* gcc.target/i386/pr103973-5.c: New test.
* gcc.target/i386/pr103973-6.c: New test.
* gcc.target/i386/pr103973-7.c: New test.
* gcc.target/i386/pr103973-8.c: New test.
* gcc.target/i386/pr103973-9.c: New test.
* gcc.target/i386/pr103973-10.c: New test.
* gcc.target/i386/pr103973-11.c: New test.
* gcc.target/i386/pr103973-12.c: New test.
* gcc.target/i386/pr103973-13.c: New test.
* gcc.target/i386/pr103973-14.c: New test.
* gcc.target/i386/pr103973-15.c: New test.
* gcc.target/i386/pr103973-16.c: New test.
* gcc.target/i386/pr103973-17.c: New test.
* gcc.target/i386/pr103973-18.c: New test.
* gcc.target/i386/pr103973-19.c: New test.
* gcc.target/i386/pr103973-20.c: New test.
* g++.target/i386/pr103973-1.C: New test.
* g++.target/i386/pr103973-2.C: New test.
* g++.target/i386/pr103973-3.C: New test.
* g++.target/i386/pr103973-4.C: New test.
* g++.target/i386/pr103973-5.C: New test.
* g++.target/i386/pr103973-6.C: New test.
* g++.target/i386/pr103973-7.C: New test.
* g++.target/i386/pr103973-8.C: New test.
* g++.target/i386/pr103973-9.C: New test.
* g++.target/i386/pr103973-10.C: New test.
* g++.target/i386/pr103973-11.C: New test.
* g++.target/i386/pr103973-12.C: New test.
* g++.target/i386/pr103973-13.C: New test.
* g++.target/i386/pr103973-14.C: New test.
* g++.target/i386/pr103973-15.C: New test.
* g++.target/i386/pr103973-16.C: New test.
* g++.target/i386/pr103973-17.C: New test.
* g++.target/i386/pr103973-18.C: New test.
* g++.target/i386/pr103973-19.C: New test.
* g++.target/i386/pr103973-20.C: New test.
52 files changed:
gcc/config/i386/i386-expand.c
gcc/config/i386/i386-protos.h
gcc/config/i386/i386.md
gcc/doc/md.texi
gcc/internal-fn.c
gcc/internal-fn.def
gcc/internal-fn.h
gcc/optabs.def
gcc/testsuite/g++.target/i386/pr103973-1.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-10.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-11.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-12.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-13.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-14.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-15.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-16.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-17.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-18.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-19.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-2.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-20.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-3.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-4.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-5.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-6.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-7.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-8.C [new file with mode: 0644]
gcc/testsuite/g++.target/i386/pr103973-9.C [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-10.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-11.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-12.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-13.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-14.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-15.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-16.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-17.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-18.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-19.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-20.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-7.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-8.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr103973-9.c [new file with mode: 0644]
gcc/tree-cfg.c
gcc/tree-cfg.h
gcc/tree-ssa-math-opts.c
gcc/tree-ssa-phiopt.c