Optimize vpermtiw/b to vpunpcklqdq for certain cases.
authorliuhongt <hongtao.liu@intel.com>
Fri, 13 May 2022 01:59:13 +0000 (09:59 +0800)
committerliuhongt <hongtao.liu@intel.com>
Tue, 17 May 2022 01:30:22 +0000 (09:30 +0800)
commit105c56a8cfde6015b989ab22c20c915c1b4e69ec
treec2d81cf0a848f12031431939e00e20ec56a193ee
parent1fba0608d12a209a5d76d65bcb1dec1c07bc33e9
Optimize vpermtiw/b to vpunpcklqdq for certain cases.

Assembly Optimization like:
-       vmovq   %xmm0, %xmm2
-       vmovdqa .LC0(%rip), %xmm0
        vmovq   %xmm1, %xmm1
-       vpermi2w        %xmm1, %xmm2, %xmm0
+       vmovq   %xmm0, %xmm0
+       vpunpcklqdq     %xmm1, %xmm0, %xmm0

...

-.LC0:
-       .value  0
-       .value  1
-       .value  2
-       .value  3
-       .value  8
-       .value  9
-       .value  10
-       .value  11

gcc/ChangeLog:

PR target/105033
* config/i386/sse.md (*vec_concatv4si): Extend to ..
(*vec_concat<mode>): .. V16QI and V8HImode.
(*vec_concatv16qi_permt2): New pre_reload define_insn_and_split.
(*vec_concatv8hi_permt2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr105033.c: New test.
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/pr105033.c [new file with mode: 0644]