[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass
authorIgor Kirillov <igor.kirillov@arm.com>
Mon, 17 Apr 2023 18:24:45 +0000 (18:24 +0000)
committerIgor Kirillov <igor.kirillov@arm.com>
Wed, 31 May 2023 18:31:38 +0000 (18:31 +0000)
commit1a1e76100e3f99c2bf0babcab52da333c12631e2
tree7ee4a15b3c41deb4d457a823d9a0296b177d0407
parent1ca458f78e26e785b6eca2946a7558d8c39c7490
[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass

Code generated with -Ofast and -O3 -ffp-contract=fast (add
-ffinite-math-only to enable vectorization) can differ significantly.
Code compiled with -O3 can be deinterleaved using patterns as the
instruction order is preserved. However, with the -Ofast flag, there
can be multiple changes in the computation sequence, and even the real
and imaginary parts may not be calculated in parallel.
For more details, refer to
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-fast.ll and
llvm/test/CodeGen/AArch64/complex-deinterleaving-*-contract.ll tests.
This patch implements a more general approach and enables handling most
-Ofast cases.

Differential Revision: https://reviews.llvm.org/D148558
llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
llvm/test/CodeGen/AArch64/complex-deinterleaving-add-mull-fixed-fast.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-add-mull-scalable-fast.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-mixed-cases.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-multiuses.ll
llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll
llvm/test/CodeGen/Thumb2/mve-complex-deinterleaving-mixed-cases.ll
llvm/test/CodeGen/Thumb2/mve-complex-deinterleaving-uniform-cases.ll