rs6000: __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode
authorPeter Bergner <bergner@linux.ibm.com>
Wed, 22 Jul 2020 16:44:35 +0000 (11:44 -0500)
committerPeter Bergner <bergner@linux.ibm.com>
Wed, 22 Jul 2020 18:36:28 +0000 (13:36 -0500)
commitae575662833d70cb7d74b9538096c7becc79af14
tree54c823e58d7e47cc8d6752ac099a147bddabf64e
parent6e1e0decc9e17a4283d1b5508e892be5215b8ab9
rs6000: __builtin_mma_disassemble_acc() doesn't store elements correctly in LE mode

PR96236 shows a problem where we don't correctly store our 512-bit accumulators
correctly in little-endian mode.  The patch below detects when we're doing a
little-endian memory access and stores to the correct memory locations.

2020-07-22  Peter Bergner  <bergner@linux.ibm.com>

gcc/
PR target/96236
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Handle
little-endian memory ordering.

gcc/testsuite/
PR target/96236
* gcc.target/powerpc/mma-double-test.c: Update storing results for
correct little-endian ordering.
* gcc.target/powerpc/mma-single-test.c: Likewise.
gcc/config/rs6000/rs6000-call.c
gcc/testsuite/gcc.target/powerpc/mma-double-test.c
gcc/testsuite/gcc.target/powerpc/mma-single-test.c