agx: Optimize swaps of 2x16 channels
authorAlyssa Rosenzweig <alyssa@rosenzweig.io>
Mon, 31 Jul 2023 21:19:56 +0000 (17:19 -0400)
committerMarge Bot <emma+marge@anholt.net>
Fri, 11 Aug 2023 20:31:27 +0000 (20:31 +0000)
commitd459de85b75842135372191af4d9dab2d75c65b3
tree21c36431a448fe9a607e33b25002d513c7eab270
parentefbdc31ce55ea01c1443a9c244c372a648787b12
agx: Optimize swaps of 2x16 channels

We can use extr to swap the low and high halves of a 32-bit register in one
instruction.

No shader-db changes, but it reduces xor's on a deqp I'm looking at. Yes, I'm
procrastinating on debugging deqps, how'd you guess?

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>
src/asahi/compiler/agx_lower_parallel_copy.c
src/asahi/compiler/test/test-lower-parallel-copy.cpp