nir/opt_sink: Sink load_local_pixel_agx
This is the AGX version of load_output, which shaders can use for framebuffer
fetch. It is beneficial to sink framebuffer fetch as late as possible, both to
reduce register pressure but also to reduce serialization of overlapping
fragments.
Results on a collection of ubershaders:
total bytes in shared programs: 1468928 -> 1468550 (-0.03%)
bytes in affected programs: 495300 -> 494922 (-0.08%)
helped: 24
HURT: 0
Bytes are helped.
total halfregs in shared programs: 14162 -> 13946 (-1.53%)
halfregs in affected programs: 5148 -> 4932 (-4.20%)
helped: 27
HURT: 0
Halfregs are helped.
total threads in shared programs: 216896 -> 217664 (0.35%)
threads in affected programs: 6912 -> 7680 (11.11%)
helped: 12
HURT: 0
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>