intel/fs: Allow copy propagation between MOVs of mixed sizes
This eliminates some spurious, size-converting moves. For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
Unfortunately, this doesn't clean everything up. Here's a subset of the
"before" assembly:
send(8) g11<1>UW g2<0,1,0>UD 0x02106e02
dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8) g7<4>UB g11<8,8,1>UD { align1 1Q };
mov(8) g12<1>UB g7<32,8,4>UB { align1 1Q };
send(8) g13<1>UW g2<0,1,0>UD 0x02106e03
dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8) g15<1>UW g12<8,8,1>UB { align1 1Q };
mov(8) g8<4>UB g13<8,8,1>UD { align1 1Q };
mov(8) g14<1>UB g8<32,8,4>UB { align1 1Q };
mov(8) g16<1>UW g14<8,8,1>UB { align1 1Q };
xor(8) g17<1>UW g15<8,8,1>UW g16<8,8,1>UW { align1 1Q };
And here's the same subset of the "after" assembly:
send(8) g11<1>UW g2<0,1,0>UD 0x02106e02
dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8) g7<4>UB g11<8,8,1>UD { align1 1Q };
send(8) g13<1>UW g2<0,1,0>UD 0x02106e03
dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8) g15<1>UW g7<32,8,4>UB { align1 1Q };
mov(8) g8<4>UB g13<8,8,1>UD { align1 1Q };
mov(8) g16<1>UW g8<32,8,4>UB { align1 1Q };
xor(8) g17<1>UW g15<8,8,1>UW g16<8,8,1>UW { align1 1Q };
There are a lot of regioning and type restrictions in
fs_visitor::try_copy_propagate, and I'm a little nervious about messing
with them too much.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>