From: Alyssa Rosenzweig Date: Sat, 29 Jul 2023 22:49:41 +0000 (-0400) Subject: gallium/u_simple_shaders: Optimize out ffloors X-Git-Tag: upstream/23.3.3~4885 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=18b2daa1362b738e0c8ad06f2de9535ab79d5d84;p=platform%2Fupstream%2Fmesa.git gallium/u_simple_shaders: Optimize out ffloors ffloor(f2i(x)) can't be optimized to f2i(x) due to differing behaviour for negative x, but u_blitter only uses this with nonnegative x so we can instead use ftrunc(f2i(x)) which NIR will optimize to f2i(x) for us. This gets rid of the silly ffloor instructions in blit shaders. Signed-off-by: Alyssa Rosenzweig Reviewed-by: Marek Olšák Part-of: --- diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c b/src/gallium/auxiliary/util/u_simple_shaders.c index 35173ca..9ecb041 100644 --- a/src/gallium/auxiliary/util/u_simple_shaders.c +++ b/src/gallium/auxiliary/util/u_simple_shaders.c @@ -211,7 +211,9 @@ ureg_load_tex(struct ureg_program *ureg, struct ureg_dst out, /* Nearest filtering floors and then converts to integer, and then * applies clamp to edge as clamp(coord, 0, dim - 1). * u_blitter only uses this when the coordinates are in bounds, - * so no clamping is needed. + * so no clamping is needed and we can use trunc instead of floor. trunc + * with f2i will get optimized out in NIR where f2i has round-to-zero + * behaviour already. */ unsigned wrmask = tex_target == TGSI_TEXTURE_1D || tex_target == TGSI_TEXTURE_1D_ARRAY ? TGSI_WRITEMASK_X : @@ -219,7 +221,7 @@ ureg_load_tex(struct ureg_program *ureg, struct ureg_dst out, TGSI_WRITEMASK_XY; ureg_MOV(ureg, temp, coord); - ureg_FLR(ureg, ureg_writemask(temp, wrmask), ureg_src(temp)); + ureg_TRUNC(ureg, ureg_writemask(temp, wrmask), ureg_src(temp)); ureg_F2I(ureg, temp, ureg_src(temp)); if (load_level_zero)