From 2faf227ec2e22c7a37e0a54783a3f0a0062ac852 Mon Sep 17 00:00:00 2001 From: Kenneth Graunke Date: Fri, 21 Apr 2017 01:28:13 -0700 Subject: [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce(). opt_register_coalesce() was optimizing sequences such as: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D mov(8) m4.zw:F, vgrf5.xxxy:F into: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D This doesn't work - if we're going to reswizzle MACH, we'd need to reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw components with attr18.yy * attr19.yy. But the MACH instruction expects .z to contain attr18.x * attr19.x. Bogus results ensue. No change in shader-db on Haswell. Prevents regressions in Timothy's patches to use enhanced layouts for varying packing (which rearrange code just enough to trigger this pre-existing bug, but were fine themselves). Acked-by: Timothy Arceri Reviewed-by: Jason Ekstrand --- src/intel/compiler/brw_vec4.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 0b92ba7..4bb774b 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo, if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW) return false; + /* Don't touch MACH - it uses the accumulator results from an earlier + * MUL - so we'd need to reswizzle both. We don't do that, so just + * avoid it entirely. + */ + if (opcode == BRW_OPCODE_MACH) + return false; + if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW) return false; -- 2.7.4