From f0b6348ad06c46f3eaa6325e85e5472a292d812d Mon Sep 17 00:00:00 2001 From: Francisco Jerez Date: Mon, 23 Jan 2023 19:36:15 -0800 Subject: [PATCH] intel/eu/gfx8-9: Fix execution with all channels disabled due to HW bug #220160235. This hardware bug is the result of a control flow optimization present in Gfx8-9 meant to prevent the ELSE instruction from disabling all channels and update the control flow stack only to have them re-enabled at the ENDIF instruction executed immediately after it. Instead, on Gfx8-9 an ELSE instruction that would normally have ended up with all channels disabled would pop off the last element of the stack and jump directly to JIP+1 instead of to the ENDIF at JIP, skipping over the ENDIF instruction. In simple cases this would work okay (though it's actual performance benefit is questionable), but in cases where a branch instruction within the IF block (e.g. BREAK or CONTINUE) caused all active channels to jump outside the IF conditional, the optimization would break the JIP chain of "join" instructions by skipping the ENDIF, causing the block of instructions immediately after the ENDIF to execute with all channels disabled until execution reaches the reconvergence point. This issue was observed on SKL in the dEQP-VK.reconvergence.subgroup_uniform_control_flow_elect.compute.nesting4.0.38 test in combination with some Vulkan binding model changes Lionel is working on. In such cases the execution with all channels disabled was leading to corruption of an indirect message descriptor, causing a hang. Unfortunately the hardware bug doesn't provide a recommended workaround. In order to fix the problem we point the JIP of an ELSE instruction to the instruction immediately before the ENDIF -- However that's not expected to work due to the restriction that JIP and UIP must be equal if and only if BranchCtrl is disabled -- So this patch also enables BranchCtrl, which is intended to support join instructions within the "ELSE" block, which in turn disables the optimization described above, which in turn causes us to execute the instruction immediately *before* the ENDIF with all channels disabled -- So in order to avoid further fallout from executing code with all channels disabled we need to insert a NOP before ENDIF instructions that have a matching ELSE instruction. Reviewed-by: Lionel Landwerlin Part-of: --- src/intel/compiler/brw_eu_emit.c | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 7e5caa2..4e46873 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -1612,10 +1612,27 @@ patch_IF_ELSE(struct brw_codegen *p, brw_inst_set_jip(devinfo, if_inst, br * (else_inst - if_inst + 1)); /* The IF instruction's UIP and ELSE's JIP should point to ENDIF */ brw_inst_set_uip(devinfo, if_inst, br * (endif_inst - if_inst)); - brw_inst_set_jip(devinfo, else_inst, br * (endif_inst - else_inst)); + + if (devinfo->ver >= 8 && devinfo->ver < 11) { + /* Set the ELSE instruction to use branch_ctrl with a join + * jump target pointing at the NOP inserted right before + * the ENDIF instruction in order to make sure it is + * executed in all cases, since attempting to do the same + * as on other generations could cause the EU to jump at + * the instruction immediately after the ENDIF due to + * Wa_220160235, which could cause the program to continue + * running with all channels disabled. + */ + brw_inst_set_jip(devinfo, else_inst, br * (endif_inst - else_inst - 1)); + brw_inst_set_branch_control(devinfo, else_inst, true); + } else { + brw_inst_set_jip(devinfo, else_inst, br * (endif_inst - else_inst)); + } + if (devinfo->ver >= 8) { - /* Since we don't set branch_ctrl, the ELSE's JIP and UIP both - * should point to ENDIF. + /* Since we don't set branch_ctrl on Gfx11+, the ELSE's + * JIP and UIP both should point to ENDIF on those + * platforms. */ brw_inst_set_uip(devinfo, else_inst, br * (endif_inst - else_inst)); } @@ -1672,6 +1689,22 @@ brw_ENDIF(struct brw_codegen *p) brw_inst *tmp; bool emit_endif = true; + assert(p->if_stack_depth > 0); + + if (devinfo->ver >= 8 && devinfo->ver < 11 && + brw_inst_opcode(p->isa, &p->store[p->if_stack[ + p->if_stack_depth - 1]]) == BRW_OPCODE_ELSE) { + /* Insert a NOP to be specified as join instruction within the + * ELSE block, which is valid for an ELSE instruction with + * branch_ctrl on. The ELSE instruction will be set to jump + * here instead of to the ENDIF instruction, since attempting to + * do the latter would prevent the ENDIF from being executed in + * some cases due to Wa_220160235, which could cause the program + * to continue running with all channels disabled. + */ + brw_NOP(p); + } + /* In single program flow mode, we can express IF and ELSE instructions * equivalently as ADD instructions that operate on IP. On platforms prior * to Gfx6, flow control instructions cause an implied thread switch, so -- 2.7.4