use global flag 0.0 to control unstructured simple block.
filter the simple block out and replace the if/endif with global flag
to control.
v2: fix the luxmark sala performance degression due to extern flag in a
BRA instruction.
v3: fix compiler_switch regression, LOAD/STORE instruction could
call replaceSrc/replaceDst to generate 2 extra MOV instruction; exclude
the scalar instructions since they don't have prediction.
this patch is somewhat dangerous to change the instruction structure of block,
will add sanity check after emitInstructionStream to assert if illegally
modified.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>