The previous commit missed the TYP_BOOL case that the original code handled. But then the original code failed to do this thinking that "test" will be used. But that means we end up with a "movzx" as well:
movzx rax, byte ptr [rsi+24]
test eax, eax
instead of just
cmp byte ptr [rsi+24], 0
The Intel manual actually recomends against using the "cmp mem, imm" form but not if other instructions need to be added. This may warrant further investigation though.
FX diff shows a 7579 bytes improvement without any regressions.
Commit migrated from https://github.com/dotnet/coreclr/commit/
9bc2e775a85667fa85a1254fa4138fb98f417543
GenTreeIntCon* op2 = cmp->gtGetOp2()->AsIntCon();
ssize_t op2Value = op2->IconValue();
- if (op1->isMemoryOp() && varTypeIsSmallInt(op1Type))
+ if (op1->isMemoryOp() && varTypeIsSmall(op1Type))
{
//
// If op1's type is small then try to narrow op2 so it has the same type as op1.
// (e.g "cmp ubyte, 200") we also get a smaller instruction encoding.
//
- if ((op1Type == TYP_UBYTE) && FitsIn<UINT8>(op2Value))
+ if (((op1Type == TYP_BOOL) || (op1Type == TYP_UBYTE)) && FitsIn<UINT8>(op2Value))
{
cmp->gtFlags |= GTF_UNSIGNED;
op2->gtType = op1Type;