[AArch64] Optimize more memcmp when the result is tested for [in]equality with 0
We already surpport the or (xor a, b), (xor c, d) with D136244, while it should
capture more cases than just bcmp according the comment on
https://reviews.llvm.org/D136672, so this patch try to fold continuous
comparison series.
Also add a new callsite in LowerSETCC to address some cases folded And in the
stage of `Optimized type-legalized selection`.
Depends on D136244
Reviewed By: dmgreen, bcl5980
Differential Revision: https://reviews.llvm.org/D137721