Improve loop inversion (#52347)
* Improve loop inversion
When doing loop inversion, we duplicate the condition block to the
top of the loop to create a fall-through zero trip test. Improve
this by redirecting all incoming branches to the condition block
that appear to be coming from outside the potential loop body
to branch to the new duplicated condition block. This improves the
ability of the loop recognizer to find loops, whereas before we
would reject the loops as "multi-entry".
There are good diffs where more loops are detected, leading to
better optimization and more loop alignment. There are also
asm diffs regressions.
* Formatting
* Updates
1. Allow scratch block to be the loop head, since we introduce a
new block to duplicate the condition, so the scratch block becomes
a BBJ_NONE, which is fine.
2. Fix issue on x86 where a catch return, which is a BBJ_ALWAYS on
x86, can't be the "head" block of the loop.