ARM: proc-v7: Force misalignment of early stmia
In an attempt to prevent the problem of CPUn not starting, explicitly
misalign the scratch space used to save registers acros the cache
invalidation.
Notes:
At this stage in the boot process the core is running with its cache
disabled. Before enabling the cache its contents must be explicitly
invalidated, a process that requires quite a few registers that the
caller must preserve. Evidence suggests that something is writing a
block of zeroes over that space at a time when all other cores should
be idle, possibly some kind of write-combiner, and the misalignment is
designed to disrupt any write-coalescing.
In truth, I don't understand why this patch works, and when the failure
is so random it is hard to be certain that this isn't just rolling the
dice again. One interesting test would be to change the "addeq r12, #4"s
to "addeq r12, #0"s determine see if the offset itself is significant or
just the additional code.
See: https://github.com/Hexxeh/rpi-firmware/issues/232
Signed-off-by: Phil Elwell <phil@raspberrypi.com>