Currently we use the link register to branch up high in the early MMU on
syscall entry path. Unfortunately, this trashes the link stack as the
address we are going to is not associated with the earlier mflr.
This patch simply converts us to used the count register (volatile over
syscalls anyway) instead. This is much better at predicting in this
scenario and doesn't trash link stack causing a bunch of additional
branch mispredicts later. Benchmarking this on POWER8 saves a bunch of
cycles on Anton's null syscall benchmark here:
http://ozlabs.org/~anton/junkcode/null_syscall.c
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mflr r10 ; \
ld r12,PACAKBASE(r13) ; \
LOAD_HANDLER(r12, system_call_entry_direct) ; \
- mtlr r12 ; \
+ mtctr r12 ; \
mfspr r12,SPRN_SRR1 ; \
/* Re-use of r13... No spare regs to do this */ \
li r13,MSR_RI ; \
mtmsrd r13,1 ; \
GET_PACA(r13) ; /* get r13 back */ \
- blr ;
+ bctr ;
#else
/* We can branch directly */
#define SYSCALL_PSERIES_2_DIRECT \