sparc64: speed up etrap/rtrap on NG2 and later processors
authorAnthony Yznaga <anthony.yznaga@oracle.com>
Fri, 18 Aug 2017 19:40:36 +0000 (12:40 -0700)
committerDavid S. Miller <davem@davemloft.net>
Sun, 10 Sep 2017 03:20:11 +0000 (20:20 -0700)
commita7159a87a3836f61a97882e671d2d66bbb96c62e
tree3f56de47be55b0367bb7588869f9c95ee2516409
parent5bd0ea9107dca975dc4ba4d9de39b4938d2cb36d
sparc64: speed up etrap/rtrap on NG2 and later processors

For many sun4v processor types, reading or writing a privileged register
has a latency of 40 to 70 cycles.  Use a combination of the low-latency
allclean, otherw, normalw, and nop instructions in etrap and rtrap to
replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
performance.  allclean, otherw, and normalw are available on NG2 and
later processors.

The average ticks to execute the flush windows trap ("ta 0x3") with and
without this patch on select platforms:

 CPU            Not patched     Patched    % Latency Reduction

 NG2            1762            1558            -11.58
 NG4            3619            3204            -11.47
 M7             3015            2624            -12.97
 SPARC64-X      829             770              -7.12

Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
arch/sparc/include/asm/trap_block.h
arch/sparc/kernel/etrap_64.S
arch/sparc/kernel/rtrap_64.S
arch/sparc/kernel/setup_64.c
arch/sparc/kernel/vmlinux.lds.S