nfp: bpf: optimize the RMW for stack accesses
authorJakub Kicinski <jakub.kicinski@netronome.com>
Mon, 23 Oct 2017 18:58:10 +0000 (11:58 -0700)
committerDavid S. Miller <davem@davemloft.net>
Tue, 24 Oct 2017 08:38:37 +0000 (17:38 +0900)
commit9a90c83c09874a2fd03905ef0f73512c9de18799
treedaa0ea3071024b29de93719e3357994588d99578
parenta82b23fb38eaaaad89332b90029fc4cd7c3f2545
nfp: bpf: optimize the RMW for stack accesses

When we are performing unaligned stack accesses in the 32-64B window
we have to do a read-modify-write cycle.  E.g. for reading 8 bytes
from address 17:

0:  tmp    = stack[16]
1:  gprLo  = tmp >> 8
2:  tmp    = stack[20]
3:  gprLo |= tmp << 24
4:  tmp    = stack[20]
5:  gprHi  = tmp >> 8
6:  tmp    = stack[24]
7:  gprHi |= tmp << 24

The load on line 4 is unnecessary, because tmp already contains data
from stack[20].

For write we can optimize both loads and writebacks away.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/netronome/nfp/bpf/jit.c