When receiving traffic, eth_type_trans() is high up on the perf top list,
because it's the first function which access the packet data.
Move the DMA unmap a bit higher, and put a prefetch just after it, so we
have more time to load the data into the cache.
The packet rate increase is about 14% with a tc drop test: 1620 => 1853 kpps
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dma_sync_single_for_cpu(dev->dev.parent, dma_addr,
rx_bytes + MVPP2_MH_SIZE,
DMA_FROM_DEVICE);
+ prefetch(data);
if (bm_pool->frag_size > PAGE_SIZE)
frag_size = 0;