net/mlx5e: Optimize poll ICOSQ completion queue
authorTariq Toukan <tariqt@mellanox.com>
Wed, 29 Mar 2017 08:46:10 +0000 (11:46 +0300)
committerSaeed Mahameed <saeedm@mellanox.com>
Sun, 30 Apr 2017 13:03:16 +0000 (16:03 +0300)
commit1f5b1e47ee08f6c623db599b6c23ce7c20b79458
tree22ba10a6a89b11fccd7d07db2804aa647dbf7783
parenta2fa1fe5ad13e7f11b82291fc08bdc654fac741e
net/mlx5e: Optimize poll ICOSQ completion queue

UMR operations are more frequent and important.
Check them first, and add a compiler branch predictor hint.

According to current design, ICOSQ CQ can contain at most one
pending CQE per napi. Poll function is optimized accordingly.

Performance:
Single-stream packet-rate tested with pktgen.
Packets are dropped in tc level to zoom into driver data-path.
Larger gain is expected for larger packet sizes, as BW is higher
and UMR posts are more frequent.

---------------------------------------------
packet size | before    | after     | gain  |
64B         | 4,092,370 | 4,113,306 |  0.5% |
1024B       | 3,421,435 | 3,633,819 |  6.2% |

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c