netfilter: nft_set_rbtree: use per-set rwlock to improve the scalability
authorLiping Zhang <zlpnobody@gmail.com>
Sun, 12 Mar 2017 11:38:47 +0000 (19:38 +0800)
committerPablo Neira Ayuso <pablo@netfilter.org>
Mon, 13 Mar 2017 18:30:43 +0000 (19:30 +0100)
commit03e5fd0e9bcc1f34b7a542786b34b8f771e7c260
treef22669fa7302bfa16f0498beeb3ca5f514deada3
parent2cb4bbd75bdf9d423b9f6c629f81eb66ee312fac
netfilter: nft_set_rbtree: use per-set rwlock to improve the scalability

Karel Rericha reported that in his test case, ICMP packets going through
boxes had normally about 5ms latency. But when running nft, actually
listing the sets with interval flags, latency would go up to 30-100ms.
This was observed when router throughput is from 600Mbps to 2Gbps.

This is because we use a single global spinlock to protect the whole
rbtree sets, so "dumping sets" will race with the "key lookup" inevitably.
But actually they are all _readers_, so it's ok to convert the spinlock
to rwlock to avoid competition between them. Also use per-set rwlock since
each set is independent.

Reported-by: Karel Rericha <karel@unitednetworks.cz>
Tested-by: Karel Rericha <karel@unitednetworks.cz>
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
net/netfilter/nft_set_rbtree.c