review.tizen.org Git - platform/kernel/linux-rpi.git/commit

author	Yevgeny Kliteynik <kliteyn@nvidia.com>
	Thu, 23 Dec 2021 23:07:30 +0000 (01:07 +0200)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Wed, 2 Mar 2022 10:47:59 +0000 (11:47 +0100)
commit	117a5a7f019e92ce357f43917afed426f304e71e
tree	72b4d19234f880b673af69f497b8d9bc8f6c25ea	tree \| snapshot
parent	6b6094db77e65ccde68bff0ce3118bb3aa5d7e10	commit \| diff

net/mlx5: DR, Cache STE shadow memory

commit e5b2bc30c21139ae10f0e56989389d0bc7b7b1d6 upstream.

During rule insertion on each ICM memory chunk we also allocate shadow memory
used for management. This includes the hw_ste, dr_ste and miss list per entry.
Since the scale of these allocations is large we noticed a performance hiccup
that happens once malloc and free are stressed.
In extreme usecases when ~1M chunks are freed at once, it might take up to 40
seconds to complete this, up to the point the kernel sees this as self-detected
stall on CPU:

rcu: INFO: rcu_sched self-detected stall on CPU

To resolve this we will increase the reuse of shadow memory.
Doing this we see that a time in the aforementioned usecase dropped from ~40
seconds to ~8-10 seconds.

Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator")
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c		diff \| blob \| history
drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h		diff \| blob \| history