mm: vmalloc: remove a global vmap_blocks xarray
A global vmap_blocks-xarray array can be contented under heavy usage of
the vm_map_ram()/vm_unmap_ram() APIs. The lock_stat shows that a
"vmap_blocks.xa_lock" lock is a second in a top-list when it comes to
contentions:
<snip>
----------------------------------------
class name con-bounces contentions ...
----------------------------------------
vmap_area_lock:
2554079 2554276 ...
--------------
vmap_area_lock
1297948 [<
00000000dd41cbaa>] alloc_vmap_area+0x1c7/0x910
vmap_area_lock
1256330 [<
000000009d927bf3>] free_vmap_block+0x4a/0xe0
vmap_area_lock 1 [<
00000000c95c05a7>] find_vm_area+0x16/0x70
--------------
vmap_area_lock
1738590 [<
00000000dd41cbaa>] alloc_vmap_area+0x1c7/0x910
vmap_area_lock 815688 [<
000000009d927bf3>] free_vmap_block+0x4a/0xe0
vmap_area_lock 1 [<
00000000c1d619d7>] __get_vm_area_node+0xd2/0x170
vmap_blocks.xa_lock: 862689 862698 ...
-------------------
vmap_blocks.xa_lock 378418 [<
00000000625a5626>] vm_map_ram+0x359/0x4a0
vmap_blocks.xa_lock 484280 [<
00000000caa2ef03>] xa_erase+0xe/0x30
-------------------
vmap_blocks.xa_lock 576226 [<
00000000caa2ef03>] xa_erase+0xe/0x30
vmap_blocks.xa_lock 286472 [<
00000000625a5626>] vm_map_ram+0x359/0x4a0
...
<snip>
that is a result of running vm_map_ram()/vm_unmap_ram() in
a loop. The test creates 64(on 64 CPUs system) threads and
each one maps/unmaps 1 page.
After this change the "xa_lock" can be considered as a noise
in the same test condition:
<snip>
...
&xa->xa_lock#1: 10333 10394 ...
--------------
&xa->xa_lock#1 5349 [<
00000000bbbc9751>] xa_erase+0xe/0x30
&xa->xa_lock#1 5045 [<
0000000018def45d>] vm_map_ram+0x3a4/0x4f0
--------------
&xa->xa_lock#1 7326 [<
0000000018def45d>] vm_map_ram+0x3a4/0x4f0
&xa->xa_lock#1 3068 [<
00000000bbbc9751>] xa_erase+0xe/0x30
...
<snip>
Running the test_vmalloc.sh run_test_mask=1024 nr_threads=64 nr_pages=5
shows around ~8 percent of throughput improvement of vm_map_ram() and
vm_unmap_ram() APIs.
This patch does not fix vmap_area_lock/free_vmap_area_lock and
purge_vmap_area_lock bottle-necks, it is rather a separate rework.
Link: https://lkml.kernel.org/r/20230330190639.431589-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>