We encountered a crash: in the packet receiving process, we got an
illegal VLAN device address, but the VLAN device address saved in vmcore
is correct. After checking the code, we found a possible data
competition:
CPU 0: CPU 1:
(RCU read lock) (RTNL lock)
vlan_do_receive() register_vlan_dev()
vlan_find_dev()
->__vlan_group_get_device() ->vlan_group_prealloc_vid()
In vlan_group_prealloc_vid(), We need to make sure that memset()
in kzalloc() is executed before assigning value to vlan devices array:
=================================
kzalloc()
->memset(object, 0, size)
smp_wmb()
vg->vlan_devices_arrays[pidx][vidx] = array;
==================================
Because __vlan_group_get_device() function depends on this order.
otherwise we may get a wrong address from the hardware cache on
another cpu.
So fix it by adding memory barrier instruction to ensure the order
of memory operations.
Signed-off-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if (array == NULL)
return -ENOBUFS;
+ /* paired with smp_rmb() in __vlan_group_get_device() */
+ smp_wmb();
+
vg->vlan_devices_arrays[pidx][vidx] = array;
return 0;
}
array = vg->vlan_devices_arrays[pidx]
[vlan_id / VLAN_GROUP_ARRAY_PART_LEN];
+
+ /* paired with smp_wmb() in vlan_group_prealloc_vid() */
+ smp_rmb();
+
return array ? array[vlan_id % VLAN_GROUP_ARRAY_PART_LEN] : NULL;
}