Stanislav Fomichev says:
====================
Currently there is no way to propagate sk storage from the listener
socket to a newly accepted one. Consider the following use case:
fd = socket();
setsockopt(fd, SOL_IP, IP_TOS,...);
/* ^^^ setsockopt BPF program triggers here and saves something
* into sk storage of the listener.
*/
listen(fd, ...);
while (client = accept(fd)) {
/* At this point all association between listener
* socket and newly accepted one is gone. New
* socket will not have any sk storage attached.
*/
}
Let's add new BPF_F_CLONE flag that can be specified when creating
a socket storage map. This new flag indicates that map contents
should be cloned when the socket is cloned.
v4:
* drop 'goto err' in bpf_sk_storage_clone (Yonghong Song)
* add comment about race with bpf_sk_storage_map_free to the
bpf_sk_storage_clone side as well (Daniel Borkmann)
v3:
* make sure BPF_F_NO_PREALLOC is always present when creating
a map (Martin KaFai Lau)
* don't call bpf_sk_storage_free explicitly, rely on
sk_free_unlock_clone to do the cleanup (Martin KaFai Lau)
v2:
* remove spinlocks around selem_link_map/sk (Martin KaFai Lau)
* BPF_F_CLONE on a map, not selem (Martin KaFai Lau)
* hold a map while cloning (Martin KaFai Lau)
* use BTF maps in selftests (Yonghong Song)
* do proper cleanup selftests; don't call close(-1) (Yonghong Song)
* export bpf_map_inc_not_zero
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>