Merge branch 'bpf-direct-pkt-access'
Alexei Starovoitov says:
====================
bpf: introduce direct packet access
This set of patches introduce 'direct packet access' from
cls_bpf and act_bpf programs (which are root only).
Current bpf programs use LD_ABS, LD_INS instructions which have
to do 'if (off < skb_headlen)' for every packet access.
It's ok for socket filters, but too slow for XDP, since single
LD_ABS insn consumes 3% of cpu. Therefore we have to amortize the cost
of length check over multiple packet accesses via direct access
to skb->data, data_end pointers.
The existing packet parser typically look like:
if (load_half(skb, offsetof(struct ethhdr, h_proto)) != ETH_P_IP)
return 0;
if (load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol)) != IPPROTO_UDP ||
load_byte(skb, ETH_HLEN) != 0x45)
return 0;
...
with 'direct packet access' the bpf program becomes:
void *data = (void *)(long)skb->data;
void *data_end = (void *)(long)skb->data_end;
struct eth_hdr *eth = data;
struct iphdr *iph = data + sizeof(*eth);
if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
return 0;
if (eth->h_proto != htons(ETH_P_IP))
return 0;
if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
return 0;
...
which is more natural to write and significantly faster.
See patch 6 for performance tests:
21Mpps(old) vs 24Mpps(new) with just 5 loads.
For more complex parsers the performance gain is higher.
The other approach implemented in [1] was adding two new instructions
to interpreter and JITs and was too hard to use from llvm side.
The approach presented here doesn't need any instruction changes,
but the verifier has to work harder to check safety of the packet access.
Patch 1 prepares the code and Patch 2 adds new checks for direct
packet access and all of them are gated with 'env->allow_ptr_leaks'
which is true for root only.
Patch 3 improves search pruning for large programs.
Patch 4 wires in verifier's changes with net/core/filter side.
Patch 5 updates docs
Patches 6 and 7 add tests.
[1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw
====================
Signed-off-by: David S. Miller <davem@davemloft.net>