6 Registers and calling convention
7 ================================
9 eBPF has 10 general purpose registers and a read-only frame pointer register,
10 all of which are 64-bits wide.
12 The eBPF calling convention is defined as:
14 * R0: return value from function calls, and exit value for eBPF programs
15 * R1 - R5: arguments for function calls
16 * R6 - R9: callee saved registers that function calls will preserve
17 * R10: read-only frame pointer to access stack
19 R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
20 necessary across calls.
25 eBPF has two instruction encodings:
27 * the basic instruction encoding, which uses 64 bits to encode an instruction
28 * the wide instruction encoding, which appends a second 64-bit immediate value
29 (imm64) after the basic instruction for a total of 128 bits.
31 The basic instruction encoding looks as follows:
33 ============= ======= =============== ==================== ============
34 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
35 ============= ======= =============== ==================== ============
36 immediate offset source register destination register opcode
37 ============= ======= =============== ==================== ============
39 Note that most instructions do not use all of the fields.
40 Unused fields shall be cleared to zero.
45 The three LSB bits of the 'opcode' field store the instruction class:
47 ========= ===== ===============================
48 class value description
49 ========= ===== ===============================
50 BPF_LD 0x00 non-standard load operations
51 BPF_LDX 0x01 load into register operations
52 BPF_ST 0x02 store from immediate operations
53 BPF_STX 0x03 store from register operations
54 BPF_ALU 0x04 32-bit arithmetic operations
55 BPF_JMP 0x05 64-bit jump operations
56 BPF_JMP32 0x06 32-bit jump operations
57 BPF_ALU64 0x07 64-bit arithmetic operations
58 ========= ===== ===============================
60 Arithmetic and jump instructions
61 ================================
63 For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
64 BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:
66 ============== ====== =================
67 4 bits (MSB) 1 bit 3 bits (LSB)
68 ============== ====== =================
69 operation code source instruction class
70 ============== ====== =================
72 The 4th bit encodes the source operand:
74 ====== ===== ========================================
75 source value description
76 ====== ===== ========================================
77 BPF_K 0x00 use 32-bit immediate as source operand
78 BPF_X 0x08 use 'src_reg' register as source operand
79 ====== ===== ========================================
81 The four MSB bits store the operation code.
84 Arithmetic instructions
85 -----------------------
87 BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
88 otherwise identical operations.
89 The code field encodes the operation as below:
91 ======== ===== =================================================
92 code value description
93 ======== ===== =================================================
94 BPF_ADD 0x00 dst += src
95 BPF_SUB 0x10 dst -= src
96 BPF_MUL 0x20 dst \*= src
97 BPF_DIV 0x30 dst /= src
98 BPF_OR 0x40 dst \|= src
99 BPF_AND 0x50 dst &= src
100 BPF_LSH 0x60 dst <<= src
101 BPF_RSH 0x70 dst >>= src
102 BPF_NEG 0x80 dst = ~src
103 BPF_MOD 0x90 dst %= src
104 BPF_XOR 0xa0 dst ^= src
105 BPF_MOV 0xb0 dst = src
106 BPF_ARSH 0xc0 sign extending shift right
107 BPF_END 0xd0 byte swap operations (see separate section below)
108 ======== ===== =================================================
110 BPF_ADD | BPF_X | BPF_ALU means::
112 dst_reg = (u32) dst_reg + (u32) src_reg;
114 BPF_ADD | BPF_X | BPF_ALU64 means::
116 dst_reg = dst_reg + src_reg
118 BPF_XOR | BPF_K | BPF_ALU means::
120 src_reg = (u32) src_reg ^ (u32) imm32
122 BPF_XOR | BPF_K | BPF_ALU64 means::
124 src_reg = src_reg ^ imm32
127 Byte swap instructions
128 ----------------------
130 The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit
131 code field of ``BPF_END``.
133 The byte swap instructions operate on the destination register
134 only and do not use a separate source register or immediate value.
136 The 1-bit source operand field in the opcode is used to to select what byte
137 order the operation convert from or to:
139 ========= ===== =================================================
140 source value description
141 ========= ===== =================================================
142 BPF_TO_LE 0x00 convert between host byte order and little endian
143 BPF_TO_BE 0x08 convert between host byte order and big endian
144 ========= ===== =================================================
146 The imm field encodes the width of the swap operations. The following widths
147 are supported: 16, 32 and 64.
151 ``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
153 dst_reg = htole16(dst_reg)
155 ``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
157 dst_reg = htobe64(dst_reg)
159 ``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and
160 ``BPF_TO_BE`` respectively.
166 BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
167 otherwise identical operations.
168 The code field encodes the operation as below:
170 ======== ===== ========================= ============
171 code value description notes
172 ======== ===== ========================= ============
173 BPF_JA 0x00 PC += off BPF_JMP only
174 BPF_JEQ 0x10 PC += off if dst == src
175 BPF_JGT 0x20 PC += off if dst > src unsigned
176 BPF_JGE 0x30 PC += off if dst >= src unsigned
177 BPF_JSET 0x40 PC += off if dst & src
178 BPF_JNE 0x50 PC += off if dst != src
179 BPF_JSGT 0x60 PC += off if dst > src signed
180 BPF_JSGE 0x70 PC += off if dst >= src signed
181 BPF_CALL 0x80 function call
182 BPF_EXIT 0x90 function / program return BPF_JMP only
183 BPF_JLT 0xa0 PC += off if dst < src unsigned
184 BPF_JLE 0xb0 PC += off if dst <= src unsigned
185 BPF_JSLT 0xc0 PC += off if dst < src signed
186 BPF_JSLE 0xd0 PC += off if dst <= src signed
187 ======== ===== ========================= ============
189 The eBPF program needs to store the return value into register R0 before doing a
193 Load and store instructions
194 ===========================
196 For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
197 8-bit 'opcode' field is divided as:
199 ============ ====== =================
200 3 bits (MSB) 2 bits 3 bits (LSB)
201 ============ ====== =================
202 mode size instruction class
203 ============ ====== =================
205 The size modifier is one of:
207 ============= ===== =====================
208 size modifier value description
209 ============= ===== =====================
210 BPF_W 0x00 word (4 bytes)
211 BPF_H 0x08 half word (2 bytes)
213 BPF_DW 0x18 double word (8 bytes)
214 ============= ===== =====================
216 The mode modifier is one of:
218 ============= ===== ====================================
219 mode modifier value description
220 ============= ===== ====================================
221 BPF_IMM 0x00 64-bit immediate instructions
222 BPF_ABS 0x20 legacy BPF packet access (absolute)
223 BPF_IND 0x40 legacy BPF packet access (indirect)
224 BPF_MEM 0x60 regular load and store operations
225 BPF_ATOMIC 0xc0 atomic operations
226 ============= ===== ====================================
229 Regular load and store operations
230 ---------------------------------
232 The ``BPF_MEM`` mode modifier is used to encode regular load and store
233 instructions that transfer data between a register and memory.
235 ``BPF_MEM | <size> | BPF_STX`` means::
237 *(size *) (dst_reg + off) = src_reg
239 ``BPF_MEM | <size> | BPF_ST`` means::
241 *(size *) (dst_reg + off) = imm32
243 ``BPF_MEM | <size> | BPF_LDX`` means::
245 dst_reg = *(size *) (src_reg + off)
247 Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
252 Atomic operations are operations that operate on memory and can not be
253 interrupted or corrupted by other access to the same memory region
254 by other eBPF programs or means outside of this specification.
256 All atomic operations supported by eBPF are encoded as store operations
257 that use the ``BPF_ATOMIC`` mode modifier as follows:
259 * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
260 * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
261 * 8-bit and 16-bit wide atomic operations are not supported.
263 The imm field is used to encode the actual atomic operation.
264 Simple atomic operation use a subset of the values defined to encode
265 arithmetic operations in the imm field to encode the atomic operation:
267 ======== ===== ===========
268 imm value description
269 ======== ===== ===========
270 BPF_ADD 0x00 atomic add
271 BPF_OR 0x40 atomic or
272 BPF_AND 0x50 atomic and
273 BPF_XOR 0xa0 atomic xor
274 ======== ===== ===========
277 ``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means::
279 *(u32 *)(dst_reg + off16) += src_reg
281 ``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means::
283 *(u64 *)(dst_reg + off16) += src_reg
285 ``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``.
287 In addition to the simple atomic operations, there also is a modifier and
288 two complex atomic operations:
290 =========== ================ ===========================
291 imm value description
292 =========== ================ ===========================
293 BPF_FETCH 0x01 modifier: return old value
294 BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
295 BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
296 =========== ================ ===========================
298 The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
299 always set for the complex atomic operations. If the ``BPF_FETCH`` flag
300 is set, then the operation also overwrites ``src_reg`` with the value that
301 was in memory before it was modified.
303 The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value
304 addressed by ``dst_reg + off``.
306 The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
307 ``dst_reg + off`` with ``R0``. If they match, the value addressed by
308 ``dst_reg + off`` is replaced with ``src_reg``. In either case, the
309 value that was at ``dst_reg + off`` before the operation is zero-extended
310 and loaded back to ``R0``.
312 Clang can generate atomic instructions by default when ``-mcpu=v3`` is
313 enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
314 Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
315 the atomics features, while keeping a lower ``-mcpu`` version, you can use
316 ``-Xclang -target-feature -Xclang +alu32``.
318 64-bit immediate instructions
319 -----------------------------
321 Instructions with the ``BPF_IMM`` mode modifier use the wide instruction
322 encoding for an extra imm64 value.
324 There is currently only one such instruction.
326 ``BPF_LD | BPF_DW | BPF_IMM`` means::
331 Legacy BPF Packet access instructions
332 -------------------------------------
334 eBPF has special instructions for access to packet data that have been
335 carried over from classic BPF to retain the performance of legacy socket
336 filters running in the eBPF interpreter.
338 The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
339 ``BPF_IND | <size> | BPF_LD``.
341 These instructions are used to access packet data and can only be used when
342 the program context is a pointer to networking packet. ``BPF_ABS``
343 accesses packet data at an absolute offset specified by the immediate data
344 and ``BPF_IND`` access packet data at an offset that includes the value of
345 a register in addition to the immediate data.
347 These instructions have seven implicit operands:
349 * Register R6 is an implicit input that must contain pointer to a
351 * Register R0 is an implicit output which contains the data fetched from
353 * Registers R1-R5 are scratch registers that are clobbered after a call to
354 ``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions.
356 These instructions have an implicit program exit condition as well. When an
357 eBPF program is trying to access the data beyond the packet boundary, the
358 program execution will be aborted.
360 ``BPF_ABS | BPF_W | BPF_LD`` means::
362 R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32))
364 ``BPF_IND | BPF_W | BPF_LD`` means::
366 R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))