review.tizen.org Git - platform/kernel/linux-rpi.git/log

libbpf: Support multiple .rodata.* and .data.* BPF maps

Add support for having multiple .rodata and .data data sections ([0]).
.rodata/.data are supported like the usual, but now also
.rodata.<whatever> and .data.<whatever> are also supported. Each such
section will get its own backing BPF_MAP_TYPE_ARRAY, just like
.rodata and .data.

Multiple .bss maps are not supported, as the whole '.bss' name is
confusing and might be deprecated soon, as well as user would need to
specify custom ELF section with SEC() attribute anyway, so might as well
stick to just .data.* and .rodata.* convention.

User-visible map name for such new maps is going to be just their ELF
section names.

[0] https://github.com/libbpf/libbpf/issues/274

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org

bpftool: Improve skeleton generation for data maps without DATASEC type

It can happen that some data sections (e.g., .rodata.cst16, containing
compiler populated string constants) won't have a corresponding BTF
DATASEC type. Now that libbpf supports .rodata.* and .data.* sections,
situation like that will cause invalid BPF skeleton to be generated that
won't compile successfully, as some parts of skeleton would assume
memory-mapped struct definitions for each special data section.

Fix this by generating empty struct definitions for such data sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-7-andrii@kernel.org

bpftool: Support multiple .rodata/.data internal maps in skeleton

Remove the assumption about only single instance of each of .rodata and
.data internal maps. Nothing changes for '.rodata' and '.data' maps, but new
'.rodata.something' map will get 'rodata_something' section in BPF
skeleton for them (as well as having struct bpf_map * field in maps
section with the same field name).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-6-andrii@kernel.org

libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps

Remove internal libbpf assumption that there can be only one .rodata,
.data, and .bss map per BPF object. To achieve that, extend and
generalize the scheme that was used for keeping track of relocation ELF
sections. Now each ELF section has a temporary extra index that keeps
track of logical type of ELF section (relocations, data, read-only data,
BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss
handling.

We don't yet allow multiple .rodata, .data, and .bss sections, but no
libbpf internal code makes an assumption that there can be only one of
each and thus they can be explicitly referenced by a single index. Next
patches will actually allow multiple .rodata and .data sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org

libbpf: Use Elf64-specific types explicitly for dealing with ELF

Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These
APIs require copying ELF data structures into local GElf_xxx structs and
have a more cumbersome API. BPF ELF file is defined to be always 64-bit
ELF object, even when intended to be run on 32-bit host architectures,
so there is no need to do class-agnostic conversions everywhere. BPF
static linker implementation within libbpf has been using Elf64-specific
types since initial implementation.

Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more
succinct direct access to ELF symbol and relocation records within ELF
data itself and switch all the GElf_xxx usage into Elf64_xxx
equivalents. The only remaining place within libbpf.c that's still using
gelf API is gelf_getclass(), as there doesn't seem to be a direct way to
get underlying ELF bitness.

No functional changes intended.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org

libbpf: Extract ELF processing state into separate struct

Name currently anonymous internal struct that keeps ELF-related state
for bpf_object. Just a bit of clean up, no functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org

libbpf: Deprecate btf__finalize_data() and move it into libbpf.c

There isn't a good use case where anyone but libbpf itself needs to call
btf__finalize_data(). It was implemented for internal use and it's not
clear why it was made into public API in the first place. To function, it
requires active ELF data, which is stored inside bpf_object for the
duration of opening phase only. But the only BTF that needs bpf_object's
ELF is that bpf_object's BTF itself, which libbpf fixes up automatically
during bpf_object__open() operation anyways. There is no need for any
additional fix up and no reasonable scenario where it's useful and
appropriate.

Thus, btf__finalize_data() is just an API atavism and is better removed.
So this patch marks it as deprecated immediately (v0.6+) and moves the
code from btf.c into libbpf.c where it's used in the context of
bpf_object opening phase. Such code co-location allows to make code
structure more straightforward and remove bpf_object__section_size() and
bpf_object__variable_offset() internal helpers from libbpf_internal.h,
making them static. Their naming is also adjusted to not create
a wrong illusion that they are some sort of method of bpf_object. They
are internal helpers and are called appropriately.

This is part of libbpf 1.0 effort ([0]).

[0] Closes: https://github.com/libbpf/libbpf/issues/276

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org

Merge branch 'selftests/bpf: Fixes for perf_buffer test'

Jiri Olsa says:

====================

hi,
sending fixes for perf_buffer test on systems
with offline cpus.

v2:
- resend due to delivery issues, no changes

thanks,
jirka
Acked-by: John Fastabend <john.fastabend@gmail.com>
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

selftests/bpf: Use nanosleep tracepoint in perf buffer test

The perf buffer tests triggers trace with nanosleep syscall,
but monitors all syscalls, which results in lot of data in the
buffer and makes it harder to debug. Let's lower the trace
traffic and monitor just nanosleep syscall.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-4-jolsa@kernel.org

selftests/bpf: Fix possible/online index mismatch in perf_buffer test

The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  serial_test_perf_buffer:PASS:nr_cpus 0 nsec
  serial_test_perf_buffer:PASS:nr_on_cpus 0 nsec
  serial_test_perf_buffer:PASS:skel_load 0 nsec
  serial_test_perf_buffer:PASS:attach_kprobe 0 nsec
  serial_test_perf_buffer:PASS:perf_buf__new 0 nsec
  serial_test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #4
  serial_test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  serial_test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  serial_test_perf_buffer:PASS:buf_cnt 0 nsec
  ...
  serial_test_perf_buffer:PASS:fd_check 0 nsec
  serial_test_perf_buffer:PASS:drain_buf 0 nsec
  serial_test_perf_buffer:PASS:consume_buf 0 nsec
  serial_test_perf_buffer:FAIL:cpu_seen cpu 5 not seen
  #88 perf_buffer:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

If the offline cpu is from the middle of the possible set,
we get mismatch with possible and online cpu buffers.

The perf buffer test calls perf_buffer__consume_buffer for
all 'possible' cpus, but the library holds only 'online'
cpu buffers and perf_buffer__consume_buffer returns them
based on index.

Adding extra (online) index to keep track of online buffers,
we need the original (possible) index to trigger trace on
proper cpu.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-3-jolsa@kernel.org

selftests/bpf: Fix perf_buffer test on system with offline cpus

The perf_buffer fails on system with offline cpus:

  # test_progs -t perf_buffer
  test_perf_buffer:PASS:nr_cpus 0 nsec
  test_perf_buffer:PASS:nr_on_cpus 0 nsec
  test_perf_buffer:PASS:skel_load 0 nsec
  test_perf_buffer:PASS:attach_kprobe 0 nsec
  test_perf_buffer:PASS:perf_buf__new 0 nsec
  test_perf_buffer:PASS:epoll_fd 0 nsec
  skipping offline CPU #24
  skipping offline CPU #25
  skipping offline CPU #26
  skipping offline CPU #27
  skipping offline CPU #28
  skipping offline CPU #29
  skipping offline CPU #30
  skipping offline CPU #31
  test_perf_buffer:PASS:perf_buffer__poll 0 nsec
  test_perf_buffer:PASS:seen_cpu_cnt 0 nsec
  test_perf_buffer:FAIL:buf_cnt got 24, expected 32
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Changing the test to check online cpus instead of possible.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021114132.8196-2-jolsa@kernel.org

Merge branch 'bpf: keep track of verifier insn_processed'

Dave Marchevsky says:

====================

This is a followup to discussion around RFC patchset "bpf: keep track of
prog verification stats" [0]. The RFC elaborates on my usecase, but to
summarize: keeping track of verifier stats for programs as they - and
the kernels they run on - change over time can help developers of
individual programs and BPF kernel folks.

The RFC added a verif_stats to the uapi which contained most of the info
which verifier prints currently. Feedback here was to avoid polluting
uapi with stats that might be meaningless after major changes to the
verifier, but that insn_processed or conceptually similar number would
exist in the long term and was safe to expose.

So let's expose just insn_processed via bpf_prog_info and fdinfo for now
and explore good ways of getting more complicated stats in the future.

[0] https://lore.kernel.org/bpf/20210920151112.3770991-1-davemarchevsky@fb.com/

v2->v3:
  * Remove unnecessary check in patch 2's test [Andrii]
  * Go back to adding new u32 in bpf_prog_info (vs using spare bits) [Andrii]
* Rebase + add acks [Andrii, John]

v1->v2:
  * Rename uapi field from insn_processed to verified_insns [Daniel]
  * use 31 bits of existing bitfield space in bpf_prog_info [Daniel]
  * change underlying type from 64-> 32 bits [Daniel]
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

selftests/bpf: Add verif_stats test

verified_insns field was added to response of bpf_obj_get_info_by_fd
call on a prog. Confirm that it's being populated by loading a simple
program and asking for its info.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-3-davemarchevsky@fb.com

bpf: Add verified_insns to bpf_prog_info and fdinfo

This stat is currently printed in the verifier log and not stored
anywhere. To ease consumption of this data, add a field to bpf_prog_aux
so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com

libbpf: Fix ptr_is_aligned() usages

Currently ptr_is_aligned() takes size, and not alignment, as a
parameter, which may be overly pessimistic e.g. for __i128 on s390,
which must be only 8-byte aligned. Fix by using btf__align_of().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211021104658.624944-2-iii@linux.ibm.com

Merge branch 'Add bpf_skc_to_unix_sock() helper'

Hengqi Chen says:

====================

This patch set adds a new BPF helper bpf_skc_to_unix_sock().
The helper is used in tracing programs to cast a socket
pointer to a unix_sock pointer.

v2->v3:
- Use abstract socket in selftest (Alexei)
- Run checkpatch script over patches (Andrii)

v1->v2:
- Update selftest, remove trailing spaces changes (Song)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

selftests/bpf: Test bpf_skc_to_unix_sock() helper

Add a new test which triggers unix_listen kernel function
to test bpf_skc_to_unix_sock helper.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-3-hengqi.chen@gmail.com

bpf: Add bpf_skc_to_unix_sock() helper

The helper is used in tracing programs to cast a socket
pointer to a unix_sock pointer.
The return value could be NULL if the casting is illegal.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-2-hengqi.chen@gmail.com

samples: bpf: Suppress readelf stderr when probing for BTF support

When compiling bpf samples, the following warning appears:

readelf: Error: Missing knowledge of 32-bit reloc types used in DWARF
sections of machine number 247
readelf: Warning: unable to apply unsupported reloc type 10 to section
.debug_info
readelf: Warning: unable to apply unsupported reloc type 1 to section
.debug_info
readelf: Warning: unable to apply unsupported reloc type 10 to section
.debug_info

Same problem was mentioned in commit 2f0921262ba9 ("selftests/bpf:
suppress readelf stderr when probing for BTF support"), let's use
readelf that supports btf.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021123913.48833-1-pulehui@huawei.com

net: bpf: Switch over to memdup_user()

This patch fixes the following Coccinelle warning:

net/bpf/test_run.c:361:8-15: WARNING opportunity for memdup_user
net/bpf/test_run.c:1055:8-15: WARNING opportunity for memdup_user

Use memdup_user rather than duplicating its implementation
This is a little bit restricted to reduce false positives

Signed-off-by: Qing Wang <wangqing@vivo.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1634556651-38702-1-git-send-email-wangqing@vivo.com

selftests/bpf: Some more atomic tests

Some new verifier tests that hit some important gaps in the parameter
space for atomic ops.

There are already exhaustive tests for the JIT part in
lib/test_bpf.c, but these exercise the verifier too.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211015093318.1273686-1-jackmanb@google.com

Merge branch 'btf_dump fixes for s390'

Ilya Leoshkevich says:

====================

This series along with [1] and [2] fixes all the failures in the
btf_dump testsuite currently present on s390, in particular:

* [1] fixes intermittent build bug causing "failed to encode tag ..."
* error messages.
* [2] fixes missing VAR entries on s390.
* Patch 1 disables Intel-specific code in a testcase.
* Patch 2 fixes an endianness-related bug.
* Patch 3 fixes an alignment-related bug.
* Patch 4 improves overly pessimistic alignment handling.

[1] https://lore.kernel.org/bpf/20211012022521.399302-1-iii@linux.ibm.com/
[2] https://lore.kernel.org/bpf/20211012022637.399365-1-iii@linux.ibm.com/

v1: https://lore.kernel.org/bpf/20211012023218.399568-1-iii@linux.ibm.com/
v1 -> v2:
- Remove redundant local variables, use t->size directly instead.

Best regards,
Ilya
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

libbpf: Fix dumping non-aligned __int128

Non-aligned integers are dumped as bitfields, which is supported for at
most 64-bit integers. Fix by using the same trick as
btf_dump_float_data(): copy non-aligned values to the local buffer.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com

libbpf: Fix dumping big-endian bitfields

On big-endian arches not only bytes, but also bits are numbered in
reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true
for other big-endian arches as well).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com

selftests/bpf: Use cpu_number only on arches that have it

cpu_number exists only on Intel and aarch64, so skip the test involing
it on other arches. An alternative would be to replace it with an
exported non-ifdefed primitive-typed percpu variable from the common
code, but there appears to be none.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-2-iii@linux.ibm.com

samples/bpf: Fix application of sizeof to pointer

The coccinelle check report:
"./samples/bpf/xdp_redirect_cpu_user.c:397:32-38:
ERROR: application of sizeof to pointer"
Using the "strlen" to fix it.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211012111649.983253-1-davidcomponentone@gmail.com

bpftool: Remove useless #include to <perf-sys.h> from map_perf_ring.c

The header is no longer needed since the event_pipe implementation
was updated to rely on libbpf's perf_buffer. This makes bpftool free of
dependencies to perf files, and we can update the Makefile accordingly.

Fixes: 9b190f185d2f ("tools/bpftool: switch map event_pipe to libbpf's perf_buffer")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211020094826.16046-1-quentin@isovalent.com

selftests/bpf: Remove duplicated include in cgroup_helpers

Fix following checkincludes.pl warning:
./scripts/checkincludes.pl tools/testing/selftests/bpf/cgroup_helpers.c
tools/testing/selftests/bpf/cgroup_helpers.c: unistd.h is included more
than once.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012023231.19911-1-wanjiabing@vivo.com

bpf/preload: Clean up .gitignore and "clean-files" target

kernel/bpf/preload/Makefile was recently updated to have it install
libbpf's headers locally instead of pulling them from tools/lib/bpf. But
two items still need to be addressed.

First, the local .gitignore file was not adjusted to ignore the files
generated in the new kernel/bpf/preload/libbpf output directory.

Second, the "clean-files" target is now incorrect. The old artefacts
names were not removed from the target, while the new ones were added
incorrectly. This is because "clean-files" expects names relative to
$(obj), but we passed the absolute path instead. This results in the
output and header-destination directories for libbpf (and their
contents) not being removed from kernel/bpf/preload on "make clean" from
the root of the repository.

This commit fixes both issues. Note that $(userprogs) needs not be added
to "clean-files", because the cleaning infrastructure already accounts
for it.

Cleaning the files properly also prevents make from printing the
following message, for builds coming after a "make clean":
"make[4]: Nothing to be done for 'install_headers'."

v2: Simplify the "clean-files" target.

Fixes: bf60791741d4 ("bpf: preload: Install libbpf headers when building")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211020094647.15564-1-quentin@isovalent.com

libbpf: Migrate internal use of bpf_program__get_prog_info_linear

In preparation for bpf_program__get_prog_info_linear deprecation, move
the single use in libbpf to call bpf_obj_get_info_by_fd directly.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com

bpf: Silence Coverity warning for find_kfunc_desc_btf

The helper function returns a pointer that in the failure case encodes
an error in the struct btf pointer. The current code lead to Coverity
warning about the use of the invalid pointer:

*** CID 1507963:  Memory - illegal accesses  (USE_AFTER_FREE)
/kernel/bpf/verifier.c: 1788 in find_kfunc_desc_btf()
1782                          return ERR_PTR(-EINVAL);
1783                  }
1784
1785                  kfunc_btf = __find_kfunc_desc_btf(env, offset, btf_modp);
1786                  if (IS_ERR_OR_NULL(kfunc_btf)) {
1787                          verbose(env, "cannot find module BTF for func_id %u\n", func_id);
>>>      CID 1507963:  Memory - illegal accesses  (USE_AFTER_FREE)
>>>      Using freed pointer "kfunc_btf".
1788                          return kfunc_btf ?: ERR_PTR(-ENOENT);
1789                  }
1790                  return kfunc_btf;
1791          }
1792          return btf_vmlinux ?: ERR_PTR(-ENOENT);
1793     }

Daniel suggested the use of ERR_CAST so that the intended use is clear
to Coverity, but on closer look it seems that we never return NULL from
the helper. Andrii noted that since __find_kfunc_desc_btf already logs
errors for all cases except btf_get_by_fd, it is much easier to add
logging for that and remove the IS_ERR check altogether, returning
directly from it.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211009040900.803436-1-memxor@gmail.com

Merge branch 'fixes for bpftool's Makefile'

Quentin Monnet says:

====================

This set contains one fix for bpftool's Makefile, to make sure that the
headers internal to libbpf are installed properly even if we add more
headers to the relevant Makefile variable in the future (although we'd like
to avoid that if possible).

The other patches aim at cleaning up the output from the Makefile, in
particular when running the command "make" another time after bpftool is
built.
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

bpftool: Turn check on zlib from a phony target into a conditional error

One of bpftool's object files depends on zlib. To make sure we do not
attempt to build that object when the library is not available, commit
d66fa3c70e59 ("tools: bpftool: add feature check for zlib") introduced a
feature check to detect whether zlib is present.

This check comes as a rule for which the target ("zdep") is a
nonexistent file (phony target), which means that the Makefile always
attempts to rebuild it. It is mostly harmless. However, one side effect
is that, on running again once bpftool is already built, make considers
that "something" (the recipe for zdep) was executed, and does not print
the usual message "make: Nothing to be done for 'all'", which is a
user-friendly indicator that the build went fine.

Before, with some level of debugging information:

    $ make --debug=m
    [...]
    Reading makefiles...

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    Updating makefiles....
    Updating goal targets....
     File 'all' does not exist.
           File 'zdep' does not exist.
          Must remake target 'zdep'.
     File 'all' does not exist.
    Must remake target 'all'.
    Successfully remade target file 'all'.

After the patch:

    $ make --debug=m
    [...]

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    Updating makefiles....
    Updating goal targets....
     File 'all' does not exist.
    Must remake target 'all'.
    Successfully remade target file 'all'.
    make: Nothing to be done for 'all'.

(Note the last line, which is not part of make's debug information.)

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-4-quentin@isovalent.com

bpftool: Do not FORCE-build libbpf

In bpftool's Makefile, libbpf has a FORCE dependency, to make sure we
rebuild it in case its source files changed. Let's instead make the
rebuild depend on the source files directly, through a call to the
"$(wildcard ...)" function. This avoids descending into libbpf's
directory if there is nothing to update.

Do the same for the bootstrap libbpf version.

This results in a slightly faster operation and less verbose output when
running make a second time in bpftool's directory.

Before:

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

    make[1]: Entering directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Entering directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Nothing to be done for 'install_headers'.
    make[1]: Leaving directory '/root/dev/linux/tools/lib/bpf'
    make[1]: Leaving directory '/root/dev/linux/tools/lib/bpf'

After:

    Auto-detecting system features:
    ...                        libbfd: [ on  ]
    ...        disassembler-four-args: [ on  ]
    ...                          zlib: [ on  ]
    ...                        libcap: [ on  ]
    ...               clang-bpf-co-re: [ on  ]

Other ways to clean up the output could be to pass the "-s" option, or
to redirect the output to >/dev/null, when calling make recursively to
descend into libbpf's directory. However, this would suppress some
useful output if something goes wrong during the build. A better
alternative would be to pass "--no-print-directory" to the recursive
make, but that would still leave us with some noise for
"install_headers". Skipping the descent into libbpf's directory if no
source file has changed works best, and seems the most logical option
overall.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-3-quentin@isovalent.com

bpftool: Fix install for libbpf's internal header(s)

We recently updated bpftool's Makefile to make it install the headers
from libbpf, instead of pulling them directly from libbpf's directory.
There is also an additional header, internal to libbpf, that needs be
installed. The way that bpftool's Makefile installs that particular
header is currently correct, but would break if we were to modify
$(LIBBPF_INTERNAL_HDRS) to make it point to more than one header.

Use a static pattern rule instead, so that the Makefile can withstand
the addition of other headers to install.

The objective is simply to make the Makefile more robust. It should
_not_ be read as an invitation to import more internal headers from
libbpf into bpftool.

Fixes: f012ade10b34 ("bpftool: Install libbpf headers instead of including the dir")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211009210341.6291-2-quentin@isovalent.com

libbpf: Remove Makefile warnings on out-of-sync netlink.h/if_link.h

Although relying on some definitions from the netlink.h and if_link.h
headers copied into tools/include/uapi/linux/, libbpf does not need
those headers to stay entirely up-to-date with their original versions,
and the warnings emitted by the Makefile when it detects a difference
are usually just noise. Let's remove those warnings.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211010002528.9772-1-quentin@isovalent.com

bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG

Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.

After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).

This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].

[1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
[2] https://reviews.llvm.org/D111588
[3] https://reviews.llvm.org/D111199

Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com

bpf, mips: Fix comment on tail call count limiting

In emit_tail_call() of bpf_jit_comp32.c, "blez t2" (t2 <= 0) is
not consistent with the comment "t2 < 0", update the comment to
keep consistency.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/1633915150-13220-3-git-send-email-yangtiezhu@loongson.cn

bpf, mips: Clean up config options about JIT

The config options MIPS_CBPF_JIT and MIPS_EBPF_JIT are useless, remove
them in arch/mips/Kconfig, and then modify arch/mips/net/Makefile.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/bpf/1633915150-13220-2-git-send-email-yangtiezhu@loongson.cn

selftests/bpf: Skip verifier tests that fail to load with ENOTSUPP

The verifier tests added in commit c48e51c8b07a ("bpf: selftests: Add
selftests for module kfunc support") fail on s390, since the JIT does
not support calling kernel functions. This is most likely an issue for
all the other non-Intel arches, as well as on Intel with
!CONFIG_DEBUG_INFO_BTF or !CONFIG_BPF_JIT.

Trying to check for messages from all the possible add_kfunc_call()
failure cases in test_verifier looks pointless, so do a much simpler
thing instead: just like it's already done in do_prog_test_run(), skip
the tests that fail to load with ENOTSUPP.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211007173329.381754-1-iii@linux.ibm.com

Merge branch 'selftests/bpf: Add parallelism to test_progs'

Yucong Sun says:

====================

This patch series adds "-j" parelell execution to test_progs, with "--debug" to
display server/worker communications. Also, some Tests that often fails in
parallel are marked as serial test, and it will run in sequence after parallel
execution is done.

This patch series also adds a error summary after all tests execution finished.

V6 -> V5:
  * adding error summary logic for non parallel mode too.
  * changed how serial tests are implemented, use main process instead of worker 0.
  * fixed a dozen broken test when running in parallel.

V5 -> V4:
  * change to SOCK_SEQPACKET for close notification.
  * move all debug output to "--debug" mode
  * output log as test finish, and all error logs again after summary line.
  * variable naming / style changes
  * adds serial_test_name() to replace serial test lists.
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

selfetest/bpf: Make some tests serial

Change tests that often fails in parallel execution mode to serial.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-15-fallentree@fb.com

selftests/bpf: Fix pid check in fexit_sleep test

bpf_get_current_pid_tgid() returns u64, whose upper 32 bits are the same
as userspace getpid() return value.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-13-fallentree@fb.com

selftests/bpf: Adding pid filtering for atomics test

This make atomics test able to run in parallel with other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-11-fallentree@fb.com

selftests/bpf: Make cgroup_v1v2 use its own port

This patch change cgroup_v1v2 use a different port, avoid conflict with
other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-8-fallentree@fb.com

selftests/bpf: Fix race condition in enable_stats

In parallel execution mode, this test now need to use atomic operation
to avoid race condition.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-7-fallentree@fb.com

selftests/bpf: Add per worker cgroup suffix

This patch make each worker use a unique cgroup base directory, thus
allowing tests that uses cgroups to run concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-5-fallentree@fb.com

selftests/bpf: Allow some tests to be executed in sequence

This patch allows tests to define serial_test_name() instead of
test_name(), and this will make test_progs execute those in sequence
after all other tests finished executing concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-3-fallentree@fb.com

selftests/bpf: Add parallelism to test_progs

This patch adds "-j" mode to test_progs, executing tests in multiple
process.  "-j" mode is optional, and works with all existing test
selection mechanism, as well as "-v", "-l" etc.

In "-j" mode, main process use UDS/SEQPACKET to communicate to each forked
worker, commanding it to run tests and collect logs. After all tests are
finished, a summary is printed. main process use multiple competing
threads to dispatch work to worker, trying to keep them all busy.

The test status will be printed as soon as it is finished, if there are
error logs, it will be printed after the final summary line.

By specifying "--debug", additional debug information on server/worker
communication will be printed.

Example output:
  > ./test_progs -n 15-20 -j
  [   12.801730] bpf_testmod: loading out-of-tree module taints kernel.
  Launching 8 workers.
  #20 btf_split:OK
  #16 btf_endian:OK
  #18 btf_module:OK
  #17 btf_map_in_map:OK
  #19 btf_skc_cls_ingress:OK
  #15 btf_dump:OK
  Summary: 6/20 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-2-fallentree@fb.com

Merge branch 'add support for writable bare tracepoint'

Hou Tao says:

====================

From: Hou Tao <houtao1@huawei.com>

Hi,

The patchset series supports writable context for bare tracepoint.

The main idea comes from patchset "writable contexts for bpf raw
tracepoints" [1], but it only supports normal tracepoint with
associated trace event under tracefs. Now we have one use case
in which we add bare tracepoint in VFS layer, and update
file::f_mode for specific files. The reason using bare tracepoint
is that it doesn't form a ABI and we can change it freely. So
add support for it in BPF.

Comments are always welcome.

[1]: https://lore.kernel.org/lkml/20190426184951.21812-1-mmullins@fb.com

Change log:
v5:
* rebased on bpf-next
* patch 1: add Acked-by tag
* patch 2: handle invalid section name, make prefixes array being const

v4: https://www.spinics.net/lists/bpf/msg47021.html
* rebased on bpf-next
* update patch 2 to add support for writable raw tracepoint attachment
   in attach_raw_tp().
* update patch 3 to add Acked-by tag

v3: https://www.spinics.net/lists/bpf/msg46824.html
  * use raw_tp.w instead of raw_tp_writable as section
    name of writable tp
  * use ASSERT_XXX() instead of CHECK()
  * define a common macro for "/sys/kernel/bpf_testmod"

v2: https://www.spinics.net/lists/bpf/msg46356.html
  * rebase on bpf-next tree
  * address comments from Yonghong Song
  * rename bpf_testmode_test_writable_ctx::ret as early_ret to reflect
    its purpose better.

v1: https://www.spinics.net/lists/bpf/msg46221.html
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

bpf/selftests: Add test for writable bare tracepoint

Add a writable bare tracepoint in bpf_testmod module, and
trigger its calling when reading /sys/kernel/bpf_testmod
with a specific buffer length. The reading will return
the value in writable context if the early return flag
is enabled in writable context.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-4-hotforest@gmail.com

libbpf: Support detecting and attaching of writable tracepoint program

Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com

bpf: Support writable context for bare tracepoint

Commit 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
supports writable context for tracepoint, but it misses the support
for bare tracepoint which has no associated trace event.

Bare tracepoint is defined by DECLARE_TRACE(), so adding a corresponding
DECLARE_TRACE_WRITABLE() macro to generate a definition in __bpf_raw_tp_map
section for bare tracepoint in a similar way to DEFINE_TRACE_WRITABLE().

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-2-hotforest@gmail.com

Merge branch 'install libbpf headers when using the library'

Quentin Monnet says:

====================

Libbpf is used at several locations in the repository. Most of the time,
the tools relying on it build the library in its own directory, and include
the headers from there. This works, but this is not the cleanest approach.
It generates objects outside of the directory of the tool which is being
built, and it also increases the risk that developers include a header file
internal to libbpf, which is not supposed to be exposed to user
applications.

This set adjusts all involved Makefiles to make sure that libbpf is built
locally (with respect to the tool's directory or provided build directory),
and by ensuring that "make install_headers" is run from libbpf's Makefile
to export user headers properly.

This comes at a cost: given that the libbpf was so far mostly compiled in
its own directory by the different components using it, compiling it once
would be enough for all those components. With the new approach, each
component compiles its own version. To mitigate this cost, efforts were
made to reuse the compiled library when possible:

- Make the bpftool version in samples/bpf reuse the library previously
  compiled for the selftests.
- Make the bpftool version in BPF selftests reuse the library previously
  compiled for the selftests.
- Similarly, make resolve_btfids in BPF selftests reuse the same compiled
  library.
- Similarly, make runqslower in BPF selftests reuse the same compiled
  library; and make it rely on the bpftool version also compiled from the
  selftests (instead of compiling its own version).
- runqslower, when compiled independently, needs its own version of
  bpftool: make them share the same compiled libbpf.

As a result:

- Compiling the samples/bpf should compile libbpf just once.
- Compiling the BPF selftests should compile libbpf just once.
- Compiling the kernel (with BTF support) should now lead to compiling
  libbpf twice: one for resolve_btfids, one for kernel/bpf/preload.
- Compiling runqslower individually should compile libbpf just once. Same
  thing for bpftool, resolve_btfids, and kernel/bpf/preload/iterators.

(Not accounting for the boostrap version of libbpf required by bpftool,
which was already placed under a dedicated .../boostrap/libbpf/ directory,
and for which the count remains unchanged.)

A few commits in the series also contain drive-by clean-up changes for
bpftool includes, samples/bpf/.gitignore, or test_bpftool_build.sh. Please
refer to individual commit logs for details.

v4:
  - Make the "libbpf_hdrs" dependency an order-only dependency in
    kernel/bpf/preload/Makefile, samples/bpf/Makefile, and
    tools/bpf/runqslower/Makefile. This is to avoid to unconditionally
    recompile the targets.
  - samples/bpf/.gitignore: prefix objects with a "/" to mark that we
    ignore them when at the root of the samples/bpf/ directory.
  - libbpf: add a commit to make "install_headers" depend on the header
    files, to avoid exporting again if the sources are older than the
    targets. This ensures that applications relying on those headers are
    not rebuilt unnecessarily.
  - bpftool: uncouple the copy of nlattr.h from libbpf target, to have it
    depend on the source header itself. By avoiding to reinstall this
    header every time, we avoid unnecessary builds of bpftool.
  - samples/bpf: Add a new commit to remove the FORCE dependency for
    libbpf, and replace it with a "$(wildcard ...)" on the .c/.h files in
    libbpf's directory. This is to avoid always recompiling libbpf/bpftool.
  - Adjust prefixes in commit subjects.

v3:
  - Remove order-only dependencies on $(LIBBPF_INCLUDE) (or equivalent)
    directories, given that they are created by libbpf's Makefile.
  - Add libbpf as a dependency for bpftool/resolve_btfids/runqslower when
    they are supposed to reuse a libbpf compiled previously. This is to
    avoid having several libbpf versions being compiled simultaneously in
    the same directory with parallel builds. Even if this didn't show up
    during tests, let's remain on the safe side.
  - kernel/bpf/preload/Makefile: Rename libbpf-hdrs (dash) dependency as
    libbpf_hdrs.
  - samples/bpf/.gitignore: Add bpftool/
  - samples/bpf/Makefile: Change "/bin/rm -rf" to "$(RM) -r".
  - samples/bpf/Makefile: Add missing slashes for $(LIBBPF_OUTPUT) and
    $(LIBBPF_DESTDIR) when buildling bpftool
  - samples/bpf/Makefile: Add a dependency to libbpf's headers for
    $(TRACE_HELPERS).
  - bpftool's Makefile: Use $(LIBBPF) instead of equivalent (but longer)
    $(LIBBPF_OUTPUT)libbpf.a
  - BPF iterators' Makefile: build bpftool in .output/bpftool (instead of
    .output/), add and clean up variables.
  - runqslower's Makefile: Add an explicit dependency on libbpf's headers
    to several objects. The dependency is not required (libbpf should have
    been compiled and so the headers exported through other dependencies
    for those targets), but they better mark the logical dependency and
    should help if exporting the headers changed in the future.
  - New commit to add an "install-bin" target to bpftool, to avoid
    installing bash completion when buildling BPF iterators and selftests.

v2: Declare an additional dependency on libbpf's headers for
    iterators/iterators.o in kernel/preload/Makefile to make sure that
    these headers are exported before we compile the object file (and not
    just before we link it).
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

bpftool: Add install-bin target to install binary only

With "make install", bpftool installs its binary and its bash completion
file. Usually, this is what we want. But a few components in the kernel
repository (namely, BPF iterators and selftests) also install bpftool
locally before using it. In such a case, bash completion is not
necessary and is just a useless build artifact.

Let's add an "install-bin" target to bpftool, to offer a way to install
the binary only.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-13-quentin@isovalent.com

selftests/bpf: Better clean up for runqslower in test_bpftool_build.sh

The script test_bpftool_build.sh attempts to build bpftool in the
various supported ways, to make sure nothing breaks.

One of those ways is to run "make tools/bpf" from the root of the kernel
repository. This command builds bpftool, along with the other tools
under tools/bpf, and runqslower in particular. After running the
command and upon a successful bpftool build, the script attempts to
cleanup the generated objects. However, after building with this target
and in the case of runqslower, the files are not cleaned up as expected.

This is because the "tools/bpf" target sets $(OUTPUT) to
.../tools/bpf/runqslower/ when building the tool, causing the object
files to be placed directly under the runqslower directory. But when
running "cd tools/bpf; make clean", the value for $(OUTPUT) is set to
".output" (relative to the runqslower directory) by runqslower's
Makefile, and this is where the Makefile looks for files to clean up.

We cannot easily fix in the root Makefile (where "tools/bpf" is defined)
or in tools/scripts/Makefile.include (setting $(OUTPUT)), where changing
the way the output variables are passed would likely have consequences
elsewhere. We could change runqslower's Makefile to build in the
repository instead of in a dedicated ".output/", but doing so just to
accommodate a test script doesn't sound great. Instead, let's just make
sure that we clean up runqslower properly by adding the correct command
to the script.

This will attempt to clean runqslower twice: the first try with command
"cd tools/bpf; make clean" will search for tools/bpf/runqslower/.output
and fail to clean it (but will still clean the other tools, in
particular bpftool), the second one (added in this commit) sets the
$(OUTPUT) variable like for building with the "tool/bpf" target and
should succeed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-12-quentin@isovalent.com

samples/bpf: Do not FORCE-recompile libbpf

In samples/bpf/Makefile, libbpf has a FORCE dependency that force it to
be rebuilt. I read this as a way to keep the library up-to-date, given
that we do not have, in samples/bpf, a list of the source files for
libbpf itself. However, a better approach would be to use the
"$(wildcard ...)" function from make, and to have libbpf depend on all
the .c and .h files in its directory. This is what samples/bpf/Makefile
does for bpftool, and also what the BPF selftests' Makefile does for
libbpf.

Let's update the Makefile to avoid rebuilding libbpf all the time (and
bpftool on top of it).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-11-quentin@isovalent.com

samples/bpf: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the source
directory. Instead, they should be exported with "make install_headers".
Make sure that samples/bpf/Makefile installs the headers properly when
building.

The object compiled from and exported by libbpf are now placed into a
subdirectory of sample/bpf/ instead of remaining in tools/lib/bpf/. We
attempt to remove this directory on "make clean". However, the "clean"
target re-enters the samples/bpf/ directory from the root of the
repository ("$(MAKE) -C ../../ M=$(CURDIR) clean"), in such a way that
$(srctree) and $(src) are not defined, making it impossible to use
$(LIBBPF_OUTPUT) and $(LIBBPF_DESTDIR) in the recipe. So we only attempt
to clean $(CURDIR)/libbpf, which is the default value.

Add a dependency on libbpf's headers for the $(TRACE_HELPERS).

We also change the output directory for bpftool, to place the generated
objects under samples/bpf/bpftool/ instead of building in bpftool's
directory directly. Doing so, we make sure bpftool reuses the libbpf
library previously compiled and installed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-10-quentin@isovalent.com

samples/bpf: Update .gitignore

Update samples/bpf/.gitignore to ignore files generated when building
the samples. Add:

  - vmlinux.h
  - the generated skeleton files (*.skel.h)
  - the samples/bpf/libbpf/ and .../bpftool/ directories, in preparation
    of a future commit which introduces a local output directory for
    building libbpf and bpftool.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-9-quentin@isovalent.com

bpf: iterators: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that bpf/preload/iterators/Makefile
installs the headers properly when building.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-8-quentin@isovalent.com

bpf: preload: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that bpf/preload/Makefile installs the
headers properly when building.

Note that we declare an additional dependency for iterators/iterators.o:
having $(LIBBPF_A) as a dependency to "$(obj)/bpf_preload_umd" is not
sufficient, as it makes it required only at the linking step. But we
need libbpf to be compiled, and in particular its headers to be
exported, before we attempt to compile iterators.o. The issue would not
occur before this commit, because libbpf's headers were not exported and
were always available under tools/lib/bpf.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-7-quentin@isovalent.com

tools/runqslower: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that runqslower installs the
headers properly when building.

We use a libbpf_hdrs target to mark the logical dependency on libbpf's
headers export for a number of object files, even though the headers
should have been exported at this time (since bpftool needs them, and is
required to generate the skeleton or the vmlinux.h).

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
BPFOBJ_OUTPUT and BPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests. We pass a number of
variables to the "make" invocation, because we want to point runqslower
to the (target) libbpf shared with other tools, instead of building its
own version. In addition, runqslower relies on (target) bpftool, and we
also want to pass the proper variables to its Makefile so that bpftool
itself reuses the same libbpf.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-6-quentin@isovalent.com

tools/resolve_btfids: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that resolve_btfids installs the
headers properly when building.

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
LIBBPF_OUT and LIBBPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests in order to point to the
(target) libbpf shared with other tools, instead of building a version
specific to resolve_btfids. Remove libbpf's order-only dependencies on
the include directories (they are created by libbpf and don't need to
exist beforehand).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-5-quentin@isovalent.com

bpftool: Install libbpf headers instead of including the dir

Bpftool relies on libbpf, therefore it relies on a number of headers
from the library and must be linked against the library. The Makefile
for bpftool exposes these objects by adding tools/lib as an include
directory ("-I$(srctree)/tools/lib"). This is a working solution, but
this is not the cleanest one. The risk is to involuntarily include
objects that are not intended to be exposed by the libbpf.

The headers needed to compile bpftool should in fact be "installed" from
libbpf, with its "install_headers" Makefile target. In addition, there
is one header which is internal to the library and not supposed to be
used by external applications, but that bpftool uses anyway.

Adjust the Makefile in order to install the header files properly before
compiling bpftool. Also copy the additional internal header file
(nlattr.h), but call it out explicitly. Build (and install headers) in a
subdirectory under bpftool/ instead of tools/lib/bpf/. When descending
from a parent Makefile, this is configurable by setting the OUTPUT,
LIBBPF_OUTPUT and LIBBPF_DESTDIR variables.

Also adjust the Makefile for BPF selftests, so as to reuse the (host)
libbpf compiled earlier and to avoid compiling a separate version of the
library just for bpftool.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-4-quentin@isovalent.com

bpftool: Remove unused includes to <bpf/bpf_gen_internal.h>

It seems that the header file was never necessary to compile bpftool,
and it is not part of the headers exported from libbpf. Let's remove the
includes from prog.c and gen.c.

Fixes: d510296d331a ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-3-quentin@isovalent.com

libbpf: Skip re-installing headers file if source is older than target

The "install_headers" target in libbpf's Makefile would unconditionally
export all API headers to the target directory. When those headers are
installed to compile another application, this means that make always
finds newer dependencies for the source files relying on those headers,
and deduces that the targets should be rebuilt.

Avoid that by making "install_headers" depend on the source header
files, and (re-)install them only when necessary.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-2-quentin@isovalent.com

selftests/bpf: Fix btf_dump test under new clang

New clang version changed ([0]) type name in dwarf from "long int" to "long",
this is causing btf_dump tests to fail.

[0] https://github.com/llvm/llvm-project/commit/f6a561c4d6754b13165a49990e8365d819f64c86

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211008173139.1457407-1-fallentree@fb.com

selftests/bpf: Remove SEC("version") from test progs

Since commit 6c4fc209fcf9d ("bpf: remove useless version check for prog
load") these "version" sections, which result in bpf_attr.kern_version
being set, have been unnecessary.

Remove them so that it's obvious to folks using selftests as a guide that
"modern" BPF progs don't need this section.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007231234.2223081-1-davemarchevsky@fb.com

selftests/bpf: Skip the second half of get_branch_snapshot in vm

VMs running on upstream 5.12+ kernel support LBR. However,
bpf_get_branch_snapshot couldn't stop the LBR before too many entries
are flushed. Skip the hit/waste test for VMs before we find a proper fix
for LBR in VM.

Fixes: 025bd7c753aa ("selftests/bpf: Add test for bpf_get_branch_snapshot")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007050231.728496-1-songliubraving@fb.com

bpf, tests: Add more LD_IMM64 tests

This patch adds new tests for the two-instruction LD_IMM64. The new tests
verify the operation with immediate values of different byte patterns.
Mainly intended to cover JITs that want to be clever when loading 64-bit
constants.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211007143006.634308-1-johan.almbladh@anyfinetworks.com

mips, bpf: Optimize loading of 64-bit constants

This patch shaves off a few instructions when loading sparse 64-bit
constants to register. The change is covered by additional tests in
lib/test_bpf.c.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211007142828.634182-1-johan.almbladh@anyfinetworks.com

mips, bpf: Fix Makefile that referenced a removed file

This patch removes a stale Makefile reference to the cBPF JIT that was
removed.

Fixes: ebcbacfa50ec ("mips, bpf: Remove old BPF JIT implementations")
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211007142339.633899-1-johan.almbladh@anyfinetworks.com

bpf, x64: Factor out emission of REX byte in more cases

Introduce a single reg version of maybe_emit_mod() and factor out
common code in more cases.

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211006194135.608932-1-jmeng@fb.com

Merge branch 'libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7'

Hengqi Chen says:

====================

bpf_{map,program}__{prev,next} don't follow the libbpf API naming
convention. Deprecate them and replace them with a new set of APIs
named bpf_object__{prev,next}_{program,map}.

v1->v2: [0]
* Addressed Andrii's comments

[0]: https://patchwork.kernel.org/project/netdevbpf/patch/20210906165456.325999-1-hengqi.chen@gmail.com/
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

libbpf: Deprecate bpf_object__unload() API since v0.6

BPF objects are not reloadable after unload. Users are expected to use
bpf_object__close() to unload and free up resources in one operation.
No need to expose bpf_object__unload() as a public API, deprecate it
([0]). Add bpf_object__unload() as an alias to internal
bpf_object_unload() and replace all bpf_object__unload() uses to avoid
compilation errors.

[0] Closes: https://github.com/libbpf/libbpf/issues/290

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com

selftests/bpf: Switch to new bpf_object__next_{map,program} APIs

Replace deprecated bpf_{map,program}__next APIs with newly added
bpf_object__next_{map,program} APIs, so that no compilation warnings
emit.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-3-hengqi.chen@gmail.com

libbpf: Add API documentation convention guidelines

This adds a section to the documentation for libbpf
naming convention which describes how to document
API features in libbpf, specifically the format of
which API doc comments need to conform to.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211004215644.497327-1-grantseltzer@gmail.com

libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7

Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with
a new set of APIs named bpf_object__{prev,next}_{program,map} which
follow the libbpf API naming convention ([0]). No functionality changes.

[0] Closes: https://github.com/libbpf/libbpf/issues/296

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com

selftest/bpf: Switch recursion test to use htab_map_delete_elem

Currently the recursion test is hooking __htab_map_lookup_elem
function, which is invoked both from bpf_prog and bpf syscall.

But in our kernel build, the __htab_map_lookup_elem gets inlined
within the htab_map_lookup_elem, so it's not trigered and the
test fails.

Fixing this by using htab_map_delete_elem, which is not inlined
for bpf_prog calls (like htab_map_lookup_elem is) and is used
directly as pointer for map_delete_elem, so it won't disappear
by inlining.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/YVnfFTL/3T6jOwHI@krava

bpf: Use $(pound) instead of \# in Makefiles

Recent-ish versions of make do no longer consider number signs ("#") as
comment symbols when they are inserted inside of a macro reference or in
a function invocation. In such cases, the symbols should not be escaped.

There are a few occurrences of "\#" in libbpf's and samples' Makefiles.
In the former, the backslash is harmless, because grep associates no
particular meaning to the escaped symbol and reads it as a regular "#".
In samples' Makefile, recent versions of make will pass the backslash
down to the compiler, making the probe fail all the time and resulting
in the display of a warning about "make headers_install" being required,
even after headers have been installed.

A similar issue has been addressed at some other locations by commit
9564a8cf422d ("Kbuild: fix # escaping in .cmd files for future Make").
Let's address it for libbpf's and samples' Makefiles in the same
fashion, by using a "$(pound)" variable (pulled from
tools/scripts/Makefile.include for libbpf, or re-defined for the
samples).

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Fixes: 2f3830412786 ("libbpf: Make libbpf_version.h non-auto-generated")
Fixes: 07c3bbdb1a9b ("samples: bpf: print a warning about headers_install")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006111049.20708-1-quentin@isovalent.com

bpf, arm: Remove dummy bpf_jit_compile stub

The BPF core defines a __weak bpf_jit_compile() dummy function already
which should only be overridden by JITs if they actually implement a
legacy cBPF JIT. Given arm implements an eBPF JIT, this stub is not
needed.

Now that MIPS cBPF JIT is finally gone, the only JIT left that is still
implementing bpf_jit_compile() is the sparc32 one.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Merge branch 'bpf-mips-jit'

Johan Almbladh says:

====================
This is an implementation of an eBPF JIT for MIPS I-V and MIPS32/64 r1-r6.
The new JIT is written from scratch, but uses the same overall structure
as other eBPF JITs.

Before, the MIPS JIT situation looked like this.

  - 32-bit: MIPS32, cBPF-only, tests fail
  - 64-bit: MIPS64r2-r6, eBPF, tests fail, incomplete eBPF ISA support

The new JIT implementation raises the bar to the following level.

  - 32/64-bit: all MIPS ISA, eBPF, all tests pass, full eBPF ISA support

Overview
--------
The implementation supports all 32-bit and 64-bit eBPF instructions
defined as of this writing, including the recently-added atomics. It is
intended to provide good performance for native word size operations,
while also being complete so the JIT never has to fall back to the
interpreter. The new JIT replaces the current cBPF and eBPF JITs for MIPS.

The implementation is divided into separate files as follows. The source
files contains comments describing internal mechanisms and details on
things like eBPF-to-CPU register mappings, so I won't repeat that here.

  - jit_comp.[ch]    code shared between 32-bit and 64-bit JITs
  - jit_comp32.c     32-bit JIT implementation
  - jit_comp64.c     64-bit JIT implementation

Both the 32-bit and 64-bit versions map all eBPF registers to native MIPS
CPU registers. There are also enough unmapped CPU registers available to
allow all eBPF operations implemented natively by the JIT to use only CPU
registers without having to resort to stack scratch space.

Some operations are deemed too complex to implement natively in the JIT.
Those are instead implemented as a function call to a helper that performs
the operation. This is done in the following cases.

  - 64-bit div and mod on a 32-bit CPU
  - 64-bit atomics on a 32-bit CPU
  - 32-bit atomics on a 32-bit CPU that lacks ll/sc instructions

CPU errata workarounds
----------------------
The JIT implements workarounds for R10000, Loongson-2F and Loongson-3 CPU
errata. For the Loongson workarounds, I have used the public information
available on the matter.

Link: https://sourceware.org/legacy-ml/binutils/2009-11/msg00387.html
Testing
-------
During the development of the JIT, I have added a number of new test cases
to the test_bpf.ko test suite to be able to verify correctness of JIT
implementations in a more systematic way. The new additions increase the
test suite roughly three-fold, with many of the new tests being very
extensive and even exhaustive when feasible.

Link: https://lore.kernel.org/bpf/20211001130348.3670534-1-johan.almbladh@anyfinetworks.com/
Link: https://lore.kernel.org/bpf/20210914091842.4186267-1-johan.almbladh@anyfinetworks.com/
Link: https://lore.kernel.org/bpf/20210809091829.810076-1-johan.almbladh@anyfinetworks.com/
The JIT has been tested by running the test_bpf.ko test suite in QEMU with
the following MIPS ISAs, in both big and little endian mode, with and
without JIT hardening enabled.

  MIPS32r2, MIPS32r6, MIPS64r2, MIPS64r6

For the QEMU r2 targets, the correctness of pre-r2 code emitted has been
tested by manually overriding each of the following macros with 0.

  cpu_has_llsc, cpu_has_mips_2, cpu_has_mips_r1, cpu_has_mips_r2

Similarly, CPU errata workaround code has been tested by enabling the
each of the following configurations for the MIPS64r2 targets.

  CONFIG_WAR_R10000
  CONFIG_CPU_LOONGSON3_WORKAROUNDS
  CONFIG_CPU_NOP_WORKAROUNDS
  CONFIG_CPU_JUMP_WORKAROUNDS

The JIT passes all tests in all configurations. Below is the summary for
MIPS32r2 in little endian mode.

  test_bpf: Summary: 1006 PASSED, 0 FAILED, [994/994 JIT'ed]
  test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
  test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED

According to MIPS ISA reference documentation, the result of a 32-bit ALU
arithmetic operation on a 64-bit CPU is unpredictable if an operand
register value is not properly sign-extended to 64 bits. To verify the
code emitted by the JIT, the code generation engine in QEMU was modifed to
flip all low 32 bits if the above condition was not met. With this
trip-wire installed, the kernel booted properly in qemu-system-mips64el
and all test_bpf.ko tests passed.

Remaining features
------------------
While the JIT is complete is terms of eBPF ISA support, this series does
not include support for BPF-to-BPF calls and BPF trampolines. Those
features are planned to be added in another patch series.

The BPF_ST | BPF_NOSPEC instruction currently emits nothing. This is
consistent with the behavior if the MIPS interpreter and the existing
eBPF JIT.

Why not build on the existing eBPF JIT?
---------------------------------------
The existing eBPF JIT was originally written for MIPS64. An effort was
made to add MIPS32 support to it in commit 716850ab104d ("MIPS: eBPF:
Initial eBPF support for MIPS32 architecture."). That turned out to
contain a number of flaws, so eBPF support for MIPS32 was disabled in
commit 36366e367ee9 ("MIPS: BPF: Restore MIPS32 cBPF JIT").

Link: https://lore.kernel.org/bpf/5deaa994.1c69fb81.97561.647e@mx.google.com/
The current eBPF JIT for MIPS64 lacks a lot of functionality regarding
ALU32, JMP32 and atomic operations. It also lacks 32-bit CPU support on a
fundamental level, for example 32-bit CPU register mappings and o32 ABI
calling conventions. For optimization purposes, it tracks register usage
through the program control flow in order to do zero-extension and sign-
extension only when necessary, a static analysis of sorts. In my opinion,
having this kind of complexity in JITs, and for which there is not
adequate test coverage, is a problem. Such analysis should be done by the
verifier, if needed at all. Finally, when I run the BPF test suite
test_bpf.ko on the current JIT, there are errors and warnings.

I believe that an eBPF JIT should strive to be correct, complete and
optimized, and in that order. The JIT runs after the verifer has audited
the program and given its approval. If the JIT then emits code that does
something else, it will undermine the eBPF security model. A simple
implementation is easier to get correct than a complex one. Furthermore,
the real performance hit is not an extra CPU instruction here and there,
but when the JIT bails on an unimplemented eBPF instruction and cause the
whole program to fall back to the interpreter. My reasoning here boils
down to the following.

* The JIT should not contain a static analyzer that tracks branches.

* It is acceptable to emit possibly superfluous sign-/zero-extensions for
  ALU32 and JMP32 operations on a 64-bit MIPS to guarantee correctness.

* The JIT should handle all eBPF instructions on all MIPS CPUs.

I conclude that the current eBPF MIPS JIT is complex, incomplete and
incorrect. For the reasons stated above, I decided to not use the existing
JIT implementation.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

mips, bpf: Remove old BPF JIT implementations

This patch removes the old 32-bit cBPF and 64-bit eBPF JIT implementations.
They are replaced by a new eBPF implementation that supports both 32-bit
and 64-bit MIPS CPUs.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-8-johan.almbladh@anyfinetworks.com

mips, bpf: Enable eBPF JITs

This patch enables the new eBPF JITs for 32-bit and 64-bit MIPS. It also
disables the old cBPF JIT to so cBPF programs are converted to use the
new JIT.

Workarounds for R4000 CPU errata are not implemented by the JIT, so the
JIT is disabled if any of those workarounds are configured.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-7-johan.almbladh@anyfinetworks.com

mips, bpf: Add JIT workarounds for CPU errata

This patch adds workarounds for the following CPU errata to the MIPS
eBPF JIT, if enabled in the kernel configuration.

  - R10000 ll/sc weak ordering
  - Loongson-3 ll/sc weak ordering
  - Loongson-2F jump hang

The Loongson-2F nop errata is implemented in uasm, which the JIT uses,
so no additional mitigations are needed for that.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-6-johan.almbladh@anyfinetworks.com

mips, bpf: Add new eBPF JIT for 64-bit MIPS

This is an implementation on of an eBPF JIT for 64-bit MIPS III-V and
MIPS64r1-r6. It uses the same framework introduced by the 32-bit JIT.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-5-johan.almbladh@anyfinetworks.com

mips, bpf: Add eBPF JIT for 32-bit MIPS

This is an implementation of an eBPF JIT for 32-bit MIPS I-V and MIPS32.
The implementation supports all 32-bit and 64-bit ALU and JMP operations,
including the recently-added atomics. 64-bit div/mod and 64-bit atomics
are implemented using function calls to math64 and atomic64 functions,
respectively. All 32-bit operations are implemented natively by the JIT,
except if the CPU lacks ll/sc instructions.

Register mapping
================
All 64-bit eBPF registers are mapped to native 32-bit MIPS register pairs,
and does not use any stack scratch space for register swapping. This means
that all eBPF register data is kept in CPU registers all the time, and
this simplifies the register management a lot. It also reduces the JIT's
pressure on temporary registers since we do not have to move data around.

Native register pairs are ordered according to CPU endiannes, following
the O32 calling convention for passing 64-bit arguments and return values.
The eBPF return value, arguments and callee-saved registers are mapped to
their native MIPS equivalents.

Since the 32 highest bits in the eBPF FP (frame pointer) register are
always zero, only one general-purpose register is actually needed for the
mapping. The MIPS fp register is used for this purpose. The high bits are
mapped to MIPS register r0. This saves us one CPU register, which is much
needed for temporaries, while still allowing us to treat the R10 (FP)
register just like any other eBPF register in the JIT.

The MIPS gp (global pointer) and at (assembler temporary) registers are
used as internal temporary registers for constant blinding. CPU registers
t6-t9 are used internally by the JIT when constructing more complex 64-bit
operations. This is precisely what is needed - two registers to store an
operand value, and two more as scratch registers when performing the
operation.

The register mapping is shown below.

    R0 - $v1, $v0   return value
    R1 - $a1, $a0   argument 1, passed in registers
    R2 - $a3, $a2   argument 2, passed in registers
    R3 - $t1, $t0   argument 3, passed on stack
    R4 - $t3, $t2   argument 4, passed on stack
    R5 - $t4, $t3   argument 5, passed on stack
    R6 - $s1, $s0   callee-saved
    R7 - $s3, $s2   callee-saved
    R8 - $s5, $s4   callee-saved
    R9 - $s7, $s6   callee-saved
    FP - $r0, $fp   32-bit frame pointer
    AX - $gp, $at   constant-blinding
         $t6 - $t9  unallocated, JIT temporaries

Jump offsets
============
The JIT tries to map all conditional JMP operations to MIPS conditional
PC-relative branches. The MIPS branch offset field is 18 bits, in bytes,
which is equivalent to the eBPF 16-bit instruction offset. However, since
the JIT may emit more than one CPU instruction per eBPF instruction, the
field width may overflow. If that happens, the JIT converts the long
conditional jump to a short PC-relative branch with the condition
inverted, jumping over a long unconditional absolute jmp (j).

This conversion will change the instruction offset mapping used for jumps,
and may in turn result in more branch offset overflows. The JIT therefore
dry-runs the translation until no more branches are converted and the
offsets do not change anymore. There is an upper bound on this of course,
and if the JIT hits that limit, the last two iterations are run with all
branches being converted.

Tail call count
===============
The current tail call count is stored in the 16-byte area of the caller's
stack frame that is reserved for the callee in the o32 ABI. The value is
initialized in the prologue, and propagated to the tail-callee by skipping
the initialization instructions when emitting the tail call.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-4-johan.almbladh@anyfinetworks.com

mips, uasm: Add workaround for Loongson-2F nop CPU errata

This patch implements a workaround for the Loongson-2F nop in generated,
code, if the existing option CONFIG_CPU_NOP_WORKAROUND is set. Before,
the binutils option -mfix-loongson2f-nop was enabled, but no workaround
was done when emitting MIPS code. Now, the nop pseudo instruction is
emitted as "or ax,ax,zero" instead of the default "sll zero,zero,0". This
is consistent with the workaround implemented by binutils.

Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://sourceware.org/legacy-ml/binutils/2009-11/msg00387.html
Link: https://lore.kernel.org/bpf/20211005165408.2305108-3-johan.almbladh@anyfinetworks.com

mips, uasm: Enable muhu opcode for MIPS R6

Enable the 'muhu' instruction, complementing the existing 'mulu', needed
to implement a MIPS32 BPF JIT.

Also fix a typo in the existing definition of 'dmulu'.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211005165408.2305108-2-johan.almbladh@anyfinetworks.com

selftests/bpf: Test new btf__add_btf() API

Add a test that validates that btf__add_btf() API is correctly copying
all the types from the source BTF into destination BTF object and
adjusts type IDs and string offsets properly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211006051107.17921-4-andrii@kernel.org

selftests/bpf: Refactor btf_write selftest to reuse BTF generation logic

Next patch will need to reuse BTF generation logic, which tests every
supported BTF kind, for testing btf__add_btf() APIs. So restructure
existing selftests and make it as a single subtest that uses bulk
VALIDATE_RAW_BTF() macro for raw BTF dump checking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211006051107.17921-3-andrii@kernel.org

libbpf: Add API that copies all BTF types from one BTF object to another

Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
appending entire contents of one BTF object to another one, taking care
of copying BTF type data, adjusting resulting BTF type IDs according to
their new locations in the destination BTF object, as well as copying
and deduplicating all the referenced strings and updating all the string
offsets in new BTF types as appropriate.

This API is intended to be used from tools that are generating and
otherwise manipulating BTFs generically, such as pahole. In pahole's
case, this API is useful for speeding up parallelized BTF encoding, as
it allows pahole to offload all the intricacies of BTF type copying to
libbpf and handle the parallelization aspects of the process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org

bpf, x64: Save bytes for DIV by reducing reg copies

Instead of unconditionally performing push/pop on %rax/%rdx in case of
division/modulo, we can save a few bytes in case of destination register
being either BPF r0 (%rax) or r3 (%rdx) since the result is written in
there anyway.

Also, we do not need to copy the source to %r11 unless the source is either
%rax, %rdx or an immediate.

For example, before the patch:

  22:   push   %rax
  23:   push   %rdx
  24:   mov    %rsi,%r11
  27:   xor    %edx,%edx
  29:   div    %r11
  2c:   mov    %rax,%r11
  2f:   pop    %rdx
  30:   pop    %rax
  31:   mov    %r11,%rax

After:

  22:   push   %rdx
  23:   xor    %edx,%edx
  25:   div    %rsi
  28:   pop    %rdx

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com

bpf: Avoid retpoline for bpf_for_each_map_elem

Similarly to 09772d92cd5a ("bpf: avoid retpoline for
lookup/update/delete calls on maps") and 84430d4232c3 ("bpf, verifier:
avoid retpoline for map push/pop/peek operation") avoid indirect call
while calling bpf_for_each_map_elem.

Before (a program fragment):

  ; if (rules_map) {
   142: (15) if r4 == 0x0 goto pc+8
   143: (bf) r3 = r10
  ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0);
   144: (07) r3 += -24
   145: (bf) r1 = r4
   146: (18) r2 = subprog[+5]
   148: (b7) r4 = 0
   149: (85) call bpf_for_each_map_elem#143680  <-- indirect call via
                                                    helper

After (same program fragment):

   ; if (rules_map) {
    142: (15) if r4 == 0x0 goto pc+8
    143: (bf) r3 = r10
   ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0);
    144: (07) r3 += -24
    145: (bf) r1 = r4
    146: (18) r2 = subprog[+5]
    148: (b7) r4 = 0
    149: (85) call bpf_for_each_array_elem#170336  <-- direct call

On a benchmark that calls bpf_for_each_map_elem() once and does many
other things (mostly checking fields in skb) with CONFIG_RETPOLINE=y it
makes program faster.

Before:

  ============================================================================
  Benchmark.cpp                                              time/iter iters/s
  ============================================================================
  IngressMatchByRemoteEndpoint                                80.78ns 12.38M
  IngressMatchByRemoteIP                                      80.66ns 12.40M
  IngressMatchByRemotePort                                    80.87ns 12.37M

After:

  ============================================================================
  Benchmark.cpp                                              time/iter iters/s
  ============================================================================
  IngressMatchByRemoteEndpoint                                73.49ns 13.61M
  IngressMatchByRemoteIP                                      71.48ns 13.99M
  IngressMatchByRemotePort                                    70.39ns 14.21M

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211006001838.75607-1-rdna@fb.com

Merge branch 'Support kernel module function calls from eBPF'

Kumar Kartikeya says:

====================

This set enables kernel module function calls, and also modifies verifier logic
to permit invalid kernel function calls as long as they are pruned as part of
dead code elimination. This is done to provide better runtime portability for
BPF objects, which can conditionally disable parts of code that are pruned later
by the verifier (e.g. const volatile vars, kconfig options). libbpf
modifications are made along with kernel changes to support module function
calls.

It also converts TCP congestion control objects to use the module kfunc support
instead of relying on IS_BUILTIN ifdef.

Changelog:
----------
v6 -> v7
v6: https://lore.kernel.org/bpf/20210930062948.1843919-1-memxor@gmail.com

* Let __bpf_check_kfunc_call take kfunc_btf_id_list instead of generating
   callbacks (Andrii)
* Rename it to bpf_check_mod_kfunc_call to reflect usage
* Remove OOM checks (Alexei)
* Remove resolve_btfids invocation for bpf_testmod (Andrii)
* Move fd_array_cnt initialization near fd_array alloc (Andrii)
* Rename helper to btf_find_by_name_kind and pass start_id (Andrii)
* memset when data is NULL in add_data (Alexei)
* Fix other nits

v5 -> v6
v5: https://lore.kernel.org/bpf/20210927145941.1383001-1-memxor@gmail.com

* Rework gen_loader relocation emits
   * Only emit bpf_btf_find_by_name_kind call when required (Alexei)
   * Refactor code to emit ksym var and func relo into separate helpers, this
     will be easier to add future weak/typeless ksym support to (for my followup)
   * Count references for both ksym var and funcs, and avoid calling helpers
     unless required for both of them. This also means we share fds between
     ksym vars for the module BTFs. Also be careful with this when closing
     BTF fd so that we only close one instance of the fd for each ksym

v4 -> v5
v4: https://lore.kernel.org/bpf/20210920141526.3940002-1-memxor@gmail.com

* Address comments from Alexei
   * Use reserved fd_array area in loader map instead of creating a new map
   * Drop selftest testing the 256 kfunc limit, however selftest testing reuse
     of BTF fd for same kfunc in gen_loader and libbpf is kept
* Address comments from Andrii
   * Make --no-fail the default for resolve_btfids, i.e. only fail if we find
     BTF section and cannot process it
   * Use obj->btf_modules array to store index in the fd_array, so that we don't
     have to do any searching to reuse the index, instead only set it the first
     time a module BTF's fd is used
   * Make find_ksym_btf_id to return struct module_btf * in last parameter
   * Improve logging when index becomes bigger than INT16_MAX
   * Add btf__find_by_name_kind_own internal helper to only start searching for
     kfunc ID in module BTF, since find_ksym_btf_id already checks vmlinux BTF
     before iterating over module BTFs.
   * Fix various other nits
* Fixes for failing selftests on BPF CI
* Rearrange/cleanup selftests
   * Avoid testing kfunc limit (Alexei)
   * Do test gen_loader and libbpf BTF fd index dedup with 256 calls
   * Move invalid kfunc failure test to verifier selftest
   * Minimize duplication
* Use consistent bpf_<type>_check_kfunc_call naming for module kfunc callback
* Since we try to add fd using add_data while we can, cherry pick Alexei's
   patch from CO-RE RFC series to align gen_loader data.

v3 -> v4
v3: https://lore.kernel.org/bpf/20210915050943.679062-1-memxor@gmail.com

* Address comments from Alexei
   * Drop MAX_BPF_STACK change, instead move map_fd and BTF fd to BPF array map
     and pass fd_array using BPF_PSEUDO_MAP_IDX_VALUE
* Address comments from Andrii
   * Fix selftest to store to variable for observing function call instead of
     printk and polluting CI logs
* Drop use of raw_tp for testing, instead reuse classifier based prog_test_run
* Drop index + 1 based insn->off convention for kfunc module calls
* Expand selftests to cover more corner cases
* Misc cleanups

v2 -> v3
v2: https://lore.kernel.org/bpf/20210914123750.460750-1-memxor@gmail.com

* Fix issues pointed out by Kernel Test Robot
* Fix find_kfunc_desc to also take offset into consideration when comparing

RFC v1 -> v2
v1: https://lore.kernel.org/bpf/20210830173424.1385796-1-memxor@gmail.com

* Address comments from Alexei
   * Reuse fd_array instead of introducing kfunc_btf_fds array
   * Take btf and module reference as needed, instead of preloading
   * Add BTF_KIND_FUNC relocation support to gen_loader infrastructure
* Address comments from Andrii
   * Drop hashmap in libbpf for finding index of existing BTF in fd_array
   * Preserve invalid kfunc calls only when the symbol is weak
* Adjust verifier selftests
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

bpf: selftests: Add selftests for module kfunc support

This adds selftests that tests the success and failure path for modules
kfuncs (in presence of invalid kfunc calls) for both libbpf and
gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
add module BTF ID set from bpf_testmod.

This also introduces a couple of test cases to verifier selftests for
validating whether we get an error or not depending on if invalid kfunc
call remains after elimination of unreachable instructions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com

libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations

This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
relocations, with support for weak kfunc relocations. The general idea
is to move map_fds to loader map, and also use the data for storing
kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
kept together.

For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
chances of being <= INT16_MAX than treating data map as a sparse array
and adding fd as needed.

When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
array model, so that as long as it does remain <= INT16_MAX, we pass an
index relative to the start of fd_array.

We store all ksyms in an array where we try to avoid calling the
bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
already stored. This also speeds up the loading process compared to
emitting calls in all cases, in later tests.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com

libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0

Preserve these calls as it allows verifier to succeed in loading the
program if they are determined to be unreachable after dead code
elimination during program load. If not, the verifier will fail at
runtime. This is done for ext->is_weak symbols similar to the case for
variable ksyms.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com

libbpf: Support kernel module function calls

This patch adds libbpf support for kernel module function call support.
The fd_array parameter is used during BPF program load to pass module
BTFs referenced by the program. insn->off is set to index into this
array, but starts from 1, because insn->off as 0 is reserved for
btf_vmlinux.

We try to use existing insn->off for a module, since the kernel limits
the maximum distinct module BTFs for kfuncs to 256, and also because
index must never exceed the maximum allowed value that can fit in
insn->off (INT16_MAX). In the future, if kernel interprets signed offset
as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.

Also introduce a btf__find_by_name_kind_own helper to start searching
from module BTF's start id when we know that the BTF ID is not present
in vmlinux BTF (in find_ksym_btf_id).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com

bpf: Enable TCP congestion control kfunc from modules

This commit moves BTF ID lookup into the newly added registration
helper, in a way that the bbr, cubic, and dctcp implementation set up
their sets in the bpf_tcp_ca kfunc_btf_set list, while the ones not
dependent on modules are looked up from the wrapper function.

This lifts the restriction for them to be compiled as built in objects,
and can be loaded as modules if required. Also modify Makefile.modfinal
to call resolve_btfids for each module.

Note that since kernel kfunc_ids never overlap with module kfunc_ids, we
only match the owner for module btf id sets.

See following commits for background on use of:

CONFIG_X86 ifdef:
569c484f9995 (bpf: Limit static tcp-cc functions in the .BTF_ids list to x86)

CONFIG_DYNAMIC_FTRACE ifdef:
7aae231ac93b (bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE)

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-6-memxor@gmail.com