History log of /linux/tools/testing/selftests/bpf/progs/lpm_trie_map.c (Results 1 – 2 of 2)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# ae28ed45 01-Oct-2025 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

- Support pulling non-linear xdp data with bpf_xdp_pull_data() kfu

Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

- Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc
(Amery Hung)

Applied as a stable branch in bpf-next and net-next trees.

- Support reading skb metadata via bpf_dynptr (Jakub Sitnicki)

Also a stable branch in bpf-next and net-next trees.

- Enforce expected_attach_type for tailcall compatibility (Daniel
Borkmann)

- Replace path-sensitive with path-insensitive live stack analysis in
the verifier (Eduard Zingerman)

This is a significant change in the verification logic. More details,
motivation, long term plans are in the cover letter/merge commit.

- Support signed BPF programs (KP Singh)

This is another major feature that took years to materialize.

Algorithm details are in the cover letter/marge commit

- Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich)

- Add support for may_goto instruction to arm64 JIT (Puranjay Mohan)

- Fix USDT SIB argument handling in libbpf (Jiawei Zhao)

- Allow uprobe-bpf program to change context registers (Jiri Olsa)

- Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and
Puranjay Mohan)

- Allow access to union arguments in tracing programs (Leon Hwang)

- Optimize rcu_read_lock() + migrate_disable() combination where it's
used in BPF subsystem (Menglong Dong)

- Introduce bpf_task_work_schedule*() kfuncs to schedule deferred
execution of BPF callback in the context of a specific task using the
kernel’s task_work infrastructure (Mykyta Yatsenko)

- Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya
Dwivedi)

- Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi)

- Improve the precision of tnum multiplier verifier operation
(Nandakumar Edamana)

- Use tnums to improve is_branch_taken() logic (Paul Chaignon)

- Add support for atomic operations in arena in riscv JIT (Pu Lehui)

- Report arena faults to BPF error stream (Puranjay Mohan)

- Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin
Monnet)

- Add bpf_strcasecmp() kfunc (Rong Tao)

- Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao
Chen)

* tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits)
libbpf: Replace AF_ALG with open coded SHA-256
selftests/bpf: Add stress test for rqspinlock in NMI
selftests/bpf: Add test case for different expected_attach_type
bpf: Enforce expected_attach_type for tailcall compatibility
bpftool: Remove duplicate string.h header
bpf: Remove duplicate crypto/sha2.h header
libbpf: Fix error when st-prefix_ops and ops from differ btf
selftests/bpf: Test changing packet data from kfunc
selftests/bpf: Add stacktrace map lookup_and_delete_elem test case
selftests/bpf: Refactor stacktrace_map case with skeleton
bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE
selftests/bpf: Fix flaky bpf_cookie selftest
selftests/bpf: Test changing packet data from global functions with a kfunc
bpf: Emit struct bpf_xdp_sock type in vmlinux BTF
selftests/bpf: Task_work selftest cleanup fixes
MAINTAINERS: Delete inactive maintainers from AF_XDP
bpf: Mark kfuncs as __noclone
selftests/bpf: Add kprobe multi write ctx attach test
selftests/bpf: Add kprobe write ctx attach test
selftests/bpf: Add uprobe context ip register change test
...

show more ...


Revision tags: v6.17, v6.17-rc7, v6.17-rc6, v6.17-rc5, v6.17-rc4
# 737433c6 27-Aug-2025 Matt Fleming <mfleming@cloudflare.com>

selftests/bpf: Add LPM trie microbenchmarks

Add benchmarks for the standard set of operations: LOOKUP, INSERT,
UPDATE, DELETE. Also include benchmarks to measure the overhead of the
bench framework

selftests/bpf: Add LPM trie microbenchmarks

Add benchmarks for the standard set of operations: LOOKUP, INSERT,
UPDATE, DELETE. Also include benchmarks to measure the overhead of the
bench framework itself (NOOP) as well as the overhead of generating keys
(BASELINE). Lastly, this includes a benchmark for FREE (trie_free())
which is known to have terrible performance for maps with many entries.

Benchmarks operate on tries without gaps in the key range, i.e. each
test begins or ends with a trie with valid keys in the range [0,
nr_entries). This is intended to cause maximum branching when traversing
the trie.

LOOKUP, UPDATE, DELETE, and FREE fill a BPF LPM trie from userspace
using bpf_map_update_batch() and run the corresponding benchmark
operation via bpf_loop(). INSERT starts with an empty map and fills it
kernel-side from bpf_loop(). FREE records the time to free a filled LPM
trie by attaching and destroying a BPF prog. NOOP measures the overhead
of the test harness by running an empty function with bpf_loop().
BASELINE is similar to NOOP except that the function generates a key.

Each operation runs 10,000 times using bpf_loop(). Note that this value
is intentionally independent of the number of entries in the LPM trie so
that the stability of the results isn't affected by the number of
entries.

For those benchmarks that need to reset the LPM trie once it's full
(INSERT) or empty (DELETE), throughput and latency results are scaled by
the fraction of a second the operation actually ran to ignore any time
spent reinitialising the trie.

By default, benchmarks run using sequential keys in the range [0,
nr_entries). BASELINE, LOOKUP, and UPDATE can use random keys via the
--random parameter but beware there is a runtime cost involved in
generating random keys. Other benchmarks are prohibited from using
random keys because it can skew the results, e.g. when inserting an
existing key or deleting a missing one.

All measurements are recorded from within the kernel to eliminate
syscall overhead. Most benchmarks run an XDP program to generate stats
but FREE needs to collect latencies using fentry/fexit on
map_free_deferred() because it's not possible to use fentry directly on
lpm_trie.c since commit c83508da5620 ("bpf: Avoid deadlock caused by
nested kprobe and fentry bpf programs") and there's no way to
create/destroy a map from within an XDP program.

Here is example output from an AMD EPYC 9684X 96-Core machine for each
of the benchmarks using a trie with 10K entries and a 32-bit prefix
length, e.g.

$ ./bench lpm-trie-$op \
--prefix_len=32 \
--producers=1 \
--nr_entries=10000

noop: throughput 74.417 ± 0.032 M ops/s ( 74.417M ops/prod), latency 13.438 ns/op
baseline: throughput 70.107 ± 0.171 M ops/s ( 70.107M ops/prod), latency 14.264 ns/op
lookup: throughput 8.467 ± 0.047 M ops/s ( 8.467M ops/prod), latency 118.109 ns/op
insert: throughput 2.440 ± 0.015 M ops/s ( 2.440M ops/prod), latency 409.290 ns/op
update: throughput 2.806 ± 0.042 M ops/s ( 2.806M ops/prod), latency 356.322 ns/op
delete: throughput 4.625 ± 0.011 M ops/s ( 4.625M ops/prod), latency 215.613 ns/op
free: throughput 0.578 ± 0.006 K ops/s ( 0.578K ops/prod), latency 1.730 ms/op

And the same benchmarks using random keys:

$ ./bench lpm-trie-$op \
--prefix_len=32 \
--producers=1 \
--nr_entries=10000 \
--random

noop: throughput 74.259 ± 0.335 M ops/s ( 74.259M ops/prod), latency 13.466 ns/op
baseline: throughput 35.150 ± 0.144 M ops/s ( 35.150M ops/prod), latency 28.450 ns/op
lookup: throughput 7.119 ± 0.048 M ops/s ( 7.119M ops/prod), latency 140.469 ns/op
insert: N/A
update: throughput 2.736 ± 0.012 M ops/s ( 2.736M ops/prod), latency 365.523 ns/op
delete: N/A
free: N/A

Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20250827140149.1001557-1-matt@readmodwrite.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...