| #
f17b474e |
| 10-Feb-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Support associating BPF program with struct_ops (Amery Hung)
-
Merge tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Support associating BPF program with struct_ops (Amery Hung)
- Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung)
- Fix live registers marking for indirect jumps (Anton Protopopov)
- Introduce execution context detection BPF helpers (Changwoo Min)
- Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda)
- Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng)
- Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman)
- In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis)
- Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai)
- Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen)
- Fix ORC stack unwind from kprobe_multi (Jiri Olsa)
- Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa)
- Require frozen map for calculating map hash (KP Singh)
- Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi)
- Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang)
- Make kfuncs return trusted pointers by default (Matt Bobrowski)
- Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong)
- Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov)
- Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan)
- Allow BPF arena calls from non-sleepable context (Puranjay Mohan)
- Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan)
- In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin)
- Use CFI compatible destructor kfunc type (Sami Tolvanen)
- Add bitwise tracking for BPF_END in the verifier (Tianci Cao)
- Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang)
- Make BPF selftests work with 64k page size (Yonghong Song)
* tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...
show more ...
|
|
Revision tags: v6.19, v6.19-rc8, v6.19-rc7 |
|
| #
d0f5d4f8 |
| 21-Jan-2026 |
Matt Bobrowski <mattbobrowski@google.com> |
bpf: Revert "bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()"
This reverts commit e463b6de9da1 ("bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()").
The original co
bpf: Revert "bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()"
This reverts commit e463b6de9da1 ("bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()").
The original commit removed the KF_ACQUIRE flag from bpf_get_root_mem_cgroup() under the assumption that it resulted in simplified usage. This stemmed from the fact that bpf_get_root_mem_cgroup() inherently returns a reference to an object which technically isn't reference counted, therefore there is no strong requirement to call a matching bpf_put_mem_cgroup() on the returned reference.
Although technically correct, as per the arguments in the thread [0], dropping the KF_ACQUIRE flag and losing reference tracking semantics negatively impacted the usability of bpf_get_root_mem_cgroup() in practice.
[0] https://lore.kernel.org/bpf/878qdx6yut.fsf@linux.dev/
Link: https://lore.kernel.org/bpf/CAADnVQ+6d1Lj4dteAv8u62d7kj3Ze5io6bqM0xeQd-UPk9ZgJQ@mail.gmail.com/ Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20260121090001.240166-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
|
Revision tags: v6.19-rc6 |
|
| #
e463b6de |
| 13-Jan-2026 |
Matt Bobrowski <mattbobrowski@google.com> |
bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()
With the BPF verifier now treating pointers to struct types returned from BPF kfuncs as implicitly trusted by default, there is no ne
bpf: drop KF_ACQUIRE flag on BPF kfunc bpf_get_root_mem_cgroup()
With the BPF verifier now treating pointers to struct types returned from BPF kfuncs as implicitly trusted by default, there is no need for bpf_get_root_mem_cgroup() to be annotated with the KF_ACQUIRE flag.
bpf_get_root_mem_cgroup() does not acquire any references, but rather simply returns a NULL pointer or a pointer to a struct mem_cgroup object that is valid for the entire lifetime of the kernel.
This simplifies BPF programs using this kfunc by removing the requirement to pair the call with bpf_put_mem_cgroup().
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260113083949.2502978-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
|
Revision tags: v6.19-rc5, v6.19-rc4 |
|
| #
e40030a4 |
| 02-Jan-2026 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'bpf-make-kf_trusted_args-default'
Puranjay Mohan says:
==================== bpf: Make KF_TRUSTED_ARGS default
v2: https://lore.kernel.org/all/20251231171118.1174007-1-puranjay@kernel
Merge branch 'bpf-make-kf_trusted_args-default'
Puranjay Mohan says:
==================== bpf: Make KF_TRUSTED_ARGS default
v2: https://lore.kernel.org/all/20251231171118.1174007-1-puranjay@kernel.org/ Changes in v2->v3: - Fix documentation: add a new section for kfunc parameters (Eduard) - Remove all occurances of KF_TRUSTED from comments, etc. (Eduard) - Fix the netfilter kfuncs to drop dead NULL checks. - Fix selftest for netfilter kfuncs to check for verification failures and remove the runtime failure that are not possible after this changes
v1: https://lore.kernel.org/all/20251224192448.3176531-1-puranjay@kernel.org/ Changes in v1->v2: - Update kfunc_dynptr_param selftest to use a real pointer that is not ptr_to_stack and not CONST_PTR_TO_DYNPTR rather than casting 1 (Alexei) - Thoroughly review all kfuncs in the to find regressions or missing annotations. (Eduard) - Fix kfuncs found from the above step.
This series makes trusted arguments the default requirement for all BPF kfuncs, inverting the current opt-in model. Instead of requiring explicit KF_TRUSTED_ARGS flags, kfuncs now require trusted arguments by default and must explicitly opt-out using __nullable/__opt annotations or the KF_RCU flag.
This improves security and type safety by preventing BPF programs from passing untrusted or NULL pointers to kernel functions at verification time, while maintaining flexibility for the small number of kfuncs that legitimately need to accept NULL or RCU pointers.
MOTIVATION
The current opt-in model is error-prone and inconsistent. Most kfuncs already require trusted pointers from sources like KF_ACQUIRE, struct_ops callbacks, or tracepoints. Making trusted arguments the default:
- Prevents NULL pointer dereferences at verification time - Reduces defensive NULL checks in kernel code - Provides better error messages for invalid BPF programs - Aligns with existing patterns (context pointers, struct_ops already trusted)
IMPACT ANALYSIS
Comprehensive analysis of all 304+ kfuncs across 37 kernel files found: - Most kfuncs (299/304) are already safe and require no changes - Only 4 kfuncs required fixes (all included in this series) - 0 regressions found in independent verification
All bpf selftests are passing. The hid_bpf tests are also passing: # PASSED: 20 / 20 tests passed. # Totals: pass:20 fail:0 xfail:0 xpass:0 skip:0 error:0
bpf programs in drivers/hid/bpf/progs/ show no regression as shown by veristat:
Done. Processed 24 files, 62 programs. Skipped 0 files, 0 programs.
TECHNICAL DETAILS
The verifier now validates kfunc arguments in this order: 1. NULL check (runs first): Rejects NULL unless parameter has __nullable/__opt 2. Trusted check: Rejects untrusted pointers unless kfunc has KF_RCU
Special cases that bypass trusted checking: - Context pointers (xdp_md, __sk_buff): Handled via KF_ARG_PTR_TO_CTX - Struct_ops callbacks: Pre-marked as PTR_TRUSTED during initialization - KF_RCU kfuncs: Have separate validation path for RCU pointers
BACKWARD COMPATIBILITY
This affects BPF program verification, not runtime: - Valid programs passing trusted pointers: Continue to work - Programs with bugs: May now fail verification (preventing runtime crashes)
This series introduces two intentional breaking changes to the BPF verifier's kfunc handling:
1. NULL pointer rejection timing: Kfuncs that previously accepted NULL pointers without KF_TRUSTED_ARGS will now reject NULL at verification time instead of returning runtime errors. This affects netfilter connection tracking functions (bpf_xdp_ct_lookup, bpf_skb_ct_lookup, bpf_xdp_ct_alloc, bpf_skb_ct_alloc), which now enforce their documented "Cannot be NULL" requirements at load time rather than returning -EINVAL at runtime.
2. Fentry/fexit program restrictions: BPF programs using fentry/fexit attachment points can no longer pass their callback arguments directly to kfuncs, as these arguments are not marked as trusted by default. Programs requiring trusted argument semantics should migrate to tp_btf (tracepoint with BTF) attachment points where arguments are guaranteed trusted by the verifier.
Both changes strengthen the verifier's safety guarantees by catching errors earlier in the development cycle and are accompanied by comprehensive test updates demonstrating the new expected behaviors. ====================
Link: https://patch.msgid.link/20260102180038.2708325-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| #
7646c7af |
| 02-Jan-2026 |
Puranjay Mohan <puranjay@kernel.org> |
bpf: Remove redundant KF_TRUSTED_ARGS flag from all kfuncs
Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the
bpf: Remove redundant KF_TRUSTED_ARGS flag from all kfuncs
Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the flag itself.
Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
|
Revision tags: v6.19-rc3 |
|
| #
042d4c06 |
| 23-Dec-2025 |
Alexei Starovoitov <ast@kernel.org> |
Merge branch 'mm-bpf-kfuncs-to-access-memcg-data'
Roman Gushchin says:
==================== mm: bpf kfuncs to access memcg data
Introduce kfuncs to simplify the access to the memcg data. These kfu
Merge branch 'mm-bpf-kfuncs-to-access-memcg-data'
Roman Gushchin says:
==================== mm: bpf kfuncs to access memcg data
Introduce kfuncs to simplify the access to the memcg data. These kfuncs can be used to accelerate monitoring use cases and for implementing custom OOM policies once BPF OOM is landed.
This patchset was separated out from the BPF OOM patchset to simplify the logistics and accelerate the landing of the part which is useful by itself. No functional changes since BPF OOM v2.
v4: - refactored memcg vm event and stat item idx checks (by Alexei)
v3: - dropped redundant kfuncs flags (by Alexei) - fixed kdocs warnings (by Alexei) - merged memcg stats access patches into one (by Alexei) - restored root memcg usage reporting, added a comment - added checks for enum boundaries - added Shakeel and JP as co-maintainers (by Shakeel)
v2: - added mem_cgroup_disabled() checks (by Shakeel B.) - added special handling of the root memcg in bpf_mem_cgroup_usage() (by Shakeel B.) - minor fixes in the kselftest (by Shakeel B.) - added a MAINTAINERS entry (by Shakeel B.)
v1: https://lore.kernel.org/bpf/87ike29s5r.fsf@linux.dev/T/#t ====================
Link: https://patch.msgid.link/20251223044156.208250-1-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| #
99430ab8 |
| 23-Dec-2025 |
Roman Gushchin <roman.gushchin@linux.dev> |
mm: introduce BPF kfuncs to access memcg statistics and events
Introduce BPF kfuncs to conveniently access memcg data: - bpf_mem_cgroup_vm_events(), - bpf_mem_cgroup_memory_events(), - bpf_mem
mm: introduce BPF kfuncs to access memcg statistics and events
Introduce BPF kfuncs to conveniently access memcg data: - bpf_mem_cgroup_vm_events(), - bpf_mem_cgroup_memory_events(), - bpf_mem_cgroup_usage(), - bpf_mem_cgroup_page_state(), - bpf_mem_cgroup_flush_stats().
These functions are useful for implementing BPF OOM policies, but also can be used to accelerate access to the memcg data. Reading it through cgroupfs is much more expensive, roughly 5x, mostly because of the need to convert the data into the text and back.
JP Kobryn: An experiment was setup to compare the performance of a program that uses the traditional method of reading memory.stat vs a program using the new kfuncs. The control program opens up the root memory.stat file and for 1M iterations reads, converts the string values to numeric data, then seeks back to the beginning. The experimental program sets up the requisite libbpf objects and for 1M iterations invokes a bpf program which uses the kfuncs to fetch all available stats for node_stat_item, memcg_stat_item, and vm_event_item types.
The results showed a significant perf benefit on the experimental side, outperforming the control side by a margin of 93%. In kernel mode, elapsed time was reduced by 80%, while in user mode, over 99% of time was saved.
control: elapsed time real 0m38.318s user 0m25.131s sys 0m13.070s
experiment: elapsed time real 0m2.789s user 0m0.187s sys 0m2.512s
control: perf data 33.43% a.out libc.so.6 [.] __vfscanf_internal 6.88% a.out [kernel.kallsyms] [k] vsnprintf 6.33% a.out libc.so.6 [.] _IO_fgets 5.51% a.out [kernel.kallsyms] [k] format_decode 4.31% a.out libc.so.6 [.] __GI_____strtoull_l_internal 3.78% a.out [kernel.kallsyms] [k] string 3.53% a.out [kernel.kallsyms] [k] number 2.71% a.out libc.so.6 [.] _IO_sputbackc 2.41% a.out [kernel.kallsyms] [k] strlen 1.98% a.out a.out [.] main 1.70% a.out libc.so.6 [.] _IO_getline_info 1.51% a.out libc.so.6 [.] __isoc99_sscanf 1.47% a.out [kernel.kallsyms] [k] memory_stat_format 1.47% a.out [kernel.kallsyms] [k] memcpy_orig 1.41% a.out [kernel.kallsyms] [k] seq_buf_printf
experiment: perf data 10.55% memcgstat bpf_prog_..._query [k] bpf_prog_16aab2f19fa982a7_query 6.90% memcgstat [kernel.kallsyms] [k] memcg_page_state_output 3.55% memcgstat [kernel.kallsyms] [k] _raw_spin_lock 3.12% memcgstat [kernel.kallsyms] [k] memcg_events 2.87% memcgstat [kernel.kallsyms] [k] __memcg_slab_post_alloc_hook 2.73% memcgstat [kernel.kallsyms] [k] kmem_cache_free 2.70% memcgstat [kernel.kallsyms] [k] entry_SYSRETQ_unsafe_stack 2.25% memcgstat [kernel.kallsyms] [k] __memcg_slab_free_hook 2.06% memcgstat [kernel.kallsyms] [k] get_page_from_freelist
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Co-developed-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Link: https://lore.kernel.org/r/20251223044156.208250-5-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| #
5c7db323 |
| 23-Dec-2025 |
Roman Gushchin <roman.gushchin@linux.dev> |
mm: introduce bpf_get_root_mem_cgroup() BPF kfunc
Introduce a BPF kfunc to get a trusted pointer to the root memory cgroup. It's very handy to traverse the full memcg tree, e.g. for handling a syste
mm: introduce bpf_get_root_mem_cgroup() BPF kfunc
Introduce a BPF kfunc to get a trusted pointer to the root memory cgroup. It's very handy to traverse the full memcg tree, e.g. for handling a system-wide OOM.
It's possible to obtain this pointer by traversing the memcg tree up from any known memcg, but it's sub-optimal and makes BPF programs more complex and less efficient.
bpf_get_root_mem_cgroup() has a KF_ACQUIRE | KF_RET_NULL semantics, however in reality it's not necessary to bump the corresponding reference counter - root memory cgroup is immortal, reference counting is skipped, see css_get(). Once set, root_mem_cgroup is always a valid memcg pointer. It's safe to call bpf_put_mem_cgroup() for the pointer obtained with bpf_get_root_mem_cgroup(), it's effectively a no-op.
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20251223044156.208250-4-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| #
5904db98 |
| 23-Dec-2025 |
Roman Gushchin <roman.gushchin@linux.dev> |
mm: introduce BPF kfuncs to deal with memcg pointers
To effectively operate with memory cgroups in BPF there is a need to convert css pointers to memcg pointers. A simple container_of cast which is
mm: introduce BPF kfuncs to deal with memcg pointers
To effectively operate with memory cgroups in BPF there is a need to convert css pointers to memcg pointers. A simple container_of cast which is used in the kernel code can't be used in BPF because from the verifier's point of view that's a out-of-bounds memory access.
Introduce helper get/put kfuncs which can be used to get a refcounted memcg pointer from the css pointer: - bpf_get_mem_cgroup, - bpf_put_mem_cgroup.
bpf_get_mem_cgroup() can take both memcg's css and the corresponding cgroup's "self" css. It allows it to be used with the existing cgroup iterator which iterates over cgroup tree, not memcg tree.
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20251223044156.208250-3-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|