History log of /linux/kernel/bpf/helpers.c (Results 251 – 275 of 1028)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.3-rc5
# a033907e 01-Apr-2023 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Enable RCU semantics for task kptrs'

David Vernet says:

====================

In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the
'refcount_t rcu_users' field was ex

Merge branch 'Enable RCU semantics for task kptrs'

David Vernet says:

====================

In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the
'refcount_t rcu_users' field was extracted out of a union with the
'struct rcu_head rcu' field. This allows us to use the field for
refcounting struct task_struct with RCU protection, as the RCU callback
no longer flips rcu_users to be nonzero after the callback is scheduled.

This patch set leverages this to do a few things:

1. Marks struct task_struct as RCU safe in the verifier, allowing
referenced kptr tasks stored in maps to be accessed in an RCU
read region without acquiring a reference (with just a NULL check).
2. Makes bpf_task_acquire() a KF_ACQUIRE | KF_RCU | KF_RET_NULL kfunc.
3. Removes bpf_task_kptr_get() and bpf_task_acquire_not_zero(), as
they're now redundant with the above two changes.
4. Updates selftests and documentation accordingly.
---
Changelog:
v1: https://lore.kernel.org/all/20230331005733.406202-1-void@manifault.com/
v1 -> v2:
- Remove testcases validating nested trust inheritance. The first
version used 'struct task_struct __rcu *parent', but because that
field has the __rcu tag it functions differently on gcc and llvm and
causes gcc selftests to fail. Alexei is reworking nested trust,
anyways so let's leave it off for now (Alexei).
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# f85671c6 31-Mar-2023 David Vernet <void@manifault.com>

bpf: Remove now-defunct task kfuncs

In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the
'refcount_t rcu_users' field was extracted out of a union with the
'struct rcu_head rcu' fie

bpf: Remove now-defunct task kfuncs

In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the
'refcount_t rcu_users' field was extracted out of a union with the
'struct rcu_head rcu' field. This allows us to safely perform a
refcount_inc_not_zero() on task->rcu_users when acquiring a reference on
a task struct. A prior patch leveraged this by making struct task_struct
an RCU-protected object in the verifier, and by bpf_task_acquire() to
use the task->rcu_users field for synchronization.

Now that we can use RCU to protect tasks, we no longer need
bpf_task_kptr_get(), or bpf_task_acquire_not_zero(). bpf_task_kptr_get()
is truly completely unnecessary, as we can just use RCU to get the
object. bpf_task_acquire_not_zero() is now equivalent to
bpf_task_acquire().

In addition to these changes, this patch also updates the associated
selftests to no longer use these kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230331195733.699708-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# d02c48fa 31-Mar-2023 David Vernet <void@manifault.com>

bpf: Make struct task_struct an RCU-safe type

struct task_struct objects are a bit interesting in terms of how their
lifetime is protected by refcounts. task structs have two refcount
fields:

1. re

bpf: Make struct task_struct an RCU-safe type

struct task_struct objects are a bit interesting in terms of how their
lifetime is protected by refcounts. task structs have two refcount
fields:

1. refcount_t usage: Protects the memory backing the task struct. When
this refcount drops to 0, the task is immediately freed, without
waiting for an RCU grace period to elapse. This is the field that
most callers in the kernel currently use to ensure that a task
remains valid while it's being referenced, and is what's currently
tracked with bpf_task_acquire() and bpf_task_release().

2. refcount_t rcu_users: A refcount field which, when it drops to 0,
schedules an RCU callback that drops a reference held on the 'usage'
field above (which is acquired when the task is first created). This
field therefore provides a form of RCU protection on the task by
ensuring that at least one 'usage' refcount will be held until an RCU
grace period has elapsed. The qualifier "a form of" is important
here, as a task can remain valid after task->rcu_users has dropped to
0 and the subsequent RCU gp has elapsed.

In terms of BPF, we want to use task->rcu_users to protect tasks that
function as referenced kptrs, and to allow tasks stored as referenced
kptrs in maps to be accessed with RCU protection.

Let's first determine whether we can safely use task->rcu_users to
protect tasks stored in maps. All of the bpf_task* kfuncs can only be
called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program
types. For tracepoint and struct_ops programs, the struct task_struct
passed to a program handler will always be trusted, so it will always be
safe to call bpf_task_acquire() with any task passed to a program.
Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL,
as it is possible that the task has exited by the time the program is
invoked, even if the pointer is still currently valid because the main
kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks
should never be passed as an argument to the any program handlers, so it
should not be relevant.

The second question is whether it's safe to use RCU to access a task
that was acquired with bpf_task_acquire(), and stored in a map. Because
bpf_task_acquire() now uses task->rcu_users, it follows that if the task
is present in the map, that it must have had at least one
task->rcu_users refcount by the time the current RCU cs was started.
Therefore, it's safe to access that task until the end of the current
RCU cs.

With all that said, this patch makes struct task_struct is an
RCU-protected object. In doing so, we also change bpf_task_acquire() to
be KF_ACQUIRE | KF_RCU | KF_RET_NULL, and adjust any selftests as
necessary. A subsequent patch will remove bpf_task_kptr_get(), and
bpf_task_acquire_not_zero() respectively.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# cecdd52a 28-Mar-2023 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next

Catch up with 6.3-rc cycle...

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


Revision tags: v6.3-rc4
# 496f4f1b 26-Mar-2023 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Don't invoke KPTR_REF destructor on NULL xchg'

David Vernet says:

====================

When a map value is being freed, we loop over all of the fields of the
corresponding BPF object

Merge branch 'Don't invoke KPTR_REF destructor on NULL xchg'

David Vernet says:

====================

When a map value is being freed, we loop over all of the fields of the
corresponding BPF object and issue the appropriate cleanup calls
corresponding to the field's type. If the field is a referenced kptr, we
atomically xchg the value out of the map, and invoke the kptr's
destructor on whatever was there before.

Currently, we always invoke the destructor (or bpf_obj_drop() for a
local kptr) on any kptr, including if no value was xchg'd out of the
map. This means that any function serving as the kptr's KF_RELEASE
destructor must always treat the argument as possibly NULL, and we
invoke unnecessary (and seemingly unsafe) cleanup logic for the local
kptr path as well.

This is an odd requirement -- KF_RELEASE kfuncs that are invoked by BPF
programs do not have this restriction, and the verifier will fail to
load the program if the register containing the to-be-released type has
any untrusted modifiers (e.g. PTR_UNTRUSTED or PTR_MAYBE_NULL). So as to
simplify the expectations required for a KF_RELEASE kfunc, this patch
set updates the KPTR_REF destructor logic to only be invoked when a
non-NULL value is xchg'd out of the map.

Additionally, the patch removes now-unnecessary KF_RELEASE calls from
several kfuncs, and finally, updates the verifier to have KF_RELEASE
automatically imply KF_TRUSTED_ARGS. This restriction was already
implicitly happening because of the aforementioned logic in the verifier
to reject any regs with untrusted modifiers, and to enforce that
KF_RELEASE args are passed with a 0 offset. This change just updates the
behavior to match that of other trusted args. This patch is left to the
end of the series in case it happens to be controversial, as it arguably
is slightly orthogonal to the purpose of the rest of the series.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# fb2211a5 25-Mar-2023 David Vernet <void@manifault.com>

bpf: Remove now-unnecessary NULL checks for KF_RELEASE kfuncs

Now that we're not invoking kfunc destructors when the kptr in a map was
NULL, we no longer require NULL checks in many of our KF_RELEAS

bpf: Remove now-unnecessary NULL checks for KF_RELEASE kfuncs

Now that we're not invoking kfunc destructors when the kptr in a map was
NULL, we no longer require NULL checks in many of our KF_RELEASE kfuncs.
This patch removes those NULL checks.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230325213144.486885-3-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# e752ab11 20-Mar-2023 Rob Clark <robdclark@chromium.org>

Merge remote-tracking branch 'drm/drm-next' into msm-next

Merge drm-next into msm-next to pick up external clk and PM dependencies
for improved a6xx GPU reset sequence.

Signed-off-by: Rob Clark <ro

Merge remote-tracking branch 'drm/drm-next' into msm-next

Merge drm-next into msm-next to pick up external clk and PM dependencies
for improved a6xx GPU reset sequence.

Signed-off-by: Rob Clark <robdclark@chromium.org>

show more ...


Revision tags: v6.3-rc3
# d26a3a6c 17-Mar-2023 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.3-rc2' into next

Merge with mainline to get of_property_present() and other newer APIs.


# 283b40c5 14-Mar-2023 Martin KaFai Lau <martin.lau@kernel.org>

Merge branch 'bpf: Allow helpers access ptr_to_btf_id.'

Alexei Starovoitov says:

====================

From: Alexei Starovoitov <ast@kernel.org>

Allow code like:
bpf_strncmp(task->comm, 16, "foo")

Merge branch 'bpf: Allow helpers access ptr_to_btf_id.'

Alexei Starovoitov says:

====================

From: Alexei Starovoitov <ast@kernel.org>

Allow code like:
bpf_strncmp(task->comm, 16, "foo");
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

show more ...


# c9267aa8 14-Mar-2023 Alexei Starovoitov <ast@kernel.org>

bpf: Fix bpf_strncmp proto.

bpf_strncmp() doesn't write into its first argument.
Make sure that the verifier knows about it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David Verne

bpf: Fix bpf_strncmp proto.

bpf_strncmp() doesn't write into its first argument.
Make sure that the verifier knows about it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230313235845.61029-2-alexei.starovoitov@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

show more ...


# b3c9a041 13-Mar-2023 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-fixes into drm-misc-fixes

Backmerging to get latest upstream.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# a1eccc57 13-Mar-2023 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next

Backmerging to get v6.3-rc1 and sync with the other DRM trees.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


Revision tags: v6.3-rc2
# 49b5300f 11-Mar-2023 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Support stashing local kptrs with bpf_kptr_xchg'

Dave Marchevsky says:

====================

Local kptrs are kptrs allocated via bpf_obj_new with a type specified in program
BTF. A BP

Merge branch 'Support stashing local kptrs with bpf_kptr_xchg'

Dave Marchevsky says:

====================

Local kptrs are kptrs allocated via bpf_obj_new with a type specified in program
BTF. A BPF program which creates a local kptr has exclusive control of the
lifetime of the kptr, and, prior to terminating, must:

* free the kptr via bpf_obj_drop
* If the kptr is a {list,rbtree} node, add the node to a {list, rbtree},
thereby passing control of the lifetime to the collection

This series adds a third option:

* stash the kptr in a map value using bpf_kptr_xchg

As indicated by the use of "stash" to describe this behavior, the intended use
of this feature is temporary storage of local kptrs. For example, a sched_ext
([0]) scheduler may want to create an rbtree node for each new cgroup on cgroup
init, but to add that node to the rbtree as part of a separate program which
runs on enqueue. Stashing the node in a map_value allows its lifetime to outlive
the execution of the cgroup_init program.

Behavior:

There is no semantic difference between adding a kptr to a graph collection and
"stashing" it in a map. In both cases exclusive ownership of the kptr's lifetime
is passed to some containing data structure, which is responsible for
bpf_obj_drop'ing it when the container goes away.

Since graph collections also expect exclusive ownership of the nodes they
contain, graph nodes cannot be both stashed in a map_value and contained by
their corresponding collection.

Implementation:

Two observations simplify the verifier changes for this feature. First, kptrs
("referenced kptrs" until a recent renaming) require registration of a
dtor function as part of their acquire/release semantics, so that a referenced
kptr which is placed in a map_value is properly released when the map goes away.
We want this exact behavior for local kptrs, but with bpf_obj_drop as the dtor
instead of a per-btf_id dtor.

The second observation is that, in terms of identification, "referenced kptr"
and "local kptr" already don't interfere with one another. Consider the
following example:

struct node_data {
long key;
long data;
struct bpf_rb_node node;
};

struct map_value {
struct node_data __kptr *node;
};

struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, int);
__type(value, struct map_value);
__uint(max_entries, 1);
} some_nodes SEC(".maps");

struct map_value *mapval;
struct node_data *res;
int key = 0;

res = bpf_obj_new(typeof(*res));
if (!res) { /* err handling */ }

mapval = bpf_map_lookup_elem(&some_nodes, &key);
if (!mapval) { /* err handling */ }

res = bpf_kptr_xchg(&mapval->node, res);
if (res)
bpf_obj_drop(res);

The __kptr tag identifies map_value's node as a referenced kptr, while the
PTR_TO_BTF_ID which bpf_obj_new returns - a type in some non-vmlinux,
non-module BTF - identifies res as a local kptr. Type tag on the pointer
indicates referenced kptr, while the type of the pointee indicates local kptr.
So using existing facilities we can tell the verifier about a "referenced kptr"
pointer to a "local kptr" pointee.

When kptr_xchg'ing a kptr into a map_value, the verifier can recognize local
kptr types and treat them like referenced kptrs with a properly-typed
bpf_obj_drop as a dtor.

Other implementation notes:
* We don't need to do anything special to enforce "graph nodes cannot be
both stashed in a map_value and contained by their corresponding collection"
* bpf_kptr_xchg both returns and takes as input a (possibly-null) owning
reference. It does not accept non-owning references as input by virtue
of requiring a ref_obj_id. By definition, if a program has an owning
ref to a node, the node isn't in a collection, so it's safe to pass
ownership via bpf_kptr_xchg.

Summary of patches:

* Patch 1 modifies BTF plumbing to support using bpf_obj_drop as a dtor
* Patch 2 adds verifier plumbing to support MEM_ALLOC-flagged param for
bpf_kptr_xchg
* Patch 3 adds selftests exercising the new behavior

Changelog:

v1 -> v2: https://lore.kernel.org/bpf/20230309180111.1618459-1-davemarchevsky@fb.com/

Patch #s used below refer to the patch's position in v1 unless otherwise
specified.

Patches 1-3 were applied and are not included in v2.
Rebase onto latest bpf-next: "libbpf: Revert poisoning of strlcpy"

Patch 4: "bpf: Support __kptr to local kptrs"
* Remove !btf_is_kernel(btf) check, WARN_ON_ONCE instead (Alexei)

Patch 6: "selftests/bpf: Add local kptr stashing test"
* Add test which stashes 2 nodes and later unstashes one of them using a
separate BPF program (Alexei)
* Fix incorrect runner subtest name for original test (was
"rbtree_add_nodes")
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# c8e18754 11-Mar-2023 Dave Marchevsky <davemarchevsky@fb.com>

bpf: Support __kptr to local kptrs

If a PTR_TO_BTF_ID type comes from program BTF - not vmlinux or module
BTF - it must have been allocated by bpf_obj_new and therefore must be
free'd with bpf_obj_d

bpf: Support __kptr to local kptrs

If a PTR_TO_BTF_ID type comes from program BTF - not vmlinux or module
BTF - it must have been allocated by bpf_obj_new and therefore must be
free'd with bpf_obj_drop. Such a PTR_TO_BTF_ID is considered a "local
kptr" and is tagged with MEM_ALLOC type tag by bpf_obj_new.

This patch adds support for treating __kptr-tagged pointers to "local
kptrs" as having an implicit bpf_obj_drop destructor for referenced kptr
acquire / release semantics. Consider the following example:

struct node_data {
long key;
long data;
struct bpf_rb_node node;
};

struct map_value {
struct node_data __kptr *node;
};

struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, int);
__type(value, struct map_value);
__uint(max_entries, 1);
} some_nodes SEC(".maps");

If struct node_data had a matching definition in kernel BTF, the verifier would
expect a destructor for the type to be registered. Since struct node_data does
not match any type in kernel BTF, the verifier knows that there is no kfunc
that provides a PTR_TO_BTF_ID to this type, and that such a PTR_TO_BTF_ID can
only come from bpf_obj_new. So instead of searching for a registered dtor,
a bpf_obj_drop dtor can be assumed.

This allows the runtime to properly destruct such kptrs in
bpf_obj_free_fields, which enables maps to clean up map_vals w/ such
kptrs when going away.

Implementation notes:
* "kernel_btf" variable is renamed to "kptr_btf" in btf_parse_kptr.
Before this patch, the variable would only ever point to vmlinux or
module BTFs, but now it can point to some program BTF for local kptr
type. It's later used to populate the (btf, btf_id) pair in kptr btf
field.
* It's necessary to btf_get the program BTF when populating btf_field
for local kptr. btf_record_free later does a btf_put.
* Behavior for non-local referenced kptrs is not modified, as
bpf_find_btf_id helper only searches vmlinux and module BTFs for
matching BTF type. If such a type is found, btf_field_kptr's btf will
pass btf_is_kernel check, and the associated release function is
some one-argument dtor. If btf_is_kernel check fails, associated
release function is two-arg bpf_obj_drop_impl. Before this patch
only btf_field_kptr's w/ kernel or module BTFs were created.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230310230743.2320707-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# b8fa3e38 10-Mar-2023 Arnaldo Carvalho de Melo <acme@redhat.com>

Merge remote-tracking branch 'acme/perf-tools' into perf-tools-next

To pick up perf-tools fixes just merged upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 23e403b3 09-Mar-2023 Alexei Starovoitov <ast@kernel.org>

Merge branch 'BPF open-coded iterators'

Andrii Nakryiko says:

====================

Add support for open-coded (aka inline) iterators in BPF world. This is a next
evolution of gradually allowing mo

Merge branch 'BPF open-coded iterators'

Andrii Nakryiko says:

====================

Add support for open-coded (aka inline) iterators in BPF world. This is a next
evolution of gradually allowing more powerful and less restrictive looping and
iteration capabilities to BPF programs.

We set up a framework for implementing all kinds of iterators (e.g., cgroup,
task, file, etc, iterators), but this patch set only implements numbers
iterator, which is used to implement ergonomic bpf_for() for-like construct
(see patches #4-#5). We also add bpf_for_each(), which is a generic
foreach-like construct that will work with any kind of open-coded iterator
implementation, as long as we stick with bpf_iter_<type>_{new,next,destroy}()
naming pattern (which we now enforce on the kernel side).

Patch #1 is preparatory refactoring for easier way to check for special kfunc
calls. Patch #2 is adding iterator kfunc registration and validation logic,
which is mostly independent from the rest of open-coded iterator logic, so is
separated out for easier reviewing.

The meat of verifier-side logic is in patch #3. Patch #4 implements numbers
iterator. I kept them separate to have clean reference for how to integrate
new iterator types (now even simpler to do than in v1 of this patch set).
Patch #5 adds bpf_for(), bpf_for_each(), and bpf_repeat() macros to
bpf_misc.h, and also adds yet another pyperf test variant, now with bpf_for()
loop. Patch #6 is verification tests, based on numbers iterator (as the only
available right now). Patch #7 actually tests runtime behavior of numbers
iterator.

Finally, with changes in v2, it's possible and trivial to implement custom
iterators completely in kernel modules, which we showcase and test by adding
a simple iterator returning same number a given number of times to
bpf_testmod. Patch #8 is where all this happens and is tested.

Most of the relevant details are in corresponding commit messages or code
comments.

v4->v5:
- fixing missed inner for() in is_iter_reg_valid_uninit, and fixed return
false (kernel test robot);
- typo fixes and comment/commit description improvements throughout the
patch set;
v3->v4:
- remove unused variable from is_iter_reg_valid_init (kernel test robot);
v2->v3:
- remove special kfunc leftovers for bpf_iter_num_{new,next,destroy};
- add iters/testmod_seq* to DENYLIST.s390x, it doesn't support kfuncs in
modules yet (CI);
v1->v2:
- rebased on latest, dropping previously landed preparatory patches;
- each iterator type now have its own `struct bpf_iter_<type>` which allows
each iterator implementation to use exactly as much stack space as
necessary, allowing to avoid runtime allocations (Alexei);
- reworked how iterator kfuncs are defined, no verifier changes are required
when adding new iterator type;
- added bpf_testmod-based iterator implementation;
- address the rest of feedback, comments, commit message adjustment, etc.

Cc: Tejun Heo <tj@kernel.org>
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 6018e1f4 08-Mar-2023 Andrii Nakryiko <andrii@kernel.org>

bpf: implement numbers iterator

Implement the first open-coded iterator type over a range of integers.

It's public API consists of:
- bpf_iter_num_new() constructor, which accepts [start, end) ra

bpf: implement numbers iterator

Implement the first open-coded iterator type over a range of integers.

It's public API consists of:
- bpf_iter_num_new() constructor, which accepts [start, end) range
(that is, start is inclusive, end is exclusive).
- bpf_iter_num_next() which will keep returning read-only pointer to int
until the range is exhausted, at which point NULL will be returned.
If bpf_iter_num_next() is kept calling after this, NULL will be
persistently returned.
- bpf_iter_num_destroy() destructor, which needs to be called at some
point to clean up iterator state. BPF verifier enforces that iterator
destructor is called at some point before BPF program exits.

Note that `start = end = X` is a valid combination to setup an empty
iterator. bpf_iter_num_new() will return 0 (success) for any such
combination.

If bpf_iter_num_new() detects invalid combination of input arguments, it
returns error, resets iterator state to, effectively, empty iterator, so
any subsequent call to bpf_iter_num_next() will keep returning NULL.

BPF verifier has no knowledge that returned integers are in the
[start, end) value range, as both `start` and `end` are not statically
known and enforced: they are runtime values.

While the implementation is pretty trivial, some care needs to be taken
to avoid overflows and underflows. Subsequent selftests will validate
correctness of [start, end) semantics, especially around extremes
(INT_MIN and INT_MAX).

Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can
be specified.

bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded
BPF loops and bpf_loop() helper and is the basis for implementing
ergonomic BPF loops with no statically known or verified bounds.
Subsequent patches implement bpf_for() macro, demonstrating how this can
be wrapped into something that works and feels like a normal for() loop
in C language.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# 36e5e391 07-Mar-2023 Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-03-06

We've added 85 non-merge commits

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-03-06

We've added 85 non-merge commits during the last 13 day(s) which contain
a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).

The main changes are:

1) Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and variable-sized
accesses, from Joanne Koong.

2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
open-coded iterators allowing for less restrictive looping capabilities,
from Andrii Nakryiko.

3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
programs to NULL-check before passing such pointers into kfunc,
from Alexei Starovoitov.

4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
local storage maps, from Kumar Kartikeya Dwivedi.

5) Add BPF verifier support for ST instructions in convert_ctx_access()
which will help new -mcpu=v4 clang flag to start emitting them,
from Eduard Zingerman.

6) Make uprobe attachment Android APK aware by supporting attachment
to functions inside ELF objects contained in APKs via function names,
from Daniel Müller.

7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
to start the timer with absolute expiration value instead of relative
one, from Tero Kristo.

8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
from Tejun Heo.

9) Extend libbpf to support users manually attaching kprobes/uprobes
in the legacy/perf/link mode, from Menglong Dong.

10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
from Jiaxun Yang.

11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
from Hengqi Chen.

12) Extend BPF instruction set doc with describing the encoding of BPF
instructions in terms of how bytes are stored under big/little endian,
from Jose E. Marchesi.

13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.

14) Fix bpf_xdp_query() backwards compatibility on old kernels,
from Yonghong Song.

15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
from Florent Revest.

16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
from Hou Tao.

17) Fix BPF verifier's check_subprogs to not unnecessarily mark
a subprogram with has_tail_call, from Ilya Leoshkevich.

18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
selftests/bpf: Split test_attach_probe into multi subtests
libbpf: Add support to set kprobe/uprobe attach mode
tools/resolve_btfids: Add /libsubcmd to .gitignore
bpf: add support for fixed-size memory pointer returns for kfuncs
bpf: generalize dynptr_get_spi to be usable for iters
bpf: mark PTR_TO_MEM as non-null register type
bpf: move kfunc_call_arg_meta higher in the file
bpf: ensure that r0 is marked scratched after any function call
bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
bpf: clean up visit_insn()'s instruction processing
selftests/bpf: adjust log_fixup's buffer size for proper truncation
bpf: honor env->test_state_freq flag in is_state_visited()
selftests/bpf: enhance align selftest's expected log matching
bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
bpf: improve stack slot state printing
selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
bpf: allow ctx writes using BPF_ST_MEM instruction
bpf: Use separate RCU callbacks for freeing selem
...
====================

Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


Revision tags: v6.3-rc1
# db55174d 03-Mar-2023 Daniel Borkmann <daniel@iogearbox.net>

Merge branch 'bpf-kptr-rcu'

Alexei Starovoitov says:

====================
v4->v5:
fix typos, add acks.

v3->v4:
- patch 3 got much cleaner after BPF_KPTR_RCU was removed as suggested by David.

- m

Merge branch 'bpf-kptr-rcu'

Alexei Starovoitov says:

====================
v4->v5:
fix typos, add acks.

v3->v4:
- patch 3 got much cleaner after BPF_KPTR_RCU was removed as suggested by David.

- make KF_RCU stronger and require that bpf program checks for NULL
before passing such pointers into kfunc. The prog has to do that anyway
to access fields and it aligns with BTF_TYPE_SAFE_RCU allowlist.

- New patch 6: refactor RCU enforcement in the verifier.
The patches 2,3,6 are part of one feature.
The 2 and 3 alone are incomplete, since RCU pointers are barely useful
without bpf_rcu_read_lock/unlock in GCC compiled kernel.
Even if GCC lands support for btf_type_tag today it will take time
to mandate that version for kernel builds. Hence go with allow list
approach. See patch 6 for details.
This allows to start strict enforcement of TRUSTED | UNTRUSTED
in one part of PTR_TO_BTF_ID accesses.
One step closer to KF_TRUSTED_ARGS by default.

v2->v3:
- Instead of requiring bpf progs to tag fields with __kptr_rcu
teach the verifier to infer RCU properties based on the type.
BPF_KPTR_RCU becomes kernel internal type of struct btf_field.
- Add patch 2 to tag cgroups and dfl_cgrp as trusted.
That bug was spotted by BPF CI on clang compiler kernels,
since patch 3 is doing:
static bool in_rcu_cs(struct bpf_verifier_env *env)
{
return env->cur_state->active_rcu_lock || !env->prog->aux->sleepable;
}
which makes all non-sleepable programs behave like they have implicit
rcu_read_lock around them. Which is the case in practice.
It was fine on gcc compiled kernels where task->cgroup deference was producing
PTR_TO_BTF_ID, but on clang compiled kernels task->cgroup deference was
producing PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL, which is more correct,
but selftests were failing. Patch 2 fixes this discrepancy.
With few more patches like patch 2 we can make KF_TRUSTED_ARGS default
for kfuncs and helpers.
- Add comment in selftest patch 5 that it's verifier only check.

v1->v2:
Instead of agressively allow dereferenced kptr_rcu pointers into KF_TRUSTED_ARGS
kfuncs only allow them into KF_RCU funcs.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free, but the pointers
maybe NULL and pointee object's reference count could have reached zero, hence
kfuncs must do != NULL check and consider refcnt==0 case when accessing such
arguments.
No changes in patch 1.
Patches 2,3,4 adjusted with above behavior.

v1:
The __kptr_ref turned out to be too limited, since any "trusted" pointer access
requires bpf_kptr_xchg() which is impractical when the same pointer needs
to be dereferenced by multiple cpus.
The __kptr "untrusted" only access isn't very useful in practice.
Rename __kptr to __kptr_untrusted with eventual goal to deprecate it,
and rename __kptr_ref to __kptr, since that looks to be more common use of kptrs.
Introduce __kptr_rcu that can be directly dereferenced and used similar
to native kernel C code.
Once bpf_cpumask and task_struct kfuncs are converted to observe RCU GP
when refcnt goes to zero, both __kptr and __kptr_untrusted can be deprecated
and __kptr_rcu can become the only __kptr tag.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


# 20c09d92 03-Mar-2023 Alexei Starovoitov <ast@kernel.org>

bpf: Introduce kptr_rcu.

The life time of certain kernel structures like 'struct cgroup' is protected by RCU.
Hence it's safe to dereference them directly from __kptr tagged pointers in bpf maps.
Th

bpf: Introduce kptr_rcu.

The life time of certain kernel structures like 'struct cgroup' is protected by RCU.
Hence it's safe to dereference them directly from __kptr tagged pointers in bpf maps.
The resulting pointer is MEM_RCU and can be passed to kfuncs that expect KF_RCU.
Derefrence of other kptr-s returns PTR_UNTRUSTED.

For example:
struct map_value {
struct cgroup __kptr *cgrp;
};

SEC("tp_btf/cgroup_mkdir")
int BPF_PROG(test_cgrp_get_ancestors, struct cgroup *cgrp_arg, const char *path)
{
struct cgroup *cg, *cg2;

cg = bpf_cgroup_acquire(cgrp_arg); // cg is PTR_TRUSTED and ref_obj_id > 0
bpf_kptr_xchg(&v->cgrp, cg);

cg2 = v->cgrp; // This is new feature introduced by this patch.
// cg2 is PTR_MAYBE_NULL | MEM_RCU.
// When cg2 != NULL, it's a valid cgroup, but its percpu_ref could be zero

if (cg2)
bpf_cgroup_ancestor(cg2, level); // safe to do.
}

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230303041446.3630-4-alexei.starovoitov@gmail.com

show more ...


# f71f8530 02-Mar-2023 Tero Kristo <tero.kristo@linux.intel.com>

bpf: Add support for absolute value BPF timers

Add a new flag BPF_F_TIMER_ABS that can be passed to bpf_timer_start()
to start an absolute value timer instead of the default relative value.
This mak

bpf: Add support for absolute value BPF timers

Add a new flag BPF_F_TIMER_ABS that can be passed to bpf_timer_start()
to start an absolute value timer instead of the default relative value.
This makes the timer expire at an exact point in time, instead of a time
with latencies induced by both the BPF and timer subsystems.

Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Link: https://lore.kernel.org/r/20230302114614.2985072-2-tero.kristo@linux.intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# c501bf55 02-Mar-2023 Tejun Heo <tj@kernel.org>

bpf: Make bpf_get_current_[ancestor_]cgroup_id() available for all program types

These helpers are safe to call from any context and there's no reason to
restrict access to them. Remove them from bp

bpf: Make bpf_get_current_[ancestor_]cgroup_id() available for all program types

These helpers are safe to call from any context and there's no reason to
restrict access to them. Remove them from bpf_trace and filter lists and add
to bpf_base_func_proto() under perfmon_capable().

v2: After consulting with Andrii, relocated in bpf_base_func_proto() so that
they require bpf_capable() but not perfomon_capable() as it doesn't read
from or affect others on the system.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/ZAD8QyoszMZiTzBY@slm.duckdns.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# c45eac53 02-Mar-2023 Joanne Koong <joannelkoong@gmail.com>

bpf: Fix bpf_dynptr_slice{_rdwr} to return NULL instead of 0

Change bpf_dynptr_slice and bpf_dynptr_slice_rdwr to return NULL instead
of 0, in accordance with the codebase guidelines.

Fixes: 66e3a1

bpf: Fix bpf_dynptr_slice{_rdwr} to return NULL instead of 0

Change bpf_dynptr_slice and bpf_dynptr_slice_rdwr to return NULL instead
of 0, in accordance with the codebase guidelines.

Fixes: 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230302053014.1726219-1-joannelkoong@gmail.com

show more ...


# 7ce60b11 01-Mar-2023 David Vernet <void@manifault.com>

bpf: Fix doxygen comments for dynptr slice kfuncs

In commit 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and
bpf_dynptr_slice_rdwr"), the bpf_dynptr_slice() and
bpf_dynptr_slice_rdwr() kfuncs were added

bpf: Fix doxygen comments for dynptr slice kfuncs

In commit 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and
bpf_dynptr_slice_rdwr"), the bpf_dynptr_slice() and
bpf_dynptr_slice_rdwr() kfuncs were added to BPF. These kfuncs included
doxygen headers, but unfortunately those headers are not properly
formatted according to [0], and causes the following warnings during the
docs build:

./kernel/bpf/helpers.c:2225: warning: \
Excess function parameter 'returns' description in 'bpf_dynptr_slice'
./kernel/bpf/helpers.c:2303: warning: \
Excess function parameter 'returns' description in 'bpf_dynptr_slice_rdwr'
...

This patch fixes those doxygen comments.

[0]: https://docs.kernel.org/doc-guide/kernel-doc.html#function-documentation

Fixes: 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr")
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230301194910.602738-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# c4b5c5ba 01-Mar-2023 Alexei Starovoitov <ast@kernel.org>

Merge branch 'Add skb + xdp dynptrs'

Joanne Koong says:

====================

This patchset is the 2nd in the dynptr series. The 1st can be found here [0].

This patchset adds skb and xdp type dynp

Merge branch 'Add skb + xdp dynptrs'

Joanne Koong says:

====================

This patchset is the 2nd in the dynptr series. The 1st can be found here [0].

This patchset adds skb and xdp type dynptrs, which have two main benefits for
packet parsing:
* allowing operations on sizes that are not statically known at
compile-time (eg variable-sized accesses).
* more ergonomic and less brittle iteration through data (eg does not need
manual if checking for being within bounds of data_end)

When comparing the differences in runtime for packet parsing without dynptrs
vs. with dynptrs, there is no noticeable difference. Patch 9 contains more
details as well as examples of how to use skb and xdp dynptrs.

[0] https://lore.kernel.org/bpf/20220523210712.3641569-1-joannelkoong@gmail.com/
---
Changelog:

v12 = https://lore.kernel.org/bpf/20230226085120.3907863-1-joannelkoong@gmail.com/
v12 -> v13:
* Fix missing { } for case statement

v11 = https://lore.kernel.org/bpf/20230222060747.2562549-1-joannelkoong@gmail.com/
v11 -> v12:
* Change constant mem size checking to use "__szk" kfunc annotation
for slices
* Use autoloading for success selftests

v10 = https://lore.kernel.org/bpf/20230216225524.1192789-1-joannelkoong@gmail.com/
v10 -> v11:
* Reject bpf_dynptr_slice_rdwr() for non-writable progs at load time
instead of runtime
* Add additional patch (__uninit kfunc annotation)
* Expand on documentation
* Add bpf_dynptr_write() calls for persisting writes in tests

v9 = https://lore.kernel.org/bpf/20230127191703.3864860-1-joannelkoong@gmail.com/
v9 -> v10:
* Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr interface
* Add some more tests
* Split up patchset into more parts to make it easier to review

v8 = https://lore.kernel.org/bpf/20230126233439.3739120-1-joannelkoong@gmail.com/
v8 -> v9:
* Fix dynptr_get_type() to check non-stack dynptrs

v7 = https://lore.kernel.org/bpf/20221021011510.1890852-1-joannelkoong@gmail.com/
v7 -> v8:
* Change helpers to kfuncs
* Add 2 new patches (1/5 and 2/5)

v6 = https://lore.kernel.org/bpf/20220907183129.745846-1-joannelkoong@gmail.com/
v6 -> v7
* Change bpf_dynptr_data() to return read-only data slices if the skb prog
is read-only (Martin)
* Add test "skb_invalid_write" to test that writes to rd-only data slices
are rejected

v5 = https://lore.kernel.org/bpf/20220831183224.3754305-1-joannelkoong@gmail.com/
v5 -> v6
* Address kernel test robot errors by static inlining

v4 = https://lore.kernel.org/bpf/20220822235649.2218031-1-joannelkoong@gmail.com/
v4 -> v5
* Address kernel test robot errors for configs w/out CONFIG_NET set
* For data slices, return PTR_TO_MEM instead of PTR_TO_PACKET (Kumar)
* Split selftests into subtests (Andrii)
* Remove insn patching. Use rdonly and rdwr protos for dynptr skb
construction (Andrii)
* bpf_dynptr_data() returns NULL for rd-only dynptrs. There will be a
separate bpf_dynptr_data_rdonly() added later (Andrii and Kumar)

v3 = https://lore.kernel.org/bpf/20220822193442.657638-1-joannelkoong@gmail.com/
v3 -> v4
* Forgot to commit --amend the kernel test robot error fixups

v2 = https://lore.kernel.org/bpf/20220811230501.2632393-1-joannelkoong@gmail.com/
v2 -> v3
* Fix kernel test robot build test errors

v1 = https://lore.kernel.org/bpf/20220726184706.954822-1-joannelkoong@gmail.com/
v1 -> v2
* Return data slices to rd-only skb dynptrs (Martin)
* bpf_dynptr_write allows writes to frags for skb dynptrs, but always
invalidates associated data slices (Martin)
* Use switch casing instead of ifs (Andrii)
* Use 0xFD for experimental kind number in the selftest (Zvi)
* Put selftest conversions w/ dynptrs into new files (Alexei)
* Add new selftest "test_cls_redirect_dynptr.c"
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


1...<<11121314151617181920>>...42