History log of /linux/io_uring/bpf_filter.h (Results 1 – 5 of 5)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# c17ee635 23-Feb-2026 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-fixes into drm-misc-fixes

7.0-rc1 was just released, let's merge it to kick the new release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


Revision tags: v7.0-rc1
# 591beb0e 10-Feb-2026 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring bpf filters from Jens Axboe:
"This adds support for both cBPF filters for

Merge tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull io_uring bpf filters from Jens Axboe:
"This adds support for both cBPF filters for io_uring, as well as task
inherited restrictions and filters.

seccomp and io_uring don't play along nicely, as most of the
interesting data to filter on resides somewhat out-of-band, in the
submission queue ring.

As a result, things like containers and systemd that apply seccomp
filters, can't filter io_uring operations.

That leaves them with just one choice if filtering is critical -
filter the actual io_uring_setup(2) system call to simply disallow
io_uring. That's rather unfortunate, and has limited us because of it.

io_uring already has some filtering support. It requires the ring to
be setup in a disabled state, and then a filter set can be applied.
This filter set is completely bi-modal - an opcode is either enabled
or it's not. Once a filter set is registered, the ring can be enabled.
This is very restrictive, and it's not useful at all to systemd or
containers which really want both broader and more specific control.

This first adds support for cBPF filters for opcodes, which enables
tighter control over what exactly a specific opcode may do. As
examples, specific support is added for IORING_OP_OPENAT/OPENAT2,
allowing filtering on resolve flags. And another example is added for
IORING_OP_SOCKET, allowing filtering on domain/type/protocol. These
are both common use cases. cBPF was chosen rather than eBPF, because
the latter is often restricted in containers as well.

These filters are run post the init phase of the request, which allows
filters to even dip into data that is being passed in struct in user
memory, as the init side of requests make that data stable by bringing
it into the kernel. This allows filtering without needing to copy this
data twice, or have filters etc know about the exact layout of the
user data. The filters get the already copied and sanitized data
passed.

On top of that support is added for per-task filters, meaning that any
ring created with a task that has a per-task filter will get those
filters applied when it's created. These filters are inherited across
fork as well. Once a filter has been registered, any further added
filters may only further restrict what operations are permitted.

Filters cannot change the return value of an operation, they can only
permit or deny it based on the contents"

* tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring: allow registration of per-task restrictions
io_uring: add task fork hook
io_uring/bpf_filter: add ref counts to struct io_bpf_filter
io_uring/bpf_filter: cache lookup table in ctx->bpf_filters
io_uring/bpf_filter: allow filtering on contents of struct open_how
io_uring/net: allow filtering on IORING_OP_SOCKET data
io_uring: add support for BPF filtering for opcode restrictions

show more ...


Revision tags: v6.19, v6.19-rc8, v6.19-rc7, v6.19-rc6, v6.19-rc5
# ed82f35b 08-Jan-2026 Jens Axboe <axboe@kernel.dk>

io_uring: allow registration of per-task restrictions

Currently io_uring supports restricting operations on a per-ring basis.
To use those, the ring must be setup in a disabled state by setting
IORI

io_uring: allow registration of per-task restrictions

Currently io_uring supports restricting operations on a per-ring basis.
To use those, the ring must be setup in a disabled state by setting
IORING_SETUP_R_DISABLED. Then restrictions can be set for the ring, and
the ring can then be enabled.

This commit adds support for IORING_REGISTER_RESTRICTIONS with ring_fd
== -1, like the other "blind" register opcodes which work on the task
rather than a specific ring. This allows registration of the same kind
of restrictions as can been done on a specific ring, but with the task
itself. Once done, any ring created will inherit these restrictions.

If a restriction filter is registered with a task, then it's inherited
on fork for its children. Children may only further restrict operations,
not extend them.

Inheriting restrictions include both the classic
IORING_REGISTER_RESTRICTIONS based restrictions, as well as the BPF
filters that have been registered with the task via
IORING_REGISTER_BPF_FILTER.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# e7c30675 17-Jan-2026 Jens Axboe <axboe@kernel.dk>

io_uring/bpf_filter: cache lookup table in ctx->bpf_filters

Currently a few pointer dereferences need to be made to both check if
BPF filters are installed, and then also to retrieve the actual filt

io_uring/bpf_filter: cache lookup table in ctx->bpf_filters

Currently a few pointer dereferences need to be made to both check if
BPF filters are installed, and then also to retrieve the actual filter
for the opcode. Cache the table in ctx->bpf_filters to avoid that.

Add a bit of debug info on ring exit to show if we ever got this wrong.
Small risk of that given that the table is currently only updated in one
spot, but once task forking is enabled, that will add one more spot.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# d42eb05e 15-Jan-2026 Jens Axboe <axboe@kernel.dk>

io_uring: add support for BPF filtering for opcode restrictions

Add support for loading classic BPF programs with io_uring to provide
fine-grained filtering of SQE operations. Unlike
IORING_REGISTER

io_uring: add support for BPF filtering for opcode restrictions

Add support for loading classic BPF programs with io_uring to provide
fine-grained filtering of SQE operations. Unlike
IORING_REGISTER_RESTRICTIONS which only allows bitmap-based allow/deny
of opcodes, BPF filters can inspect request attributes and make dynamic
decisions.

The filter is registered via IORING_REGISTER_BPF_FILTER with a struct
io_uring_bpf:

struct io_uring_bpf_filter {
__u32 opcode; /* io_uring opcode to filter */
__u32 flags;
__u32 filter_len; /* number of BPF instructions */
__u32 resv;
__u64 filter_ptr; /* pointer to BPF filter */
__u64 resv2[5];
};

enum {
IO_URING_BPF_CMD_FILTER = 1,
};

struct io_uring_bpf {
__u16 cmd_type; /* IO_URING_BPF_* values */
__u16 cmd_flags; /* none so far */
__u32 resv;
union {
struct io_uring_bpf_filter filter;
};
};

and the filters get supplied a struct io_uring_bpf_ctx:

struct io_uring_bpf_ctx {
__u64 user_data;
__u8 opcode;
__u8 sqe_flags;
__u8 pdu_size;
__u8 pad[5];
};

where it's possible to filter on opcode and sqe_flags, with pdu_size
indicating how much extra data is being passed in beyond the pad field.
This will used for specific finer grained filtering inside an opcode.
An example of that for sockets is in one of the following patches.
Anything the opcode supports can end up in this struct, populated by
the opcode itself, and hence can be filtered for.

Filters have the following semantics:
- Return 1 to allow the request
- Return 0 to deny the request with -EACCES
- Multiple filters can be stacked per opcode. All filters must
return 1 for the opcode to be allowed.
- Filters are evaluated in registration order (most recent first)

The implementation uses classic BPF (cBPF) rather than eBPF for as
that's required for containers, and since they can be used by any
user in the system.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...