History log of /linux/kernel/bpf/token.c (Results 1 – 23 of 23)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a23e1966 15-Jul-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.11 merge window.


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2
# 6f47c7ae 28-May-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.9' into next

Sync up with the mainline to bring in the new cleanup API.


Revision tags: v6.10-rc1
# 60a2f25d 16-May-2024 Tvrtko Ursulin <tursulin@ursulin.net>

Merge drm/drm-next into drm-intel-gt-next

Some display refactoring patches are needed in order to allow conflict-
less merging.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>


# 594ce0b8 10-Jun-2024 Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Merge topic branches 'clkdev' and 'fixes' into for-linus


Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1
# b228ab57 18-Mar-2024 Andrew Morton <akpm@linux-foundation.org>

Merge branch 'master' into mm-stable


# 79790b68 12-Apr-2024 Thomas Hellström <thomas.hellstrom@linux.intel.com>

Merge drm/drm-next into drm-xe-next

Backmerging drm-next in order to get up-to-date and in particular
to access commit 9ca5facd0400f610f3f7f71aeb7fc0b949a48c67.

Signed-off-by: Thomas Hellström <tho

Merge drm/drm-next into drm-xe-next

Backmerging drm-next in order to get up-to-date and in particular
to access commit 9ca5facd0400f610f3f7f71aeb7fc0b949a48c67.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

show more ...


# 3e5a516f 08-Apr-2024 Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Merge tag 'phy_dp_modes_6.10' into msm-next-lumag

Merge DisplayPort subnode API in order to allow DisplayPort driver to
configure the PHYs either to the DP or eDP mode, depending on hardware
configu

Merge tag 'phy_dp_modes_6.10' into msm-next-lumag

Merge DisplayPort subnode API in order to allow DisplayPort driver to
configure the PHYs either to the DP or eDP mode, depending on hardware
configuration.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

show more ...


# 5add703f 02-Apr-2024 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next

Catching up on 6.9-rc2

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 0d21364c 02-Apr-2024 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next

Backmerging to get v6.9-rc2 changes into drm-misc-next.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# b7e1e969 26-Mar-2024 Takashi Iwai <tiwai@suse.de>

Merge branch 'topic/sound-devel-6.10' into for-next


# f4566a1e 25-Mar-2024 Ingo Molnar <mingo@kernel.org>

Merge tag 'v6.9-rc1' into sched/core, to pick up fixes and to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 100c8542 05-Apr-2024 Takashi Iwai <tiwai@suse.de>

Merge tag 'asoc-fix-v6.9-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.9

A relatively large set of fixes here, the biggest piece of it is a

Merge tag 'asoc-fix-v6.9-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v6.9

A relatively large set of fixes here, the biggest piece of it is a
series correcting some problems with the delay reporting for Intel SOF
cards but there's a bunch of other things. Everything here is driver
specific except for a fix in the core for an issue with sign extension
handling volume controls.

show more ...


# 36a1818f 25-Mar-2024 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-fixes into drm-misc-fixes

Backmerging to get drm-misc-fixes to the state of v6.9-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 9187210e 13-Mar-2024 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
"Core & protocols:

- Large effort by Eric to lower rtnl_lo

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
"Core & protocols:

- Large effort by Eric to lower rtnl_lock pressure and remove locks:

- Make commonly used parts of rtnetlink (address, route dumps
etc) lockless, protected by RCU instead of rtnl_lock.

- Add a netns exit callback which already holds rtnl_lock,
allowing netns exit to take rtnl_lock once in the core instead
of once for each driver / callback.

- Remove locks / serialization in the socket diag interface.

- Remove 6 calls to synchronize_rcu() while holding rtnl_lock.

- Remove the dev_base_lock, depend on RCU where necessary.

- Support busy polling on a per-epoll context basis. Poll length and
budget parameters can be set independently of system defaults.

- Introduce struct net_hotdata, to make sure read-mostly global
config variables fit in as few cache lines as possible.

- Add optional per-nexthop statistics to ease monitoring / debug of
ECMP imbalance problems.

- Support TCP_NOTSENT_LOWAT in MPTCP.

- Ensure that IPv6 temporary addresses' preferred lifetimes are long
enough, compared to other configured lifetimes, and at least 2 sec.

- Support forwarding of ICMP Error messages in IPSec, per RFC 4301.

- Add support for the independent control state machine for bonding
per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
control state machine.

- Add "network ID" to MCTP socket APIs to support hosts with multiple
disjoint MCTP networks.

- Re-use the mono_delivery_time skbuff bit for packets which user
space wants to be sent at a specified time. Maintain the timing
information while traversing veth links, bridge etc.

- Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.

- Simplify many places iterating over netdevs by using an xarray
instead of a hash table walk (hash table remains in place, for use
on fastpaths).

- Speed up scanning for expired routes by keeping a dedicated list.

- Speed up "generic" XDP by trying harder to avoid large allocations.

- Support attaching arbitrary metadata to netconsole messages.

Things we sprinkled into general kernel code:

- Enforce VM_IOREMAP flag and range in ioremap_page_range and
introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
bpf_arena).

- Rework selftest harness to enable the use of the full range of ksft
exit code (pass, fail, skip, xfail, xpass).

Netfilter:

- Allow userspace to define a table that is exclusively owned by a
daemon (via netlink socket aliveness) without auto-removing this
table when the userspace program exits. Such table gets marked as
orphaned and a restarting management daemon can re-attach/regain
ownership.

- Speed up element insertions to nftables' concatenated-ranges set
type. Compact a few related data structures.

BPF:

- Add BPF token support for delegating a subset of BPF subsystem
functionality from privileged system-wide daemons such as systemd
through special mount options for userns-bound BPF fs to a trusted
& unprivileged application.

- Introduce bpf_arena which is sparse shared memory region between
BPF program and user space where structures inside the arena can
have pointers to other areas of the arena, and pointers work
seamlessly for both user-space programs and BPF programs.

- Introduce may_goto instruction that is a contract between the
verifier and the program. The verifier allows the program to loop
assuming it's behaving well, but reserves the right to terminate
it.

- Extend the BPF verifier to enable static subprog calls in spin lock
critical sections.

- Support registration of struct_ops types from modules which helps
projects like fuse-bpf that seeks to implement a new struct_ops
type.

- Add support for retrieval of cookies for perf/kprobe multi links.

- Support arbitrary TCP SYN cookie generation / validation in the TC
layer with BPF to allow creating SYN flood handling in BPF
firewalls.

- Add code generation to inline the bpf_kptr_xchg() helper which
improves performance when stashing/popping the allocated BPF
objects.

Wireless:

- Add SPP (signaling and payload protected) AMSDU support.

- Support wider bandwidth OFDMA, as required for EHT operation.

Driver API:

- Major overhaul of the Energy Efficient Ethernet internals to
support new link modes (2.5GE, 5GE), share more code between
drivers (especially those using phylib), and encourage more
uniform behavior. Convert and clean up drivers.

- Define an API for querying per netdev queue statistics from
drivers.

- IPSec: account in global stats for fully offloaded sessions.

- Create a concept of Ethernet PHY Packages at the Device Tree level,
to allow parameterizing the existing PHY package code.

- Enable Rx hashing (RSS) on GTP protocol fields.

Misc:

- Improvements and refactoring all over networking selftests.

- Create uniform module aliases for TC classifiers, actions, and
packet schedulers to simplify creating modprobe policies.

- Address all missing MODULE_DESCRIPTION() warnings in networking.

- Extend the Netlink descriptions in YAML to cover message
encapsulation or "Netlink polymorphism", where interpretation of
nested attributes depends on link type, classifier type or some
other "class type".

Drivers:

- Ethernet high-speed NICs:
- Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
- Intel (100G, ice, idpf):
- support E825-C devices
- nVidia/Mellanox:
- support devices with one port and multiple PCIe links
- Broadcom (bnxt):
- support n-tuple filters
- support configuring the RSS key
- Wangxun (ngbe/txgbe):
- implement irq_domain for TXGBE's sub-interrupts
- Pensando/AMD:
- support XDP
- optimize queue submission and wakeup handling (+17% bps)
- optimize struct layout, saving 28% of memory on queues

- Ethernet NICs embedded and virtual:
- Google cloud vNIC:
- refactor driver to perform memory allocations for new queue
config before stopping and freeing the old queue memory
- Synopsys (stmmac):
- obey queueMaxSDU and implement counters required by 802.1Qbv
- Renesas (ravb):
- support packet checksum offload
- suspend to RAM and runtime PM support

- Ethernet switches:
- nVidia/Mellanox:
- support for nexthop group statistics
- Microchip:
- ksz8: implement PHY loopback
- add support for KSZ8567, a 7-port 10/100Mbps switch

- PTP:
- New driver for RENESAS FemtoClock3 Wireless clock generator.
- Support OCP PTP cards designed and built by Adva.

- CAN:
- Support recvmsg() flags for own, local and remote traffic on CAN
BCM sockets.
- Support for esd GmbH PCIe/402 CAN device family.
- m_can:
- Rx/Tx submission coalescing
- wake on frame Rx

- WiFi:
- Intel (iwlwifi):
- enable signaling and payload protected A-MSDUs
- support wider-bandwidth OFDMA
- support for new devices
- bump FW API to 89 for AX devices; 90 for BZ/SC devices
- MediaTek (mt76):
- mt7915: newer ADIE version support
- mt7925: radio temperature sensor support
- Qualcomm (ath11k):
- support 6 GHz station power modes: Low Power Indoor (LPI),
Standard Power) SP and Very Low Power (VLP)
- QCA6390 & WCN6855: support 2 concurrent station interfaces
- QCA2066 support
- Qualcomm (ath12k):
- refactoring in preparation for Multi-Link Operation (MLO)
support
- 1024 Block Ack window size support
- firmware-2.bin support
- support having multiple identical PCI devices (firmware needs
to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
- QCN9274: support split-PHY devices
- WCN7850: enable Power Save Mode in station mode
- WCN7850: P2P support
- RealTek:
- rtw88: support for more rtw8811cu and rtw8821cu devices
- rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
- rtlwifi: speed up USB firmware initialization
- rtwl8xxxu:
- RTL8188F: concurrent interface support
- Channel Switch Announcement (CSA) support in AP mode
- Broadcom (brcmfmac):
- per-vendor feature support
- per-vendor SAE password setup
- DMI nvram filename quirk for ACEPC W5 Pro"

* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
nexthop: Fix out-of-bounds access during attribute validation
nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
nexthop: Only parse NHA_OP_FLAGS for get messages that require it
bpf: move sleepable flag from bpf_prog_aux to bpf_prog
bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
selftests/bpf: Add kprobe multi triggering benchmarks
ptp: Move from simple ida to xarray
vxlan: Remove generic .ndo_get_stats64
vxlan: Do not alloc tstats manually
devlink: Add comments to use netlink gen tool
nfp: flower: handle acti_netdevs allocation failure
net/packet: Add getsockopt support for PACKET_COPY_THRESH
net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
selftests/bpf: Add bpf_arena_htab test.
selftests/bpf: Add bpf_arena_list test.
selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
bpf: Add helper macro bpf_addr_space_cast()
libbpf: Recognize __arena global variables.
bpftool: Recognize arena map type
...

show more ...


Revision tags: v6.8, v6.8-rc7
# 4b2765ae 03-Mar-2024 Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-02-29

We've added 119 non-merge commit

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-02-29

We've added 119 non-merge commits during the last 32 day(s) which contain
a total of 150 files changed, 3589 insertions(+), 995 deletions(-).

The main changes are:

1) Extend the BPF verifier to enable static subprog calls in spin lock
critical sections, from Kumar Kartikeya Dwivedi.

2) Fix confusing and incorrect inference of PTR_TO_CTX argument type
in BPF global subprogs, from Andrii Nakryiko.

3) Larger batch of riscv BPF JIT improvements and enabling inlining
of the bpf_kptr_xchg() for RV64, from Pu Lehui.

4) Allow skeleton users to change the values of the fields in struct_ops
maps at runtime, from Kui-Feng Lee.

5) Extend the verifier's capabilities of tracking scalars when they
are spilled to stack, especially when the spill or fill is narrowing,
from Maxim Mikityanskiy & Eduard Zingerman.

6) Various BPF selftest improvements to fix errors under gcc BPF backend,
from Jose E. Marchesi.

7) Avoid module loading failure when the module trying to register
a struct_ops has its BTF section stripped, from Geliang Tang.

8) Annotate all kfuncs in .BTF_ids section which eventually allows
for automatic kfunc prototype generation from bpftool, from Daniel Xu.

9) Several updates to the instruction-set.rst IETF standardization
document, from Dave Thaler.

10) Shrink the size of struct bpf_map resp. bpf_array,
from Alexei Starovoitov.

11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer,
from Benjamin Tissoires.

12) Fix bpftool to be more portable to musl libc by using POSIX's
basename(), from Arnaldo Carvalho de Melo.

13) Add libbpf support to gcc in CORE macro definitions,
from Cupertino Miranda.

14) Remove a duplicate type check in perf_event_bpf_event,
from Florian Lehner.

15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them
with notrace correctly, from Yonghong Song.

16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible
array to fix build warnings, from Kees Cook.

17) Fix resolve_btfids cross-compilation to non host-native endianness,
from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
selftests/bpf: Test if shadow types work correctly.
bpftool: Add an example for struct_ops map and shadow type.
bpftool: Generated shadow variables for struct_ops maps.
libbpf: Convert st_ops->data to shadow type.
libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
bpf, arm64: use bpf_prog_pack for memory management
arm64: patching: implement text_poke API
bpf, arm64: support exceptions
arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT
bpf: add is_async_callback_calling_insn() helper
bpf: introduce in_sleepable() helper
bpf: allow more maps in sleepable bpf programs
selftests/bpf: Test case for lacking CFI stub functions.
bpf: Check cfi_stubs before registering a struct_ops type.
bpf: Clarify batch lookup/lookup_and_delete semantics
bpf, docs: specify which BPF_ABS and BPF_IND fields were zero
bpf, docs: Fix typos in instruction-set.rst
selftests/bpf: update tcp_custom_syncookie to use scalar packet offset
bpf: Shrink size of struct bpf_map/bpf_array.
...
====================

Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


Revision tags: v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2
# 6668e818 27-Jan-2024 Haiyue Wang <haiyue.wang@intel.com>

bpf,token: Use BIT_ULL() to convert the bit mask

Replace the '(1ULL << *)' with the macro BIT_ULL(nr).

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kern

bpf,token: Use BIT_ULL() to convert the bit mask

Replace the '(1ULL << *)' with the macro BIT_ULL(nr).

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127134901.3698613-1-haiyue.wang@intel.com

show more ...


# 92046e83 27-Jan-2024 Jakub Kicinski <kuba@kernel.org>

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-01-26

We've added 107 non-merge commit

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-01-26

We've added 107 non-merge commits during the last 4 day(s) which contain
a total of 101 files changed, 6009 insertions(+), 1260 deletions(-).

The main changes are:

1) Add BPF token support to delegate a subset of BPF subsystem
functionality from privileged system-wide daemons such as systemd
through special mount options for userns-bound BPF fs to a trusted
& unprivileged application. With addressed changes from Christian
and Linus' reviews, from Andrii Nakryiko.

2) Support registration of struct_ops types from modules which helps
projects like fuse-bpf that seeks to implement a new struct_ops type,
from Kui-Feng Lee.

3) Add support for retrieval of cookies for perf/kprobe multi links,
from Jiri Olsa.

4) Bigger batch of prep-work for the BPF verifier to eventually support
preserving boundaries and tracking scalars on narrowing fills,
from Maxim Mikityanskiy.

5) Extend the tc BPF flavor to support arbitrary TCP SYN cookies to help
with the scenario of SYN floods, from Kuniyuki Iwashima.

6) Add code generation to inline the bpf_kptr_xchg() helper which
improves performance when stashing/popping the allocated BPF objects,
from Hou Tao.

7) Extend BPF verifier to track aligned ST stores as imprecise spilled
registers, from Yonghong Song.

8) Several fixes to BPF selftests around inline asm constraints and
unsupported VLA code generation, from Jose E. Marchesi.

9) Various updates to the BPF IETF instruction set draft document such
as the introduction of conformance groups for instructions,
from Dave Thaler.

10) Fix BPF verifier to make infinite loop detection in is_state_visited()
exact to catch some too lax spill/fill corner cases,
from Eduard Zingerman.

11) Refactor the BPF verifier pointer ALU check to allow ALU explicitly
instead of implicitly for various register types, from Hao Sun.

12) Fix the flaky tc_redirect_dtime BPF selftest due to slowness
in neighbor advertisement at setup time, from Martin KaFai Lau.

13) Change BPF selftests to skip callback tests for the case when the
JIT is disabled, from Tiezhu Yang.

14) Add a small extension to libbpf which allows to auto create
a map-in-map's inner map, from Andrey Grafin.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (107 commits)
selftests/bpf: Add missing line break in test_verifier
bpf, docs: Clarify definitions of various instructions
bpf: Fix error checks against bpf_get_btf_vmlinux().
bpf: One more maintainer for libbpf and BPF selftests
selftests/bpf: Incorporate LSM policy to token-based tests
selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar
libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar
selftests/bpf: Add tests for BPF object load with implicit token
selftests/bpf: Add BPF object loading tests with explicit token passing
libbpf: Wire up BPF token support at BPF object level
libbpf: Wire up token_fd into feature probing logic
libbpf: Move feature detection code into its own file
libbpf: Further decouple feature checking logic from bpf_object
libbpf: Split feature detectors definitions from cached results
selftests/bpf: Utilize string values for delegate_xxx mount options
bpf: Support symbolic BPF FS delegation mount options
bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS
bpf,selinux: Allocate bpf_security_struct per BPF token
selftests/bpf: Add BPF token-enabled tests
libbpf: Add BPF token support to bpf_prog_load() API
...
====================

Link: https://lore.kernel.org/r/20240126215710.19855-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

show more ...


# c8632acf 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

Merge branch 'bpf-token'

Andrii Nakryiko says:

====================
BPF token

This patch set is a combination of three BPF token-related patch sets ([0],
[1], [2]) with fixes ([3]) to kernel-side

Merge branch 'bpf-token'

Andrii Nakryiko says:

====================
BPF token

This patch set is a combination of three BPF token-related patch sets ([0],
[1], [2]) with fixes ([3]) to kernel-side token_fd passing APIs incorporated
into relevant patches, bpf_token_capable() changes requested by
Christian Brauner, and necessary libbpf and BPF selftests side adjustments.

This patch set introduces an ability to delegate a subset of BPF subsystem
functionality from privileged system-wide daemon (e.g., systemd or any other
container manager) through special mount options for userns-bound BPF FS to
a *trusted* unprivileged application. Trust is the key here. This
functionality is not about allowing unconditional unprivileged BPF usage.
Establishing trust, though, is completely up to the discretion of respective
privileged application that would create and mount a BPF FS instance with
delegation enabled, as different production setups can and do achieve it
through a combination of different means (signing, LSM, code reviews, etc),
and it's undesirable and infeasible for kernel to enforce any particular way
of validating trustworthiness of particular process.

The main motivation for this work is a desire to enable containerized BPF
applications to be used together with user namespaces. This is currently
impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced
or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF
helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read
arbitrary memory, and it's impossible to ensure that they only read memory of
processes belonging to any given namespace. This means that it's impossible to
have a mechanically verifiable namespace-aware CAP_BPF capability, and as such
another mechanism to allow safe usage of BPF functionality is necessary.

BPF FS delegation mount options and BPF token derived from such BPF FS instance
is such a mechanism. Kernel makes no assumption about what "trusted"
constitutes in any particular case, and it's up to specific privileged
applications and their surrounding infrastructure to decide that. What kernel
provides is a set of APIs to setup and mount special BPF FS instance and
derive BPF tokens from it. BPF FS and BPF token are both bound to its owning
userns and in such a way are constrained inside intended container. Users can
then pass BPF token FD to privileged bpf() syscall commands, like BPF map
creation and BPF program loading, to perform such operations without having
init userns privileges.

This version incorporates feedback and suggestions ([4]) received on earlier
iterations of BPF token approach, and instead of allowing to create BPF tokens
directly assuming capable(CAP_SYS_ADMIN), we instead enhance BPF FS to accept
a few new delegation mount options. If these options are used and BPF FS itself
is properly created, set up, and mounted inside the user namespaced container,
user application is able to derive a BPF token object from BPF FS instance, and
pass that token to bpf() syscall. As explained in patch #3, BPF token itself
doesn't grant access to BPF functionality, but instead allows kernel to do
namespaced capabilities checks (ns_capable() vs capable()) for CAP_BPF,
CAP_PERFMON, CAP_NET_ADMIN, and CAP_SYS_ADMIN, as applicable. So it forms one
half of a puzzle and allows container managers and sys admins to have safe and
flexible configuration options: determining which containers get delegation of
BPF functionality through BPF FS, and then which applications within such
containers are allowed to perform bpf() commands, based on namespaces
capabilities.

Previous attempt at addressing this very same problem ([5]) attempted to
utilize authoritative LSM approach, but was conclusively rejected by upstream
LSM maintainers. BPF token concept is not changing anything about LSM
approach, but can be combined with LSM hooks for very fine-grained security
policy. Some ideas about making BPF token more convenient to use with LSM (in
particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF
2023 presentation ([6]). E.g., an ability to specify user-provided data
(context), which in combination with BPF LSM would allow implementing a very
dynamic and fine-granular custom security policies on top of BPF token. In the
interest of minimizing API surface area and discussions this was relegated to
follow up patches, as it's not essential to the fundamental concept of
delegatable BPF token.

It should be noted that BPF token is conceptually quite similar to the idea of
/dev/bpf device file, proposed by Song a while ago ([7]). The biggest
difference is the idea of using virtual anon_inode file to hold BPF token and
allowing multiple independent instances of them, each (potentially) with its
own set of restrictions. And also, crucially, BPF token approach is not using
any special stateful task-scoped flags. Instead, bpf() syscall accepts
token_fd parameters explicitly for each relevant BPF command. This addresses
main concerns brought up during the /dev/bpf discussion, and fits better with
overall BPF subsystem design.

Second part of this patch set adds full support for BPF token in libbpf's BPF
object high-level API. Good chunk of the changes rework libbpf feature
detection internals, which are the most affected by BPF token presence.

Besides internal refactorings, libbpf allows to pass location of BPF FS from
which BPF token should be created by libbpf. This can be done explicitly though
a new bpf_object_open_opts.bpf_token_path field. But we also add implicit BPF
token creation logic to BPF object load step, even without any explicit
involvement of the user. If the environment is setup properly, BPF token will
be created transparently and used implicitly. This allows for all existing
application to gain BPF token support by just linking with latest version of
libbpf library. No source code modifications are required. All that under
assumption that privileged container management agent properly set up default
BPF FS instance at /sys/bpf/fs to allow BPF token creation.

libbpf adds support to override default BPF FS location for BPF token creation
through LIBBPF_BPF_TOKEN_PATH envvar knowledge. This allows admins or container
managers to mount BPF token-enabled BPF FS at non-standard location without the
need to coordinate with applications. LIBBPF_BPF_TOKEN_PATH can also be used
to disable BPF token implicit creation by setting it to an empty value.

[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=805707&state=*
[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=810260&state=*
[2] https://patchwork.kernel.org/project/netdevbpf/list/?series=809800&state=*
[3] https://patchwork.kernel.org/project/netdevbpf/patch/20231219053150.336991-1-andrii@kernel.org/
[4] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/
[5] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/
[6] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf
[7] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/

v1->v2:
- disable BPF token creation in init userns, and simplify
bpf_token_capable() logic (Christian);
- use kzalloc/kfree instead of kvzalloc/kvfree (Linus);
- few more selftest cases to validate LSM and BPF token interations.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
====================

Link: https://lore.kernel.org/r/20240124022127.2379740-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


# aeaa97b0 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS

It's quite confusing in practice when it's possible to successfully
create a BPF token from BPF FS that didn't have any of delega

bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS

It's quite confusing in practice when it's possible to successfully
create a BPF token from BPF FS that didn't have any of delegate_xxx
mount options set up. While it's not wrong, it's actually more
meaningful to reject BPF_TOKEN_CREATE with specific error code (-ENOENT)
to let user-space know that no token delegation is setup up.

So, instead of creating empty BPF token that will be always ignored
because it doesn't have any of the allow_xxx bits set, reject it with
-ENOENT. If we ever need empty BPF token to be possible, we can support
that with extra flag passed into BPF_TOKEN_CREATE.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-19-andrii@kernel.org

show more ...


# f568a3d4 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

bpf,lsm: Add BPF token LSM hooks

Wire up bpf_token_create and bpf_token_free LSM hooks, which allow to
allocate LSM security blob (we add `void *security` field to struct
bpf_token for that), but al

bpf,lsm: Add BPF token LSM hooks

Wire up bpf_token_create and bpf_token_free LSM hooks, which allow to
allocate LSM security blob (we add `void *security` field to struct
bpf_token for that), but also control who can instantiate BPF token.
This follows existing pattern for BPF map and BPF prog.

Also add security_bpf_token_allow_cmd() and security_bpf_token_capable()
LSM hooks that allow LSM implementation to control and negate (if
necessary) BPF token's delegation of a specific bpf_cmd and capability,
respectively.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-12-andrii@kernel.org

show more ...


# caf8f28e 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

bpf: Add BPF token support to BPF_PROG_LOAD command

Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.

Wire through

bpf: Add BPF token support to BPF_PROG_LOAD command

Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.

Wire through a set of allowed BPF program types and attach types,
derived from BPF FS at BPF token creation time. Then make sure we
perform bpf_token_capable() checks everywhere where it's relevant.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org

show more ...


# a177fc2b 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

bpf: Add BPF token support to BPF_MAP_CREATE command

Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
N

bpf: Add BPF token support to BPF_MAP_CREATE command

Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD
for BPF_MAP_CREATE command.

Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org

show more ...


# 35f96de0 24-Jan-2024 Andrii Nakryiko <andrii@kernel.org>

bpf: Introduce BPF token object

Add new kind of BPF kernel object, BPF token. BPF token is meant to
allow delegating privileged BPF functionality, like loading a BPF
program or creating a BPF map, f

bpf: Introduce BPF token object

Add new kind of BPF kernel object, BPF token. BPF token is meant to
allow delegating privileged BPF functionality, like loading a BPF
program or creating a BPF map, from privileged process to a *trusted*
unprivileged process, all while having a good amount of control over which
privileged operations could be performed using provided BPF token.

This is achieved through mounting BPF FS instance with extra delegation
mount options, which determine what operations are delegatable, and also
constraining it to the owning user namespace (as mentioned in the
previous patch).

BPF token itself is just a derivative from BPF FS and can be created
through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF
FS FD, which can be attained through open() API by opening BPF FS mount
point. Currently, BPF token "inherits" delegated command, map types,
prog type, and attach type bit sets from BPF FS as is. In the future,
having an BPF token as a separate object with its own FD, we can allow
to further restrict BPF token's allowable set of things either at the
creation time or after the fact, allowing the process to guard itself
further from unintentionally trying to load undesired kind of BPF
programs. But for now we keep things simple and just copy bit sets as is.

When BPF token is created from BPF FS mount, we take reference to the
BPF super block's owning user namespace, and then use that namespace for
checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN}
capabilities that are normally only checked against init userns (using
capable()), but now we check them using ns_capable() instead (if BPF
token is provided). See bpf_token_capable() for details.

Such setup means that BPF token in itself is not sufficient to grant BPF
functionality. User namespaced process has to *also* have necessary
combination of capabilities inside that user namespace. So while
previously CAP_BPF was useless when granted within user namespace, now
it gains a meaning and allows container managers and sys admins to have
a flexible control over which processes can and need to use BPF
functionality within the user namespace (i.e., container in practice).
And BPF FS delegation mount options and derived BPF tokens serve as
a per-container "flag" to grant overall ability to use bpf() (plus further
restrict on which parts of bpf() syscalls are treated as namespaced).

Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF)
within the BPF FS owning user namespace, rounding up the ns_capable()
story of BPF token. Also creating BPF token in init user namespace is
currently not supported, given BPF token doesn't have any effect in init
user namespace anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-4-andrii@kernel.org

show more ...