| e815df29 | 23-Apr-2026 |
Hengqi Chen <hengqi.chen@gmail.com> |
LoongArch: BPF: Add fsession support for trampolines
Implement BPF_TRACE_FSESSION support in LoongArch BPF JIT. The logic here is almost identical to what has been done in RISC-V JIT.
The key chang
LoongArch: BPF: Add fsession support for trampolines
Implement BPF_TRACE_FSESSION support in LoongArch BPF JIT. The logic here is almost identical to what has been done in RISC-V JIT.
The key changes are: - Allocate stack space for function meta and session cookies - Introduce invoke_bpf() as a wrapper around invoke_bpf_prog() that populates session cookies before each invocation - Implement bpf_jit_supports_fsession() callback
Tested-by: Vincent Li <vincent.mc.li@gmail.com> Reviewed-by: Menglong Dong <menglong8.dong@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 6ef04707 | 23-Apr-2026 |
Hengqi Chen <hengqi.chen@gmail.com> |
LoongArch: BPF: Introduce emit_store_stack_imm64() helper
Introduce a helper to store 64-bit immediate on the trampoline stack. The helper will be used in the next patch. Also refactor the existing
LoongArch: BPF: Introduce emit_store_stack_imm64() helper
Introduce a helper to store 64-bit immediate on the trampoline stack. The helper will be used in the next patch. Also refactor the existing code to use this helper.
Tested-by: Vincent Li <vincent.mc.li@gmail.com> Reviewed-by: Menglong Dong <menglong8.dong@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| c9ebe201 | 22-Apr-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Support up to 12 function arguments for trampoline
Currently, LoongArch bpf trampoline supports up to 8 function arguments. According to the statistics from commit 473e3150e30a ("bpf
LoongArch: BPF: Support up to 12 function arguments for trampoline
Currently, LoongArch bpf trampoline supports up to 8 function arguments. According to the statistics from commit 473e3150e30a ("bpf, x86: allow function arguments up to 12 for TRACING"), there are over 200 functions accept 9 to 12 arguments, so add 12 arguments support for trampoline.
With this patch, the following related testcases passed:
sudo ./test_progs -a tracing_struct/struct_many_args sudo ./test_progs -a fentry_test/fentry_many_args sudo ./test_progs -a fexit_test/fexit_many_args
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 0ef8b960 | 22-Apr-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Support small struct arguments for trampoline
In the current BPF code, the struct argument size is at most 16 bytes, enforced by the verifier. According to the Procedure Call Standar
LoongArch: BPF: Support small struct arguments for trampoline
In the current BPF code, the struct argument size is at most 16 bytes, enforced by the verifier. According to the Procedure Call Standard for LoongArch, the struct argument size below 16 bytes are provided as part of the 8 argument registers, that is to say, the struct argument may be passed in a pair of registers if its size is more than 8 bytes and no more than 16 bytes.
Extend the BPF trampoline JIT to support attachment to functions that take small structures (up to 16 bytes) as argument, save and restore a number of "argument registers" rather than a number of arguments.
With this patch, the following related testcases passed:
sudo ./test_progs -a tracing_struct/struct_args sudo ./test_progs -a tracing_struct/union_args
Link: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#structures Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 4653682c | 22-Apr-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()
invoke_bpf_mod_ret() is a small wrapper over invoke_bpf_prog(), it should check the return value of invoke_bpf_prog() and then return immedi
LoongArch: BPF: Open code and remove invoke_bpf_mod_ret()
invoke_bpf_mod_ret() is a small wrapper over invoke_bpf_prog(), it should check the return value of invoke_bpf_prog() and then return immediately if invoke_bpf_prog() failed, just open code and remove it due to it is called only once.
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| ee823fe7 | 22-Apr-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Support load-acquire and store-release instructions
Use the LoongArch common memory access instructions with the barrier 'dbar' to support the BPF load-acquire and store-release inst
LoongArch: BPF: Support load-acquire and store-release instructions
Use the LoongArch common memory access instructions with the barrier 'dbar' to support the BPF load-acquire and store-release instructions.
With this patch, the following testcases passed on LoongArch if the macro CAN_USE_LOAD_ACQ_STORE_REL is usable in bpf selftests:
sudo ./test_progs -t verifier_load_acquire sudo ./test_progs -t verifier_store_release sudo ./test_progs -t verifier_precision/bpf_load_acquire sudo ./test_progs -t verifier_precision/bpf_store_release sudo ./test_progs -t compute_live_registers/atomic_load_acq_store_rel
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| fc935c19 | 22-Apr-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions
The 8 and 16 bit read-modify-write instructions {amadd/amswap}.{b/h} were newly added in the latest LoongArch Reference Manual, us
LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions
The 8 and 16 bit read-modify-write instructions {amadd/amswap}.{b/h} were newly added in the latest LoongArch Reference Manual, use them to avoid the error of unknown opcode if possible.
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| d9ef13f7 | 16-Apr-2026 |
Xu Kuohai <xukuohai@huawei.com> |
bpf: Pass bpf_verifier_env to JIT
Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will use env->insn_aux_data in the JIT stage to detect indirect jump targets.
Since bpf_prog_se
bpf: Pass bpf_verifier_env to JIT
Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will use env->insn_aux_data in the JIT stage to detect indirect jump targets.
Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c code without verifier, introduce helper __bpf_prog_select_runtime() to accept the env parameter.
Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and switch to call __bpf_prog_select_runtime() in the verifier, with env variable passed. The original bpf_prog_select_runtime() is preserved for cbpf and lib/test_bpf.c, where env is NULL.
Now all constants blinding calls are moved into the verifier, except the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted by bpf_patch_insn_data() function for normal cases, so there is no need to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it.
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8 Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12 Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14 Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260416064341.151802-3-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
| b254c629 | 16-Mar-2026 |
Tiezhu Yang <yangtiezhu@loongson.cn> |
LoongArch: BPF: Make arch_protect_bpf_trampoline() return 0
Occasionally there exist "text_copy_cb: operation failed" when executing the bpf selftests, the reason is copy_to_kernel_nofault() failed
LoongArch: BPF: Make arch_protect_bpf_trampoline() return 0
Occasionally there exist "text_copy_cb: operation failed" when executing the bpf selftests, the reason is copy_to_kernel_nofault() failed and the ecode of ESTAT register is 0x4 (PME: Page Modification Exception) due to the pte is not writeable. The root cause is that there is another place to set the pte entry as readonly which is in the generic weak version of arch_protect_bpf_trampoline().
There are two ways to fix this race condition issue: the direct way is to modify the generic weak arch_protect_bpf_trampoline() to add a mutex lock for set_memory_rox(), but the other simple and proper way is to just make arch_protect_bpf_trampoline() return 0 in the arch-specific code because LoongArch has already use the BPF prog pack allocator for trampoline.
Here are the trimmed kernel log messages:
copy_to_kernel_nofault: memory access failed, ecode 0x4 copy_to_kernel_nofault: the caller is text_copy_cb+0x50/0xa0 text_copy_cb: operation failed ------------[ cut here ]------------ bpf_prog_pack bug: missing bpf_arch_text_invalidate? WARNING: kernel/bpf/core.c:1008 at bpf_prog_pack_free+0x200/0x228 ... Call Trace: [<9000000000248914>] show_stack+0x64/0x188 [<9000000000241308>] dump_stack_lvl+0x6c/0x9c [<90000000002705bc>] __warn+0x9c/0x200 [<9000000001c428c0>] __report_bug+0xa8/0x1c0 [<9000000001c42b5c>] report_bug+0x64/0x120 [<9000000001c7dcd0>] do_bp+0x270/0x3c0 [<9000000000246f40>] handle_bp+0x120/0x1c0 [<900000000047b030>] bpf_prog_pack_free+0x200/0x228 [<900000000047b2ec>] bpf_jit_binary_pack_free+0x24/0x60 [<900000000026989c>] bpf_jit_free+0x54/0xb0 [<900000000029e10c>] process_one_work+0x184/0x610 [<900000000029ef8c>] worker_thread+0x24c/0x388 [<90000000002a902c>] kthread+0x13c/0x170 [<9000000001c7dfe8>] ret_from_kernel_thread+0x28/0x1c0 [<9000000000246624>] ret_from_kernel_thread_asm+0xc/0x88
---[ end trace 0000000000000000 ]---
Here is a simple shell script to reproduce:
#!/bin/bash
for ((i=1; i<=1000; i++)) do echo "Under testing $i ..." dmesg -c > /dev/null ./test_progs -t fentry_attach_stress > /dev/null dmesg -t | grep "text_copy_cb: operation failed" if [ $? -eq 0 ]; then break fi done
Cc: stable@vger.kernel.org Fixes: 4ab17e762b34 ("LoongArch: BPF: Use BPF prog pack allocator") Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 4fdb5dd8 | 10-Feb-2026 |
Hengqi Chen <hengqi.chen@gmail.com> |
LoongArch: BPF: Implement bpf_addr_space_cast instruction
LLVM generates bpf_addr_space_cast instruction while translating pointers between native (zero) address space and __attribute__((address_spa
LoongArch: BPF: Implement bpf_addr_space_cast instruction
LLVM generates bpf_addr_space_cast instruction while translating pointers between native (zero) address space and __attribute__((address_space(N))). The addr_space=0 is reserved as bpf_arena address space.
rY = addr_space_cast(rX, 0, 1) is processed by the verifier and converted to normal 32-bit move: wX = wY
rY = addr_space_cast(rX, 1, 0) has to be converted by JIT.
With this, the following test cases passed:
$ ./test_progs -a arena_htab,arena_list,arena_strsearch,verifier_arena,verifier_arena_large #4/1 arena_htab/arena_htab_llvm:OK #4/2 arena_htab/arena_htab_asm:OK #4 arena_htab:OK #5/1 arena_list/arena_list_1:OK #5/2 arena_list/arena_list_1000:OK #5 arena_list:OK #7/1 arena_strsearch/arena_strsearch:OK #7 arena_strsearch:OK #507/1 verifier_arena/basic_alloc1:OK #507/2 verifier_arena/basic_alloc2:OK #507/3 verifier_arena/basic_alloc3:OK #507/4 verifier_arena/basic_reserve1:OK #507/5 verifier_arena/basic_reserve2:OK #507/6 verifier_arena/reserve_twice:OK #507/7 verifier_arena/reserve_invalid_region:OK #507/8 verifier_arena/iter_maps1:OK #507/9 verifier_arena/iter_maps2:OK #507/10 verifier_arena/iter_maps3:OK #507 verifier_arena:OK #508/1 verifier_arena_large/big_alloc1:OK #508/2 verifier_arena_large/access_reserved:OK #508/3 verifier_arena_large/request_partially_reserved:OK #508/4 verifier_arena_large/free_reserved:OK #508/5 verifier_arena_large/big_alloc2:OK #508 verifier_arena_large:OK Summary: 5/20 PASSED, 0 SKIPPED, 0 FAILED
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| ef54c517 | 10-Feb-2026 |
Hengqi Chen <hengqi.chen@gmail.com> |
LoongArch: BPF: Implement PROBE_MEM32 pseudo instructions
Add support for `{LDX,STX,ST} | PROBE_MEM32 | {B,H,W,DW}` instructions. They are similar to PROBE_MEM instructions with the following differ
LoongArch: BPF: Implement PROBE_MEM32 pseudo instructions
Add support for `{LDX,STX,ST} | PROBE_MEM32 | {B,H,W,DW}` instructions. They are similar to PROBE_MEM instructions with the following differences: * PROBE_MEM32 supports store. * PROBE_MEM32 relies on the verifier to clear upper 32-bit of the src/dst register * PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in S6 in the prologue). Due to bpf_arena constructions such S6 + reg + off16 access is guaranteed to be within arena virtual range, so no address check at run-time. * S6 is a free callee-saved register, so it is used to store arena_vm_start * PROBE_MEM32 allows ST and STX. If they fault the store is a nop. When LDX faults the destination register is zeroed.
To support these on LoongArch, we employ the t2/t3 registers to store the intermediate results of reg_arena + src/dst reg and use the t2/t3 registers as the new src/dst reg. This allows us to reuse most of the existing code.
Acked-by: Tiezhu Yang <yangtiezhu@loongson.cn> Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 73721d86 | 31-Dec-2025 |
Chenghao Duan <duanchenghao@kylinos.cn> |
LoongArch: BPF: Enhance the bpf_arch_text_poke() function
Enhance the bpf_arch_text_poke() function to enable accurate location of BPF program entry points.
When modifying the entry point of a BPF
LoongArch: BPF: Enhance the bpf_arch_text_poke() function
Enhance the bpf_arch_text_poke() function to enable accurate location of BPF program entry points.
When modifying the entry point of a BPF program, skip the "move t0, ra" instruction to ensure the correct logic and copy of the jump address.
Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 26138762 | 31-Dec-2025 |
Chenghao Duan <duanchenghao@kylinos.cn> |
LoongArch: BPF: Enable trampoline-based tracing for module functions
Remove the previous restrictions that blocked the tracing of kernel module functions. Fix the issue that previously caused kernel
LoongArch: BPF: Enable trampoline-based tracing for module functions
Remove the previous restrictions that blocked the tracing of kernel module functions. Fix the issue that previously caused kernel lockups when attempting to trace module functions.
Before entering the trampoline code, the return address register ra shall store the address of the next assembly instruction after the 'bl trampoline' instruction, which is the traced function address, and the register t0 shall store the parent function return address. Refine the trampoline return logic to ensure that register data remains correct when returning to both the traced function and the parent function.
Before this patch was applied, the module_attach test in selftests/bpf encountered a deadlock issue. This was caused by an incorrect jump address after the trampoline execution, which resulted in an infinite loop within the module function.
Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| 61319d15 | 31-Dec-2025 |
Chenghao Duan <duanchenghao@kylinos.cn> |
LoongArch: BPF: Adjust the jump offset of tail calls
Call the next bpf prog and skip the first instruction of TCC initialization.
A total of 7 instructions are skipped: 'move t0, ra' 1 inst 'move
LoongArch: BPF: Adjust the jump offset of tail calls
Call the next bpf prog and skip the first instruction of TCC initialization.
A total of 7 instructions are skipped: 'move t0, ra' 1 inst 'move_imm + jirl' 5 inst 'addid REG_TCC, zero, 0' 1 inst
Relevant test cases: the tailcalls test item in selftests/bpf.
Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| d314e1f4 | 31-Dec-2025 |
Chenghao Duan <duanchenghao@kylinos.cn> |
LoongArch: BPF: Save return address register ra to t0 before trampoline
Modify the build_prologue() function to ensure the return address register ra is saved to t0 before entering trampoline operat
LoongArch: BPF: Save return address register ra to t0 before trampoline
Modify the build_prologue() function to ensure the return address register ra is saved to t0 before entering trampoline operations. This change ensures the accurate return address handling when a BPF program calls another BPF program, preventing errors in the BPF-to-BPF call chain.
Cc: stable@vger.kernel.org Fixes: 677e6123e3d2 ("LoongArch: BPF: Disable trampoline for kernel module function trace") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|
| eb71f5c4 | 31-Dec-2025 |
Hengqi Chen <hengqi.chen@gmail.com> |
LoongArch: BPF: Zero-extend bpf_tail_call() index
The bpf_tail_call() index should be treated as a u32 value. Let's zero-extend it to avoid calling wrong BPF progs. See similar fixes for x86 [1]) an
LoongArch: BPF: Zero-extend bpf_tail_call() index
The bpf_tail_call() index should be treated as a u32 value. Let's zero-extend it to avoid calling wrong BPF progs. See similar fixes for x86 [1]) and arm64 ([2]) for more details.
[1]: https://github.com/torvalds/linux/commit/90caccdd8cc0215705f18b92771b449b01e2474a [2]: https://github.com/torvalds/linux/commit/16338a9b3ac30740d49f5dfed81bac0ffa53b9c7
Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
show more ...
|