| #
05d2a3da |
| 23-Jun-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v7.2-1-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf inject --aslr
Merge tag 'perf-tools-for-v7.2-1-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf inject --aslr' to remap ASLR-randomized addresses in perf.data files, enabling reproducible analysis across runs with different address space layouts
- Refactor evsel out of sample processing paths: store evsel in struct perf_sample and remove the redundant evsel parameter from tool APIs, tracepoint handlers, hist entry iterators, and db-export, simplifying the entire tool callback chain
- Switch architecture detection from string-based perf_env__arch() comparisons to the numeric ELF e_machine field across the codebase (capstone, print_insn, c2c, lock-contention, sort, sample-raw, machine, header), making cross-analysis more robust
- Overhaul ARM CoreSight ETM tests: add deterministic and named_threads workloads, speed up basic and disassembly tests, add process attribution and concurrent threads tests, remove unused workloads and duplicate tests, queue context packets for the frontend decoder
- Add ARM SPE IMPDEF event decoding for Arm Neoverse N1, store MIDR in arm_spe_pkt for per-CPU event mapping, handle missing CPU IDs gracefully
- Refactor libunwind support: remove the libunwind-local backend, make register reading cross-platform, add RISC-V libunwind support, allow dynamic selection between libdw and libunwind unwinding at runtime
- Extensive hardening of perf.data parsing against crafted files: add bounds checks and byte-swap validation for session records, feature sections, header attributes, BPF metadata, auxtrace errors, compressed events, CPU maps, build ID notes, and ELF program headers. Add minimum event size validation and file offset diagnostics
- Fix libdw API contract violations across dwarf-aux, libdw, probe-finder, annotate-data, and debuginfo subsystems. Fix callchain parent update in ORDER_CALLER mode, support DWARF line 0 in inline lists, handle multiple address spaces in callchains
- Fix numerous 'perf sched' bugs: thread reference leaks, memory leaks, heap overflows with cross-machine recordings, NULL dereferences, replace BUG_ON assertions with graceful error handling, bounds-check CPU indices, fix SIGCHLD vs pause() races in sched stats
- Overhaul the build system: move BPF skeleton generation out of Makefile.perf into bpf_skel.mak, decouple pmu-events from the prepare target, make beauty generated C code standalone .o files, compile BPF skeletons with -mcpu=v3, fix continuous rebuilds, various cleanups
- Add 'perf test' JUnit XML reporting with -j/--junit option, split monolithic test suites into sub-tests, add summary reporting, refactor parallel poll loop, fix test failures on musl-based systems
- Fix 'perf c2c' memory leaks in hist entry and format list handling, use-after-free in error paths, bounds-check CPU and node IDs
- Fix 'perf bpf' metadata leaks on duplicate insert and alloc failure, bounds-check array offsets, validate event sizes and func_info fields, add NULL checks
- Fix hwmon PMU: off-by-one null termination on sysfs reads, strlcpy buffer overflow in parse_hwmon_filename(), fd 0 check, empty label reads, scnprintf usage
- Fix symbols subsystem: bounds-check ELF and sysfs build ID note iteration, validate p_filesz, fix 32-bit ELF bswap error, fix signed overflow in size checks, bounds-check .gnu_debuglink section
- Fix tools lib api: null termination in filename__read_int/ull(), uninitialized stack data in filename__write_int(), snprintf truncation in mount_overload()
- Replace libbabeltrace with babeltrace2-ctf-writer for CTF conversion in 'perf data'
- Add RISC-V SDT argument parsing for static tracepoints
- Add 'perf trace --show-cpu' option to display CPU id
- Add 'perf bench sched pipe --write-size' option
- Add a perf-specific .clang-format that overrides some kernel style behaviors
- Update Intel vendor events for Alder Lake, Arrow Lake, Clearwater Forest, Emerald Rapids, Granite Rapids, Grand Ridge, Lunar Lake, Meteor Lake, Panther Lake, Sapphire Rapids, Sierra Forest
- Add IOMMU metrics for AMD and Intel
- Fix AMD event: switch l2_itlb_misses to bp_l1_tlb_miss_l2_tlb_miss.all
- Add AMD IBS improvements: decode Streaming-store and Remote-Socket flags, suppress bogus fields on Zen4+, skip privilege test on Zen6+
- Fix 'perf lock contention' SIGCHLD vs pause() race, allow 'mmap_lock' in -L filter, enable end-timestamp for cgroup aggregation, fix non-atomic data updates
- Fix 'perf stat' false NMI watchdog warning in aggregation modes, bounds-check CPU index in topology callbacks, add aggr_nr metric parser support for uncore scaling
- Fix 'perf timechart' memory leaks, CPU bounds checking, use-after-free on corrupted callchains
- Fix 'perf inject' itrace branch stack synthesis, fix synthesized sample size with branch stacks
- Fix DSO heap overflow on decompressed paths, uninitialized pathname on fallback, set proper error codes
- Fix various snprintf/scnprintf usages to prevent buffer overflows and truncation across the codebase
- Fix off-by-one stack buffer overflow in kallsyms__parse()
- Fix 'perf kwork' memory management, address sanitizer issues, bounds check work->cpu
- Fix 'perf tpebs' concurrent stop races and PID reuse hazards
- Add O_CLOEXEC to open() calls and use mkostemp() for temporary files to prevent file descriptor leaks to child processes
- Fix s390 Python extension TEXTREL by compiling as PIC
- Fix build with ASAN for jitdump
- Fix build failure due to btf_vlen() return type change
* tag 'perf-tools-for-v7.2-1-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (343 commits) perf bpf: Fix up build failure due to change of btf_vlen() return type perf dso: Set standard errno on decompression failure perf bpf: Validate array presence before casting BPF prog info pointers perf c2c: Fix hist entry and format list leaks in c2c_he_free() perf c2c: Free format list entries when c2c_hists__init() fails perf cs-etm: Bounds-check CPU in cs_etm__get_queue() perf cs-etm: Require full global header in auxtrace_info size check perf cs-etm: Validate num_cpu before metadata allocation perf machine: Use snprintf() for guestmount path construction perf machine: Propagate machine__init() error to callers perf trace: Guard __probe_ip suppression with evsel__is_probe() perf evsel: Add lazy-initialized probe type detection helpers perf evsel: Add no-libtraceevent stubs for evsel__field() and evsel__common_field() perf cs-etm: Reject CPU IDs that would overflow signed comparison perf c2c: Free format list entries when releasing c2c hist entries perf bpf: Bounds-check array offsets in bpil_offs_to_addr() perf bpf: Reject oversized BPF metadata events that truncate header.size perf bpf: Validate func_info_rec_size and sub_id in synthesize_bpf_prog_name() perf sched: Replace (void*)1 sentinel with proper runtime allocation perf hwmon: Fix fd check to accept fd 0 in hwmon_pmu__describe_items() ...
show more ...
|
| #
9e1e9d66 |
| 16-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'trace-rtla-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull RTLA updates from Steven Rostedt:
- Simplify option parsing
Auto-generate getopt_long() opts
Merge tag 'trace-rtla-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull RTLA updates from Steven Rostedt:
- Simplify option parsing
Auto-generate getopt_long() optstring for short options from long options array, avoiding the need to specify it manually and reducing the surface for mistakes.
- Add unit tests
Implement unit tests (make unit-tests) using libcheck, next to existing runtime tests (make check). Currently, three functions from utils.c are tested.
- Add --stack-format option
In addition to stopping stack pointer decoding (with -s/--stack option) on first unresolvable pointer, allow also skipping unresolvable pointers and displaying everything, configurable with a new option.
- Unify number of CPUs into one global variable
Use one global variable, nr_cpus, to store the number of CPUs instead of retrieving it and passing it at multiple places.
- Fix behavior in various corner cases
Make RTLA behave correctly in several corner cases: memory allocation failure, invalid value read from kernel side, thread creation failure, malformed time value input, and read/write failure or interruption by signal.
- Improve string handling
Simplify several places in the code that handle strings, including parsing of action arguments. A few new helper functions and variables are added for that purpose.
- Get rid of magic numbers
Few places handling paths use a magic number of 1024. Replace it with MAX_PATH and ARRAY_SIZE() macro.
- Unify threshold handling
Code that handles response to latency threshold is duplicated between tools, which has led to bugs in the past. Unify it into a new helper as much as possible.
- Fix segfault on SIGINT during cleanup
The SIGINT handler touches dynamically allocated memory. Detach it before freeing it during cleanup to prevent segmentation fault and discarding of output buffers. Also, properly document SIGINT handling while at it.
* tag 'trace-rtla-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits) Documentation/rtla: Document SIGINT behavior rtla: Fix segfault on multiple SIGINTs rtla/utils: Fix loop condition in PID validation rtla/utils: Fix resource leak in set_comm_sched_attr() rtla/trace: Fix I/O handling in save_trace_to_file() rtla/trace: Fix write loop in trace_event_save_hist() rtla/timerlat: Simplify RTLA_NO_BPF environment variable check rtla: Use str_has_prefix() for option prefix check rtla: Enforce exact match for time unit suffixes rtla: Use str_has_prefix() for prefix checks rtla: Add str_has_prefix() helper function rtla: Handle pthread_create() failure properly rtla/timerlat: Add bounds check for softirq vector rtla: Simplify code by caching string lengths rtla: Replace magic number with MAX_PATH rtla: Introduce common_threshold_handler() helper rtla/actions: Simplify argument parsing rtla: Use strdup() to simplify code rtla: Exit on memory allocation failures during initialization tools/rtla: Remove unneeded nr_cpus from for_each_monitored_cpu ...
show more ...
|