#
7685b334 |
| 24-Jan-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf-tools updates from Namhyung Kim: "There are a lot of changes in the perf tools
Merge tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf-tools updates from Namhyung Kim: "There are a lot of changes in the perf tools in this cycle.
build:
- Use generic syscall table to generate syscall numbers on supported archs
- This also enables to get rid of libaudit which was used for syscall numbers
- Remove python2 support as it's deprecated for years
- Fix issues on static build with libzstd
perf record:
- Intel-PT supports "aux-action" config term to pause or resume tracing in the aux-buffer. Users can start the intel_pt event as "started-paused" and configure other events to control the Intel-PT tracing:
# perf record --kcore -e intel_pt/aux-action=start-paused/ \ -e syscalls:sys_enter_newuname/aux-action=resume/ \ -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname
This requires kernel support (which was added in v6.13)
perf lock:
- 'perf lock contention' command has an ability to symbolize locks in dynamically allocated objects using slab cache name when it runs with BPF. Those dynamic locks would have "&" prefix in the name to distinguish them from ordinary (static) locks
# perf lock con -abl -E 5 sleep 1 contended total wait max wait avg wait address symbol
2 1.95 us 1.77 us 975 ns ffff9d5e852d3498 &task_struct (mutex) 1 1.18 us 1.18 us 1.18 us ffff9d5e852d3538 &task_struct (mutex) 4 1.12 us 354 ns 279 ns ffff9d5e841ca800 &kmalloc-cg-512 (mutex) 2 859 ns 617 ns 429 ns ffffffffa41c3620 delayed_uprobe_lock (mutex) 3 691 ns 388 ns 230 ns ffffffffa41c0940 pack_mutex (mutex)
This also requires kernel/BPF support (which was added in v6.13)
perf ftrace:
- 'perf ftrace latency' command gets a couple of options to support linear buckets instead of exponential. Also it's possible to specify max and min latency for the linear buckets:
# perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100 \ --min-latency=200 --max-latency=800 -- sleep 1 # DURATION | COUNT | GRAPH | 0 - 200 ns | 186 | ### | 200 - 300 ns | 256 | ##### | 300 - 400 ns | 364 | ####### | 400 - 500 ns | 223 | #### | 500 - 600 ns | 111 | ## | 600 - 700 ns | 41 | | 700 - 800 ns | 141 | ## | 800 - ... ns | 169 | ### |
# statistics (in nsec) total time: 2162212 avg time: 967 max time: 16817 min time: 132 count: 2236
- As you can see in the above example, it nows shows the statistics at the end so that users can see the avg/max/min latencies easily
- 'perf ftrace profile' command has --graph-opts option like 'perf ftrace trace' so that it can control the tracing behaviors in the same way. For example, it can limit the function call depth or threshold
perf script:
- Improve physical memory resolution in 'mem-phys-addr' script by parsing /proc/iomem file
# perf script mem-phys-addr -- find / ... Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-85f7fffff : System RAM 8929 69.7 547600000-54785d23f : Kernel data 1240 9.7 546a00000-5474bdfff : Kernel rodata 490 3.8 5480ce000-5485fffff : Kernel bss 121 0.9 0-fff : Reserved 3860 30.1 100000-89c01fff : System RAM 18 0.1 8a22c000-8df6efff : System RAM 5 0.0
Others:
- 'perf test' gets --runs-per-test option to run the test cases repeatedly. This would be helpful to see if it's flaky
- Add 'parse_events' method to Python perf extension module, so that users can use the same event parsing logic in the python code. One more step towards implementing perf tools in Python. :)
- Support opening tracepoint events without libtraceevent. This will be helpful if it won't use the tracing data like in 'perf stat'
- Update ARM Neoverse N2/V2 JSON events and metrics"
* tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits) perf test: Update event_groups test to use instructions perf bench: Fix undefined behavior in cmpworker() perf annotate: Prefer passing evsel to evsel->core.idx perf lock: Rename fields in lock_type_table perf lock: Add percpu-rwsem for type filter perf lock: Fix parse_lock_type which only retrieve one lock flag perf lock: Fix return code for functions in __cmd_contention perf hist: Fix width calculation in hpp__fmt() perf hist: Fix bogus profiles when filters are enabled perf hist: Deduplicate cmp/sort/collapse code perf test: Improve verbose documentation perf test: Add a runs-per-test flag perf test: Fix parallel/sequential option documentation perf test: Send list output to stdout rather than stderr perf test: Rename functions and variables for better clarity perf tools: Expose quiet/verbose variables in Makefile.perf perf config: Add a function to set one variable in .perfconfig perf test perftool_testsuite: Return correct value for skipping perf test perftool_testsuite: Add missing description perf test record+probe_libc_inet_pton: Make test resilient ...
show more ...
|
#
4c02c7e0 |
| 09-Jan-2025 |
Charlie Jenkins <charlie@rivosinc.com> |
perf tools powerpc: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table instead of the custom ones for powerpc.
Signed-off-by: Charlie Jenkins <char
perf tools powerpc: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table instead of the custom ones for powerpc.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-14-7543b5293098@rivosinc.com Link: https://lore.kernel.org/lkml/20250110100505.78d81450@canb.auug.org.au [ Stephen Rothwell noticed on linux-next that the powerpc build for perf was broken and ...] Link: https://lore.kernel.org/lkml/20250109-perf_powerpc_spu-v1-1-c097fc43737e@rivosinc.com [ ... Charlie fixed it up and asked for it to be squashed to avoid breaking bisection. ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|