#
b50ecc5a |
| 26-Nov-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim: "perf record:
- Enable leader sampling fo
Merge tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim: "perf record:
- Enable leader sampling for inherited task events. It was supported only for system-wide events but the kernel started to support such a setup since v6.12.
This is to reduce the number of PMU interrupts. The samples of the leader event will contain counts of other events and no samples will be generated for the other member events.
$ perf record -e '{cycles,instructions}:S' ${MYPROG}
perf report:
- Fix --branch-history option to display more branch-related information like prediction, abort and cycles which is available on Intel machines.
$ perf record -bg -- perf test -w brstack
$ perf report --branch-history ... # # Overhead Source:Line Symbol Shared Object Predicted Abort Cycles IPC [IPC Coverage] # ........ ........................ .............. .................... ......... ..... ...... .................... # 8.17% copy_page_64.S:19 [k] copy_page [kernel.kallsyms] 50.0% 0 5 - - | ---xas_load xarray.h:171 | |--5.68%--xas_load xarray.c:245 (cycles:1) | xas_load xarray.c:242 | xas_load xarray.h:1260 (cycles:1) | xas_descend xarray.c:146 | xas_load xarray.c:244 (cycles:2) | xas_load xarray.c:245 | xas_descend xarray.c:218 (cycles:10) ...
perf stat:
- Add HWMON PMU support.
The HWMON provides various system information like CPU/GPU temperature, fan speed and so on. Expose them as PMU events so that users can see the values using perf stat commands.
$ perf stat -e temp_cpu,fan1 true
Performance counter stats for 'true':
60.00 'C temp_cpu 0 rpm fan1
0.000745382 seconds time elapsed
0.000883000 seconds user 0.000000000 seconds sys
- Display metric threshold in JSON output.
Some metrics define thresholds to classify value ranges. It used to be in a different color but it won't work for JSON.
Add "metric-threshold" field to the JSON that can be one of "good", "less good", "nearly bad" and "bad".
# perf stat -a -M TopdownL1 -j true {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"} {"metric-value" : "29.212267", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"} {"metric-value" : "7.138972", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"} {"metric-value" : "20.422759", "metric-unit" : "% tma_retiring", "metric-threshold" : "good"} {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, } ...
perf sched:
- Add -P/--pre-migrations option for 'timehist' sub-command to track time a task waited on a run-queue before migrating to a different CPU.
$ perf sched timehist -P time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 585940.535527 [0000] perf[584885] 0.000 0.000 0.000 0.000 585940.535535 [0000] migration/0[20] 0.000 0.002 0.008 0.000 585940.535559 [0001] perf[584885] 0.000 0.000 0.000 0.000 585940.535563 [0001] migration/1[25] 0.000 0.001 0.004 0.000 585940.535678 [0002] perf[584885] 0.000 0.000 0.000 0.000 585940.535686 [0002] migration/2[31] 0.000 0.002 0.008 0.000 585940.535905 [0001] <idle> 0.000 0.000 0.342 0.000 585940.535938 [0003] perf[584885] 0.000 0.000 0.000 0.000 585940.537048 [0001] sleep[584886] 0.000 0.019 1.142 0.001 585940.537749 [0002] <idle> 0.000 0.000 2.062 0.000 ...
Build:
- Make libunwind opt-in (LIBUNWIND=1) rather than opt-out.
The perf tools are generally built with libelf and libdw which has unwinder functionality. The libunwind support predates it and no need to have duplicate unwinders by default.
- Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify it's using libdw for handling DWARF information.
Internals:
- Do not set exclude_guest bit in the perf_event_attr by default.
This was causing a trouble in AMD IBS PMU as it doesn't support the bit. The bit will be set when it's needed later by the fallback logic. Also update the missing feature detection logic to make sure not clear supported bits unnecessarily.
- Run perf test in parallel by default and mark flaky tests "exclusive" to run them serially at the end. Some test numbers are changed but the test can complete in less than half the time.
JSON vendor events:
- Add AMD Zen 5 events and metrics.
- Add i.MX91 and i.MX95 DDR metrics
- Fix HiSilicon HIP08 Topdown metric name.
- Support compat events on PowerPC"
* tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits) perf tests: Fix hwmon parsing with PMU name test perf hwmon_pmu: Ensure hwmon key union is zeroed before use perf tests hwmon_pmu: Remove double evlist__delete() perf/test: fix perf ftrace test on s390 perf bpf-filter: Return -ENOMEM directly when pfi allocation fails perf test: Correct hwmon test PMU detection perf: Remove unused del_perf_probe_events() perf pmu: Move pmu_metrics_table__find and remove ARM override perf jevents: Add map_for_cpu() perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str perf header: Avoid transitive PMU includes perf arm64 header: Use cpu argument in get_cpuid perf header: Refactor get_cpuid to take a CPU for ARM perf header: Move is_cpu_online to numa bench perf jevents: fix breakage when do perf stat on system metric perf test: Add missing __exit calls in tool/hwmon tests perf tests: Make leader sampling test work without branch event perf util: Remove kernel version deadcode perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved perf test shell trace_exit_race: Show what went wrong in verbose mode ...
show more ...
|
#
9ea671d1 |
| 13-Oct-2024 |
Athira Rajeev <atrajeev@linux.vnet.ibm.com> |
tools/perf/tests: Remove duplicate evlist__delete in tests/tool_pmu.c
The testcase for tool_pmu failed in powerpc as below:
./perf test -v "Parsing without PMU name" 8: Tool PMU
tools/perf/tests: Remove duplicate evlist__delete in tests/tool_pmu.c
The testcase for tool_pmu failed in powerpc as below:
./perf test -v "Parsing without PMU name" 8: Tool PMU : 8.1: Parsing without PMU name : FAILED!
This happens when parse_events results in either skip or fail of an event. Because the code invokes evlist__delete(evlist) and "goto out".
ret = parse_events(evlist, str, &err); if (ret) { evlist__delete(evlist);
But in the "out" section also evlist__delete happens.
out: evlist__delete(evlist); return ret;
Hence remove the duplicate evlist__delete from the first path in the testcase
With the change: # ./perf test -v "Parsing without PMU name" 8: Tool PMU : 8.1: Parsing without PMU name : Ok
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241013170732.71339-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
#
d94d86ce |
| 13-Oct-2024 |
Athira Rajeev <atrajeev@linux.vnet.ibm.com> |
tools/perf/tests: Fix compilation error with strncpy in tests/tool_pmu
perf fails to compile on systems with GCC version11 as below:
In file included from /usr/include/string.h:519,
tools/perf/tests: Fix compilation error with strncpy in tests/tool_pmu
perf fails to compile on systems with GCC version11 as below:
In file included from /usr/include/string.h:519, from /home/athir/perf-tools-next/tools/include/linux/bitmap.h:5, from /home/athir/perf-tools-next/tools/perf/util/pmu.h:5, from /home/athir/perf-tools-next/tools/perf/util/evsel.h:14, from /home/athir/perf-tools-next/tools/perf/util/evlist.h:14, from tests/tool_pmu.c:3: In function ‘strncpy’, inlined from ‘do_test’ at tests/tool_pmu.c:25:3: /usr/include/bits/string_fortified.h:95:10: error: ‘__builtin_strncpy’ specified bound 128 equals destination size [-Werror=stringop-truncation] 95 | return __builtin___strncpy_chk (__dest, __src, __len, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 96 | __glibc_objsize (__dest)); | ~~~~~~~~~~~~~~~~~~~~~~~~~
The compile error is from strncpy refernce in do_test: strncpy(str, tool_pmu__event_to_str(ev), sizeof(str));
This behaviour is not observed with GCC version 8, but observed with GCC version 11 . This is message from gcc for detecting truncation while using strncpu. Use snprintf instead of strncpy here to be safe.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241013173742.71882-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|