940007ce | 22-Jul-2024 |
James Clark <james.clark@arm.com> |
perf: cs-etm: Only save valid trace IDs into files
This isn't a bug because Perf always masks with CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it looking like it could be, ma
perf: cs-etm: Only save valid trace IDs into files
This isn't a bug because Perf always masks with CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it looking like it could be, make an effort to not save bad values.
Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240722101202.26915-6-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
19c3e4db | 22-Jul-2024 |
James Clark <james.clark@arm.com> |
perf: cs-etm: Create decoders based on the trace ID mappings
Now that each queue has a unique set of trace ID mappings, use this list to create the decoders. In unformatted mode just add a single ma
perf: cs-etm: Create decoders based on the trace ID mappings
Now that each queue has a unique set of trace ID mappings, use this list to create the decoders. In unformatted mode just add a single mapping so only one decoder is made.
Previously each queue would have a decoder created for each traced CPU on the system but this won't work anymore because CPUs can have overlapping trace IDs.
This also means that the CORESIGHT_TRACE_ID_UNUSED_FLAG isn't needed any more. If mappings aren't added then decoders aren't created, rather than needing a flag to suppress creation.
Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240722101202.26915-5-james.clark@linaro.org Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
e3123079 | 01-May-2024 |
James Clark <james.clark@arm.com> |
perf cs-etm: Improve version detection and error reporting
When the config validation functions are warning about ETMv3, they do it based on "not ETMv4". If the drivers aren't all loaded or the hard
perf cs-etm: Improve version detection and error reporting
When the config validation functions are warning about ETMv3, they do it based on "not ETMv4". If the drivers aren't all loaded or the hardware doesn't support Coresight it will appear as "not ETMv4" and then Perf will print the error message "... not supported in ETMv3 ..." which is wrong and confusing.
cs_etm_is_etmv4() is also misnamed because it also returns true for ETE because ETE has a superset of the ETMv4 metadata files. Although this was always done in the correct order so it wasn't a bug.
Improve all this by making a single get version function which also handles not present as a separate case. Change the ETMv3 error message to only print when ETMv3 is detected, and add a new error message for the not present case.
Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Leo Yan <leo.yan@linux.dev> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240501135753.508022-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
bc5e0e1b | 01-May-2024 |
James Clark <james.clark@arm.com> |
perf cs-etm: Remove repeated fetches of the ETM PMU
Most functions already have cs_etm_pmu, so it's a bit neater to pass it through rather than itr only to convert it again.
Signed-off-by: James Cl
perf cs-etm: Remove repeated fetches of the ETM PMU
Most functions already have cs_etm_pmu, so it's a bit neater to pass it through rather than itr only to convert it again.
Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240501135753.508022-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
78efa7b4 | 14-Oct-2023 |
Leo Yan <leo.yan@linaro.org> |
perf cs-etm: Respect timestamp option
When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX
perf cs-etm: Respect timestamp option
When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX event will record the kernel timestamp, but it doesn't mean Arm CoreSight enables timestamp packets in its hardware tracing.
If the option '--timestamp' or '-T' is set, this patch always enables Arm CoreSight timestamp, as a result, the bit 28 in event's config is to be set.
Before:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136, config = 0, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
After:
# perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136, config = 0x10000000, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ...
Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231014074159.1667880-3-leo.yan@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
f8ccc2d5 | 14-Oct-2023 |
Leo Yan <leo.yan@linaro.org> |
perf cs-etm: Validate timestamp tracing in per-thread mode
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command:
per
perf cs-etm: Validate timestamp tracing in per-thread mode
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command:
perf record -e cs_etm/timestamp/ --per-thread -- ls
The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation.
Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty.
Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231014074159.1667880-2-leo.yan@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
0197da7a | 12-Oct-2023 |
Ian Rogers <irogers@google.com> |
perf pmu: Lazily compute default config
The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change def
perf pmu: Lazily compute default config
The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change default_config to perf_event_attr_init_default, a callback that is used when a default config needs initializing. This avoids holding onto the memory for a perf_event_attr and copying.
On a tigerlake laptop running the pmu-scan benchmark:
Before: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.780 usec (+- 0.503 usec) Average PMU scanning took: 283.480 usec (+- 18.471 usec) Number of openat syscalls: 30,227
After: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 27.880 usec (+- 0.169 usec) Average PMU scanning took: 245.260 usec (+- 15.758 usec) Number of openat syscalls: 28,914
Over 3 runs it is a nearly 12% reduction in execution time and a 4.3% of openat calls.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
672bd213 | 12-Oct-2023 |
Ian Rogers <irogers@google.com> |
perf arm-spe: Move PMU initialization from default config code
Avoid setting PMU values in arm_spe_pmu_default_config, move to perf_pmu__arch_init.
Signed-off-by: Ian Rogers <irogers@google.com> Re
perf arm-spe: Move PMU initialization from default config code
Avoid setting PMU values in arm_spe_pmu_default_config, move to perf_pmu__arch_init.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
ff382c1c | 06-Jun-2023 |
Leo Yan <leo.yan@linaro.org> |
perf parse-regs: Move out arch specific header from util/perf_regs.h
util/perf_regs.h includes another perf_regs.h:
#include <perf_regs.h>
Here it includes architecture specific header, for exam
perf parse-regs: Move out arch specific header from util/perf_regs.h
util/perf_regs.h includes another perf_regs.h:
#include <perf_regs.h>
Here it includes architecture specific header, for example, if we build arm64 target, the header tools/perf/arch/arm64/include/perf_regs.h is included.
We use this implicit way to include architecture specific header, which is not directive; furthermore, util/perf_regs.c is coupled with the architecture specific definitions.
This patch moves out arch specific header from util/perf_regs.h for generalizing the 'util' folder, as a result, the source files in 'arch' folder explicitly include architecture's perf_regs.h.
Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Lin <eric.lin@sifive.com> Cc: Fangrui Song <maskray@google.com> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20230606014559.21783-7-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|