| b70aa410 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Expose SIMD information in other operations
The other operations contain SME data processing, ASE (Advanced SIMD) and floating-point operations. Expose these info in the records.
Sign
perf arm_spe: Expose SIMD information in other operations
The other operations contain SME data processing, ASE (Advanced SIMD) and floating-point operations. Expose these info in the records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| d67835cd | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report GCS in record
Report GCS related info in records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@li
perf arm_spe: Report GCS in record
Report GCS related info in records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| d4b61de4 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report memset and memcpy in records
Expose memset and memcpy related info in records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by
perf arm_spe: Report memset and memcpy in records
Expose memset and memcpy related info in records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| 6d47c32c | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report associated info for SVE / SME operations
SVE / SME operations can be predicated or Gather load / scatter store, save the relevant info into record.
Signed-off-by: Leo Yan <leo.
perf arm_spe: Report associated info for SVE / SME operations
SVE / SME operations can be predicated or Gather load / scatter store, save the relevant info into record.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| f3b9bed7 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report extended memory operations in records
Extended memory operations include atomic (AT), acquire/release (AR), and exclusive (EXCL) operations. Save the relevant information in the
perf arm_spe: Report extended memory operations in records
Extended memory operations include atomic (AT), acquire/release (AR), and exclusive (EXCL) operations. Save the relevant information in the records.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| c462dc70 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report MTE allocation tag in record
Save MTE tag info in memory record.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark
perf arm_spe: Report MTE allocation tag in record
Save MTE tag info in memory record.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| 77e4291e | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Report register access in record
Record register access info for load / store operations.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewe
perf arm_spe: Report register access in record
Record register access info for load / store operations.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| cdc1aff1 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Introduce data processing macro for SVE operations
Introduce the ARM_SPE_OP_DP (data processing) macro as associated information for SVE operations. For SVE register access, only ARM_S
perf arm_spe: Introduce data processing macro for SVE operations
Introduce the ARM_SPE_OP_DP (data processing) macro as associated information for SVE operations. For SVE register access, only ARM_SPE_OP_SVE is set; for SVE data processing, both ARM_SPE_OP_SVE and ARM_SPE_OP_DP are set together.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| b64bf913 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Consolidate operation types
Consolidate operation types in a way:
(a) Extract the second-level types into separate enums. (b) The second-level types for memory and SIMD operations are
perf arm_spe: Consolidate operation types
Consolidate operation types in a way:
(a) Extract the second-level types into separate enums. (b) The second-level types for memory and SIMD operations are classified by modules. E.g., an operation may relate to general register, SIMD/FP, SVE, etc. (c) The associated information tells details. E.g., an operation is load or store, whether it is atomic operation, etc.
Start the enum items for the second-level types from 8 to accommodate more entries within a 32-bit integer.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| c7c198b3 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Remove unused operation types
Remove unused SVE operation types. These operations will be reintroduced in subsequent refactoring, but with a different format.
Signed-off-by: Leo Yan <
perf arm_spe: Remove unused operation types
Remove unused SVE operation types. These operations will be reintroduced in subsequent refactoring, but with a different format.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| c4cfe1bc | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Decode SME data processing packet
For SME data processing, decode its Effective vector length or Tile Size (ETS), and print out if a floating-point operation.
After:
. 00000000:
perf arm_spe: Decode SME data processing packet
For SME data processing, decode its Effective vector length or Tile Size (ETS), and print out if a floating-point operation.
After:
. 00000000: 49 00 SME-OTHER ETS 1024 FP . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| 876294a6 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Decode ASE and FP fields in other operation
Add a check for other operation, which prevents any incorrectly classifying. Parse the ASE and FP fields.
After:
. 0000002f: 48 06
perf arm_spe: Decode ASE and FP fields in other operation
Add a check for other operation, which prevents any incorrectly classifying. Parse the ASE and FP fields.
After:
. 0000002f: 48 06 OTHER ASE FP INSN-OTHER . 00000031: b2 08 80 48 01 08 00 ff ff VA 0xffff000801488008 . 0000003a: 9a 00 00 LAT 0 XLAT . 0000003d: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| c8bf2a05 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro
Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@go
perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro
Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| b4eaece3 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Decode GCS operation
Decode a load or store from a GCS operation and the associated "common" field.
After:
. 00000000: 49 44 LD GCS COMM
perf arm_spe: Decode GCS operation
Decode a load or store from a GCS operation and the associated "common" field.
After:
. 00000000: 49 44 LD GCS COMM . 00000002: b2 18 3c d7 83 00 80 ff ff VA 0xffff800083d73c18 . 0000000b: 9a 00 00 LAT 0 XLAT . 0000000e: 43 00 DATA-SOURCE 0
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| b61ca721 | 12-Nov-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Unify operation naming
Rename extended subclass and SVE/SME register access subclass, so that the naming can be consistent cross all sub classes.
Add an log "SVE-SME-REG" for the SVE/
perf arm_spe: Unify operation naming
Rename extended subclass and SVE/SME register access subclass, so that the naming can be consistent cross all sub classes.
Add an log "SVE-SME-REG" for the SVE/SME register access, this is easier for parsing.
Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| d5105689 | 12-Sep-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Set HITM flag
Since FEAT_SPEv1p4, Arm SPE provides two extra events: "Cache data modified" and "Data snooped".
Set the snoop mode as:
- If both the "Cache data modified" event and th
perf arm_spe: Set HITM flag
Since FEAT_SPEv1p4, Arm SPE provides two extra events: "Cache data modified" and "Data snooped".
Set the snoop mode as:
- If both the "Cache data modified" event and the "Data snooped" event are set, which indicates a load operation that snooped from a outside cache and hit a modified copy, set the HITM flag to inspect false sharing.
- If the snooped event bit is not set, and the snooped event has been supported by the hardware, set as NONE mode (no snoop operation).
- If the snooped event bit is not set, and the event is not supported or absent the events info in the meta data, set as NA mode (not available).
Don't set any mode for only "Cache data modified" event, as it hits a local modified copy.
Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
| 786e7e7a | 12-Sep-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Fill memory levels for FEAT_SPEv1p4
Starting with FEAT_SPEv1p4, Arm SPE provides information on Level 2 data cache and recently fetched events. This patch fills in the memory levels fo
perf arm_spe: Fill memory levels for FEAT_SPEv1p4
Starting with FEAT_SPEv1p4, Arm SPE provides information on Level 2 data cache and recently fetched events. This patch fills in the memory levels for these new events.
The recently fetched events are matched to line-fill buffer (LFB). In general, the latency for accessing LFB is higher than accessing L1 cache but lower than accessing L2 cache. Thus, it locates in the memory hierarchy information between L1 cache and L2 cache.
Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
| e44e2b2b | 12-Sep-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm_spe: Decode event types for new features
Decode new event types introduced by FEAT_SPEv1p4, FEAT_SPE_SME and FEAT_SPE_SME.
The printed event names don't strictly follow the naming in the A
perf arm_spe: Decode event types for new features
Decode new event types introduced by FEAT_SPEv1p4, FEAT_SPE_SME and FEAT_SPE_SME.
The printed event names don't strictly follow the naming in the Arm ARM. For example, the "Cache data modified" event is shown as "HITM", and the "Data snooped" event is printed as "SNOOPED". Shorter names are easier to read while preserving core meanings.
Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
| 2cc2f258 | 04-Mar-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm-spe: Support previous branch target (PBT) address
When FEAT_SPE_PBT is implemented, the previous branch target address (named as PBT) before the sampled operation, will be recorded.
This c
perf arm-spe: Support previous branch target (PBT) address
When FEAT_SPE_PBT is implemented, the previous branch target address (named as PBT) before the sampled operation, will be recorded.
This commit first introduces a 'prev_br_tgt' field in the record for saving the PBT address in the decoder.
If the current operation is a branch instruction, by combining with PBT, it can create a chain with two consecutive branches. As the branch stack stores branches in descending order, meaning a newer branch is stored in a lower entry in the stack. Arm SPE stores the latest branch in the first entry of branch stack, and the previous branch coming from PBT is stored into the second entry.
Otherwise, if current operation is not a branch, the last branch will be saved for PBT only. PBT lacks associated information such as branch source address, branch type, and events. The branch entry fills zeros for the corresponding fields and only set its target address.
After:
perf script -f --itrace=bl -F flags,addr,brstack jcc ffff800080187914 0xffff8000801878fc/0xffff800080187914/P/-/-/1/COND/- 0x0/0xffff8000801878f8/-/-/-/0//- jcc ffff8000802d12d8 0xffff8000802d12f8/0xffff8000802d12d8/P/-/-/1/COND/- 0x0/0xffff8000802d12ec/-/-/-/0//- jcc ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/- 0x0/0xffff8000813fe200/-/-/-/0//- jcc ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/- 0x0/0xffff8000813fe200/-/-/-/0//- jmp ffff800081410980 0xffff800081419108/0xffff800081410980/P/-/-/1//- 0x0/0xffff800081419104/-/-/-/0//- return ffff80008036e064 0xffff80008141ba84/0xffff80008036e064/P/-/-/1/RET/- 0x0/0xffff80008141ba60/-/-/-/0//- jcc ffff8000803d54f0 0xffff8000803d54e8/0xffff8000803d54f0/P/-/-/1/COND/- 0x0/0xffff8000803d54e0/-/-/-/0//- jmp ffff80008015e468 0xffff8000803d46dc/0xffff80008015e468/P/-/-/1//- 0x0/0xffff8000803d46c8/-/-/-/0//- jmp ffff8000806e2d50 0xffff80008040f710/0xffff8000806e2d50/P/-/-/1//- 0x0/0xffff80008040f6e8/-/-/-/0//- jcc ffff800080721704 0xffff8000807216b4/0xffff800080721704/P/-/-/1/COND/- 0x0/0xffff8000807216ac/-/-/-/0//-
Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-13-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| 5c1b1583 | 04-Mar-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm-spe: Fill branch operations and events to record
The new added branch operations and events are filled into record, the information will be consumed when synthesizing samples.
Reviewed-by:
perf arm-spe: Fill branch operations and events to record
The new added branch operations and events are filled into record, the information will be consumed when synthesizing samples.
Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-10-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|
| faf22605 | 04-Mar-2025 |
Leo Yan <leo.yan@arm.com> |
perf arm-spe: Decode transactional event
The bit[16] in an event payload indicates an operation is in transactional state. Decode the bit.
Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by:
perf arm-spe: Decode transactional event
The bit[16] in an event payload indicates an operation is in transactional state. Decode the bit.
Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-9-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
show more ...
|