perf-mem.txt - OpenGrok cross reference for /linux/tools/perf/Documentation/perf-mem.txt

Lines Matching +full:spe +full:- +full:pmu
1 perf-mem(1)
5 ----
6 perf-mem - Profile memory accesses
9 --------
14 -----------
20 and stores are sampled. Use the -t option to limit to loads or stores.
22 Note that on Intel systems the memory latency reported is the use-latency,
26 On Arm64 this uses SPE to sample load and store operations, therefore hardware
27 and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide.
28 Due to the statistical nature of SPE sampling, not every memory operation will
31 On AMD this use IBS Op PMU to sample load-store operations.
34 --------------
35 -f::
36 --force::
39 -t::
40 --type=<type>::
43 -v::
44 --verbose::
47 -p::
48 --phys-data::
51 --data-page-size::
55 --------------
59 -e::
60 --event <event>::
61 	Event selector. Use 'perf mem record -e list' to list available events.
63 -K::
64 --all-kernel::
67 -U::
68 --all-user::
71 --ldlat <n>::
76 	- /sys/bus/event_source/devices/ibs_op/caps/ldlat file contains '1'.
77 	- Supported latency values are 128 to 2048 (both inclusive).
78 	- Latency value which is a multiple of 128 incurs a little less profiling
80 	- Load latency filtering is disabled by default.
83 --------------
84 -i::
85 --input=<file>::
88 -C::
89 --cpu=<cpu>::
91         comma-separated list with no space: 0,1. Ranges of CPUs are specified with -
92 	like 0-2. Default is to monitor all CPUS.
94 -D::
95 --dump-raw-samples::
99 -s::
100 --sort=<key>::
101 	Group result by given key(s) - multiple keys can be specified
106 	- symbol_daddr: name of data symbol being executed on at the time of sample
107 	- symbol_iaddr: name of code symbol being executed on at the time of sample
108 	- dso_daddr: name of library or module containing the data being executed
110 	- locked: whether the bus was locked at the time of the sample
111 	- tlb: type of tlb access for the data at the time of the sample
112 	- mem: type of memory access for the data at the time of the sample
113 	- snoop: type of snoop (if any) for the data at the time of the sample
114 	- dcacheline: the cacheline the data address is on at the time of the sample
115 	- phys_daddr: physical address of data being executed on at the time of sample
116 	- data_page_size: the data page size of data being executed on at the time of sample
117 	- blocked: reason of blocked load access for the data at the time of the sample
122 -F::
123 --fields=::
124 	Specify output field - multiple keys can be specified in CSV format.
125 	Please see linkperf:perf-report[1] for details.
130 	- op: operation in the sample instruction (load, store, prefetch, ...)
131 	- cache: location in CPU cache (L1, L2, ...) where the sample hit
132 	- mem: location in memory or other places the sample hit
133 	- dtlb: location in Data TLB (L1, L2) where the sample hit
134 	- snoop: snoop result for the sampled data access
138 -T::
139 --type-profile::
140 	Show data-type profile result instead of code symbols.  This requires
144 -U::
145 --hide-unresolved::
148 -x::
149 --field-separator=<separator>::
150 	Specify the field separator used when dump raw samples (-D option). By default,
157 --------------------
158 Unlike linkperf:perf-report[1], which calculates overhead from the actual
159 sample period, perf-mem overhead is calculated using sample weight. E.g.
163   $ perf script -F period,data_src,weight,ip,sym
167   $ perf report -F overhead,symbol
171   $ perf mem report -F overhead,symbol
176 ----------------------
180 behave differently when it's used by -F/--fields or -s/--sort.
185   $ perf mem report -F mem,snoop
187   # ------ Memory -------  --- Snoop ----
196   $ perf mem report -s mem,snoop
210 --------
211 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]