Lines Matching +full:ip +full:- +full:core
1 perf-amd-ibs(1)
5 ----
6 perf-amd-ibs - Support for AMD Instruction-Based Sampling (IBS) with perf tool
9 --------
11 'perf record' -e ibs_op//
12 'perf record' -e ibs_fetch//
15 -----------
17 Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
20 execution (micro-op execution to be precise) with details like d-cache
21 hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
23 with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
24 per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
39 IBS VS. REGULAR CORE PMU
40 ------------------------
42 IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
43 no skid. Whereas the IP recorded by regular core PMU will have some skid
44 (sample was generated at IP X but perf would record it at IP X+n). Hence,
45 regular core PMU might not help for profiling with instruction level
47 question. On the other hand, regular core PMU has it's own advantages like
51 Three regular core PMU events are internally forwarded to IBS Op PMU when
54 -e cpu-cycles:p becomes -e ibs_op//
55 -e r076:p becomes -e ibs_op//
56 -e r0C1:p becomes -e ibs_op/cnt_ctl=1/
59 --------
64 System-wide profile, cycles event, sampling period: 100000
66 # perf record -e ibs_op// -c 100000 -a
68 Per-cpu profile (cpu10), cycles event, sampling period: 100000
70 # perf record -e ibs_op// -c 100000 -C 10
72 Per-cpu profile (cpu10), cycles event, sampling freq: 1000
74 # perf record -e ibs_op// -F 1000 -C 10
76 System-wide profile, uOps event, sampling period: 100000
78 # perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
82 # perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
84 System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
86 # perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
88 System-wide profile, cycles event, sampling period: 100000, LdLat filtering (Zen5
91 # perf record -e ibs_op/ldlat=128/ -c 100000 -a
99 # perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
103 # perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
114 Raw dump of IBS registers when profiled with --raw-samples
116 # perf report -D
142 https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
149 System-wide profile, fetch ops event, sampling period: 100000
151 # perf record -e ibs_fetch// -c 100000 -a
153 System-wide profile, fetch ops event, sampling period: 100000, Random enable
155 # perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
164 ---------------------
170 # perf mem record -c 100000 -- make
176 # perf mem report -F overhead,cache,snoop,comm
181 # ---------- Cache ----------- --- Snoop ----
182 # Overhead L1 L2 L1-buf Other HitM Other Command
191 0.28% 0.1% 0.0% 0.0% 0.2% 0.1% 0.2% pkg-config
199 # perf mem report -s mem,snoop
209 25.08% 338 core, same node Any cache hit HitM
212 6.39% 101 core, same node Any cache hit N/A
220 --------
222 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
223 linkperf:perf-mem[1], linkperf:perf-c2c[1]