xref: /freebsd/lib/libpmc/pmc.sandybridgexeon.3 (revision 7ef62cebc2f965b0f640263e179276928885e33d)
1.\" Copyright (c) 2012 Hiren Panchasara <hiren.panchasara@gmail.com>
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd October 18, 2012
28.Dt PMC.SANDYBRIDGEXEON 3
29.Os
30.Sh NAME
31.Nm pmc.sandybridgexeon
32.Nd measurement events for
33.Tn Intel
34.Tn Sandy Bridge Xeon
35family CPUs
36.Sh LIBRARY
37.Lb libpmc
38.Sh SYNOPSIS
39.In pmc.h
40.Sh DESCRIPTION
41.Tn Intel
42.Tn "Sandy Bridge Xeon"
43CPUs contain PMCs conforming to version 2 of the
44.Tn Intel
45performance measurement architecture.
46These CPUs may contain up to two classes of PMCs:
47.Bl -tag -width "Li PMC_CLASS_IAP"
48.It Li PMC_CLASS_IAF
49Fixed-function counters that count only one hardware event per counter.
50.It Li PMC_CLASS_IAP
51Programmable counters that may be configured to count one of a defined
52set of hardware events.
53.El
54.Pp
55The number of PMCs available in each class and their widths need to be
56determined at run time by calling
57.Xr pmc_cpuinfo 3 .
58.Pp
59Intel Sandy Bridge Xeon PMCs are documented in
60.Rs
61.%B "Intel(R) 64 and IA-32 Architectures Software Developer's Manual"
62.%T "Volume 3B: System Programming Guide, Part 2"
63.%N "Order Number: 253669-043US"
64.%D August 2012
65.%Q "Intel Corporation"
66.Re
67.Ss SANDYBRIDGE XEON FIXED FUNCTION PMCS
68These PMCs and their supported events are documented in
69.Xr pmc.iaf 3 .
70.Ss SANDYBRIDGE XEON PROGRAMMABLE PMCS
71The programmable PMCs support the following capabilities:
72.Bl -column "PMC_CAP_INTERRUPT" "Support"
73.It Em Capability Ta Em Support
74.It PMC_CAP_CASCADE Ta \&No
75.It PMC_CAP_EDGE Ta Yes
76.It PMC_CAP_INTERRUPT Ta Yes
77.It PMC_CAP_INVERT Ta Yes
78.It PMC_CAP_READ Ta Yes
79.It PMC_CAP_PRECISE Ta \&No
80.It PMC_CAP_SYSTEM Ta Yes
81.It PMC_CAP_TAGGING Ta \&No
82.It PMC_CAP_THRESHOLD Ta Yes
83.It PMC_CAP_USER Ta Yes
84.It PMC_CAP_WRITE Ta Yes
85.El
86.Ss Event Qualifiers
87Event specifiers for these PMCs support the following common
88qualifiers:
89.Bl -tag -width indent
90.It Li rsp= Ns Ar value
91Configure the Off-core Response bits.
92.Bl -tag -width indent
93.It Li REQ_DMND_DATA_RD
94Counts the number of demand and DCU prefetch data reads of full and partial
95cachelines as well as demand data page table entry cacheline reads.
96Does not count L2 data read prefetches or instruction fetches.
97.It Li REQ_DMND_RFO
98Counts the number of demand and DCU prefetch reads for ownership (RFO)
99requests generated by a write to data cacheline.
100Does not count L2 RFO prefetches.
101.It Li REQ_DMND_IFETCH
102Counts the number of demand and DCU prefetch instruction cacheline reads.
103Does not count L2 code read prefetches.
104.It Li REQ_WB
105Counts the number of writeback (modified to exclusive) transactions.
106.It Li REQ_PF_DATA_RD
107Counts the number of data cacheline reads generated by L2 prefetchers.
108.It Li REQ_PF_RFO
109Counts the number of RFO requests generated by L2 prefetchers.
110.It Li REQ_PF_IFETCH
111Counts the number of code reads generated by L2 prefetchers.
112.It Li REQ_PF_LLC_DATA_RD
113L2 prefetcher to L3 for loads.
114.It Li REQ_PF_LLC_RFO
115RFO requests generated by L2 prefetcher
116.It Li REQ_PF_LLC_IFETCH
117L2 prefetcher to L3 for instruction fetches.
118.It Li REQ_BUS_LOCKS
119Bus lock and split lock requests.
120.It Li REQ_STRM_ST
121Streaming store requests.
122.It Li REQ_OTHER
123Any other request that crosses IDI, including I/O.
124.It Li RES_ANY
125Catch all value for any response types.
126.It Li RES_SUPPLIER_NO_SUPP
127No Supplier Information available.
128.It Li RES_SUPPLIER_LLC_HITM
129M-state initial lookup stat in L3.
130.It Li RES_SUPPLIER_LLC_HITE
131E-state.
132.It Li RES_SUPPLIER_LLC_HITS
133S-state.
134.It Li RES_SUPPLIER_LLC_HITF
135F-state.
136.It Li RES_SUPPLIER_LOCAL
137Local DRAM Controller.
138.It Li RES_SNOOP_SNP_NONE
139No details on snoop-related information.
140.It Li RES_SNOOP_SNP_NO_NEEDED
141No snoop was needed to satisfy the request.
142.It Li RES_SNOOP_SNP_MISS
143A snoop was needed and it missed all snooped caches:
144-For LLC Hit, ReslHitl was returned by all cores
145-For LLC Miss, Rspl was returned by all sockets and data was returned from
146DRAM.
147.It Li RES_SNOOP_HIT_NO_FWD
148A snoop was needed and it hits in at least one snooped cache.
149Hit denotes a cache-line was valid before snoop effect.
150This includes:
151-Snoop Hit w/ Invalidation (LLC Hit, RFO)
152-Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD)
153-Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S)
154In the LLC Miss case, data is returned from DRAM.
155.It Li RES_SNOOP_HIT_FWD
156A snoop was needed and data was forwarded from a remote socket.
157This includes:
158-Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT).
159.It Li RES_SNOOP_HITM
160A snoop was needed and it HitM-ed in local or remote cache.
161HitM denotes a cache-line was in modified state before effect as a results of snoop.
162This includes:
163-Snoop HitM w/ WB (LLC miss, IFetch/Data_RD)
164-Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO)
165-Snoop MtoS (LLC Hit, IFetch/Data_RD).
166.It Li RES_NON_DRAM
167Target was non-DRAM system address.
168This includes MMIO transactions.
169.El
170.It Li cmask= Ns Ar value
171Configure the PMC to increment only if the number of configured
172events measured in a cycle is greater than or equal to
173.Ar value .
174.It Li edge
175Configure the PMC to count the number of de-asserted to asserted
176transitions of the conditions expressed by the other qualifiers.
177If specified, the counter will increment only once whenever a
178condition becomes true, irrespective of the number of clocks during
179which the condition remains true.
180.It Li inv
181Invert the sense of comparison when the
182.Dq Li cmask
183qualifier is present, making the counter increment when the number of
184events per cycle is less than the value specified by the
185.Dq Li cmask
186qualifier.
187.It Li os
188Configure the PMC to count events happening at processor privilege
189level 0.
190.It Li usr
191Configure the PMC to count events occurring at privilege levels 1, 2
192or 3.
193.El
194.Pp
195If neither of the
196.Dq Li os
197or
198.Dq Li usr
199qualifiers are specified, the default is to enable both.
200.Ss Event Specifiers (Programmable PMCs)
201Sandy Bridge Xeon programmable PMCs support the following events:
202.Bl -tag -width indent
203.It Li LD_BLOCKS.DATA_UNKNOWN
204.Pq Event 03H , Umask 01H
205blocked loads due to store buffer blocks with unknown data.
206.It Li LD_BLOCKS.STORE_FORWARD
207.Pq Event 03H , Umask 02H
208loads blocked by overlapping with store buffer that cannot
209be forwarded .
210.It Li LD_BLOCKS.NO_SR
211.Pq Event 03H , Umask 08H
212# of Split loads blocked due to resource not available.
213.It Li LD_BLOCKS.ALL_BLOCK
214.Pq Event 03H , Umask 10H
215Number of cases where any load is blocked but has no
216DCU miss.
217.It Li MISALIGN_MEM_REF.LOADS
218.Pq Event 05H , Umask 01H
219Speculative cache-line split load uops dispatched to
220L1D.
221.It Li MISALIGN_MEM_REF.STORES
222.Pq Event 05H , Umask 02H
223Speculative cache-line split Store- address uops
224dispatched to L1D.
225.It Li LD_BLOCKS_PARTIAL.ADDRESS_ALIAS
226.Pq Event 07H , Umask 01H
227False dependencies in MOB due to partial compare on
228address.
229.It Li LD_BLOCKS_PARTIAL.ALL_STALL_BLOCK
230.Pq Event 07H , Umask 08H
231The number of times that load operations are temporarily
232blocked because of older stores, with addresses that are
233not yet known.
234A load operation may incur more than one block of this type.
235.It Li TLB_LOAD_MISSES.MISS_CAUSES_A_WALK
236.Pq Event 08H , Umask 01H
237Misses in all TLB levels that cause a page walk of any
238page size.
239.It Li TLB_LOAD_MISSES.WALK_COMPLETED
240.Pq Event 08H , Umask 02H
241Misses in all TLB levels that caused page walk completed
242of any size.
243.It Li DTLB_LOAD_MISSES.WALK_DURATION
244.Pq Event 08H , Umask 04H
245Cycle PMH is busy with a walk.
246.It Li DTLB_LOAD_MISSES.STLB_HIT
247.Pq Event 08H , Umask 10H
248Number of cache load STLB hits.
249No page walk.
250.It Li INT_MISC.RECOVERY_CYCLES
251.Pq Event 0DH , Umask 03H
252Cycles waiting to recover after Machine Clears or EClear.
253Set Cmask= 1.
254.It Li INT_MISC.RAT_STALL_CYCLES
255.Pq Event 0DH , Umask 40H
256Cycles RAT external stall is sent to IDQ for this thread.
257.It Li UOPS_ISSUED.ANY
258.Pq Event 0EH , Umask 01H
259Increments each cycle the # of Uops issued by the
260RAT to RS.
261Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles
262of this core.
263.It Li FP_COMP_OPS_EXE.X87
264.Pq Event 10H , Umask 01H
265Counts number of X87 uops executed.
266.It Li FP_COMP_OPS_EXE.SSE_FP_PACKED_DOUBLE
267.Pq Event 10H , Umask 10H
268Counts number of SSE* double precision FP packed
269uops executed.
270.It Li FP_COMP_OPS_EXE.SSE_FP_SCALAR_SINGLE
271.Pq Event 10H , Umask 20H
272Counts number of SSE* single precision FP scalar
273uops executed.
274.It Li FP_COMP_OPS_EXE.SSE_PACKED_SINGLE
275.Pq Event 10H , Umask 40H
276Counts number of SSE* single precision FP packed
277uops executed.
278.It Li FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE
279.Pq Event 10H , Umask 80H
280Counts number of SSE* double precision FP scalar
281uops executed.
282.It Li SIMD_FP_256.PACKED_SINGLE
283.Pq Event 11H , Umask 01H
284Counts 256-bit packed single-precision floating-
285point instructions.
286.It Li SIMD_FP_256.PACKED_DOUBLE
287.Pq Event 11H , Umask 02H
288Counts 256-bit packed double-precision floating-
289point instructions.
290.It Li ARITH.FPU_DIV_ACTIVE
291.Pq Event 14H , Umask 01H
292Cycles that the divider is active, includes INT and FP.
293Set 'edge =1, cmask=1' to count the number of
294divides.
295.It Li INSTS_WRITTEN_TO_IQ.INSTS
296.Pq Event 17H , Umask 01H
297Counts the number of instructions written into the
298IQ every cycle.
299.It Li L2_RQSTS.DEMAND_DATA_RD_HIT
300.Pq Event 24H , Umask 01H
301Demand Data Read requests that hit L2 cache.
302.It Li L2_RQSTS.ALL_DEMAND_DATA_RD
303.Pq Event 24H , Umask 03H
304Counts any demand and L1 HW prefetch data load
305requests to L2.
306.It Li L2_RQSTS.RFO_HITS
307.Pq Event 24H , Umask 04H
308Counts the number of store RFO requests that
309hit the L2 cache.
310.It Li L2_RQSTS.RFO_MISS
311.Pq Event 24H , Umask 08H
312Counts the number of store RFO requests that
313miss the L2 cache.
314.It Li L2_RQSTS.ALL_RFO
315.Pq Event 24H , Umask 0CH
316Counts all L2 store RFO requests.
317.It Li L2_RQSTS.CODE_RD_HIT
318.Pq Event 24H , Umask 10H
319Number of instruction fetches that hit the L2
320cache.
321.It Li L2_RQSTS.CODE_RD_MISS
322.Pq Event 24H , Umask 20H
323Number of instruction fetches that missed the L2
324cache.
325.It Li L2_RQSTS.ALL_CODE_RD
326.Pq Event 24H , Umask 30H
327Counts all L2 code requests.
328.It Li L2_RQSTS.PF_HIT
329.Pq Event 24H , Umask 40H
330Requests from L2 Hardware prefetcher that hit L2.
331.It Li L2_RQSTS.PF_MISS
332.Pq Event 24H , Umask 80H
333Requests from L2 Hardware prefetcher that missed
334L2.
335.It Li L2_RQSTS.ALL_PF
336.Pq Event 24H , Umask C0H
337Any requests from L2 Hardware prefetchers.
338.It Li L2_STORE_LOCK_RQSTS.MISS
339.Pq Event 27H , Umask 01H
340ROs that miss cache lines.
341.It Li L2_STORE_LOCK_RQSTS.HIT_E
342.Pq Event 27H , Umask 04H
343RFOs that hit cache lines in E state.
344.It Li L2_STORE_LOCK_RQSTS.HIT_M
345.Pq Event 27H , Umask 08H
346RFOs that hit cache lines in M state.
347.It Li L2_STORE_LOCK_RQSTS.ALL
348.Pq Event 27H , Umask 0FH
349RFOs that access cache lines in any state.
350.It Li L2_L1D_WB_RQSTS.MISS
351.Pq Event 28H , Umask 01H
352Not rejected writebacks from L1D to L2 cache lines
353that missed L2.
354.It Li L2_L1D_WB_RQSTS.HIT_S
355.Pq Event 28H , Umask 02H
356Not rejected writebacks from L1D to L2 cache lines
357in S state.
358.It Li L2_L1D_WB_RQSTS.HIT_E
359.Pq Event 28H , Umask 04H
360Not rejected writebacks from L1D to L2 cache lines
361in E state.
362.It Li L2_L1D_WB_RQSTS.HIT_M
363.Pq Event 28H , Umask 08H
364Not rejected writebacks from L1D to L2 cache lines
365in M state.
366.It Li L2_L1D_WB_RQSTS.ALL
367.Pq Event 28H , Umask 0FH
368Not rejected writebacks from L1D to L2 cache.
369.It Li LONGEST_LAT_CACHE.REFERENCE
370.Pq Event 2EH , Umask 4FH
371This event counts requests originating from the
372core that reference
373a cache line in the last level cache.
374.It Li LONGEST_LAT_CACHE.MISS
375.Pq Event 2EH , Umask 41H
376This event counts each cache miss condition for
377references to the last level cache.
378.It Li CPU_CLK_UNHALTED.THREAD_P
379.Pq Event 3CH , Umask 00H
380Counts the number of thread cycles while the
381thread is not in a halt state.
382The thread enters the halt state when it is running the HLT
383instruction.
384The core frequency may change from time to time due to power or thermal throttling.
385.It Li CPU_CLK_THREAD_UNHALTED.REF_XCLK
386.Pq Event 3CH , Umask 01H
387Increments at the frequency of XCLK (100 MHz)
388when not halted.
389.It Li L1D_PEND_MISS.PENDING
390.Pq Event 48H , Umask 01H
391Increments the number of outstanding L1D misses
392every cycle.
393Set Cmaks = 1 and Edge =1 to count occurrences.
394.It Li DTLB_STORE_MISSES.MISS_CAUSES_A_WALK
395.Pq Event 49H , Umask 01H
396Miss in all TLB levels causes an page walk of
397any page size (4K/2M/4M/1G).
398.It Li DTLB_STORE_MISSES.WALK_COMPLETED
399.Pq Event 49H , Umask 02H
400Miss in all TLB levels causes a page walk that
401completes of any page size (4K/2M/4M/1G).
402.It Li DTLB_STORE_MISSES.WALK_DURATION
403.Pq Event 49H , Umask 04H
404Cycles PMH is busy with this walk.
405.It Li DTLB_STORE_MISSES.STLB_HIT
406.Pq Event 49H , Umask 10H
407Store operations that miss the first TLB level
408but hit the second and do not cause page walks.
409.It Li LOAD_HIT_PRE.SW_PF
410.Pq Event 4CH , Umask 01H
411Not SW-prefetch load dispatches that hit fill
412buffer allocated for S/W prefetch.
413.It Li LOAD_HIT_PER.HW_PF
414.Pq Event 4CH , Umask 02H
415Not SW-prefetch load dispatches that hit fill
416buffer allocated for H/W prefetch.
417.It Li HW_PRE_REQ.DL1_MISS
418.Pq Event 4EH , Umask 02H
419Hardware Prefetch requests that miss the L1D cache.
420A request is being counted each time it access the cache
421& miss it, including if a block is applicable or if hit the Fill
422Buffer for example.
423.It Li L1D.REPLACEMENT
424.Pq Event 51H , Umask 01H
425Counts the number of lines brought into the
426L1 data cache.
427.It Li L1D.ALLOCATED_IN_M
428.Pq Event 51H , Umask 02H
429Counts the number of allocations of modified
430L1D cache lines.
431.It Li L1D.EVICTION
432.Pq Event 51H , Umask 04H
433Counts the number of modified lines evicted
434from the L1 data cache due to replacement.
435.It Li L1D.ALL_M_REPLACEMENT
436.Pq Event 51H , Umask 08H
437Cache lines in M state evicted out of L1D due
438to Snoop HitM or dirty line replacement.
439.It Li PARTIAL_RAT_STALLS.FLAGS_MERGE_UOP
440.Pq Event 59H , Umask 0CH
441Increments the number of flags-merge uops in
442flight each cycle.
443Set Cmask = 1 to count cycles.
444.It Li PARTIAL_RAT_STALLS.SLOW_LEA_WINDOW
445.Pq Event 59H , Umask 0FH
446Cycles with at least one slow LEA uop allocated.
447.It Li PARTIAL_RAT_STALLS.MUL_SINGLE_UOP
448.Pq Event 59H , Umask 40H
449Number of Multiply packed/scalar single precision
450uops allocated.
451.It Li RESOURCE_STALLS2.ALL_FL_EMPTY
452.Pq Event 5BH , Umask 0CH
453Cycles stalled due to free list empty.
454.It Li RESOURCE_STALLS2.ALL_PRF_CONTROL
455.Pq Event 5BH , Umask 0FH
456Cycles stalled due to control structures full for
457physical registers.
458.It Li RESOURCE_STALLS2.BOB_FULL
459.Pq Event 5BH , Umask 40H
460Cycles Allocator is stalled due Branch Order Buffer.
461.It Li RESOURCE_STALLS2.OOO_RSRC
462.Pq Event 5BH , Umask 4FH
463Cycles stalled due to out of order resources full.
464.It Li CPL_CYCLES.RING0
465.Pq Event 5CH , Umask 01H
466Unhalted core cycles when the thread is in ring 0.
467.It Li CPL_CYCLES.RING123
468.Pq Event 5CH , Umask 02H
469Unhalted core cycles when the thread is not in ring
4700.
471.It Li RS_EVENTS.EMPTY_CYCLES
472.Pq Event 5EH , Umask 01H
473Cycles the RS is empty for the thread.
474.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD
475.Pq Event 60H , Umask 01H
476Offcore outstanding Demand Data Read
477transactions in SQ to uncore.
478Set Cmask=1 to count cycles.
479.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO
480.Pq Event 60H , Umask 04H
481Offcore outstanding RFO store transactions in SQ to
482uncore.
483Set Cmask=1 to count cycles.
484.It Li OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD
485.Pq Event 60H , Umask 08H
486Offcore outstanding cacheable data read
487transactions in SQ to uncore.
488Set Cmask=1 to count cycles.
489.It Li LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION
490.Pq Event 63H , Umask 01H
491Cycles in which the L1D and L2 are locked, due to a
492UC lock or split lock.
493.It Li LOCK_CYCLES.CACHE_LOCK_DURATION
494.Pq Event 63H , Umask 02H
495Cycles in which the L1D is locked.
496.It Li IDQ.EMPTY
497.Pq Event 79H , Umask 02H
498Counts cycles the IDQ is empty.
499.It Li IDQ.MITE_UOPS
500.Pq Event 79H , Umask 04H
501Increment each cycle # of uops delivered to IDQ
502from MITE path.
503Set Cmask = 1 to count cycles.
504.It Li IDQ.DSB_UOPS
505.Pq Event 79H , Umask 08H
506Increment each cycle. # of uops delivered to IDQ
507from DSB path.
508Set Cmask = 1 to count cycles.
509.It Li IDQ.MS_DSB_UOPS
510.Pq Event 79H , Umask 10H
511Increment each cycle # of uops delivered to IDQ
512when MS busy by DSB.
513Set Cmask = 1 to count cycles MS is busy.
514Set Cmask=1 and Edge =1 to count MS activations.
515.It Li IDQ.MS_MITE_UOPS
516.Pq Event 79H , Umask 20H
517Increment each cycle # of uops delivered to IDQ
518when MS is busy by MITE.
519Set Cmask = 1 to count cycles.
520.It Li IDQ.MS_UOPS
521.Pq Event 79H , Umask 30H
522Increment each cycle # of uops delivered to IDQ
523from MS by either DSB or MITE.
524Set Cmask = 1 to count cycles.
525.It Li ICACHE.MISSES
526.Pq Event 80H , Umask 02H
527Number of Instruction Cache, Streaming Buffer and
528Victim Cache Misses.
529Includes UC accesses.
530.It Li ITLB_MISSES.MISS_CAUSES_A_WALK
531.Pq Event 85H , Umask 01H
532Misses in all ITLB levels that cause page walks.
533.It Li ITLB_MISSES.WALK_COMPLETED
534.Pq Event 85H , Umask 02H
535Misses in all ITLB levels that cause completed page
536walks.
537.It Li ITLB_MISSES.WALK_DURATION
538.Pq Event 85H , Umask 04H
539Cycle PMH is busy with a walk.
540.It Li ITLB_MISSES.STLB_HIT
541.Pq Event 85H , Umask 10H
542Number of cache load STLB hits.
543No page walk.
544.It Li ILD_STALL.LCP
545.Pq Event 87H , Umask 01H
546Stalls caused by changing prefix length of the
547instruction.
548.It Li ILD_STALL.IQ_FULL
549.Pq Event 87H , Umask 04H
550Stall cycles due to IQ is full.
551.It Li BR_INST_EXEC.NONTAKEN_COND
552.Pq Event 88H , Umask 41H
553Count conditional near branch instructions that were executed (but not
554necessarily retired) and not taken.
555.It Li BR_INST_EXEC.TAKEN_COND
556.Pq Event 88H , Umask 81H
557Count conditional near branch instructions that were executed (but not
558necessarily retired) and taken.
559.It Li BR_INST_EXEC.DIRECT_JMP
560.Pq Event 88H , Umask 82H
561Count all unconditional near branch instructions excluding calls and
562indirect branches.
563.It Li BR_INST_EXEC.INDIRECT_JMP_NON_CALL_RET
564.Pq Event 88H , Umask 84H
565Count executed indirect near branch instructions that are not calls nor
566returns.
567.It Li BR_INST_EXEC.RETURN_NEAR
568.Pq Event 88H , Umask 88H
569Count indirect near branches that have a return mnemonic.
570.It Li BR_INST_EXEC.DIRECT_NEAR_CALL
571.Pq Event 88H , Umask 90H
572Count unconditional near call branch instructions, excluding non call
573branch, executed.
574.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL
575.Pq Event 88H , Umask A0H
576Count indirect near calls, including both register and memory indirect,
577executed.
578.It Li BR_INST_EXEC.ALL_BRANCHES
579.Pq Event 88H , Umask FFH
580Counts all near executed branches (not necessarily retired).
581.It Li BR_MISP_EXEC.NONTAKEN_COND
582.Pq Event 89H , Umask 41H
583Count conditional near branch instructions mispredicted as nontaken.
584.It Li BR_MISP_EXEC.TAKEN_COND
585.Pq Event 89H , Umask 81H
586Count conditional near branch instructions mispredicted as taken.
587.It Li BR_MISP_EXEC.INDIRECT_JMP_NON_CALL_RET
588.Pq Event 89H , Umask 84H
589Count mispredicted indirect near branch instructions that are not calls
590nor returns.
591.It Li BR_MISP_EXEC.RETURN_NEAR
592.Pq Event 89H , Umask 88H
593Count mispredicted indirect near branches that have a return mnemonic.
594.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL
595.Pq Event 89H , Umask 90H
596Count mispredicted unconditional near call branch instructions, excluding
597non call branch, executed.
598.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL
599.Pq Event 89H , Umask A0H
600Count mispredicted indirect near calls, including both register and memory
601indirect, executed.
602.It Li BR_MISP_EXEC.ALL_BRANCHES
603.Pq Event 89H , Umask FFH
604Counts all mispredicted near executed branches (not necessarily retired).
605.It Li IDQ_UOPS_NOT_DELIVERED.CORE
606.Pq Event 9CH , Umask 01H
607Count number of non-delivered uops to RAT per
608thread.
609.It Li UOPS_DISPATCHED_PORT.PORT_0
610.Pq Event A1H , Umask 01H
611Cycles which a Uop is dispatched on port 0.
612.It Li UOPS_DISPATCHED_PORT.PORT_1
613.Pq Event A1H , Umask 02H
614Cycles which a Uop is dispatched on port 1.
615.It Li UOPS_DISPATCHED_PORT.PORT_2_LD
616.Pq Event A1H , Umask 04H
617Cycles which a load uop is dispatched on port 2.
618.It Li UOPS_DISPATCHED_PORT.PORT_2_STA
619.Pq Event A1H , Umask 08H
620Cycles which a store address uop is dispatched on
621port 2.
622.It Li UOPS_DISPATCHED_PORT.PORT_2
623.Pq Event A1H , Umask 0CH
624Cycles which a Uop is dispatched on port 2.
625.It Li UOPS_DISPATCHED_PORT.PORT_3_LD
626.Pq Event A1H , Umask 10H
627Cycles which a load uop is dispatched on port 3.
628.It Li UOPS_DISPATCHED_PORT.PORT_3_STA
629.Pq Event A1H , Umask 20H
630Cycles which a store address uop is dispatched on
631port 3.
632.It Li UOPS_DISPATCHED_PORT.PORT_3
633.Pq Event A1H , Umask 30H
634Cycles which a Uop is dispatched on port 3.
635.It Li UOPS_DISPATCHED_PORT.PORT_4
636.Pq Event A1H , Umask 40H
637Cycles which a Uop is dispatched on port 4.
638.It Li UOPS_DISPATCHED_PORT.PORT_5
639.Pq Event A1H , Umask 80H
640Cycles which a Uop is dispatched on port 5.
641.It Li RESOURCE_STALLS.ANY
642.Pq Event A2H , Umask 01H
643Cycles Allocation is stalled due to Resource Related
644reason.
645.It Li RESOURCE_STALLS.LB
646.Pq Event A2H , Umask 01H
647Counts the cycles of stall due to lack of load buffers.
648.It Li RESOURCE_STALLS.RS
649.Pq Event A2H , Umask 04H
650Cycles stalled due to no eligible RS entry available.
651.It Li RESOURCE_STALLS.SB
652.Pq Event A2H , Umask 08H
653Cycles stalled due to no store buffers available.
654(not including draining form sync).
655.It Li RESOURCE_STALLS.ROB
656.Pq Event A2H , Umask 10H
657Cycles stalled due to re-order buffer full.
658.It Li RESOURCE_STALLS.FCSW
659.Pq Event A2H , Umask 20H
660Cycles stalled due to writing the FPU control word.
661.It Li RESOURCE_STALLS.MXCSR
662.Pq Event A2H , Umask 40H
663Cycles stalled due to the MXCSR register rename
664occurring to close to a previous MXCSR rename.
665.It Li RESOURCE_STALLS.OTHER
666.Pq Event A2H , Umask 80H
667Cycles stalled while execution was stalled due to
668other resource issues.
669.It Li CYCLE_ACTIVITY.CYCLES_L2_PENDING
670.Pq Event A3H , Umask 01H
671Cycles with pending L2 miss loads.
672Set AnyThread to count per core.
673.It Li CYCLE_ACTIVITY.CYCLES_L1D_PENDING
674.Pq Event A3H , Umask 02H
675Cycles with pending L1 cache miss loads.
676Set AnyThread to count per core.
677.It Li CYCLE_ACTIVITY.CYCLES_NO_DISPATCH
678.Pq Event A3H , Umask 04H
679Cycles of dispatch stalls.
680Set AnyThread to count per core.
681.It Li DSB2MITE_SWITCHES.COUNT
682.Pq Event ABH , Umask 01H
683Number of DSB to MITE switches.
684.It Li DSB2MITE_SWITCHES.PENALTY_CYCLES
685.Pq Event ABH , Umask 02H
686Cycles DSB to MITE switches caused delay.
687.It Li DSB_FILL.OTHER_CANCEL
688.Pq Event ACH , Umask 02H
689Cases of cancelling valid DSB fill not because of
690exceeding way limit.
691.It Li DSB_FILL.EXCEED_DSB_LINES
692.Pq Event ACH , Umask 08H
693DSB Fill encountered > 3 DSB lines.
694.It Li DSB_FILL.ALL_CANCEL
695.Pq Event ACH , Umask 0AH
696Cases of cancelling valid Decode Stream Buffer
697(DSB) fill not because of exceeding way limit.
698.It Li ITLB.ITLB_FLUSH
699.Pq Event AEH , Umask 01H
700Counts the number of ITLB flushes, includes
7014k/2M/4M pages.
702.It Li OFFCORE_REQUESTS.DEMAND_DATA_RD
703.Pq Event B0H , Umask 01H
704Demand data read requests sent to uncore.
705.It Li OFFCORE_REQUESTS.DEMAND_RFO
706.Pq Event B0H , Umask 04H
707Demand RFO read requests sent to uncore, including
708regular RFOs, locks, ItoM.
709.It Li OFFCORE_REQUESTS.ALL_DATA_RD
710.Pq Event B0H , Umask 08H
711Data read requests sent to uncore (demand and
712prefetch).
713.It Li UOPS_DISPATCHED.THREAD
714.Pq Event B1H , Umask 01H
715Counts total number of uops to be dispatched per-
716thread each cycle.
717Set Cmask = 1, INV =1 to count stall cycles.
718.It Li UOPS_DISPATCHED.CORE
719.Pq Event B1H , Umask 02H
720Counts total number of uops to be dispatched per-
721core each cycle.
722.It Li OFFCORE_REQUESTS_BUFFER.SQ_FULL
723.Pq Event B2H , Umask 01H
724Offcore requests buffer cannot take more entries
725for this thread core.
726.It Li AGU_BYPASS_CANCEL.COUNT
727.Pq Event B6H , Umask 01H
728Counts executed load operations with all the
729following traits: 1. addressing of the format [base +
730offset], 2. the offset is between 1 and 2047, 3. the
731address specified in the base register is in one page
732and the address [base+offset] is in another page.
733.It Li OFF_CORE_RESPONSE_0
734.Pq Event B7H , Umask 01H
735(Event B7H, Umask 01H) Off-core Response Performance
736Monitoring; PMC0 only.
737Requires programming MSR 01A6H
738.It Li OFF_CORE_RESPONSE_1
739.Pq Event BBH , Umask 01H
740(Event BBH, Umask 01H) Off-core Response Performance
741Monitoring; PMC3 only.
742Requires programming MSR 01A7H
743.It Li TLB_FLUSH.DTLB_THREAD
744.Pq Event BDH , Umask 01H
745DTLB flush attempts of the thread-specific entries.
746.It Li TLB_FLUSH.STLB_ANY
747.Pq Event BDH , Umask 20H
748Count number of STLB flush attempts.
749.It Li L1D_BLOCKS.BANK_CONFLICT_CYCLES
750.Pq Event BFH , Umask 05H
751Cycles when dispatched loads are cancelled due to
752L1D bank conflicts with other load ports.
753.It Li INST_RETIRED.ANY_P
754.Pq Event C0H , Umask 00H
755Number of instructions at retirement.
756.It Li INST_RETIRED.ALL
757.Pq Event C0H , Umask 01H
758Precise instruction retired event with HW to reduce
759effect of PEBS shadow in IP distribution.
760.It Li OTHER_ASSISTS.ITLB_MISS_RETIRED
761.Pq Event C1H , Umask 02H
762Instructions that experienced an ITLB miss.
763.It Li OTHER_ASSISTS.AVX_STORE
764.Pq Event C1H , Umask 08H
765Number of assists associated with 256-bit AVX
766store operations.
767.It Li OTHER_ASSISTS.AVX_TO_SSE
768.Pq Event C1H , Umask 10H
769Number of transitions from AVX-256 to legacy SSE
770when penalty applicable.
771.It Li OTHER_ASSISTS.SSE_TO_AVX
772.Pq Event C1H , Umask 20H
773Number of transitions from SSE to AVX-256 when
774penalty applicable.
775.It Li UOPS_RETIRED.ALL
776.Pq Event C2H , Umask 01H
777Counts the number of micro-ops retired, Use
778cmask=1 and invert to count active cycles or stalled
779cycles.
780.It Li UOPS_RETIRED.RETIRE_SLOTS
781.Pq Event C2H , Umask 02H
782Counts the number of retirement slots used each
783cycle.
784.It Li MACHINE_CLEARS.MEMORY_ORDERING
785.Pq Event C3H , Umask 02H
786Counts the number of machine clears due to
787memory order conflicts.
788.It Li MACHINE_CLEARS.SMC
789.Pq Event C3H , Umask 04H
790Counts the number of times that a program writes
791to a code section.
792.It Li MACHINE_CLEARS.MASKMOV
793.Pq Event C3H , Umask 20H
794Counts the number of executed AVX masked load
795operations that refer to an illegal address range
796with the mask bits set to 0.
797.It Li BR_INST_RETIRED.ALL_BRANCH
798.Pq Event C4H , Umask 00H
799Branch instructions at retirement.
800.It Li BR_INST_RETIRED.CONDITIONAL
801.Pq Event C4H , Umask 01H
802Counts the number of conditional branch
803instructions retired.
804.It Li BR_INST_RETIRED.NEAR_CALL
805.Pq Event C4H , Umask 02H
806Direct and indirect near call instructions retired.
807.It Li BR_INST_RETIRED.ALL_BRANCHES
808.Pq Event C4H , Umask 04H
809Counts the number of branch instructions retired.
810.It Li BR_INST_RETIRED.NEAR_RETURN
811.Pq Event C4H , Umask 08H
812Counts the number of near return instructions
813retired.
814.It Li BR_INST_RETIRED.NOT_TAKEN
815.Pq Event C4H , Umask 10H
816Counts the number of not taken branch instructions
817retired.
818.It Li BR_INST_RETIRED.NEAR_TAKEN
819.Pq Event C4H , Umask 20H
820Number of near taken branches retired.
821.It Li BR_INST_RETIRED.FAR_BRANCH
822.Pq Event C4H , Umask 40H
823Number of far branches retired.
824.It Li BR_MISP_RETIRED.ALL_BRANCHES
825.Pq Event C5H , Umask 00H
826Mispredicted branch instructions at retirement.
827.It Li BR_MISP_RETIRED.CONDITIONAL
828.Pq Event C5H , Umask 01H
829Mispredicted conditional branch instructions retired.
830.It Li BR_MISP_RETIRED.NEAR_CALL
831.Pq Event C5H , Umask 02H
832Direct and indirect mispredicted near call
833instructions retired.
834.It Li BR_MISP_RETIRED.ALL_BRANCHES
835.Pq Event C5H , Umask 04H
836Mispredicted macro branch instructions retired.
837.It Li BR_MISP_RETIRED.NOT_TAKEN
838.Pq Event C5H , Umask 10H
839Mispredicted not taken branch instructions retired.
840.It Li BR_MISP_RETIRED.TAKEN
841.Pq Event C5H , Umask 20H
842Mispredicted taken branch instructions retired.
843.It Li FP_ASSIST.X87_OUTPUT
844.Pq Event CAH , Umask 02H
845Number of X87 assists due to output value.
846.It Li FP_ASSIST.X87_INPUT
847.Pq Event CAH , Umask 04H
848Number of X87 assists due to input value.
849.It Li FP_ASSIST.SIMD_OUTPUT
850.Pq Event CAH , Umask 08H
851 Number of SIMD FP assists due to output values.
852.It Li FP_ASSIST.SIMD_INPUT
853.Pq Event CAH , Umask 10H
854Number of SIMD FP assists due to input values.
855.It Li FP_ASSIST.ANY 1EH
856.Pq Event CAH , Umask
857Cycles with any input/output SSE* or FP assists.
858.It Li ROB_MISC_EVENTS.LBR_INSERTS
859.Pq Event CCH , Umask 20H
860Count cases of saving new LBR records by
861hardware.
862.It Li MEM_TRANS_RETIRED.LOAD_LATENCY
863.Pq Event CDH , Umask 01H
864Sample loads with specified latency threshold.
865PMC3 only.
866.It Li MEM_TRANS_RETIRED.PRECISE_STORE
867.Pq Event CDH , Umask 02H
868Sample stores and collect precise store operation
869via PEBS record.
870PMC3 only.
871.It Li MEM_UOP_RETIRED.LOADS
872.Pq Event D0H , Umask 10H
873Qualify retired memory uops that are loads.
874Combine with umask 10H, 20H, 40H, 80H.
875.It Li MEM_UOP_RETIRED.STORES
876.Pq Event D0H , Umask 02H
877Qualify retired memory uops that are stores.
878Combine with umask 10H, 20H, 40H, 80H.
879.It Li MEM_UOP_RETIRED.STLB_MISS
880.Pq Event D0H , Umask
881Qualify retired memory uops with STLB miss.
882Must combine with umask 01H, 02H, to produce counts.
883.It Li MEM_UOP_RETIRED.LOCK
884.Pq Event D0H , Umask
885Qualify retired memory uops with lock.
886Must combine with umask 01H, 02H, to produce counts.
887.It Li MEM_UOP_RETIRED.SPLIT
888.Pq Event D0H , Umask
889Qualify retired memory uops with line split.
890Must combine with umask 01H, 02H, to produce counts.
891.It Li MEM_UOP_RETIRED_ALL
892.Pq Event D0H , Umask
893Qualify any retired memory uops.
894Must combine with umask 01H, 02H, to produce counts.
895.It Li MEM_LOAD_UOPS_RETIRED.L1_HIT
896.Pq Event D1H , Umask 01H
897Retired load uops with L1 cache hits as data
898sources.
899.It Li MEM_LOAD_UOPS_RETIRED.L2_HIT
900.Pq Event D1H , Umask 02H
901Retired load uops with L2 cache hits as data
902sources.
903.It Li MEM_LOAD_UOPS_RETIRED.LLC_HIT
904.Pq Event D1H , Umask 04H
905Retired load uops which data sources were data hits
906in LLC without snoops required.
907.It Li MEM_LOAD_UOPS_RETIRED.LLC_MISS
908.Pq Event D1H , Umask 20H
909Retired load uops which data sources were data
910missed LLC (excluding unknown data source).
911.It Li MEM_LOAD_UOPS_RETIRED.HIT_LFB
912.Pq Event D1H , Umask 40H
913Retired load uops which data sources were load
914uops missed L1 but hit FB due to preceding miss to
915the same cache line with data not ready.
916.It Li MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
917.Pq Event D4H , Umask 02H
918Retired load uops with unknown information as data
919source in cache serviced the load.
920.It Li BACLEARS.ANY
921.Pq Event E6H , Umask 01H
922Counts the number of times the front end is re-
923steered, mainly when the BPU cannot provide a
924correct prediction and this is corrected by other
925branch handling mechanisms at the front end.
926.It Li L2_TRANS.DEMAND_DATA_RD
927.Pq Event F0H , Umask 01H
928Demand Data Read requests that access L2 cache.
929.It Li L2_TRANS.RFO
930.Pq Event F0H , Umask 02H
931RFO requests that access L2 cache.
932.It Li L2_TRANS.CODE_RD
933.Pq Event F0H , Umask 04H
934L2 cache accesses when fetching instructions.
935.It Li L2_TRANS.ALL_PF
936.Pq Event F0H , Umask 08H
937L2 or LLC HW prefetches that access L2 cache.
938.It Li L2_TRANS.L1D_WB
939.Pq Event F0H , Umask 10H
940L1D writebacks that access L2 cache.
941.It Li L2_TRANS.L2_FILL
942.Pq Event F0H , Umask 20H
943L2 fill requests that access L2 cache.
944.It Li L2_TRANS.L2_WB
945.Pq Event F0H , Umask 40H
946L2 writebacks that access L2 cache.
947.It Li L2_TRANS.ALL_REQUESTS
948.Pq Event F0H , Umask 80H
949Transactions accessing L2 pipe.
950.It Li L2_LINES_IN.I
951.Pq Event F1H , Umask 01H
952L2 cache lines in I state filling L2.
953.It Li L2_LINES_IN.S
954.Pq Event F1H , Umask 02H
955L2 cache lines in S state filling L2.
956.It Li L2_LINES_IN.E
957.Pq Event F1H , Umask 04H
958L2 cache lines in E state filling L2.
959.It Li L2_LINES-IN.ALL
960.Pq Event F1H , Umask 07H
961L2 cache lines filling L2.
962.It Li L2_LINES_OUT.DEMAND_CLEAN
963.Pq Event F2H , Umask 01H
964Clean L2 cache lines evicted by demand.
965.It Li L2_LINES_OUT.DEMAND_DIRTY
966.Pq Event F2H , Umask 02H
967Dirty L2 cache lines evicted by demand.
968.It Li L2_LINES_OUT.PF_CLEAN
969.Pq Event F2H , Umask 04H
970Clean L2 cache lines evicted by L2 prefetch.
971.It Li L2_LINES_OUT.PF_DIRTY
972.Pq Event F2H , Umask 08H
973Dirty L2 cache lines evicted by L2 prefetch.
974.It Li L2_LINES_OUT.DIRTY_ALL
975.Pq Event F2H , Umask 0AH
976Dirty L2 cache lines filling the L2.
977.It Li SQ_MISC.SPLIT_LOCK
978.Pq Event F4H , Umask 10H
979Split locks in SQ.
980.El
981.Sh SEE ALSO
982.Xr pmc 3 ,
983.Xr pmc.atom 3 ,
984.Xr pmc.core 3 ,
985.Xr pmc.corei7 3 ,
986.Xr pmc.corei7uc 3 ,
987.Xr pmc.haswelluc 3 ,
988.Xr pmc.iaf 3 ,
989.Xr pmc.ivybridge 3 ,
990.Xr pmc.ivybridgexeon 3 ,
991.Xr pmc.k7 3 ,
992.Xr pmc.k8 3 ,
993.Xr pmc.sandybridge 3 ,
994.Xr pmc.sandybridgeuc 3 ,
995.Xr pmc.soft 3 ,
996.Xr pmc.tsc 3 ,
997.Xr pmc.ucf 3 ,
998.Xr pmc.westmere 3 ,
999.Xr pmc.westmereuc 3 ,
1000.Xr pmc_cpuinfo 3 ,
1001.Xr pmclog 3 ,
1002.Xr hwpmc 4
1003.Sh HISTORY
1004The
1005.Nm pmc
1006library first appeared in
1007.Fx 6.0 .
1008.Sh AUTHORS
1009.An -nosplit
1010The
1011.Lb libpmc
1012library was written by
1013.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org .
1014The support for the Sandy Bridge Xeon
1015microarchitecture was written by
1016.An Hiren Panchasara Aq Mt hiren.panchasara@gmail.com .
1017