xref: /freebsd/lib/libpmc/pmc.sandybridgexeon.3 (revision fcb560670601b2a4d87bb31d7531c8dcc37ee71b)
1.\" Copyright (c) 2012 Hiren Panchasara <hiren.panchasara@gmail.com>
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd October 18, 2012
28.Dt PMC.SANDYBRIDGEXEON 3
29.Os
30.Sh NAME
31.Nm pmc.sandybridgexeon
32.Nd measurement events for
33.Tn Intel
34.Tn Sandy Bridge Xeon
35family CPUs
36.Sh LIBRARY
37.Lb libpmc
38.Sh SYNOPSIS
39.In pmc.h
40.Sh DESCRIPTION
41.Tn Intel
42.Tn "Sandy Bridge Xeon"
43CPUs contain PMCs conforming to version 2 of the
44.Tn Intel
45performance measurement architecture.
46These CPUs may contain up to two classes of PMCs:
47.Bl -tag -width "Li PMC_CLASS_IAP"
48.It Li PMC_CLASS_IAF
49Fixed-function counters that count only one hardware event per counter.
50.It Li PMC_CLASS_IAP
51Programmable counters that may be configured to count one of a defined
52set of hardware events.
53.El
54.Pp
55The number of PMCs available in each class and their widths need to be
56determined at run time by calling
57.Xr pmc_cpuinfo 3 .
58.Pp
59Intel Sandy Bridge Xeon PMCs are documented in
60.Rs
61.%B "Intel(R) 64 and IA-32 Architectures Software Developer's Manual"
62.%T "Volume 3B: System Programming Guide, Part 2"
63.%N "Order Number: 253669-043US"
64.%D August 2012
65.%Q "Intel Corporation"
66.Re
67.Ss SANDYBRIDGE XEON FIXED FUNCTION PMCS
68These PMCs and their supported events are documented in
69.Xr pmc.iaf 3 .
70.Ss SANDYBRIDGE XEON PROGRAMMABLE PMCS
71The programmable PMCs support the following capabilities:
72.Bl -column "PMC_CAP_INTERRUPT" "Support"
73.It Em Capability Ta Em Support
74.It PMC_CAP_CASCADE Ta \&No
75.It PMC_CAP_EDGE Ta Yes
76.It PMC_CAP_INTERRUPT Ta Yes
77.It PMC_CAP_INVERT Ta Yes
78.It PMC_CAP_READ Ta Yes
79.It PMC_CAP_PRECISE Ta \&No
80.It PMC_CAP_SYSTEM Ta Yes
81.It PMC_CAP_TAGGING Ta \&No
82.It PMC_CAP_THRESHOLD Ta Yes
83.It PMC_CAP_USER Ta Yes
84.It PMC_CAP_WRITE Ta Yes
85.El
86.Ss Event Qualifiers
87Event specifiers for these PMCs support the following common
88qualifiers:
89.Bl -tag -width indent
90.It Li rsp= Ns Ar value
91Configure the Off-core Response bits.
92.Bl -tag -width indent
93.It Li REQ_DMND_DATA_RD
94Counts the number of demand and DCU prefetch data reads of full and partial
95cachelines as well as demand data page table entry cacheline reads. Does not
96count L2 data read prefetches or instruction fetches.
97.It Li REQ_DMND_RFO
98Counts the number of demand and DCU prefetch reads for ownership (RFO)
99requests generated by a write to data cacheline. Does not count L2 RFO
100prefetches.
101.It Li REQ_DMND_IFETCH
102Counts the number of demand and DCU prefetch instruction cacheline reads.
103Does not count L2 code read prefetches.
104.It Li REQ_WB
105Counts the number of writeback (modified to exclusive) transactions.
106.It Li REQ_PF_DATA_RD
107Counts the number of data cacheline reads generated by L2 prefetchers.
108.It Li REQ_PF_RFO
109Counts the number of RFO requests generated by L2 prefetchers.
110.It Li REQ_PF_IFETCH
111Counts the number of code reads generated by L2 prefetchers.
112.It Li REQ_PF_LLC_DATA_RD
113L2 prefetcher to L3 for loads.
114.It Li REQ_PF_LLC_RFO
115RFO requests generated by L2 prefetcher
116.It Li REQ_PF_LLC_IFETCH
117L2 prefetcher to L3 for instruction fetches.
118.It Li REQ_BUS_LOCKS
119Bus lock and split lock requests.
120.It Li REQ_STRM_ST
121Streaming store requests.
122.It Li REQ_OTHER
123Any other request that crosses IDI, including I/O.
124.It Li RES_ANY
125Catch all value for any response types.
126.It Li RES_SUPPLIER_NO_SUPP
127No Supplier Information available.
128.It Li RES_SUPPLIER_LLC_HITM
129M-state initial lookup stat in L3.
130.It Li RES_SUPPLIER_LLC_HITE
131E-state.
132.It Li RES_SUPPLIER_LLC_HITS
133S-state.
134.It Li RES_SUPPLIER_LLC_HITF
135F-state.
136.It Li RES_SUPPLIER_LOCAL
137Local DRAM Controller.
138.It Li RES_SNOOP_SNP_NONE
139No details on snoop-related information.
140.It Li RES_SNOOP_SNP_NO_NEEDED
141No snoop was needed to satisfy the request.
142.It Li RES_SNOOP_SNP_MISS
143A snoop was needed and it missed all snooped caches:
144-For LLC Hit, ReslHitl was returned by all cores
145-For LLC Miss, Rspl was returned by all sockets and data was returned from
146DRAM.
147.It Li RES_SNOOP_HIT_NO_FWD
148A snoop was needed and it hits in at least one snooped cache. Hit denotes a
149cache-line was valid before snoop effect. This includes:
150-Snoop Hit w/ Invalidation (LLC Hit, RFO)
151-Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD)
152-Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S)
153In the LLC Miss case, data is returned from DRAM.
154.It Li RES_SNOOP_HIT_FWD
155A snoop was needed and data was forwarded from a remote socket.
156This includes:
157-Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT).
158.It Li RES_SNOOP_HITM
159A snoop was needed and it HitM-ed in local or remote cache. HitM denotes a
160cache-line was in modified state before effect as a results of snoop. This
161includes:
162-Snoop HitM w/ WB (LLC miss, IFetch/Data_RD)
163-Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO)
164-Snoop MtoS (LLC Hit, IFetch/Data_RD).
165.It Li RES_NON_DRAM
166Target was non-DRAM system address. This includes MMIO transactions.
167.El
168.It Li cmask= Ns Ar value
169Configure the PMC to increment only if the number of configured
170events measured in a cycle is greater than or equal to
171.Ar value .
172.It Li edge
173Configure the PMC to count the number of de-asserted to asserted
174transitions of the conditions expressed by the other qualifiers.
175If specified, the counter will increment only once whenever a
176condition becomes true, irrespective of the number of clocks during
177which the condition remains true.
178.It Li inv
179Invert the sense of comparison when the
180.Dq Li cmask
181qualifier is present, making the counter increment when the number of
182events per cycle is less than the value specified by the
183.Dq Li cmask
184qualifier.
185.It Li os
186Configure the PMC to count events happening at processor privilege
187level 0.
188.It Li usr
189Configure the PMC to count events occurring at privilege levels 1, 2
190or 3.
191.El
192.Pp
193If neither of the
194.Dq Li os
195or
196.Dq Li usr
197qualifiers are specified, the default is to enable both.
198.Ss Event Specifiers (Programmable PMCs)
199Sandy Bridge Xeon programmable PMCs support the following events:
200.Bl -tag -width indent
201.It Li LD_BLOCKS.DATA_UNKNOWN
202.Pq Event 03H , Umask 01H
203blocked loads due to store buffer blocks with unknown data.
204.It Li LD_BLOCKS.STORE_FORWARD
205.Pq Event 03H , Umask 02H
206loads blocked by overlapping with store buffer that cannot
207be forwarded .
208.It Li LD_BLOCKS.NO_SR
209.Pq Event 03H , Umask 08H
210# of Split loads blocked due to resource not available.
211.It Li LD_BLOCKS.ALL_BLOCK
212.Pq Event 03H , Umask 10H
213Number of cases where any load is blocked but has no
214DCU miss.
215.It Li MISALIGN_MEM_REF.LOADS
216.Pq Event 05H , Umask 01H
217Speculative cache-line split load uops dispatched to
218L1D.
219.It Li MISALIGN_MEM_REF.STORES
220.Pq Event 05H , Umask 02H
221Speculative cache-line split Store- address uops
222dispatched to L1D.
223.It Li LD_BLOCKS_PARTIAL.ADDRESS_ALIAS
224.Pq Event 07H , Umask 01H
225False dependencies in MOB due to partial compare on
226address.
227.It Li LD_BLOCKS_PARTIAL.ALL_STALL_BLOCK
228.Pq Event 07H , Umask 08H
229The number of times that load operations are temporarily
230blocked because of older stores, with addresses that are
231not yet known.  A load operation may incur more than one
232block of this type.
233.It Li TLB_LOAD_MISSES.MISS_CAUSES_A_WALK
234.Pq Event 08H , Umask 01H
235Misses in all TLB levels that cause a page walk of any
236page size.
237.It Li TLB_LOAD_MISSES.WALK_COMPLETED
238.Pq Event 08H , Umask 02H
239Misses in all TLB levels that caused page walk completed
240of any size.
241.It Li DTLB_LOAD_MISSES.WALK_DURATION
242.Pq Event 08H , Umask 04H
243Cycle PMH is busy with a walk.
244.It Li DTLB_LOAD_MISSES.STLB_HIT
245.Pq Event 08H , Umask 10H
246Number of cache load STLB hits. No page walk.
247.It Li INT_MISC.RECOVERY_CYCLES
248.Pq Event 0DH , Umask 03H
249Cycles waiting to recover after Machine Clears or EClear.
250Set Cmask= 1.
251.It Li INT_MISC.RAT_STALL_CYCLES
252.Pq Event 0DH , Umask 40H
253Cycles RAT external stall is sent to IDQ for this thread.
254.It Li UOPS_ISSUED.ANY
255.Pq Event 0EH , Umask 01H
256Increments each cycle the # of Uops issued by the
257RAT to RS.
258Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles
259of this core.
260.It Li FP_COMP_OPS_EXE.X87
261.Pq Event 10H , Umask 01H
262Counts number of X87 uops executed.
263.It Li FP_COMP_OPS_EXE.SSE_FP_PACKED_DOUBLE
264.Pq Event 10H , Umask 10H
265Counts number of SSE* double precision FP packed
266uops executed.
267.It Li FP_COMP_OPS_EXE.SSE_FP_SCALAR_SINGLE
268.Pq Event 10H , Umask 20H
269Counts number of SSE* single precision FP scalar
270uops executed.
271.It Li FP_COMP_OPS_EXE.SSE_PACKED_SINGLE
272.Pq Event 10H , Umask 40H
273Counts number of SSE* single precision FP packed
274uops executed.
275.It Li FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE
276.Pq Event 10H , Umask 80H
277Counts number of SSE* double precision FP scalar
278uops executed.
279.It Li SIMD_FP_256.PACKED_SINGLE
280.Pq Event 11H , Umask 01H
281Counts 256-bit packed single-precision floating-
282point instructions.
283.It Li SIMD_FP_256.PACKED_DOUBLE
284.Pq Event 11H , Umask 02H
285Counts 256-bit packed double-precision floating-
286point instructions.
287.It Li ARITH.FPU_DIV_ACTIVE
288.Pq Event 14H , Umask 01H
289Cycles that the divider is active, includes INT and FP.
290Set 'edge =1, cmask=1' to count the number of
291divides.
292.It Li INSTS_WRITTEN_TO_IQ.INSTS
293.Pq Event 17H , Umask 01H
294Counts the number of instructions written into the
295IQ every cycle.
296.It Li L2_RQSTS.DEMAND_DATA_RD_HIT
297.Pq Event 24H , Umask 01H
298Demand Data Read requests that hit L2 cache.
299.It Li L2_RQSTS.ALL_DEMAND_DATA_RD
300.Pq Event 24H , Umask 03H
301Counts any demand and L1 HW prefetch data load
302requests to L2.
303.It Li L2_RQSTS.RFO_HITS
304.Pq Event 24H , Umask 04H
305Counts the number of store RFO requests that
306hit the L2 cache.
307.It Li L2_RQSTS.RFO_MISS
308.Pq Event 24H , Umask 08H
309Counts the number of store RFO requests that
310miss the L2 cache.
311.It Li L2_RQSTS.ALL_RFO
312.Pq Event 24H , Umask 0CH
313Counts all L2 store RFO requests.
314.It Li L2_RQSTS.CODE_RD_HIT
315.Pq Event 24H , Umask 10H
316Number of instruction fetches that hit the L2
317cache.
318.It Li L2_RQSTS.CODE_RD_MISS
319.Pq Event 24H , Umask 20H
320Number of instruction fetches that missed the L2
321cache.
322.It Li L2_RQSTS.ALL_CODE_RD
323.Pq Event 24H , Umask 30H
324Counts all L2 code requests.
325.It Li L2_RQSTS.PF_HIT
326.Pq Event 24H , Umask 40H
327Requests from L2 Hardware prefetcher that hit L2.
328.It Li L2_RQSTS.PF_MISS
329.Pq Event 24H , Umask 80H
330Requests from L2 Hardware prefetcher that missed
331L2.
332.It Li L2_RQSTS.ALL_PF
333.Pq Event 24H , Umask C0H
334Any requests from L2 Hardware prefetchers.
335.It Li L2_STORE_LOCK_RQSTS.MISS
336.Pq Event 27H , Umask 01H
337ROs that miss cache lines.
338.It Li L2_STORE_LOCK_RQSTS.HIT_E
339.Pq Event 27H , Umask 04H
340RFOs that hit cache lines in E state.
341.It Li L2_STORE_LOCK_RQSTS.HIT_M
342.Pq Event 27H , Umask 08H
343RFOs that hit cache lines in M state.
344.It Li L2_STORE_LOCK_RQSTS.ALL
345.Pq Event 27H , Umask 0FH
346RFOs that access cache lines in any state.
347.It Li L2_L1D_WB_RQSTS.MISS
348.Pq Event 28H , Umask 01H
349Not rejected writebacks from L1D to L2 cache lines
350that missed L2.
351.It Li L2_L1D_WB_RQSTS.HIT_S
352.Pq Event 28H , Umask 02H
353Not rejected writebacks from L1D to L2 cache lines
354in S state.
355.It Li L2_L1D_WB_RQSTS.HIT_E
356.Pq Event 28H , Umask 04H
357Not rejected writebacks from L1D to L2 cache lines
358in E state.
359.It Li L2_L1D_WB_RQSTS.HIT_M
360.Pq Event 28H , Umask 08H
361Not rejected writebacks from L1D to L2 cache lines
362in M state.
363.It Li L2_L1D_WB_RQSTS.ALL
364.Pq Event 28H , Umask 0FH
365Not rejected writebacks from L1D to L2 cache.
366.It Li LONGEST_LAT_CACHE.REFERENCE
367.Pq Event 2EH , Umask 4FH
368This event counts requests originating from the
369core that reference
370a cache line in the last level cache.
371.It Li LONGEST_LAT_CACHE.MISS
372.Pq Event 2EH , Umask 41H
373This event counts each cache miss condition for
374references to the last level cache.
375.It Li CPU_CLK_UNHALTED.THREAD_P
376.Pq Event 3CH , Umask 00H
377Counts the number of thread cycles while the
378thread is not in a halt state. The thread enters
379the halt state when it is running the HLT
380instruction. The core frequency may change from
381time to time due to power or thermal throttling.
382.It Li CPU_CLK_THREAD_UNHALTED.REF_XCLK
383.Pq Event 3CH , Umask 01H
384Increments at the frequency of XCLK (100 MHz)
385when not halted.
386.It Li L1D_PEND_MISS.PENDING
387.Pq Event 48H , Umask 01H
388Increments the number of outstanding L1D misses
389every cycle.
390Set Cmaks = 1 and Edge =1 to count occurrences.
391.It Li DTLB_STORE_MISSES.MISS_CAUSES_A_WALK
392.Pq Event 49H , Umask 01H
393Miss in all TLB levels causes an page walk of
394any page size (4K/2M/4M/1G).
395.It Li DTLB_STORE_MISSES.WALK_COMPLETED
396.Pq Event 49H , Umask 02H
397Miss in all TLB levels causes a page walk that
398completes of any page size (4K/2M/4M/1G).
399.It Li DTLB_STORE_MISSES.WALK_DURATION
400.Pq Event 49H , Umask 04H
401Cycles PMH is busy with this walk.
402.It Li DTLB_STORE_MISSES.STLB_HIT
403.Pq Event 49H , Umask 10H
404Store operations that miss the first TLB level
405but hit the second and do not cause page walks.
406.It Li LOAD_HIT_PRE.SW_PF
407.Pq Event 4CH , Umask 01H
408Not SW-prefetch load dispatches that hit fill
409buffer allocated for S/W prefetch.
410.It Li LOAD_HIT_PER.HW_PF
411.Pq Event 4CH , Umask 02H
412Not SW-prefetch load dispatches that hit fill
413buffer allocated for H/W prefetch.
414.It Li HW_PRE_REQ.DL1_MISS
415.Pq Event 4EH , Umask 02H
416Hardware Prefetch requests that miss the L1D
417cache. A request is being counted each time
418it access the cache & miss it, including if
419a block is applicable or if hit the Fill
420Buffer for example.
421.It Li L1D.REPLACEMENT
422.Pq Event 51H , Umask 01H
423Counts the number of lines brought into the
424L1 data cache.
425.It Li L1D.ALLOCATED_IN_M
426.Pq Event 51H , Umask 02H
427Counts the number of allocations of modified
428L1D cache lines.
429.It Li L1D.EVICTION
430.Pq Event 51H , Umask 04H
431Counts the number of modified lines evicted
432from the L1 data cache due to replacement.
433.It Li L1D.ALL_M_REPLACEMENT
434.Pq Event 51H , Umask 08H
435Cache lines in M state evicted out of L1D due
436to Snoop HitM or dirty line replacement.
437.It Li PARTIAL_RAT_STALLS.FLAGS_MERGE_UOP
438.Pq Event 59H , Umask 0CH
439Increments the number of flags-merge uops in
440flight each cycle.
441Set Cmask = 1 to count cycles.
442.It Li PARTIAL_RAT_STALLS.SLOW_LEA_WINDOW
443.Pq Event 59H , Umask 0FH
444Cycles with at least one slow LEA uop allocated.
445.It Li PARTIAL_RAT_STALLS.MUL_SINGLE_UOP
446.Pq Event 59H , Umask 40H
447Number of Multiply packed/scalar single precision
448uops allocated.
449.It Li RESOURCE_STALLS2.ALL_FL_EMPTY
450.Pq Event 5BH , Umask 0CH
451Cycles stalled due to free list empty.
452.It Li RESOURCE_STALLS2.ALL_PRF_CONTROL
453.Pq Event 5BH , Umask 0FH
454Cycles stalled due to control structures full for
455physical registers.
456.It Li RESOURCE_STALLS2.BOB_FULL
457.Pq Event 5BH , Umask 40H
458Cycles Allocator is stalled due Branch Order Buffer.
459.It Li RESOURCE_STALLS2.OOO_RSRC
460.Pq Event 5BH , Umask 4FH
461Cycles stalled due to out of order resources full.
462.It Li CPL_CYCLES.RING0
463.Pq Event 5CH , Umask 01H
464Unhalted core cycles when the thread is in ring 0.
465.It Li CPL_CYCLES.RING123
466.Pq Event 5CH , Umask 02H
467Unhalted core cycles when the thread is not in ring
4680.
469.It Li RS_EVENTS.EMPTY_CYCLES
470.Pq Event 5EH , Umask 01H
471Cycles the RS is empty for the thread.
472.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD
473.Pq Event 60H , Umask 01H
474Offcore outstanding Demand Data Read
475transactions in SQ to uncore. Set Cmask=1 to count
476cycles.
477.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO
478.Pq Event 60H , Umask 04H
479Offcore outstanding RFO store transactions in SQ to
480uncore. Set Cmask=1 to count cycles.
481.It Li OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD
482.Pq Event 60H , Umask 08H
483Offcore outstanding cacheable data read
484transactions in SQ to uncore. Set Cmask=1 to count
485cycles.
486.It Li LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION
487.Pq Event 63H , Umask 01H
488Cycles in which the L1D and L2 are locked, due to a
489UC lock or split lock.
490.It Li LOCK_CYCLES.CACHE_LOCK_DURATION
491.Pq Event 63H , Umask 02H
492Cycles in which the L1D is locked.
493.It Li IDQ.EMPTY
494.Pq Event 79H , Umask 02H
495Counts cycles the IDQ is empty.
496.It Li IDQ.MITE_UOPS
497.Pq Event 79H , Umask 04H
498Increment each cycle # of uops delivered to IDQ
499from MITE path.
500Set Cmask = 1 to count cycles.
501.It Li IDQ.DSB_UOPS
502.Pq Event 79H , Umask 08H
503Increment each cycle. # of uops delivered to IDQ
504from DSB path.
505Set Cmask = 1 to count cycles.
506.It Li IDQ.MS_DSB_UOPS
507.Pq Event 79H , Umask 10H
508Increment each cycle # of uops delivered to IDQ
509when MS busy by DSB. Set Cmask = 1 to count
510cycles MS is busy. Set Cmask=1 and Edge =1 to
511count MS activations.
512.It Li IDQ.MS_MITE_UOPS
513.Pq Event 79H , Umask 20H
514Increment each cycle # of uops delivered to IDQ
515when MS is busy by MITE. Set Cmask = 1 to count
516cycles.
517.It Li IDQ.MS_UOPS
518.Pq Event 79H , Umask 30H
519Increment each cycle # of uops delivered to IDQ
520from MS by either DSB or MITE. Set Cmask = 1 to
521count cycles.
522.It Li ICACHE.MISSES
523.Pq Event 80H , Umask 02H
524Number of Instruction Cache, Streaming Buffer and
525Victim Cache Misses. Includes UC accesses.
526.It Li ITLB_MISSES.MISS_CAUSES_A_WALK
527.Pq Event 85H , Umask 01H
528Misses in all ITLB levels that cause page walks.
529.It Li ITLB_MISSES.WALK_COMPLETED
530.Pq Event 85H , Umask 02H
531Misses in all ITLB levels that cause completed page
532walks.
533.It Li ITLB_MISSES.WALK_DURATION
534.Pq Event 85H , Umask 04H
535Cycle PMH is busy with a walk.
536.It Li ITLB_MISSES.STLB_HIT
537.Pq Event 85H , Umask 10H
538Number of cache load STLB hits. No page walk.
539.It Li ILD_STALL.LCP
540.Pq Event 87H , Umask 01H
541Stalls caused by changing prefix length of the
542instruction.
543.It Li ILD_STALL.IQ_FULL
544.Pq Event 87H , Umask 04H
545Stall cycles due to IQ is full.
546.It Li BR_INST_EXEC.COND
547.Pq Event 88H , Umask 01H
548Qualify conditional near branch instructions
549executed, but not necessarily retired.
550.It Li BR_INST_EXEC.DIRECT_JMP
551.Pq Event 88H , Umask 02H
552Qualify all unconditional near branch instructions
553excluding calls and indirect branches.
554.It Li BR_INST_EXEC.INDIRECT_JMP_NON_CALL_RET
555.Pq Event 88H , Umask 04H
556Qualify executed indirect near branch instructions
557that are not calls nor returns.
558.It Li BR_INST_EXEC.RETURN_NEAR
559.Pq Event 88H , Umask 08H
560Qualify indirect near branches that have a return
561mnemonic.
562.It Li BR_INST_EXEC.DIRECT_NEAR_CALL
563.Pq Event 88H , Umask 10H
564Qualify unconditional near call branch instructions,
565excluding non call branch, executed.
566.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL
567.Pq Event 88H , Umask 20H
568Qualify indirect near calls, including both register
569and memory indirect, executed.
570.It Li BR_INST_EXEC.NONTAKEN
571.Pq Event 88H , Umask 40H
572Qualify non-taken near branches executed.
573.It Li BR_INST_EXEC.TAKEN
574.Pq Event 88H , Umask 80H
575Qualify taken near branches executed. Must
576combine with 01H,02H, 04H, 08H, 10H, 20H.
577.It Li BR_INST_EXE.ALL_BRANCHES
578.Pq Event 88H , Umask FFH
579Counts all near executed branches (not necessarily
580retired).
581.It Li BR_MISP_EXEC.COND
582.Pq Event 89H , Umask 01H
583Qualify conditional near branch instructions
584mispredicted.
585.It Li BR_MISP_EXEC.INDIRECT_JMP_NON_CALL_RET
586.Pq Event 89H , Umask 04H
587Qualify mispredicted indirect near branch
588instructions that are not calls nor returns.
589.It Li BR_MISP_EXEC.RETURN_NEAR
590.Pq Event 89H , Umask 08H
591Qualify mispredicted indirect near branches that
592have a return mnemonic.
593.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL
594.Pq Event 89H , Umask 10H
595Qualify mispredicted unconditional near call branch
596instructions, excluding non call branch, executed.
597.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL
598.Pq Event 89H , Umask 20H
599Qualify mispredicted indirect near calls, including
600both register and memory indirect, executed.
601.It Li BR_MISP_EXEC.NONTAKEN
602.Pq Event 89H , Umask 40H
603Qualify mispredicted non-taken near branches
604executed,.
605.It Li BR_MISP_EXEC.TAKEN
606.Pq Event 89H , Umask 80H
607Qualify mispredicted taken near branches executed.
608Must combine with 01H,02H, 04H, 08H, 10H, 20H
609.It Li BR_MISP_EXEC.ALL_BRANCHES
610.Pq Event 89H , Umask FFH
611Counts all near executed branches (not necessarily
612retired).
613.It Li IDQ_UOPS_NOT_DELIVERED.CORE
614.Pq Event 9CH , Umask 01H
615Count number of non-delivered uops to RAT per
616thread.
617.It Li UOPS_DISPATCHED_PORT.PORT_0
618.Pq Event A1H , Umask 01H
619Cycles which a Uop is dispatched on port 0.
620.It Li UOPS_DISPATCHED_PORT.PORT_1
621.Pq Event A1H , Umask 02H
622Cycles which a Uop is dispatched on port 1.
623.It Li UOPS_DISPATCHED_PORT.PORT_2_LD
624.Pq Event A1H , Umask 04H
625Cycles which a load uop is dispatched on port 2.
626.It Li UOPS_DISPATCHED_PORT.PORT_2_STA
627.Pq Event A1H , Umask 08H
628Cycles which a store address uop is dispatched on
629port 2.
630.It Li UOPS_DISPATCHED_PORT.PORT_2
631.Pq Event A1H , Umask 0CH
632Cycles which a Uop is dispatched on port 2.
633.It Li UOPS_DISPATCHED_PORT.PORT_3_LD
634.Pq Event A1H , Umask 10H
635Cycles which a load uop is dispatched on port 3.
636.It Li UOPS_DISPATCHED_PORT.PORT_3_STA
637.Pq Event A1H , Umask 20H
638Cycles which a store address uop is dispatched on
639port 3.
640.It Li UOPS_DISPATCHED_PORT.PORT_3
641.Pq Event A1H , Umask 30H
642Cycles which a Uop is dispatched on port 3.
643.It Li UOPS_DISPATCHED_PORT.PORT_4
644.Pq Event A1H , Umask 40H
645Cycles which a Uop is dispatched on port 4.
646.It Li UOPS_DISPATCHED_PORT.PORT_5
647.Pq Event A1H , Umask 80H
648Cycles which a Uop is dispatched on port 5.
649.It Li RESOURCE_STALLS.ANY
650.Pq Event A2H , Umask 01H
651Cycles Allocation is stalled due to Resource Related
652reason.
653.It Li RESOURCE_STALLS.LB
654.Pq Event A2H , Umask 01H
655Counts the cycles of stall due to lack of load buffers.
656.It Li RESOURCE_STALLS.RS
657.Pq Event A2H , Umask 04H
658Cycles stalled due to no eligible RS entry available.
659.It Li RESOURCE_STALLS.SB
660.Pq Event A2H , Umask 08H
661Cycles stalled due to no store buffers available. (not
662including draining form sync).
663.It Li RESOURCE_STALLS.ROB
664.Pq Event A2H , Umask 10H
665Cycles stalled due to re-order buffer full.
666.It Li RESOURCE_STALLS.FCSW
667.Pq Event A2H , Umask 20H
668Cycles stalled due to writing the FPU control word.
669.It Li RESOURCE_STALLS.MXCSR
670.Pq Event A2H , Umask 40H
671Cycles stalled due to the MXCSR register rename
672occurring to close to a previous MXCSR rename.
673.It Li RESOURCE_STALLS.OTHER
674.Pq Event A2H , Umask 80H
675Cycles stalled while execution was stalled due to
676other resource issues.
677.It Li CYCLE_ACTIVITY.CYCLES_L2_PENDING
678.Pq Event A3H , Umask 01H
679Cycles with pending L2 miss loads. Set AnyThread
680to count per core.
681.It Li CYCLE_ACTIVITY.CYCLES_L1D_PENDING
682.Pq Event A3H , Umask 02H
683Cycles with pending L1 cache miss loads.Set
684AnyThread to count per core.
685.It Li CYCLE_ACTIVITY.CYCLES_NO_DISPATCH
686.Pq Event A3H , Umask 04H
687Cycles of dispatch stalls. Set AnyThread to count per
688core.
689.It Li DSB2MITE_SWITCHES.COUNT
690.Pq Event ABH , Umask 01H
691Number of DSB to MITE switches.
692.It Li DSB2MITE_SWITCHES.PENALTY_CYCLES
693.Pq Event ABH , Umask 02H
694Cycles DSB to MITE switches caused delay.
695.It Li DSB_FILL.OTHER_CANCEL
696.Pq Event ACH , Umask 02H
697Cases of cancelling valid DSB fill not because of
698exceeding way limit.
699.It Li DSB_FILL.EXCEED_DSB_LINES
700.Pq Event ACH , Umask 08H
701DSB Fill encountered > 3 DSB lines.
702.It Li DSB_FILL.ALL_CANCEL
703.Pq Event ACH , Umask 0AH
704Cases of cancelling valid Decode Stream Buffer
705(DSB) fill not because of exceeding way limit.
706.It Li ITLB.ITLB_FLUSH
707.Pq Event AEH , Umask 01H
708Counts the number of ITLB flushes, includes
7094k/2M/4M pages.
710.It Li OFFCORE_REQUESTS.DEMAND_DATA_RD
711.Pq Event B0H , Umask 01H
712Demand data read requests sent to uncore.
713.It Li OFFCORE_REQUESTS.DEMAND_RFO
714.Pq Event B0H , Umask 04H
715Demand RFO read requests sent to uncore, including
716regular RFOs, locks, ItoM.
717.It Li OFFCORE_REQUESTS.ALL_DATA_RD
718.Pq Event B0H , Umask 08H
719Data read requests sent to uncore (demand and
720prefetch).
721.It Li UOPS_DISPATCHED.THREAD
722.Pq Event B1H , Umask 01H
723Counts total number of uops to be dispatched per-
724thread each cycle. Set Cmask = 1, INV =1 to count
725stall cycles.
726.It Li UOPS_DISPATCHED.CORE
727.Pq Event B1H , Umask 02H
728Counts total number of uops to be dispatched per-
729core each cycle.
730.It Li OFFCORE_REQUESTS_BUFFER.SQ_FULL
731.Pq Event B2H , Umask 01H
732Offcore requests buffer cannot take more entries
733for this thread core.
734.It Li AGU_BYPASS_CANCEL.COUNT
735.Pq Event B6H , Umask 01H
736Counts executed load operations with all the
737following traits: 1. addressing of the format [base +
738offset], 2. the offset is between 1 and 2047, 3. the
739address specified in the base register is in one page
740and the address [base+offset] is in another page.
741.It Li OFF_CORE_RESPONSE_0
742.Pq Event B7H , Umask 01H
743(Event B7H, Umask 01H) Off-core Response Performance
744Monitoring; PMC0 only.  Requires programming MSR 01A6H
745.It Li OFF_CORE_RESPONSE_1
746.Pq Event BBH , Umask 01H
747(Event BBH, Umask 01H) Off-core Response Performance
748Monitoring; PMC3 only.  Requires programming MSR 01A7H
749.It Li TLB_FLUSH.DTLB_THREAD
750.Pq Event BDH , Umask 01H
751DTLB flush attempts of the thread-specific entries.
752.It Li TLB_FLUSH.STLB_ANY
753.Pq Event BDH , Umask 20H
754Count number of STLB flush attempts.
755.It Li L1D_BLOCKS.BANK_CONFLICT_CYCLES
756.Pq Event BFH , Umask 05H
757Cycles when dispatched loads are cancelled due to
758L1D bank conflicts with other load ports.
759.It Li INST_RETIRED.ANY_P
760.Pq Event C0H , Umask 00H
761Number of instructions at retirement.
762.It Li INST_RETIRED.ALL
763.Pq Event C0H , Umask 01H
764Precise instruction retired event with HW to reduce
765effect of PEBS shadow in IP distribution.
766.It Li OTHER_ASSISTS.ITLB_MISS_RETIRED
767.Pq Event C1H , Umask 02H
768Instructions that experienced an ITLB miss.
769.It Li OTHER_ASSISTS.AVX_STORE
770.Pq Event C1H , Umask 08H
771Number of assists associated with 256-bit AVX
772store operations.
773.It Li OTHER_ASSISTS.AVX_TO_SSE
774.Pq Event C1H , Umask 10H
775Number of transitions from AVX-256 to legacy SSE
776when penalty applicable.
777.It Li OTHER_ASSISTS.SSE_TO_AVX
778.Pq Event C1H , Umask 20H
779Number of transitions from SSE to AVX-256 when
780penalty applicable.
781.It Li UOPS_RETIRED.ALL
782.Pq Event C2H , Umask 01H
783Counts the number of micro-ops retired, Use
784cmask=1 and invert to count active cycles or stalled
785cycles.
786.It Li UOPS_RETIRED.RETIRE_SLOTS
787.Pq Event C2H , Umask 02H
788Counts the number of retirement slots used each
789cycle.
790.It Li MACHINE_CLEARS.MEMORY_ORDERING
791.Pq Event C3H , Umask 02H
792Counts the number of machine clears due to
793memory order conflicts.
794.It Li MACHINE_CLEARS.SMC
795.Pq Event C3H , Umask 04H
796Counts the number of times that a program writes
797to a code section.
798.It Li MACHINE_CLEARS.MASKMOV
799.Pq Event C3H , Umask 20H
800Counts the number of executed AVX masked load
801operations that refer to an illegal address range
802with the mask bits set to 0.
803.It Li BR_INST_RETIRED.ALL_BRANCH
804.Pq Event C4H , Umask 00H
805Branch instructions at retirement.
806.It Li BR_INST_RETIRED.CONDITIONAL
807.Pq Event C4H , Umask 01H
808Counts the number of conditional branch
809instructions retired.
810.It Li BR_INST_RETIRED.NEAR_CALL
811.Pq Event C4H , Umask 02H
812Direct and indirect near call instructions retired.
813.It Li BR_INST_RETIRED.ALL_BRANCHES
814.Pq Event C4H , Umask 04H
815Counts the number of branch instructions retired.
816.It Li BR_INST_RETIRED.NEAR_RETURN
817.Pq Event C4H , Umask 08H
818Counts the number of near return instructions
819retired.
820.It Li BR_INST_RETIRED.NOT_TAKEN
821.Pq Event C4H , Umask 10H
822Counts the number of not taken branch instructions
823retired.
824.It Li BR_INST_RETIRED.NEAR_TAKEN
825.Pq Event C4H , Umask 20H
826Number of near taken branches retired.
827.It Li BR_INST_RETIRED.FAR_BRANCH
828.Pq Event C4H , Umask 40H
829Number of far branches retired.
830.It Li BR_MISP_RETIRED.ALL_BRANCHES
831.Pq Event C5H , Umask 00H
832Mispredicted branch instructions at retirement.
833.It Li BR_MISP_RETIRED.CONDITIONAL
834.Pq Event C5H , Umask 01H
835Mispredicted conditional branch instructions retired.
836.It Li BR_MISP_RETIRED.NEAR_CALL
837.Pq Event C5H , Umask 02H
838Direct and indirect mispredicted near call
839instructions retired.
840.It Li BR_MISP_RETIRED.ALL_BRANCHES
841.Pq Event C5H , Umask 04H
842Mispredicted macro branch instructions retired.
843.It Li BR_MISP_RETIRED.NOT_TAKEN
844.Pq Event C5H , Umask 10H
845Mispredicted not taken branch instructions retired.
846.It Li BR_MISP_RETIRED.TAKEN
847.Pq Event C5H , Umask 20H
848Mispredicted taken branch instructions retired.
849.It Li FP_ASSIST.X87_OUTPUT
850.Pq Event CAH , Umask 02H
851Number of X87 assists due to output value.
852.It Li FP_ASSIST.X87_INPUT
853.Pq Event CAH , Umask 04H
854Number of X87 assists due to input value.
855.It Li FP_ASSIST.SIMD_OUTPUT
856.Pq Event CAH , Umask 08H
857 Number of SIMD FP assists due to output values.
858.It Li FP_ASSIST.SIMD_INPUT
859.Pq Event CAH , Umask 10H
860Number of SIMD FP assists due to input values.
861.It Li FP_ASSIST.ANY 1EH
862.Pq Event CAH , Umask
863Cycles with any input/output SSE* or FP assists.
864.It Li ROB_MISC_EVENTS.LBR_INSERTS
865.Pq Event CCH , Umask 20H
866Count cases of saving new LBR records by
867hardware.
868.It Li MEM_TRANS_RETIRED.LOAD_LATENCY
869.Pq Event CDH , Umask 01H
870Sample loads with specified latency threshold.
871PMC3 only.
872.It Li MEM_TRANS_RETIRED.PRECISE_STORE
873.Pq Event CDH , Umask 02H
874Sample stores and collect precise store operation
875via PEBS record. PMC3 only.
876.It Li MEM_UOP_RETIRED.LOADS
877.Pq Event D0H , Umask 10H
878Qualify retired memory uops that are loads.
879Combine with umask 10H, 20H, 40H, 80H.
880.It Li MEM_UOP_RETIRED.STORES
881.Pq Event D0H , Umask 02H
882Qualify retired memory uops that are stores.
883Combine with umask 10H, 20H, 40H, 80H.
884.It Li MEM_UOP_RETIRED.STLB_MISS
885.Pq Event D0H , Umask
886Qualify retired memory uops with STLB miss. Must
887combine with umask 01H, 02H, to produce counts.
888.It Li MEM_UOP_RETIRED.LOCK
889.Pq Event D0H , Umask
890Qualify retired memory uops with lock. Must
891combine with umask 01H, 02H, to produce counts.
892.It Li MEM_UOP_RETIRED.SPLIT
893.Pq Event D0H , Umask
894Qualify retired memory uops with line split. Must
895combine with umask 01H, 02H, to produce counts.
896.It Li MEM_UOP_RETIRED_ALL
897.Pq Event D0H , Umask
898Qualify any retired memory uops. Must combine
899with umask 01H, 02H, to produce counts.
900.It Li MEM_LOAD_UOPS_RETIRED.L1_HIT
901.Pq Event D1H , Umask 01H
902Retired load uops with L1 cache hits as data
903sources.
904.It Li MEM_LOAD_UOPS_RETIRED.L2_HIT
905.Pq Event D1H , Umask 02H
906Retired load uops with L2 cache hits as data
907sources.
908.It Li MEM_LOAD_UOPS_RETIRED.LLC_HIT
909.Pq Event D1H , Umask 04H
910Retired load uops which data sources were data hits
911in LLC without snoops required.
912.It Li MEM_LOAD_UOPS_RETIRED.LLC_MISS
913.Pq Event D1H , Umask 20H
914Retired load uops which data sources were data
915missed LLC (excluding unknown data source).
916.It Li MEM_LOAD_UOPS_RETIRED.HIT_LFB
917.Pq Event D1H , Umask 40H
918Retired load uops which data sources were load
919uops missed L1 but hit FB due to preceding miss to
920the same cache line with data not ready.
921.It Li MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
922.Pq Event D4H , Umask 02H
923Retired load uops with unknown information as data
924source in cache serviced the load.
925.It Li BACLEARS.ANY
926.Pq Event E6H , Umask 01H
927Counts the number of times the front end is re-
928steered, mainly when the BPU cannot provide a
929correct prediction and this is corrected by other
930branch handling mechanisms at the front end.
931.It Li L2_TRANS.DEMAND_DATA_RD
932.Pq Event F0H , Umask 01H
933Demand Data Read requests that access L2 cache.
934.It Li L2_TRANS.RFO
935.Pq Event F0H , Umask 02H
936RFO requests that access L2 cache.
937.It Li L2_TRANS.CODE_RD
938.Pq Event F0H , Umask 04H
939L2 cache accesses when fetching instructions.
940.It Li L2_TRANS.ALL_PF
941.Pq Event F0H , Umask 08H
942L2 or LLC HW prefetches that access L2 cache.
943.It Li L2_TRANS.L1D_WB
944.Pq Event F0H , Umask 10H
945L1D writebacks that access L2 cache.
946.It Li L2_TRANS.L2_FILL
947.Pq Event F0H , Umask 20H
948L2 fill requests that access L2 cache.
949.It Li L2_TRANS.L2_WB
950.Pq Event F0H , Umask 40H
951L2 writebacks that access L2 cache.
952.It Li L2_TRANS.ALL_REQUESTS
953.Pq Event F0H , Umask 80H
954Transactions accessing L2 pipe.
955.It Li L2_LINES_IN.I
956.Pq Event F1H , Umask 01H
957L2 cache lines in I state filling L2.
958.It Li L2_LINES_IN.S
959.Pq Event F1H , Umask 02H
960L2 cache lines in S state filling L2.
961.It Li L2_LINES_IN.E
962.Pq Event F1H , Umask 04H
963L2 cache lines in E state filling L2.
964.It Li L2_LINES-IN.ALL
965.Pq Event F1H , Umask 07H
966L2 cache lines filling L2.
967.It Li L2_LINES_OUT.DEMAND_CLEAN
968.Pq Event F2H , Umask 01H
969Clean L2 cache lines evicted by demand.
970.It Li L2_LINES_OUT.DEMAND_DIRTY
971.Pq Event F2H , Umask 02H
972Dirty L2 cache lines evicted by demand.
973.It Li L2_LINES_OUT.PF_CLEAN
974.Pq Event F2H , Umask 04H
975Clean L2 cache lines evicted by L2 prefetch.
976.It Li L2_LINES_OUT.PF_DIRTY
977.Pq Event F2H , Umask 08H
978Dirty L2 cache lines evicted by L2 prefetch.
979.It Li L2_LINES_OUT.DIRTY_ALL
980.Pq Event F2H , Umask 0AH
981Dirty L2 cache lines filling the L2.
982.It Li SQ_MISC.SPLIT_LOCK
983.Pq Event F4H , Umask 10H
984Split locks in SQ.
985.El
986.Sh SEE ALSO
987.Xr pmc 3 ,
988.Xr pmc.atom 3 ,
989.Xr pmc.core 3 ,
990.Xr pmc.corei7 3 ,
991.Xr pmc.corei7uc 3 ,
992.Xr pmc.haswelluc 3 ,
993.Xr pmc.iaf 3 ,
994.Xr pmc.ivybridge 3 ,
995.Xr pmc.ivybridgexeon 3 ,
996.Xr pmc.k7 3 ,
997.Xr pmc.k8 3 ,
998.Xr pmc.p4 3 ,
999.Xr pmc.p5 3 ,
1000.Xr pmc.p6 3 ,
1001.Xr pmc.sandybridge 3 ,
1002.Xr pmc.sandybridgeuc 3 ,
1003.Xr pmc.soft 3 ,
1004.Xr pmc.tsc 3 ,
1005.Xr pmc.ucf 3 ,
1006.Xr pmc.westmere 3 ,
1007.Xr pmc.westmereuc 3 ,
1008.Xr pmc_cpuinfo 3 ,
1009.Xr pmclog 3 ,
1010.Xr hwpmc 4
1011.Sh HISTORY
1012The
1013.Nm pmc
1014library first appeared in
1015.Fx 6.0 .
1016.Sh AUTHORS
1017.An -nosplit
1018The
1019.Lb libpmc
1020library was written by
1021.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org .
1022The support for the Sandy Bridge Xeon
1023microarchitecture was written by
1024.An Hiren Panchasara Aq Mt hiren.panchasara@gmail.com .
1025