1.\" Copyright (c) 2012 Hiren Panchasara <hiren.panchasara@gmail.com> 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD$ 26.\" 27.Dd October 18, 2012 28.Dt PMC.SANDYBRIDGEXEON 3 29.Os 30.Sh NAME 31.Nm pmc.sandybridgexeon 32.Nd measurement events for 33.Tn Intel 34.Tn Sandy Bridge Xeon 35family CPUs 36.Sh LIBRARY 37.Lb libpmc 38.Sh SYNOPSIS 39.In pmc.h 40.Sh DESCRIPTION 41.Tn Intel 42.Tn "Sandy Bridge Xeon" 43CPUs contain PMCs conforming to version 2 of the 44.Tn Intel 45performance measurement architecture. 46These CPUs may contain up to two classes of PMCs: 47.Bl -tag -width "Li PMC_CLASS_IAP" 48.It Li PMC_CLASS_IAF 49Fixed-function counters that count only one hardware event per counter. 50.It Li PMC_CLASS_IAP 51Programmable counters that may be configured to count one of a defined 52set of hardware events. 53.El 54.Pp 55The number of PMCs available in each class and their widths need to be 56determined at run time by calling 57.Xr pmc_cpuinfo 3 . 58.Pp 59Intel Sandy Bridge Xeon PMCs are documented in 60.Rs 61.%B "Intel(R) 64 and IA-32 Architectures Software Developer's Manual" 62.%T "Volume 3B: System Programming Guide, Part 2" 63.%N "Order Number: 253669-043US" 64.%D August 2012 65.%Q "Intel Corporation" 66.Re 67.Ss SANDYBRIDGE XEON FIXED FUNCTION PMCS 68These PMCs and their supported events are documented in 69.Xr pmc.iaf 3 . 70.Ss SANDYBRIDGE XEON PROGRAMMABLE PMCS 71The programmable PMCs support the following capabilities: 72.Bl -column "PMC_CAP_INTERRUPT" "Support" 73.It Em Capability Ta Em Support 74.It PMC_CAP_CASCADE Ta \&No 75.It PMC_CAP_EDGE Ta Yes 76.It PMC_CAP_INTERRUPT Ta Yes 77.It PMC_CAP_INVERT Ta Yes 78.It PMC_CAP_READ Ta Yes 79.It PMC_CAP_PRECISE Ta \&No 80.It PMC_CAP_SYSTEM Ta Yes 81.It PMC_CAP_TAGGING Ta \&No 82.It PMC_CAP_THRESHOLD Ta Yes 83.It PMC_CAP_USER Ta Yes 84.It PMC_CAP_WRITE Ta Yes 85.El 86.Ss Event Qualifiers 87Event specifiers for these PMCs support the following common 88qualifiers: 89.Bl -tag -width indent 90.It Li rsp= Ns Ar value 91Configure the Off-core Response bits. 92.Bl -tag -width indent 93.It Li REQ_DMND_DATA_RD 94Counts the number of demand and DCU prefetch data reads of full and partial 95cachelines as well as demand data page table entry cacheline reads. Does not 96count L2 data read prefetches or instruction fetches. 97.It Li REQ_DMND_RFO 98Counts the number of demand and DCU prefetch reads for ownership (RFO) 99requests generated by a write to data cacheline. Does not count L2 RFO 100prefetches. 101.It Li REQ_DMND_IFETCH 102Counts the number of demand and DCU prefetch instruction cacheline reads. 103Does not count L2 code read prefetches. 104.It Li REQ_WB 105Counts the number of writeback (modified to exclusive) transactions. 106.It Li REQ_PF_DATA_RD 107Counts the number of data cacheline reads generated by L2 prefetchers. 108.It Li REQ_PF_RFO 109Counts the number of RFO requests generated by L2 prefetchers. 110.It Li REQ_PF_IFETCH 111Counts the number of code reads generated by L2 prefetchers. 112.It Li REQ_PF_LLC_DATA_RD 113L2 prefetcher to L3 for loads. 114.It Li REQ_PF_LLC_RFO 115RFO requests generated by L2 prefetcher 116.It Li REQ_PF_LLC_IFETCH 117L2 prefetcher to L3 for instruction fetches. 118.It Li REQ_BUS_LOCKS 119Bus lock and split lock requests. 120.It Li REQ_STRM_ST 121Streaming store requests. 122.It Li REQ_OTHER 123Any other request that crosses IDI, including I/O. 124.It Li RES_ANY 125Catch all value for any response types. 126.It Li RES_SUPPLIER_NO_SUPP 127No Supplier Information available. 128.It Li RES_SUPPLIER_LLC_HITM 129M-state initial lookup stat in L3. 130.It Li RES_SUPPLIER_LLC_HITE 131E-state. 132.It Li RES_SUPPLIER_LLC_HITS 133S-state. 134.It Li RES_SUPPLIER_LLC_HITF 135F-state. 136.It Li RES_SUPPLIER_LOCAL 137Local DRAM Controller. 138.It Li RES_SNOOP_SNP_NONE 139No details on snoop-related information. 140.It Li RES_SNOOP_SNP_NO_NEEDED 141No snoop was needed to satisfy the request. 142.It Li RES_SNOOP_SNP_MISS 143A snoop was needed and it missed all snooped caches: 144-For LLC Hit, ReslHitl was returned by all cores 145-For LLC Miss, Rspl was returned by all sockets and data was returned from 146DRAM. 147.It Li RES_SNOOP_HIT_NO_FWD 148A snoop was needed and it hits in at least one snooped cache. Hit denotes a 149cache-line was valid before snoop effect. This includes: 150-Snoop Hit w/ Invalidation (LLC Hit, RFO) 151-Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD) 152-Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S) 153In the LLC Miss case, data is returned from DRAM. 154.It Li RES_SNOOP_HIT_FWD 155A snoop was needed and data was forwarded from a remote socket. 156This includes: 157-Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT). 158.It Li RES_SNOOP_HITM 159A snoop was needed and it HitM-ed in local or remote cache. HitM denotes a 160cache-line was in modified state before effect as a results of snoop. This 161includes: 162-Snoop HitM w/ WB (LLC miss, IFetch/Data_RD) 163-Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO) 164-Snoop MtoS (LLC Hit, IFetch/Data_RD). 165.It Li RES_NON_DRAM 166Target was non-DRAM system address. This includes MMIO transactions. 167.El 168.It Li cmask= Ns Ar value 169Configure the PMC to increment only if the number of configured 170events measured in a cycle is greater than or equal to 171.Ar value . 172.It Li edge 173Configure the PMC to count the number of de-asserted to asserted 174transitions of the conditions expressed by the other qualifiers. 175If specified, the counter will increment only once whenever a 176condition becomes true, irrespective of the number of clocks during 177which the condition remains true. 178.It Li inv 179Invert the sense of comparison when the 180.Dq Li cmask 181qualifier is present, making the counter increment when the number of 182events per cycle is less than the value specified by the 183.Dq Li cmask 184qualifier. 185.It Li os 186Configure the PMC to count events happening at processor privilege 187level 0. 188.It Li usr 189Configure the PMC to count events occurring at privilege levels 1, 2 190or 3. 191.El 192.Pp 193If neither of the 194.Dq Li os 195or 196.Dq Li usr 197qualifiers are specified, the default is to enable both. 198.Ss Event Specifiers (Programmable PMCs) 199Sandy Bridge Xeon programmable PMCs support the following events: 200.Bl -tag -width indent 201.It Li LD_BLOCKS.DATA_UNKNOWN 202.Pq Event 03H , Umask 01H 203blocked loads due to store buffer blocks with unknown data. 204.It Li LD_BLOCKS.STORE_FORWARD 205.Pq Event 03H , Umask 02H 206loads blocked by overlapping with store buffer that cannot 207be forwarded . 208.It Li LD_BLOCKS.NO_SR 209.Pq Event 03H , Umask 08H 210# of Split loads blocked due to resource not available. 211.It Li LD_BLOCKS.ALL_BLOCK 212.Pq Event 03H , Umask 10H 213Number of cases where any load is blocked but has no 214DCU miss. 215.It Li MISALIGN_MEM_REF.LOADS 216.Pq Event 05H , Umask 01H 217Speculative cache-line split load uops dispatched to 218L1D. 219.It Li MISALIGN_MEM_REF.STORES 220.Pq Event 05H , Umask 02H 221Speculative cache-line split Store- address uops 222dispatched to L1D. 223.It Li LD_BLOCKS_PARTIAL.ADDRESS_ALIAS 224.Pq Event 07H , Umask 01H 225False dependencies in MOB due to partial compare on 226address. 227.It Li LD_BLOCKS_PARTIAL.ALL_STALL_BLOCK 228.Pq Event 07H , Umask 08H 229The number of times that load operations are temporarily 230blocked because of older stores, with addresses that are 231not yet known. A load operation may incur more than one 232block of this type. 233.It Li TLB_LOAD_MISSES.MISS_CAUSES_A_WALK 234.Pq Event 08H , Umask 01H 235Misses in all TLB levels that cause a page walk of any 236page size. 237.It Li TLB_LOAD_MISSES.WALK_COMPLETED 238.Pq Event 08H , Umask 02H 239Misses in all TLB levels that caused page walk completed 240of any size. 241.It Li DTLB_LOAD_MISSES.WALK_DURATION 242.Pq Event 08H , Umask 04H 243Cycle PMH is busy with a walk. 244.It Li DTLB_LOAD_MISSES.STLB_HIT 245.Pq Event 08H , Umask 10H 246Number of cache load STLB hits. No page walk. 247.It Li INT_MISC.RECOVERY_CYCLES 248.Pq Event 0DH , Umask 03H 249Cycles waiting to recover after Machine Clears or EClear. 250Set Cmask= 1. 251.It Li INT_MISC.RAT_STALL_CYCLES 252.Pq Event 0DH , Umask 40H 253Cycles RAT external stall is sent to IDQ for this thread. 254.It Li UOPS_ISSUED.ANY 255.Pq Event 0EH , Umask 01H 256Increments each cycle the # of Uops issued by the 257RAT to RS. 258Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles 259of this core. 260.It Li FP_COMP_OPS_EXE.X87 261.Pq Event 10H , Umask 01H 262Counts number of X87 uops executed. 263.It Li FP_COMP_OPS_EXE.SSE_FP_PACKED_DOUBLE 264.Pq Event 10H , Umask 10H 265Counts number of SSE* double precision FP packed 266uops executed. 267.It Li FP_COMP_OPS_EXE.SSE_FP_SCALAR_SINGLE 268.Pq Event 10H , Umask 20H 269Counts number of SSE* single precision FP scalar 270uops executed. 271.It Li FP_COMP_OPS_EXE.SSE_PACKED_SINGLE 272.Pq Event 10H , Umask 40H 273Counts number of SSE* single precision FP packed 274uops executed. 275.It Li FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE 276.Pq Event 10H , Umask 80H 277Counts number of SSE* double precision FP scalar 278uops executed. 279.It Li SIMD_FP_256.PACKED_SINGLE 280.Pq Event 11H , Umask 01H 281Counts 256-bit packed single-precision floating- 282point instructions. 283.It Li SIMD_FP_256.PACKED_DOUBLE 284.Pq Event 11H , Umask 02H 285Counts 256-bit packed double-precision floating- 286point instructions. 287.It Li ARITH.FPU_DIV_ACTIVE 288.Pq Event 14H , Umask 01H 289Cycles that the divider is active, includes INT and FP. 290Set 'edge =1, cmask=1' to count the number of 291divides. 292.It Li INSTS_WRITTEN_TO_IQ.INSTS 293.Pq Event 17H , Umask 01H 294Counts the number of instructions written into the 295IQ every cycle. 296.It Li L2_RQSTS.DEMAND_DATA_RD_HIT 297.Pq Event 24H , Umask 01H 298Demand Data Read requests that hit L2 cache. 299.It Li L2_RQSTS.ALL_DEMAND_DATA_RD 300.Pq Event 24H , Umask 03H 301Counts any demand and L1 HW prefetch data load 302requests to L2. 303.It Li L2_RQSTS.RFO_HITS 304.Pq Event 24H , Umask 04H 305Counts the number of store RFO requests that 306hit the L2 cache. 307.It Li L2_RQSTS.RFO_MISS 308.Pq Event 24H , Umask 08H 309Counts the number of store RFO requests that 310miss the L2 cache. 311.It Li L2_RQSTS.ALL_RFO 312.Pq Event 24H , Umask 0CH 313Counts all L2 store RFO requests. 314.It Li L2_RQSTS.CODE_RD_HIT 315.Pq Event 24H , Umask 10H 316Number of instruction fetches that hit the L2 317cache. 318.It Li L2_RQSTS.CODE_RD_MISS 319.Pq Event 24H , Umask 20H 320Number of instruction fetches that missed the L2 321cache. 322.It Li L2_RQSTS.ALL_CODE_RD 323.Pq Event 24H , Umask 30H 324Counts all L2 code requests. 325.It Li L2_RQSTS.PF_HIT 326.Pq Event 24H , Umask 40H 327Requests from L2 Hardware prefetcher that hit L2. 328.It Li L2_RQSTS.PF_MISS 329.Pq Event 24H , Umask 80H 330Requests from L2 Hardware prefetcher that missed 331L2. 332.It Li L2_RQSTS.ALL_PF 333.Pq Event 24H , Umask C0H 334Any requests from L2 Hardware prefetchers. 335.It Li L2_STORE_LOCK_RQSTS.MISS 336.Pq Event 27H , Umask 01H 337ROs that miss cache lines. 338.It Li L2_STORE_LOCK_RQSTS.HIT_E 339.Pq Event 27H , Umask 04H 340RFOs that hit cache lines in E state. 341.It Li L2_STORE_LOCK_RQSTS.HIT_M 342.Pq Event 27H , Umask 08H 343RFOs that hit cache lines in M state. 344.It Li L2_STORE_LOCK_RQSTS.ALL 345.Pq Event 27H , Umask 0FH 346RFOs that access cache lines in any state. 347.It Li L2_L1D_WB_RQSTS.MISS 348.Pq Event 28H , Umask 01H 349Not rejected writebacks from L1D to L2 cache lines 350that missed L2. 351.It Li L2_L1D_WB_RQSTS.HIT_S 352.Pq Event 28H , Umask 02H 353Not rejected writebacks from L1D to L2 cache lines 354in S state. 355.It Li L2_L1D_WB_RQSTS.HIT_E 356.Pq Event 28H , Umask 04H 357Not rejected writebacks from L1D to L2 cache lines 358in E state. 359.It Li L2_L1D_WB_RQSTS.HIT_M 360.Pq Event 28H , Umask 08H 361Not rejected writebacks from L1D to L2 cache lines 362in M state. 363.It Li L2_L1D_WB_RQSTS.ALL 364.Pq Event 28H , Umask 0FH 365Not rejected writebacks from L1D to L2 cache. 366.It Li LONGEST_LAT_CACHE.REFERENCE 367.Pq Event 2EH , Umask 4FH 368This event counts requests originating from the 369core that reference 370a cache line in the last level cache. 371.It Li LONGEST_LAT_CACHE.MISS 372.Pq Event 2EH , Umask 41H 373This event counts each cache miss condition for 374references to the last level cache. 375.It Li CPU_CLK_UNHALTED.THREAD_P 376.Pq Event 3CH , Umask 00H 377Counts the number of thread cycles while the 378thread is not in a halt state. The thread enters 379the halt state when it is running the HLT 380instruction. The core frequency may change from 381time to time due to power or thermal throttling. 382.It Li CPU_CLK_THREAD_UNHALTED.REF_XCLK 383.Pq Event 3CH , Umask 01H 384Increments at the frequency of XCLK (100 MHz) 385when not halted. 386.It Li L1D_PEND_MISS.PENDING 387.Pq Event 48H , Umask 01H 388Increments the number of outstanding L1D misses 389every cycle. 390Set Cmaks = 1 and Edge =1 to count occurrences. 391.It Li DTLB_STORE_MISSES.MISS_CAUSES_A_WALK 392.Pq Event 49H , Umask 01H 393Miss in all TLB levels causes an page walk of 394any page size (4K/2M/4M/1G). 395.It Li DTLB_STORE_MISSES.WALK_COMPLETED 396.Pq Event 49H , Umask 02H 397Miss in all TLB levels causes a page walk that 398completes of any page size (4K/2M/4M/1G). 399.It Li DTLB_STORE_MISSES.WALK_DURATION 400.Pq Event 49H , Umask 04H 401Cycles PMH is busy with this walk. 402.It Li DTLB_STORE_MISSES.STLB_HIT 403.Pq Event 49H , Umask 10H 404Store operations that miss the first TLB level 405but hit the second and do not cause page walks. 406.It Li LOAD_HIT_PRE.SW_PF 407.Pq Event 4CH , Umask 01H 408Not SW-prefetch load dispatches that hit fill 409buffer allocated for S/W prefetch. 410.It Li LOAD_HIT_PER.HW_PF 411.Pq Event 4CH , Umask 02H 412Not SW-prefetch load dispatches that hit fill 413buffer allocated for H/W prefetch. 414.It Li HW_PRE_REQ.DL1_MISS 415.Pq Event 4EH , Umask 02H 416Hardware Prefetch requests that miss the L1D 417cache. A request is being counted each time 418it access the cache & miss it, including if 419a block is applicable or if hit the Fill 420Buffer for example. 421.It Li L1D.REPLACEMENT 422.Pq Event 51H , Umask 01H 423Counts the number of lines brought into the 424L1 data cache. 425.It Li L1D.ALLOCATED_IN_M 426.Pq Event 51H , Umask 02H 427Counts the number of allocations of modified 428L1D cache lines. 429.It Li L1D.EVICTION 430.Pq Event 51H , Umask 04H 431Counts the number of modified lines evicted 432from the L1 data cache due to replacement. 433.It Li L1D.ALL_M_REPLACEMENT 434.Pq Event 51H , Umask 08H 435Cache lines in M state evicted out of L1D due 436to Snoop HitM or dirty line replacement. 437.It Li PARTIAL_RAT_STALLS.FLAGS_MERGE_UOP 438.Pq Event 59H , Umask 0CH 439Increments the number of flags-merge uops in 440flight each cycle. 441Set Cmask = 1 to count cycles. 442.It Li PARTIAL_RAT_STALLS.SLOW_LEA_WINDOW 443.Pq Event 59H , Umask 0FH 444Cycles with at least one slow LEA uop allocated. 445.It Li PARTIAL_RAT_STALLS.MUL_SINGLE_UOP 446.Pq Event 59H , Umask 40H 447Number of Multiply packed/scalar single precision 448uops allocated. 449.It Li RESOURCE_STALLS2.ALL_FL_EMPTY 450.Pq Event 5BH , Umask 0CH 451Cycles stalled due to free list empty. 452.It Li RESOURCE_STALLS2.ALL_PRF_CONTROL 453.Pq Event 5BH , Umask 0FH 454Cycles stalled due to control structures full for 455physical registers. 456.It Li RESOURCE_STALLS2.BOB_FULL 457.Pq Event 5BH , Umask 40H 458Cycles Allocator is stalled due Branch Order Buffer. 459.It Li RESOURCE_STALLS2.OOO_RSRC 460.Pq Event 5BH , Umask 4FH 461Cycles stalled due to out of order resources full. 462.It Li CPL_CYCLES.RING0 463.Pq Event 5CH , Umask 01H 464Unhalted core cycles when the thread is in ring 0. 465.It Li CPL_CYCLES.RING123 466.Pq Event 5CH , Umask 02H 467Unhalted core cycles when the thread is not in ring 4680. 469.It Li RS_EVENTS.EMPTY_CYCLES 470.Pq Event 5EH , Umask 01H 471Cycles the RS is empty for the thread. 472.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD 473.Pq Event 60H , Umask 01H 474Offcore outstanding Demand Data Read 475transactions in SQ to uncore. Set Cmask=1 to count 476cycles. 477.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO 478.Pq Event 60H , Umask 04H 479Offcore outstanding RFO store transactions in SQ to 480uncore. Set Cmask=1 to count cycles. 481.It Li OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD 482.Pq Event 60H , Umask 08H 483Offcore outstanding cacheable data read 484transactions in SQ to uncore. Set Cmask=1 to count 485cycles. 486.It Li LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION 487.Pq Event 63H , Umask 01H 488Cycles in which the L1D and L2 are locked, due to a 489UC lock or split lock. 490.It Li LOCK_CYCLES.CACHE_LOCK_DURATION 491.Pq Event 63H , Umask 02H 492Cycles in which the L1D is locked. 493.It Li IDQ.EMPTY 494.Pq Event 79H , Umask 02H 495Counts cycles the IDQ is empty. 496.It Li IDQ.MITE_UOPS 497.Pq Event 79H , Umask 04H 498Increment each cycle # of uops delivered to IDQ 499from MITE path. 500Set Cmask = 1 to count cycles. 501.It Li IDQ.DSB_UOPS 502.Pq Event 79H , Umask 08H 503Increment each cycle. # of uops delivered to IDQ 504from DSB path. 505Set Cmask = 1 to count cycles. 506.It Li IDQ.MS_DSB_UOPS 507.Pq Event 79H , Umask 10H 508Increment each cycle # of uops delivered to IDQ 509when MS busy by DSB. Set Cmask = 1 to count 510cycles MS is busy. Set Cmask=1 and Edge =1 to 511count MS activations. 512.It Li IDQ.MS_MITE_UOPS 513.Pq Event 79H , Umask 20H 514Increment each cycle # of uops delivered to IDQ 515when MS is busy by MITE. Set Cmask = 1 to count 516cycles. 517.It Li IDQ.MS_UOPS 518.Pq Event 79H , Umask 30H 519Increment each cycle # of uops delivered to IDQ 520from MS by either DSB or MITE. Set Cmask = 1 to 521count cycles. 522.It Li ICACHE.MISSES 523.Pq Event 80H , Umask 02H 524Number of Instruction Cache, Streaming Buffer and 525Victim Cache Misses. Includes UC accesses. 526.It Li ITLB_MISSES.MISS_CAUSES_A_WALK 527.Pq Event 85H , Umask 01H 528Misses in all ITLB levels that cause page walks. 529.It Li ITLB_MISSES.WALK_COMPLETED 530.Pq Event 85H , Umask 02H 531Misses in all ITLB levels that cause completed page 532walks. 533.It Li ITLB_MISSES.WALK_DURATION 534.Pq Event 85H , Umask 04H 535Cycle PMH is busy with a walk. 536.It Li ITLB_MISSES.STLB_HIT 537.Pq Event 85H , Umask 10H 538Number of cache load STLB hits. No page walk. 539.It Li ILD_STALL.LCP 540.Pq Event 87H , Umask 01H 541Stalls caused by changing prefix length of the 542instruction. 543.It Li ILD_STALL.IQ_FULL 544.Pq Event 87H , Umask 04H 545Stall cycles due to IQ is full. 546.It Li BR_INST_EXEC.NONTAKEN_COND 547.Pq Event 88H , Umask 41H 548Count conditional near branch instructions that were executed (but not 549necessarily retired) and not taken. 550.It Li BR_INST_EXEC.TAKEN_COND 551.Pq Event 88H , Umask 81H 552Count conditional near branch instructions that were executed (but not 553necessarily retired) and taken. 554.It Li BR_INST_EXEC.DIRECT_JMP 555.Pq Event 88H , Umask 82H 556Count all unconditional near branch instructions excluding calls and 557indirect branches. 558.It Li BR_INST_EXEC.INDIRECT_JMP_NON_CALL_RET 559.Pq Event 88H , Umask 84H 560Count executed indirect near branch instructions that are not calls nor 561returns. 562.It Li BR_INST_EXEC.RETURN_NEAR 563.Pq Event 88H , Umask 88H 564Count indirect near branches that have a return mnemonic. 565.It Li BR_INST_EXEC.DIRECT_NEAR_CALL 566.Pq Event 88H , Umask 90H 567Count unconditional near call branch instructions, excluding non call 568branch, executed. 569.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL 570.Pq Event 88H , Umask A0H 571Count indirect near calls, including both register and memory indirect, 572executed. 573.It Li BR_INST_EXEC.ALL_BRANCHES 574.Pq Event 88H , Umask FFH 575Counts all near executed branches (not necessarily retired). 576.It Li BR_MISP_EXEC.NONTAKEN_COND 577.Pq Event 89H , Umask 41H 578Count conditional near branch instructions mispredicted as nontaken. 579.It Li BR_MISP_EXEC.TAKEN_COND 580.Pq Event 89H , Umask 81H 581Count conditional near branch instructions mispredicted as taken. 582.It Li BR_MISP_EXEC.INDIRECT_JMP_NON_CALL_RET 583.Pq Event 89H , Umask 84H 584Count mispredicted indirect near branch instructions that are not calls 585nor returns. 586.It Li BR_MISP_EXEC.RETURN_NEAR 587.Pq Event 89H , Umask 88H 588Count mispredicted indirect near branches that have a return mnemonic. 589.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL 590.Pq Event 89H , Umask 90H 591Count mispredicted unconditional near call branch instructions, excluding 592non call branch, executed. 593.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL 594.Pq Event 89H , Umask A0H 595Count mispredicted indirect near calls, including both register and memory 596indirect, executed. 597.It Li BR_MISP_EXEC.ALL_BRANCHES 598.Pq Event 89H , Umask FFH 599Counts all mispredicted near executed branches (not necessarily retired). 600.It Li IDQ_UOPS_NOT_DELIVERED.CORE 601.Pq Event 9CH , Umask 01H 602Count number of non-delivered uops to RAT per 603thread. 604.It Li UOPS_DISPATCHED_PORT.PORT_0 605.Pq Event A1H , Umask 01H 606Cycles which a Uop is dispatched on port 0. 607.It Li UOPS_DISPATCHED_PORT.PORT_1 608.Pq Event A1H , Umask 02H 609Cycles which a Uop is dispatched on port 1. 610.It Li UOPS_DISPATCHED_PORT.PORT_2_LD 611.Pq Event A1H , Umask 04H 612Cycles which a load uop is dispatched on port 2. 613.It Li UOPS_DISPATCHED_PORT.PORT_2_STA 614.Pq Event A1H , Umask 08H 615Cycles which a store address uop is dispatched on 616port 2. 617.It Li UOPS_DISPATCHED_PORT.PORT_2 618.Pq Event A1H , Umask 0CH 619Cycles which a Uop is dispatched on port 2. 620.It Li UOPS_DISPATCHED_PORT.PORT_3_LD 621.Pq Event A1H , Umask 10H 622Cycles which a load uop is dispatched on port 3. 623.It Li UOPS_DISPATCHED_PORT.PORT_3_STA 624.Pq Event A1H , Umask 20H 625Cycles which a store address uop is dispatched on 626port 3. 627.It Li UOPS_DISPATCHED_PORT.PORT_3 628.Pq Event A1H , Umask 30H 629Cycles which a Uop is dispatched on port 3. 630.It Li UOPS_DISPATCHED_PORT.PORT_4 631.Pq Event A1H , Umask 40H 632Cycles which a Uop is dispatched on port 4. 633.It Li UOPS_DISPATCHED_PORT.PORT_5 634.Pq Event A1H , Umask 80H 635Cycles which a Uop is dispatched on port 5. 636.It Li RESOURCE_STALLS.ANY 637.Pq Event A2H , Umask 01H 638Cycles Allocation is stalled due to Resource Related 639reason. 640.It Li RESOURCE_STALLS.LB 641.Pq Event A2H , Umask 01H 642Counts the cycles of stall due to lack of load buffers. 643.It Li RESOURCE_STALLS.RS 644.Pq Event A2H , Umask 04H 645Cycles stalled due to no eligible RS entry available. 646.It Li RESOURCE_STALLS.SB 647.Pq Event A2H , Umask 08H 648Cycles stalled due to no store buffers available. (not 649including draining form sync). 650.It Li RESOURCE_STALLS.ROB 651.Pq Event A2H , Umask 10H 652Cycles stalled due to re-order buffer full. 653.It Li RESOURCE_STALLS.FCSW 654.Pq Event A2H , Umask 20H 655Cycles stalled due to writing the FPU control word. 656.It Li RESOURCE_STALLS.MXCSR 657.Pq Event A2H , Umask 40H 658Cycles stalled due to the MXCSR register rename 659occurring to close to a previous MXCSR rename. 660.It Li RESOURCE_STALLS.OTHER 661.Pq Event A2H , Umask 80H 662Cycles stalled while execution was stalled due to 663other resource issues. 664.It Li CYCLE_ACTIVITY.CYCLES_L2_PENDING 665.Pq Event A3H , Umask 01H 666Cycles with pending L2 miss loads. Set AnyThread 667to count per core. 668.It Li CYCLE_ACTIVITY.CYCLES_L1D_PENDING 669.Pq Event A3H , Umask 02H 670Cycles with pending L1 cache miss loads.Set 671AnyThread to count per core. 672.It Li CYCLE_ACTIVITY.CYCLES_NO_DISPATCH 673.Pq Event A3H , Umask 04H 674Cycles of dispatch stalls. Set AnyThread to count per 675core. 676.It Li DSB2MITE_SWITCHES.COUNT 677.Pq Event ABH , Umask 01H 678Number of DSB to MITE switches. 679.It Li DSB2MITE_SWITCHES.PENALTY_CYCLES 680.Pq Event ABH , Umask 02H 681Cycles DSB to MITE switches caused delay. 682.It Li DSB_FILL.OTHER_CANCEL 683.Pq Event ACH , Umask 02H 684Cases of cancelling valid DSB fill not because of 685exceeding way limit. 686.It Li DSB_FILL.EXCEED_DSB_LINES 687.Pq Event ACH , Umask 08H 688DSB Fill encountered > 3 DSB lines. 689.It Li DSB_FILL.ALL_CANCEL 690.Pq Event ACH , Umask 0AH 691Cases of cancelling valid Decode Stream Buffer 692(DSB) fill not because of exceeding way limit. 693.It Li ITLB.ITLB_FLUSH 694.Pq Event AEH , Umask 01H 695Counts the number of ITLB flushes, includes 6964k/2M/4M pages. 697.It Li OFFCORE_REQUESTS.DEMAND_DATA_RD 698.Pq Event B0H , Umask 01H 699Demand data read requests sent to uncore. 700.It Li OFFCORE_REQUESTS.DEMAND_RFO 701.Pq Event B0H , Umask 04H 702Demand RFO read requests sent to uncore, including 703regular RFOs, locks, ItoM. 704.It Li OFFCORE_REQUESTS.ALL_DATA_RD 705.Pq Event B0H , Umask 08H 706Data read requests sent to uncore (demand and 707prefetch). 708.It Li UOPS_DISPATCHED.THREAD 709.Pq Event B1H , Umask 01H 710Counts total number of uops to be dispatched per- 711thread each cycle. Set Cmask = 1, INV =1 to count 712stall cycles. 713.It Li UOPS_DISPATCHED.CORE 714.Pq Event B1H , Umask 02H 715Counts total number of uops to be dispatched per- 716core each cycle. 717.It Li OFFCORE_REQUESTS_BUFFER.SQ_FULL 718.Pq Event B2H , Umask 01H 719Offcore requests buffer cannot take more entries 720for this thread core. 721.It Li AGU_BYPASS_CANCEL.COUNT 722.Pq Event B6H , Umask 01H 723Counts executed load operations with all the 724following traits: 1. addressing of the format [base + 725offset], 2. the offset is between 1 and 2047, 3. the 726address specified in the base register is in one page 727and the address [base+offset] is in another page. 728.It Li OFF_CORE_RESPONSE_0 729.Pq Event B7H , Umask 01H 730(Event B7H, Umask 01H) Off-core Response Performance 731Monitoring; PMC0 only. Requires programming MSR 01A6H 732.It Li OFF_CORE_RESPONSE_1 733.Pq Event BBH , Umask 01H 734(Event BBH, Umask 01H) Off-core Response Performance 735Monitoring; PMC3 only. Requires programming MSR 01A7H 736.It Li TLB_FLUSH.DTLB_THREAD 737.Pq Event BDH , Umask 01H 738DTLB flush attempts of the thread-specific entries. 739.It Li TLB_FLUSH.STLB_ANY 740.Pq Event BDH , Umask 20H 741Count number of STLB flush attempts. 742.It Li L1D_BLOCKS.BANK_CONFLICT_CYCLES 743.Pq Event BFH , Umask 05H 744Cycles when dispatched loads are cancelled due to 745L1D bank conflicts with other load ports. 746.It Li INST_RETIRED.ANY_P 747.Pq Event C0H , Umask 00H 748Number of instructions at retirement. 749.It Li INST_RETIRED.ALL 750.Pq Event C0H , Umask 01H 751Precise instruction retired event with HW to reduce 752effect of PEBS shadow in IP distribution. 753.It Li OTHER_ASSISTS.ITLB_MISS_RETIRED 754.Pq Event C1H , Umask 02H 755Instructions that experienced an ITLB miss. 756.It Li OTHER_ASSISTS.AVX_STORE 757.Pq Event C1H , Umask 08H 758Number of assists associated with 256-bit AVX 759store operations. 760.It Li OTHER_ASSISTS.AVX_TO_SSE 761.Pq Event C1H , Umask 10H 762Number of transitions from AVX-256 to legacy SSE 763when penalty applicable. 764.It Li OTHER_ASSISTS.SSE_TO_AVX 765.Pq Event C1H , Umask 20H 766Number of transitions from SSE to AVX-256 when 767penalty applicable. 768.It Li UOPS_RETIRED.ALL 769.Pq Event C2H , Umask 01H 770Counts the number of micro-ops retired, Use 771cmask=1 and invert to count active cycles or stalled 772cycles. 773.It Li UOPS_RETIRED.RETIRE_SLOTS 774.Pq Event C2H , Umask 02H 775Counts the number of retirement slots used each 776cycle. 777.It Li MACHINE_CLEARS.MEMORY_ORDERING 778.Pq Event C3H , Umask 02H 779Counts the number of machine clears due to 780memory order conflicts. 781.It Li MACHINE_CLEARS.SMC 782.Pq Event C3H , Umask 04H 783Counts the number of times that a program writes 784to a code section. 785.It Li MACHINE_CLEARS.MASKMOV 786.Pq Event C3H , Umask 20H 787Counts the number of executed AVX masked load 788operations that refer to an illegal address range 789with the mask bits set to 0. 790.It Li BR_INST_RETIRED.ALL_BRANCH 791.Pq Event C4H , Umask 00H 792Branch instructions at retirement. 793.It Li BR_INST_RETIRED.CONDITIONAL 794.Pq Event C4H , Umask 01H 795Counts the number of conditional branch 796instructions retired. 797.It Li BR_INST_RETIRED.NEAR_CALL 798.Pq Event C4H , Umask 02H 799Direct and indirect near call instructions retired. 800.It Li BR_INST_RETIRED.ALL_BRANCHES 801.Pq Event C4H , Umask 04H 802Counts the number of branch instructions retired. 803.It Li BR_INST_RETIRED.NEAR_RETURN 804.Pq Event C4H , Umask 08H 805Counts the number of near return instructions 806retired. 807.It Li BR_INST_RETIRED.NOT_TAKEN 808.Pq Event C4H , Umask 10H 809Counts the number of not taken branch instructions 810retired. 811.It Li BR_INST_RETIRED.NEAR_TAKEN 812.Pq Event C4H , Umask 20H 813Number of near taken branches retired. 814.It Li BR_INST_RETIRED.FAR_BRANCH 815.Pq Event C4H , Umask 40H 816Number of far branches retired. 817.It Li BR_MISP_RETIRED.ALL_BRANCHES 818.Pq Event C5H , Umask 00H 819Mispredicted branch instructions at retirement. 820.It Li BR_MISP_RETIRED.CONDITIONAL 821.Pq Event C5H , Umask 01H 822Mispredicted conditional branch instructions retired. 823.It Li BR_MISP_RETIRED.NEAR_CALL 824.Pq Event C5H , Umask 02H 825Direct and indirect mispredicted near call 826instructions retired. 827.It Li BR_MISP_RETIRED.ALL_BRANCHES 828.Pq Event C5H , Umask 04H 829Mispredicted macro branch instructions retired. 830.It Li BR_MISP_RETIRED.NOT_TAKEN 831.Pq Event C5H , Umask 10H 832Mispredicted not taken branch instructions retired. 833.It Li BR_MISP_RETIRED.TAKEN 834.Pq Event C5H , Umask 20H 835Mispredicted taken branch instructions retired. 836.It Li FP_ASSIST.X87_OUTPUT 837.Pq Event CAH , Umask 02H 838Number of X87 assists due to output value. 839.It Li FP_ASSIST.X87_INPUT 840.Pq Event CAH , Umask 04H 841Number of X87 assists due to input value. 842.It Li FP_ASSIST.SIMD_OUTPUT 843.Pq Event CAH , Umask 08H 844 Number of SIMD FP assists due to output values. 845.It Li FP_ASSIST.SIMD_INPUT 846.Pq Event CAH , Umask 10H 847Number of SIMD FP assists due to input values. 848.It Li FP_ASSIST.ANY 1EH 849.Pq Event CAH , Umask 850Cycles with any input/output SSE* or FP assists. 851.It Li ROB_MISC_EVENTS.LBR_INSERTS 852.Pq Event CCH , Umask 20H 853Count cases of saving new LBR records by 854hardware. 855.It Li MEM_TRANS_RETIRED.LOAD_LATENCY 856.Pq Event CDH , Umask 01H 857Sample loads with specified latency threshold. 858PMC3 only. 859.It Li MEM_TRANS_RETIRED.PRECISE_STORE 860.Pq Event CDH , Umask 02H 861Sample stores and collect precise store operation 862via PEBS record. PMC3 only. 863.It Li MEM_UOP_RETIRED.LOADS 864.Pq Event D0H , Umask 10H 865Qualify retired memory uops that are loads. 866Combine with umask 10H, 20H, 40H, 80H. 867.It Li MEM_UOP_RETIRED.STORES 868.Pq Event D0H , Umask 02H 869Qualify retired memory uops that are stores. 870Combine with umask 10H, 20H, 40H, 80H. 871.It Li MEM_UOP_RETIRED.STLB_MISS 872.Pq Event D0H , Umask 873Qualify retired memory uops with STLB miss. Must 874combine with umask 01H, 02H, to produce counts. 875.It Li MEM_UOP_RETIRED.LOCK 876.Pq Event D0H , Umask 877Qualify retired memory uops with lock. Must 878combine with umask 01H, 02H, to produce counts. 879.It Li MEM_UOP_RETIRED.SPLIT 880.Pq Event D0H , Umask 881Qualify retired memory uops with line split. Must 882combine with umask 01H, 02H, to produce counts. 883.It Li MEM_UOP_RETIRED_ALL 884.Pq Event D0H , Umask 885Qualify any retired memory uops. Must combine 886with umask 01H, 02H, to produce counts. 887.It Li MEM_LOAD_UOPS_RETIRED.L1_HIT 888.Pq Event D1H , Umask 01H 889Retired load uops with L1 cache hits as data 890sources. 891.It Li MEM_LOAD_UOPS_RETIRED.L2_HIT 892.Pq Event D1H , Umask 02H 893Retired load uops with L2 cache hits as data 894sources. 895.It Li MEM_LOAD_UOPS_RETIRED.LLC_HIT 896.Pq Event D1H , Umask 04H 897Retired load uops which data sources were data hits 898in LLC without snoops required. 899.It Li MEM_LOAD_UOPS_RETIRED.LLC_MISS 900.Pq Event D1H , Umask 20H 901Retired load uops which data sources were data 902missed LLC (excluding unknown data source). 903.It Li MEM_LOAD_UOPS_RETIRED.HIT_LFB 904.Pq Event D1H , Umask 40H 905Retired load uops which data sources were load 906uops missed L1 but hit FB due to preceding miss to 907the same cache line with data not ready. 908.It Li MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS 909.Pq Event D4H , Umask 02H 910Retired load uops with unknown information as data 911source in cache serviced the load. 912.It Li BACLEARS.ANY 913.Pq Event E6H , Umask 01H 914Counts the number of times the front end is re- 915steered, mainly when the BPU cannot provide a 916correct prediction and this is corrected by other 917branch handling mechanisms at the front end. 918.It Li L2_TRANS.DEMAND_DATA_RD 919.Pq Event F0H , Umask 01H 920Demand Data Read requests that access L2 cache. 921.It Li L2_TRANS.RFO 922.Pq Event F0H , Umask 02H 923RFO requests that access L2 cache. 924.It Li L2_TRANS.CODE_RD 925.Pq Event F0H , Umask 04H 926L2 cache accesses when fetching instructions. 927.It Li L2_TRANS.ALL_PF 928.Pq Event F0H , Umask 08H 929L2 or LLC HW prefetches that access L2 cache. 930.It Li L2_TRANS.L1D_WB 931.Pq Event F0H , Umask 10H 932L1D writebacks that access L2 cache. 933.It Li L2_TRANS.L2_FILL 934.Pq Event F0H , Umask 20H 935L2 fill requests that access L2 cache. 936.It Li L2_TRANS.L2_WB 937.Pq Event F0H , Umask 40H 938L2 writebacks that access L2 cache. 939.It Li L2_TRANS.ALL_REQUESTS 940.Pq Event F0H , Umask 80H 941Transactions accessing L2 pipe. 942.It Li L2_LINES_IN.I 943.Pq Event F1H , Umask 01H 944L2 cache lines in I state filling L2. 945.It Li L2_LINES_IN.S 946.Pq Event F1H , Umask 02H 947L2 cache lines in S state filling L2. 948.It Li L2_LINES_IN.E 949.Pq Event F1H , Umask 04H 950L2 cache lines in E state filling L2. 951.It Li L2_LINES-IN.ALL 952.Pq Event F1H , Umask 07H 953L2 cache lines filling L2. 954.It Li L2_LINES_OUT.DEMAND_CLEAN 955.Pq Event F2H , Umask 01H 956Clean L2 cache lines evicted by demand. 957.It Li L2_LINES_OUT.DEMAND_DIRTY 958.Pq Event F2H , Umask 02H 959Dirty L2 cache lines evicted by demand. 960.It Li L2_LINES_OUT.PF_CLEAN 961.Pq Event F2H , Umask 04H 962Clean L2 cache lines evicted by L2 prefetch. 963.It Li L2_LINES_OUT.PF_DIRTY 964.Pq Event F2H , Umask 08H 965Dirty L2 cache lines evicted by L2 prefetch. 966.It Li L2_LINES_OUT.DIRTY_ALL 967.Pq Event F2H , Umask 0AH 968Dirty L2 cache lines filling the L2. 969.It Li SQ_MISC.SPLIT_LOCK 970.Pq Event F4H , Umask 10H 971Split locks in SQ. 972.El 973.Sh SEE ALSO 974.Xr pmc 3 , 975.Xr pmc.atom 3 , 976.Xr pmc.core 3 , 977.Xr pmc.corei7 3 , 978.Xr pmc.corei7uc 3 , 979.Xr pmc.haswelluc 3 , 980.Xr pmc.iaf 3 , 981.Xr pmc.ivybridge 3 , 982.Xr pmc.ivybridgexeon 3 , 983.Xr pmc.k7 3 , 984.Xr pmc.k8 3 , 985.Xr pmc.p4 3 , 986.Xr pmc.p5 3 , 987.Xr pmc.p6 3 , 988.Xr pmc.sandybridge 3 , 989.Xr pmc.sandybridgeuc 3 , 990.Xr pmc.soft 3 , 991.Xr pmc.tsc 3 , 992.Xr pmc.ucf 3 , 993.Xr pmc.westmere 3 , 994.Xr pmc.westmereuc 3 , 995.Xr pmc_cpuinfo 3 , 996.Xr pmclog 3 , 997.Xr hwpmc 4 998.Sh HISTORY 999The 1000.Nm pmc 1001library first appeared in 1002.Fx 6.0 . 1003.Sh AUTHORS 1004.An -nosplit 1005The 1006.Lb libpmc 1007library was written by 1008.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org . 1009The support for the Sandy Bridge Xeon 1010microarchitecture was written by 1011.An Hiren Panchasara Aq Mt hiren.panchasara@gmail.com . 1012