xref: /freebsd/lib/libpmc/pmc.corei7.3 (revision b293497146fea63d76a1c7492c3a21e4e5bf8f48)
11fa7f10bSFabien Thomas.\" Copyright (c) 2010 Fabien Thomas.  All rights reserved.
21fa7f10bSFabien Thomas.\"
31fa7f10bSFabien Thomas.\" Redistribution and use in source and binary forms, with or without
41fa7f10bSFabien Thomas.\" modification, are permitted provided that the following conditions
51fa7f10bSFabien Thomas.\" are met:
61fa7f10bSFabien Thomas.\" 1. Redistributions of source code must retain the above copyright
71fa7f10bSFabien Thomas.\"    notice, this list of conditions and the following disclaimer.
81fa7f10bSFabien Thomas.\" 2. Redistributions in binary form must reproduce the above copyright
91fa7f10bSFabien Thomas.\"    notice, this list of conditions and the following disclaimer in the
101fa7f10bSFabien Thomas.\"    documentation and/or other materials provided with the distribution.
111fa7f10bSFabien Thomas.\"
12026dbd29SChristian Brueffer.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
13026dbd29SChristian Brueffer.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
14026dbd29SChristian Brueffer.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
15026dbd29SChristian Brueffer.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
16026dbd29SChristian Brueffer.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
17026dbd29SChristian Brueffer.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
18026dbd29SChristian Brueffer.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
19026dbd29SChristian Brueffer.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
20026dbd29SChristian Brueffer.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
21026dbd29SChristian Brueffer.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
22026dbd29SChristian Brueffer.\" SUCH DAMAGE.
231fa7f10bSFabien Thomas.\"
241fa7f10bSFabien Thomas.Dd March 24, 2010
251fa7f10bSFabien Thomas.Dt PMC.COREI7 3
26aa12cea2SUlrich Spörlein.Os
271fa7f10bSFabien Thomas.Sh NAME
281fa7f10bSFabien Thomas.Nm pmc.corei7
291fa7f10bSFabien Thomas.Nd measurement events for
301fa7f10bSFabien Thomas.Tn Intel
311fa7f10bSFabien Thomas.Tn Core i7 and Xeon 5500
321fa7f10bSFabien Thomasfamily CPUs
331fa7f10bSFabien Thomas.Sh LIBRARY
341fa7f10bSFabien Thomas.Lb libpmc
351fa7f10bSFabien Thomas.Sh SYNOPSIS
361fa7f10bSFabien Thomas.In pmc.h
371fa7f10bSFabien Thomas.Sh DESCRIPTION
381fa7f10bSFabien Thomas.Tn Intel
391fa7f10bSFabien Thomas.Tn "Core i7"
401fa7f10bSFabien ThomasCPUs contain PMCs conforming to version 2 of the
411fa7f10bSFabien Thomas.Tn Intel
421fa7f10bSFabien Thomasperformance measurement architecture.
431fa7f10bSFabien ThomasThese CPUs may contain up to three classes of PMCs:
441fa7f10bSFabien Thomas.Bl -tag -width "Li PMC_CLASS_IAP"
451fa7f10bSFabien Thomas.It Li PMC_CLASS_IAF
461fa7f10bSFabien ThomasFixed-function counters that count only one hardware event per counter.
471fa7f10bSFabien Thomas.It Li PMC_CLASS_IAP
481fa7f10bSFabien ThomasProgrammable counters that may be configured to count one of a defined
491fa7f10bSFabien Thomasset of hardware events.
501fa7f10bSFabien Thomas.El
511fa7f10bSFabien Thomas.Pp
521fa7f10bSFabien ThomasThe number of PMCs available in each class and their widths need to be
531fa7f10bSFabien Thomasdetermined at run time by calling
541fa7f10bSFabien Thomas.Xr pmc_cpuinfo 3 .
551fa7f10bSFabien Thomas.Pp
561fa7f10bSFabien ThomasIntel Core i7 and Xeon 5500 PMCs are documented in
571fa7f10bSFabien Thomas.Rs
581fa7f10bSFabien Thomas.%B "Intel(R) 64 and IA-32 Architectures Software Developes Manual"
591fa7f10bSFabien Thomas.%T "Volume 3B: System Programming Guide, Part 2"
601fa7f10bSFabien Thomas.%N "Order Number: 253669-033US"
611fa7f10bSFabien Thomas.%D December 2009
621fa7f10bSFabien Thomas.%Q "Intel Corporation"
631fa7f10bSFabien Thomas.Re
641fa7f10bSFabien Thomas.Ss COREI7 AND XEON 5500 FIXED FUNCTION PMCS
651fa7f10bSFabien ThomasThese PMCs and their supported events are documented in
661fa7f10bSFabien Thomas.Xr pmc.iaf 3 .
671fa7f10bSFabien ThomasNot all CPUs in this family implement fixed-function counters.
681fa7f10bSFabien Thomas.Ss COREI7 AND XEON 5500 PROGRAMMABLE PMCS
691fa7f10bSFabien ThomasThe programmable PMCs support the following capabilities:
701fa7f10bSFabien Thomas.Bl -column "PMC_CAP_INTERRUPT" "Support"
711fa7f10bSFabien Thomas.It Em Capability Ta Em Support
721fa7f10bSFabien Thomas.It PMC_CAP_CASCADE Ta \&No
731fa7f10bSFabien Thomas.It PMC_CAP_EDGE Ta Yes
741fa7f10bSFabien Thomas.It PMC_CAP_INTERRUPT Ta Yes
751fa7f10bSFabien Thomas.It PMC_CAP_INVERT Ta Yes
761fa7f10bSFabien Thomas.It PMC_CAP_READ Ta Yes
771fa7f10bSFabien Thomas.It PMC_CAP_PRECISE Ta \&No
781fa7f10bSFabien Thomas.It PMC_CAP_SYSTEM Ta Yes
791fa7f10bSFabien Thomas.It PMC_CAP_TAGGING Ta \&No
801fa7f10bSFabien Thomas.It PMC_CAP_THRESHOLD Ta Yes
811fa7f10bSFabien Thomas.It PMC_CAP_USER Ta Yes
821fa7f10bSFabien Thomas.It PMC_CAP_WRITE Ta Yes
831fa7f10bSFabien Thomas.El
841fa7f10bSFabien Thomas.Ss Event Qualifiers
851fa7f10bSFabien ThomasEvent specifiers for these PMCs support the following common
861fa7f10bSFabien Thomasqualifiers:
871fa7f10bSFabien Thomas.Bl -tag -width indent
881fa7f10bSFabien Thomas.It Li rsp= Ns Ar value
891fa7f10bSFabien ThomasConfigure the Off-core Response bits.
901fa7f10bSFabien Thomas.Bl -tag -width indent
911fa7f10bSFabien Thomas.It Li DMND_DATA_RD
921fa7f10bSFabien ThomasCounts the number of demand and DCU prefetch data reads of full
931fa7f10bSFabien Thomasand partial cachelines as well as demand data page table entry
940b129325SGordon Berglingcacheline reads.
950b129325SGordon BerglingDoes not count L2 data read prefetches or instruction fetches.
961fa7f10bSFabien Thomas.It Li DMND_RFO
971fa7f10bSFabien ThomasCounts the number of demand and DCU prefetch reads for ownership
980b129325SGordon Bergling(RFO) requests generated by a write to data cacheline.
990b129325SGordon BerglingDoes not count L2 RFO.
1001fa7f10bSFabien Thomas.It Li DMND_IFETCH
1011fa7f10bSFabien ThomasCounts the number of demand and DCU prefetch instruction cacheline
1020b129325SGordon Berglingreads.
1030b129325SGordon BerglingDoes not count L2 code read prefetches.
1040b129325SGordon BerglingWB Counts the number of writeback (modified to exclusive) transactions.
1051fa7f10bSFabien Thomas.It Li PF_DATA_RD
1061fa7f10bSFabien ThomasCounts the number of data cacheline reads generated by L2 prefetchers.
1071fa7f10bSFabien Thomas.It Li PF_RFO
1081fa7f10bSFabien ThomasCounts the number of RFO requests generated by L2 prefetchers.
1091fa7f10bSFabien Thomas.It Li PF_IFETCH
1101fa7f10bSFabien ThomasCounts the number of code reads generated by L2 prefetchers.
1111fa7f10bSFabien Thomas.It Li OTHER
1121fa7f10bSFabien ThomasCounts one of the following transaction types, including L3 invalidate,
1131fa7f10bSFabien ThomasI/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences,
1141fa7f10bSFabien Thomaslock, unlock, split lock.
1151fa7f10bSFabien Thomas.It Li UNCORE_HIT
1161fa7f10bSFabien ThomasL3 Hit: local or remote home requests that hit L3 cache in the uncore
1171fa7f10bSFabien Thomaswith no coherency actions required (snooping).
1181fa7f10bSFabien Thomas.It Li OTHER_CORE_HIT_SNP
1191fa7f10bSFabien ThomasL3 Hit: local or remote home requests that hit L3 cache in the uncore
1201fa7f10bSFabien Thomasand was serviced by another core with a cross core snoop where no modified
1211fa7f10bSFabien Thomascopies were found (clean).
1221fa7f10bSFabien Thomas.It Li OTHER_CORE_HITM
1231fa7f10bSFabien ThomasL3 Hit: local or remote home requests that hit L3 cache in the uncore
1241fa7f10bSFabien Thomasand was serviced by another core with a cross core snoop where modified
1251fa7f10bSFabien Thomascopies were found (HITM).
1261fa7f10bSFabien Thomas.It Li REMOTE_CACHE_FWD
1271fa7f10bSFabien ThomasL3 Miss: local homed requests that missed the L3 cache and was serviced
1281fa7f10bSFabien Thomasby forwarded data following a cross package snoop where no modified
1291fa7f10bSFabien Thomascopies found. (Remote home requests are not counted)
1301fa7f10bSFabien Thomas.It Li REMOTE_DRAM
1311fa7f10bSFabien ThomasL3 Miss: remote home requests that missed the L3 cache and were serviced
1321fa7f10bSFabien Thomasby remote DRAM.
1331fa7f10bSFabien Thomas.It Li LOCAL_DRAM
1341fa7f10bSFabien ThomasL3 Miss: local home requests that missed the L3 cache and were serviced
1351fa7f10bSFabien Thomasby local DRAM.
1361fa7f10bSFabien Thomas.It Li NON_DRAM
1371fa7f10bSFabien ThomasNon-DRAM requests that were serviced by IOH.
1381fa7f10bSFabien Thomas.El
1391fa7f10bSFabien Thomas.It Li cmask= Ns Ar value
1401fa7f10bSFabien ThomasConfigure the PMC to increment only if the number of configured
1411fa7f10bSFabien Thomasevents measured in a cycle is greater than or equal to
1421fa7f10bSFabien Thomas.Ar value .
1431fa7f10bSFabien Thomas.It Li edge
1441fa7f10bSFabien ThomasConfigure the PMC to count the number of de-asserted to asserted
1451fa7f10bSFabien Thomastransitions of the conditions expressed by the other qualifiers.
1461fa7f10bSFabien ThomasIf specified, the counter will increment only once whenever a
1471fa7f10bSFabien Thomascondition becomes true, irrespective of the number of clocks during
1481fa7f10bSFabien Thomaswhich the condition remains true.
1491fa7f10bSFabien Thomas.It Li inv
1501fa7f10bSFabien ThomasInvert the sense of comparison when the
1511fa7f10bSFabien Thomas.Dq Li cmask
1521fa7f10bSFabien Thomasqualifier is present, making the counter increment when the number of
1531fa7f10bSFabien Thomasevents per cycle is less than the value specified by the
1541fa7f10bSFabien Thomas.Dq Li cmask
1551fa7f10bSFabien Thomasqualifier.
1561fa7f10bSFabien Thomas.It Li os
1571fa7f10bSFabien ThomasConfigure the PMC to count events happening at processor privilege
1581fa7f10bSFabien Thomaslevel 0.
1591fa7f10bSFabien Thomas.It Li usr
1601fa7f10bSFabien ThomasConfigure the PMC to count events occurring at privilege levels 1, 2
1611fa7f10bSFabien Thomasor 3.
1621fa7f10bSFabien Thomas.El
1631fa7f10bSFabien Thomas.Pp
1641fa7f10bSFabien ThomasIf neither of the
1651fa7f10bSFabien Thomas.Dq Li os
1661fa7f10bSFabien Thomasor
1671fa7f10bSFabien Thomas.Dq Li usr
1681fa7f10bSFabien Thomasqualifiers are specified, the default is to enable both.
1691fa7f10bSFabien Thomas.Ss Event Specifiers (Programmable PMCs)
1701fa7f10bSFabien ThomasCore i7 and Xeon 5500 programmable PMCs support the following events:
1711fa7f10bSFabien Thomas.Bl -tag -width indent
1721fa7f10bSFabien Thomas.It Li SB_DRAIN.ANY
1731fa7f10bSFabien Thomas.Pq Event 04H , Umask 07H
1741fa7f10bSFabien ThomasCounts the number of store buffer drains.
1751fa7f10bSFabien Thomas.It Li STORE_BLOCKS.AT_RET
1761fa7f10bSFabien Thomas.Pq Event 06H , Umask 04H
1770b129325SGordon BerglingCounts number of loads delayed with at-Retirement block code.
1780b129325SGordon BerglingThe following loads need to be executed at retirement and wait for all
1790b129325SGordon Berglingsenior stores on the same thread to be drained: load splitting across
1800b129325SGordon Bergling4K boundary (page split), load accessing uncacheable
1810b129325SGordon Bergling(UC or USWC) memory, load lock, and load with page table in UC or USWC memory region.
1821fa7f10bSFabien Thomas.It Li STORE_BLOCKS.L1D_BLOCK
1831fa7f10bSFabien Thomas.Pq Event 06H , Umask 08H
1841fa7f10bSFabien ThomasCacheable loads delayed with L1D block code
1851fa7f10bSFabien Thomas.It Li PARTIAL_ADDRESS_ALIAS
1861fa7f10bSFabien Thomas.Pq Event 07H , Umask 01H
1871fa7f10bSFabien ThomasCounts false dependency due to partial address aliasing
1881fa7f10bSFabien Thomas.It Li DTLB_LOAD_MISSES.ANY
1891fa7f10bSFabien Thomas.Pq Event 08H , Umask 01H
1901fa7f10bSFabien ThomasCounts all load misses that cause a page walk
1911fa7f10bSFabien Thomas.It Li DTLB_LOAD_MISSES.WALK_COMPLETED
1921fa7f10bSFabien Thomas.Pq Event 08H , Umask 02H
1931fa7f10bSFabien ThomasCounts number of completed page walks due to load miss in the STLB.
1941fa7f10bSFabien Thomas.It Li DTLB_LOAD_MISSES.STLB_HIT
1951fa7f10bSFabien Thomas.Pq Event 08H , Umask 10H
1961fa7f10bSFabien ThomasNumber of cache load STLB hits
1971fa7f10bSFabien Thomas.It Li DTLB_LOAD_MISSES.PDE_MISS
1981fa7f10bSFabien Thomas.Pq Event 08H , Umask 20H
1991fa7f10bSFabien ThomasNumber of DTLB cache load misses where the low part of the linear to
2001fa7f10bSFabien Thomasphysical address translation was missed.
2011fa7f10bSFabien Thomas.It Li DTLB_LOAD_MISSES.LARGE_WALK_COMPLETED
2021fa7f10bSFabien Thomas.Pq Event 08H , Umask 80H
2031fa7f10bSFabien ThomasCounts number of completed large page walks due to load miss in the STLB.
2041fa7f10bSFabien Thomas.It Li MEM_INST_RETIRED.LOADS
2051fa7f10bSFabien Thomas.Pq Event 0BH , Umask 01H
2061fa7f10bSFabien ThomasCounts the number of instructions with an architecturally-visible store
2071fa7f10bSFabien Thomasretired on the architected path.
2081fa7f10bSFabien ThomasIn conjunction with ld_lat facility
2091fa7f10bSFabien Thomas.It Li MEM_INST_RETIRED.STORES
2101fa7f10bSFabien Thomas.Pq Event 0BH , Umask 02H
2111fa7f10bSFabien ThomasCounts the number of instructions with an architecturally-visible store
2121fa7f10bSFabien Thomasretired on the architected path.
2131fa7f10bSFabien ThomasIn conjunction with ld_lat facility
2141fa7f10bSFabien Thomas.It Li MEM_INST_RETIRED.LATENCY_ABOVE_THRESHOLD
2151fa7f10bSFabien Thomas.Pq Event 0BH , Umask 10H
2161fa7f10bSFabien ThomasCounts the number of instructions exceeding the latency specified with
2171fa7f10bSFabien Thomasld_lat facility.
2181fa7f10bSFabien ThomasIn conjunction with ld_lat facility
2191fa7f10bSFabien Thomas.It Li MEM_STORE_RETIRED.DTLB_MISS
2201fa7f10bSFabien Thomas.Pq Event 0CH , Umask 01H
2210b129325SGordon BerglingThe event counts the number of retired stores that missed the DTLB.
2220b129325SGordon BerglingThe DTLB miss is not counted if the store operation causes a fault.
2230b129325SGordon BerglingDoes not counter prefetches.
2240b129325SGordon BerglingCounts both primary and secondary misses to the TLB
2251fa7f10bSFabien Thomas.It Li UOPS_ISSUED.ANY
2261fa7f10bSFabien Thomas.Pq Event 0EH , Umask 01H
2271fa7f10bSFabien ThomasCounts the number of Uops issued by the Register Allocation Table to the
2281fa7f10bSFabien ThomasReservation Station, i.e. the UOPs issued from the front end to the back
2291fa7f10bSFabien Thomasend.
2301fa7f10bSFabien Thomas.It Li UOPS_ISSUED.STALLED_CYCLES
2311fa7f10bSFabien Thomas.Pq Event 0EH , Umask 01H
2321fa7f10bSFabien ThomasCounts the number of cycles no Uops issued by the Register Allocation Table
2331fa7f10bSFabien Thomasto the Reservation Station, i.e. the UOPs issued from the front end to the
2341fa7f10bSFabien Thomasback end.
2351fa7f10bSFabien Thomasset invert=1, cmask = 1
2361fa7f10bSFabien Thomas.It Li UOPS_ISSUED.FUSED
2371fa7f10bSFabien Thomas.Pq Event 0EH , Umask 02H
2381fa7f10bSFabien ThomasCounts the number of fused Uops that were issued from the Register
2391fa7f10bSFabien ThomasAllocation Table to the Reservation Station.
2401fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.L3_DATA_MISS_UNKNOWN
2411fa7f10bSFabien Thomas.Pq Event 0FH , Umask 01H
2421fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2431fa7f10bSFabien Thomasmissed L3 and data source is unknown.
2441fa7f10bSFabien ThomasAvailable only for CPUID signature 06_2EH
2451fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.OTHER_CORE_L2_HITM
2461fa7f10bSFabien Thomas.Pq Event 0FH , Umask 02H
2471fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2481fa7f10bSFabien Thomashit modified data in a sibling core residing on the same socket.
2491fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.REMOTE_CACHE_LOCAL_HOME_HIT
2501fa7f10bSFabien Thomas.Pq Event 0FH , Umask 08H
2511fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2520b129325SGordon Berglingmissed the L1, L2 and L3 caches and HIT in a remote socket's cache.
2530b129325SGordon BerglingOnly counts locally homed lines.
2541fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.REMOTE_DRAM
2551fa7f10bSFabien Thomas.Pq Event 0FH , Umask 10H
2561fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2570b129325SGordon Berglingmissed the L1, L2 and L3 caches and was remotely homed.
2580b129325SGordon BerglingThis includes both DRAM access and HITM in a remote socket's cache
2590b129325SGordon Berglingfor remotely homed lines.
2601fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.LOCAL_DRAM
2611fa7f10bSFabien Thomas.Pq Event 0FH , Umask 20H
2621fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2631fa7f10bSFabien Thomasmissed the L1, L2 and L3 caches and required a local socket memory
2640b129325SGordon Berglingreference.
2650b129325SGordon BerglingThis includes locally homed cachelines that were in a modified
2661fa7f10bSFabien Thomasstate in another socket.
2671fa7f10bSFabien Thomas.It Li MEM_UNCORE_RETIRED.UNCACHEABLE
2681fa7f10bSFabien Thomas.Pq Event 0FH , Umask 80H
2691fa7f10bSFabien ThomasCounts number of memory load instructions retired where the memory reference
2701fa7f10bSFabien Thomasmissed the L1, L2 and L3 caches and to perform I/O.
2711fa7f10bSFabien ThomasAvailable only for CPUID signature 06_2EH
2721fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.X87
2731fa7f10bSFabien Thomas.Pq Event 10H , Umask 01H
2740b129325SGordon BerglingCounts the number of FP Computational Uops Executed.
2750b129325SGordon BerglingThe number of FADD, FSUB, FCOM, FMULs, integer MULsand IMULs, FDIVs, FPREMs, FSQRTS, integer
2760b129325SGordon BerglingDIVs, and IDIVs.
2770b129325SGordon BerglingThis event does not distinguish an FADD used in the middle of a transcendental flow from a separate FADD instruction.
2781fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.MMX
2791fa7f10bSFabien Thomas.Pq Event 10H , Umask 02H
2801fa7f10bSFabien ThomasCounts number of MMX Uops executed.
2811fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE_FP
2821fa7f10bSFabien Thomas.Pq Event 10H , Umask 04H
2831fa7f10bSFabien ThomasCounts number of SSE and SSE2 FP uops executed.
2841fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE2_INTEGER
2851fa7f10bSFabien Thomas.Pq Event 10H , Umask 08H
2861fa7f10bSFabien ThomasCounts number of SSE2 integer uops executed.
2871fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE_FP_PACKED
2881fa7f10bSFabien Thomas.Pq Event 10H , Umask 10H
2891fa7f10bSFabien ThomasCounts number of SSE FP packed uops executed.
2901fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE_FP_SCALAR
2911fa7f10bSFabien Thomas.Pq Event 10H , Umask 20H
2921fa7f10bSFabien ThomasCounts number of SSE FP scalar uops executed.
2931fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION
2941fa7f10bSFabien Thomas.Pq Event 10H , Umask 40H
2951fa7f10bSFabien ThomasCounts number of SSE* FP single precision uops executed.
2961fa7f10bSFabien Thomas.It Li FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION
2971fa7f10bSFabien Thomas.Pq Event 10H , Umask 80H
2981fa7f10bSFabien ThomasCounts number of SSE* FP double precision uops executed.
2991fa7f10bSFabien Thomas.It Li SIMD_INT_128.PACKED_MPY
3001fa7f10bSFabien Thomas.Pq Event 12H , Umask 01H
3011fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer multiply operations.
3021fa7f10bSFabien Thomas.It Li SIMD_INT_128.PACKED_SHIFT
3031fa7f10bSFabien Thomas.Pq Event 12H , Umask 02H
3041fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer shift operations.
3051fa7f10bSFabien Thomas.It Li SIMD_INT_128.PACK
3061fa7f10bSFabien Thomas.Pq Event 12H , Umask 04H
3071fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer pack operations.
3081fa7f10bSFabien Thomas.It Li SIMD_INT_128.UNPACK
3091fa7f10bSFabien Thomas.Pq Event 12H , Umask 08H
3101fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer unpack operations.
3111fa7f10bSFabien Thomas.It Li SIMD_INT_128.PACKED_LOGICAL
3121fa7f10bSFabien Thomas.Pq Event 12H , Umask 10H
3131fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer logical operations.
3141fa7f10bSFabien Thomas.It Li SIMD_INT_128.PACKED_ARITH
3151fa7f10bSFabien Thomas.Pq Event 12H , Umask 20H
3161fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer arithmetic operations.
3171fa7f10bSFabien Thomas.It Li SIMD_INT_128.SHUFFLE_MOVE
3181fa7f10bSFabien Thomas.Pq Event 12H , Umask 40H
3191fa7f10bSFabien ThomasCounts number of 128 bit SIMD integer shuffle and move operations.
3201fa7f10bSFabien Thomas.It Li LOAD_DISPATCH.RS
3211fa7f10bSFabien Thomas.Pq Event 13H , Umask 01H
3221fa7f10bSFabien ThomasCounts number of loads dispatched from the Reservation Station that bypass
3231fa7f10bSFabien Thomasthe Memory Order Buffer.
3241fa7f10bSFabien Thomas.It Li LOAD_DISPATCH.RS_DELAYED
3251fa7f10bSFabien Thomas.Pq Event 13H , Umask 02H
3260b129325SGordon BerglingCounts the number of delayed RS dispatches at the stage latch.
3270b129325SGordon BerglingIf an RS dispatch can not bypass to LB, it has another chance to dispatch from the
3281fa7f10bSFabien Thomasone-cycle delayed staging latch before it is written into the LB.
3291fa7f10bSFabien Thomas.It Li LOAD_DISPATCH.MOB
3301fa7f10bSFabien Thomas.Pq Event 13H , Umask 04H
3311fa7f10bSFabien ThomasCounts the number of loads dispatched from the Reservation Station to the
3321fa7f10bSFabien ThomasMemory Order Buffer.
3331fa7f10bSFabien Thomas.It Li LOAD_DISPATCH.ANY
3341fa7f10bSFabien Thomas.Pq Event 13H , Umask 07H
3351fa7f10bSFabien ThomasCounts all loads dispatched from the Reservation Station.
3361fa7f10bSFabien Thomas.It Li ARITH.CYCLES_DIV_BUSY
3371fa7f10bSFabien Thomas.Pq Event 14H , Umask 01H
3381fa7f10bSFabien ThomasCounts the number of cycles the divider is busy executing divide or square
3390b129325SGordon Berglingroot operations.
3400b129325SGordon BerglingThe divide can be integer, X87 or Streaming SIMD Extensions (SSE).
3410b129325SGordon BerglingThe square root operation can be either X87 or SSE.
3421fa7f10bSFabien ThomasSet 'edge =1, invert=1, cmask=1' to count the number of divides.
3431fa7f10bSFabien ThomasCount may be incorrect When SMT is on.
3441fa7f10bSFabien Thomas.It Li ARITH.MUL
3451fa7f10bSFabien Thomas.Pq Event 14H , Umask 02H
3460b129325SGordon BerglingCounts the number of multiply operations executed.
3470b129325SGordon BerglingThis includes integer as well as floating point multiply operations but excludes DPPS mul and MPSAD.
3481fa7f10bSFabien ThomasCount may be incorrect When SMT is on
3491fa7f10bSFabien Thomas.It Li INST_QUEUE_WRITES
3501fa7f10bSFabien Thomas.Pq Event 17H , Umask 01H
3511fa7f10bSFabien ThomasCounts the number of instructions written into the instruction queue every
3521fa7f10bSFabien Thomascycle.
3531fa7f10bSFabien Thomas.It Li INST_DECODED.DEC0
3541fa7f10bSFabien Thomas.Pq Event 18H , Umask 01H
3550b129325SGordon BerglingCounts number of instructions that require decoder 0 to be decoded.
3560b129325SGordon BerglingUsually, this means that the instruction maps to more than 1 uop
3571fa7f10bSFabien Thomas.It Li TWO_UOP_INSTS_DECODED
3581fa7f10bSFabien Thomas.Pq Event 19H , Umask 01H
3591fa7f10bSFabien ThomasAn instruction that generates two uops was decoded
3601fa7f10bSFabien Thomas.It Li INST_QUEUE_WRITE_CYCLES
3611fa7f10bSFabien Thomas.Pq Event 1EH , Umask 01H
3621fa7f10bSFabien ThomasThis event counts the number of cycles during which instructions are written
3630b129325SGordon Berglingto the instruction queue.
3640b129325SGordon BerglingDividing this counter by the number of instructions written to the
3650b129325SGordon Berglinginstruction queue (INST_QUEUE_WRITES) yields the average number of
3660b129325SGordon Berglinginstructions decoded each cycle.
3670b129325SGordon BerglingIf this number is less than four and the pipe stalls, this indicates that the decoder is failing to
3681fa7f10bSFabien Thomasdecode enough instructions per cycle to sustain the 4-wide pipeline.
3691fa7f10bSFabien ThomasIf SSE* instructions that are 6 bytes or longer arrive one after another,
3700b129325SGordon Berglingthen front end throughput may limit execution speed.
3710b129325SGordon BerglingIn such case,
3721fa7f10bSFabien Thomas.It Li LSD_OVERFLOW
3731fa7f10bSFabien Thomas.Pq Event 20H , Umask 01H
3741fa7f10bSFabien ThomasCounts number of loops that cant stream from the instruction queue.
3751fa7f10bSFabien Thomas.It Li L2_RQSTS.LD_HIT
3761fa7f10bSFabien Thomas.Pq Event 24H , Umask 01H
3770b129325SGordon BerglingCounts number of loads that hit the L2 cache.
3780b129325SGordon BerglingL2 loads include both L1D demand misses as well as L1D prefetches.
3790b129325SGordon BerglingL2 loads can be rejected for various reasons.
3800b129325SGordon BerglingOnly non rejected loads are counted.
3811fa7f10bSFabien Thomas.It Li L2_RQSTS.LD_MISS
3821fa7f10bSFabien Thomas.Pq Event 24H , Umask 02H
3830b129325SGordon BerglingCounts the number of loads that miss the L2 cache.
3840b129325SGordon BerglingL2 loads include both L1D demand misses as well as L1D prefetches.
3851fa7f10bSFabien Thomas.It Li L2_RQSTS.LOADS
3861fa7f10bSFabien Thomas.Pq Event 24H , Umask 03H
3870b129325SGordon BerglingCounts all L2 load requests.
3880b129325SGordon BerglingL2 loads include both L1D demand misses as well as L1D prefetches.
3891fa7f10bSFabien Thomas.It Li L2_RQSTS.RFO_HIT
3901fa7f10bSFabien Thomas.Pq Event 24H , Umask 04H
3910b129325SGordon BerglingCounts the number of store RFO requests that hit the L2 cache.
3920b129325SGordon BerglingL2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.
3931fa7f10bSFabien ThomasCount includes WC memory requests, where the data is not fetched but the
3941fa7f10bSFabien Thomaspermission to write the line is required.
3951fa7f10bSFabien Thomas.It Li L2_RQSTS.RFO_MISS
3961fa7f10bSFabien Thomas.Pq Event 24H , Umask 08H
3970b129325SGordon BerglingCounts the number of store RFO requests that miss the L2 cache.
3980b129325SGordon BerglingL2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.
3991fa7f10bSFabien Thomas.It Li L2_RQSTS.RFOS
4001fa7f10bSFabien Thomas.Pq Event 24H , Umask 0CH
4010b129325SGordon BerglingCounts all L2 store RFO requests.
4020b129325SGordon BerglingL2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.
4031fa7f10bSFabien Thomas.It Li L2_RQSTS.IFETCH_HIT
4041fa7f10bSFabien Thomas.Pq Event 24H , Umask 10H
4050b129325SGordon BerglingCounts number of instruction fetches that hit the L2 cache.
4060b129325SGordon BerglingL2 instruction fetches include both L1I demand misses as well as L1I instruction
4071fa7f10bSFabien Thomasprefetches.
4081fa7f10bSFabien Thomas.It Li L2_RQSTS.IFETCH_MISS
4091fa7f10bSFabien Thomas.Pq Event 24H , Umask 20H
4100b129325SGordon BerglingCounts number of instruction fetches that miss the L2 cache.
4110b129325SGordon BerglingL2 instruction fetches include both L1I demand misses as well as L1I instruction
4121fa7f10bSFabien Thomasprefetches.
4131fa7f10bSFabien Thomas.It Li L2_RQSTS.IFETCHES
4141fa7f10bSFabien Thomas.Pq Event 24H , Umask 30H
4150b129325SGordon BerglingCounts all instruction fetches.
4160b129325SGordon BerglingL2 instruction fetches include both L1I demand misses as well as L1I instruction prefetches.
4171fa7f10bSFabien Thomas.It Li L2_RQSTS.PREFETCH_HIT
4181fa7f10bSFabien Thomas.Pq Event 24H , Umask 40H
4191fa7f10bSFabien ThomasCounts L2 prefetch hits for both code and data.
4201fa7f10bSFabien Thomas.It Li L2_RQSTS.PREFETCH_MISS
4211fa7f10bSFabien Thomas.Pq Event 24H , Umask 80H
4221fa7f10bSFabien ThomasCounts L2 prefetch misses for both code and data.
4231fa7f10bSFabien Thomas.It Li L2_RQSTS.PREFETCHES
4241fa7f10bSFabien Thomas.Pq Event 24H , Umask C0H
4251fa7f10bSFabien ThomasCounts all L2 prefetches for both code and data.
4261fa7f10bSFabien Thomas.It Li L2_RQSTS.MISS
4271fa7f10bSFabien Thomas.Pq Event 24H , Umask AAH
4281fa7f10bSFabien ThomasCounts all L2 misses for both code and data.
4291fa7f10bSFabien Thomas.It Li L2_RQSTS.REFERENCES
4301fa7f10bSFabien Thomas.Pq Event 24H , Umask FFH
4311fa7f10bSFabien ThomasCounts all L2 requests for both code and data.
4321fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.DEMAND.I_STATE
4331fa7f10bSFabien Thomas.Pq Event 26H , Umask 01H
4341fa7f10bSFabien ThomasCounts number of L2 data demand loads where the cache line to be loaded is
4350b129325SGordon Berglingin the I (invalid) state, i.e. a cache miss.
4360b129325SGordon BerglingL2 demand loads are both L1D demand misses and L1D prefetches.
4371fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.DEMAND.S_STATE
4381fa7f10bSFabien Thomas.Pq Event 26H , Umask 02H
4391fa7f10bSFabien ThomasCounts number of L2 data demand loads where the cache line to be loaded is
4400b129325SGordon Berglingin the S (shared) state.
4410b129325SGordon BerglingL2 demand loads are both L1D demand misses and L1D prefetches.
4421fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.DEMAND.E_STATE
4431fa7f10bSFabien Thomas.Pq Event 26H , Umask 04H
4441fa7f10bSFabien ThomasCounts number of L2 data demand loads where the cache line to be loaded is
4450b129325SGordon Berglingin the E (exclusive) state.
4460b129325SGordon BerglingL2 demand loads are both L1D demand misses and L1D prefetches.
4471fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.DEMAND.M_STATE
4481fa7f10bSFabien Thomas.Pq Event 26H , Umask 08H
4491fa7f10bSFabien ThomasCounts number of L2 data demand loads where the cache line to be loaded is
4500b129325SGordon Berglingin the M (modified) state.
4510b129325SGordon BerglingL2 demand loads are both L1D demand misses and L1D prefetches.
4521fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.DEMAND.MESI
4531fa7f10bSFabien Thomas.Pq Event 26H , Umask 0FH
4540b129325SGordon BerglingCounts all L2 data demand requests.
4550b129325SGordon BerglingL2 demand loads are both L1D demand misses and L1D prefetches.
4561fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.PREFETCH.I_STATE
4571fa7f10bSFabien Thomas.Pq Event 26H , Umask 10H
4581fa7f10bSFabien ThomasCounts number of L2 prefetch data loads where the cache line to be loaded is
4591fa7f10bSFabien Thomasin the I (invalid) state, i.e. a cache miss.
4601fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.PREFETCH.S_STATE
4611fa7f10bSFabien Thomas.Pq Event 26H , Umask 20H
4621fa7f10bSFabien ThomasCounts number of L2 prefetch data loads where the cache line to be loaded is
4630b129325SGordon Berglingin the S (shared) state.
4640b129325SGordon BerglingA prefetch RFO will miss on an S state line, while a prefetch read will
4650b129325SGordon Berglinghit on an S state line.
4661fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.PREFETCH.E_STATE
4671fa7f10bSFabien Thomas.Pq Event 26H , Umask 40H
4681fa7f10bSFabien ThomasCounts number of L2 prefetch data loads where the cache line to be loaded is
4691fa7f10bSFabien Thomasin the E (exclusive) state.
4701fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.PREFETCH.M_STATE
4711fa7f10bSFabien Thomas.Pq Event 26H , Umask 80H
4721fa7f10bSFabien ThomasCounts number of L2 prefetch data loads where the cache line to be loaded is
4731fa7f10bSFabien Thomasin the M (modified) state.
4741fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.PREFETCH.MESI
4751fa7f10bSFabien Thomas.Pq Event 26H , Umask F0H
4761fa7f10bSFabien ThomasCounts all L2 prefetch requests.
4771fa7f10bSFabien Thomas.It Li L2_DATA_RQSTS.ANY
4781fa7f10bSFabien Thomas.Pq Event 26H , Umask FFH
4791fa7f10bSFabien ThomasCounts all L2 data requests.
4801fa7f10bSFabien Thomas.It Li L2_WRITE.RFO.I_STATE
4811fa7f10bSFabien Thomas.Pq Event 27H , Umask 01H
4821fa7f10bSFabien ThomasCounts number of L2 demand store RFO requests where the cache line to be
4830b129325SGordon Berglingloaded is in the I (invalid) state, i.e, a cache miss.
4840b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
4851fa7f10bSFabien ThomasThis is a demand RFO request
4861fa7f10bSFabien Thomas.It Li L2_WRITE.RFO.S_STATE
4871fa7f10bSFabien Thomas.Pq Event 27H , Umask 02H
4881fa7f10bSFabien ThomasCounts number of L2 store RFO requests where the cache line to be loaded is
4890b129325SGordon Berglingin the S (shared) state.
4900b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
4911fa7f10bSFabien ThomasThis is a demand RFO request
4921fa7f10bSFabien Thomas.It Li L2_WRITE.RFO.M_STATE
4931fa7f10bSFabien Thomas.Pq Event 27H , Umask 08H
4941fa7f10bSFabien ThomasCounts number of L2 store RFO requests where the cache line to be loaded is
4950b129325SGordon Berglingin the M (modified) state.
4960b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
4971fa7f10bSFabien ThomasThis is a demand RFO request
4981fa7f10bSFabien Thomas.It Li L2_WRITE.RFO.HIT
4991fa7f10bSFabien Thomas.Pq Event 27H , Umask 0EH
5001fa7f10bSFabien ThomasCounts number of L2 store RFO requests where the cache line to be loaded is
5010b129325SGordon Berglingin either the S, E or M states.
5020b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
5031fa7f10bSFabien ThomasThis is a demand RFO request
5041fa7f10bSFabien Thomas.It Li L2_WRITE.RFO.MESI
5051fa7f10bSFabien Thomas.Pq Event 27H , Umask 0FH
5060b129325SGordon BerglingCounts all L2 store RFO requests.
5070b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
5081fa7f10bSFabien ThomasThis is a demand RFO request
5091fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.I_STATE
5101fa7f10bSFabien Thomas.Pq Event 27H , Umask 10H
5111fa7f10bSFabien ThomasCounts number of L2 demand lock RFO requests where the cache line to be
5121fa7f10bSFabien Thomasloaded is in the I (invalid) state, i.e. a cache miss.
5131fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.S_STATE
5141fa7f10bSFabien Thomas.Pq Event 27H , Umask 20H
5151fa7f10bSFabien ThomasCounts number of L2 lock RFO requests where the cache line to be loaded is
5161fa7f10bSFabien Thomasin the S (shared) state.
5171fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.E_STATE
5181fa7f10bSFabien Thomas.Pq Event 27H , Umask 40H
5191fa7f10bSFabien ThomasCounts number of L2 demand lock RFO requests where the cache line to be
5201fa7f10bSFabien Thomasloaded is in the E (exclusive) state.
5211fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.M_STATE
5221fa7f10bSFabien Thomas.Pq Event 27H , Umask 80H
5231fa7f10bSFabien ThomasCounts number of L2 demand lock RFO requests where the cache line to be
5241fa7f10bSFabien Thomasloaded is in the M (modified) state.
5251fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.HIT
5261fa7f10bSFabien Thomas.Pq Event 27H , Umask E0H
5271fa7f10bSFabien ThomasCounts number of L2 demand lock RFO requests where the cache line to be
5281fa7f10bSFabien Thomasloaded is in either the S, E, or M state.
5291fa7f10bSFabien Thomas.It Li L2_WRITE.LOCK.MESI
5301fa7f10bSFabien Thomas.Pq Event 27H , Umask F0H
5311fa7f10bSFabien ThomasCounts all L2 demand lock RFO requests.
5321fa7f10bSFabien Thomas.It Li L1D_WB_L2.I_STATE
5331fa7f10bSFabien Thomas.Pq Event 28H , Umask 01H
5341fa7f10bSFabien ThomasCounts number of L1 writebacks to the L2 where the cache line to be written
5351fa7f10bSFabien Thomasis in the I (invalid) state, i.e. a cache miss.
5361fa7f10bSFabien Thomas.It Li L1D_WB_L2.S_STATE
5371fa7f10bSFabien Thomas.Pq Event 28H , Umask 02H
5381fa7f10bSFabien ThomasCounts number of L1 writebacks to the L2 where the cache line to be written
5391fa7f10bSFabien Thomasis in the S state.
5401fa7f10bSFabien Thomas.It Li L1D_WB_L2.E_STATE
5411fa7f10bSFabien Thomas.Pq Event 28H , Umask 04H
5421fa7f10bSFabien ThomasCounts number of L1 writebacks to the L2 where the cache line to be written
5431fa7f10bSFabien Thomasis in the E (exclusive) state.
5441fa7f10bSFabien Thomas.It Li L1D_WB_L2.M_STATE
5451fa7f10bSFabien Thomas.Pq Event 28H , Umask 08H
5461fa7f10bSFabien ThomasCounts number of L1 writebacks to the L2 where the cache line to be written
5471fa7f10bSFabien Thomasis in the M (modified) state.
5481fa7f10bSFabien Thomas.It Li L1D_WB_L2.MESI
5491fa7f10bSFabien Thomas.Pq Event 28H , Umask 0FH
5501fa7f10bSFabien ThomasCounts all L1 writebacks to the L2.
5511fa7f10bSFabien Thomas.It Li L3_LAT_CACHE.REFERENCE
5521fa7f10bSFabien Thomas.Pq Event 2EH , Umask 4FH
5531fa7f10bSFabien ThomasThis event counts requests originating from the core that reference a cache
5540b129325SGordon Berglingline in the last level cache.
5550b129325SGordon BerglingThe event count includes speculative traffic but excludes cache line fills
5560b129325SGordon Berglingdue to a L2 hardware-prefetch.
5570b129325SGordon BerglingBecause cache hierarchy, cache sizes and other implementation-specific
5580b129325SGordon Berglingcharacteristics; value comparison to estimate performance differences is not recommended.
5591fa7f10bSFabien Thomassee Table A-1
5601fa7f10bSFabien Thomas.It Li L3_LAT_CACHE.MISS
5611fa7f10bSFabien Thomas.Pq Event 2EH , Umask 41H
5621fa7f10bSFabien ThomasThis event counts each cache miss condition for references to the last level
5630b129325SGordon Berglingcache.
5640b129325SGordon BerglingThe event count may include speculative traffic but excludes cache
5650b129325SGordon Berglingline fills due to L2 hardware-prefetches.
5660b129325SGordon BerglingBecause cache hierarchy, cache sizes and other implementation-specific
5670b129325SGordon Berglingcharacteristics; value comparison to estimate performance differences is not recommended.
5681fa7f10bSFabien Thomassee Table A-1
5691fa7f10bSFabien Thomas.It Li CPU_CLK_UNHALTED.THREAD_P
5701fa7f10bSFabien Thomas.Pq Event 3CH , Umask 00H
5711fa7f10bSFabien ThomasCounts the number of thread cycles while the thread is not in a halt state.
5720b129325SGordon BerglingThe thread enters the halt state when it is running the HLT instruction.
5730b129325SGordon BerglingThe core frequency may change from time to time due to power or thermal throttling.
5741fa7f10bSFabien Thomassee Table A-1
5751fa7f10bSFabien Thomas.It Li CPU_CLK_UNHALTED.REF_P
5761fa7f10bSFabien Thomas.Pq Event 3CH , Umask 01H
5771fa7f10bSFabien ThomasIncrements at the frequency of TSC when not halted.
5781fa7f10bSFabien Thomassee Table A-1
5791fa7f10bSFabien Thomas.It Li L1D_CACHE_LD.I_STATE
5801fa7f10bSFabien Thomas.Pq Event 40H , Umask 01H
5811fa7f10bSFabien ThomasCounts L1 data cache read requests where the cache line to be loaded is in
5821fa7f10bSFabien Thomasthe I (invalid) state, i.e. the read request missed the cache.
5831fa7f10bSFabien ThomasCounter 0, 1 only
5841fa7f10bSFabien Thomas.It Li L1D_CACHE_LD.S_STATE
5851fa7f10bSFabien Thomas.Pq Event 40H , Umask 02H
5861fa7f10bSFabien ThomasCounts L1 data cache read requests where the cache line to be loaded is in
5871fa7f10bSFabien Thomasthe S (shared) state.
5881fa7f10bSFabien ThomasCounter 0, 1 only
5891fa7f10bSFabien Thomas.It Li L1D_CACHE_LD.E_STATE
5901fa7f10bSFabien Thomas.Pq Event 40H , Umask 04H
5911fa7f10bSFabien ThomasCounts L1 data cache read requests where the cache line to be loaded is in
5921fa7f10bSFabien Thomasthe E (exclusive) state.
5931fa7f10bSFabien ThomasCounter 0, 1 only
5941fa7f10bSFabien Thomas.It Li L1D_CACHE_LD.M_STATE
5951fa7f10bSFabien Thomas.Pq Event 40H , Umask 08H
5961fa7f10bSFabien ThomasCounts L1 data cache read requests where the cache line to be loaded is in
5971fa7f10bSFabien Thomasthe M (modified) state.
5981fa7f10bSFabien ThomasCounter 0, 1 only
5991fa7f10bSFabien Thomas.It Li L1D_CACHE_LD.MESI
6001fa7f10bSFabien Thomas.Pq Event 40H , Umask 0FH
6011fa7f10bSFabien ThomasCounts L1 data cache read requests.
6021fa7f10bSFabien ThomasCounter 0, 1 only
6031fa7f10bSFabien Thomas.It Li L1D_CACHE_ST.S_STATE
6041fa7f10bSFabien Thomas.Pq Event 41H , Umask 02H
6051fa7f10bSFabien ThomasCounts L1 data cache store RFO requests where the cache line to be loaded is
6061fa7f10bSFabien Thomasin the S (shared) state.
6071fa7f10bSFabien ThomasCounter 0, 1 only
6081fa7f10bSFabien Thomas.It Li L1D_CACHE_ST.E_STATE
6091fa7f10bSFabien Thomas.Pq Event 41H , Umask 04H
6101fa7f10bSFabien ThomasCounts L1 data cache store RFO requests where the cache line to be loaded is
6111fa7f10bSFabien Thomasin the E (exclusive) state.
6121fa7f10bSFabien ThomasCounter 0, 1 only
6131fa7f10bSFabien Thomas.It Li L1D_CACHE_ST.M_STATE
6141fa7f10bSFabien Thomas.Pq Event 41H , Umask 08H
6151fa7f10bSFabien ThomasCounts L1 data cache store RFO requests where cache line to be loaded is in
6161fa7f10bSFabien Thomasthe M (modified) state.
6171fa7f10bSFabien ThomasCounter 0, 1 only
6181fa7f10bSFabien Thomas.It Li L1D_CACHE_LOCK.HIT
6191fa7f10bSFabien Thomas.Pq Event 42H , Umask 01H
6201fa7f10bSFabien ThomasCounts retired load locks that hit in the L1 data cache or hit in an already
6210b129325SGordon Berglingallocated fill buffer.
6220b129325SGordon BerglingThe lock portion of the load lock transaction must hit in the L1D.
6230b129325SGordon BerglingThe initial load will pull the lock into the L1 data cache.
6240b129325SGordon BerglingCounter 0, 1 only
6251fa7f10bSFabien Thomas.It Li L1D_CACHE_LOCK.S_STATE
6261fa7f10bSFabien Thomas.Pq Event 42H , Umask 02H
6271fa7f10bSFabien ThomasCounts L1 data cache retired load locks that hit the target cache line in
6281fa7f10bSFabien Thomasthe shared state.
6291fa7f10bSFabien ThomasCounter 0, 1 only
6301fa7f10bSFabien Thomas.It Li L1D_CACHE_LOCK.E_STATE
6311fa7f10bSFabien Thomas.Pq Event 42H , Umask 04H
6321fa7f10bSFabien ThomasCounts L1 data cache retired load locks that hit the target cache line in
6331fa7f10bSFabien Thomasthe exclusive state.
6341fa7f10bSFabien ThomasCounter 0, 1 only
6351fa7f10bSFabien Thomas.It Li L1D_CACHE_LOCK.M_STATE
6361fa7f10bSFabien Thomas.Pq Event 42H , Umask 08H
6371fa7f10bSFabien ThomasCounts L1 data cache retired load locks that hit the target cache line in
6381fa7f10bSFabien Thomasthe modified state.
6391fa7f10bSFabien ThomasCounter 0, 1 only
6401fa7f10bSFabien Thomas.It Li L1D_ALL_REF.ANY
6411fa7f10bSFabien Thomas.Pq Event 43H , Umask 01H
6421fa7f10bSFabien ThomasCounts all references (uncached, speculated and retired) to the L1 data
6430b129325SGordon Berglingcache, including all loads and stores with any memory types.
6440b129325SGordon BerglingThe event counts memory accesses only when they are actually performed.
6450b129325SGordon BerglingFor example, a load blocked by unknown store address and later performed
6460b129325SGordon Berglingis only counted once.
6471fa7f10bSFabien ThomasThe event does not include non- memory accesses, such as I/O accesses.
6481fa7f10bSFabien ThomasCounter 0, 1 only
6491fa7f10bSFabien Thomas.It Li L1D_ALL_REF.CACHEABLE
6501fa7f10bSFabien Thomas.Pq Event 43H , Umask 02H
6511fa7f10bSFabien ThomasCounts all data reads and writes (speculated and retired) from cacheable
6521fa7f10bSFabien Thomasmemory, including locked operations.
6531fa7f10bSFabien ThomasCounter 0, 1 only
6541fa7f10bSFabien Thomas.It Li DTLB_MISSES.ANY
6551fa7f10bSFabien Thomas.Pq Event 49H , Umask 01H
6561fa7f10bSFabien ThomasCounts the number of misses in the STLB which causes a page walk.
6571fa7f10bSFabien Thomas.It Li DTLB_MISSES.WALK_COMPLETED
6581fa7f10bSFabien Thomas.Pq Event 49H , Umask 02H
6591fa7f10bSFabien ThomasCounts number of misses in the STLB which resulted in a completed page walk.
6601fa7f10bSFabien Thomas.It Li DTLB_MISSES.STLB_HIT
6611fa7f10bSFabien Thomas.Pq Event 49H , Umask 10H
6620b129325SGordon BerglingCounts the number of DTLB first level misses that hit in the second level TLB.
6630b129325SGordon BerglingThis event is only relevant if the core contains multiple DTLB levels.
664ba89031aSFabien Thomas.It Li DTLB_MISSES.PDE_MISS
665ba89031aSFabien Thomas.Pq Event 49H , Umask 20H
666ba89031aSFabien ThomasNumber of DTLB misses caused by low part of address, includes references to 2M pages because 2M pages do not use the PDE.
667ba89031aSFabien Thomas.It Li DTLB_MISSES.LARGE_WALK_COMPLETED
668ba89031aSFabien Thomas.Pq Event 49H , Umask 80H
669ba89031aSFabien ThomasCounts number of misses in the STLB which resulted in a completed page walk for large pages.
6701fa7f10bSFabien Thomas.It Li LOAD_HIT_PRE
6711fa7f10bSFabien Thomas.Pq Event 4CH , Umask 01H
6721fa7f10bSFabien ThomasCounts load operations sent to the L1 data cache while a previous SSE
6731fa7f10bSFabien Thomasprefetch instruction to the same cache line has started prefetching but has
6741fa7f10bSFabien Thomasnot yet finished.
6751fa7f10bSFabien Thomas.It Li L1D_PREFETCH.REQUESTS
6761fa7f10bSFabien Thomas.Pq Event 4EH , Umask 01H
6771fa7f10bSFabien ThomasCounts number of hardware prefetch requests dispatched out of the prefetch
6781fa7f10bSFabien ThomasFIFO.
6791fa7f10bSFabien Thomas.It Li L1D_PREFETCH.MISS
6801fa7f10bSFabien Thomas.Pq Event 4EH , Umask 02H
6810b129325SGordon BerglingCounts number of hardware prefetch requests that miss the L1D.
6820b129325SGordon BerglingThere are two prefetchers in the L1D.
6830b129325SGordon BerglingA streamer, which predicts lines sequentially after this one should be fetched,
6840b129325SGordon Berglingand the IP prefetcher that remembers access patterns for the current instruction.
6850b129325SGordon BerglingThe streamer prefetcher stops on an L1D hit, while the IP prefetcher does not.
6861fa7f10bSFabien Thomas.It Li L1D_PREFETCH.TRIGGERS
6871fa7f10bSFabien Thomas.Pq Event 4EH , Umask 04H
6881fa7f10bSFabien ThomasCounts number of prefetch requests triggered by the Finite State Machine and
6890b129325SGordon Berglingpushed into the prefetch FIFO.
6900b129325SGordon BerglingSome of the prefetch requests are dropped due to overwrites or competition between
6910b129325SGordon Berglingthe IP index prefetcher and streamer prefetcher.
6920b129325SGordon BerglingThe prefetch FIFO contains 4 entries.
6931fa7f10bSFabien Thomas.It Li L1D.REPL
6941fa7f10bSFabien Thomas.Pq Event 51H , Umask 01H
6951fa7f10bSFabien ThomasCounts the number of lines brought into the L1 data cache.
6961fa7f10bSFabien ThomasCounter 0, 1 only
6971fa7f10bSFabien Thomas.It Li L1D.M_REPL
6981fa7f10bSFabien Thomas.Pq Event 51H , Umask 02H
6991fa7f10bSFabien ThomasCounts the number of modified lines brought into the L1 data cache.
7001fa7f10bSFabien ThomasCounter 0, 1 only
7011fa7f10bSFabien Thomas.It Li L1D.M_EVICT
7021fa7f10bSFabien Thomas.Pq Event 51H , Umask 04H
7031fa7f10bSFabien ThomasCounts the number of modified lines evicted from the L1 data cache due to
7041fa7f10bSFabien Thomasreplacement.
7051fa7f10bSFabien ThomasCounter 0, 1 only
7061fa7f10bSFabien Thomas.It Li L1D.M_SNOOP_EVICT
7071fa7f10bSFabien Thomas.Pq Event 51H , Umask 08H
7081fa7f10bSFabien ThomasCounts the number of modified lines evicted from the L1 data cache due to
7091fa7f10bSFabien Thomassnoop HITM intervention.
7101fa7f10bSFabien ThomasCounter 0, 1 only
7111fa7f10bSFabien Thomas.It Li L1D_CACHE_PREFETCH_LOCK_FB_HIT
7121fa7f10bSFabien Thomas.Pq Event 52H , Umask 01H
7131fa7f10bSFabien ThomasCounts the number of cacheable load lock speculated instructions accepted
7141fa7f10bSFabien Thomasinto the fill buffer.
7151fa7f10bSFabien Thomas.It Li L1D_CACHE_LOCK_FB_HIT
7161fa7f10bSFabien Thomas.Pq Event 53H , Umask 01H
7171fa7f10bSFabien ThomasCounts the number of cacheable load lock speculated or retired instructions
7181fa7f10bSFabien Thomasaccepted into the fill buffer.
7191fa7f10bSFabien Thomas.It Li CACHE_LOCK_CYCLES.L1D_L2
7201fa7f10bSFabien Thomas.Pq Event 63H , Umask 01H
7210b129325SGordon BerglingCycle count during which the L1D and L2 are locked.
7220b129325SGordon BerglingA lock is asserted when there is a locked memory access, due to uncacheable memory, a locked
7231fa7f10bSFabien Thomasoperation that spans two cache lines, or a page walk from an uncacheable
7241fa7f10bSFabien Thomaspage table.
7250b129325SGordon BerglingCounter 0, 1 only.
7260b129325SGordon BerglingL1D and L2 locks have a very high performance penalty and it is highly recommended to
7270b129325SGordon Berglingavoid such accesses.
7281fa7f10bSFabien Thomas.It Li CACHE_LOCK_CYCLES.L1D
7291fa7f10bSFabien Thomas.Pq Event 63H , Umask 02H
7301fa7f10bSFabien ThomasCounts the number of cycles that cacheline in the L1 data cache unit is
7311fa7f10bSFabien Thomaslocked.
7321fa7f10bSFabien ThomasCounter 0, 1 only.
7331fa7f10bSFabien Thomas.It Li IO_TRANSACTIONS
7341fa7f10bSFabien Thomas.Pq Event 6CH , Umask 01H
7351fa7f10bSFabien ThomasCounts the number of completed I/O transactions.
7361fa7f10bSFabien Thomas.It Li L1I.HITS
7371fa7f10bSFabien Thomas.Pq Event 80H , Umask 01H
7381fa7f10bSFabien ThomasCounts all instruction fetches that hit the L1 instruction cache.
7391fa7f10bSFabien Thomas.It Li L1I.MISSES
7401fa7f10bSFabien Thomas.Pq Event 80H , Umask 02H
7410b129325SGordon BerglingCounts all instruction fetches that miss the L1I cache.
7420b129325SGordon BerglingThis includes instruction cache misses, streaming buffer misses, victim cache misses and
7430b129325SGordon Berglinguncacheable fetches.
7440b129325SGordon BerglingAn instruction fetch miss is counted only once and not once for every cycle
7450b129325SGordon Berglingit is outstanding.
7461fa7f10bSFabien Thomas.It Li L1I.READS
7471fa7f10bSFabien Thomas.Pq Event 80H , Umask 03H
7481fa7f10bSFabien ThomasCounts all instruction fetches, including uncacheable fetches that bypass
7491fa7f10bSFabien Thomasthe L1I.
7501fa7f10bSFabien Thomas.It Li L1I.CYCLES_STALLED
7511fa7f10bSFabien Thomas.Pq Event 80H , Umask 04H
7521fa7f10bSFabien ThomasCycle counts for which an instruction fetch stalls due to a L1I cache miss,
7531fa7f10bSFabien ThomasITLB miss or ITLB fault.
7541fa7f10bSFabien Thomas.It Li LARGE_ITLB.HIT
7551fa7f10bSFabien Thomas.Pq Event 82H , Umask 01H
7561fa7f10bSFabien ThomasCounts number of large ITLB hits.
7571fa7f10bSFabien Thomas.It Li ITLB_MISSES.ANY
7581fa7f10bSFabien Thomas.Pq Event 85H , Umask 01H
7591fa7f10bSFabien ThomasCounts the number of misses in all levels of the ITLB which causes a page
7601fa7f10bSFabien Thomaswalk.
7611fa7f10bSFabien Thomas.It Li ITLB_MISSES.WALK_COMPLETED
7621fa7f10bSFabien Thomas.Pq Event 85H , Umask 02H
7631fa7f10bSFabien ThomasCounts number of misses in all levels of the ITLB which resulted in a
7641fa7f10bSFabien Thomascompleted page walk.
7651fa7f10bSFabien Thomas.It Li ILD_STALL.LCP
7661fa7f10bSFabien Thomas.Pq Event 87H , Umask 01H
7671fa7f10bSFabien ThomasCycles Instruction Length Decoder stalls due to length changing prefixes:
7681fa7f10bSFabien Thomas66, 67 or REX.W (for EM64T) instructions which change the length of the
7691fa7f10bSFabien Thomasdecoded instruction.
7701fa7f10bSFabien Thomas.It Li ILD_STALL.MRU
7711fa7f10bSFabien Thomas.Pq Event 87H , Umask 02H
7721fa7f10bSFabien ThomasInstruction Length Decoder stall cycles due to Brand Prediction Unit (PBU)
7731fa7f10bSFabien ThomasMost Recently Used (MRU) bypass.
7741fa7f10bSFabien Thomas.It Li ILD_STALL.IQ_FULL
7751fa7f10bSFabien Thomas.Pq Event 87H , Umask 04H
7761fa7f10bSFabien ThomasStall cycles due to a full instruction queue.
7771fa7f10bSFabien Thomas.It Li ILD_STALL.REGEN
7781fa7f10bSFabien Thomas.Pq Event 87H , Umask 08H
7791fa7f10bSFabien ThomasCounts the number of regen stalls.
7801fa7f10bSFabien Thomas.It Li ILD_STALL.ANY
7811fa7f10bSFabien Thomas.Pq Event 87H , Umask 0FH
7821fa7f10bSFabien ThomasCounts any cycles the Instruction Length Decoder is stalled.
7831fa7f10bSFabien Thomas.It Li BR_INST_EXEC.COND
7841fa7f10bSFabien Thomas.Pq Event 88H , Umask 01H
7851fa7f10bSFabien ThomasCounts the number of conditional near branch instructions executed, but not
7861fa7f10bSFabien Thomasnecessarily retired.
7871fa7f10bSFabien Thomas.It Li BR_INST_EXEC.DIRECT
7881fa7f10bSFabien Thomas.Pq Event 88H , Umask 02H
7891fa7f10bSFabien ThomasCounts all unconditional near branch instructions excluding calls and
7901fa7f10bSFabien Thomasindirect branches.
7911fa7f10bSFabien Thomas.It Li BR_INST_EXEC.INDIRECT_NON_CALL
7921fa7f10bSFabien Thomas.Pq Event 88H , Umask 04H
7931fa7f10bSFabien ThomasCounts the number of executed indirect near branch instructions that are not
7941fa7f10bSFabien Thomascalls.
7951fa7f10bSFabien Thomas.It Li BR_INST_EXEC.NON_CALLS
7961fa7f10bSFabien Thomas.Pq Event 88H , Umask 07H
7971fa7f10bSFabien ThomasCounts all non call near branch instructions executed, but not necessarily
7981fa7f10bSFabien Thomasretired.
7991fa7f10bSFabien Thomas.It Li BR_INST_EXEC.RETURN_NEAR
8001fa7f10bSFabien Thomas.Pq Event 88H , Umask 08H
8011fa7f10bSFabien ThomasCounts indirect near branches that have a return mnemonic.
8021fa7f10bSFabien Thomas.It Li BR_INST_EXEC.DIRECT_NEAR_CALL
8031fa7f10bSFabien Thomas.Pq Event 88H , Umask 10H
8041fa7f10bSFabien ThomasCounts unconditional near call branch instructions, excluding non call
8051fa7f10bSFabien Thomasbranch, executed.
8061fa7f10bSFabien Thomas.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL
8071fa7f10bSFabien Thomas.Pq Event 88H , Umask 20H
8081fa7f10bSFabien ThomasCounts indirect near calls, including both register and memory indirect,
8091fa7f10bSFabien Thomasexecuted.
8101fa7f10bSFabien Thomas.It Li BR_INST_EXEC.NEAR_CALLS
8111fa7f10bSFabien Thomas.Pq Event 88H , Umask 30H
8121fa7f10bSFabien ThomasCounts all near call branches executed, but not necessarily retired.
8131fa7f10bSFabien Thomas.It Li BR_INST_EXEC.TAKEN
8141fa7f10bSFabien Thomas.Pq Event 88H , Umask 40H
8151fa7f10bSFabien ThomasCounts taken near branches executed, but not necessarily retired.
8161fa7f10bSFabien Thomas.It Li BR_INST_EXEC.ANY
8171fa7f10bSFabien Thomas.Pq Event 88H , Umask 7FH
8180b129325SGordon BerglingCounts all near executed branches (not necessarily retired).
8190b129325SGordon BerglingThis includes only instructions and not micro-op branches.
8200b129325SGordon BerglingFrequent branching is not necessarily a major performance issue.
8210b129325SGordon BerglingHowever frequent branch mispredictions may be a problem.
8221fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.COND
8231fa7f10bSFabien Thomas.Pq Event 89H , Umask 01H
8241fa7f10bSFabien ThomasCounts the number of mispredicted conditional near branch instructions
8251fa7f10bSFabien Thomasexecuted, but not necessarily retired.
8261fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.DIRECT
8271fa7f10bSFabien Thomas.Pq Event 89H , Umask 02H
8281fa7f10bSFabien ThomasCounts mispredicted macro unconditional near branch instructions, excluding
8291fa7f10bSFabien Thomascalls and indirect branches (should always be 0).
8301fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.INDIRECT_NON_CALL
8311fa7f10bSFabien Thomas.Pq Event 89H , Umask 04H
8321fa7f10bSFabien ThomasCounts the number of executed mispredicted indirect near branch instructions
8331fa7f10bSFabien Thomasthat are not calls.
8341fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.NON_CALLS
8351fa7f10bSFabien Thomas.Pq Event 89H , Umask 07H
8361fa7f10bSFabien ThomasCounts mispredicted non call near branches executed, but not necessarily
8371fa7f10bSFabien Thomasretired.
8381fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.RETURN_NEAR
8391fa7f10bSFabien Thomas.Pq Event 89H , Umask 08H
8401fa7f10bSFabien ThomasCounts mispredicted indirect branches that have a rear return mnemonic.
8411fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL
8421fa7f10bSFabien Thomas.Pq Event 89H , Umask 10H
8431fa7f10bSFabien ThomasCounts mispredicted non-indirect near calls executed, (should always be 0).
8441fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL
8451fa7f10bSFabien Thomas.Pq Event 89H , Umask 20H
846f6ac2391SJoel DahlCounts mispredicted indirect near calls executed, including both register
8471fa7f10bSFabien Thomasand memory indirect.
8481fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.NEAR_CALLS
8491fa7f10bSFabien Thomas.Pq Event 89H , Umask 30H
8501fa7f10bSFabien ThomasCounts all mispredicted near call branches executed, but not necessarily
8511fa7f10bSFabien Thomasretired.
8521fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.TAKEN
8531fa7f10bSFabien Thomas.Pq Event 89H , Umask 40H
8541fa7f10bSFabien ThomasCounts executed mispredicted near branches that are taken, but not
8551fa7f10bSFabien Thomasnecessarily retired.
8561fa7f10bSFabien Thomas.It Li BR_MISP_EXEC.ANY
8571fa7f10bSFabien Thomas.Pq Event 89H , Umask 7FH
8581fa7f10bSFabien ThomasCounts the number of mispredicted near branch instructions that were
8591fa7f10bSFabien Thomasexecuted, but not necessarily retired.
8601fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.ANY
8611fa7f10bSFabien Thomas.Pq Event A2H , Umask 01H
8620b129325SGordon BerglingCounts the number of Allocator resource related stalls.
8630b129325SGordon BerglingIncludes register renaming buffer entries, memory buffer entries.
8640b129325SGordon BerglingIn addition to resource related stalls, this event counts some other events.
8650b129325SGordon BerglingIncludes stalls arising during branch misprediction recovery, such as if retirement of the
8661fa7f10bSFabien Thomasmispredicted branch is delayed and stalls arising while store buffer is
8671fa7f10bSFabien Thomasdraining from synchronizing operations.
8681fa7f10bSFabien ThomasDoes not include stalls due to SuperQ (off core) queue full, too many cache
8691fa7f10bSFabien Thomasmisses, etc.
8701fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.LOAD
8711fa7f10bSFabien Thomas.Pq Event A2H , Umask 02H
8721fa7f10bSFabien ThomasCounts the cycles of stall due to lack of load buffer for load operation.
8731fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.RS_FULL
8741fa7f10bSFabien Thomas.Pq Event A2H , Umask 04H
8751fa7f10bSFabien ThomasThis event counts the number of cycles when the number of instructions in
8760b129325SGordon Berglingthe pipeline waiting for execution reaches the limit the processor can handle.
8770b129325SGordon BerglingA high count of this event indicates that there are long latency
8781fa7f10bSFabien Thomasoperations in the pipe (possibly load and store operations that miss the L2
8791fa7f10bSFabien Thomascache, or instructions dependent upon instructions further down the pipeline
8801fa7f10bSFabien Thomasthat have yet to retire.
8811fa7f10bSFabien ThomasWhen RS is full, new instructions can not enter the reservation station and
8821fa7f10bSFabien Thomasstart execution.
8831fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.STORE
8841fa7f10bSFabien Thomas.Pq Event A2H , Umask 08H
8851fa7f10bSFabien ThomasThis event counts the number of cycles that a resource related stall will
8861fa7f10bSFabien Thomasoccur due to the number of store instructions reaching the limit of the
8870b129325SGordon Berglingpipeline, (i.e. all store buffers are used).
8880b129325SGordon BerglingThe stall ends when a store instruction commits its data to the cache or memory.
8891fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.ROB_FULL
8901fa7f10bSFabien Thomas.Pq Event A2H , Umask 10H
8911fa7f10bSFabien ThomasCounts the cycles of stall due to re- order buffer full.
8921fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.FPCW
8931fa7f10bSFabien Thomas.Pq Event A2H , Umask 20H
8941fa7f10bSFabien ThomasCounts the number of cycles while execution was stalled due to writing the
8951fa7f10bSFabien Thomasfloating-point unit (FPU) control word.
8961fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.MXCSR
8971fa7f10bSFabien Thomas.Pq Event A2H , Umask 40H
8981fa7f10bSFabien ThomasStalls due to the MXCSR register rename occurring to close to a previous
8990b129325SGordon BerglingMXCSR rename.
9000b129325SGordon BerglingThe MXCSR provides control and status for the MMX registers.
9011fa7f10bSFabien Thomas.It Li RESOURCE_STALLS.OTHER
9021fa7f10bSFabien Thomas.Pq Event A2H , Umask 80H
9031fa7f10bSFabien ThomasCounts the number of cycles while execution was stalled due to other
9041fa7f10bSFabien Thomasresource issues.
9051fa7f10bSFabien Thomas.It Li MACRO_INSTS.FUSIONS_DECODED
9061fa7f10bSFabien Thomas.Pq Event A6H , Umask 01H
9071fa7f10bSFabien ThomasCounts the number of instructions decoded that are macro-fused but not
9081fa7f10bSFabien Thomasnecessarily executed or retired.
9091fa7f10bSFabien Thomas.It Li BACLEAR_FORCE_IQ
9101fa7f10bSFabien Thomas.Pq Event A7H , Umask 01H
9110b129325SGordon BerglingCounts number of times a BACLEAR was forced by the Instruction Queue.
9120b129325SGordon BerglingThe IQ is also responsible for providing conditional branch prediction direction
9131fa7f10bSFabien Thomasbased on a static scheme and dynamic data provided by the L2 Branch
9140b129325SGordon BerglingPrediction Unit.
9150b129325SGordon BerglingIf the conditional branch target is not found in the Target Array and the IQ
9160b129325SGordon Berglingpredicts that the branch is taken, then the IQ will force
9170b129325SGordon Berglingthe Branch Address Calculator to issue a BACLEAR.
9180b129325SGordon BerglingEach BACLEAR asserted by the BAC generates approximately an 8 cycle bubble
9190b129325SGordon Berglingin the instruction fetch pipeline.
9201fa7f10bSFabien Thomas.It Li LSD.UOPS
9211fa7f10bSFabien Thomas.Pq Event A8H , Umask 01H
9221fa7f10bSFabien ThomasCounts the number of micro-ops delivered by loop stream detector
9231fa7f10bSFabien ThomasUse cmask=1 and invert to count cycles
9241fa7f10bSFabien Thomas.It Li ITLB_FLUSH
9251fa7f10bSFabien Thomas.Pq Event AEH , Umask 01H
9261fa7f10bSFabien ThomasCounts the number of ITLB flushes
9271fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.L1D_WRITEBACK
9281fa7f10bSFabien Thomas.Pq Event B0H , Umask 40H
9291fa7f10bSFabien ThomasCounts number of L1D writebacks to the uncore.
9301fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT0
9311fa7f10bSFabien Thomas.Pq Event B1H , Umask 01H
9320b129325SGordon BerglingCounts number of Uops executed that were issued on port 0.
9330b129325SGordon BerglingPort 0 handles integer arithmetic, SIMD and FP add Uops.
9341fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT1
9351fa7f10bSFabien Thomas.Pq Event B1H , Umask 02H
9360b129325SGordon BerglingCounts number of Uops executed that were issued on port 1.
9370b129325SGordon BerglingPort 1 handles integer arithmetic, SIMD, integer shift, FP multiply and FP divide Uops.
9381fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT2_CORE
9391fa7f10bSFabien Thomas.Pq Event B1H , Umask 04H
9400b129325SGordon BerglingCounts number of Uops executed that were issued on port 2.
9410b129325SGordon BerglingPort 2 handles the load Uops.
9420b129325SGordon BerglingThis is a core count only and can not be collected per thread.
9431fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT3_CORE
9441fa7f10bSFabien Thomas.Pq Event B1H , Umask 08H
9450b129325SGordon BerglingCounts number of Uops executed that were issued on port 3.
9460b129325SGordon BerglingPort 3 handles store Uops.
9470b129325SGordon BerglingThis is a core count only and can not be collected per thread.
9481fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT4_CORE
9491fa7f10bSFabien Thomas.Pq Event B1H , Umask 10H
9500b129325SGordon BerglingCounts number of Uops executed that where issued on port 4.
9510b129325SGordon BerglingPort 4 handles the value to be stored for the store Uops issued on port 3.
9520b129325SGordon BerglingThis is a core count only and can not be collected per thread.
9531fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.CORE_ACTIVE_CYCLES_NO_PORT5
9541fa7f10bSFabien Thomas.Pq Event B1H , Umask 1FH
9551fa7f10bSFabien ThomasCounts cycles when the Uops executed were issued from any ports except port
9560b129325SGordon Bergling5.
9570b129325SGordon BerglingUse Cmask=1 for active cycles; Cmask=0 for weighted cycles; Use CMask=1,
9581fa7f10bSFabien ThomasInvert=1 to count P0-4 stalled cycles Use Cmask=1, Edge=1, Invert=1 to count
9591fa7f10bSFabien ThomasP0-4 stalls.
9601fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT5
9611fa7f10bSFabien Thomas.Pq Event B1H , Umask 20H
9621fa7f10bSFabien ThomasCounts number of Uops executed that where issued on port 5.
9631fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.CORE_ACTIVE_CYCLES
9641fa7f10bSFabien Thomas.Pq Event B1H , Umask 3FH
9650b129325SGordon BerglingCounts cycles when the Uops are executing.
9660b129325SGordon BerglingUse Cmask=1 for active cycles; Cmask=0 for weighted cycles; Use CMask=1, Invert=1 to count P0-4 stalled
9671fa7f10bSFabien Thomascycles Use Cmask=1, Edge=1, Invert=1 to count P0-4 stalls.
9681fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT015
9691fa7f10bSFabien Thomas.Pq Event B1H , Umask 40H
9701fa7f10bSFabien ThomasCounts number of Uops executed that where issued on port 0, 1, or 5.
9711fa7f10bSFabien Thomasuse cmask=1, invert=1 to count stall cycles
9721fa7f10bSFabien Thomas.It Li UOPS_EXECUTED.PORT234
9731fa7f10bSFabien Thomas.Pq Event B1H , Umask 80H
9741fa7f10bSFabien ThomasCounts number of Uops executed that where issued on port 2, 3, or 4.
9751fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS_SQ_FULL
9761fa7f10bSFabien Thomas.Pq Event B2H , Umask 01H
9771fa7f10bSFabien ThomasCounts number of cycles the SQ is full to handle off-core requests.
9781fa7f10bSFabien Thomas.It Li OFF_CORE_RESPONSE_0
9791fa7f10bSFabien Thomas.Pq Event B7H , Umask 01H
9801fa7f10bSFabien Thomassee Section 30.6.1.3, Off-core Response Performance Monitoring in the
9811fa7f10bSFabien ThomasProcessor Core
9821fa7f10bSFabien ThomasRequires programming MSR 01A6H
9831fa7f10bSFabien Thomas.It Li SNOOP_RESPONSE.HIT
9841fa7f10bSFabien Thomas.Pq Event B8H , Umask 01H
9851fa7f10bSFabien ThomasCounts HIT snoop response sent by this thread in response to a snoop
9861fa7f10bSFabien Thomasrequest.
9871fa7f10bSFabien Thomas.It Li SNOOP_RESPONSE.HITE
9881fa7f10bSFabien Thomas.Pq Event B8H , Umask 02H
9891fa7f10bSFabien ThomasCounts HIT E snoop response sent by this thread in response to a snoop
9901fa7f10bSFabien Thomasrequest.
9911fa7f10bSFabien Thomas.It Li SNOOP_RESPONSE.HITM
9921fa7f10bSFabien Thomas.Pq Event B8H , Umask 04H
9931fa7f10bSFabien ThomasCounts HIT M snoop response sent by this thread in response to a snoop
9941fa7f10bSFabien Thomasrequest.
9951fa7f10bSFabien Thomas.It Li OFF_CORE_RESPONSE_1
9961fa7f10bSFabien Thomas.Pq Event BBH , Umask 01H
9971fa7f10bSFabien Thomassee Section 30.6.1.3, Off-core Response Performance Monitoring in the
9981fa7f10bSFabien ThomasProcessor Core
9991fa7f10bSFabien ThomasRequires programming MSR 01A7H
10001fa7f10bSFabien Thomas.It Li INST_RETIRED.ANY_P
10011fa7f10bSFabien Thomas.Pq Event C0H , Umask 01H
10021fa7f10bSFabien ThomasSee Table A-1
10031fa7f10bSFabien ThomasNotes: INST_RETIRED.ANY is counted by a designated fixed counter.
10041fa7f10bSFabien ThomasINST_RETIRED.ANY_P is counted by a programmable counter and is an
10050b129325SGordon Berglingarchitectural performance event.
10060b129325SGordon BerglingEvent is supported if CPUID.A.EBX[1] = 0.
10071fa7f10bSFabien ThomasCounting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not
10081fa7f10bSFabien Thomascount as retired instructions.
10091fa7f10bSFabien Thomas.It Li INST_RETIRED.X87
10101fa7f10bSFabien Thomas.Pq Event C0H , Umask 02H
1011c2025a76SJoel DahlCounts the number of MMX instructions retired.
10121fa7f10bSFabien Thomas.It Li INST_RETIRED.MMX
10131fa7f10bSFabien Thomas.Pq Event C0H , Umask 04H
10141fa7f10bSFabien ThomasCounts the number of floating point computational operations retired:
10151fa7f10bSFabien Thomasfloating point computational operations executed by the assist handler and
10161fa7f10bSFabien Thomassub-operations of complex floating point instructions like transcendental
10171fa7f10bSFabien Thomasinstructions.
10181fa7f10bSFabien Thomas.It Li UOPS_RETIRED.ANY
10191fa7f10bSFabien Thomas.Pq Event C2H , Umask 01H
10201fa7f10bSFabien ThomasCounts the number of micro-ops retired, (macro-fused=1, micro- fused=2,
10210b129325SGordon Berglingothers=1; maximum count of 8 per cycle).
10220b129325SGordon BerglingMost instructions are composed of one or two micro-ops.
10230b129325SGordon BerglingSome instructions are decoded into longer sequences such as repeat instructions,
10240b129325SGordon Berglingfloating point transcendental instructions, and assists.
10251fa7f10bSFabien ThomasUse cmask=1 and invert to count active cycles or stalled cycles
10261fa7f10bSFabien Thomas.It Li UOPS_RETIRED.RETIRE_SLOTS
10271fa7f10bSFabien Thomas.Pq Event C2H , Umask 02H
10281fa7f10bSFabien ThomasCounts the number of retirement slots used each cycle
10291fa7f10bSFabien Thomas.It Li UOPS_RETIRED.MACRO_FUSED
10301fa7f10bSFabien Thomas.Pq Event C2H , Umask 04H
10311fa7f10bSFabien ThomasCounts number of macro-fused uops retired.
10321fa7f10bSFabien Thomas.It Li MACHINE_CLEARS.CYCLES
10331fa7f10bSFabien Thomas.Pq Event C3H , Umask 01H
10341fa7f10bSFabien ThomasCounts the cycles machine clear is asserted.
10351fa7f10bSFabien Thomas.It Li MACHINE_CLEARS.MEM_ORDER
10361fa7f10bSFabien Thomas.Pq Event C3H , Umask 02H
10371fa7f10bSFabien ThomasCounts the number of machine clears due to memory order conflicts.
10381fa7f10bSFabien Thomas.It Li MACHINE_CLEARS.SMC
10391fa7f10bSFabien Thomas.Pq Event C3H , Umask 04H
10401fa7f10bSFabien ThomasCounts the number of times that a program writes to a code section.
10411fa7f10bSFabien ThomasSelf-modifying code causes a sever penalty in all Intel 64 and IA-32
10420b129325SGordon Berglingprocessors.
10430b129325SGordon BerglingThe modified cache line is written back to the L2 and L3caches.
10441fa7f10bSFabien Thomas.It Li BR_INST_RETIRED.ALL_BRANCHES
10451fa7f10bSFabien Thomas.Pq Event C4H , Umask 00H
10461fa7f10bSFabien ThomasSee Table A-1
10471fa7f10bSFabien Thomas.It Li BR_INST_RETIRED.CONDITIONAL
10481fa7f10bSFabien Thomas.Pq Event C4H , Umask 01H
10491fa7f10bSFabien ThomasCounts the number of conditional branch instructions retired.
10501fa7f10bSFabien Thomas.It Li BR_INST_RETIRED.NEAR_CALL
10511fa7f10bSFabien Thomas.Pq Event C4H , Umask 02H
10521fa7f10bSFabien ThomasCounts the number of direct & indirect near unconditional calls retired
10531fa7f10bSFabien Thomas.It Li BR_INST_RETIRED.ALL_BRANCHES
10541fa7f10bSFabien Thomas.Pq Event C4H , Umask 04H
10551fa7f10bSFabien ThomasCounts the number of branch instructions retired
10561fa7f10bSFabien Thomas.It Li BR_MISP_RETIRED.ALL_BRANCHES
10571fa7f10bSFabien Thomas.Pq Event C5H , Umask 00H
10581fa7f10bSFabien ThomasSee Table A-1
10591fa7f10bSFabien Thomas.It Li BR_MISP_RETIRED.NEAR_CALL
10601fa7f10bSFabien Thomas.Pq Event C5H , Umask 02H
10611fa7f10bSFabien ThomasCounts mispredicted direct & indirect near unconditional retired calls.
10621fa7f10bSFabien Thomas.It Li SSEX_UOPS_RETIRED.PACKED_SINGLE
10631fa7f10bSFabien Thomas.Pq Event C7H , Umask 01H
10641fa7f10bSFabien ThomasCounts SIMD packed single-precision floating point Uops retired.
10651fa7f10bSFabien Thomas.It Li SSEX_UOPS_RETIRED.SCALAR_SINGLE
10661fa7f10bSFabien Thomas.Pq Event C7H , Umask 02H
10671fa7f10bSFabien ThomasCounts SIMD calar single-precision floating point Uops retired.
10681fa7f10bSFabien Thomas.It Li SSEX_UOPS_RETIRED.PACKED_DOUBLE
10691fa7f10bSFabien Thomas.Pq Event C7H , Umask 04H
10701fa7f10bSFabien ThomasCounts SIMD packed double- precision floating point Uops retired.
10711fa7f10bSFabien Thomas.It Li SSEX_UOPS_RETIRED.SCALAR_DOUBLE
10721fa7f10bSFabien Thomas.Pq Event C7H , Umask 08H
10731fa7f10bSFabien ThomasCounts SIMD scalar double-precision floating point Uops retired.
10741fa7f10bSFabien Thomas.It Li SSEX_UOPS_RETIRED.VECTOR_INTEGER
10751fa7f10bSFabien Thomas.Pq Event C7H , Umask 10H
10761fa7f10bSFabien ThomasCounts 128-bit SIMD vector integer Uops retired.
10771fa7f10bSFabien Thomas.It Li ITLB_MISS_RETIRED
10781fa7f10bSFabien Thomas.Pq Event C8H , Umask 20H
10791fa7f10bSFabien ThomasCounts the number of retired instructions that missed the ITLB when the
10801fa7f10bSFabien Thomasinstruction was fetched.
10811fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.L1D_HIT
10821fa7f10bSFabien Thomas.Pq Event CBH , Umask 01H
10831fa7f10bSFabien ThomasCounts number of retired loads that hit the L1 data cache.
10841fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.L2_HIT
10851fa7f10bSFabien Thomas.Pq Event CBH , Umask 02H
10861fa7f10bSFabien ThomasCounts number of retired loads that hit the L2 data cache.
10871fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.L3_UNSHARED_HIT
10881fa7f10bSFabien Thomas.Pq Event CBH , Umask 04H
10891fa7f10bSFabien ThomasCounts number of retired loads that hit their own, unshared lines in the L3
10901fa7f10bSFabien Thomascache.
10911fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.OTHER_CORE_L2_HIT_HITM
10921fa7f10bSFabien Thomas.Pq Event CBH , Umask 08H
10930b129325SGordon BerglingCounts number of retired loads that hit in a sibling core's L2 (on die core).
10940b129325SGordon BerglingSince the L3 is inclusive of all cores on the package, this is an L3 hit.
10950b129325SGordon BerglingThis counts both clean or modified hits.
10961fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.L3_MISS
10971fa7f10bSFabien Thomas.Pq Event CBH , Umask 10H
10980b129325SGordon BerglingCounts number of retired loads that miss the L3 cache.
10990b129325SGordon BerglingThe load was satisfied by a remote socket, local memory or an IOH.
11001fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.HIT_LFB
11011fa7f10bSFabien Thomas.Pq Event CBH , Umask 40H
11021fa7f10bSFabien ThomasCounts number of retired loads that miss the L1D and the address is located
11030b129325SGordon Berglingin an allocated line fill buffer and will soon be committed to cache.
11040b129325SGordon BerglingThis is counting secondary L1D misses.
11051fa7f10bSFabien Thomas.It Li MEM_LOAD_RETIRED.DTLB_MISS
11061fa7f10bSFabien Thomas.Pq Event CBH , Umask 80H
11070b129325SGordon BerglingCounts the number of retired loads that missed the DTLB.
11080b129325SGordon BerglingThe DTLB miss is not counted if the load operation causes a fault.
11090b129325SGordon BerglingThis event counts loads from cacheable memory only.
11100b129325SGordon BerglingThe event does not count loads by software prefetches.
11110b129325SGordon BerglingCounts both primary and secondary misses to the TLB.
11121fa7f10bSFabien Thomas.It Li FP_MMX_TRANS.TO_FP
11131fa7f10bSFabien Thomas.Pq Event CCH , Umask 01H
11141fa7f10bSFabien ThomasCounts the first floating-point instruction following any MMX instruction.
11151fa7f10bSFabien ThomasYou can use this event to estimate the penalties for the transitions between
11161fa7f10bSFabien Thomasfloating-point and MMX technology states.
11171fa7f10bSFabien Thomas.It Li FP_MMX_TRANS.TO_MMX
11181fa7f10bSFabien Thomas.Pq Event CCH , Umask 02H
11190b129325SGordon BerglingCounts the first MMX instruction following a floating-point instruction.
11200b129325SGordon BerglingYou can use this event to estimate the penalties for the transitions between
11211fa7f10bSFabien Thomasfloating-point and MMX technology states.
11221fa7f10bSFabien Thomas.It Li FP_MMX_TRANS.ANY
11231fa7f10bSFabien Thomas.Pq Event CCH , Umask 03H
11241fa7f10bSFabien ThomasCounts all transitions from floating point to MMX instructions and from MMX
11250b129325SGordon Berglinginstructions to floating point instructions.
11260b129325SGordon BerglingYou can use this event to estimate the penalties for the transitions between
11270b129325SGordon Berglingfloating-point and MMX technology states.
11281fa7f10bSFabien Thomas.It Li MACRO_INSTS.DECODED
11291fa7f10bSFabien Thomas.Pq Event D0H , Umask 01H
11301fa7f10bSFabien ThomasCounts the number of instructions decoded, (but not necessarily executed or
11311fa7f10bSFabien Thomasretired).
11321fa7f10bSFabien Thomas.It Li UOPS_DECODED.MS
11331fa7f10bSFabien Thomas.Pq Event D1H , Umask 02H
11340b129325SGordon BerglingCounts the number of Uops decoded by the Microcode Sequencer, MS.
11350b129325SGordon BerglingThe MS delivers uops when the instruction is more than 4 uops long or a microcode
11361fa7f10bSFabien Thomasassist is occurring.
11371fa7f10bSFabien Thomas.It Li UOPS_DECODED.ESP_FOLDING
11381fa7f10bSFabien Thomas.Pq Event D1H , Umask 04H
11391fa7f10bSFabien ThomasCounts number of stack pointer (ESP) instructions decoded: push , pop , call
11400b129325SGordon Bergling, ret, etc.
11410b129325SGordon BerglingESP instructions do not generate a Uop to increment or decrement ESP.
11420b129325SGordon BerglingInstead, they update an ESP_Offset register that keeps track of the
11431fa7f10bSFabien Thomasdelta to the current value of the ESP register.
11441fa7f10bSFabien Thomas.It Li UOPS_DECODED.ESP_SYNC
11451fa7f10bSFabien Thomas.Pq Event D1H , Umask 08H
11461fa7f10bSFabien ThomasCounts number of stack pointer (ESP) sync operations where an ESP
11471fa7f10bSFabien Thomasinstruction is corrected by adding the ESP offset register to the current
11481fa7f10bSFabien Thomasvalue of the ESP register.
11491fa7f10bSFabien Thomas.It Li RAT_STALLS.FLAGS
11501fa7f10bSFabien Thomas.Pq Event D2H , Umask 01H
11511fa7f10bSFabien ThomasCounts the number of cycles during which execution stalled due to several
11520b129325SGordon Berglingreasons, one of which is a partial flag register stall.
11530b129325SGordon BerglingA partial register stall may occur when two conditions are met: 1) an instruction modifies
11541fa7f10bSFabien Thomassome, but not all, of the flags in the flag register and 2) the next
11551fa7f10bSFabien Thomasinstruction, which depends on flags, depends on flags that were not modified
11561fa7f10bSFabien Thomasby this instruction.
11571fa7f10bSFabien Thomas.It Li RAT_STALLS.REGISTERS
11581fa7f10bSFabien Thomas.Pq Event D2H , Umask 02H
11591fa7f10bSFabien ThomasThis event counts the number of cycles instruction execution latency became
11601fa7f10bSFabien Thomaslonger than the defined latency because the instruction used a register that
11611fa7f10bSFabien Thomaswas partially written by previous instruction.
11621fa7f10bSFabien Thomas.It Li RAT_STALLS.ROB_READ_PORT
11631fa7f10bSFabien Thomas.Pq Event D2H , Umask 04H
11641fa7f10bSFabien ThomasCounts the number of cycles when ROB read port stalls occurred, which did
11650b129325SGordon Berglingnot allow new micro-ops to enter the out-of-order pipeline.
11660b129325SGordon BerglingNote that, at this stage in the pipeline, additional stalls may occur at
11670b129325SGordon Berglingthe same cycle and prevent the stalled micro-ops from entering the pipe.
11680b129325SGordon BerglingIn such a case, micro-ops retry entering the execution pipe in the next
11690b129325SGordon Berglingcycle and the ROB-read port stall is counted again.
11701fa7f10bSFabien Thomas.It Li RAT_STALLS.SCOREBOARD
11711fa7f10bSFabien Thomas.Pq Event D2H , Umask 08H
11721fa7f10bSFabien ThomasCounts the cycles where we stall due to microarchitecturally required
11730b129325SGordon Berglingserialization.
11740b129325SGordon BerglingMicrocode scoreboarding stalls.
11751fa7f10bSFabien Thomas.It Li RAT_STALLS.ANY
11761fa7f10bSFabien Thomas.Pq Event D2H , Umask 0FH
11771fa7f10bSFabien ThomasCounts all Register Allocation Table stall cycles due to: Cycles when ROB
11781fa7f10bSFabien Thomasread port stalls occurred, which did not allow new micro-ops to enter the
11790b129325SGordon Berglingexecution pipe.
11800b129325SGordon BerglingCycles when partial register stalls occurred Cycles when flag stalls occurred
11810b129325SGordon BerglingCycles floating-point unit (FPU) status word stalls occurred.
11820b129325SGordon BerglingTo count each of these conditions separately use the events:
11831fa7f10bSFabien ThomasRAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and
11841fa7f10bSFabien ThomasRAT_STALLS.FPSW.
11851fa7f10bSFabien Thomas.It Li SEG_RENAME_STALLS
11861fa7f10bSFabien Thomas.Pq Event D4H , Umask 01H
11871fa7f10bSFabien ThomasCounts the number of stall cycles due to the lack of renaming resources for
11880b129325SGordon Berglingthe ES, DS, FS, and GS segment registers.
11890b129325SGordon BerglingIf a segment is renamed but not retired and a second update to the same
11900b129325SGordon Berglingsegment occurs, a stall occurs in the front-end of the pipeline until the
11910b129325SGordon Berglingrenamed segment retires.
11921fa7f10bSFabien Thomas.It Li ES_REG_RENAMES
11931fa7f10bSFabien Thomas.Pq Event D5H , Umask 01H
11941fa7f10bSFabien ThomasCounts the number of times the ES segment register is renamed.
11951fa7f10bSFabien Thomas.It Li UOP_UNFUSION
11961fa7f10bSFabien Thomas.Pq Event DBH , Umask 01H
11971fa7f10bSFabien ThomasCounts unfusion events due to floating point exception to a fused uop.
11981fa7f10bSFabien Thomas.It Li BR_INST_DECODED
11991fa7f10bSFabien Thomas.Pq Event E0H , Umask 01H
12001fa7f10bSFabien ThomasCounts the number of branch instructions decoded.
12011fa7f10bSFabien Thomas.It Li BPU_MISSED_CALL_RET
12021fa7f10bSFabien Thomas.Pq Event E5H , Umask 01H
1203799162a6SJoel DahlCounts number of times the Branch Prediction Unit missed predicting a call
12041fa7f10bSFabien Thomasor return branch.
12051fa7f10bSFabien Thomas.It Li BACLEAR.CLEAR
12061fa7f10bSFabien Thomas.Pq Event E6H , Umask 01H
12071fa7f10bSFabien ThomasCounts the number of times the front end is resteered, mainly when the
12081fa7f10bSFabien ThomasBranch Prediction Unit cannot provide a correct prediction and this is
12090b129325SGordon Berglingcorrected by the Branch Address Calculator at the front end.
12100b129325SGordon BerglingThis can occur if the code has many branches such that they cannot be
12110b129325SGordon Berglingconsumed by the BPU.
12121fa7f10bSFabien ThomasEach BACLEAR asserted by the BAC generates approximately an 8 cycle bubble
12130b129325SGordon Berglingin the instruction fetch pipeline.
12140b129325SGordon BerglingThe effect on total execution time depends on the surrounding code.
12151fa7f10bSFabien Thomas.It Li BACLEAR.BAD_TARGET
12161fa7f10bSFabien Thomas.Pq Event E6H , Umask 02H
12171fa7f10bSFabien ThomasCounts number of Branch Address Calculator clears (BACLEAR) asserted due to
12181fa7f10bSFabien Thomasconditional branch instructions in which there was a target hit but the
12190b129325SGordon Berglingdirection was wrong.
12200b129325SGordon BerglingEach BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in
12210b129325SGordon Berglingthe instruction fetch pipeline.
12221fa7f10bSFabien Thomas.It Li BPU_CLEARS.EARLY
12231fa7f10bSFabien Thomas.Pq Event E8H , Umask 01H
12241fa7f10bSFabien ThomasCounts early (normal) Branch Prediction Unit clears: BPU predicted a taken
12251fa7f10bSFabien Thomasbranch after incorrectly assuming that it was not taken.
12261fa7f10bSFabien ThomasThe BPU clear leads to 2 cycle bubble in the Front End.
12271fa7f10bSFabien Thomas.It Li BPU_CLEARS.LATE
12281fa7f10bSFabien Thomas.Pq Event E8H , Umask 02H
12290b129325SGordon BerglingCounts late Branch Prediction Unit clears due to Most Recently Used conflicts.
12300b129325SGordon BerglingThe PBU clear leads to a 3 cycle bubble in the Front End.
12311fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.LOAD
12321fa7f10bSFabien Thomas.Pq Event F0H , Umask 01H
12331fa7f10bSFabien ThomasCounts L2 load operations due to HW prefetch or demand loads.
12341fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.RFO
12351fa7f10bSFabien Thomas.Pq Event F0H , Umask 02H
12361fa7f10bSFabien ThomasCounts L2 RFO operations due to HW prefetch or demand RFOs.
12371fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.IFETCH
12381fa7f10bSFabien Thomas.Pq Event F0H , Umask 04H
12391fa7f10bSFabien ThomasCounts L2 instruction fetch operations due to HW prefetch or demand ifetch.
12401fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.PREFETCH
12411fa7f10bSFabien Thomas.Pq Event F0H , Umask 08H
12421fa7f10bSFabien ThomasCounts L2 prefetch operations.
12431fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.L1D_WB
12441fa7f10bSFabien Thomas.Pq Event F0H , Umask 10H
12451fa7f10bSFabien ThomasCounts L1D writeback operations to the L2.
12461fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.FILL
12471fa7f10bSFabien Thomas.Pq Event F0H , Umask 20H
12481fa7f10bSFabien ThomasCounts L2 cache line fill operations due to load, RFO, L1D writeback or
12491fa7f10bSFabien Thomasprefetch.
12501fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.WB
12511fa7f10bSFabien Thomas.Pq Event F0H , Umask 40H
12521fa7f10bSFabien ThomasCounts L2 writeback operations to the L3.
12531fa7f10bSFabien Thomas.It Li L2_TRANSACTIONS.ANY
12541fa7f10bSFabien Thomas.Pq Event F0H , Umask 80H
12551fa7f10bSFabien ThomasCounts all L2 cache operations.
12561fa7f10bSFabien Thomas.It Li L2_LINES_IN.S_STATE
12571fa7f10bSFabien Thomas.Pq Event F1H , Umask 02H
12581fa7f10bSFabien ThomasCounts the number of cache lines allocated in the L2 cache in the S (shared)
12591fa7f10bSFabien Thomasstate.
12601fa7f10bSFabien Thomas.It Li L2_LINES_IN.E_STATE
12611fa7f10bSFabien Thomas.Pq Event F1H , Umask 04H
12621fa7f10bSFabien ThomasCounts the number of cache lines allocated in the L2 cache in the E
12631fa7f10bSFabien Thomas(exclusive) state.
12641fa7f10bSFabien Thomas.It Li L2_LINES_IN.ANY
12651fa7f10bSFabien Thomas.Pq Event F1H , Umask 07H
12661fa7f10bSFabien ThomasCounts the number of cache lines allocated in the L2 cache.
12671fa7f10bSFabien Thomas.It Li L2_LINES_OUT.DEMAND_CLEAN
12681fa7f10bSFabien Thomas.Pq Event F2H , Umask 01H
12691fa7f10bSFabien ThomasCounts L2 clean cache lines evicted by a demand request.
12701fa7f10bSFabien Thomas.It Li L2_LINES_OUT.DEMAND_DIRTY
12711fa7f10bSFabien Thomas.Pq Event F2H , Umask 02H
12721fa7f10bSFabien ThomasCounts L2 dirty (modified) cache lines evicted by a demand request.
12731fa7f10bSFabien Thomas.It Li L2_LINES_OUT.PREFETCH_CLEAN
12741fa7f10bSFabien Thomas.Pq Event F2H , Umask 04H
12751fa7f10bSFabien ThomasCounts L2 clean cache line evicted by a prefetch request.
12761fa7f10bSFabien Thomas.It Li L2_LINES_OUT.PREFETCH_DIRTY
12771fa7f10bSFabien Thomas.Pq Event F2H , Umask 08H
12781fa7f10bSFabien ThomasCounts L2 modified cache line evicted by a prefetch request.
12791fa7f10bSFabien Thomas.It Li L2_LINES_OUT.ANY
12801fa7f10bSFabien Thomas.Pq Event F2H , Umask 0FH
12811fa7f10bSFabien ThomasCounts all L2 cache lines evicted for any reason.
12821fa7f10bSFabien Thomas.It Li SQ_MISC.SPLIT_LOCK
12831fa7f10bSFabien Thomas.Pq Event F4H , Umask 10H
12841fa7f10bSFabien ThomasCounts the number of SQ lock splits across a cache line.
12851fa7f10bSFabien Thomas.It Li SQ_FULL_STALL_CYCLES
12861fa7f10bSFabien Thomas.Pq Event F6H , Umask 01H
12870b129325SGordon BerglingCounts cycles the Super Queue is full.
12880b129325SGordon BerglingNeither of the threads on this core will be able to access the uncore.
12891fa7f10bSFabien Thomas.It Li FP_ASSIST.ALL
12901fa7f10bSFabien Thomas.Pq Event F7H , Umask 01H
12911fa7f10bSFabien ThomasCounts the number of floating point operations executed that required
12920b129325SGordon Berglingmicro-code assist intervention.
12930b129325SGordon BerglingAssists are required in the following cases:
12941fa7f10bSFabien ThomasSSE instructions, (Denormal input when the DAZ flag is off or Underflow
12951fa7f10bSFabien Thomasresult when the FTZ flag is off): x87 instructions, (NaN or denormal are
12961fa7f10bSFabien Thomasloaded to a register or used as input from memory, Division by 0 or
12971fa7f10bSFabien ThomasUnderflow output).
12981fa7f10bSFabien Thomas.It Li FP_ASSIST.OUTPUT
12991fa7f10bSFabien Thomas.Pq Event F7H , Umask 02H
13001fa7f10bSFabien ThomasCounts number of floating point micro-code assist when the output value
13011fa7f10bSFabien Thomas(destination register) is invalid.
13021fa7f10bSFabien Thomas.It Li FP_ASSIST.INPUT
13031fa7f10bSFabien Thomas.Pq Event F7H , Umask 04H
13041fa7f10bSFabien ThomasCounts number of floating point micro-code assist when the input value (one
13051fa7f10bSFabien Thomasof the source operands to an FP instruction) is invalid.
13061fa7f10bSFabien Thomas.It Li SIMD_INT_64.PACKED_MPY
13071fa7f10bSFabien Thomas.Pq Event FDH , Umask 01H
13081fa7f10bSFabien ThomasCounts number of SID integer 64 bit packed multiply operations.
13091fa7f10bSFabien Thomas.It Li SIMD_INT_64.PACKED_SHIFT
13101fa7f10bSFabien Thomas.Pq Event FDH , Umask 02H
13111fa7f10bSFabien ThomasCounts number of SID integer 64 bit packed shift operations.
13121fa7f10bSFabien Thomas.It Li SIMD_INT_64.PACK
13131fa7f10bSFabien Thomas.Pq Event FDH , Umask 04H
13141fa7f10bSFabien ThomasCounts number of SID integer 64 bit pack operations.
13151fa7f10bSFabien Thomas.It Li SIMD_INT_64.UNPACK
13161fa7f10bSFabien Thomas.Pq Event FDH , Umask 08H
13171fa7f10bSFabien ThomasCounts number of SID integer 64 bit unpack operations.
13181fa7f10bSFabien Thomas.It Li SIMD_INT_64.PACKED_LOGICAL
13191fa7f10bSFabien Thomas.Pq Event FDH , Umask 10H
13201fa7f10bSFabien ThomasCounts number of SID integer 64 bit logical operations.
13211fa7f10bSFabien Thomas.It Li SIMD_INT_64.PACKED_ARITH
13221fa7f10bSFabien Thomas.Pq Event FDH , Umask 20H
13231fa7f10bSFabien ThomasCounts number of SID integer 64 bit arithmetic operations.
13241fa7f10bSFabien Thomas.It Li SIMD_INT_64.SHUFFLE_MOVE
13251fa7f10bSFabien Thomas.Pq Event FDH , Umask 40H
13261fa7f10bSFabien ThomasCounts number of SID integer 64 bit shift or move operations.
13271fa7f10bSFabien Thomas.El
13281fa7f10bSFabien Thomas.Ss Event Specifiers (Programmable PMCs)
13291fa7f10bSFabien ThomasCore i7 and Xeon 5500 programmable PMCs support the following events as
13301fa7f10bSFabien ThomasJune 2009 document (removed in December 2009):
13311fa7f10bSFabien Thomas.Bl -tag -width indent
13321fa7f10bSFabien Thomas.It Li SB_FORWARD.ANY
13331fa7f10bSFabien Thomas.Pq Event 02H , Umask 01H
13341fa7f10bSFabien ThomasCounts the number of store forwards.
13351fa7f10bSFabien Thomas.It Li LOAD_BLOCK.STD
13361fa7f10bSFabien Thomas.Pq Event 03H , Umask 01H
13371fa7f10bSFabien ThomasCounts the number of loads blocked by a preceding store with unknown data.
13381fa7f10bSFabien Thomas.It Li LOAD_BLOCK.ADDRESS_OFFSET
13391fa7f10bSFabien Thomas.Pq Event 03H , Umask 04H
13401fa7f10bSFabien ThomasCounts the number of loads blocked by a preceding store address.
13411fa7f10bSFabien Thomas.It Li LOAD_BLOCK.ADDRESS_OFFSET
13421fa7f10bSFabien Thomas.Pq Event 01H , Umask 04H
13431fa7f10bSFabien ThomasCounts the cycles of store buffer drains.
13441fa7f10bSFabien Thomas.It Li MISALIGN_MEM_REF.LOAD
13451fa7f10bSFabien Thomas.Pq Event 05H , Umask 01H
13461fa7f10bSFabien ThomasCounts the number of misaligned load references
13471fa7f10bSFabien Thomas.It Li MISALIGN_MEM_REF.STORE
13481fa7f10bSFabien Thomas.Pq Event 05H , Umask 02H
13491fa7f10bSFabien ThomasCounts the number of misaligned store references
13501fa7f10bSFabien Thomas.It Li MISALIGN_MEM_REF.ANY
13511fa7f10bSFabien Thomas.Pq Event 05H , Umask 03H
13521fa7f10bSFabien ThomasCounts the number of misaligned memory references
13531fa7f10bSFabien Thomas.It Li STORE_BLOCKS.NOT_STA
13541fa7f10bSFabien Thomas.Pq Event 06H , Umask 01H
13551fa7f10bSFabien ThomasThis event counts the number of load operations delayed caused by preceding
13561fa7f10bSFabien Thomasstores whose addresses are known but whose data is unknown, and preceding
13571fa7f10bSFabien Thomasstores that conflict with the load but which incompletely overlap the load.
13581fa7f10bSFabien Thomas.It Li STORE_BLOCKS.STA
13591fa7f10bSFabien Thomas.Pq Event 06H , Umask 02H
13601fa7f10bSFabien ThomasThis event counts load operations delayed caused by preceding stores whose
13611fa7f10bSFabien Thomasaddresses are unknown (STA block).
13621fa7f10bSFabien Thomas.It Li STORE_BLOCKS.ANY
13631fa7f10bSFabien Thomas.Pq Event 06H , Umask 0FH
13641fa7f10bSFabien ThomasAll loads delayed due to store blocks
13651fa7f10bSFabien Thomas.It Li MEMORY_DISAMBIGURATION.RESET
13661fa7f10bSFabien Thomas.Pq Event 09H , Umask 01H
13671fa7f10bSFabien ThomasCounts memory disambiguration reset cycles
13681fa7f10bSFabien Thomas.It Li MEMORY_DISAMBIGURATION.SUCCESS
13691fa7f10bSFabien Thomas.Pq Event 09H , Umask 02H
13701fa7f10bSFabien ThomasCounts the number of loads that memory disambiguration succeeded
13711fa7f10bSFabien Thomas.It Li MEMORY_DISAMBIGURATION.WATCHDOG
13721fa7f10bSFabien Thomas.Pq Event 09H , Umask 04H
13731fa7f10bSFabien ThomasCounts the number of times the memory disambiguration watchdog kicked in.
13741fa7f10bSFabien Thomas.It Li MEMORY_DISAMBIGURATION.WATCH_CYCLES
13751fa7f10bSFabien Thomas.Pq Event 09H , Umask 08H
13761fa7f10bSFabien ThomasCounts the cycles that the memory disambiguration watchdog is active.
13771fa7f10bSFabien Thomasset invert=1, cmask = 1
13781fa7f10bSFabien Thomas.It Li HW_INT.RCV
13791fa7f10bSFabien Thomas.Pq Event 1DH , Umask 01H
13801fa7f10bSFabien ThomasNumber of interrupt received
13811fa7f10bSFabien Thomas.It Li HW_INT.CYCLES_MASKED
13821fa7f10bSFabien Thomas.Pq Event 1DH , Umask 02H
13831fa7f10bSFabien ThomasNumber of cycles interrupt are masked
13841fa7f10bSFabien Thomas.It Li HW_INT.CYCLES_PENDING_AND_MASKED
13851fa7f10bSFabien Thomas.Pq Event 1DH , Umask 04H
13861fa7f10bSFabien ThomasNumber of cycles interrupts are pending and masked
13871fa7f10bSFabien Thomas.It Li HW_INT.CYCLES_PENDING_AND_MASKED
13881fa7f10bSFabien Thomas.Pq Event 04H , Umask 04H
13891fa7f10bSFabien ThomasCounts number of L2 store RFO requests where the cache line to be loaded is
13900b129325SGordon Berglingin the E (exclusive) state.
13910b129325SGordon BerglingThe L1D prefetcher does not issue a RFO prefetch.
13921fa7f10bSFabien ThomasThis is a demand RFO request
13931fa7f10bSFabien Thomas.It Li HW_INT.CYCLES_PENDING_AND_MASKED
13941fa7f10bSFabien Thomas.Pq Event 27H , Umask 04H
13951fa7f10bSFabien ThomasLONGEST_LAT_CACH E.MISS
13961fa7f10bSFabien Thomas.It Li UOPS_DECODED.DEC0
13971fa7f10bSFabien Thomas.Pq Event 3DH , Umask 01H
13981fa7f10bSFabien ThomasCounts micro-ops decoded by decoder 0.
13991fa7f10bSFabien Thomas.It Li UOPS_DECODED.DEC0
14001fa7f10bSFabien Thomas.Pq Event 01H , Umask 01H
14011fa7f10bSFabien ThomasCounts L1 data cache store RFO requests where the cache line to be loaded is
14021fa7f10bSFabien Thomasin the I state.
14031fa7f10bSFabien ThomasCounter 0, 1 only
14041fa7f10bSFabien Thomas.It Li 0FH
14051fa7f10bSFabien Thomas.Pq Event 41H , Umask 41H
14061fa7f10bSFabien ThomasL1D_CACHE_ST.MESI
14071fa7f10bSFabien ThomasCounts L1 data cache store RFO requests.
14081fa7f10bSFabien ThomasCounter 0, 1 only
14091fa7f10bSFabien Thomas.It Li DTLB_MISSES.PDE_MISS
14101fa7f10bSFabien Thomas.Pq Event 49H , Umask 20H
14111fa7f10bSFabien ThomasNumber of DTLB cache misses where the low part of the linear to physical
14121fa7f10bSFabien Thomasaddress translation was missed.
14131fa7f10bSFabien Thomas.It Li DTLB_MISSES.PDP_MISS
14141fa7f10bSFabien Thomas.Pq Event 49H , Umask 40H
14151fa7f10bSFabien ThomasNumber of DTLB misses where the high part of the linear to physical address
14161fa7f10bSFabien Thomastranslation was missed.
14171fa7f10bSFabien Thomas.It Li DTLB_MISSES.LARGE_WALK_COMPLETED
14181fa7f10bSFabien Thomas.Pq Event 49H , Umask 80H
14191fa7f10bSFabien ThomasCounts number of completed large page walks due to misses in the STLB.
14201fa7f10bSFabien Thomas.It Li SSE_MEM_EXEC.NTA
14211fa7f10bSFabien Thomas.Pq Event 4BH , Umask 01H
14221fa7f10bSFabien ThomasCounts number of SSE NTA prefetch/weakly-ordered instructions which missed
14231fa7f10bSFabien Thomasthe L1 data cache.
14241fa7f10bSFabien Thomas.It Li SSE_MEM_EXEC.STREAMING_STORES
14251fa7f10bSFabien Thomas.Pq Event 4BH , Umask 08H
14261fa7f10bSFabien ThomasCounts number of SSE non temporal stores
14271fa7f10bSFabien Thomas.It Li SFENCE_CYCLES
14281fa7f10bSFabien Thomas.Pq Event 4DH , Umask 01H
14291fa7f10bSFabien ThomasCounts store fence cycles
14301fa7f10bSFabien Thomas.It Li EPT.EPDE_MISS
14311fa7f10bSFabien Thomas.Pq Event 4FH , Umask 02H
14320b129325SGordon BerglingCounts Extended Page Directory Entry misses.
14330b129325SGordon BerglingThe Extended Page Directory cache is used by Virtual Machine operating
14340b129325SGordon Berglingsystems while the guest operating systems use the standard TLB caches.
14351fa7f10bSFabien Thomas.It Li EPT.EPDPE_HIT
14361fa7f10bSFabien Thomas.Pq Event 4FH , Umask 04H
14371fa7f10bSFabien ThomasCounts Extended Page Directory Pointer Entry hits.
14381fa7f10bSFabien Thomas.It Li EPT.EPDPE_MISS
14391fa7f10bSFabien Thomas.Pq Event 4FH , Umask 08H
14400b129325SGordon BerglingCounts Extended Page Directory Pointer Entry misses.
14411fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_DATA
14421fa7f10bSFabien Thomas.Pq Event 60H , Umask 01H
14430b129325SGordon BerglingCounts weighted cycles of offcore demand data read requests.
14440b129325SGordon BerglingDoes not include L2 prefetch requests.
14451fa7f10bSFabien Thomascounter 0
14461fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_CODE
14471fa7f10bSFabien Thomas.Pq Event 60H , Umask 02H
14480b129325SGordon BerglingCounts weighted cycles of offcore demand code read requests.
14490b129325SGordon BerglingDoes not include L2 prefetch requests.
14501fa7f10bSFabien Thomascounter 0
14511fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.RFO
14521fa7f10bSFabien Thomas.Pq Event 60H , Umask 04H
14530b129325SGordon BerglingCounts weighted cycles of offcore demand RFO requests.
14540b129325SGordon BerglingDoes not include L2 prefetch requests.
14551fa7f10bSFabien Thomascounter 0
14561fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS_OUTSTANDING.ANY.READ
14571fa7f10bSFabien Thomas.Pq Event 60H , Umask 08H
14580b129325SGordon BerglingCounts weighted cycles of offcore read requests of any kind.
14590b129325SGordon BerglingInclude L2 prefetch requests.
14601fa7f10bSFabien Thomascounter 0
14611fa7f10bSFabien Thomas.It Li IFU_IVC.FULL
14621fa7f10bSFabien Thomas.Pq Event 81H , Umask 01H
14631fa7f10bSFabien ThomasInstruction Fetche unit victim cache full.
14641fa7f10bSFabien Thomas.It Li IFU_IVC.L1I_EVICTION
14651fa7f10bSFabien Thomas.Pq Event 81H , Umask 02H
14661fa7f10bSFabien ThomasL1 Instruction cache evictions.
14671fa7f10bSFabien Thomas.It Li L1I_OPPORTUNISTIC_HITS
14681fa7f10bSFabien Thomas.Pq Event 83H , Umask 01H
14691fa7f10bSFabien ThomasOpportunistic hits in streaming.
14701fa7f10bSFabien Thomas.It Li ITLB_MISSES.WALK_CYCLES
14711fa7f10bSFabien Thomas.Pq Event 85H , Umask 04H
14721fa7f10bSFabien ThomasCounts ITLB miss page walk cycles.
14731fa7f10bSFabien Thomas.It Li ITLB_MISSES.PMH_BUSY_CYCLES
14741fa7f10bSFabien Thomas.Pq Event 85H , Umask 04H
14751fa7f10bSFabien ThomasCounts PMH busy cycles.
14761fa7f10bSFabien Thomas.It Li ITLB_MISSES.STLB_HIT
14771fa7f10bSFabien Thomas.Pq Event 85H , Umask 10H
14781fa7f10bSFabien ThomasCounts the number of ITLB misses that hit in the second level TLB.
14791fa7f10bSFabien Thomas.It Li ITLB_MISSES.PDE_MISS
14801fa7f10bSFabien Thomas.Pq Event 85H , Umask 20H
14811fa7f10bSFabien ThomasNumber of ITLB misses where the low part of the linear to physical address
14821fa7f10bSFabien Thomastranslation was missed.
14831fa7f10bSFabien Thomas.It Li ITLB_MISSES.PDP_MISS
14841fa7f10bSFabien Thomas.Pq Event 85H , Umask 40H
14851fa7f10bSFabien ThomasNumber of ITLB misses where the high part of the linear to physical address
14861fa7f10bSFabien Thomastranslation was missed.
14871fa7f10bSFabien Thomas.It Li ITLB_MISSES.LARGE_WALK_COMPLETED
14881fa7f10bSFabien Thomas.Pq Event 85H , Umask 80H
14891fa7f10bSFabien ThomasCounts number of completed large page walks due to misses in the STLB.
14901fa7f10bSFabien Thomas.It Li ITLB_MISSES.LARGE_WALK_COMPLETED
14911fa7f10bSFabien Thomas.Pq Event 01H , Umask 80H
14920b129325SGordon BerglingCounts number of offcore demand data read requests.
14930b129325SGordon BerglingDoes not count L2 prefetch requests.
14941fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.DEMAND.READ_CODE
14951fa7f10bSFabien Thomas.Pq Event B0H , Umask 02H
14960b129325SGordon BerglingCounts number of offcore demand code read requests.
14970b129325SGordon BerglingDoes not count L2 prefetch requests.
14981fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.DEMAND.RFO
14991fa7f10bSFabien Thomas.Pq Event B0H , Umask 04H
15000b129325SGordon BerglingCounts number of offcore demand RFO requests.
15010b129325SGordon BerglingDoes not count L2 prefetch requests.
15021fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.ANY.READ
15031fa7f10bSFabien Thomas.Pq Event B0H , Umask 08H
15040b129325SGordon BerglingCounts number of offcore read requests.
15050b129325SGordon BerglingIncludes L2 prefetch requests.
15061fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.ANY.RFO
15071fa7f10bSFabien Thomas.Pq Event B0H , Umask 10H
15080b129325SGordon BerglingCounts number of offcore RFO requests.
15090b129325SGordon BerglingIncludes L2 prefetch requests.
15101fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.UNCACHED_MEM
15111fa7f10bSFabien Thomas.Pq Event B0H , Umask 20H
15121fa7f10bSFabien ThomasCounts number of offcore uncached memory requests.
15131fa7f10bSFabien Thomas.It Li OFFCORE_REQUESTS.ANY
15141fa7f10bSFabien Thomas.Pq Event B0H , Umask 80H
15151fa7f10bSFabien ThomasCounts all offcore requests.
15161fa7f10bSFabien Thomas.It Li SNOOPQ_REQUESTS_OUTSTANDING.DATA
15171fa7f10bSFabien Thomas.Pq Event B3H , Umask 01H
15180b129325SGordon BerglingCounts weighted cycles of snoopq requests for data.
15190b129325SGordon BerglingCounter 0 only Use cmask=1 to count cycles not empty.
15201fa7f10bSFabien Thomas.It Li SNOOPQ_REQUESTS_OUTSTANDING.INVALIDATE
15211fa7f10bSFabien Thomas.Pq Event B3H , Umask 02H
15220b129325SGordon BerglingCounts weighted cycles of snoopq invalidate requests.
15230b129325SGordon BerglingCounter 0 only Use cmask=1 to count cycles not empty.
15241fa7f10bSFabien Thomas.It Li SNOOPQ_REQUESTS_OUTSTANDING.CODE
15251fa7f10bSFabien Thomas.Pq Event B3H , Umask 04H
15260b129325SGordon BerglingCounts weighted cycles of snoopq requests for code.
15270b129325SGordon BerglingCounter 0 only Use cmask=1 to count cycles not empty.
15281fa7f10bSFabien Thomas.It Li SNOOPQ_REQUESTS_OUTSTANDING.CODE
15291fa7f10bSFabien Thomas.Pq Event BAH , Umask 04H
15301fa7f10bSFabien ThomasCounts number of TPR reads
15311fa7f10bSFabien Thomas.It Li PIC_ACCESSES.TPR_WRITES
15321fa7f10bSFabien Thomas.Pq Event BAH , Umask 02H
15330b129325SGordon BerglingCounts number of TPR writes one or two micro-ops.
15340b129325SGordon BerglingSome instructions are decoded into longer sequences
15351fa7f10bSFabien Thomas.It Li MACHINE_CLEARS.FUSION_ASSIST
15361fa7f10bSFabien Thomas.Pq Event C3H , Umask 10H
15371fa7f10bSFabien ThomasCounts the number of macro-fusion assists
15381fa7f10bSFabien ThomasCounts SIMD packed single- precision floating point Uops retired.
15391fa7f10bSFabien Thomas.It Li BOGUS_BR
15401fa7f10bSFabien Thomas.Pq Event E4H , Umask 01H
15411fa7f10bSFabien ThomasCounts the number of bogus branches.
15421fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.HIT
15431fa7f10bSFabien Thomas.Pq Event F3H , Umask 01H
15441fa7f10bSFabien ThomasCount L2 HW prefetcher detector hits
15451fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.ALLOC
15461fa7f10bSFabien Thomas.Pq Event F3H , Umask 02H
15471fa7f10bSFabien ThomasCount L2 HW prefetcher allocations
15481fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.DATA_TRIGGER
15491fa7f10bSFabien Thomas.Pq Event F3H , Umask 04H
15501fa7f10bSFabien ThomasCount L2 HW data prefetcher triggered
15511fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.CODE_TRIGGER
15521fa7f10bSFabien Thomas.Pq Event F3H , Umask 08H
15531fa7f10bSFabien ThomasCount L2 HW code prefetcher triggered
15541fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.DCA_TRIGGER
15551fa7f10bSFabien Thomas.Pq Event F3H , Umask 10H
15561fa7f10bSFabien ThomasCount L2 HW DCA prefetcher triggered
15571fa7f10bSFabien Thomas.It Li L2_HW_PREFETCH.KICK_START
15581fa7f10bSFabien Thomas.Pq Event F3H , Umask 20H
15591fa7f10bSFabien ThomasCount L2 HW prefetcher kick started
15601fa7f10bSFabien Thomas.It Li SQ_MISC.PROMOTION
15611fa7f10bSFabien Thomas.Pq Event F4H , Umask 01H
15621fa7f10bSFabien ThomasCounts the number of L2 secondary misses that hit the Super Queue.
15631fa7f10bSFabien Thomas.It Li SQ_MISC.PROMOTION_POST_GO
15641fa7f10bSFabien Thomas.Pq Event F4H , Umask 02H
15651fa7f10bSFabien ThomasCounts the number of L2 secondary misses during the Super Queue filling L2.
15661fa7f10bSFabien Thomas.It Li SQ_MISC.LRU_HINTS
15671fa7f10bSFabien Thomas.Pq Event F4H , Umask 04H
15681fa7f10bSFabien ThomasCounts number of Super Queue LRU hints sent to L3.
15691fa7f10bSFabien Thomas.It Li SQ_MISC.FILL_DROPPED
15701fa7f10bSFabien Thomas.Pq Event F4H , Umask 08H
15711fa7f10bSFabien ThomasCounts the number of SQ L2 fills dropped due to L2 busy.
15721fa7f10bSFabien Thomas.It Li SEGMENT_REG_LOADS
15731fa7f10bSFabien Thomas.Pq Event F8H , Umask 01H
15741fa7f10bSFabien ThomasCounts number of segment register loads.
15751fa7f10bSFabien Thomas.El
15761fa7f10bSFabien Thomas.Sh SEE ALSO
15771fa7f10bSFabien Thomas.Xr pmc 3 ,
1578*b2934971SMitchell Horne.Xr pmc.amd 3 ,
15791fa7f10bSFabien Thomas.Xr pmc.atom 3 ,
15801fa7f10bSFabien Thomas.Xr pmc.core 3 ,
158173461c24SJoel Dahl.Xr pmc.corei7uc 3 ,
15821fa7f10bSFabien Thomas.Xr pmc.iaf 3 ,
1583f5f9340bSFabien Thomas.Xr pmc.soft 3 ,
15841fa7f10bSFabien Thomas.Xr pmc.tsc 3 ,
158573461c24SJoel Dahl.Xr pmc.ucf 3 ,
158673461c24SJoel Dahl.Xr pmc.westmere 3 ,
158773461c24SJoel Dahl.Xr pmc.westmereuc 3 ,
15881fa7f10bSFabien Thomas.Xr pmc_cpuinfo 3 ,
15891fa7f10bSFabien Thomas.Xr pmclog 3 ,
15901fa7f10bSFabien Thomas.Xr hwpmc 4
15911fa7f10bSFabien Thomas.Sh HISTORY
15921fa7f10bSFabien ThomasThe
15931fa7f10bSFabien Thomas.Nm pmc
15941fa7f10bSFabien Thomaslibrary first appeared in
15951fa7f10bSFabien Thomas.Fx 6.0 .
15961fa7f10bSFabien Thomas.Sh AUTHORS
15971fa7f10bSFabien ThomasThe
15981fa7f10bSFabien Thomas.Lb libpmc
15991fa7f10bSFabien Thomaslibrary was written by
16002b7af31cSBaptiste Daroussin.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org .
1601