1.\" Copyright (c) 2014 Hiren Panchasara <hiren@FreeBSD.org> 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD$ 26.\" 27.Dd April 6, 2017 28.Dt PMC.ATOMSILVERMONT 3 29.Os 30.Sh NAME 31.Nm pmc.atomsilvermont 32.Nd measurement events for 33.Tn Intel 34.Tn Atom Silvermont 35family CPUs 36.Sh LIBRARY 37.Lb libpmc 38.Sh SYNOPSIS 39.In pmc.h 40.Sh DESCRIPTION 41.Tn Intel 42.Tn Atom Silvermont 43CPUs contain PMCs conforming to version 3 of the 44.Tn Intel 45performance measurement architecture. 46These CPUs contains two classes of PMCs: 47.Bl -tag -width "Li PMC_CLASS_IAP" 48.It Li PMC_CLASS_IAF 49Fixed-function counters that count only one hardware event per counter. 50.It Li PMC_CLASS_IAP 51Programmable counters that may be configured to count one of a defined 52set of hardware events. 53.El 54.Pp 55The number of PMCs available in each class and their widths need to be 56determined at run time by calling 57.Xr pmc_cpuinfo 3 . 58.Pp 59Intel Atom Silvermont PMCs are documented in 60.Rs 61.%B "Intel 64 and IA-32 Intel(R) Architecture Software Developer's Manual" 62.%T "Combined Volumes" 63.%N "Order Number 325462-050US" 64.%D February 2014 65.%Q "Intel Corporation" 66.Re 67.Ss ATOM SILVERMONT FIXED FUNCTION PMCS 68These PMCs and their supported events are documented in 69.Xr pmc.iaf 3 . 70.Ss ATOM SILVERMONT PROGRAMMABLE PMCS 71The programmable PMCs support the following capabilities: 72.Bl -column "PMC_CAP_INTERRUPT" "Support" 73.It Em Capability Ta Em Support 74.It PMC_CAP_CASCADE Ta \&No 75.It PMC_CAP_EDGE Ta Yes 76.It PMC_CAP_INTERRUPT Ta Yes 77.It PMC_CAP_INVERT Ta Yes 78.It PMC_CAP_READ Ta Yes 79.It PMC_CAP_PRECISE Ta \&No 80.It PMC_CAP_SYSTEM Ta Yes 81.It PMC_CAP_TAGGING Ta \&No 82.It PMC_CAP_THRESHOLD Ta Yes 83.It PMC_CAP_USER Ta Yes 84.It PMC_CAP_WRITE Ta Yes 85.El 86.Ss Event Qualifiers 87Event specifiers for these PMCs support the following common 88qualifiers: 89.Bl -tag -width indent 90.It Li any 91Count matching events seen on any logical processor in a package. 92.It Li cmask= Ns Ar value 93Configure the PMC to increment only if the number of configured 94events measured in a cycle is greater than or equal to 95.Ar value . 96.It Li edge 97Configure the PMC to count the number of de-asserted to asserted 98transitions of the conditions expressed by the other qualifiers. 99If specified, the counter will increment only once whenever a 100condition becomes true, irrespective of the number of clocks during 101which the condition remains true. 102.It Li inv 103Invert the sense of comparison when the 104.Dq Li cmask 105qualifier is present, making the counter increment when the number of 106events per cycle is less than the value specified by the 107.Dq Li cmask 108qualifier. 109.It Li os 110Configure the PMC to count events happening at processor privilege 111level 0. 112.It Li usr 113Configure the PMC to count events occurring at privilege levels 1, 2 114or 3. 115.El 116.Pp 117If neither of the 118.Dq Li os 119or 120.Dq Li usr 121qualifiers are specified, the default is to enable both. 122.Pp 123Events that require core-specificity to be specified use a 124additional qualifier 125.Dq Li core= Ns Ar core , 126where argument 127.Ar core 128is one of: 129.Bl -tag -width indent 130.It Li all 131Measure event conditions on all cores. 132.It Li this 133Measure event conditions on this core. 134.El 135.Pp 136The default is 137.Dq Li this . 138.Pp 139Events that require an agent qualifier to be specified use an 140additional qualifier 141.Dq Li agent= Ns agent , 142where argument 143.Ar agent 144is one of: 145.Bl -tag -width indent 146.It Li this 147Measure events associated with this bus agent. 148.It Li any 149Measure events caused by any bus agent. 150.El 151.Pp 152The default is 153.Dq Li this . 154.Pp 155Events that require a hardware prefetch qualifier to be specified use an 156additional qualifier 157.Dq Li prefetch= Ns Ar prefetch , 158where argument 159.Ar prefetch 160is one of: 161.Bl -tag -width "exclude" 162.It Li both 163Include all prefetches. 164.It Li only 165Only count hardware prefetches. 166.It Li exclude 167Exclude hardware prefetches. 168.El 169.Pp 170The default is 171.Dq Li both . 172.Pp 173Events that require a cache coherence qualifier to be specified use an 174additional qualifier 175.Dq Li cachestate= Ns Ar state , 176where argument 177.Ar state 178contains one or more of the following letters: 179.Bl -tag -width indent 180.It Li e 181Count cache lines in the exclusive state. 182.It Li i 183Count cache lines in the invalid state. 184.It Li m 185Count cache lines in the modified state. 186.It Li s 187Count cache lines in the shared state. 188.El 189.Pp 190The default is 191.Dq Li eims . 192.Pp 193Events that require a snoop response qualifier to be specified use an 194additional qualifier 195.Dq Li snoopresponse= Ns Ar response , 196where argument 197.Ar response 198comprises of the following keywords separated by 199.Dq + 200signs: 201.Bl -tag -width indent 202.It Li clean 203Measure CLEAN responses. 204.It Li hit 205Measure HIT responses. 206.It Li hitm 207Measure HITM responses. 208.El 209.Pp 210The default is to measure all the above responses. 211.Pp 212Events that require a snoop type qualifier use an additional qualifier 213.Dq Li snooptype= Ns Ar type , 214where argument 215.Ar type 216comprises the one of the following keywords: 217.Bl -tag -width indent 218.It Li cmp2i 219Measure CMP2I snoops. 220.It Li cmp2s 221Measure CMP2S snoops. 222.El 223.Pp 224The default is to measure both snoops. 225.Ss Event Specifiers (Programmable PMCs) 226Atom Silvermont programmable PMCs support the following events: 227.Bl -tag -width indent 228.It Li REHABQ.LD_BLOCK_ST_FORWARD 229.Pq Event 03H , Umask 01H 230The number of retired loads that were 231prohibited from receiving forwarded data from the store 232because of address mismatch. 233.It Li REHABQ.LD_BLOCK_STD_NOTREADY 234.Pq Event 03H , Umask 02H 235The cases where a forward was technically possible, 236but did not occur because the store data was not available 237at the right time. 238.It Li REHABQ.ST_SPLITS 239.Pq Event 03H , Umask 04H 240The number of retire stores that experienced. 241cache line boundary splits. 242.It Li REHABQ.LD_SPLITS 243.Pq Event 03H , Umask 08H 244The number of retire loads that experienced. 245cache line boundary splits. 246.It Li REHABQ.LOCK 247.Pq Event 03H , Umask 10H 248The number of retired memory operations with lock semantics. 249These are either implicit locked instructions such as the 250XCHG instruction or instructions with an explicit LOCK 251prefix (0xF0). 252.It Li REHABQ.STA_FULL 253.Pq Event 03H , Umask 20H 254The number of retired stores that are delayed 255because there is not a store address buffer available. 256.It Li REHABQ.ANY_LD 257.Pq Event 03H , Umask 40H 258The number of load uops reissued from Rehabq. 259.It Li REHABQ.ANY_ST 260.Pq Event 03H , Umask 80H 261The number of store uops reissued from Rehabq. 262.It Li MEM_UOPS_RETIRED.L1_MISS_LOADS 263.Pq Event 04H , Umask 01H 264The number of load ops retired that miss in L1 265Data cache. 266Note that prefetch misses will not be counted. 267.It Li MEM_UOPS_RETIRED.L2_HIT_LOADS 268.Pq Event 04H , Umask 02H 269The number of load micro-ops retired that hit L2. 270.It Li MEM_UOPS_RETIRED.L2_MISS_LOADS 271.Pq Event 04H , Umask 04H 272The number of load micro-ops retired that missed L2. 273.It Li MEM_UOPS_RETIRED.DTLB_MISS_LOADS 274.Pq Event 04H , Umask 08H 275The number of load ops retired that had DTLB miss. 276.It Li MEM_UOPS_RETIRED.UTLB_MISS 277.Pq Event 04H , Umask 10H 278The number of load ops retired that had UTLB miss. 279.It Li MEM_UOPS_RETIRED.HITM 280.Pq Event 04H , Umask 20H 281The number of load ops retired that got data 282from the other core or from the other module. 283.It Li MEM_UOPS_RETIRED.ALL_LOADS 284.Pq Event 04H , Umask 40H 285The number of load ops retired. 286.It Li MEM_UOP_RETIRED.ALL_STORES 287.Pq Event 04H , Umask 80H 288The number of store ops retired. 289.It Li PAGE_WALKS.D_SIDE_CYCLES 290.Pq Event 05H , Umask 01H 291Every cycle when a D-side (walks due to a load) page walk is in progress. 292Page walk duration divided by number of page walks is the average duration of 293page-walks. 294Edge trigger bit must be cleared. 295Set Edge to count the number of page walks. 296.It Li PAGE_WALKS.I_SIDE_CYCLES 297.Pq Event 05H , Umask 02H 298Every cycle when a I-side (walks due to an instruction fetch) page walk is in 299progress. 300Page walk duration divided by number of page walks is the average duration of 301page-walks. 302.It Li PAGE_WALKS.WALKS 303.Pq Event 05H , Umask 03H 304The number of times a data (D) page walk or an instruction (I) page walk is 305completed or started. 306Since a page walk implies a TLB miss, the number of TLB misses can be counted 307by counting the number of pagewalks. 308.It Li LONGEST_LAT_CACHE.MISS 309.Pq Event 2EH , Umask 41H 310the total number of L2 cache references and the number of L2 cache misses 311respectively. 312L3 is not supported in Silvermont microarchitecture. 313.It Li LONGEST_LAT_CACHE.REFERENCE 314.Pq Event 2EH , Umask 4FH 315The number of requests originating from the core that 316references a cache line in the L2 cache. 317L3 is not supported in Silvermont microarchitecture. 318.It Li L2_REJECT_XQ.ALL 319.Pq Event 30H , Umask 00H 320The number of demand and prefetch 321transactions that the L2 XQ rejects due to a full or near full 322condition which likely indicates back pressure from the IDI link. 323The XQ may reject transactions from the L2Q (non-cacheable 324requests), BBS (L2 misses) and WOB (L2 write-back victims) 325.It Li CORE_REJECT_L2Q.ALL 326.Pq Event 31H , Umask 00H 327The number of demand and L1 prefetcher 328requests rejected by the L2Q due to a full or nearly full condition which 329likely indicates back pressure from L2Q. 330It also counts requests that would have gone directly to the XQ, but are 331rejected due to a full or nearly full condition, indicating back pressure from 332the IDI link. 333The L2Q may also reject transactions from a core to insure fairness between 334cores, or to delay a core's dirty eviction when the address conflicts incoming 335external snoops. 336(Note that L2 prefetcher requests that are dropped are not counted by this 337event). 338.It Li CPU_CLK_UNHALTED.CORE_P 339.Pq Event 3CH , Umask 00H 340The number of core cycles while the core is not in a halt state. 341The core enters the halt state when it is running the HLT instruction. 342In mobile systems the core frequency may change from time to time. 343For this reason this event may have a changing ratio with regards to time. 344.It Li CPU_CLK_UNHALTED.REF_P 345.Pq Event 3CH , Umask 01H 346The number of reference cycles that the core is not in a halt state. 347The core enters the halt state when it is running the HLT instruction. 348In mobile systems the core frequency may change from time. 349This event is not affected by core frequency changes but counts as if the core 350is running at the maximum frequency all the time. 351.It Li ICACHE.HIT 352.Pq Event 80H , Umask 01H 353The number of instruction fetches from the instruction cache. 354.It Li ICACHE.MISSES 355.Pq Event 80H , Umask 02H 356The number of instruction fetches that miss the Instruction cache or produce 357memory requests. 358This includes uncacheable fetches. 359An instruction fetch miss is counted only once and not once for every cycle 360it is outstanding. 361.It Li ICACHE.ACCESSES 362.Pq Event 80H , Umask 03H 363The number of instruction fetches, including uncacheable fetches. 364.It Li NIP_STALL.ICACHE_MISS 365.Pq Event B6H , Umask 04H 366The number of cycles the NIP stalls because of an icache miss. 367This is a cumulative count of cycles the NIP stalled for all 368icache misses. 369.It Li OFFCORE_RESPONSE_0 370.Pq Event B7H , Umask 01H 371Requires MSR_OFFCORE_RESP0 to specify request type and response. 372.It Li OFFCORE_RESPONSE_1 373.Pq Event B7H , Umask 02H 374Requires MSR_OFFCORE_RESP to specify request type and response. 375.It Li INST_RETIRED.ANY_P 376.Pq Event C0H , Umask 00H 377The number of instructions that retire execution. 378For instructions that consist of multiple micro-ops, this event counts the 379retirement of the last micro-op of the instruction. 380The counter continues counting during hardware interrupts, traps, and inside 381interrupt handlers. 382.It Li UOPS_RETIRED.MS 383.Pq Event C2H , Umask 01H 384The number of micro-ops retired that were supplied from MSROM. 385.It Li UOPS_RETIRED.ALL 386.Pq Event C2H , Umask 10H 387The number of micro-ops retired. 388.It Li MACHINE_CLEARS.SMC 389.Pq Event C3H , Umask 01H 390The number of times that a program writes to a code section. 391Self-modifying code causes a severe penalty in all Intel 392architecture processors. 393.It Li MACHINE_CLEARS.MEMORY_ORDERING 394.Pq Event C3H , Umask 02H 395The number of times that pipeline was cleared due to memory 396ordering issues. 397.It Li MACHINE_CLEARS.FP_ASSIST 398.Pq Event C3H , Umask 04H 399The number of times that pipeline stalled due to FP operations 400needing assists. 401.It Li MACHINE_CLEARS.ALL 402.Pq Event C3H , Umask 08H 403The number of times that pipeline stalled due to due to any causes 404(including SMC, MO, FP assist, etc). 405.It Li BR_INST_RETIRED.ALL_BRANCHES 406.Pq Event C4H , Umask 00H 407The number of branch instructions retired. 408.It Li BR_INST_RETIRED.JCC 409.Pq Event C4H , Umask 7EH 410The number of branch instructions retired that were conditional 411jumps. 412.It Li BR_INST_RETIRED.FAR_BRANCH 413.Pq Event C4H , Umask BFH 414The number of far branch instructions retired. 415.It Li BR_INST_RETIRED.NON_RETURN_IND 416.Pq Event C4H , Umask EBH 417The number of branch instructions retired that were near indirect 418call or near indirect jmp. 419.It Li BR_INST_RETIRED.RETURN 420.Pq Event C4H , Umask F7H 421The number of near RET branch instructions retired. 422.It Li BR_INST_RETIRED.CALL 423.Pq Event C4H , Umask F9H 424The number of near CALL branch instructions retired. 425.It Li BR_INST_RETIRED.IND_CALL 426.Pq Event C4H , Umask FBH 427The number of near indirect CALL branch instructions retired. 428.It Li BR_INST_RETIRED.REL_CALL 429.Pq Event C4H , Umask FDH 430The number of near relative CALL branch instructions retired. 431.It Li BR_INST_RETIRED.TAKEN_JCC 432.Pq Event C4H , Umask FEH 433The number of branch instructions retired that were conditional 434jumps and predicted taken. 435.It Li BR_MISP_RETIRED.ALL_BRANCHES 436.Pq Event C5H , Umask 00H 437The number of mispredicted branch instructions retired. 438.It Li BR_MISP_RETIRED.JCC 439.Pq Event C5H , Umask 7EH 440The number of mispredicted branch instructions retired that were 441conditional jumps. 442.It Li BR_MISP_RETIRED.FAR 443.Pq Event C5H , Umask BFH 444The number of mispredicted far branch instructions retired. 445.It Li BR_MISP_RETIRED.NON_RETURN_IND 446.Pq Event C5H , Umask EBH 447The number of mispredicted branch instructions retired that were 448near indirect call or near indirect jmp. 449.It Li BR_MISP_RETIRED.RETURN 450.Pq Event C5H , Umask F7H 451The number of mispredicted near RET branch instructions retired. 452.It Li BR_MISP_RETIRED.CALL 453.Pq Event C5H , Umask F9H 454The number of mispredicted near CALL branch instructions retired. 455.It Li BR_MISP_RETIRED.IND_CALL 456.Pq Event C5H , Umask FBH 457The number of mispredicted near indirect CALL branch instructions 458retired. 459.It Li BR_MISP_RETIRED.REL_CALL 460.Pq Event C5H , Umask FDH 461The number of mispredicted near relative CALL branch instructions 462retired. 463.It Li BR_MISP_RETIRED.TAKEN_JCC 464.Pq Event C5H , Umask FEH 465The number of mispredicted branch instructions retired that were 466conditional jumps and predicted taken. 467.It Li NO_ALLOC_CYCLES.ROB_FULL 468.Pq Event CAH , Umask 01H 469The number of cycles when no uops are allocated and the ROB is full 470(less than 2 entries available). 471.It Li NO_ALLOC_CYCLES.RAT_STALL 472.Pq Event CAH , Umask 20H 473The number of cycles when no uops are allocated and a RATstall is 474asserted. 475.It Li NO_ALLOC_CYCLES.ALL 476.Pq Event CAH , Umask 3FH 477The number of cycles when the front-end does not provide any 478instructions to be allocated for any reason. 479.It Li NO_ALLOC_CYCLES.NOT_DELIVERED 480.Pq Event CAH , Umask 50H 481The number of cycles when the front-end does not provide any 482instructions to be allocated but the back end is not stalled. 483.It Li RS_FULL_STALL.MEC 484.Pq Event CBH , Umask 01H 485The number of cycles the allocation pipe line stalled due to 486the RS for the MEC cluster is full. 487.It Li RS_FULL_STALL.ALL 488.Pq Event CBH , Umask 1FH 489The number of cycles that the allocation pipe line stalled due 490to any one of the RS is full. 491.It Li CYCLES_DIV_BUSY.ANY 492.Pq Event CDH , Umask 01H 493The number of cycles the divider is busy. 494.It Li BACLEARS.ALL 495.Pq Event E6H , Umask 01H 496The number of baclears for any type of branch. 497.It Li BACLEARS.RETURN 498.Pq Event E6H , Umask 08H 499The number of baclears for return branches. 500.It Li BACLEARS.COND 501.Pq Event E6H , Umask 10H 502The number of baclears for conditional branches. 503.It Li MS_DECODED.MS_ENTRY 504.Pq Event E7H , Umask 01H) 505The number of times the MSROM starts a flow of UOPS. 506.El 507.Sh SEE ALSO 508.Xr pmc 3 , 509.Xr pmc.atom 3 , 510.Xr pmc.core 3 , 511.Xr pmc.core2 3 , 512.Xr pmc.iaf 3 , 513.Xr pmc.k7 3 , 514.Xr pmc.k8 3 , 515.Xr pmc.soft 3 , 516.Xr pmc.tsc 3 , 517.Xr pmc_cpuinfo 3 , 518.Xr pmclog 3 , 519.Xr hwpmc 4 520.Sh HISTORY 521The 522.Nm pmc 523library first appeared in 524.Fx 6.0 . 525.Sh AUTHORS 526.An -nosplit 527The 528.Lb libpmc 529library was written by 530.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org . 531The support for the Atom Silvermont 532microarchitecture was written by 533.An Hiren Panchasara Aq Mt hiren@FreeBSD.org . 534