1.\" Copyright (c) 2014 Hiren Panchasara <hiren@FreeBSD.org> 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice, this list of conditions and the following disclaimer. 9.\" 2. Redistributions in binary form must reproduce the above copyright 10.\" notice, this list of conditions and the following disclaimer in the 11.\" documentation and/or other materials provided with the distribution. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.Dd April 6, 2017 26.Dt PMC.ATOMSILVERMONT 3 27.Os 28.Sh NAME 29.Nm pmc.atomsilvermont 30.Nd measurement events for 31.Tn Intel 32.Tn Atom Silvermont 33family CPUs 34.Sh LIBRARY 35.Lb libpmc 36.Sh SYNOPSIS 37.In pmc.h 38.Sh DESCRIPTION 39.Tn Intel 40.Tn Atom Silvermont 41CPUs contain PMCs conforming to version 3 of the 42.Tn Intel 43performance measurement architecture. 44These CPUs contains two classes of PMCs: 45.Bl -tag -width "Li PMC_CLASS_IAP" 46.It Li PMC_CLASS_IAF 47Fixed-function counters that count only one hardware event per counter. 48.It Li PMC_CLASS_IAP 49Programmable counters that may be configured to count one of a defined 50set of hardware events. 51.El 52.Pp 53The number of PMCs available in each class and their widths need to be 54determined at run time by calling 55.Xr pmc_cpuinfo 3 . 56.Pp 57Intel Atom Silvermont PMCs are documented in 58.Rs 59.%B "Intel 64 and IA-32 Intel(R) Architecture Software Developer's Manual" 60.%T "Combined Volumes" 61.%N "Order Number 325462-050US" 62.%D February 2014 63.%Q "Intel Corporation" 64.Re 65.Ss ATOM SILVERMONT FIXED FUNCTION PMCS 66These PMCs and their supported events are documented in 67.Xr pmc.iaf 3 . 68.Ss ATOM SILVERMONT PROGRAMMABLE PMCS 69The programmable PMCs support the following capabilities: 70.Bl -column "PMC_CAP_INTERRUPT" "Support" 71.It Em Capability Ta Em Support 72.It PMC_CAP_CASCADE Ta \&No 73.It PMC_CAP_EDGE Ta Yes 74.It PMC_CAP_INTERRUPT Ta Yes 75.It PMC_CAP_INVERT Ta Yes 76.It PMC_CAP_READ Ta Yes 77.It PMC_CAP_PRECISE Ta \&No 78.It PMC_CAP_SYSTEM Ta Yes 79.It PMC_CAP_TAGGING Ta \&No 80.It PMC_CAP_THRESHOLD Ta Yes 81.It PMC_CAP_USER Ta Yes 82.It PMC_CAP_WRITE Ta Yes 83.El 84.Ss Event Qualifiers 85Event specifiers for these PMCs support the following common 86qualifiers: 87.Bl -tag -width indent 88.It Li any 89Count matching events seen on any logical processor in a package. 90.It Li cmask= Ns Ar value 91Configure the PMC to increment only if the number of configured 92events measured in a cycle is greater than or equal to 93.Ar value . 94.It Li edge 95Configure the PMC to count the number of de-asserted to asserted 96transitions of the conditions expressed by the other qualifiers. 97If specified, the counter will increment only once whenever a 98condition becomes true, irrespective of the number of clocks during 99which the condition remains true. 100.It Li inv 101Invert the sense of comparison when the 102.Dq Li cmask 103qualifier is present, making the counter increment when the number of 104events per cycle is less than the value specified by the 105.Dq Li cmask 106qualifier. 107.It Li os 108Configure the PMC to count events happening at processor privilege 109level 0. 110.It Li usr 111Configure the PMC to count events occurring at privilege levels 1, 2 112or 3. 113.El 114.Pp 115If neither of the 116.Dq Li os 117or 118.Dq Li usr 119qualifiers are specified, the default is to enable both. 120.Pp 121Events that require core-specificity to be specified use a 122additional qualifier 123.Dq Li core= Ns Ar core , 124where argument 125.Ar core 126is one of: 127.Bl -tag -width indent 128.It Li all 129Measure event conditions on all cores. 130.It Li this 131Measure event conditions on this core. 132.El 133.Pp 134The default is 135.Dq Li this . 136.Pp 137Events that require an agent qualifier to be specified use an 138additional qualifier 139.Dq Li agent= Ns agent , 140where argument 141.Ar agent 142is one of: 143.Bl -tag -width indent 144.It Li this 145Measure events associated with this bus agent. 146.It Li any 147Measure events caused by any bus agent. 148.El 149.Pp 150The default is 151.Dq Li this . 152.Pp 153Events that require a hardware prefetch qualifier to be specified use an 154additional qualifier 155.Dq Li prefetch= Ns Ar prefetch , 156where argument 157.Ar prefetch 158is one of: 159.Bl -tag -width "exclude" 160.It Li both 161Include all prefetches. 162.It Li only 163Only count hardware prefetches. 164.It Li exclude 165Exclude hardware prefetches. 166.El 167.Pp 168The default is 169.Dq Li both . 170.Pp 171Events that require a cache coherence qualifier to be specified use an 172additional qualifier 173.Dq Li cachestate= Ns Ar state , 174where argument 175.Ar state 176contains one or more of the following letters: 177.Bl -tag -width indent 178.It Li e 179Count cache lines in the exclusive state. 180.It Li i 181Count cache lines in the invalid state. 182.It Li m 183Count cache lines in the modified state. 184.It Li s 185Count cache lines in the shared state. 186.El 187.Pp 188The default is 189.Dq Li eims . 190.Pp 191Events that require a snoop response qualifier to be specified use an 192additional qualifier 193.Dq Li snoopresponse= Ns Ar response , 194where argument 195.Ar response 196comprises of the following keywords separated by 197.Dq + 198signs: 199.Bl -tag -width indent 200.It Li clean 201Measure CLEAN responses. 202.It Li hit 203Measure HIT responses. 204.It Li hitm 205Measure HITM responses. 206.El 207.Pp 208The default is to measure all the above responses. 209.Pp 210Events that require a snoop type qualifier use an additional qualifier 211.Dq Li snooptype= Ns Ar type , 212where argument 213.Ar type 214comprises the one of the following keywords: 215.Bl -tag -width indent 216.It Li cmp2i 217Measure CMP2I snoops. 218.It Li cmp2s 219Measure CMP2S snoops. 220.El 221.Pp 222The default is to measure both snoops. 223.Ss Event Specifiers (Programmable PMCs) 224Atom Silvermont programmable PMCs support the following events: 225.Bl -tag -width indent 226.It Li REHABQ.LD_BLOCK_ST_FORWARD 227.Pq Event 03H , Umask 01H 228The number of retired loads that were 229prohibited from receiving forwarded data from the store 230because of address mismatch. 231.It Li REHABQ.LD_BLOCK_STD_NOTREADY 232.Pq Event 03H , Umask 02H 233The cases where a forward was technically possible, 234but did not occur because the store data was not available 235at the right time. 236.It Li REHABQ.ST_SPLITS 237.Pq Event 03H , Umask 04H 238The number of retire stores that experienced. 239cache line boundary splits. 240.It Li REHABQ.LD_SPLITS 241.Pq Event 03H , Umask 08H 242The number of retire loads that experienced. 243cache line boundary splits. 244.It Li REHABQ.LOCK 245.Pq Event 03H , Umask 10H 246The number of retired memory operations with lock semantics. 247These are either implicit locked instructions such as the 248XCHG instruction or instructions with an explicit LOCK 249prefix (0xF0). 250.It Li REHABQ.STA_FULL 251.Pq Event 03H , Umask 20H 252The number of retired stores that are delayed 253because there is not a store address buffer available. 254.It Li REHABQ.ANY_LD 255.Pq Event 03H , Umask 40H 256The number of load uops reissued from Rehabq. 257.It Li REHABQ.ANY_ST 258.Pq Event 03H , Umask 80H 259The number of store uops reissued from Rehabq. 260.It Li MEM_UOPS_RETIRED.L1_MISS_LOADS 261.Pq Event 04H , Umask 01H 262The number of load ops retired that miss in L1 263Data cache. 264Note that prefetch misses will not be counted. 265.It Li MEM_UOPS_RETIRED.L2_HIT_LOADS 266.Pq Event 04H , Umask 02H 267The number of load micro-ops retired that hit L2. 268.It Li MEM_UOPS_RETIRED.L2_MISS_LOADS 269.Pq Event 04H , Umask 04H 270The number of load micro-ops retired that missed L2. 271.It Li MEM_UOPS_RETIRED.DTLB_MISS_LOADS 272.Pq Event 04H , Umask 08H 273The number of load ops retired that had DTLB miss. 274.It Li MEM_UOPS_RETIRED.UTLB_MISS 275.Pq Event 04H , Umask 10H 276The number of load ops retired that had UTLB miss. 277.It Li MEM_UOPS_RETIRED.HITM 278.Pq Event 04H , Umask 20H 279The number of load ops retired that got data 280from the other core or from the other module. 281.It Li MEM_UOPS_RETIRED.ALL_LOADS 282.Pq Event 04H , Umask 40H 283The number of load ops retired. 284.It Li MEM_UOP_RETIRED.ALL_STORES 285.Pq Event 04H , Umask 80H 286The number of store ops retired. 287.It Li PAGE_WALKS.D_SIDE_CYCLES 288.Pq Event 05H , Umask 01H 289Every cycle when a D-side (walks due to a load) page walk is in progress. 290Page walk duration divided by number of page walks is the average duration of 291page-walks. 292Edge trigger bit must be cleared. 293Set Edge to count the number of page walks. 294.It Li PAGE_WALKS.I_SIDE_CYCLES 295.Pq Event 05H , Umask 02H 296Every cycle when a I-side (walks due to an instruction fetch) page walk is in 297progress. 298Page walk duration divided by number of page walks is the average duration of 299page-walks. 300.It Li PAGE_WALKS.WALKS 301.Pq Event 05H , Umask 03H 302The number of times a data (D) page walk or an instruction (I) page walk is 303completed or started. 304Since a page walk implies a TLB miss, the number of TLB misses can be counted 305by counting the number of pagewalks. 306.It Li LONGEST_LAT_CACHE.MISS 307.Pq Event 2EH , Umask 41H 308the total number of L2 cache references and the number of L2 cache misses 309respectively. 310L3 is not supported in Silvermont microarchitecture. 311.It Li LONGEST_LAT_CACHE.REFERENCE 312.Pq Event 2EH , Umask 4FH 313The number of requests originating from the core that 314references a cache line in the L2 cache. 315L3 is not supported in Silvermont microarchitecture. 316.It Li L2_REJECT_XQ.ALL 317.Pq Event 30H , Umask 00H 318The number of demand and prefetch 319transactions that the L2 XQ rejects due to a full or near full 320condition which likely indicates back pressure from the IDI link. 321The XQ may reject transactions from the L2Q (non-cacheable 322requests), BBS (L2 misses) and WOB (L2 write-back victims) 323.It Li CORE_REJECT_L2Q.ALL 324.Pq Event 31H , Umask 00H 325The number of demand and L1 prefetcher 326requests rejected by the L2Q due to a full or nearly full condition which 327likely indicates back pressure from L2Q. 328It also counts requests that would have gone directly to the XQ, but are 329rejected due to a full or nearly full condition, indicating back pressure from 330the IDI link. 331The L2Q may also reject transactions from a core to insure fairness between 332cores, or to delay a core's dirty eviction when the address conflicts incoming 333external snoops. 334(Note that L2 prefetcher requests that are dropped are not counted by this 335event). 336.It Li CPU_CLK_UNHALTED.CORE_P 337.Pq Event 3CH , Umask 00H 338The number of core cycles while the core is not in a halt state. 339The core enters the halt state when it is running the HLT instruction. 340In mobile systems the core frequency may change from time to time. 341For this reason this event may have a changing ratio with regards to time. 342.It Li CPU_CLK_UNHALTED.REF_P 343.Pq Event 3CH , Umask 01H 344The number of reference cycles that the core is not in a halt state. 345The core enters the halt state when it is running the HLT instruction. 346In mobile systems the core frequency may change from time. 347This event is not affected by core frequency changes but counts as if the core 348is running at the maximum frequency all the time. 349.It Li ICACHE.HIT 350.Pq Event 80H , Umask 01H 351The number of instruction fetches from the instruction cache. 352.It Li ICACHE.MISSES 353.Pq Event 80H , Umask 02H 354The number of instruction fetches that miss the Instruction cache or produce 355memory requests. 356This includes uncacheable fetches. 357An instruction fetch miss is counted only once and not once for every cycle 358it is outstanding. 359.It Li ICACHE.ACCESSES 360.Pq Event 80H , Umask 03H 361The number of instruction fetches, including uncacheable fetches. 362.It Li NIP_STALL.ICACHE_MISS 363.Pq Event B6H , Umask 04H 364The number of cycles the NIP stalls because of an icache miss. 365This is a cumulative count of cycles the NIP stalled for all 366icache misses. 367.It Li OFFCORE_RESPONSE_0 368.Pq Event B7H , Umask 01H 369Requires MSR_OFFCORE_RESP0 to specify request type and response. 370.It Li OFFCORE_RESPONSE_1 371.Pq Event B7H , Umask 02H 372Requires MSR_OFFCORE_RESP to specify request type and response. 373.It Li INST_RETIRED.ANY_P 374.Pq Event C0H , Umask 00H 375The number of instructions that retire execution. 376For instructions that consist of multiple micro-ops, this event counts the 377retirement of the last micro-op of the instruction. 378The counter continues counting during hardware interrupts, traps, and inside 379interrupt handlers. 380.It Li UOPS_RETIRED.MS 381.Pq Event C2H , Umask 01H 382The number of micro-ops retired that were supplied from MSROM. 383.It Li UOPS_RETIRED.ALL 384.Pq Event C2H , Umask 10H 385The number of micro-ops retired. 386.It Li MACHINE_CLEARS.SMC 387.Pq Event C3H , Umask 01H 388The number of times that a program writes to a code section. 389Self-modifying code causes a severe penalty in all Intel 390architecture processors. 391.It Li MACHINE_CLEARS.MEMORY_ORDERING 392.Pq Event C3H , Umask 02H 393The number of times that pipeline was cleared due to memory 394ordering issues. 395.It Li MACHINE_CLEARS.FP_ASSIST 396.Pq Event C3H , Umask 04H 397The number of times that pipeline stalled due to FP operations 398needing assists. 399.It Li MACHINE_CLEARS.ALL 400.Pq Event C3H , Umask 08H 401The number of times that pipeline stalled due to due to any causes 402(including SMC, MO, FP assist, etc). 403.It Li BR_INST_RETIRED.ALL_BRANCHES 404.Pq Event C4H , Umask 00H 405The number of branch instructions retired. 406.It Li BR_INST_RETIRED.JCC 407.Pq Event C4H , Umask 7EH 408The number of branch instructions retired that were conditional 409jumps. 410.It Li BR_INST_RETIRED.FAR_BRANCH 411.Pq Event C4H , Umask BFH 412The number of far branch instructions retired. 413.It Li BR_INST_RETIRED.NON_RETURN_IND 414.Pq Event C4H , Umask EBH 415The number of branch instructions retired that were near indirect 416call or near indirect jmp. 417.It Li BR_INST_RETIRED.RETURN 418.Pq Event C4H , Umask F7H 419The number of near RET branch instructions retired. 420.It Li BR_INST_RETIRED.CALL 421.Pq Event C4H , Umask F9H 422The number of near CALL branch instructions retired. 423.It Li BR_INST_RETIRED.IND_CALL 424.Pq Event C4H , Umask FBH 425The number of near indirect CALL branch instructions retired. 426.It Li BR_INST_RETIRED.REL_CALL 427.Pq Event C4H , Umask FDH 428The number of near relative CALL branch instructions retired. 429.It Li BR_INST_RETIRED.TAKEN_JCC 430.Pq Event C4H , Umask FEH 431The number of branch instructions retired that were conditional 432jumps and predicted taken. 433.It Li BR_MISP_RETIRED.ALL_BRANCHES 434.Pq Event C5H , Umask 00H 435The number of mispredicted branch instructions retired. 436.It Li BR_MISP_RETIRED.JCC 437.Pq Event C5H , Umask 7EH 438The number of mispredicted branch instructions retired that were 439conditional jumps. 440.It Li BR_MISP_RETIRED.FAR 441.Pq Event C5H , Umask BFH 442The number of mispredicted far branch instructions retired. 443.It Li BR_MISP_RETIRED.NON_RETURN_IND 444.Pq Event C5H , Umask EBH 445The number of mispredicted branch instructions retired that were 446near indirect call or near indirect jmp. 447.It Li BR_MISP_RETIRED.RETURN 448.Pq Event C5H , Umask F7H 449The number of mispredicted near RET branch instructions retired. 450.It Li BR_MISP_RETIRED.CALL 451.Pq Event C5H , Umask F9H 452The number of mispredicted near CALL branch instructions retired. 453.It Li BR_MISP_RETIRED.IND_CALL 454.Pq Event C5H , Umask FBH 455The number of mispredicted near indirect CALL branch instructions 456retired. 457.It Li BR_MISP_RETIRED.REL_CALL 458.Pq Event C5H , Umask FDH 459The number of mispredicted near relative CALL branch instructions 460retired. 461.It Li BR_MISP_RETIRED.TAKEN_JCC 462.Pq Event C5H , Umask FEH 463The number of mispredicted branch instructions retired that were 464conditional jumps and predicted taken. 465.It Li NO_ALLOC_CYCLES.ROB_FULL 466.Pq Event CAH , Umask 01H 467The number of cycles when no uops are allocated and the ROB is full 468(less than 2 entries available). 469.It Li NO_ALLOC_CYCLES.RAT_STALL 470.Pq Event CAH , Umask 20H 471The number of cycles when no uops are allocated and a RATstall is 472asserted. 473.It Li NO_ALLOC_CYCLES.ALL 474.Pq Event CAH , Umask 3FH 475The number of cycles when the front-end does not provide any 476instructions to be allocated for any reason. 477.It Li NO_ALLOC_CYCLES.NOT_DELIVERED 478.Pq Event CAH , Umask 50H 479The number of cycles when the front-end does not provide any 480instructions to be allocated but the back end is not stalled. 481.It Li RS_FULL_STALL.MEC 482.Pq Event CBH , Umask 01H 483The number of cycles the allocation pipe line stalled due to 484the RS for the MEC cluster is full. 485.It Li RS_FULL_STALL.ALL 486.Pq Event CBH , Umask 1FH 487The number of cycles that the allocation pipe line stalled due 488to any one of the RS is full. 489.It Li CYCLES_DIV_BUSY.ANY 490.Pq Event CDH , Umask 01H 491The number of cycles the divider is busy. 492.It Li BACLEARS.ALL 493.Pq Event E6H , Umask 01H 494The number of baclears for any type of branch. 495.It Li BACLEARS.RETURN 496.Pq Event E6H , Umask 08H 497The number of baclears for return branches. 498.It Li BACLEARS.COND 499.Pq Event E6H , Umask 10H 500The number of baclears for conditional branches. 501.It Li MS_DECODED.MS_ENTRY 502.Pq Event E7H , Umask 01H) 503The number of times the MSROM starts a flow of UOPS. 504.El 505.Sh SEE ALSO 506.Xr pmc 3 , 507.Xr pmc.atom 3 , 508.Xr pmc.core 3 , 509.Xr pmc.core2 3 , 510.Xr pmc.iaf 3 , 511.Xr pmc.k7 3 , 512.Xr pmc.k8 3 , 513.Xr pmc.soft 3 , 514.Xr pmc.tsc 3 , 515.Xr pmc_cpuinfo 3 , 516.Xr pmclog 3 , 517.Xr hwpmc 4 518.Sh HISTORY 519The 520.Nm pmc 521library first appeared in 522.Fx 6.0 . 523.Sh AUTHORS 524.An -nosplit 525The 526.Lb libpmc 527library was written by 528.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org . 529The support for the Atom Silvermont 530microarchitecture was written by 531.An Hiren Panchasara Aq Mt hiren@FreeBSD.org . 532