xref: /freebsd/lib/libpmc/pmc.atomsilvermont.3 (revision 7e00348e7605b9906601438008341ffc37c00e2c)
1.\" Copyright (c) 2014 Hiren Panchasara <hiren@FreeBSD.org>
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd March 20, 2014
28.Dt PMC.ATOMSILVERMONT 3
29.Os
30.Sh NAME
31.Nm pmc.atomsilvermont
32.Nd measurement events for
33.Tn Intel
34.Tn Atom Silvermont
35family CPUs
36.Sh LIBRARY
37.Lb libpmc
38.Sh SYNOPSIS
39.In pmc.h
40.Sh DESCRIPTION
41.Tn Intel
42.Tn Atom Silvermont
43CPUs contain PMCs conforming to version 3 of the
44.Tn Intel
45performance measurement architecture.
46These CPUs contains two classes of PMCs:
47.Bl -tag -width "Li PMC_CLASS_IAP"
48.It Li PMC_CLASS_IAF
49Fixed-function counters that count only one hardware event per counter.
50.It Li PMC_CLASS_IAP
51Programmable counters that may be configured to count one of a defined
52set of hardware events.
53.El
54.Pp
55The number of PMCs available in each class and their widths need to be
56determined at run time by calling
57.Xr pmc_cpuinfo 3 .
58.Pp
59Intel Atom Silvermont PMCs are documented in
60.Rs
61.%B "Intel 64 and IA-32 Intel(R) Architecture Software Developer's Manual"
62.%T "Combined Volumes"
63.%N "Order Number 325462-050US"
64.%D February 2014
65.%Q "Intel Corporation"
66.Re
67.Ss ATOM SILVERMONT FIXED FUNCTION PMCS
68These PMCs and their supported events are documented in
69.Xr pmc.iaf 3 .
70.Ss ATOM SILVERMONT PROGRAMMABLE PMCS
71The programmable PMCs support the following capabilities:
72.Bl -column "PMC_CAP_INTERRUPT" "Support"
73.It Em Capability Ta Em Support
74.It PMC_CAP_CASCADE Ta \&No
75.It PMC_CAP_EDGE Ta Yes
76.It PMC_CAP_INTERRUPT Ta Yes
77.It PMC_CAP_INVERT Ta Yes
78.It PMC_CAP_READ Ta Yes
79.It PMC_CAP_PRECISE Ta \&No
80.It PMC_CAP_SYSTEM Ta Yes
81.It PMC_CAP_TAGGING Ta \&No
82.It PMC_CAP_THRESHOLD Ta Yes
83.It PMC_CAP_USER Ta Yes
84.It PMC_CAP_WRITE Ta Yes
85.El
86.Ss Event Qualifiers
87Event specifiers for these PMCs support the following common
88qualifiers:
89.Bl -tag -width indent
90.It Li any
91Count matching events seen on any logical processor in a package.
92.It Li cmask= Ns Ar value
93Configure the PMC to increment only if the number of configured
94events measured in a cycle is greater than or equal to
95.Ar value .
96.It Li edge
97Configure the PMC to count the number of de-asserted to asserted
98transitions of the conditions expressed by the other qualifiers.
99If specified, the counter will increment only once whenever a
100condition becomes true, irrespective of the number of clocks during
101which the condition remains true.
102.It Li inv
103Invert the sense of comparison when the
104.Dq Li cmask
105qualifier is present, making the counter increment when the number of
106events per cycle is less than the value specified by the
107.Dq Li cmask
108qualifier.
109.It Li os
110Configure the PMC to count events happening at processor privilege
111level 0.
112.It Li usr
113Configure the PMC to count events occurring at privilege levels 1, 2
114or 3.
115.El
116.Pp
117If neither of the
118.Dq Li os
119or
120.Dq Li usr
121qualifiers are specified, the default is to enable both.
122.Pp
123Events that require core-specificity to be specified use a
124additional qualifier
125.Dq Li core= Ns Ar core ,
126where argument
127.Ar core
128is one of:
129.Bl -tag -width indent
130.It Li all
131Measure event conditions on all cores.
132.It Li this
133Measure event conditions on this core.
134.El
135.Pp
136The default is
137.Dq Li this .
138.Pp
139Events that require an agent qualifier to be specified use an
140additional qualifier
141.Dq Li agent= Ns agent ,
142where argument
143.Ar agent
144is one of:
145.Bl -tag -width indent
146.It Li this
147Measure events associated with this bus agent.
148.It Li any
149Measure events caused by any bus agent.
150.El
151.Pp
152The default is
153.Dq Li this .
154.Pp
155Events that require a hardware prefetch qualifier to be specified use an
156additional qualifier
157.Dq Li prefetch= Ns Ar prefetch ,
158where argument
159.Ar prefetch
160is one of:
161.Bl -tag -width "exclude"
162.It Li both
163Include all prefetches.
164.It Li only
165Only count hardware prefetches.
166.It Li exclude
167Exclude hardware prefetches.
168.El
169.Pp
170The default is
171.Dq Li both .
172.Pp
173Events that require a cache coherence qualifier to be specified use an
174additional qualifier
175.Dq Li cachestate= Ns Ar state ,
176where argument
177.Ar state
178contains one or more of the following letters:
179.Bl -tag -width indent
180.It Li e
181Count cache lines in the exclusive state.
182.It Li i
183Count cache lines in the invalid state.
184.It Li m
185Count cache lines in the modified state.
186.It Li s
187Count cache lines in the shared state.
188.El
189.Pp
190The default is
191.Dq Li eims .
192.Pp
193Events that require a snoop response qualifier to be specified use an
194additional qualifier
195.Dq Li snoopresponse= Ns Ar response ,
196where argument
197.Ar response
198comprises of the following keywords separated by
199.Dq +
200signs:
201.Bl -tag -width indent
202.It Li clean
203Measure CLEAN responses.
204.It Li hit
205Measure HIT responses.
206.It Li hitm
207Measure HITM responses.
208.El
209.Pp
210The default is to measure all the above responses.
211.Pp
212Events that require a snoop type qualifier use an additional qualifier
213.Dq Li snooptype= Ns Ar type ,
214where argument
215.Ar type
216comprises the one of the following keywords:
217.Bl -tag -width indent
218.It Li cmp2i
219Measure CMP2I snoops.
220.It Li cmp2s
221Measure CMP2S snoops.
222.El
223.Pp
224The default is to measure both snoops.
225.Ss Event Specifiers (Programmable PMCs)
226Atom Silvermont programmable PMCs support the following events:
227.Bl -tag -width indent
228.It Li REHABQ.LD_BLOCK_ST_FORWARD
229.Pq Event 03H , Umask 01H
230The number of retired loads that were
231prohibited from receiving forwarded data from the store
232because of address mismatch.
233.It Li REHABQ.LD_BLOCK_STD_NOTREADY
234.Pq Event 03H , Umask 02H
235The cases where a forward was technically possible,
236but did not occur because the store data was not available
237at the right time.
238.It Li REHABQ.ST_SPLITS
239.Pq Event 03H , Umask 04H
240The number of retire stores that experienced.
241cache line boundary splits.
242.It Li REHABQ.LD_SPLITS
243.Pq Event 03H , Umask 08H
244The number of retire loads that experienced.
245cache line boundary splits.
246.It Li REHABQ.LOCK
247.Pq Event 03H , Umask 10H
248The number of retired memory operations with lock semantics.
249These are either implicit locked instructions such as the
250XCHG instruction or instructions with an explicit LOCK
251prefix (0xF0).
252.It Li REHABQ.STA_FULL
253.Pq Event 03H , Umask 20H
254The number of retired stores that are delayed
255because there is not a store address buffer available.
256.It Li REHABQ.ANY_LD
257.Pq Event 03H , Umask 40H
258The number of load uops reissued from Rehabq.
259.It Li REHABQ.ANY_ST
260.Pq Event 03H , Umask 80H
261The number of store uops reissued from Rehabq.
262.It Li MEM_UOPS_RETIRED.L1_MISS_LOADS
263.Pq Event 04H , Umask 01H
264The number of load ops retired that miss in L1
265Data cache. Note that prefetch misses will not be counted.
266.It Li MEM_UOPS_RETIRED.L2_HIT_LOADS
267.Pq Event 04H , Umask 02H
268The number of load micro-ops retired that hit L2.
269.It Li MEM_UOPS_RETIRED.L2_MISS_LOADS
270.Pq Event 04H , Umask 04H
271The number of load micro-ops retired that missed L2.
272.It Li MEM_UOPS_RETIRED.DTLB_MISS_LOADS
273.Pq Event 04H , Umask 08H
274The number of load ops retired that had DTLB miss.
275.It Li MEM_UOPS_RETIRED.UTLB_MISS
276.Pq Event 04H , Umask 10H
277The number of load ops retired that had UTLB miss.
278.It Li MEM_UOPS_RETIRED.HITM
279.Pq Event 04H , Umask 20H
280The number of load ops retired that got data
281from the other core or from the other module.
282.It Li MEM_UOPS_RETIRED.ALL_LOADS
283.Pq Event 04H , Umask 40H
284The number of load ops retired.
285.It Li MEM_UOP_RETIRED.ALL_STORES
286.Pq Event 04H , Umask 80H
287The number of store ops retired.
288.It Li PAGE_WALKS.D_SIDE_CYCLES
289.Pq Event 05H , Umask 01H
290Every cycle when a D-side (walks due to a load) page walk
291is in progress. Page walk duration divided by
292number of page walks is the average duration of page-walks.
293Edge trigger bit must be cleared. Set Edge to count the number
294of page walks.
295.It Li PAGE_WALKS.I_SIDE_CYCLES
296.Pq Event 05H , Umask 02H
297Every cycle when a I-side (walks due to an instruction fetch)
298page walk is in progress. Page walk duration divided by number
299of page walks is the average duration of page-walks.
300.It Li PAGE_WALKS.WALKS
301.Pq Event 05H , Umask 03H
302The number of times a data (D) page walk or an instruction (I)
303page walk is completed or started. Since a page walk implies a
304TLB miss, the number of TLB misses can be counted by counting
305the number of pagewalks.
306.It Li LONGEST_LAT_CACHE.MISS
307.Pq Event 2EH , Umask 41H
308the total number of L2 cache references and
309The number of L2 cache misses respectively.
310L3 is not supported in Silvermont microarchitecture.
311.It Li LONGEST_LAT_CACHE.REFERENCE
312.Pq Event 2EH , Umask 4FH
313The number of requests originating from the core that
314references a cache line in the L2 cache.
315L3 is not supported in Silvermont microarchitecture.
316.It Li L2_REJECT_XQ.ALL
317.Pq Event 30H , Umask 00H
318The number of demand and prefetch
319transactions that the L2 XQ rejects due to a full or near full
320condition which likely indicates back pressure from the IDI link.
321The XQ may reject transactions from the L2Q (non-cacheable
322requests), BBS (L2 misses) and WOB (L2 write-back victims)
323.It Li CORE_REJECT_L2Q.ALL
324.Pq Event 31H , Umask 00H
325The number of demand and L1 prefetcher
326requests rejected by the L2Q due to a full or nearly full
327condition which likely indicates back pressure from L2Q.
328It also counts requests that would have gone directly to
329the XQ, but are rejected due to a full or nearly full condition,
330indicating back pressure from the IDI link. The L2Q may also
331reject transactions from a core to insure fairness between
332cores, or to delay a core's dirty eviction when the address
333conflicts incoming external snoops. (Note that L2 prefetcher
334requests that are dropped are not counted by this event.).
335.It Li CPU_CLK_UNHALTED.CORE_P
336.Pq Event 3CH , Umask 00H
337The number of core cycles while the core is not in a halt
338state. The core enters the halt state when it is running
339the HLT instruction. In mobile systems the core frequency
340may change from time to time. For this reason this event
341may have a changing ratio with regards to time.
342.It Li CPU_CLK_UNHALTED.REF_P
343.Pq Event 3CH , Umask 01H
344The number of reference cycles that the core is not in a halt
345state. The core enters the halt state when it is running
346the HLT instruction.
347In mobile systems the core frequency may change from time.
348This event is not affected by core frequency changes but counts
349as if the core is running at the maximum frequency all the time.
350.It Li ICACHE.HIT
351.Pq Event 80H , Umask 01H
352The number of instruction fetches from the instruction cache.
353.It Li ICACHE.MISSES
354.Pq Event 80H , Umask 02H
355The number of instruction fetches that miss the
356Instruction cache or produce memory requests. This includes
357uncacheable fetches. An instruction fetch miss is counted only
358once and not once for every cycle it is outstanding.
359.It Li ICACHE.ACCESSES
360.Pq Event 80H , Umask 03H
361The number of instruction fetches, including uncacheable fetches.
362.It Li NIP_STALL.ICACHE_MISS
363.Pq Event B6H , Umask 04H
364The number of cycles the NIP stalls because of an icache miss.
365This is a cumulative count of cycles the NIP stalled for all
366icache misses.
367.It Li OFFCORE_RESPONSE_0
368.Pq Event B7H , Umask 01H
369Requires MSR_OFFCORE_RESP0 to specify request type and response.
370.It Li OFFCORE_RESPONSE_1
371.Pq Event B7H , Umask 02H
372Requires MSR_OFFCORE_RESP  to specify request type and response.
373.It Li INST_RETIRED.ANY_P
374.Pq Event C0H , Umask 00H
375The number of instructions that retire execution. For instructions
376that consist of multiple micro-ops, this event counts the
377retirement of the last micro-op of the instruction. The counter
378continues counting during hardware interrupts, traps, and inside
379interrupt handlers.
380.It Li UOPS_RETIRED.MS
381.Pq Event C2H , Umask 01H
382The number of micro-ops retired that were supplied from MSROM.
383.It Li UOPS_RETIRED.ALL
384.Pq Event C2H , Umask 10H
385The number of micro-ops retired.
386.It Li MACHINE_CLEARS.SMC
387.Pq Event C3H , Umask 01H
388The number of times that a program writes to a code section.
389Self-modifying code causes a severe penalty in all Intel
390architecture processors.
391.It Li MACHINE_CLEARS.MEMORY_ORDERING
392.Pq Event C3H , Umask 02H
393The number of times that pipeline was cleared due to memory
394ordering issues.
395.It Li MACHINE_CLEARS.FP_ASSIST
396.Pq Event C3H , Umask 04H
397The number of times that pipeline stalled due to FP operations
398needing assists.
399.It Li MACHINE_CLEARS.ALL
400.Pq Event C3H , Umask 08H
401The number of times that pipeline stalled due to due to any causes
402(including SMC, MO, FP assist, etc).
403.It Li BR_INST_RETIRED.ALL_BRANCHES
404.Pq Event C4H , Umask 00H
405The number of branch instructions retired.
406.It Li BR_INST_RETIRED.JCC
407.Pq Event C4H , Umask 7EH
408The number of branch instructions retired that were conditional
409jumps.
410.It Li BR_INST_RETIRED.FAR_BRANCH
411.Pq Event C4H , Umask BFH
412The number of far branch instructions retired.
413.It Li BR_INST_RETIRED.NON_RETURN_IND
414.Pq Event C4H , Umask EBH
415The number of branch instructions retired that were near indirect
416call or near indirect jmp.
417.It Li BR_INST_RETIRED.RETURN
418.Pq Event C4H , Umask F7H
419The number of near RET branch instructions retired.
420.It Li BR_INST_RETIRED.CALL
421.Pq Event C4H , Umask F9H
422The number of near CALL branch instructions retired.
423.It Li BR_INST_RETIRED.IND_CALL
424.Pq Event C4H , Umask FBH
425The number of near indirect CALL branch instructions retired.
426.It Li BR_INST_RETIRED.REL_CALL
427.Pq Event C4H , Umask FDH
428The number of near relative CALL branch instructions retired.
429.It Li BR_INST_RETIRED.TAKEN_JCC
430.Pq Event C4H , Umask FEH
431The number of branch instructions retired that were conditional
432jumps and predicted taken.
433.It Li BR_MISP_RETIRED.ALL_BRANCHES
434.Pq Event C5H , Umask 00H
435The number of mispredicted branch instructions retired.
436.It Li BR_MISP_RETIRED.JCC
437.Pq Event C5H , Umask 7EH
438The number of mispredicted branch instructions retired that were
439conditional jumps.
440.It Li BR_MISP_RETIRED.FAR
441.Pq Event C5H , Umask BFH
442The number of mispredicted far branch instructions retired.
443.It Li BR_MISP_RETIRED.NON_RETURN_IND
444.Pq Event C5H , Umask EBH
445The number of mispredicted branch instructions retired that were
446near indirect call or near indirect jmp.
447.It Li BR_MISP_RETIRED.RETURN
448.Pq Event C5H , Umask F7H
449The number of mispredicted near RET branch instructions retired.
450.It Li BR_MISP_RETIRED.CALL
451.Pq Event C5H , Umask F9H
452The number of mispredicted near CALL branch instructions retired.
453.It Li BR_MISP_RETIRED.IND_CALL
454.Pq Event C5H , Umask FBH
455The number of mispredicted near indirect CALL branch instructions
456retired.
457.It Li BR_MISP_RETIRED.REL_CALL
458.Pq Event C5H , Umask FDH
459The number of mispredicted near relative CALL branch instructions
460retired.
461.It Li BR_MISP_RETIRED.TAKEN_JCC
462.Pq Event C5H , Umask FEH
463The number of mispredicted branch instructions retired that were
464conditional jumps and predicted taken.
465.It Li NO_ALLOC_CYCLES.ROB_FULL
466.Pq Event CAH , Umask 01H
467The number of cycles when no uops are allocated and the ROB is full
468(less than 2 entries available).
469.It Li NO_ALLOC_CYCLES.RAT_STALL
470.Pq Event CAH , Umask 20H
471The number of cycles when no uops are allocated and a RATstall is
472asserted.
473.It Li NO_ALLOC_CYCLES.ALL
474.Pq Event CAH , Umask 3FH
475The number of cycles when the front-end does not provide any
476instructions to be allocated for any reason.
477.It Li NO_ALLOC_CYCLES.NOT_DELIVERED
478.Pq Event CAH , Umask 50H
479The number of cycles when the front-end does not provide any
480instructions to be allocated but the back end is not stalled.
481.It Li RS_FULL_STALL.MEC
482.Pq Event CBH , Umask 01H
483The number of cycles the allocation pipe line stalled due to
484the RS for the MEC cluster is full.
485.It Li RS_FULL_STALL.ALL
486.Pq Event CBH , Umask 1FH
487The number of cycles that the allocation pipe line stalled due
488to any one of the RS is full.
489.It Li CYCLES_DIV_BUSY.ANY
490.Pq Event CDH , Umask 01H
491The number of cycles the divider is busy.
492.It Li BACLEARS.ALL
493.Pq Event E6H , Umask 01H
494The number of baclears for any type of branch.
495.It Li BACLEARS.RETURN
496.Pq Event E6H , Umask 08H
497The number of baclears for return branches.
498.It Li BACLEARS.COND
499.Pq Event E6H , Umask 10H
500The number of baclears for conditional branches.
501.It Li MS_DECODED.MS_ENTRY
502.Pq Event E7H , Umask 01H)
503The number of times the MSROM starts a flow of UOPS.
504.El
505.Sh SEE ALSO
506.Xr pmc 3 ,
507.Xr pmc.atom 3 ,
508.Xr pmc.core 3 ,
509.Xr pmc.core2 3 ,
510.Xr pmc.iaf 3 ,
511.Xr pmc.k7 3 ,
512.Xr pmc.k8 3 ,
513.Xr pmc.p4 3 ,
514.Xr pmc.p5 3 ,
515.Xr pmc.p6 3 ,
516.Xr pmc.soft 3 ,
517.Xr pmc.tsc 3 ,
518.Xr pmc_cpuinfo 3 ,
519.Xr pmclog 3 ,
520.Xr hwpmc 4
521.Sh HISTORY
522The
523.Nm pmc
524library first appeared in
525.Fx 6.0 .
526.Sh AUTHORS
527.An -nosplit
528The
529.Lb libpmc
530library was written by
531.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org .
532The support for the Atom Silvermont
533microarchitecture was written by
534.An Hiren Panchasara Aq Mt hiren@FreeBSD.org .
535