lockstat.1 (10b3b54548f2290bbe8d8f88c59c28d12b7a635d) lockstat.1 (a10f58d076db56056d729664690a563b4627e85d)
1'\" te
2.\" CDDL HEADER START
3.\"
4.\" The contents of this file are subject to the terms of the
1'\" te
2.\" CDDL HEADER START
3.\"
4.\" The contents of this file are subject to the terms of the
5.\" Common Development and Distribution License (the "License").
5.\" Common Development and Distribution License (the "License").
6.\" You may not use this file except in compliance with the License.
7.\"
8.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9.\" or http://www.opensolaris.org/os/licensing.
10.\" See the License for the specific language governing permissions
11.\" and limitations under the License.
12.\"
13.\" When distributing Covered Code, include this CDDL HEADER in each
14.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15.\" If applicable, add the following below this CDDL HEADER, with the
16.\" fields enclosed by brackets "[]" replaced with your own identifying
17.\" information: Portions Copyright [yyyy] [name of copyright owner]
18.\"
19.\" CDDL HEADER END
20.\" Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved.
6.\" You may not use this file except in compliance with the License.
7.\"
8.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9.\" or http://www.opensolaris.org/os/licensing.
10.\" See the License for the specific language governing permissions
11.\" and limitations under the License.
12.\"
13.\" When distributing Covered Code, include this CDDL HEADER in each
14.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15.\" If applicable, add the following below this CDDL HEADER, with the
16.\" fields enclosed by brackets "[]" replaced with your own identifying
17.\" information: Portions Copyright [yyyy] [name of copyright owner]
18.\"
19.\" CDDL HEADER END
20.\" Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved.
21.TH lockstat 1M "28 Feb 2008" "SunOS 5.11" "System Administration Commands"
22.SH NAME
23lockstat \- report kernel lock and profiling statistics
24.SH SYNOPSIS
25.LP
26.nf
27\fBlockstat\fR [\fB-ACEHI\fR] [\fB-e\fR \fIevent_list\fR] [\fB-i\fR \fIrate\fR]
28 [\fB-b\fR | \fB-t\fR | \fB-h\fR | \fB-s\fR \fIdepth\fR] [\fB-n\fR \fInrecords\fR]
29 [\fB-l\fR \fIlock\fR [, \fIsize\fR]] [\fB-d\fR \fIduration\fR]
30 [\fB-f\fR \fIfunction\fR [, \fIsize\fR]] [\fB-T\fR] [\fB-ckgwWRpP\fR] [\fB-D\fR \fIcount\fR]
31 [\fB-o\fR \fIfilename\fR] [\fB-x\fR \fIopt\fR [=val]] \fIcommand\fR [\fIargs\fR]
32.fi
33
34.SH DESCRIPTION
35.sp
36.LP
37The \fBlockstat\fR utility gathers and displays kernel locking and profiling statistics. \fBlockstat\fR allows you to specify which events to watch (for example, spin on adaptive mutex, block on read access to rwlock due to waiting writers, and so forth) how much
38data to gather for each event, and how to display the data. By default, \fBlockstat\fR monitors all lock contention events, gathers frequency and timing data about those events, and displays the data in decreasing frequency order, so that the most common events appear first.
39.sp
40.LP
41\fBlockstat\fR gathers data until the specified command completes. For example, to gather statistics for a fixed-time interval, use \fBsleep\fR(1) as
42the command, as follows:
43.sp
44.LP
45\fBexample#\fR \fBlockstat\fR \fBsleep\fR \fB5\fR
46.sp
47.LP
48When the \fB-I\fR option is specified, \fBlockstat\fR establishes a per-processor high-level periodic interrupt source to gather profiling data. The interrupt handler simply generates a \fBlockstat\fR event whose caller is the interrupted PC (program counter).
49The profiling event is just like any other \fBlockstat\fR event, so all of the normal \fBlockstat\fR options are applicable.
50.sp
51.LP
52\fBlockstat\fR relies on DTrace to modify the running kernel's text to intercept events of interest. This imposes a small but measurable overhead on all system activity, so access to \fBlockstat\fR is restricted to super-user by default. The system administrator
53can permit other users to use \fBlockstat\fR by granting them additional DTrace privileges. Refer to the \fISolaris Dynamic Tracing Guide\fR for more information about DTrace security features.
54.SH OPTIONS
55.sp
56.LP
57The following options are supported:
58.SS "Event Selection"
59.sp
60.LP
61If no event selection options are specified, the default is \fB-C\fR.
62.sp
63.ne 2
64.mk
65.na
66\fB\fB-A\fR\fR
67.ad
68.sp .6
69.RS 4n
70Watch all lock events. \fB-A\fR is equivalent to \fB-CH\fR.
71.RE
72
73.sp
74.ne 2
75.mk
76.na
77\fB\fB-C\fR\fR
78.ad
79.sp .6
80.RS 4n
81Watch contention events.
82.RE
83
84.sp
85.ne 2
86.mk
87.na
88\fB\fB-E\fR\fR
89.ad
90.sp .6
91.RS 4n
21.\"
22.\" $FreeBSD$
23.\"
24.Dd October 24, 2013
25.Dt LOCKSTAT 1
26.Os
27.Sh NAME
28.Nm lockstat
29.Nd report kernel lock and profiling statistics
30.Sh SYNOPSIS
31.Nm
32.Op Fl ACEHI
33.Op Fl e Ar event-list
34.Op Fl i Ar rate
35.Op Fl b | t | h | s Ar depth
36.Op Fl n Ar num-records
37.Op Fl l Ar lock Oo Ns , Ns Ar size Oc
38.Op Fl d Ar duration
39.Op Fl f Ar function Oo Ns , Ns Ar size Oc
40.Op Fl T
41.Op Fl ckgwWRpP
42.Op Fl D Ar count
43.Op Fl o filename
44.Op Fl x Ar opt Oo Ns = Ns Ar val Oc
45.Ar command
46.Op Oo Ar args Oc
47.Sh DESCRIPTION
48The
49.Nm
50utility gathers and displays kernel locking and profiling statistics.
51.Nm
52allows you to specify which events to watch (for example, spin on adaptive
53mutex, block on read access to rwlock due to waiting writers, and so forth), how
54much data to gather for each event, and how to display the data.
55By default,
56.Nm
57monitors all lock contention events, gathers frequency and timing data about
58those events, and displays the data in decreasing frequency order, so that the
59most common events appear first.
60.Pp
61.Nm
62gathers data until the specified command completes.
63For example, to gather statistics for a fixed-time interval, use
64.Xr sleep 1
65as the command, as follows:
66.Pp
67.Dl # lockstat sleep 5
68.Pp
69When the
70.Fl I
71option is specified,
72.Nm lockstat
73establishes a per-processor high-level periodic interrupt source to gather
74profiling data.
75The interrupt handler simply generates a
76.Nm
77event whose caller is the interrupted PC (program counter).
78The profiling event is just like any other
79.Nm lockstat
80event, so all of the normal
81.Nm lockstat
82options are applicable.
83.Pp
84.Nm
85relies on DTrace to modify the running kernel's text to intercept events of
86interest.
87This imposes a small but measurable overhead on all system activity, so access
88to
89.Nm
90is restricted to super-user by default.
91.Sh OPTIONS
92The following options are supported:
93.Ss Event Selection
94If no event selection options are specified, the default is
95.Fl C .
96.Bl -tag -width indent
97.It Fl A
98Watch all lock events.
99.Fl A
100is equivalent to
101.Fl CH .
102.It Fl C
103Watch contention events.
104.It Fl E
92Watch error events.
105Watch error events.
93.RE
94
95.sp
96.ne 2
97.mk
98.na
99\fB\fB\fR\fB-e\fR \fIevent_list\fR\fR
100.ad
101.sp .6
102.RS 4n
103Only watch the specified events. \fIevent\fR \fIlist\fR is a comma-separated list of events or ranges of events such as 1,4-7,35. Run \fBlockstat\fR with no arguments to get a brief description of all events.
104.RE
105
106.sp
107.ne 2
108.mk
109.na
110\fB\fB-H\fR\fR
111.ad
112.sp .6
113.RS 4n
106.It Fl e Ar event-list
107Only watch the specified events.
108.Ar event-list
109is a comma-separated list of events or ranges of events such as 1,4-7,35.
110Run
111.Nm
112with no arguments to get a brief description of all events.
113.It Fl H
114Watch hold events.
114Watch hold events.
115.RE
116
117.sp
118.ne 2
119.mk
120.na
121\fB\fB-I\fR\fR
122.ad
123.sp .6
124.RS 4n
115.It Fl I
125Watch profiling interrupt events.
116Watch profiling interrupt events.
126.RE
127
128.sp
129.ne 2
130.mk
131.na
132\fB\fB\fR\fB-i\fR \fIrate\fR\fR
133.ad
134.sp .6
135.RS 4n
136Interrupt rate (per second) for \fB-I\fR. The default is 97 Hz, so that profiling doesn't run in lockstep with the clock interrupt (which runs at 100 Hz).
137.RE
138
139.SS "Data Gathering"
140.sp
141.ne 2
142.mk
143.na
144\fB\fB-x\fR \fIarg\fR[=\fIval\fR]\fR
145.ad
146.sp .6
147.RS 4n
148Enable or modify a DTrace runtime option or D compiler option. The list of options is found in the \fI\fR. Boolean options are enabled by specifying their name. Options with values are set by separating the option name and
149value with an equals sign (=).
150.RE
151
152.SS "Data Gathering (Mutually Exclusive)"
153.sp
154.ne 2
155.mk
156.na
157\fB\fB-b\fR\fR
158.ad
159.sp .6
160.RS 4n
117.It Fl i Ar rate
118Interrupt rate (per second) for
119.Fl I .
120The default is 97 Hz, so that profiling doesn't run in lockstep with the clock
121interrupt (which runs at 100 Hz).
122.El
123.Ss Data Gathering
124.Bl -tag -width indent
125.It Fl x Ar arg Oo Ns = Ns Ar val Oc
126Enable or modify a
127.Xr dtrace 1
128runtime option or D compiler option.
129Boolean options are enabled by specifying their name.
130Options with values are set by separating the option name and value with an
131equals sign.
132.El
133.Ss "Data Gathering (Mutually Exclusive)"
134.Bl -tag -width indent
135.It Fl b
161Basic statistics: lock, caller, number of events.
136Basic statistics: lock, caller, number of events.
162.RE
163
164.sp
165.ne 2
166.mk
167.na
168\fB\fB-h\fR\fR
169.ad
170.sp .6
171.RS 4n
172Histogram: Timing plus time-distribution histograms.
173.RE
174
175.sp
176.ne 2
177.mk
178.na
179\fB\fB\fR\fB-s\fR \fIdepth\fR\fR
180.ad
181.sp .6
182.RS 4n
183Stack trace: Histogram plus stack traces up to \fIdepth\fR frames deep.
184.RE
185
186.sp
187.ne 2
188.mk
189.na
190\fB\fB-t\fR\fR
191.ad
192.sp .6
193.RS 4n
194Timing: Basic plus timing for all events [default].
195.RE
196
197.SS "Data Filtering"
198.sp
199.ne 2
200.mk
201.na
202\fB\fB\fR\fB-d\fR \fIduration\fR\fR
203.ad
204.sp .6
205.RS 4n
206Only watch events longer than \fIduration\fR.
207.RE
208
209.sp
210.ne 2
211.mk
212.na
213\fB\fB\fR\fB-f\fR \fIfunc[,size]\fR\fR
214.ad
215.sp .6
216.RS 4n
217Only watch events generated by \fIfunc\fR, which can be specified as a symbolic name or hex address. \fIsize\fR defaults to the \fBELF\fR symbol size if available, or \fB1\fR if not.
218.RE
219
220.sp
221.ne 2
222.mk
223.na
224\fB\fB\fR\fB-l\fR \fIlock[,size]\fR\fR
225.ad
226.sp .6
227.RS 4n
228Only watch \fIlock\fR, which can be specified as a symbolic name or hex address. \fBsize\fR defaults to the \fBELF\fR symbol size or \fB1\fR if the symbol size is not available.
229.RE
230
231.sp
232.ne 2
233.mk
234.na
235\fB\fB\fR\fB-n\fR \fInrecords\fR\fR
236.ad
237.sp .6
238.RS 4n
137.It Fl h
138Histogram: timing plus time-distribution histograms.
139.It Fl s Ar depth
140Stack trace: histogram plus stack traces up to
141.Ar depth
142frames deep.
143.It Fl t
144Timing: Basic plus timing for all events (default).
145.El
146.Ss "Data Filtering"
147.Bl -tag -width indent
148.It Fl d Ar duration
149Only watch events longer than
150.Ar duration .
151.It Fl f Ar func Ns Oo Ns , Ns Ar size Oc Ns
152Only watch events generated by
153.Ar func ,
154which can be specified as a symbolic name or hex address.
155.Ar size
156defaults to the ELF symbol size if available, or 1 if not.
157.It Fl l Ar lock Ns Oo Ns , Ns Ar size Oc Ns
158Only watch
159.Ar lock ,
160which can be specified as a symbolic name or hex address.
161.Ar size
162defaults to the ELF symbol size or 1 if the symbol size is not available.
163.It Fl n Ar num-records
239Maximum number of data records.
164Maximum number of data records.
240.RE
241
242.sp
243.ne 2
244.mk
245.na
246\fB\fB-T\fR\fR
247.ad
248.sp .6
249.RS 4n
250Trace (rather than sample) events [off by default].
251.RE
252
253.SS "Data Reporting"
254.sp
255.ne 2
256.mk
257.na
258\fB\fB-c\fR\fR
259.ad
260.sp .6
261.RS 4n
262Coalesce lock data for lock arrays (for example, \fBpse_mutex[]\fR).
263.RE
264
265.sp
266.ne 2
267.mk
268.na
269\fB\fB\fR\fB-D\fR \fIcount\fR\fR
270.ad
271.sp .6
272.RS 4n
273Only display the top \fIcount\fR events of each type.
274.RE
275
276.sp
277.ne 2
278.mk
279.na
280\fB\fB-g\fR\fR
281.ad
282.sp .6
283.RS 4n
284Show total events generated by function. For example, if \fBfoo()\fR calls \fBbar()\fR in a loop, the work done by \fBbar()\fR counts as work generated by \fBfoo()\fR (along with any work done by \fBfoo()\fR itself).
285The \fB-g\fR option works by counting the total number of stack frames in which each function appears. This implies two things: (1) the data reported by \fB-g\fR can be misleading if the stack traces are not deep enough, and (2) functions that are called recursively might show
286greater than 100% activity. In light of issue (1), the default data gathering mode when using \fB-g\fR is \fB-s\fR \fB50\fR.
287.RE
288
289.sp
290.ne 2
291.mk
292.na
293\fB\fB-k\fR\fR
294.ad
295.sp .6
296.RS 4n
165.It Fl T
166Trace (rather than sample) events.
167This is off by default.
168.El
169.Ss Data Reporting
170.Bl -tag -width indent
171.It Fl c
172Coalesce lock data for lock arrays.
173.It Fl D Ar count
174Only display the top
175.Ar count
176events of each type.
177.It Fl g
178Show total events generated by function.
179For example, if
180.Fn foo
181calls
182.Fn bar
183in a loop, the work done by
184.Fn bar
185counts as work generated by
186.Fn foo
187(along with any work done by
188.Fn foo
189itself).
190The
191.Fl g
192option works by counting the total number of stack frames in which each function
193appears.
194This implies two things: (1) the data reported by
195.Fl g
196can be misleading if the stack traces are not deep enough, and (2) functions
197that are called recursively might show greater than 100% activity.
198In light of issue (1), the default data gathering mode when using
199.Fl g
200is
201.Fl s 50 .
202.It Fl k
297Coalesce PCs within functions.
203Coalesce PCs within functions.
298.RE
299
300.sp
301.ne 2
302.mk
303.na
304\fB\fB\fR\fB-o\fR \fIfilename\fR\fR
305.ad
306.sp .6
307.RS 4n
308Direct output to \fIfilename\fR.
309.RE
310
311.sp
312.ne 2
313.mk
314.na
315\fB\fB-P\fR\fR
316.ad
317.sp .6
318.RS 4n
204.It Fl o Ar filename
205Direct output to
206.Ar filename .
207.It Fl P
319Sort data by (\fIcount * time\fR) product.
208Sort data by (\fIcount * time\fR) product.
320.RE
321
322.sp
323.ne 2
324.mk
325.na
326\fB\fB-p\fR\fR
327.ad
328.sp .6
329.RS 4n
209.It Fl p
330Parsable output format.
210Parsable output format.
331.RE
332
333.sp
334.ne 2
335.mk
336.na
337\fB\fB-R\fR\fR
338.ad
339.sp .6
340.RS 4n
211.It Fl R
341Display rates (events per second) rather than counts.
212Display rates (events per second) rather than counts.
342.RE
343
344.sp
345.ne 2
346.mk
347.na
348\fB\fB-W\fR\fR
349.ad
350.sp .6
351.RS 4n
213.It Fl W
352Whichever: distinguish events only by caller, not by lock.
214Whichever: distinguish events only by caller, not by lock.
353.RE
354
355.sp
356.ne 2
357.mk
358.na
359\fB\fB-w\fR\fR
360.ad
361.sp .6
362.RS 4n
215.It Fl w
363Wherever: distinguish events only by lock, not by caller.
216Wherever: distinguish events only by lock, not by caller.
364.RE
365
366.SH DISPLAY FORMATS
367.sp
368.LP
217.El
218.Sh DISPLAY FORMATS
369The following headers appear over various columns of data.
219The following headers appear over various columns of data.
370.sp
371.ne 2
372.mk
373.na
374\fB\fBCount\fR or \fBops/s\fR\fR
375.ad
376.sp .6
377.RS 4n
378Number of times this event occurred, or the rate (times per second) if \fB-R\fR was specified.
379.RE
380
381.sp
382.ne 2
383.mk
384.na
385\fB\fBindv\fR\fR
386.ad
387.sp .6
388.RS 4n
220.Bl -tag -width indent
221.It Count or ops/s
222Number of times this event occurred, or the rate (times per second) if
223.Fl R
224was specified.
225.It indv
389Percentage of all events represented by this individual event.
226Percentage of all events represented by this individual event.
390.RE
391
392.sp
393.ne 2
394.mk
395.na
396\fB\fBgenr\fR\fR
397.ad
398.sp .6
399.RS 4n
227.It genr
400Percentage of all events generated by this function.
228Percentage of all events generated by this function.
401.RE
402
403.sp
404.ne 2
405.mk
406.na
407\fB\fBcuml\fR\fR
408.ad
409.sp .6
410.RS 4n
229.It cuml
411Cumulative percentage; a running total of the individuals.
230Cumulative percentage; a running total of the individuals.
412.RE
413
414.sp
415.ne 2
416.mk
417.na
418\fB\fBrcnt\fR\fR
419.ad
420.sp .6
421.RS 4n
422Average reference count. This will always be \fB1\fR for exclusive locks (mutexes, spin locks, rwlocks held as writer) but can be greater than \fB1\fR for shared locks (rwlocks held as reader).
423.RE
424
425.sp
426.ne 2
427.mk
428.na
429\fB\fBnsec\fR\fR
430.ad
431.sp .6
432.RS 4n
433Average duration of the events in nanoseconds, as appropriate for the event. For the profiling event, duration means interrupt latency.
434.RE
435
436.sp
437.ne 2
438.mk
439.na
440\fB\fBLock\fR\fR
441.ad
442.sp .6
443.RS 4n
231.It rcnt
232Average reference count.
233This will always be 1 for exclusive locks (mutexes,
234spin locks, rwlocks held as writer) but can be greater than 1 for shared locks
235(rwlocks held as reader).
236.It nsec
237Average duration of the events in nanoseconds, as appropriate for the event.
238For the profiling event, duration means interrupt latency.
239.It Lock
444Address of the lock; displayed symbolically if possible.
240Address of the lock; displayed symbolically if possible.
445.RE
446
447.sp
448.ne 2
449.mk
450.na
451\fB\fBCPU+PIL\fR\fR
452.ad
453.sp .6
454.RS 4n
455\fBCPU\fR plus processor interrupt level (\fBPIL\fR). For example, if \fBCPU\fR 4 is interrupted while at \fBPIL\fR 6, this will be reported as \fBcpu[4]+6\fR.
456.RE
457
458.sp
459.ne 2
460.mk
461.na
462\fB\fBCaller\fR\fR
463.ad
464.sp .6
465.RS 4n
241.It CPU+PIL
242CPU plus processor interrupt level (PIL).
243For example, if CPU 4 is interrupted while at PIL 6, this will be reported as
244cpu[4]+6.
245.It Caller
466Address of the caller; displayed symbolically if possible.
246Address of the caller; displayed symbolically if possible.
467.RE
247.El
248.Sh EXAMPLES
249.Bl -tag -width 0n
250.It Example 1 Measuring Kernel Lock Contention
251.Pp
252.Li # lockstat sleep 5
253.Bd -literal
254Adaptive mutex spin: 41411 events in 5.011 seconds (8263 events/sec)
468
255
469.SH EXAMPLES
470.LP
471\fBExample 1 \fRMeasuring Kernel Lock Contention
472.sp
473.in +2
474.nf
475example# \fBlockstat sleep 5\fR
476Adaptive mutex spin: 2210 events in 5.055 seconds (437 events/sec)
477.fi
478.in -2
479.sp
480
481.sp
482.in +2
483.nf
484Count indv cuml rcnt nsec Lock Caller
485------------------------------------------------------------------------
486 269 12% 12% 1.00 2160 service_queue background+0xdc
487 249 11% 23% 1.00 86 service_queue qenable_locked+0x64
488 228 10% 34% 1.00 131 service_queue background+0x15c
489 68 3% 37% 1.00 79 0x30000024070 untimeout+0x1c
490 59 3% 40% 1.00 384 0x300066fa8e0 background+0xb0
491 43 2% 41% 1.00 30 rqcred_lock svc_getreq+0x3c
492 42 2% 43% 1.00 341 0x30006834eb8 background+0xb0
493 41 2% 45% 1.00 135 0x30000021058 untimeout+0x1c
494 40 2% 47% 1.00 39 rqcred_lock svc_getreq+0x260
495 37 2% 49% 1.00 2372 0x300068e83d0 hmestart+0x1c4
496 36 2% 50% 1.00 77 0x30000021058 timeout_common+0x4
497 36 2% 52% 1.00 354 0x300066fa120 background+0xb0
498 32 1% 53% 1.00 97 0x30000024070 timeout_common+0x4
499 31 1% 55% 1.00 2923 0x300069883d0 hmestart+0x1c4
500 29 1% 56% 1.00 366 0x300066fb290 background+0xb0
501 28 1% 57% 1.00 117 0x3000001e040 untimeout+0x1c
502 25 1% 59% 1.00 93 0x3000001e040 timeout_common+0x4
503 22 1% 60% 1.00 25 0x30005161110 sync_stream_buf+0xdc
504 21 1% 60% 1.00 291 0x30006834eb8 putq+0xa4
505 19 1% 61% 1.00 43 0x3000515dcb0 mdf_alloc+0xc
506 18 1% 62% 1.00 456 0x30006834eb8 qenable+0x8
507 18 1% 63% 1.00 61 service_queue queuerun+0x168
508 17 1% 64% 1.00 268 0x30005418ee8 vmem_free+0x3c
256Count indv cuml rcnt nsec Lock Caller
257-------------------------------------------------------------------------------
25813750 33% 33% 0.00 72 vm_page_queue_free_mtx vm_page_free_toq+0x12e
25913648 33% 66% 0.00 66 vm_page_queue_free_mtx vm_page_alloc+0x138
260 4023 10% 76% 0.00 51 vm_dom+0x80 vm_page_dequeue+0x68
261 2672 6% 82% 0.00 186 vm_dom+0x80 vm_page_enqueue+0x63
262 618 1% 84% 0.00 31 0xfffff8000cd83a88 qsyncvp+0x37
263 506 1% 85% 0.00 164 0xfffff8000cb3f098 vputx+0x5a
264 477 1% 86% 0.00 69 0xfffff8000c7eb180 uma_dbg_getslab+0x5b
265 288 1% 87% 0.00 77 0xfffff8000cd8b000 vn_finished_write+0x29
266 263 1% 88% 0.00 103 0xfffff8000cbad448 vinactive+0xdc
267 259 1% 88% 0.00 53 0xfffff8000cd8b000 vfs_ref+0x24
268 237 1% 89% 0.00 20 0xfffff8000cbad448 vfs_hash_get+0xcc
269 233 1% 89% 0.00 22 0xfffff8000bfd9480 uma_dbg_getslab+0x5b
270 223 1% 90% 0.00 20 0xfffff8000cb3f098 cache_lookup+0x561
271 193 0% 90% 0.00 16 0xfffff8000cb40ba8 vref+0x27
272 175 0% 91% 0.00 34 0xfffff8000cbad448 vputx+0x5a
273 169 0% 91% 0.00 51 0xfffff8000cd8b000 vfs_unbusy+0x27
274 164 0% 92% 0.00 31 0xfffff8000cb40ba8 vputx+0x5a
509[...]
510
275[...]
276
511R/W reader blocked by writer: 76 events in 5.055 seconds (15 events/sec)
277Adaptive mutex block: 10 events in 5.011 seconds (2 events/sec)
512
278
513Count indv cuml rcnt nsec Lock Caller
514------------------------------------------------------------------------
515 23 30% 30% 1.00 22590137 0x300098ba358 ufs_dirlook+0xd0
516 17 22% 53% 1.00 5820995 0x3000ad815e8 find_bp+0x10
517 13 17% 70% 1.00 2639918 0x300098ba360 ufs_iget+0x198
518 4 5% 75% 1.00 3193015 0x300098ba360 ufs_getattr+0x54
519 3 4% 79% 1.00 7953418 0x3000ad817c0 find_bp+0x10
520 3 4% 83% 1.00 935211 0x3000ad815e8 find_read_lof+0x14
521 2 3% 86% 1.00 16357310 0x300073a4720 find_bp+0x10
522 2 3% 88% 1.00 2072433 0x300073a4720 find_read_lof+0x14
523 2 3% 91% 1.00 1606153 0x300073a4370 find_bp+0x10
524 1 1% 92% 1.00 2656909 0x300107e7400 ufs_iget+0x198
279Count indv cuml rcnt nsec Lock Caller
280-------------------------------------------------------------------------------
281 3 30% 30% 0.00 17592 vm_page_queue_free_mtx vm_page_alloc+0x138
282 2 20% 50% 0.00 20528 vm_dom+0x80 vm_page_enqueue+0x63
283 2 20% 70% 0.00 55502 0xfffff8000cb40ba8 vputx+0x5a
284 1 10% 80% 0.00 12007 vm_page_queue_free_mtx vm_page_free_toq+0x12e
285 1 10% 90% 0.00 9125 0xfffff8000cbad448 vfs_hash_get+0xcc
286 1 10% 100% 0.00 7864 0xfffff8000cd83a88 qsyncvp+0x37
287-------------------------------------------------------------------------------
525[...]
288[...]
526.fi
527.in -2
528.sp
289.Ed
290.It Example 2 Measuring Hold Times
291.Pp
292.Li # lockstat -H -D 10 sleep 1
293.Bd -literal
294Adaptive mutex hold: 109589 events in 1.039 seconds (105526 events/sec)
529
295
530.LP
531\fBExample 2 \fRMeasuring Hold Times
532.sp
533.in +2
534.nf
535example# \fBlockstat -H -D 10 sleep 1\fR
536Adaptive mutex spin: 513 events
537.fi
538.in -2
539.sp
540
541.sp
542.in +2
543.nf
544Count indv cuml rcnt nsec Lock Caller
545-------------------------------------------------------------------------
546 480 5% 5% 1.00 1136 0x300007718e8 putnext+0x40
547 286 3% 9% 1.00 666 0x3000077b430 getf+0xd8
548 271 3% 12% 1.00 537 0x3000077b430 msgio32+0x2fc
549 270 3% 15% 1.00 3670 0x300007718e8 strgetmsg+0x3d4
550 270 3% 18% 1.00 1016 0x300007c38b0 getq_noenab+0x200
551 264 3% 20% 1.00 1649 0x300007718e8 strgetmsg+0xa70
552 216 2% 23% 1.00 6251 tcp_mi_lock tcp_snmp_get+0xfc
553 206 2% 25% 1.00 602 thread_free_lock clock+0x250
554 138 2% 27% 1.00 485 0x300007c3998 putnext+0xb8
555 138 2% 28% 1.00 3706 0x300007718e8 strrput+0x5b8
556-------------------------------------------------------------------------
296Count indv cuml rcnt nsec Lock Caller
297-------------------------------------------------------------------------------
298 8998 8% 8% 0.00 617 0xfffff8000c7eb180 uma_dbg_getslab+0xd4
299 5901 5% 14% 0.00 917 vm_page_queue_free_mtx vm_object_terminate+0x16a
300 5040 5% 18% 0.00 902 vm_dom+0x80 vm_page_free_toq+0x88
301 4884 4% 23% 0.00 1056 vm_page_queue_free_mtx vm_page_alloc+0x44e
302 4664 4% 27% 0.00 759 vm_dom+0x80 vm_fault_hold+0x1a13
303 4011 4% 31% 0.00 888 vm_dom vm_page_advise+0x11b
304 4010 4% 34% 0.00 957 vm_dom+0x80 _vm_page_deactivate+0x5c
305 3743 3% 38% 0.00 582 0xfffff8000cf04838 pmap_is_prefaultable+0x158
306 2254 2% 40% 0.00 952 vm_dom vm_page_free_toq+0x88
307 1639 1% 41% 0.00 591 0xfffff800d60065b8 trap_pfault+0x1f7
308-------------------------------------------------------------------------------
557[...]
309[...]
558.fi
559.in -2
560.sp
561
310
562.LP
563\fBExample 3 \fRMeasuring Hold Times for Stack Traces Containing a Specific Function
564.sp
565.in +2
566.nf
567example# \fBlockstat -H -f tcp_rput_data -s 50 -D 10 sleep 1\fR
568Adaptive mutex spin: 11 events in 1.023 seconds (11
569events/sec)
570.fi
571.in -2
572.sp
311R/W writer hold: 64314 events in 1.039 seconds (61929 events/sec)
573
312
574.sp
575.in +2
576.nf
577-------------------------------------------------------------------------
578Count indv cuml rcnt nsec Lock Caller
313Count indv cuml rcnt nsec Lock Caller
579 9 82% 82% 1.00 2540 0x30000031380 tcp_rput_data+0x2b90
314-------------------------------------------------------------------------------
315 7421 12% 12% 0.00 2994 pvh_global_lock pmap_page_is_mapped+0xb6
316 4668 7% 19% 0.00 3313 pvh_global_lock pmap_enter+0x9ae
317 1639 3% 21% 0.00 733 0xfffff80168d10200 vm_object_deallocate+0x683
318 1639 3% 24% 0.00 3061 0xfffff80168d10200 unlock_and_deallocate+0x2b
319 1639 3% 26% 0.00 2966 0xfffff80168d10200 vm_fault_hold+0x16ee
320 1567 2% 29% 0.00 733 0xfffff80168d10200 vm_fault_hold+0x19bc
321 821 1% 30% 0.00 786 0xfffff801eb0cc000 vm_object_madvise+0x32d
322 649 1% 31% 0.00 4918 0xfffff80191105300 vm_fault_hold+0x16ee
323 648 1% 32% 0.00 8112 0xfffff80191105300 unlock_and_deallocate+0x2b
324 647 1% 33% 0.00 1261 0xfffff80191105300 vm_object_deallocate+0x683
325-------------------------------------------------------------------------------
326.Ed
327.It Example 3 Measuring Hold Times for Stack Traces Containing a Specific Function
328.Pp
329.Li # lockstat -H -f tcp_input -s 50 -D 10 sleep 1
330.Bd -literal
331Adaptive mutex hold: 68 events in 1.026 seconds (66 events/sec)
580
332
581 nsec ------ Time Distribution ------ count Stack
582 256 |@@@@@@@@@@@@@@@@ 5 tcp_rput_data+0x2b90
583 512 |@@@@@@ 2 putnext+0x78
584 1024 |@@@ 1 ip_rput+0xec4
585 2048 | 0 _c_putnext+0x148
586 4096 | 0 hmeread+0x31c
587 8192 | 0 hmeintr+0x36c
588 16384 |@@@ 1
589sbus_intr_wrapper+0x30
590[...]
333-------------------------------------------------------------------------------
334Count indv cuml rcnt nsec Lock Caller
335 32 47% 47% 0.00 1631 0xfffff800686f50d8 tcp_do_segment+0x284b
591
336
337 nsec ------ Time Distribution ------ count Stack
338 1024 |@@@@@@@@@@ 11 tcp_input+0xf54
339 2048 |@@@@@@@@@@@@@ 14 ip_input+0xc8
340 4096 |@@@@@ 6 swi_net+0x192
341 8192 | 1 intr_event_execute_handlers+0x93
342 ithread_loop+0xa6
343 fork_exit+0x84
344 0xffffffff808cf9ee
345-------------------------------------------------------------------------------
592Count indv cuml rcnt nsec Lock Caller
346Count indv cuml rcnt nsec Lock Caller
593 1 9% 91% 1.00 1036 0x30000055380 freemsg+0x44
347 29 43% 90% 0.00 4851 0xfffff800686f50d8 sowakeup+0xf8
594
348
595 nsec ------ Time Distribution ------ count Stack
596 1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1 freemsg+0x44
597 tcp_rput_data+0x2fd0
598 putnext+0x78
599 ip_rput+0xec4
600 _c_putnext+0x148
601 hmeread+0x31c
602 hmeintr+0x36c
603
604sbus_intr_wrapper+0x30
605-------------------------------------------------------------------------
349 nsec ------ Time Distribution ------ count Stack
350 4096 |@@@@@@@@@@@@@@@ 15 tcp_do_segment+0x2423
351 8192 |@@@@@@@@@@@@ 12 tcp_input+0xf54
352 16384 |@@ 2 ip_input+0xc8
353 swi_net+0x192
354 intr_event_execute_handlers+0x93
355 ithread_loop+0xa6
356 fork_exit+0x84
357 0xffffffff808cf9ee
358-------------------------------------------------------------------------------
606[...]
359[...]
607.fi
608.in -2
609.sp
610
611.LP
612\fBExample 4 \fRBasic Kernel Profiling
613.sp
614.LP
615For basic profiling, we don't care whether the profiling interrupt sampled \fBfoo()\fR\fB+0x4c\fR or \fBfoo()\fR\fB+0x78\fR; we care only that it sampled somewhere in \fBfoo()\fR, so we use \fB-k\fR. The \fBCPU\fR and \fBPIL\fR aren't relevant to basic profiling because we are measuring the system as a whole, not a particular \fBCPU\fR or interrupt level, so we use \fB-W\fR.
616
617.sp
618.in +2
619.nf
620example# \fBlockstat -kIW -D 20 ./polltest\fR
621Profiling interrupt: 82 events in 0.424 seconds (194
622events/sec)
623.fi
624.in -2
625.sp
626
627.sp
628.in +2
629.nf
630Count indv cuml rcnt nsec Hottest CPU+PIL Caller
631-----------------------------------------------------------------------
632 8 10% 10% 1.00 698 cpu[1] utl0
633 6 7% 17% 1.00 299 cpu[0] read
634 5 6% 23% 1.00 124 cpu[1] getf
635 4 5% 28% 1.00 327 cpu[0] fifo_read
636 4 5% 33% 1.00 112 cpu[1] poll
637 4 5% 38% 1.00 212 cpu[1] uiomove
638 4 5% 43% 1.00 361 cpu[1] mutex_tryenter
639 3 4% 46% 1.00 682 cpu[0] write
640 3 4% 50% 1.00 89 cpu[0] pcache_poll
641 3 4% 54% 1.00 118 cpu[1] set_active_fd
642 3 4% 57% 1.00 105 cpu[0] syscall_trap32
643 3 4% 61% 1.00 640 cpu[1] (usermode)
644 2 2% 63% 1.00 127 cpu[1] fifo_poll
645 2 2% 66% 1.00 300 cpu[1] fifo_write
646 2 2% 68% 1.00 669 cpu[0] releasef
647 2 2% 71% 1.00 112 cpu[1] bt_getlowbit
648 2 2% 73% 1.00 247 cpu[1] splx
649 2 2% 76% 1.00 503 cpu[0] mutex_enter
650 2 2% 78% 1.00 467 cpu[0]+10 disp_lock_enter
651 2 2% 80% 1.00 139 cpu[1] default_copyin
652-----------------------------------------------------------------------
653[...]
654.fi
655.in -2
656.sp
657
658.LP
659\fBExample 5 \fRGenerated-load Profiling
660.sp
661.LP
662In the example above, 5% of the samples were in \fBpoll()\fR. This tells us how much time was spent inside \fBpoll()\fR itself, but tells us nothing about how much work was \fBgenerated\fR by \fBpoll()\fR; that is, how much time we spent
663in functions called by \fBpoll()\fR. To determine that, we use the \fB-g\fR option. The example below shows that although \fBpolltest\fR spends only 5% of its time in \fBpoll()\fR itself, \fBpoll()\fR-induced work accounts for 34% of
664the load.
665
666.sp
667.LP
668Note that the functions that generate the profiling interrupt (\fBlockstat_intr()\fR, \fBcyclic_fire()\fR, and so forth) appear in every stack trace, and therefore are considered to have generated 100% of the load. This illustrates an important point: the generated
669load percentages do \fBnot\fR add up to 100% because they are not independent. If 72% of all stack traces contain both \fBfoo()\fR and \fBbar()\fR, then both \fBfoo()\fR and \fBbar()\fR are 72% load generators.
670
671.sp
672.in +2
673.nf
674example# \fBlockstat -kgIW -D 20 ./polltest\fR
675Profiling interrupt: 80 events in 0.412 seconds (194 events/sec)
676.fi
677.in -2
678.sp
679
680.sp
681.in +2
682.nf
683Count genr cuml rcnt nsec Hottest CPU+PIL Caller
684-------------------------------------------------------------------------
685 80 100% ---- 1.00 310 cpu[1] lockstat_intr
686 80 100% ---- 1.00 310 cpu[1] cyclic_fire
687 80 100% ---- 1.00 310 cpu[1] cbe_level14
688 80 100% ---- 1.00 310 cpu[1] current_thread
689 27 34% ---- 1.00 176 cpu[1] poll
690 20 25% ---- 1.00 221 cpu[0] write
691 19 24% ---- 1.00 249 cpu[1] read
692 17 21% ---- 1.00 232 cpu[0] write32
693 17 21% ---- 1.00 207 cpu[1] pcache_poll
694 14 18% ---- 1.00 319 cpu[0] fifo_write
695 13 16% ---- 1.00 214 cpu[1] read32
696 10 12% ---- 1.00 208 cpu[1] fifo_read
697 10 12% ---- 1.00 787 cpu[1] utl0
698 9 11% ---- 1.00 178 cpu[0] pcacheset_resolve
699 9 11% ---- 1.00 262 cpu[0] uiomove
700 7 9% ---- 1.00 506 cpu[1] (usermode)
701 5 6% ---- 1.00 195 cpu[1] fifo_poll
702 5 6% ---- 1.00 136 cpu[1] syscall_trap32
703 4 5% ---- 1.00 139 cpu[0] releasef
704 3 4% ---- 1.00 277 cpu[1] polllock
705-------------------------------------------------------------------------
706[...]
707.fi
708.in -2
709.sp
710
711.LP
712\fBExample 6 \fRGathering Lock Contention and Profiling Data for a Specific Module
713.sp
714.LP
715In this example we use the \fB-f\fR option not to specify a single function, but rather to specify the entire text space of the \fBsbus\fR module. We gather both lock contention and profiling statistics so that contention can be correlated with overall load on the
716module.
717
718.sp
719.in +2
720.nf
721example# \fBmodinfo | grep sbus\fR
72224 102a8b6f b8b4 59 1 sbus (SBus (sysio) nexus driver)
723.fi
724.in -2
725.sp
726
727.sp
728.in +2
729.nf
730example# \fBlockstat -kICE -f 0x102a8b6f,0xb8b4 sleep 10\fR
731Adaptive mutex spin: 39 events in 10.042 seconds (4 events/sec)
732.fi
733.in -2
734.sp
735
736.sp
737.in +2
738.nf
739Count indv cuml rcnt nsec Lock Caller
740-------------------------------------------------------------------------
741 15 38% 38% 1.00 206 0x30005160528 sync_stream_buf
742 7 18% 56% 1.00 14 0x30005160d18 sync_stream_buf
743 6 15% 72% 1.00 27 0x300060c3118 sync_stream_buf
744 5 13% 85% 1.00 24 0x300060c3510 sync_stream_buf
745 2 5% 90% 1.00 29 0x300060c2d20 sync_stream_buf
746 2 5% 95% 1.00 24 0x30005161cf8 sync_stream_buf
747 1 3% 97% 1.00 21 0x30005161110 sync_stream_buf
748 1 3% 100% 1.00 23 0x30005160130 sync_stream_buf
749[...]
750
751Adaptive mutex block: 9 events in 10.042 seconds (1 events/sec)
752
753Count indv cuml rcnt nsec Lock Caller
754-------------------------------------------------------------------------
755 4 44% 44% 1.00 156539 0x30005160528 sync_stream_buf
756 2 22% 67% 1.00 763516 0x30005160d18 sync_stream_buf
757 1 11% 78% 1.00 462130 0x300060c3510 sync_stream_buf
758 1 11% 89% 1.00 288749 0x30005161110 sync_stream_buf
759 1 11% 100% 1.00 1015374 0x30005160130 sync_stream_buf
760[...]
761
762Profiling interrupt: 229 events in 10.042 seconds (23 events/sec)
763
764Count indv cuml rcnt nsec Hottest CPU+PIL Caller
765
766-------------------------------------------------------------------------
767 89 39% 39% 1.00 426 cpu[0]+6 sync_stream_buf
768 64 28% 67% 1.00 398 cpu[0]+6 sbus_intr_wrapper
769 23 10% 77% 1.00 324 cpu[0]+6 iommu_dvma_kaddr_load
770 21 9% 86% 1.00 512 cpu[0]+6 iommu_tlb_flush
771 14 6% 92% 1.00 342 cpu[0]+6 iommu_dvma_unload
772 13 6% 98% 1.00 306 cpu[1] iommu_dvma_sync
773 5 2% 100% 1.00 389 cpu[1] iommu_dma_bindhdl
774-------------------------------------------------------------------------
775[...]
776.fi
777.in -2
778.sp
779
780.LP
781\fBExample 7 \fRDetermining the Average PIL (processor interrupt level) for a CPU
782.sp
783.in +2
784.nf
785example# \fBlockstat -Iw -l cpu[3] ./testprog\fR
786
787Profiling interrupt: 14791 events in 152.463 seconds (97 events/sec)
788
789Count indv cuml rcnt nsec CPU+PIL Hottest Caller
790
791-----------------------------------------------------------------------
79213641 92% 92% 1.00 253 cpu[3] (usermode)
793 579 4% 96% 1.00 325 cpu[3]+6 ip_ocsum+0xe8
794 375 3% 99% 1.00 411 cpu[3]+10 splx
795 154 1% 100% 1.00 527 cpu[3]+4 fas_intr_svc+0x80
796 41 0% 100% 1.00 293 cpu[3]+13 send_mondo+0x18
797 1 0% 100% 1.00 266 cpu[3]+12 zsa_rxint+0x400
798-----------------------------------------------------------------------
799[...]
800.fi
801.in -2
802.sp
803
804.LP
805\fBExample 8 \fRDetermining which Subsystem is Causing the System to be Busy
806.sp
807.in +2
808.nf
809example# \fBlockstat -s 10 -I sleep 20\fR
810
811Profiling interrupt: 4863 events in 47.375 seconds (103 events/sec)
812
813Count indv cuml rcnt nsec CPU+PIL Caller
814
815-----------------------------------------------------------------------
8161929 40% 40% 0.00 3215 cpu[0] usec_delay+0x78
817 nsec ------ Time Distribution ------ count Stack
818 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1872 ata_wait+0x90
819 8192 | 27 acersb_get_intr_status+0x34
82016384 | 29 ata_set_feature+0x124
82132768 | 1 ata_disk_start+0x15c
822 ata_hba_start+0xbc
823 ghd_waitq_process_and \e
824 _mutex_hold+0x70
825 ghd_waitq_process_and \e
826 _mutex_exit+0x4
827 ghd_transport+0x12c
828 ata_disk_tran_start+0x108
829-----------------------------------------------------------------------
830[...]
831.fi
832.in -2
833.sp
834
835.SH ATTRIBUTES
836.sp
837.LP
838See \fBattributes\fR(5) for descriptions of the following attributes:
839.sp
840
841.sp
842.TS
843tab() box;
844cw(2.75i) |cw(2.75i)
845lw(2.75i) |lw(2.75i)
846.
847ATTRIBUTE TYPEATTRIBUTE VALUE
848_
849AvailabilitySUNWdtrc
850.TE
851
852.SH SEE ALSO
853.sp
854.LP
855\fBdtrace\fR(1M), \fBplockstat\fR(1M), \fBattributes\fR(5), \fBlockstat\fR(7D), \fBmutex\fR(9F), \fBrwlock\fR(9F)
856.sp
857.LP
858\fISolaris Dynamic Tracing Guide\fR
859.SH NOTES
860.sp
861.LP
862The profiling support provided by \fBlockstat\fR \fB-I\fR replaces the old (and undocumented) \fB/usr/bin/kgmon\fR and \fB/dev/profile\fR.
863.sp
864.LP
865Tail-call elimination can affect call sites. For example, if \fBfoo()\fR\fB+0x50\fR calls \fBbar()\fR and the last thing \fBbar()\fR does is call \fBmutex_exit()\fR, the compiler can arrange for \fBbar()\fR to
866branch to \fBmutex_exit()\fRwith a return address of \fBfoo()\fR\fB+0x58\fR. Thus, the \fBmutex_exit()\fR in \fBbar()\fR will appear as though it occurred at \fBfoo()\fR\fB+0x58\fR.
867.sp
868.LP
869The \fBPC\fR in the stack frame in which an interrupt occurs can be bogus because, between function calls, the compiler is free to use the return address register for local storage.
870.sp
871.LP
872When using the \fB-I\fR and \fB-s\fR options together, the interrupted PC will usually not appear anywhere in the stack since the interrupt handler is entered asynchronously, not by a function call from that \fBPC\fR.
873.sp
874.LP
875The \fBlockstat\fR technology is provided on an as-is basis. The format and content of \fBlockstat\fR output reflect the current Solaris kernel implementation and are therefore subject to change in future releases.
360.Ed
361.El
362.Sh SEE ALSO
363.Xr dtrace 1 ,
364.Xr ksyms 4 ,
365.Xr locking 9
366.Sh NOTES
367Tail-call elimination can affect call sites.
368For example, if
369.Fn foo Ns +0x50
370calls
371.Fn bar
372and the last thing
373.Fn bar
374does is call
375.Fn mtx_unlock ,
376the compiler can arrange for
377.Fn bar
378to branch to
379.Fn mtx_unlock
380with a return address of
381.Fn foo Ns +0x58.
382Thus, the
383.Fn mtx_unlock
384in
385.Fn bar
386will appear as though it occurred at
387.Fn foo Ns +0x58.
388.Pp
389The PC in the stack frame in which an interrupt occurs can be bogus because,
390between function calls, the compiler is free to use the return address register
391for local storage.
392.Pp
393When using the
394.Fl I
395and
396.Fl s
397options together, the interrupted PC will usually not appear anywhere in the
398stack since the interrupt handler is entered asynchronously, not by a function
399call from that PC.