xref: /freebsd/usr.sbin/pmcstat/pmcstat.8 (revision 1e413cf93298b5b97441a21d9a50fdcd0ee9945e)
1.\" Copyright (c) 2003-2007 Joseph Koshy
2.\" Copyright (c) 2007 The FreeBSD Foundation
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" This software is provided by Joseph Koshy ``as is'' and
15.\" any express or implied warranties, including, but not limited to, the
16.\" implied warranties of merchantability and fitness for a particular purpose
17.\" are disclaimed.  in no event shall Joseph Koshy be liable
18.\" for any direct, indirect, incidental, special, exemplary, or consequential
19.\" damages (including, but not limited to, procurement of substitute goods
20.\" or services; loss of use, data, or profits; or business interruption)
21.\" however caused and on any theory of liability, whether in contract, strict
22.\" liability, or tort (including negligence or otherwise) arising in any way
23.\" out of the use of this software, even if advised of the possibility of
24.\" such damage.
25.\"
26.\" $FreeBSD$
27.\"
28.Dd April 23, 2007
29.Os
30.Dt PMCSTAT 8
31.Sh NAME
32.Nm pmcstat
33.Nd "performance measurement with performance monitoring hardware"
34.Sh SYNOPSIS
35.Nm
36.Op Fl C
37.Op Fl D Ar pathname
38.Op Fl E
39.Op Fl G Ar pathname
40.Op Fl M Ar mapfilename
41.Op Fl N
42.Op Fl O Ar logfilename
43.Op Fl P Ar event-spec
44.Op Fl R Ar logfilename
45.Op Fl S Ar event-spec
46.Op Fl W
47.Op Fl c Ar cpu-spec
48.Op Fl d
49.Op Fl g
50.Op Fl k Ar kerneldir
51.Op Fl n Ar rate
52.Op Fl o Ar outputfile
53.Op Fl p Ar event-spec
54.Op Fl q
55.Op Fl r Ar fsroot
56.Op Fl s Ar event-spec
57.Op Fl t Ar process-spec
58.Op Fl v
59.Op Fl w Ar secs
60.Op Fl z Ar graphdepth
61.Op Ar command Op Ar args
62.Sh DESCRIPTION
63The
64.Nm
65utility measures system performance using the facilities provided by
66.Xr hwpmc 4 .
67.Pp
68The
69.Nm
70utility can measure both hardware events seen by the system as a
71whole, and those seen when a specified set of processes are executing
72on the system's CPUs.
73If a specific set of processes is being targeted (for example,
74if the
75.Fl t Ar process-spec
76option is specified, or if a command line is specified using
77.Ar command ) ,
78then measurement occurs till
79.Ar command
80exits, or till all target processes specified by the
81.Fl t Ar process-spec
82options exit, or till the
83.Nm
84utility is interrupted by the user.
85If a specific set of processes is not targeted for measurement, then
86.Nm
87will perform system-wide measurements till interrupted by the
88user.
89.Pp
90A given invocation of
91.Nm
92can mix allocations of system-mode and process-mode PMCs, of both
93counting and sampling flavors.
94The values of all counting PMCs are printed in human readable form
95at regular intervals by
96.Nm .
97The output of sampling PMCs may be configured to go to a log file for
98subsequent offline analysis, or, at the expense of greater
99overhead, may be configured to be printed in text form on the fly.
100.Pp
101Hardware events to measure are specified to
102.Nm
103using event specifier strings
104.Ar event-spec .
105The syntax of these event specifiers is machine dependent and is
106documented in
107.Xr pmc 3 .
108.Pp
109A process-mode PMC may be configured to be inheritable by the target
110process' current and future children.
111.Sh OPTIONS
112The following options are available:
113.Bl -tag -width indent
114.It Fl C
115Toggle between showing cumulative or incremental counts for
116subsequent counting mode PMCs specified on the command line.
117The default is to show incremental counts.
118.It Fl D Ar pathname
119Create files with per-program samples in the directory named
120by
121.Ar pathname .
122The default is to create these files in the current directory.
123.It Fl E
124Toggle showing per-process counts at the time a tracked process
125exits for subsequent process-mode PMCs specified on the command line.
126This option is useful for mapping the performance characteristics of a
127complex pipeline of processes when used in conjunction with the
128.Fl d
129option.
130The default is to not to enable per-process tracking.
131.It Fl G Ar pathname
132Print callchain information to file
133.Ar pathname .
134If argument
135.Ar pathname
136is a
137.Dq Li -
138this information is sent to the output file specified by the
139.Fl o
140option.
141.It Fl M Ar mapfilename
142Write the mapping between executable objects encountered in the event
143log and the abbreviated pathnames used for
144.Xr gprof 1
145profiles to file
146.Ar mapfilename .
147If this option is not specified, mapping information is not written.
148Argument
149.Ar mapfilename
150may be a
151.Dq Li -
152in which case this mapping information is sent to the output
153file configured by the
154.Fl o
155option.
156.It Fl N
157Toggle capturing callchain information for subsequent sampling PMCs.
158The default is for sampling PMCs to capture callchain information.
159.It Fl O Ar logfilename
160Send logging output to file
161.Ar logfilename .
162If
163.Ar logfilename
164is of the form
165.Ar hostname Ns : Ns Ar port ,
166where
167.Ar hostname
168does not start with a
169.Ql \&.
170or a
171.Ql / ,
172then
173.Nm
174will open a network socket to host
175.Ar hostname
176on port
177.Ar port .
178.Pp
179If the
180.Fl O
181option is not specified and one of the logging options is requested,
182then
183.Nm
184will print a textual form of the logged events to the configured
185output file.
186.It Fl P Ar event-spec
187Allocate a process mode sampling PMC measuring hardware events
188specified in
189.Ar event-spec .
190.It Fl R Ar logfilename
191Perform offline analysis using sampling data in file
192.Ar logfilename .
193.It Fl S Ar event-spec
194Allocate a system mode sampling PMC measuring hardware events
195specified in
196.Ar event-spec .
197.It Fl W
198Toggle logging the incremental counts seen by the threads of a
199tracked process each time they are scheduled on a CPU.
200This is an experimental feature intended to help analyse the
201dynamic behaviour of processes in the system.
202It may incur substantial overhead if enabled.
203The default is for this feature to be disabled.
204.It Fl c Ar cpu-spec
205Set the cpus for subsequent system mode PMCs specified on the
206command line to
207.Ar cpu-spec .
208Argument
209.Ar cpu-spec
210is a comma separated list of CPU numbers, or the literal
211.Sq *
212denoting all CPUs.
213The default is to allocate system mode PMCs on all active CPUs in
214the system.
215.It Fl d
216Toggle between process mode PMCs measuring events for the target
217process' current and future children or only measuring events for
218the target process.
219The default is to measure events for the target process alone.
220.It Fl g
221Produce profiles in a format compatible with
222.Xr gprof 1 .
223A separate profile file is generated for each executable object
224encountered.
225Profile files are placed in sub-directories named by their PMC
226event name.
227.It Fl k Ar kerneldir
228Set the pathname of the kernel directory to argument
229.Ar kerneldir .
230This directory specifies where
231.Nm
232should look for the kernel and its modules.
233The default is
234.Pa /boot/kernel .
235.It Fl n Ar rate
236Set the default sampling rate for subsequent sampling mode
237PMCs specified on the command line.
238The default is to configure PMCs to sample the CPU's instruction
239pointer every 65536 events.
240.It Fl o Ar outputfile
241Send counter readings and textual representations of logged data
242to file
243.Ar outputfile .
244The default is to send output to
245.Pa stderr
246when collecting live data and to
247.Pa stdout
248when processing a pre-existing logfile.
249.It Fl p Ar event-spec
250Allocate a process mode counting PMC measuring hardware events
251specified in
252.Ar event-spec .
253.It Fl q
254Decrease verbosity.
255.It Fl r Ar fsroot
256Set the top of the filesystem hierarchy under which executables
257are located to argument
258.Ar fsroot .
259The default is
260.Pa / .
261.It Fl s Ar event-spec
262Allocate a system mode counting PMC measuring hardware events
263specified in
264.Ar event-spec .
265.It Fl t Ar process-spec
266Attach process mode PMCs to the processes named by argument
267.Ar process-spec .
268Argument
269.Ar process-spec
270may be a non-negative integer denoting a specific process id, or a
271regular expression for selecting processes based on their command names.
272.It Fl v
273Increase verbosity.
274.It Fl w Ar secs
275Print the values of all counting mode PMCs every
276.Ar secs
277seconds.
278The argument
279.Ar secs
280may be a fractional value.
281The default interval is 5 seconds.
282.It Fl z Ar graphdepth
283When printing system-wide callgraphs, limit callgraphs to the depth
284specified by argument
285.Ar graphdepth .
286.El
287.Pp
288If
289.Ar command
290is specified, it is executed using
291.Xr execvp 3 .
292.Sh EXAMPLES
293To perform system-wide statistical sampling on an AMD Athlon CPU with
294samples taken every 32768 instruction retirals and data being sampled
295to file
296.Pa sample.stat ,
297use:
298.Dl "pmcstat -O sample.stat -n 32768 -S k7-retired-instructions"
299.Pp
300To execute
301.Nm mozilla
302and measure the number of data cache misses suffered
303by it and its children every 12 seconds on an AMD Athlon, use:
304.Dl "pmcstat -d -w 12 -p k7-dc-misses mozilla"
305.Pp
306To measure processor instructions retired for all processes named
307.Dq emacs
308use:
309.Dl "pmcstat -t '^emacs$' -p instructions"
310.Pp
311To count instruction tlb-misses on CPUs 0 and 2 on a Intel
312Pentium Pro/Pentium III SMP system use:
313.Dl "pmcstat -c 0,2 -s p6-itlb-miss"
314.Pp
315To collect profiling information for a specific process with pid 1234
316based on instruction cache misses seen by it use:
317.Dl "pmcstat -P ic-misses -t 1234 -O /tmp/sample.out"
318.Pp
319To perform system-wide sampling on all configured processors
320based on processor instructions retired use:
321.Dl "pmcstat -S instructions -O /tmp/sample.out"
322If callgraph capture is not desired use:
323.Dl "pmcstat -N -S instructions -O /tmp/sample.out"
324.Pp
325To send the generated event log to a remote machine use:
326.Dl "pmcstat -S instructions -O remotehost:port"
327On the remote machine, the sample log can be collected using
328.Xr nc 1 :
329.Dl "nc -l remotehost port > /tmp/sample.out"
330.Pp
331To generate
332.Xr gprof 1
333compatible profiles from a sample file use:
334.Dl "pmcstat -R /tmp/sample.out -g"
335.Pp
336To print a system-wide profile with callgraphs to file
337.Pa "foo.graph"
338use:
339.Dl "pmcstat -R /tmp/sample.out -G foo.graph"
340.Sh DIAGNOSTICS
341.Ex -std
342.Sh COMPATIBILITY
343Due to the limitations of the
344.Pa gmon.out
345file format,
346.Xr gprof 1
347compatible profiles generated by the
348.Fl g
349option do not contain information about calls that cross executable
350boundaries.
351The generated
352.Pa gmon.out
353files are also only meaningful for native executables.
354.Sh SEE ALSO
355.Xr gprof 1 ,
356.Xr nc 1 ,
357.Xr execvp 3 ,
358.Xr pmc 3 ,
359.Xr pmclog 3 ,
360.Xr hwpmc 4 ,
361.Xr pmccontrol 8 ,
362.Xr sysctl 8
363.Sh HISTORY
364The
365.Nm
366utility first appeared in
367.Fx 6.0 .
368It is
369.Ud
370.Sh AUTHORS
371.An Joseph Koshy Aq jkoshy@FreeBSD.org
372.Sh BUGS
373The
374.Nm
375utility cannot yet analyse
376.Xr hwpmc 4
377logs generated by non-native architectures.
378