1.\" Copyright (c) 2003-2007 Joseph Koshy. All rights reserved. 2.\" 3.\" Redistribution and use in source and binary forms, with or without 4.\" modification, are permitted provided that the following conditions 5.\" are met: 6.\" 1. Redistributions of source code must retain the above copyright 7.\" notice, this list of conditions and the following disclaimer. 8.\" 2. Redistributions in binary form must reproduce the above copyright 9.\" notice, this list of conditions and the following disclaimer in the 10.\" documentation and/or other materials provided with the distribution. 11.\" 12.\" This software is provided by Joseph Koshy ``as is'' and 13.\" any express or implied warranties, including, but not limited to, the 14.\" implied warranties of merchantability and fitness for a particular purpose 15.\" are disclaimed. in no event shall Joseph Koshy be liable 16.\" for any direct, indirect, incidental, special, exemplary, or consequential 17.\" damages (including, but not limited to, procurement of substitute goods 18.\" or services; loss of use, data, or profits; or business interruption) 19.\" however caused and on any theory of liability, whether in contract, strict 20.\" liability, or tort (including negligence or otherwise) arising in any way 21.\" out of the use of this software, even if advised of the possibility of 22.\" such damage. 23.\" 24.\" $FreeBSD$ 25.\" 26.Dd April 23, 2007 27.Os 28.Dt PMCSTAT 8 29.Sh NAME 30.Nm pmcstat 31.Nd "performance measurement with performance monitoring hardware" 32.Sh SYNOPSIS 33.Nm 34.Op Fl C 35.Op Fl D Ar pathname 36.Op Fl E 37.Op Fl M Ar mapfilename 38.Op Fl O Ar logfilename 39.Op Fl P Ar event-spec 40.Op Fl R Ar logfilename 41.Op Fl S Ar event-spec 42.Op Fl W 43.Op Fl c Ar cpu-spec 44.Op Fl d 45.Op Fl g 46.Op Fl k Ar kerneldir 47.Op Fl n Ar rate 48.Op Fl o Ar outputfile 49.Op Fl p Ar event-spec 50.Op Fl q 51.Op Fl r Ar fsroot 52.Op Fl s Ar event-spec 53.Op Fl t Ar process-spec 54.Op Fl v 55.Op Fl w Ar secs 56.Op Ar command Op Ar args 57.Sh DESCRIPTION 58The 59.Nm 60utility measures system performance using the facilities provided by 61.Xr hwpmc 4 . 62.Pp 63The 64.Nm 65utility can measure both hardware events seen by the system as a 66whole, and those seen when a specified set of processes are executing 67on the system's CPUs. 68If a specific set of processes is being targeted (for example, 69if the 70.Fl t Ar process-spec 71option is specified, or if a command line is specified using 72.Ar command ) , 73then measurement occurs till 74.Ar command 75exits, or till all target processes specified by the 76.Fl t Ar process-spec 77options exit, or till the 78.Nm 79utility is interrupted by the user. 80If a specific set of processes is not targeted for measurement, then 81.Nm 82will perform system-wide measurements till interrupted by the 83user. 84.Pp 85A given invocation of 86.Nm 87can mix allocations of system-mode and process-mode PMCs, of both 88counting and sampling flavors. 89The values of all counting PMCs are printed in human readable form 90at regular intervals by 91.Nm . 92The output of sampling PMCs may be configured to go to a log file for 93subsequent offline analysis, or, at the expense of greater 94overhead, may be configured to be printed in text form on the fly. 95.Pp 96Hardware events to measure are specified to 97.Nm 98using event specifier strings 99.Ar event-spec . 100The syntax of these event specifiers is machine dependent and is 101documented in 102.Xr pmc 3 . 103.Pp 104A process-mode PMC may be configured to be inheritable by the target 105process' current and future children. 106.Sh OPTIONS 107The following options are available: 108.Bl -tag -width indent 109.It Fl C 110Toggle between showing cumulative or incremental counts for 111subsequent counting mode PMCs specified on the command line. 112The default is to show incremental counts. 113.It Fl D Ar pathname 114Create files with per-program samples in the directory named 115by 116.Ar pathname . 117The default is to create these files in the current directory. 118.It Fl E 119Toggle showing per-process counts at the time a tracked process 120exits for subsequent process-mode PMCs specified on the command line. 121This option is useful for mapping the performance characteristics of a 122complex pipeline of processes when used in conjunction with the 123.Fl d 124option. 125The default is to not to enable per-process tracking. 126.It Fl M Ar mapfilename 127Write the mapping between executable objects encountered in the event 128log and the abbreviated pathnames used for 129.Xr gprof 1 130profiles to file 131.Ar mapfilename . 132If this option is not specified, mapping information is not written. 133Argument 134.Ar mapfilename 135may be a 136.Dq Li - 137in which case this mapping information is sent to the output 138file configured by the 139.Fl o 140option. 141.It Fl O Ar logfilename 142Send logging output to file 143.Ar logfilename . 144If 145.Ar logfilename 146is of the form 147.Ar hostname Ns : Ns Ar port , 148where 149.Ar hostname 150does not start with a 151.Ql \&. 152or a 153.Ql / , 154then 155.Nm 156will open a network socket to host 157.Ar hostname 158on port 159.Ar port . 160.Pp 161If the 162.Fl O 163option is not specified and one of the logging options is requested, 164then 165.Nm 166will print a textual form of the logged events to the configured 167output file. 168.It Fl P Ar event-spec 169Allocate a process mode sampling PMC measuring hardware events 170specified in 171.Ar event-spec . 172.It Fl R Ar logfilename 173Perform offline analysis using sampling data in file 174.Ar logfilename . 175.It Fl S Ar event-spec 176Allocate a system mode sampling PMC measuring hardware events 177specified in 178.Ar event-spec . 179.It Fl W 180Toggle logging the incremental counts seen by the threads of a 181tracked process each time they are scheduled on a CPU. 182This is an experimental feature intended to help analyse the 183dynamic behaviour of processes in the system. 184It may incur substantial overhead if enabled. 185The default is for this feature to be disabled. 186.It Fl c Ar cpu-spec 187Set the cpus for subsequent system mode PMCs specified on the 188command line to 189.Ar cpu-spec . 190Argument 191.Ar cpu-spec 192is a comma separated list of CPU numbers, or the literal 193.Sq * 194denoting all CPUs. 195The default is to allocate system mode PMCs on all CPUs. 196.It Fl d 197Toggle between process mode PMCs measuring events for the target 198process' current and future children or only measuring events for 199the target process. 200The default is to measure events for the target process alone. 201.It Fl g 202Produce flat execution profiles in a format compatible with 203.Xr gprof 1 . 204A separate profile file is generated for each executable object 205encountered. 206Profile files are placed in sub-directories named by their PMC 207event name. 208.It Fl k Ar kerneldir 209Set the pathname of the kernel directory to argument 210.Ar kerneldir . 211This directory specifies where 212.Nm 213should look for the kernel and its modules. 214The default is 215.Pa /boot/kernel . 216.It Fl n Ar rate 217Set the default sampling rate for subsequent sampling mode 218PMCs specified on the command line. 219The default is to configure PMCs to sample the CPU's instruction 220pointer every 65536 events. 221.It Fl o Ar outputfile 222Send counter readings and textual representations of logged data 223to file 224.Ar outputfile . 225The default is to send output to 226.Pa stderr . 227.It Fl p Ar event-spec 228Allocate a process mode counting PMC measuring hardware events 229specified in 230.Ar event-spec . 231.It Fl q 232Decrease verbosity. 233.It Fl r Ar fsroot 234Set the top of the filesystem hierarchy under which executables 235are located to argument 236.Ar fsroot . 237The default is 238.Pa / . 239.It Fl s Ar event-spec 240Allocate a system mode counting PMC measuring hardware events 241specified in 242.Ar event-spec . 243.It Fl t Ar process-spec 244Attach process mode PMCs to the processes named by argument 245.Ar process-spec . 246Argument 247.Ar process-spec 248may be a non-negative integer denoting a specific process id, or a 249regular expression for selecting processes based on their command names. 250.It Fl v 251Increase verbosity. 252.It Fl w Ar secs 253Print the values of all counting mode PMCs every 254.Ar secs 255seconds. 256The argument 257.Ar secs 258may be a fractional value. 259The default interval is 5 seconds. 260.El 261.Pp 262If 263.Ar command 264is specified, it is executed using 265.Xr execvp 3 . 266.Sh EXAMPLES 267To perform system-wide statistical sampling on an AMD Athlon CPU with 268samples taken every 32768 instruction retirals and data being sampled 269to file 270.Pa sample.stat , 271use: 272.Dl "pmcstat -O sample.stat -n 32768 -S k7-retired-instructions" 273.Pp 274To execute 275.Nm mozilla 276and measure the number of data cache misses suffered 277by it and its children every 12 seconds on an AMD Athlon, use: 278.Dl "pmcstat -d -w 12 -p k7-dc-misses mozilla" 279.Pp 280To measure processor instructions retired for all processes named 281.Dq emacs 282use: 283.Dl "pmcstat -t '^emacs$' -p instructions" 284.Pp 285To count instruction tlb-misses on CPUs 0 and 2 on a Intel 286Pentium Pro/Pentium III SMP system use: 287.Dl "pmcstat -c 0,2 -s p6-itlb-miss" 288.Pp 289To perform system-wide sampling on all configured processors 290based on processor instructions retired use: 291.Dl "pmcstat -S instructions -O /tmp/sample.out" 292.Pp 293To send the generated event log to a remote machine use: 294.Dl "pmcstat -S instructions -O remotehost:port" 295On the remote machine, the sample log can be collected using 296.Xr nc 1 : 297.Dl "nc -l remotehost port > /tmp/sample.out" 298.Pp 299To generate 300.Xr gprof 1 301compatible flat profiles from a sample file use: 302.Dl "pmcstat -R /tmp/sample.out -g" 303.Sh DIAGNOSTICS 304.Ex -std 305.Sh SEE ALSO 306.Xr gprof 1 , 307.Xr nc 1 , 308.Xr execvp 3 , 309.Xr pmc 3 , 310.Xr pmclog 3 , 311.Xr hwpmc 4 , 312.Xr pmccontrol 8 , 313.Xr sysctl 8 314.Sh HISTORY 315The 316.Nm 317utility first appeared in 318.Fx 6.0 . 319It is 320.Ud 321.Sh AUTHORS 322.An Joseph Koshy Aq jkoshy@FreeBSD.org 323.Sh BUGS 324The 325.Nm 326utility cannot yet analyse 327.Xr hwpmc 4 328logs generated by non-native architectures. 329