1.\" Copyright (c) 2003-2007 Joseph Koshy. All rights reserved. 2.\" 3.\" Redistribution and use in source and binary forms, with or without 4.\" modification, are permitted provided that the following conditions 5.\" are met: 6.\" 1. Redistributions of source code must retain the above copyright 7.\" notice, this list of conditions and the following disclaimer. 8.\" 2. Redistributions in binary form must reproduce the above copyright 9.\" notice, this list of conditions and the following disclaimer in the 10.\" documentation and/or other materials provided with the distribution. 11.\" 12.\" This software is provided by Joseph Koshy ``as is'' and 13.\" any express or implied warranties, including, but not limited to, the 14.\" implied warranties of merchantability and fitness for a particular purpose 15.\" are disclaimed. in no event shall Joseph Koshy be liable 16.\" for any direct, indirect, incidental, special, exemplary, or consequential 17.\" damages (including, but not limited to, procurement of substitute goods 18.\" or services; loss of use, data, or profits; or business interruption) 19.\" however caused and on any theory of liability, whether in contract, strict 20.\" liability, or tort (including negligence or otherwise) arising in any way 21.\" out of the use of this software, even if advised of the possibility of 22.\" such damage. 23.\" 24.\" $FreeBSD$ 25.\" 26.Dd April 22, 2007 27.Os 28.Dt PMCSTAT 8 29.Sh NAME 30.Nm pmcstat 31.Nd "performance measurement with performance monitoring hardware" 32.Sh SYNOPSIS 33.Nm 34.Op Fl C 35.Op Fl D Ar pathname 36.Op Fl E 37.Op Fl M Ar mapfilename 38.Op Fl O Ar logfilename 39.Op Fl P Ar event-spec 40.Op Fl R Ar logfilename 41.Op Fl S Ar event-spec 42.Op Fl W 43.Op Fl c Ar cpu-spec 44.Op Fl d 45.Op Fl g 46.Op Fl k Ar kerneldir 47.Op Fl n Ar rate 48.Op Fl o Ar outputfile 49.Op Fl p Ar event-spec 50.Op Fl q 51.Op Fl r Ar fsroot 52.Op Fl s Ar event-spec 53.Op Fl t Ar pid 54.Op Fl v 55.Op Fl w Ar secs 56.Op Ar command Op Ar args 57.Sh DESCRIPTION 58The 59.Nm 60utility measures system performance using the facilities provided by 61.Xr hwpmc 4 . 62.Pp 63The 64.Nm 65utility can measure both hardware events seen by the system as a 66whole, and those seen when a specified process is executing on the 67system's CPUs. 68If a specific process is being targeted (for example, 69if the 70.Fl t Ar pid 71option is specified, or if a command line is specified using 72.Ar command ) , 73then measurement occurs till the target process exits or 74the 75.Nm 76utility is interrupted by the user. 77If a specific process is not targeted for measurement, then 78.Nm 79will perform system-wide measurements till interrupted by the 80user. 81.Pp 82A given invocation of 83.Nm 84can mix allocations of system-mode and process-mode PMCs, of both 85counting and sampling flavors. 86The values of all counting PMCs are printed in human readable form 87at regular intervals by 88.Nm . 89The output of sampling PMCs may be configured to go to a log file for 90subsequent offline analysis, or, at the expense of greater 91overhead, may be configured to be printed in text form on the fly. 92.Pp 93Hardware events to measure are specified to 94.Nm 95using event specifier strings 96.Ar event-spec . 97The syntax of these event specifiers is machine dependent and is 98documented in 99.Xr pmc 3 . 100.Pp 101A process-mode PMC may be configured to be inheritable by the target 102process' current and future children. 103.Sh OPTIONS 104The following options are available: 105.Bl -tag -width indent 106.It Fl C 107Toggle between showing cumulative or incremental counts for 108subsequent counting mode PMCs specified on the command line. 109The default is to show incremental counts. 110.It Fl D Ar pathname 111Create files with per-program samples in the directory named 112by 113.Ar pathname . 114The default is to create these files in the current directory. 115.It Fl E 116Toggle showing per-process counts at the time a tracked process 117exits for subsequent process-mode PMCs specified on the command line. 118This option is useful for mapping the performance characteristics of a 119complex pipeline of processes when used in conjunction with the 120.Fl d 121option. 122The default is to not to enable per-process tracking. 123.It Fl M Ar mapfilename 124Write the mapping between executable objects encountered in the event 125log and the abbreviated pathnames used for 126.Xr gprof 1 127profiles to file 128.Ar mapfilename . 129If this option is not specified, mapping information is not written. 130Argument 131.Ar mapfilename 132may be a 133.Dq Li - 134in which case this mapping information is sent to the output 135file configured by the 136.Fl o 137option. 138.It Fl O Ar logfilename 139Send logging output to file 140.Ar logfilename . 141If 142.Ar logfilename 143is of the form 144.Ar hostname Ns : Ns Ar port , 145where 146.Ar hostname 147does not start with a 148.Ql \&. 149or a 150.Ql / , 151then 152.Nm 153will open a network socket to host 154.Ar hostname 155on port 156.Ar port . 157.Pp 158If the 159.Fl O 160option is not specified and one of the logging options is requested, 161then 162.Nm 163will print a textual form of the logged events to the configured 164output file. 165.It Fl P Ar event-spec 166Allocate a process mode sampling PMC measuring hardware events 167specified in 168.Ar event-spec . 169.It Fl R Ar logfilename 170Perform offline analysis using sampling data in file 171.Ar logfilename . 172.It Fl S Ar event-spec 173Allocate a system mode sampling PMC measuring hardware events 174specified in 175.Ar event-spec . 176.It Fl W 177Toggle logging the incremental counts seen by the threads of a 178tracked process each time they are scheduled on a CPU. 179This is an experimental feature intended to help analyse the 180dynamic behaviour of processes in the system. 181It may incur substantial overhead if enabled. 182The default is for this feature to be disabled. 183.It Fl c Ar cpu-spec 184Set the cpus for subsequent system mode PMCs specified on the 185command line to 186.Ar cpu-spec . 187Argument 188.Ar cpu-spec 189is a comma separated list of CPU numbers, or the literal 190.Sq * 191denoting all CPUs. 192The default is to allocate system mode PMCs on all CPUs. 193.It Fl d 194Toggle between process mode PMCs measuring events for the target 195process' current and future children or only measuring events for 196the target process. 197The default is to measure events for the target process alone. 198.It Fl g 199Produce flat execution profiles in a format compatible with 200.Xr gprof 1 . 201A separate profile file is generated for each executable object 202encountered. 203Profile files are placed in sub-directories named by their PMC 204event name. 205.It Fl k Ar kerneldir 206Set the pathname of the kernel directory to argument 207.Ar kerneldir . 208This directory specifies where 209.Nm 210should look for the kernel and its modules. 211The default is 212.Pa /boot/kernel . 213.It Fl n Ar rate 214Set the default sampling rate for subsequent sampling mode 215PMCs specified on the command line. 216The default is to configure PMCs to sample the CPU's instruction 217pointer every 65536 events. 218.It Fl o Ar outputfile 219Send counter readings and textual representations of logged data 220to file 221.Ar outputfile . 222The default is to send output to 223.Pa stderr . 224.It Fl p Ar event-spec 225Allocate a process mode counting PMC measuring hardware events 226specified in 227.Ar event-spec . 228.It Fl q 229Decrease verbosity. 230.It Fl r Ar fsroot 231Set the top of the filesystem hierarchy under which executables 232are located to argument 233.Ar fsroot . 234The default is 235.Pa / . 236.It Fl s Ar event-spec 237Allocate a system mode counting PMC measuring hardware events 238specified in 239.Ar event-spec . 240.It Fl t Ar pid 241Attach all process mode PMCs allocated to the process with PID 242.Ar pid . 243The option is not allowed in conjunction with specifying a 244command using 245.Ar command . 246.It Fl v 247Increase verbosity. 248.It Fl w Ar secs 249Print the values of all counting mode PMCs every 250.Ar secs 251seconds. 252The argument 253.Ar secs 254may be a fractional value. 255The default interval is 5 seconds. 256.El 257.Pp 258If 259.Ar command 260is specified, it is executed using 261.Xr execvp 3 . 262.Sh EXAMPLES 263To perform system-wide statistical sampling on an AMD Athlon CPU with 264samples taken every 32768 instruction retirals and data being sampled 265to file 266.Pa sample.stat , 267use: 268.Dl "pmcstat -O sample.stat -n 32768 -S k7-retired-instructions" 269.Pp 270To execute 271.Nm mozilla 272and measure the number of data cache misses suffered 273by it and its children every 12 seconds on an AMD Athlon, use: 274.Dl "pmcstat -d -w 12 -p k7-dc-misses mozilla" 275.Pp 276To count instruction tlb-misses on CPUs 0 and 2 on a Intel 277Pentium Pro/Pentium III SMP system use: 278.Dl "pmcstat -c 0,2 -s p6-itlb-miss" 279.Pp 280To perform system-wide sampling on all configured processors 281based on processor instructions retired use: 282.Dl "pmcstat -S instructions -O /tmp/sample.out" 283.Pp 284To send the generated event log to a remote machine use: 285.Dl "pmcstat -S instructions -O remotehost:port" 286On the remote machine, the sample log can be collected using 287.Xr nc 1 : 288.Dl "nc -l remotehost port > /tmp/sample.out" 289.Pp 290To generate 291.Xr gprof 1 292compatible flat profiles from a sample file use: 293.Dl "pmcstat -R /tmp/sample.out -g" 294.Sh DIAGNOSTICS 295.Ex -std 296.Sh SEE ALSO 297.Xr gprof 1 , 298.Xr nc 1 , 299.Xr execvp 3 , 300.Xr pmc 3 , 301.Xr pmclog 3 , 302.Xr hwpmc 4 , 303.Xr pmccontrol 8 , 304.Xr sysctl 8 305.Sh HISTORY 306The 307.Nm 308utility first appeared in 309.Fx 6.0 . 310It is 311.Ud 312.Sh AUTHORS 313.An Joseph Koshy Aq jkoshy@FreeBSD.org 314.Sh BUGS 315The 316.Nm 317utility cannot yet analyse 318.Xr hwpmc 4 319logs generated by non-native architectures. 320