1.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved. 2.\" 3.\" Redistribution and use in source and binary forms, with or without 4.\" modification, are permitted provided that the following conditions 5.\" are met: 6.\" 1. Redistributions of source code must retain the above copyright 7.\" notice, this list of conditions and the following disclaimer. 8.\" 2. Redistributions in binary form must reproduce the above copyright 9.\" notice, this list of conditions and the following disclaimer in the 10.\" documentation and/or other materials provided with the distribution. 11.\" 12.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 13.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 14.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 15.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 16.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 17.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 18.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 19.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 20.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 21.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 22.\" SUCH DAMAGE. 23.\" 24.\" $FreeBSD$ 25.\" 26.Dd November 24, 2008 27.Dt PMC 3 28.Os 29.Sh NAME 30.Nm pmc 31.Nd library for accessing hardware performance monitoring counters 32.Sh LIBRARY 33.Lb libpmc 34.Sh SYNOPSIS 35.In pmc.h 36.Sh DESCRIPTION 37The 38.Lb libpmc 39provides a programming interface that allows applications to use 40hardware performance counters to gather performance data about 41specific processes or for the system as a whole. 42The library is implemented using the lower-level facilities offered by 43the 44.Xr hwpmc 4 45driver. 46.Ss Key Concepts 47Performance monitoring counters (PMCs) are represented by the library 48using a software abstraction. 49These 50.Dq abstract 51PMCs can have two scopes: 52.Bl -bullet 53.It 54System scope. 55These PMCs measure events in a whole-system manner, i.e., independent 56of the currently executing thread. 57System scope PMCs are allocated on specific CPUs and do not 58migrate between CPUs. 59Non-privileged process are allowed to allocate system scope PMCs if the 60.Xr hwpmc 4 61sysctl tunable: 62.Va security.bsd.unprivileged_syspmcs 63is non-zero. 64.It 65Process scope. 66These PMCs only measure hardware events when the processes they are 67attached to are executing on a CPU. 68In an SMP system, process scope PMCs migrate between CPUs along with 69their target processes. 70.El 71.Pp 72Orthogonal to PMC scope, PMCs may be allocated in one of two 73operational modes: 74.Bl -bullet 75.It 76Counting PMCs measure events according to their scope 77(system or process). 78The application needs to explicitly read these counters 79to retrieve their value. 80.It 81Sampling PMCs cause the CPU to be periodically interrupted 82and information about its state of execution to be collected. 83Sampling PMCs are used to profile specific processes and kernel 84threads or to profile the system as a whole. 85.El 86.Pp 87The scope and operational mode for a software PMC are specified at 88PMC allocation time. 89An application is allowed to allocate multiple PMCs subject 90to availability of hardware resources. 91.Pp 92The library uses human-readable strings to name the event being 93measured by hardware. 94The syntax used for specifying a hardware event along with additional 95event specific qualifiers (if any) is described in detail in section 96.Sx "EVENT SPECIFIERS" 97below. 98.Pp 99PMCs are associated with the process that allocated them and 100will be automatically reclaimed by the system when the process exits. 101Additionally, process-scope PMCs have to be attached to one or more 102target processes before they can perform measurements. 103A process-scope PMC may be attached to those target processes 104that its owner process would otherwise be permitted to debug. 105An owner process may attach PMCs to itself allowing 106it to measure its own behavior. 107Additionally, on some machine architectures, such self-attached PMCs 108may be read cheaply using specialized instructions supported by the 109processor. 110.Pp 111Certain kinds of PMCs require that a log file be configured before 112they may be started. 113These include: 114.Bl -bullet 115.It 116System scope sampling PMCs. 117.It 118Process scope sampling PMCs. 119.It 120Process scope counting PMCs that have been configured to report PMC 121readings on process context switches or process exits. 122.El 123.Pp 124Up to one log file may be configured per owner process. 125Events logged to a log file may be subsequently analyzed using the 126.Xr pmclog 3 127family of functions. 128.Ss Supported CPUs 129The CPUs known to the PMC library are named by the 130.Vt "enum pmc_cputype" 131enumeration. 132Supported CPUs include: 133.Pp 134.Bl -tag -width "Li PMC_CPU_INTEL_CORE2" -compact 135.It Li PMC_CPU_AMD_K7 136.Tn "AMD Athlon" 137CPUs. 138.It Li PMC_CPU_AMD_K8 139.Tn "AMD Athlon64" 140CPUs. 141.It Li PMC_CPU_INTEL_ATOM 142.Tn Intel 143.Tn Atom 144CPUs and other CPUs conforming to version 3 of the 145.Tn Intel 146performance measurement architecture. 147.It Li PMC_CPU_INTEL_CORE 148.Tn Intel 149.Tn Core Solo 150and 151.Tn Core Duo 152CPUs, and other CPUs conforming to version 1 of the 153.Tn Intel 154performance measurement architecture. 155.It Li PMC_CPU_INTEL_CORE2 156.Tn Intel 157.Tn "Core2 Solo" , 158.Tn "Core2 Duo" 159and 160.Tn "Core2 Extreme" 161CPUs, and other CPUs conforming to version 2 of the 162.Tn Intel 163performance measurement architecture. 164.It Li PMC_CPU_INTEL_P5 165.Tn Intel 166.Tn "Pentium" 167CPUs. 168.It Li PMC_CPU_INTEL_P6 169.Tn Intel 170.Tn "Pentium Pro" 171CPUs. 172.It Li PMC_CPU_INTEL_PII 173.Tn "Intel Pentium II" 174CPUs. 175.It Li PMC_CPU_INTEL_PIII 176.Tn "Intel Pentium III" 177CPUs. 178.It Li PMC_CPU_INTEL_PIV 179.Tn "Intel Pentium 4" 180CPUs. 181.It Li PMC_CPU_INTEL_PM 182.Tn "Intel Pentium M" 183CPUs. 184.El 185.Ss Supported PMCs 186PMC supported by this library are named by the 187.Vt enum pmc_class 188enumeration. 189Supported PMC kinds include: 190.Pp 191.Bl -tag -width "Li PMC_CLASS_IAF" -compact 192.It Li PMC_CLASS_IAF 193Fixed function hardware counters presents in CPUs conforming to the 194.Tn Intel 195performance measurement architecture version 2 and later. 196.It Li PMC_CLASS_IAP 197Programmable hardware counters present in CPUs conforming to the 198.Tn Intel 199performance measurement architecture version 1 and later. 200.It Li PMC_CLASS_K7 201Programmable hardware counters present in 202.Tn "AMD Athlon" 203CPUs. 204.It Li PMC_CLASS_K8 205Programmable hardware counters present in 206.Tn "AMD Athlon64" 207CPUs. 208.It Li PMC_CLASS_P4 209Programmable hardware counters present in 210.Tn "Intel Pentium 4" 211CPUs. 212.It Li PMC_CLASS_P5 213Programmable hardware counters present in 214.Tn Intel 215.Tn Pentium 216CPUs. 217.It Li PMC_CLASS_P6 218Programmable hardware counters present in 219.Tn Intel 220.Tn "Pentium Pro" , 221.Tn "Pentium II" , 222.Tn "Pentium III" , 223.Tn "Celeron" , 224and 225.Tn "Pentium M" 226CPUs. 227.It Li PMC_CLASS_TSC 228The timestamp counter on i386 and amd64 architecture CPUs. 229.It Li PMC_CLASS_SOFT 230Software events. 231.El 232.Ss PMC Capabilities 233Capabilities of performance monitoring hardware are denoted using 234the 235.Vt "enum pmc_caps" 236enumeration. 237Supported capabilities include: 238.Pp 239.Bl -tag -width "Li PMC_CAP_INTERRUPT" -compact 240.It Li PMC_CAP_CASCADE 241The ability to cascade counters. 242.It Li PMC_CAP_EDGE 243The ability to count negated to asserted transitions of the hardware 244conditions being probed for. 245.It Li PMC_CAP_INTERRUPT 246The ability to interrupt the CPU. 247.It Li PMC_CAP_INVERT 248The ability to invert the sense of the hardware conditions being 249measured. 250.It Li PMC_CAP_PRECISE 251The ability to perform precise sampling. 252.It Li PMC_CAP_QUALIFIER 253The hardware allows monitored to be further qualified in some 254system dependent way. 255.It Li PMC_CAP_READ 256The ability to read from performance counters. 257.It Li PMC_CAP_SYSTEM 258The ability to restrict counting of hardware events to when the CPU is 259running privileged code. 260.It Li PMC_CAP_THRESHOLD 261The ability to ignore simultaneous hardware events below a 262programmable threshold. 263.It Li PMC_CAP_USER 264The ability to restrict counting of hardware events to those when the 265CPU is running unprivileged code. 266.It Li PMC_CAP_WRITE 267The ability to write to performance counters. 268.El 269.Ss CPU Naming Conventions 270CPUs are named using small integers from zero up to, but 271excluding, the value returned by function 272.Fn pmc_ncpu . 273On platforms supporting sparsely numbered CPUs not all the numbers in 274this range will denote valid CPUs. 275Operations on non-existent CPUs will return an error. 276.Ss Functional Grouping of the API 277This section contains a brief overview of the available functionality 278in the PMC library. 279Each function listed here is described further in its own manual page. 280.Bl -tag -width 2n 281.It Administration 282.Bl -tag -width 6n -compact 283.It Fn pmc_disable , Fn pmc_enable 284Administratively disable (enable) specific performance monitoring 285counter hardware. 286Counters that are disabled will not be available to applications to 287use. 288.El 289.It "Convenience Functions" 290.Bl -tag -width 6n -compact 291.It Fn pmc_event_names_of_class 292Returns a list of event names supported by a given PMC type. 293.It Fn pmc_name_of_capability 294Convert a 295.Dv PMC_CAP_* 296flag to a human-readable string. 297.It Fn pmc_name_of_class 298Convert a 299.Dv PMC_CLASS_* 300constant to a human-readable string. 301.It Fn pmc_name_of_cputype 302Return a human-readable name for a CPU type. 303.It Fn pmc_name_of_disposition 304Return a human-readable string describing a PMC's disposition. 305.It Fn pmc_name_of_event 306Convert a numeric event code to a human-readable string. 307.It Fn pmc_name_of_mode 308Convert a 309.Dv PMC_MODE_* 310constant to a human-readable name. 311.It Fn pmc_name_of_state 312Return a human-readable string describing a PMC's current state. 313.El 314.It "Library Initialization" 315.Bl -tag -width 6n -compact 316.It Fn pmc_init 317Initialize the library. 318This function must be called before any other library function. 319.El 320.It "Log File Handling" 321.Bl -tag -width 6n -compact 322.It Fn pmc_configure_logfile 323Configure a log file for 324.Xr hwpmc 4 325to write logged events to. 326.It Fn pmc_flush_logfile 327Flush all pending log data in 328.Xr hwpmc 4 Ns Ap s 329buffers. 330.It Fn pmc_close_logfile 331Flush all pending log data and close 332.Xr hwpmc 4 Ns Ap s 333side of the stream. 334.It Fn pmc_writelog 335Append arbitrary user data to the current log file. 336.El 337.It "PMC Management" 338.Bl -tag -width 6n -compact 339.It Fn pmc_allocate , Fn pmc_release 340Allocate (free) a PMC. 341.It Fn pmc_attach , Fn pmc_detach 342Attach (detach) a process scope PMC to a target. 343.It Fn pmc_read , Fn pmc_write , Fn pmc_rw 344Read (write) a value from (to) a PMC. 345.It Fn pmc_start , Fn pmc_stop 346Start (stop) a software PMC. 347.It Fn pmc_set 348Set the reload value for a sampling PMC. 349.El 350.It "Queries" 351.Bl -tag -width 6n -compact 352.It Fn pmc_capabilities 353Retrieve the capabilities for a given PMC. 354.It Fn pmc_cpuinfo 355Retrieve information about the CPUs and PMC hardware present in the 356system. 357.It Fn pmc_get_driver_stats 358Retrieve statistics maintained by 359.Xr hwpmc 4 . 360.It Fn pmc_ncpu 361Determine the greatest possible CPU number on the system. 362.It Fn pmc_npmc 363Return the number of hardware PMCs present in a given CPU. 364.It Fn pmc_pmcinfo 365Return information about the state of a given CPU's PMCs. 366.It Fn pmc_width 367Determine the width of a hardware counter in bits. 368.El 369.It "x86 Architecture Specific API" 370.Bl -tag -width 6n -compact 371.It Fn pmc_get_msr 372Returns the processor model specific register number 373associated with 374.Fa pmc . 375Applications may then use the x86 376.Ic RDPMC 377instruction to directly read the contents of the PMC. 378.El 379.El 380.Ss Signal Handling Requirements 381Applications using PMCs are required to handle the following signals: 382.Bl -tag -width ".Dv SIGBUS" 383.It Dv SIGBUS 384When the 385.Xr hwpmc 4 386module is unloaded using 387.Xr kldunload 8 , 388processes that have PMCs allocated to them will be sent a 389.Dv SIGBUS 390signal. 391.It Dv SIGIO 392The 393.Xr hwpmc 4 394driver will send a PMC owning process a 395.Dv SIGIO 396signal if: 397.Bl -bullet 398.It 399If any process-mode PMC allocated by it loses all its 400target processes. 401.It 402If the driver encounters an error when writing log data to a 403configured log file. 404This error may be retrieved by a subsequent call to 405.Fn pmc_flush_logfile . 406.El 407.El 408.Ss Typical Program Flow 409.Bl -enum 410.It 411An application would first invoke function 412.Fn pmc_init 413to allow the library to initialize itself. 414.It 415Signal handling would then be set up. 416.It 417Next the application would allocate the PMCs it desires using function 418.Fn pmc_allocate . 419.It 420Initial values for PMCs may be set using function 421.Fn pmc_set . 422.It 423If a log file is necessary for the PMCs to work, it would 424be configured using function 425.Fn pmc_configure_logfile . 426.It 427Process scope PMCs would then be attached to their target processes 428using function 429.Fn pmc_attach . 430.It 431The PMCs would then be started using function 432.Fn pmc_start . 433.It 434Once started, the values of counting PMCs may be read using function 435.Fn pmc_read . 436For PMCs that write events to the log file, this logged data would be 437read and parsed using the 438.Xr pmclog 3 439family of functions. 440.It 441PMCs are stopped using function 442.Fn pmc_stop , 443and process scope PMCs are detached from their targets using 444function 445.Fn pmc_detach . 446.It 447Before the process exits, its may release its PMCs using function 448.Fn pmc_release . 449Any configured log file may be closed using function 450.Fn pmc_configure_logfile . 451.El 452.Sh EVENT SPECIFIERS 453Event specifiers are strings comprising of an event name, followed by 454optional parameters modifying the semantics of the hardware event 455being probed. 456Event names are PMC architecture dependent, but the PMC library defines 457machine independent aliases for commonly used events. 458.Pp 459Event specifiers spellings are case-insensitive and space characters, 460periods, underscores and hyphens are considered equivalent to each other. 461Thus the event specifiers 462.Qq "Example Event" , 463.Qq "example-event" , 464and 465.Qq "EXAMPLE_EVENT" 466are equivalent. 467.Ss PMC Architecture Dependent Events 468PMC architecture dependent event specifiers are described in the 469following manual pages: 470.Bl -column " PMC_CLASS_TSC " "MANUAL PAGE " 471.It Em "PMC Class" Ta Em "Manual Page" 472.It Li PMC_CLASS_IAF Ta Xr pmc.iaf 3 473.It Li PMC_CLASS_IAP Ta Xr pmc.atom 3 , Xr pmc.core 3 , Xr pmc.core2 3 474.It Li PMC_CLASS_K7 Ta Xr pmc.k7 3 475.It Li PMC_CLASS_K8 Ta Xr pmc.k8 3 476.It Li PMC_CLASS_P4 Ta Xr pmc.p4 3 477.It Li PMC_CLASS_P5 Ta Xr pmc.p5 3 478.It Li PMC_CLASS_P6 Ta Xr pmc.p6 3 479.It Li PMC_CLASS_TSC Ta Xr pmc.tsc 3 480.El 481.Ss Event Name Aliases 482Event name aliases are PMC-independent names for commonly used events. 483The following aliases are known to this version of the 484.Nm pmc 485library: 486.Bl -tag -width indent 487.It Li branches 488Measure the number of branches retired. 489.It Li branch-mispredicts 490Measure the number of retired branches that were mispredicted. 491.It Li cycles 492Measure processor cycles. 493This event is implemented using the processor's Time Stamp Counter 494register. 495.It Li dc-misses 496Measure the number of data cache misses. 497.It Li ic-misses 498Measure the number of instruction cache misses. 499.It Li instructions 500Measure the number of instructions retired. 501.It Li interrupts 502Measure the number of interrupts seen. 503.It Li unhalted-cycles 504Measure the number of cycles the processor is not in a halted 505or sleep state. 506.El 507.Sh COMPATIBILITY 508The interface between the 509.Nm pmc 510library and the 511.Xr hwpmc 4 512driver is intended to be private to the implementation and may 513change. 514In order to ease forward compatibility with future versions of the 515.Xr hwpmc 4 516driver, applications are urged to dynamically link with the 517.Nm pmc 518library. 519.Pp 520The 521.Nm pmc 522API is 523.Ud 524.Sh SEE ALSO 525.Xr pmc.atom 3 , 526.Xr pmc.core 3 , 527.Xr pmc.core2 3 , 528.Xr pmc.iaf 3 , 529.Xr pmc.k7 3 , 530.Xr pmc.k8 3 , 531.Xr pmc.p4 3 , 532.Xr pmc.p5 3 , 533.Xr pmc.p6 3 , 534.Xr pmc.soft 3 , 535.Xr pmc.tsc 3 , 536.Xr pmclog 3 , 537.Xr hwpmc 4 , 538.Xr pmccontrol 8 , 539.Xr pmcstat 8 540.Sh HISTORY 541The 542.Nm pmc 543library first appeared in 544.Fx 6.0 . 545.Sh AUTHORS 546The 547.Lb libpmc 548library was written by 549.An "Joseph Koshy" 550.Aq jkoshy@FreeBSD.org . 551