1.\" 2.\" Copyright (c) 1998, 1999 Kenneth D. Merry. 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 3. The name of the author may not be used to endorse or promote products 14.\" derived from this software without specific prior written permission. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.\" $FreeBSD$ 29.\" 30.Dd May 22, 1998 31.Dt DEVSTAT 9 32.Os 33.Sh NAME 34.Nm devstat , 35.Nm devstat_add_entry , 36.Nm devstat_end_transaction , 37.Nm devstat_end_transaction_bio , 38.Nm devstat_remove_entry , 39.Nm devstat_start_transaction 40.Nd kernel interface for keeping device statistics 41.Sh SYNOPSIS 42.In sys/devicestat.h 43.Ft void 44.Fo devstat_add_entry 45.Fa "struct devstat *ds" 46.Fa "const char *dev_name" 47.Fa "int unit_number" 48.Fa "u_int32_t block_size" 49.Fa "devstat_support_flags flags" 50.Fa "devstat_type_flags device_type" 51.Fa "devstat_priority priority" 52.Fc 53.Ft void 54.Fn devstat_remove_entry "struct devstat *ds" 55.Ft void 56.Fn devstat_start_transaction "struct devstat *ds" 57.Ft void 58.Fo devstat_end_transaction 59.Fa "struct devstat *ds" 60.Fa "u_int32_t bytes" 61.Fa "devstat_tag_type tag_type" 62.Fa "devstat_trans_flags flags" 63.Fc 64.Ft void 65.Fo devstat_end_transaction_bio 66.Fa "struct devstat *ds" 67.Fa "struct bio *bp" 68.Fc 69.Sh DESCRIPTION 70The devstat subsystem is an interface for recording device 71statistics, as its name implies. 72The idea is to keep reasonably detailed 73statistics while utilizing a minimum amount of CPU time to record them. 74Thus, no statistical calculations are actually performed in the kernel 75portion of the 76.Nm 77code. Instead, that is left for user programs to handle. 78.Pp 79.Fn devstat_add_entry 80registers a device with the 81.Nm 82subsystem. 83The caller is expected to have already allocated \fBand zeroed\fR 84the devstat structure before calling this function. 85.Fn devstat_add_entry 86takes several arguments: 87.Bl -tag -width device_type 88.It ds 89The 90.Va devstat 91structure, allocated and zeroed by the client. 92.It dev_name 93The device name. e.g. da, cd, sa. 94.It unit_number 95Device unit number. 96.It block_size 97Block size of the device, if supported. 98If the device does not support a 99block size, or if the blocksize is unknown at the time the device is added 100to the 101.Nm 102list, it should be set to 0. 103.It flags 104Flags indicating operations supported or not supported by the device. 105See below for details. 106.It device_type 107The device type. 108This is broken into three sections: base device type 109(e.g. direct access, CDROM, sequential access), interface type (IDE, SCSI 110or other) and a pass-through flag to indicate pas-through devices. 111See below for a complete list of types. 112.It priority 113The device priority. 114The priority is used to determine how devices are 115sorted within 116.Nm devstat Ns 's 117list of devices. 118Devices are sorted first by priority (highest to lowest), 119and then by attach order. 120See below for a complete list of available 121priorities. 122.El 123.Pp 124.Fn devstat_remove_entry 125removes a device from the 126.Nm 127subsystem. 128It takes the devstat structure for the device in question as 129an argument. 130The 131.Nm 132generation number is incremented and the number of devices is decremented. 133.Pp 134.Fn devstat_start_transaction 135registers the start of a transaction with the 136.Nm 137subsystem. 138The busy count is incremented with each transaction start. 139When a device goes from idle to busy, the system uptime is recorded in the 140.Va start_time 141field of the 142.Va devstat 143structure. 144.Pp 145.Fn devstat_end_transaction 146registers the end of a transaction with the 147.Nm 148subsystem. 149It takes four arguments: 150.Bl -tag -width tag_type 151.It ds 152The 153.Va devstat 154structure for the device in question. 155.It bytes 156The number of bytes transferred in this transaction. 157.It tag_type 158Transaction tag type. 159See below for tag types. 160.It flags 161Transaction flags indicating whether the transaction was a read, write, or 162whether no data was transferred. 163.El 164.Pp 165.Fn devstat_end_transaction_bio 166is a wrapper for 167.Fn devstat_end_transaction 168which pulls all the information from a 169.Va "struct bio" 170which is ready for biodone(). 171.Pp 172The 173.Va devstat 174structure is composed of the following fields: 175.Bl -tag -width dev_creation_time 176.It dev_links 177Each 178.Va devstat 179structure is placed in a linked list when it is registered. 180The 181.Va dev_links 182field contains a pointer to the next entry in the list of 183.Va devstat 184structures. 185.It device_number 186The device number is a unique identifier for each device. 187The device 188number is incremented for each new device that is registered. 189The device 190number is currently only a 32-bit integer, but it could be enlarged if 191someone has a system with more than four billion device arrival events. 192.It device_name 193The device name is a text string given by the registering driver to 194identify itself. 195(e.g.\& 196.Dq da , 197.Dq cd , 198.Dq sa , 199etc.) 200.It unit_number 201The unit number identifies the particular instance of the peripheral driver 202in question. 203.It bytes_written 204This is the number of bytes that have been written to the device. 205This number is currently an unsigned 64 bit integer. 206This will hopefully 207eliminate the counter wrap that would come very quickly on some systems if 20832 bit integers were used. 209.It bytes_read 210This is the number of bytes that have been read from the device. 211.It bytes_freed 212This is the number of bytes that have been freed/erased on the device. 213.It num_reads 214This is the number of reads from the device. 215.It num_writes 216This is the number of writes to the device. 217.It num_frees 218This is the number of free/erase operations on the device. 219.It num_other 220This is the number of transactions to the device which are neither reads or 221writes. 222For instance, 223.Tn SCSI 224drivers often send a test unit ready command to 225.Tn SCSI 226devices. 227The test unit ready command does not read or write any data. 228It merely causes the device to return its status. 229.It busy_count 230This is the current number of outstanding transactions for the device. 231This should never go below zero, and on an idle device it should be zero. 232If either one of these conditions is not true, it indicates a problem in 233the way 234.Fn devstat_start_transaction 235and 236.Fn devstat_end_transaction 237are being called in client code. 238There should be one and only one 239transaction start event and one transaction end event for each transaction. 240.It block_size 241This is the block size of the device, if the device has a block size. 242.It tag_types 243This is an array of counters to record the number of various tag types that 244are sent to a device. 245See below for a list of tag types. 246.It dev_creation_time 247This is the time, as reported by 248.Fn getmicrotime 249that the device was registered. 250.It busy_time 251This is the amount of time that the device busy count has been greater than 252zero. 253This is only updated when the busy count returns to zero. 254.It start_time 255This is the time, as reported by 256.Fn getmicrouptime 257that the device busy count went from zero to one. 258.It last_comp_time 259This is the time as reported by 260.Fn getmicrouptime 261that a transaction last completed. 262It is used along with 263.Va start_time 264to calculate the device busy time. 265.It flags 266These flags indicate which statistics measurements are supported by a 267particular device. 268These flags are primarily intended to serve as an aid 269to userland programs that decipher the statistics. 270.It device_type 271This is the device type. 272It consists of three parts: the device type 273(e.g. direct access, CDROM, sequential access, etc.), the interface (IDE, 274SCSI or other) and whether or not the device in question is a pass-through 275driver. 276See below for a complete list of device types. 277.It priority 278This is the priority. 279This is the first parameter used to determine where 280to insert a device in the 281.Nm 282list. 283The second parameter is attach order. 284See below for a list of 285available priorities. 286.El 287.Pp 288Each device is given a device type. 289Pass-through devices have the same 290underlying device type and interface as the device they provide an 291interface for, but they also have the pass-through flag set. 292The base 293device types are identical to the 294.Tn SCSI 295device type numbers, so with 296.Tn SCSI 297peripherals, the device type returned from an inquiry is usually ORed with 298the 299.Tn SCSI 300interface type and the pass-through flag if appropriate. 301The device type 302flags are as follows: 303.Bd -literal -offset indent 304typedef enum { 305 DEVSTAT_TYPE_DIRECT = 0x000, 306 DEVSTAT_TYPE_SEQUENTIAL = 0x001, 307 DEVSTAT_TYPE_PRINTER = 0x002, 308 DEVSTAT_TYPE_PROCESSOR = 0x003, 309 DEVSTAT_TYPE_WORM = 0x004, 310 DEVSTAT_TYPE_CDROM = 0x005, 311 DEVSTAT_TYPE_SCANNER = 0x006, 312 DEVSTAT_TYPE_OPTICAL = 0x007, 313 DEVSTAT_TYPE_CHANGER = 0x008, 314 DEVSTAT_TYPE_COMM = 0x009, 315 DEVSTAT_TYPE_ASC0 = 0x00a, 316 DEVSTAT_TYPE_ASC1 = 0x00b, 317 DEVSTAT_TYPE_STORARRAY = 0x00c, 318 DEVSTAT_TYPE_ENCLOSURE = 0x00d, 319 DEVSTAT_TYPE_FLOPPY = 0x00e, 320 DEVSTAT_TYPE_MASK = 0x00f, 321 DEVSTAT_TYPE_IF_SCSI = 0x010, 322 DEVSTAT_TYPE_IF_IDE = 0x020, 323 DEVSTAT_TYPE_IF_OTHER = 0x030, 324 DEVSTAT_TYPE_IF_MASK = 0x0f0, 325 DEVSTAT_TYPE_PASS = 0x100 326} devstat_type_flags; 327.Ed 328.Pp 329Devices have a priority associated with them, which controls roughly where 330they are placed in the 331.Nm 332list. 333The priorities are as follows: 334.Bd -literal -offset indent 335typedef enum { 336 DEVSTAT_PRIORITY_MIN = 0x000, 337 DEVSTAT_PRIORITY_OTHER = 0x020, 338 DEVSTAT_PRIORITY_PASS = 0x030, 339 DEVSTAT_PRIORITY_FD = 0x040, 340 DEVSTAT_PRIORITY_WFD = 0x050, 341 DEVSTAT_PRIORITY_TAPE = 0x060, 342 DEVSTAT_PRIORITY_CD = 0x090, 343 DEVSTAT_PRIORITY_DISK = 0x110, 344 DEVSTAT_PRIORITY_ARRAY = 0x120, 345 DEVSTAT_PRIORITY_MAX = 0xfff 346} devstat_priority; 347.Ed 348.Pp 349Each device has associated with it flags to indicate what operations are 350supported or not supported. 351The 352.Va devstat_support_flags 353values are as follows: 354.Bl -tag -width DEVSTAT_NO_ORDERED_TAGS 355.It DEVSTAT_ALL_SUPPORTED 356Every statistic type is supported by the device. 357.It DEVSTAT_NO_BLOCKSIZE 358This device does not have a blocksize. 359.It DEVSTAT_NO_ORDERED_TAGS 360This device does not support ordered tags. 361.It DEVSTAT_BS_UNAVAILABLE 362This device supports a blocksize, but it is currently unavailable. 363This 364flag is most often used with removable media drives. 365.El 366.Pp 367Transactions to a device fall into one of three categories, which are 368represented in the 369.Va flags 370passed into 371.Fn devstat_end_transaction . 372The transaction types are as follows: 373.Bd -literal -offset indent 374typedef enum { 375 DEVSTAT_NO_DATA = 0x00, 376 DEVSTAT_READ = 0x01, 377 DEVSTAT_WRITE = 0x02, 378 DEVSTAT_FREE = 0x03 379} devstat_trans_flags; 380.Ed 381.Pp 382There are four possible values for the 383.Va tag_type 384argument to 385.Fn devstat_end_transaction : 386.Bl -tag -width DEVSTAT_TAG_ORDERED 387.It DEVSTAT_TAG_SIMPLE 388The transaction had a simple tag. 389.It DEVSTAT_TAG_HEAD 390The transaction had a head of queue tag. 391.It DEVSTAT_TAG_ORDERED 392The transaction had an ordered tag. 393.It DEVSTAT_TAG_NONE 394The device doesn't support tags. 395.El 396.Pp 397The tag type values correspond to the lower four bits of the 398.Tn SCSI 399tag definitions. 400In CAM, for instance, the 401.Va tag_action 402from the CCB is ORed with 0xf to determine the tag type to pass in to 403.Fn devstat_end_transaction . 404.Pp 405There is a macro, 406.Dv DEVSTAT_VERSION 407that is defined in 408.In sys/devicestat.h . 409This is the current version of the 410.Nm 411subsystem, and it should be incremented each time a change is made that 412would require recompilation of userland programs that access 413.Nm 414statistics. 415Userland programs use this version, via the 416.Va kern.devstat.version 417.Nm sysctl 418variable to determine whether they are in sync with the kernel 419.Nm 420structures. 421.Sh SEE ALSO 422.Xr systat 1 , 423.Xr devstat 3 , 424.Xr iostat 8 , 425.Xr rpc.rstatd 8 , 426.Xr vmstat 8 427.Sh HISTORY 428The 429.Nm 430statistics system appeared in 431.Fx 3.0 . 432.Sh AUTHORS 433.An Kenneth Merry Aq ken@FreeBSD.org 434.Sh BUGS 435There may be a need for 436.Fn spl 437protection around some of the 438.Nm 439list manipulation code to insure, for example, that the list of devices 440is not changed while someone is fetching the 441.Va kern.devstat.all 442.Nm sysctl 443variable. 444.Pp 445It is impossible with the current 446.Nm 447architecture to accurately measure time per transaction. 448The only feasible 449way to accurately measure time per transaction would be to record a 450timestamp for every transaction. 451This measurement is probably not 452worthwhile for most people as it would adversely affect the performance of 453the system and cost space to store the timestamps for individual 454transactions. 455