1.\" 2.\" Copyright (c) 1998, 1999 Kenneth D. Merry. 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 3. The name of the author may not be used to endorse or promote products 14.\" derived from this software without specific prior written permission. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 17.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 19.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 20.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 22.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 23.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 24.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 25.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 26.\" SUCH DAMAGE. 27.\" 28.\" $FreeBSD$ 29.\" 30.Dd May 22, 1998 31.Dt DEVSTAT 9 32.Os 33.Sh NAME 34.Nm devstat , 35.Nm devstat_add_entry , 36.Nm devstat_end_transaction , 37.Nm devstat_end_transaction_bio , 38.Nm devstat_remove_entry , 39.Nm devstat_start_transaction 40.Nd kernel interface for keeping device statistics 41.Sh SYNOPSIS 42.In sys/devicestat.h 43.Ft void 44.Fo devstat_add_entry 45.Fa "struct devstat *ds" 46.Fa "const char *dev_name" 47.Fa "int unit_number" 48.Fa "u_int32_t block_size" 49.Fa "devstat_support_flags flags" 50.Fa "devstat_type_flags device_type" 51.Fa "devstat_priority priority" 52.Fc 53.Ft void 54.Fn devstat_remove_entry "struct devstat *ds" 55.Ft void 56.Fn devstat_start_transaction "struct devstat *ds" 57.Ft void 58.Fo devstat_end_transaction 59.Fa "struct devstat *ds" 60.Fa "u_int32_t bytes" 61.Fa "devstat_tag_type tag_type" 62.Fa "devstat_trans_flags flags" 63.Fc 64.Ft void 65.Fo devstat_end_transaction_bio 66.Fa "struct devstat *ds" 67.Fa "struct bio *bp" 68.Fc 69.Sh DESCRIPTION 70The devstat subsystem is an interface for recording device 71statistics, as its name implies. 72The idea is to keep reasonably detailed 73statistics while utilizing a minimum amount of CPU time to record them. 74Thus, no statistical calculations are actually performed in the kernel 75portion of the 76.Nm 77code. 78Instead, that is left for user programs to handle. 79.Pp 80.Fn devstat_add_entry 81registers a device with the 82.Nm 83subsystem. 84The caller is expected to have already allocated \fBand zeroed\fR 85the devstat structure before calling this function. 86.Fn devstat_add_entry 87takes several arguments: 88.Bl -tag -width device_type 89.It ds 90The 91.Va devstat 92structure, allocated and zeroed by the client. 93.It dev_name 94The device name, e.g.\& da, cd, sa. 95.It unit_number 96Device unit number. 97.It block_size 98Block size of the device, if supported. 99If the device does not support a 100block size, or if the blocksize is unknown at the time the device is added 101to the 102.Nm 103list, it should be set to 0. 104.It flags 105Flags indicating operations supported or not supported by the device. 106See below for details. 107.It device_type 108The device type. 109This is broken into three sections: base device type 110(e.g.\& direct access, CDROM, sequential access), interface type (IDE, SCSI 111or other) and a pass-through flag to indicate pas-through devices. 112See below for a complete list of types. 113.It priority 114The device priority. 115The priority is used to determine how devices are 116sorted within 117.Nm devstat Ns 's 118list of devices. 119Devices are sorted first by priority (highest to lowest), 120and then by attach order. 121See below for a complete list of available 122priorities. 123.El 124.Pp 125.Fn devstat_remove_entry 126removes a device from the 127.Nm 128subsystem. 129It takes the devstat structure for the device in question as 130an argument. 131The 132.Nm 133generation number is incremented and the number of devices is decremented. 134.Pp 135.Fn devstat_start_transaction 136registers the start of a transaction with the 137.Nm 138subsystem. 139The busy count is incremented with each transaction start. 140When a device goes from idle to busy, the system uptime is recorded in the 141.Va start_time 142field of the 143.Va devstat 144structure. 145.Pp 146.Fn devstat_end_transaction 147registers the end of a transaction with the 148.Nm 149subsystem. 150It takes four arguments: 151.Bl -tag -width tag_type 152.It ds 153The 154.Va devstat 155structure for the device in question. 156.It bytes 157The number of bytes transferred in this transaction. 158.It tag_type 159Transaction tag type. 160See below for tag types. 161.It flags 162Transaction flags indicating whether the transaction was a read, write, or 163whether no data was transferred. 164.El 165.Pp 166.Fn devstat_end_transaction_bio 167is a wrapper for 168.Fn devstat_end_transaction 169which pulls all the information from a 170.Va "struct bio" 171which is ready for biodone(). 172.Pp 173The 174.Va devstat 175structure is composed of the following fields: 176.Bl -tag -width dev_creation_time 177.It dev_links 178Each 179.Va devstat 180structure is placed in a linked list when it is registered. 181The 182.Va dev_links 183field contains a pointer to the next entry in the list of 184.Va devstat 185structures. 186.It device_number 187The device number is a unique identifier for each device. 188The device 189number is incremented for each new device that is registered. 190The device 191number is currently only a 32-bit integer, but it could be enlarged if 192someone has a system with more than four billion device arrival events. 193.It device_name 194The device name is a text string given by the registering driver to 195identify itself. 196(e.g.\& 197.Dq da , 198.Dq cd , 199.Dq sa , 200etc.) 201.It unit_number 202The unit number identifies the particular instance of the peripheral driver 203in question. 204.It bytes_written 205This is the number of bytes that have been written to the device. 206This number is currently an unsigned 64 bit integer. 207This will hopefully 208eliminate the counter wrap that would come very quickly on some systems if 20932 bit integers were used. 210.It bytes_read 211This is the number of bytes that have been read from the device. 212.It bytes_freed 213This is the number of bytes that have been freed/erased on the device. 214.It num_reads 215This is the number of reads from the device. 216.It num_writes 217This is the number of writes to the device. 218.It num_frees 219This is the number of free/erase operations on the device. 220.It num_other 221This is the number of transactions to the device which are neither reads or 222writes. 223For instance, 224.Tn SCSI 225drivers often send a test unit ready command to 226.Tn SCSI 227devices. 228The test unit ready command does not read or write any data. 229It merely causes the device to return its status. 230.It busy_count 231This is the current number of outstanding transactions for the device. 232This should never go below zero, and on an idle device it should be zero. 233If either one of these conditions is not true, it indicates a problem in 234the way 235.Fn devstat_start_transaction 236and 237.Fn devstat_end_transaction 238are being called in client code. 239There should be one and only one 240transaction start event and one transaction end event for each transaction. 241.It block_size 242This is the block size of the device, if the device has a block size. 243.It tag_types 244This is an array of counters to record the number of various tag types that 245are sent to a device. 246See below for a list of tag types. 247.It dev_creation_time 248This is the time, as reported by 249.Fn getmicrotime 250that the device was registered. 251.It busy_time 252This is the amount of time that the device busy count has been greater than 253zero. 254This is only updated when the busy count returns to zero. 255.It start_time 256This is the time, as reported by 257.Fn getmicrouptime 258that the device busy count went from zero to one. 259.It last_comp_time 260This is the time as reported by 261.Fn getmicrouptime 262that a transaction last completed. 263It is used along with 264.Va start_time 265to calculate the device busy time. 266.It flags 267These flags indicate which statistics measurements are supported by a 268particular device. 269These flags are primarily intended to serve as an aid 270to userland programs that decipher the statistics. 271.It device_type 272This is the device type. 273It consists of three parts: the device type 274(e.g.\& direct access, CDROM, sequential access, etc.), the interface (IDE, 275SCSI or other) and whether or not the device in question is a pass-through 276driver. 277See below for a complete list of device types. 278.It priority 279This is the priority. 280This is the first parameter used to determine where 281to insert a device in the 282.Nm 283list. 284The second parameter is attach order. 285See below for a list of 286available priorities. 287.El 288.Pp 289Each device is given a device type. 290Pass-through devices have the same 291underlying device type and interface as the device they provide an 292interface for, but they also have the pass-through flag set. 293The base 294device types are identical to the 295.Tn SCSI 296device type numbers, so with 297.Tn SCSI 298peripherals, the device type returned from an inquiry is usually ORed with 299the 300.Tn SCSI 301interface type and the pass-through flag if appropriate. 302The device type 303flags are as follows: 304.Bd -literal -offset indent 305typedef enum { 306 DEVSTAT_TYPE_DIRECT = 0x000, 307 DEVSTAT_TYPE_SEQUENTIAL = 0x001, 308 DEVSTAT_TYPE_PRINTER = 0x002, 309 DEVSTAT_TYPE_PROCESSOR = 0x003, 310 DEVSTAT_TYPE_WORM = 0x004, 311 DEVSTAT_TYPE_CDROM = 0x005, 312 DEVSTAT_TYPE_SCANNER = 0x006, 313 DEVSTAT_TYPE_OPTICAL = 0x007, 314 DEVSTAT_TYPE_CHANGER = 0x008, 315 DEVSTAT_TYPE_COMM = 0x009, 316 DEVSTAT_TYPE_ASC0 = 0x00a, 317 DEVSTAT_TYPE_ASC1 = 0x00b, 318 DEVSTAT_TYPE_STORARRAY = 0x00c, 319 DEVSTAT_TYPE_ENCLOSURE = 0x00d, 320 DEVSTAT_TYPE_FLOPPY = 0x00e, 321 DEVSTAT_TYPE_MASK = 0x00f, 322 DEVSTAT_TYPE_IF_SCSI = 0x010, 323 DEVSTAT_TYPE_IF_IDE = 0x020, 324 DEVSTAT_TYPE_IF_OTHER = 0x030, 325 DEVSTAT_TYPE_IF_MASK = 0x0f0, 326 DEVSTAT_TYPE_PASS = 0x100 327} devstat_type_flags; 328.Ed 329.Pp 330Devices have a priority associated with them, which controls roughly where 331they are placed in the 332.Nm 333list. 334The priorities are as follows: 335.Bd -literal -offset indent 336typedef enum { 337 DEVSTAT_PRIORITY_MIN = 0x000, 338 DEVSTAT_PRIORITY_OTHER = 0x020, 339 DEVSTAT_PRIORITY_PASS = 0x030, 340 DEVSTAT_PRIORITY_FD = 0x040, 341 DEVSTAT_PRIORITY_WFD = 0x050, 342 DEVSTAT_PRIORITY_TAPE = 0x060, 343 DEVSTAT_PRIORITY_CD = 0x090, 344 DEVSTAT_PRIORITY_DISK = 0x110, 345 DEVSTAT_PRIORITY_ARRAY = 0x120, 346 DEVSTAT_PRIORITY_MAX = 0xfff 347} devstat_priority; 348.Ed 349.Pp 350Each device has associated with it flags to indicate what operations are 351supported or not supported. 352The 353.Va devstat_support_flags 354values are as follows: 355.Bl -tag -width DEVSTAT_NO_ORDERED_TAGS 356.It DEVSTAT_ALL_SUPPORTED 357Every statistic type is supported by the device. 358.It DEVSTAT_NO_BLOCKSIZE 359This device does not have a blocksize. 360.It DEVSTAT_NO_ORDERED_TAGS 361This device does not support ordered tags. 362.It DEVSTAT_BS_UNAVAILABLE 363This device supports a blocksize, but it is currently unavailable. 364This 365flag is most often used with removable media drives. 366.El 367.Pp 368Transactions to a device fall into one of three categories, which are 369represented in the 370.Va flags 371passed into 372.Fn devstat_end_transaction . 373The transaction types are as follows: 374.Bd -literal -offset indent 375typedef enum { 376 DEVSTAT_NO_DATA = 0x00, 377 DEVSTAT_READ = 0x01, 378 DEVSTAT_WRITE = 0x02, 379 DEVSTAT_FREE = 0x03 380} devstat_trans_flags; 381.Ed 382.Pp 383There are four possible values for the 384.Va tag_type 385argument to 386.Fn devstat_end_transaction : 387.Bl -tag -width DEVSTAT_TAG_ORDERED 388.It DEVSTAT_TAG_SIMPLE 389The transaction had a simple tag. 390.It DEVSTAT_TAG_HEAD 391The transaction had a head of queue tag. 392.It DEVSTAT_TAG_ORDERED 393The transaction had an ordered tag. 394.It DEVSTAT_TAG_NONE 395The device does not support tags. 396.El 397.Pp 398The tag type values correspond to the lower four bits of the 399.Tn SCSI 400tag definitions. 401In CAM, for instance, the 402.Va tag_action 403from the CCB is ORed with 0xf to determine the tag type to pass in to 404.Fn devstat_end_transaction . 405.Pp 406There is a macro, 407.Dv DEVSTAT_VERSION 408that is defined in 409.In sys/devicestat.h . 410This is the current version of the 411.Nm 412subsystem, and it should be incremented each time a change is made that 413would require recompilation of userland programs that access 414.Nm 415statistics. 416Userland programs use this version, via the 417.Va kern.devstat.version 418.Nm sysctl 419variable to determine whether they are in sync with the kernel 420.Nm 421structures. 422.Sh SEE ALSO 423.Xr systat 1 , 424.Xr devstat 3 , 425.Xr iostat 8 , 426.Xr rpc.rstatd 8 , 427.Xr vmstat 8 428.Sh HISTORY 429The 430.Nm 431statistics system appeared in 432.Fx 3.0 . 433.Sh AUTHORS 434.An Kenneth Merry Aq ken@FreeBSD.org 435.Sh BUGS 436There may be a need for 437.Fn spl 438protection around some of the 439.Nm 440list manipulation code to ensure, for example, that the list of devices 441is not changed while someone is fetching the 442.Va kern.devstat.all 443.Nm sysctl 444variable. 445.Pp 446It is impossible with the current 447.Nm 448architecture to accurately measure time per transaction. 449The only feasible 450way to accurately measure time per transaction would be to record a 451timestamp for every transaction. 452This measurement is probably not 453worthwhile for most people as it would adversely affect the performance of 454the system and cost space to store the timestamps for individual 455transactions. 456