1.\" 2.\" This file and its contents are supplied under the terms of the 3.\" Common Development and Distribution License ("CDDL"), version 1.0. 4.\" You may only use this file in accordance with the terms of version 5.\" 1.0 of the CDDL. 6.\" 7.\" A full copy of the text of the CDDL should have accompanied this 8.\" source. A copy of the CDDL is also available via the Internet at 9.\" http://www.illumos.org/license/CDDL. 10.\" 11.\" 12.\" Copyright 2023 Oxide Computer Company 13.\" 14.Dd January 26, 2023 15.Dt INTRO 9F 16.Os 17.Sh NAME 18.Nm Intro 19.Nd Introduction to kernel and device driver functions 20.Sh SYNOPSIS 21.In sys/ddi.h 22.In sys/sunddi.h 23.Sh DESCRIPTION 24Section 9F of the manual page describes functions that are used for device 25drivers, kernel modules, and the implementation of the kernel itself. 26This first provides an overview for the use of kernel functions and portions of 27the manual that are specific to the kernel. 28After that, we have grouped together most functions that are available by use, 29with some brief commentary and introduction. 30.Pp 31Most manual pages are similar to those in other sections. 32They have common fields such as the NAME, a SYNOPSIS to show which header files 33to include and prototypes, an extended DESCRIPTION discussing its use, and the 34common combination of RETURN VALUES and ERRORS. 35Some manuals will have examples and additional manuals to reference in the SEE 36ALSO section. 37.Ss RETURN VALUES and ERRORS 38One major difference when programming in the kernel versus userland is that 39there is no equivalent to 40.Va errno . 41Instead, there are a few common patterns that are used throughout the kernel 42that we'll discuss. 43While there are common patterns, please be aware that due to the natural 44evolution of the system, you will need to read the specifics of the 45section. 46.Bl -bullet 47.It 48Many functions will return a specific DDI 49.Pq Device Driver Interface 50value, which is commonly one of 51.Dv DDI_SUCCESS 52or 53.Dv DDI_FAILURE , 54indicating success and failure respectively. 55Some functions will return additional error codes to indicate why something 56failed. 57In general, when checking a response code is always preferred to compare that 58something equals or does not equal 59.Dv DDI_SUCCESS 60as there can be many different error cases and additional ones can be added over 61time. 62.It 63Many routines explicitly return 64.Sy 0 65on success and will return an explicit error number. 66.Xr Intro 2 67has a list of error numbers. 68.It 69There are classes of functions that return either a pointer or a boolean type, 70either the C99 71.Vt bool 72or the system's traditional type 73.Vt boolean_t . 74In these cases, sometimes a more detailed error is provided via an additional 75argument such as a 76.Vt "int *" . 77Absent such an argument, there is generally no more detailed information 78available. 79.El 80.Ss CONTEXT 81The CONTEXT section of a manual page describes the times in which this function 82may be called. 83In generally there are three different contexts that come up: 84.Bl -tag -width Ds 85.It Sy User 86User context implies that the thread of execution is operating because a user 87thread has entered the kernel for an operation. 88When an application issues a system call such as 89.Xr open 2 , 90.Xr read 2 , 91.Xr write 2 , 92or 93.Xr ioctl 2 94then we are said to be in user context. 95When in user context, one can copy in or out data from a user's address space. 96When writing a character or block device driver, the majority of the time that a 97character device operation such as the corresponding 98.Xr open 9E , 99.Xr read 9E , 100.Xr write 9E , 101and 102.Xr ioctl 9E 103entry point being called, it is executing in user context. 104It is possible to call those entry points through the kernel's layered device 105interface, so drivers cannot assume those entry points will always have a user 106process present, strictly speaking. 107.It Sy Interrupt 108Interrupt context refers to when the operating system is handling an interrupt 109.Po 110See 111.Sx Interrupt Related Functions 112.Pc 113and executing a registered interrupt handler. 114Interrupt context is split into two different sets: high-level and low-level 115interrupts. 116Most device drivers are always going to be executing low-level interrupts. 117To determine whether an interrupt is considered high level or not, you should 118pass the interrupt handle to the 119.Xr ddi_intr_get_pri 9F 120function and compare the resulting priority with 121.Xr ddi_intr_get_hilevel_pri 9F . 122.Pp 123When executing high-level interrupts, the thread may only execute a limited 124number of functions. 125In particular, it may call 126.Xr ddi_intr_trigger_softint 9F , 127.Xr mutex_enter 9F , 128and 129.Xr mutex_exit 9F . 130It is critical that the mutex being used be properly initialized with the 131driver's interrupt priority. 132The system will transparently pick the correct implementation of a mutex based 133on the interrupt type. 134Aside from the above, one must not block while in high-level interrupt context. 135.Pp 136On the other hand, when a thread is not in high-level interrupt context, most of 137these restrictions are lifted. 138Kernel memory may be allocated 139.Po 140if using a non-blocking allocation such as 141.Dv KM_NOSLEEP 142or 143.Dv KM_NOSLEEP_LAZY 144.Pc , 145and many of the other documented functions may be called. 146.Pp 147Regardless of whether a thread is in high-level or low-level interrupt context, 148it will never have a user context associated with it and therefore cannot use 149routines like 150.Xr ddi_copyin 9F 151or 152.Xr ddi_copyout 9F . 153.It Sy Kernel 154Kernel context refers to all other times in the kernel. 155Whenever the kernel is executing something on a thread that is not associated 156with a user process, then one is in kernel context. 157The most common situation for writers of kernel modules are things like timeout 158callbacks, such as 159.Xr timeout 9F 160or 161.Xr ddi_periodic_add 9F , 162cases where the kernel is invoking a driver's device operation routines such as 163.Xr attach 9E 164and 165.Xr detach 9E , 166or many of the device driver's registered callbacks from frameworks such as the 167.Xr mac 9E , 168.Xr usba_hcdi 9E , 169and various portions of SCSI, USB, and block devices. 170.It Sy Framework-specific Contexts 171Some manuals will discuss more specific constraints about when they can be used. 172For example, some functions may only be called while executing a specific entry 173point like 174.Xr attach 9E . 175Another example of this is that the 176.Xr mac_transceiver_info_set_present 9F 177function is only meant to be used while executing a networking driver's 178.Xr mct_info 9E 179entry point. 180.El 181.Ss PARAMETERS 182In kernel manual pages 183.Pq section 9 , 184each function and entry point description generally has a separate list 185of parameters which are arguments to the function. 186The parameters section describes the basic purpose of each argument and 187should explain where such things often come from and any constraints on 188their values. 189.Sh INTERFACES 190Functions below are organized into categories that describe their purpose. 191Individual functions are documented in their own manual pages. 192For each of these areas, we discuss high-level concepts behind each area and 193provide a brief discussion of how to get started with it. 194Note, some deprecated functions or older frameworks are not listed here. 195.Pp 196Every function listed below has its own manual page in section 9F and 197can be read with 198.Xr man 1 . 199In addition, some corresponding concepts are documented in section 9 and 200some groups of functions are present to support a specific type of 201device driver, which is discussed more in section 9E . 202.Ss Logging Functions 203Through the kernel there are often needs to log messages that either 204make it into the system log or on the console. 205These kinds of messages can be performed with the 206.Xr cmn_err 9F 207function or one of its more specific variants that operate in the 208context of a device 209.Po 210.Xr dev_err 9F 211.Pc 212or a zone 213.Po 214.Xr zcmn_err 9F 215.Pc . 216.Pp 217The console should be used sparingly. 218While a notice may be found there, one should assume that it may be 219missed either due to overflow, not being connected to say a serial 220console at the time, or some other reason. 221While the system log is better than the console, folks need to take care 222not to spam the log. 223Imagine if someone logged every time a network packet was generated or 224received, you'd quickly potentially run out of space and make it harder 225to find useful messages for bizarre behavior. 226It's also important to remember that only system administrators and 227privileged users can actually see this log. 228Where possible and appropriate use programmatic errors in routines that 229allow it. 230.Pp 231The system also supports a structured event log called a system event 232that is processed by 233.Xr syseventd 8 . 234This is used by the OS to provide notifications for things like device 235insertion and removal or the change of a data link. 236These are driven by the 237.Xr ddi_log_sysevent 9F 238function and allow arbitrary additional structured metadata in the form 239of a 240.Vt nvlist_t . 241.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 242.It Xr cmn_err 9F Ta Xr dev_err 9F 243.It Xr vcmn_err 9F Ta Xr vzcmn_err 9F 244.It Xr zcmn_err 9F Ta Xr ddi_log_sysevent 9F 245.El 246.Ss Memory Allocation 247At the heart of most device drivers is memory allocation. 248The primary kernel allocator is called 249.Qq kmem 250.Pq kernel memory 251and it is based on the 252.Qq vmem 253.Pq virtual memory 254subsystem. 255Most of the time, device drivers should use 256.Xr kmem_alloc 9F 257and 258.Xr kmem_zalloc 9F 259to allocate memory and free it with 260.Xr kmem_free 9F . 261Based on the original kmem and subsequent vmem papers, the kernel is 262internally using object caches and magazines to allow high-throughput 263allocation in a multi-CPU environment. 264.Pp 265When allocating memory, an important choice must be made: whether or not 266to block for memory. 267If one opts to perform a sleeping allocation, then the caller can be 268guaranteed that the allocation will succeed, but it may take some time 269and the thread will be blocked during that entire duration. 270This is the 271.Dv KM_SLEEP 272flag. 273On the other hand, there are many circumstances where this is not 274appropriate, especially because a thread that is inside a memory 275allocation function cannot currently be cancelled. 276If the thread corresponds to a user process, then it will not be 277killable. 278.Pp 279Given that there are many situations where this is not appropriate, the 280kernel offers an allocation mode where it will not block for memory to 281be available: 282.Dv KM_NOSLEEP 283and 284.Dv KM_NOSLEEP_LAZY . 285These allocations can fail and return 286.Dv NULL 287when they do fail. 288Even though these are said to be no sleep operations, that does not mean 289that the caller may not end up temporarily blocked due to mutex 290contention or due to trying a bit more aggressively to reclaim memory in 291the case of 292.Dv KM_NOSLEEP . 293Unless operating in special circumstances, using 294.Dv KM_NOSLEEP_LAZY 295should be preferred to 296.Dv KM_NOSLEEP . 297.Pp 298If a device driver has its own complex object that has more significant 299set up and tear down costs, then the kmem cache function family should 300be considered. 301To use a kmem cache, it must first be created using the 302.Xr kmem_cache_create 9F 303function, which requires specifying the size, alignment, and 304constructors and destructors. 305Individual objects are allocated from the cache with the 306.Xr kmem_cache_alloc 9F 307function. 308An important constraint when using the caches is that when an object is 309freed with 310.Xr kmem_cache_free 9F , 311it is the callers responsibility to ensure that the object is returned 312to its constructed state prior to freeing it. 313If the object is reused, prior to the kernel reclaiming the memory for 314other uses, then the constructor will not be called again. 315Most device drivers do not need to create a kmem cache for their 316own allocations. 317.Pp 318If you are writing a device driver that is trying to interact with the 319networking, STREAMS, or USB subsystems, then they are generally using 320the 321.Vt mblk_t 322data structure which is managed through a different set of APIs, though 323they are leveraging kmem under the hood. 324.Pp 325The vmem set of interfaces allows for the management of abstract regions 326of integers, generally representing memory or some other object, each 327with an offset and length. 328While it is not common that a device driver needs to do their own such 329management, 330.Xr vmem_create 9F 331and 332.Xr vmem_alloc 9F 333are what to reach for when the need arises. 334Rather than using vmem, if one needs to model a set of integers where 335each is a valid identifier, that is you need to allocate every integer 336between 0 and 1000 as a distinct identifier, instead use 337.Xr id_space_create 9F 338which is discussed in 339.Sx Identifier Management . 340For more information on vmem, see 341.Xr vmem 9 . 342.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 343.It Xr kmem_alloc 9F Ta Xr kmem_cache_alloc 9F 344.It Xr kmem_cache_create 9F Ta Xr kmem_cache_destroy 9F 345.It Xr kmem_cache_free 9F Ta Xr kmem_cache_set_move 9F 346.It Xr kmem_free 9F Ta Xr kmem_zalloc 9F 347.It Xr vmem_add 9F Ta Xr vmem_alloc 9F 348.It Xr vmem_contains 9F Ta Xr vmem_create 9F 349.It Xr vmem_destroy 9F Ta Xr vmem_free 9F 350.It Xr vmem_size 9F Ta Xr vmem_walk 9F 351.It Xr vmem_xalloc 9F Ta Xr vmem_xcreate 9F 352.It Xr vmem_xfree 9F Ta Xr bufcall 9F 353.It Xr esbbcall 9F Ta Xr qbufcall 9F 354.It Xr qunbufcall 9F Ta Xr unbufcall 9F 355.El 356.Ss String and libc Analogues 357The kernel has many analogues for classic libc functions that deal with 358string processing, memory copying, and related. 359For the most part, these behave similarly to their userland analogues, 360but there can be some differences in return values and for example, in 361the set of supported format characters in the case of 362.Xr snprintf 9F 363and related. 364.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 365.It Xr ASSERT 9F Ta Xr bcmp 9F 366.It Xr bzero 9F Ta Xr bcopy 9F 367.It Xr ddi_strdup 9F Ta Xr ddi_strtol 9F 368.It Xr ddi_strtoll 9F Ta Xr ddi_strtoul 9F 369.It Xr ddi_strtoull 9F Ta Xr ddi_ffs 9F 370.It Xr ddi_fls 9F Ta Xr max 9F 371.It Xr memchr 9F Ta Xr memcmp 9F 372.It Xr memcpy 9F Ta Xr memmove 9F 373.It Xr memset 9F Ta Xr min 9F 374.It Xr numtos 9F Ta Xr snprintf 9F 375.It Xr sprintf 9F Ta Xr stoi 9F 376.It Xr strcasecmp 9F Ta Xr strcat 9F 377.It Xr strchr 9F Ta Xr strcmp 9F 378.It Xr strcpy 9F Ta Xr strdup 9F 379.It Xr strfree 9F Ta Xr string 9F 380.It Xr strlcat 9F Ta Xr strlcpy 9F 381.It Xr strlen 9F Ta Xr strlog 9F 382.It Xr strncasecmp 9F Ta Xr strncat 9F 383.It Xr strncmp 9F Ta Xr strncpy 9F 384.It Xr strnlen 9F Ta Xr strqget 9F 385.It Xr strqset 9F Ta Xr strrchr 9F 386.It Xr strspn 9F Ta Xr swab 9F 387.It Xr vsnprintf 9F Ta Xr va_arg 9F 388.It Xr va_copy 9F Ta Xr va_end 9F 389.It Xr va_start 9F Ta Xr vsprintf 9F 390.El 391.Ss Tree Data Structures 392These functions provide access to an intrusive self-balancing binary 393tree that is generally used throughout illumos. 394The primary type here is the 395.Vt avl_tree_t . 396Structures can be present in multiple trees and there are built-in 397walkers for the data structure in 398.Xr mdb 1 . 399.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 400.It Xr avl_add 9F Ta Xr avl_create 9F 401.It Xr avl_destroy_nodes 9F Ta Xr avl_destroy 9F 402.It Xr avl_find 9F Ta Xr avl_first 9F 403.It Xr avl_insert_here 9F Ta Xr avl_insert 9F 404.It Xr avl_is_empty 9F Ta Xr avl_last 9F 405.It Xr avl_nearest 9F Ta Xr AVL_NEXT 9F 406.It Xr avl_numnodes 9F Ta Xr AVL_PREV 9F 407.It Xr avl_remove 9F Ta Xr avl_swap 9F 408.El 409.Ss Linked Lists 410These functions provide a standard, intrusive doubly-linked list whose 411type is the 412.Vt list_t . 413This list implementation is used extensively throughout illumos, has 414debugging support through 415.Xr mdb 1 416walkers, and is generally recommended rather than creating your own 417list. 418Due to its intrusive nature, a given structure can be present on 419multiple lists. 420.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 421.It Xr list_create 9F Ta Xr list_destroy 9F 422.It Xr list_head 9F Ta Xr list_insert_after 9F 423.It Xr list_insert_before 9F Ta Xr list_insert_head 9F 424.It Xr list_insert_tail 9F Ta Xr list_is_empty 9F 425.It Xr list_link_active 9F Ta Xr list_link_init 9F 426.It Xr list_link_replace 9F Ta Xr list_move_tail 9F 427.It Xr list_next 9F Ta Xr list_prev 9F 428.It Xr list_remove_head 9F Ta Xr list_remove_tail 9F 429.It Xr list_remove 9F Ta Xr list_tail 9F 430.El 431.Ss Name-Value Pairs 432The kernel often uses the 433.Vt nvlist_t 434data structure to pass around a list of typed name-value pairs. 435This data structure is used in diverse areas, particularly because of 436its ability to be serialized in different formats that are suitable not 437only for use between userland and the kernel, but also persistently to a 438file. 439.Pp 440A 441.Vt nvlist_t 442structure is initialized with the 443.Xr nvlist_alloc 9F 444function and can operate with two different degrees of uniqueness: a 445mode where only names are unique or that every name is qualified to a 446type. 447The former means that if I have an integer name 448.Dq foo 449and then add a string, array, or any other value with the same name, it 450will be replaced. 451However, if were using the name and type as unique, then the value would 452only be replaced if both the pair's type and the name 453.Dq foo 454matched a pair that was already present. 455Otherwise, the two different entries would co-exist. 456.Pp 457When constructing an nvlist, it is normally backed by the normal kmem 458allocator and may either use sleeping or non-sleeping allocations. 459It is also possible to use a custom allocator, though that generally has 460not been necessary in the kernel. 461.Pp 462Specific keys and values can be looked up directly with the 463nvlist_lookup family of functions, but the entire list can be iterated 464as well, which is especially useful when trying to validate that no 465unknown keys are present in the list. 466The iteration API 467.Xr nvlist_next_nvpair 9F 468allows one to then get both the key's name, the type of value of the 469pair, and then the value itself. 470.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 471.It Xr nv_alloc_fini 9F Ta Xr nv_alloc_init 9F 472.It Xr nvlist_add_boolean_array 9F Ta Xr nvlist_add_boolean_value 9F 473.It Xr nvlist_add_boolean 9F Ta Xr nvlist_add_byte_array 9F 474.It Xr nvlist_add_byte 9F Ta Xr nvlist_add_int16_array 9F 475.It Xr nvlist_add_int16 9F Ta Xr nvlist_add_int32_array 9F 476.It Xr nvlist_add_int32 9F Ta Xr nvlist_add_int64_array 9F 477.It Xr nvlist_add_int64 9F Ta Xr nvlist_add_int8_array 9F 478.It Xr nvlist_add_int8 9F Ta Xr nvlist_add_nvlist_array 9F 479.It Xr nvlist_add_nvlist 9F Ta Xr nvlist_add_nvpair 9F 480.It Xr nvlist_add_string_array 9F Ta Xr nvlist_add_string 9F 481.It Xr nvlist_add_uint16_array 9F Ta Xr nvlist_add_uint16 9F 482.It Xr nvlist_add_uint32_array 9F Ta Xr nvlist_add_uint32 9F 483.It Xr nvlist_add_uint64_array 9F Ta Xr nvlist_add_uint64 9F 484.It Xr nvlist_add_uint8_array 9F Ta Xr nvlist_add_uint8 9F 485.It Xr nvlist_alloc 9F Ta Xr nvlist_dup 9F 486.It Xr nvlist_exists 9F Ta Xr nvlist_free 9F 487.It Xr nvlist_lookup_boolean_array 9F Ta Xr nvlist_lookup_boolean_value 9F 488.It Xr nvlist_lookup_boolean 9F Ta Xr nvlist_lookup_byte_array 9F 489.It Xr nvlist_lookup_byte 9F Ta Xr nvlist_lookup_int16_array 9F 490.It Xr nvlist_lookup_int16 9F Ta Xr nvlist_lookup_int32_array 9F 491.It Xr nvlist_lookup_int32 9F Ta Xr nvlist_lookup_int64_array 9F 492.It Xr nvlist_lookup_int64 9F Ta Xr nvlist_lookup_int8_array 9F 493.It Xr nvlist_lookup_int8 9F Ta Xr nvlist_lookup_nvlist_array 9F 494.It Xr nvlist_lookup_nvlist 9F Ta Xr nvlist_lookup_nvpair 9F 495.It Xr nvlist_lookup_pairs 9F Ta Xr nvlist_lookup_string_array 9F 496.It Xr nvlist_lookup_string 9F Ta Xr nvlist_lookup_uint16_array 9F 497.It Xr nvlist_lookup_uint16 9F Ta Xr nvlist_lookup_uint32_array 9F 498.It Xr nvlist_lookup_uint32 9F Ta Xr nvlist_lookup_uint64_array 9F 499.It Xr nvlist_lookup_uint64 9F Ta Xr nvlist_lookup_uint8_array 9F 500.It Xr nvlist_lookup_uint8 9F Ta Xr nvlist_merge 9F 501.It Xr nvlist_next_nvpair 9F Ta Xr nvlist_pack 9F 502.It Xr nvlist_remove_all 9F Ta Xr nvlist_remove 9F 503.It Xr nvlist_size 9F Ta Xr nvlist_t 9F 504.It Xr nvlist_unpack 9F Ta Xr nvlist_xalloc 9F 505.It Xr nvlist_xdup 9F Ta Xr nvlist_xpack 9F 506.It Xr nvlist_xunpack 9F Ta Xr nvpair_name 9F 507.It Xr nvpair_type 9F Ta Xr nvpair_value_boolean_array 9F 508.It Xr nvpair_value_byte_array 9F Ta Xr nvpair_value_byte 9F 509.It Xr nvpair_value_int16_array 9F Ta Xr nvpair_value_int16 9F 510.It Xr nvpair_value_int32_array 9F Ta Xr nvpair_value_int32 9F 511.It Xr nvpair_value_int64_array 9F Ta Xr nvpair_value_int64 9F 512.It Xr nvpair_value_int8_array 9F Ta Xr nvpair_value_int8 9F 513.It Xr nvpair_value_nvlist_array 9F Ta Xr nvpair_value_nvlist 9F 514.It Xr nvpair_value_string_array 9F Ta Xr nvpair_value_string 9F 515.It Xr nvpair_value_uint16_array 9F Ta Xr nvpair_value_uint16 9F 516.It Xr nvpair_value_uint32_array 9F Ta Xr nvpair_value_uint32 9F 517.It Xr nvpair_value_uint64_array 9F Ta Xr nvpair_value_uint64 9F 518.It Xr nvpair_value_uint8_array 9F Ta Xr nvpair_value_uint8 9F 519.El 520.Ss Identifier Management 521A common challenge in the kernel is the management of a series of 522different IDs. 523There are three different families of routines for managing identifiers 524presented here, but we recommend the use of the 525.Xr id_space_create 9F 526and 527.Xr id_alloc 9F 528family for new use cases. 529The ID space can cover all or a subset of the 32-bit integer space and 530provides different allocation strategies for this. 531.Pp 532Due to the current implementation, callers should generally prefer the 533non-sleeping variants because the sleeping ones are not cancellable 534.Po 535currently this is backed by vmem, but this should not be assumed and may 536change in the future 537.Pc . 538.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 539.It Xr id_alloc_nosleep 9F Ta Xr id_alloc_specific_nosleep 9F 540.It Xr id_alloc 9F Ta Xr id_allocff_nosleep 9F 541.It Xr id_allocff 9F Ta Xr id_free 9F 542.It Xr id_space_create 9F Ta Xr id_space_destroy 9F 543.It Xr id_space_extend 9F Ta Xr id_space 9F 544.It Xr id32_alloc 9F Ta Xr id32_free 9F 545.It Xr id32_lookup 9F Ta Xr rmalloc_wait 9F 546.It Xr rmalloc 9F Ta Xr rmallocmap_wait 9F 547.It Xr rmallocmap 9F Ta Xr rmfree 9F 548.It Xr rmfreemap 9F Ta 549.El 550.Ss Bit Manipulation Routines 551Many device drivers that are working with registers often need to get a 552specific range of bits out of an integer. 553These functions provide safe ways to set 554.Pq bitset 555and extract 556.Pq bitx 557bit ranges, as well 558as modify an integer to remove a set of bits entirely 559.Pq bitdel . 560Using these functions is preferred to constructing manual masks and 561shifts particularly when a programming manual for a device is specified 562in ranges of bits. 563On debug builds, these provide extra checking to try and catch 564programmer error. 565.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 566.It Xr bitdel64 9F Ta Xr bitset8 9F 567.It Xr bitset16 9F Ta Xr bitset32 9F 568.It Xr bitset64 9F Ta Xr bitx8 9F 569.It Xr bitx16 9F Ta Xr bitx32 9F 570.It Xr bitx64 9F Ta 571.El 572.Ss Synchronization Primitives 573The kernel provides a set of basic synchronization primitives that can 574be used by the system. 575These include mutexes, condition variables, reader/writer locks, and 576semaphores. 577When creating mutexes and reader/writer locks, the kernel requires that 578one pass in the interrupt priority of a mutex if it will be used in 579interrupt context. 580This is required so the kernel can determine the correct underlying type 581of lock to use. 582This ensures that if for some reason a mutex needs to be used in 583high-level interrupt context, the kernel will use a spin lock, but 584otherwise can use the standard adaptive mutex that might block. 585For developers familiar with other operating systems, this is somewhat 586different in that the consumer does not need to generally figure out 587this level of detail and this is why this is not present. 588.Pp 589In addition, condition variables provide means for waiting and detecting 590that a signal has been delivered. 591These variants are particularly useful when writing character device 592operations for device drivers as it allows users the chance to cancel an 593operation and not be blocked indefinitely on something that may not 594occur. 595These _sig variants should generally be preferred where applicable. 596.Pp 597The kernel also provides memory barrier primitives. 598See the 599.Sx Memory Barriers 600section for more information. 601There is no need to use manual memory barriers when using the 602synchronization primitives. 603The synchronization primitives contain that the appropriate barriers are 604present to ensure coherency while the lock is held. 605.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 606.It Xr cv_broadcast 9F Ta Xr cv_destroy 9F 607.It Xr cv_init 9F Ta Xr cv_reltimedwait_sig 9F 608.It Xr cv_reltimedwait 9F Ta Xr cv_signal 9F 609.It Xr cv_timedwait_sig 9F Ta Xr cv_timedwait 9F 610.It Xr cv_wait_sig 9F Ta Xr cv_wait 9F 611.It Xr ddi_enter_critical 9F Ta Xr ddi_exit_critical 9F 612.It Xr mutex_destroy 9F Ta Xr mutex_enter 9F 613.It Xr mutex_exit 9F Ta Xr mutex_init 9F 614.It Xr mutex_owned 9F Ta Xr mutex_tryenter 9F 615.It Xr rw_destroy 9F Ta Xr rw_downgrade 9F 616.It Xr rw_enter 9F Ta Xr rw_exit 9F 617.It Xr rw_init 9F Ta Xr rw_read_locked 9F 618.It Xr rw_tryenter 9F Ta Xr rw_tryupgrade 9F 619.It Xr sema_destroy 9F Ta Xr sema_init 9F 620.It Xr sema_p_sig 9F Ta Xr sema_p 9F 621.It Xr sema_tryp 9F Ta Xr sema_v 9F 622.It Xr semaphore 9F Ta 623.El 624.Ss Atomic Operations 625This group of functions provides a general way to perform atomic 626operations on integers of different sizes and explicit types. 627The 628.Xr atomic_ops 9F 629manual page describes the different classes of functions in more detail, 630but there are functions that take care of using the CPU's instructions 631for addition, compare and swap, and more. 632If data is being protected and only accessed under a synchronization 633primitive such as a mutex or reader-writer lock, then there isn't a 634reason to use an atomic operation for that data, generally speaking. 635.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 636.It Xr atomic_add_8_nv 9F Ta Xr atomic_add_8 9F 637.It Xr atomic_add_16_nv 9F Ta Xr atomic_add_16 9F 638.It Xr atomic_add_32_nv 9F Ta Xr atomic_add_32 9F 639.It Xr atomic_add_64_nv 9F Ta Xr atomic_add_64 9F 640.It Xr atomic_add_char_nv 9F Ta Xr atomic_add_char 9F 641.It Xr atomic_add_int_nv 9F Ta Xr atomic_add_int 9F 642.It Xr atomic_add_long_nv 9F Ta Xr atomic_add_long 9F 643.It Xr atomic_add_ptr_nv 9F Ta Xr atomic_add_ptr 9F 644.It Xr atomic_add_short_nv 9F Ta Xr atomic_add_short 9F 645.It Xr atomic_and_8_nv 9F Ta Xr atomic_and_8 9F 646.It Xr atomic_and_16_nv 9F Ta Xr atomic_and_16 9F 647.It Xr atomic_and_32_nv 9F Ta Xr atomic_and_32 9F 648.It Xr atomic_and_64_nv 9F Ta Xr atomic_and_64 9F 649.It Xr atomic_and_uchar_nv 9F Ta Xr atomic_and_uchar 9F 650.It Xr atomic_and_uint_nv 9F Ta Xr atomic_and_uint 9F 651.It Xr atomic_and_ulong_nv 9F Ta Xr atomic_and_ulong 9F 652.It Xr atomic_and_ushort_nv 9F Ta Xr atomic_and_ushort 9F 653.It Xr atomic_cas_16 9F Ta Xr atomic_cas_32 9F 654.It Xr atomic_cas_64 9F Ta Xr atomic_cas_8 9F 655.It Xr atomic_cas_ptr 9F Ta Xr atomic_cas_uchar 9F 656.It Xr atomic_cas_uint 9F Ta Xr atomic_cas_ulong 9F 657.It Xr atomic_cas_ushort 9F Ta Xr atomic_clear_long_excl 9F 658.It Xr atomic_dec_8_nv 9F Ta Xr atomic_dec_8 9F 659.It Xr atomic_dec_16_nv 9F Ta Xr atomic_dec_16 9F 660.It Xr atomic_dec_32_nv 9F Ta Xr atomic_dec_32 9F 661.It Xr atomic_dec_64_nv 9F Ta Xr atomic_dec_64 9F 662.It Xr atomic_dec_ptr_nv 9F Ta Xr atomic_dec_ptr 9F 663.It Xr atomic_dec_uchar_nv 9F Ta Xr atomic_dec_uchar 9F 664.It Xr atomic_dec_uint_nv 9F Ta Xr atomic_dec_uint 9F 665.It Xr atomic_dec_ulong_nv 9F Ta Xr atomic_dec_ulong 9F 666.It Xr atomic_dec_ushort_nv 9F Ta Xr atomic_dec_ushort 9F 667.It Xr atomic_inc_8_nv 9F Ta Xr atomic_inc_8 9F 668.It Xr atomic_inc_16_nv 9F Ta Xr atomic_inc_16 9F 669.It Xr atomic_inc_32_nv 9F Ta Xr atomic_inc_32 9F 670.It Xr atomic_inc_64_nv 9F Ta Xr atomic_inc_64 9F 671.It Xr atomic_inc_ptr_nv 9F Ta Xr atomic_inc_ptr 9F 672.It Xr atomic_inc_uchar_nv 9F Ta Xr atomic_inc_uchar 9F 673.It Xr atomic_inc_uint_nv 9F Ta Xr atomic_inc_uint 9F 674.It Xr atomic_inc_ulong_nv 9F Ta Xr atomic_inc_ulong 9F 675.It Xr atomic_inc_ushort_nv 9F Ta Xr atomic_inc_ushort 9F 676.It Xr atomic_or_8_nv 9F Ta Xr atomic_or_8 9F 677.It Xr atomic_or_16_nv 9F Ta Xr atomic_or_16 9F 678.It Xr atomic_or_32_nv 9F Ta Xr atomic_or_32 9F 679.It Xr atomic_or_64_nv 9F Ta Xr atomic_or_64 9F 680.It Xr atomic_or_uchar_nv 9F Ta Xr atomic_or_uchar 9F 681.It Xr atomic_or_uint_nv 9F Ta Xr atomic_or_uint 9F 682.It Xr atomic_or_ulong_nv 9F Ta Xr atomic_or_ulong 9F 683.It Xr atomic_or_ushort_nv 9F Ta Xr atomic_or_ushort 9F 684.It Xr atomic_set_long_excl 9F Ta Xr atomic_swap_8 9F 685.It Xr atomic_swap_16 9F Ta Xr atomic_swap_32 9F 686.It Xr atomic_swap_64 9F Ta Xr atomic_swap_ptr 9F 687.It Xr atomic_swap_uchar 9F Ta Xr atomic_swap_uint 9F 688.It Xr atomic_swap_ulong 9F Ta Xr atomic_swap_ushort 9F 689.El 690.Ss Memory Barriers 691The kernel provides general purpose memory barriers that can be used 692when required. 693In general, when using items described in the 694.Sx Synchronization Primitives 695section, these are not required. 696.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 697.It Xr membar_consumer 9F Ta Xr membar_enter 9F 698.It Xr membar_exit 9F Ta Xr membar_producer 9F 699.El 700.Ss Virtual Memory and Pages 701All platforms that the operating system supports have some form of 702virtual memory which is managed in units of pages. 703The page size varies between architectures and platforms. 704For example, the smallest x86 page size is 4 KiB while SPARC 705traditionally used 8 KiB pages. 706These functions can be used to convert between pages and bytes. 707.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 708.It Xr btop 9F Ta Xr btopr 9F 709.It Xr ddi_btop 9F Ta Xr ddi_btopr 9F 710.It Xr ddi_ptob 9F Ta Xr ptob 9F 711.El 712.Ss Module and Device Framework 713These functions are used as part of implementing kernel modules and 714register device drivers with the various kernel frameworks. 715There are also functions here that are satiable for use in the 716.Xr dev_ops 9F , 717.Xr cb_ops 9F , 718etc. 719structures and for interrogating module information. 720.Pp 721The 722.Xr mod_install 9F 723and 724.Xr mod_remove 9F 725functions are used during a driver's 726.Xr _init 9E 727and 728.Xr _fini 9E 729functions. 730.Pp 731There are two different ways that drivers often manage their instance 732state which is created during 733.Xr attach 9E . 734The first is the use of 735.Xr ddi_set_driver_private 9F 736and 737.Xr ddi_get_driver_private 9F . 738This stores a driver-specific value on the 739.Vt dev_info_t 740structure which allows it to be used during other operations. 741Some device driver frameworks may use this themselves, making this 742unavailable to the driver. 743.Pp 744The other path is to use the soft state suite of functions which 745dynamically grows to cover the number of instances of a device that 746exist. 747The soft state is generally initialized in the 748.Xr _init 9E 749entry point with 750.Xr ddi_soft_state_init 9F 751and then instances are allocated and freed during 752.Xr attach 9E 753and 754.Xr detach 9E 755with 756.Xr ddi_soft_state_zalloc 9F 757and 758.Xr ddi_soft_state_free 9F , 759and then retrieved with 760.Xr ddi_get_soft_state 9F . 761.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 762.It Xr ddi_get_driver_private 9F Ta Xr ddi_get_soft_state 9F 763.It Xr ddi_modclose 9F Ta Xr ddi_modopen 9F 764.It Xr ddi_modsym 9F Ta Xr ddi_no_info 9F 765.It Xr ddi_report_dev 9F Ta Xr ddi_set_driver_private 9F 766.It Xr ddi_soft_state_fini 9F Ta Xr ddi_soft_state_free 9F 767.It Xr ddi_soft_state_init 9F Ta Xr ddi_soft_state_zalloc 9F 768.It Xr mod_info 9F Ta Xr mod_install 9F 769.It Xr mod_modname 9F Ta Xr mod_remove 9F 770.It Xr nochpoll 9F Ta Xr nodev 9F 771.It Xr nulldev 9F Ta 772.El 773.Ss Device Tree Information 774Devices are organized into a tree that is partially seeded by the 775platform based on information discovered at boot and augmented with 776additional information at runtime. 777Every instance of a device driver is given a 778.Vt "dev_info_t *" 779.Pq device information 780data structure which corresponds to information about an instance and 781has a place in the tree. 782When a driver requests operations like to allocate memory for DMA, that 783request is passed up the tree and modified. 784The same is true for other things like interrupts, event notifications, 785or properties. 786.Pp 787There are many different informational properties about a device driver. 788For example, 789.Xr ddi_driver_name 9F 790returns the name of the device driver, 791.Xr ddi_get_name 9F 792returns the name of the node in the tree, 793.Xr ddi_get_parent 9F 794returns a node's parent, and 795.Xr ddi_get_instance 9F 796returns the instance number of a specific driver. 797.Pp 798There are a series of properties that exist on the tree, the exact set 799of which depend on the class of the device and are often documented in a 800specific device class's manual. 801For example, the 802.Dq reg 803property is used for PCI and PCIe devices to describe the various base 804address registers, their types, and related, which are documented in 805.Xr pci 5 . 806.Pp 807When getting a property one can constrain it to the current instance or 808you can ask for a parent to try to look up the property. 809Which mode is appropriate depends on the specific class of driver, its 810parent, and the property. 811.Pp 812Using a 813.Vt "dev_info_t *" 814pointer has to be done carefully. 815When a device driver is in any of its 816.Xr dev_ops 9S , 817.Xr cb_ops 9S , 818or similar callback functions that it has registered with the kernel, 819then it can always safely use its own 820.Vt "dev_info_t" 821and those of any parents it discovers through 822.Xr ddi_get_parent 9F . 823However, it cannot assume the validity of any siblings or children 824unless there are other circumstances that guarantee that they will not 825disappear. 826In the broader kernel, one should not assume that it is safe to use a 827given 828.Vt "dev_info_t *" 829structure without the appropriate NDI 830.Pq nexus driver interface 831hold having been applied. 832.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 833.It Xr ddi_binding_name 9F Ta Xr ddi_dev_is_sid 9F 834.It Xr ddi_driver_major 9F Ta Xr ddi_driver_name 9F 835.It Xr ddi_get_devstate 9F Ta Xr ddi_get_instance 9F 836.It Xr ddi_get_name 9F Ta Xr ddi_get_parent 9F 837.It Xr ddi_getlongprop_buf 9F Ta Xr ddi_getlongprop 9F 838.It Xr ddi_getprop 9F Ta Xr ddi_getproplen 9F 839.It Xr ddi_node_name 9F Ta Xr ddi_prop_create 9F 840.It Xr ddi_prop_exists 9F Ta Xr ddi_prop_free 9F 841.It Xr ddi_prop_get_int 9F Ta Xr ddi_prop_get_int64 9F 842.It Xr ddi_prop_lookup_byte_array 9F Ta Xr ddi_prop_lookup_int_array 9F 843.It Xr ddi_prop_lookup_int64_array 9F Ta Xr ddi_prop_lookup_string_array 9F 844.It Xr ddi_prop_lookup_string 9F Ta Xr ddi_prop_lookup 9F 845.It Xr ddi_prop_modify 9F Ta Xr ddi_prop_op 9F 846.It Xr ddi_prop_remove_all 9F Ta Xr ddi_prop_remove 9F 847.It Xr ddi_prop_undefine 9F Ta Xr ddi_prop_update_byte_array 9F 848.It Xr ddi_prop_update_int_array 9F Ta Xr ddi_prop_update_int 9F 849.It Xr ddi_prop_update_int64_array 9F Ta Xr ddi_prop_update_int64 9F 850.It Xr ddi_prop_update_string_array 9F Ta Xr ddi_prop_update_string 9F 851.It Xr ddi_prop_update 9F Ta Xr ddi_root_node 9F 852.It Xr ddi_slaveonly 9F Ta 853.El 854.Ss Copying Data to and from Userland 855The kernel operates in a different context from userland. 856One does not simply access user memory. 857This is enforced either by the architecture's memory model, where user 858address space isn't even present in the kernel's virtual address space 859or by architectural mechanisms such as Supervisor Mode Access Protect 860.Pq SMAP 861on x86. 862.Pp 863To facilitate accessing memory, the kernel provides a few routines that 864can be used. 865In most contexts the main thing to use is 866.Xr ddi_copyin 9F 867and 868.Xr ddi_copyout 9F . 869These will safely dereference addresses and ensure that the address is 870appropriate depending on whether this is coming from the user or kernel. 871When operating with the kernel's 872.Vt uio_t 873structure which is for mostly used when processing read and write 874requests, instead 875.Xr uiomove 9F 876is the goto function. 877.Pp 878When reading data from userland into the kernel, there is another 879concern: the data model. 880The most common place this comes up is in an 881.Xr ioctl 9E 882handler or other places where the kernel is operating on data that isn't 883fixed size. 884Particularly in C, though this applies to other languages, structures 885and unions vary in the size and alignment requirements between 32-bit 886and 64-bit processes. 887The same even applies if one uses pointers or the 888.Vt long , 889.Vt size_t , 890or similar types in C. 891In supported 32-bit and 64-bit environments these types are 4 and 8 892bytes respectively. 893To account for this, when data is not fixed size between all data 894models, the driver must look at the data model of the process it is 895copying data from. 896.Pp 897The simplest way to solve this problem is to try to make the data 898structure the same across the different models. 899It's not sufficient to just use the same structure definition and fixed 900size types as the alignment and padding between the two can vary. 901For example, the alignment of a 64-bit integer like a 902.Vt uint64_t 903can change between a 32-bit and 64-bit data model. 904One way to check for the data structures being identical is to leverage 905the 906.Xr ctfdiff 1 907program, generally with the 908.Fl I 909option. 910.Pp 911However, there are times when a structure simply can't be the same, such 912as when we're encoding a pointer into the structure or a type like the 913.Vt size_t . 914When this happens, the most natural way to accomplish this is to use the 915.Xr ddi_model_convert_from 9F 916function which can determine the appropriate model from the ioctl's 917arguments. 918This provides a natural way to copy a structure in and out in the 919appropriate data model and convert it at those points to the kernel's 920native form. 921.Pp 922An alternate way to approach the data model is to use the 923.Xr STRUCT_DECL 9F 924functions, but as this requires wrapping every access to every member, 925often times the 926.Xr ddi_model_convert_from 9F 927approach and taking care of converting values and ensuring that limits 928aren't exceeded at the end is preferred. 929.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 930.It Xr bp_copyin 9F Ta Xr bp_copyout 9F 931.It Xr copyin 9F Ta Xr copyout 9F 932.It Xr ddi_copyin 9F Ta Xr ddi_copyout 9F 933.It Xr ddi_model_convert_from 9F Ta Xr SIZEOF_PTR 9F 934.It Xr SIZEOF_STRUCT 9F Ta Xr STRUCT_BUF 9F 935.It Xr STRUCT_DECL 9F Ta Xr STRUCT_FADDR 9F 936.It Xr STRUCT_FGET 9F Ta Xr STRUCT_FGETP 9F 937.It Xr STRUCT_FSET 9F Ta Xr STRUCT_FSETP 9F 938.It Xr STRUCT_HANDLE 9F Ta Xr STRUCT_INIT 9F 939.It Xr STRUCT_SET_HANDLE 9F Ta Xr STRUCT_SIZE 9F 940.It Xr uiomove 9F Ta Xr ureadc 9F 941.It Xr uwritec 9F Ta 942.El 943.Ss Device Register Setup and Access 944The kernel abstracts out accessing registers on a device on behalf of 945drivers. 946This allows a similar set of interfaces to be used whether the registers 947are found within a PCI BAR, utilizing I/O ports, memory mapped 948registers, or some other scheme. 949Devices with registers all have a 950.Dq regs 951property that is set up by their parent device, generally a kernel 952framework as is the case for PCIe devices, and the meaning is a contract 953between the two. 954Register sets are identified by a numeric ID, which varies on the device 955type. 956For example, the first BAR of a PCI device is defined as register set 1. 957On the other hand, the AMD GPIO controller might have three register sets 958because of how the hardware design splits them up. 959The meaning of the registers and their semantics is still 960device-specific. 961The kernel doesn't know how to interpret the actual registers of a PCIe 962device say, just that they exist. 963.Pp 964To begin with register setup, one often first looks at the number of 965register sets that exist and their size. 966Most PCI-based device drivers will skip calling 967.Xr ddi_dev_nregs 9F 968and will just move straight to calling 969.Xr ddi_dev_regsize 9F 970to determine the size of a register set that they are interested in. 971To actually map the registers, a device driver will call 972.Xr ddi_regs_map_setup 9F 973which requires both a register set and a series of attributes and 974returns an access handle that is used to actually read and write the 975registers. 976When setting up registers, one must have a corresponding 977.Vt ddi_device_acc_attr_t 978structure which is used to define what endianness the register set is 979in, whether any kind of reordering is allowed 980.Po 981if in doubt specify 982.Dv DDI_STRICTORDER_ACC 983.Pc , 984and whether any particular error handling is being used. 985The structure and all of its different options are described in 986.Xr ddi_device_acc_attr 9S . 987.Pp 988Once a register handle is obtained, then it's easy to read and write the 989register space. 990Functions are organized based on the size of the access. 991For the most part, most situations call for the use of the 992.Xr ddi_get8 9F , 993.Xr ddi_get16 9F , 994.Xr ddi_get32 9F , 995and 996.Xr ddi_get64 9F 997functions to read a register and the 998.Xr ddi_put8 9F , 999.Xr ddi_put16 9F , 1000.Xr ddi_put32 9F , 1001and 1002.Xr ddi_put64 9F 1003functions to set a register value. 1004While there are the ddi_io_ and ddi_mem_ families of functions below, 1005these are not generally needed and are generally present for 1006compatibility. 1007The kernel will automatically perform the appropriate type of register 1008read for the device type in question. 1009.Pp 1010Once a register set is no longer being used, the 1011.Xr ddi_regs_map_free 9F 1012function should be used to release resources. 1013In most cases, this happens while executing the 1014.Xr detach 9E 1015entry point. 1016.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1017.It Xr ddi_dev_nregs 9F Ta Xr ddi_dev_regsize 9F 1018.It Xr ddi_device_copy 9F Ta Xr ddi_device_zero 9F 1019.It Xr ddi_regs_map_free 9F Ta Xr ddi_regs_map_setup 9F 1020.It Xr ddi_get8 9F Ta Xr ddi_get16 9F 1021.It Xr ddi_get32 9F Ta Xr ddi_get64 9F 1022.It Xr ddi_io_get8 9F Ta Xr ddi_io_get16 9F 1023.It Xr ddi_io_get32 9F Ta Xr ddi_io_put8 9F 1024.It Xr ddi_io_put16 9F Ta Xr ddi_io_put32 9F 1025.It Xr ddi_io_rep_get8 9F Ta Xr ddi_io_rep_get16 9F 1026.It Xr ddi_io_rep_get32 9F Ta Xr ddi_io_rep_put8 9F 1027.It Xr ddi_io_rep_put16 9F Ta Xr ddi_io_rep_put32 9F 1028.It Xr ddi_map_regs 9F Ta Xr ddi_mem_get8 9F 1029.It Xr ddi_mem_get16 9F Ta Xr ddi_mem_get32 9F 1030.It Xr ddi_mem_get64 9F Ta Xr ddi_mem_put8 9F 1031.It Xr ddi_mem_put16 9F Ta Xr ddi_mem_put32 9F 1032.It Xr ddi_mem_put64 9F Ta Xr ddi_mem_rep_get8 9F 1033.It Xr ddi_mem_rep_get16 9F Ta Xr ddi_mem_rep_get32 9F 1034.It Xr ddi_mem_rep_get64 9F Ta Xr ddi_mem_rep_put8 9F 1035.It Xr ddi_mem_rep_put16 9F Ta Xr ddi_mem_rep_put32 9F 1036.It Xr ddi_mem_rep_put64 9F Ta Xr ddi_peek8 9F 1037.It Xr ddi_peek16 9F Ta Xr ddi_peek32 9F 1038.It Xr ddi_peek64 9F Ta Xr ddi_poke8 9F 1039.It Xr ddi_poke16 9F Ta Xr ddi_poke32 9F 1040.It Xr ddi_poke64 9F Ta Xr ddi_put8 9F 1041.It Xr ddi_put16 9F Ta Xr ddi_put32 9F 1042.It Xr ddi_put64 9F Ta Xr ddi_rep_get8 9F 1043.It Xr ddi_rep_get16 9F Ta Xr ddi_rep_get32 9F 1044.It Xr ddi_rep_get64 9F Ta Xr ddi_rep_put8 9F 1045.It Xr ddi_rep_put16 9F Ta Xr ddi_rep_put32 9F 1046.It Xr ddi_rep_put64 9F Ta 1047.El 1048.Ss DMA Related Functions 1049Most high-performance devices provide first-class support for DMA 1050.Pq direct memory access . 1051DMA allows a transfer between a device and memory to occur 1052asynchronously and generally without a thread's specific involvement. 1053Today, most DMA is provided directly by devices and the corresponding 1054device scheme. 1055Take PCI and PCI Express for example. 1056The idea of DMA is built into the PCIe standard and therefore basic 1057support for it exists and therefore there isn't a lot of special 1058programming required. 1059However, this hasn't always been true and still exists in some cases 1060where there is a 3rd party DMA engine. 1061If we consider the PCIe example, the PCIe device directly performs reads 1062and writes to main memory on its own. 1063However, in the 3rd party case, there is a distinct controller that is 1064neither the device nor memory that facilitates this, which is called a 1065DMA engine. 1066For most part, DMA engines are not something that needs to be thought 1067about for most platforms that illumos is present on; however, they still 1068exist in some embedded and related contexts. 1069.Pp 1070The first thing that a driver needs to do to set up DMA is to understand 1071the constraints of the device and bus. 1072These constraints are described in a series of attributes in the 1073.Vt ddi_dma_attr_t 1074structure which is defined in 1075.Xr ddi_dma_attr 9S . 1076The reason that attributes exist is because different devices, and 1077sometimes different memory uses with a device, have different 1078requirements for memory. 1079A simple example of this is that not all devices can accept memory 1080addresses that are 64-bits wide and may have to be constrained to the 1081lower 32-bits of memory. 1082Another common constraint is how this memory is chunked up. 1083Some devices may require that all of the DMA memory be contiguous, while 1084others can allow that to be broken up into say up to 4 or 8 different 1085regions. 1086.Pp 1087When memory is allocated for DMA it isn't immediately mapped into the 1088kernel's address space. 1089The addresses that describe a DMA address are defined in a DMA cookie, 1090several of which may make up a request. 1091However, those addresses are always physical addresses or addresses that 1092are virtualized by an IOMMU. 1093There are some cases were the kernel or a driver needs to be able to 1094access that memory, such as memory that represents a networking packet. 1095The IP stack will expect to be able to actually read the data it's 1096given. 1097.Pp 1098To begin with allocating DMA memory, a driver first fills out its 1099attribute structure. 1100Once that's ready, the DMA allocation process can begin. 1101This starts off by a driver calling 1102.Xr ddi_dma_alloc_handle 9F . 1103This handle is used through the lifetime of a given DMA memory buffer, 1104but it can be used across multiple operations that a device or the 1105kernel may perform. 1106The next step is to actually request that the kernel allocate some 1107amount of memory in the kernel for this DMA request. 1108This phase actually allocates addresses in virtual address space for the 1109activity and also requires a register attribute object that is discussed 1110in 1111.Sx Device Register Setup and Access . 1112Armed with this a driver can now call 1113.Xr ddi_dma_mem_alloc 9F 1114to specify how much memory they are looking for. 1115If this is successful, a virtual address, the actual length of the 1116region, and an access handle will be returned. 1117.Pp 1118At this point, the virtual address region is present. 1119Most drivers will access this virtual address range directly and will 1120ignore the register access handle. 1121The side effect of this is that they will handle all endianness issues 1122with the memory region themselves. 1123If the driver would prefer to go through the handle, then it can use the 1124register access functions discussed earlier. 1125.Pp 1126Before the memory can be programmed into the device, it must be bound to 1127a series of physical addresses or addresses virtualized by an IOMMU. 1128While the kernel presents the illusion of a single consistent virtual 1129address range for applications, the physical reality can be quite 1130different. 1131When the driver is ready it calls 1132.Xr ddi_dma_addr_bind_handle 9F 1133to create the mapping to well known physical addresses. 1134.Pp 1135These addresses are stored in a series of cookies. 1136A driver can determine the number of cookies for a given request by 1137utilizing its DMA handle and calling 1138.Xr ddi_dma_ncookies 9F 1139and then pairing that with 1140.Xr ddi_dma_cookie_get 9F . 1141These DMA cookies will not change and can be used time and time again 1142until 1143.Xr ddi_dma_unbind_handle 9F 1144is called. 1145With this information in hand, a physical device can be programmed with 1146these addresses and let loose to perform I/O. 1147.Pp 1148When performing I/O to and from a device, synchronization is a vitally 1149important thing which ensures that the actual state in memory is 1150coherent with the rest of the CPU's internal structures such as caches. 1151In general, a given DMA request is only going in one direction: for a 1152device or for the local CPU. 1153In either case, the 1154.Xr ddi_dma_sync 9F 1155function must be called after the kernel is done writing to a region of 1156DMA memory and before it triggers the device or the kernel must call it 1157after the device has told it that some activity has completed that it is 1158going to check. 1159.Pp 1160Some DMA operations utilize what are called DMA windows. 1161The most common consumer is something like a disk device where DMA 1162operations to a given series of sectors can be split up into different 1163chunks where as long as all the transfers are performed, the 1164intermediate states are acceptable. 1165Put another way, because of how SCSI and SAS commands are designed, 1166block devices can basically take a given I/O request and break it into 1167multiple independent I/Os that will equate to the same final item. 1168.Pp 1169When a device supports this mode of operation and it is opted into, then 1170a DMA allocation may result in the use of DMA windows. 1171This allows for cases where the kernel can't perform a DMA allocation 1172for the entire request, but instead can allocate a partial region and 1173then walk through each part one at a time. 1174This is uncommon outside of block devices and usually also is related to 1175calling 1176.Xr ddi_dma_buf_bind_handle 9F . 1177.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1178.It Xr ddi_dma_addr_bind_handle 9F Ta Xr ddi_dma_alloc_handle 9F 1179.It Xr ddi_dma_buf_bind_handle 9F Ta Xr ddi_dma_burstsizes 9F 1180.It Xr ddi_dma_cookie_get 9F Ta Xr ddi_dma_cookie_iter 9F 1181.It Xr ddi_dma_cookie_one 9F Ta Xr ddi_dma_free_handle 9F 1182.It Xr ddi_dma_getwin 9F Ta Xr ddi_dma_mem_alloc 9F 1183.It Xr ddi_dma_mem_free 9F Ta Xr ddi_dma_ncookies 9F 1184.It Xr ddi_dma_nextcookie 9F Ta Xr ddi_dma_numwin 9F 1185.It Xr ddi_dma_set_sbus64 9F Ta Xr ddi_dma_sync 9F 1186.It Xr ddi_dma_unbind_handle 9F Ta Xr ddi_dmae_1stparty 9F 1187.It Xr ddi_dmae_alloc 9F Ta Xr ddi_dmae_disable 9F 1188.It Xr ddi_dmae_enable 9F Ta Xr ddi_dmae_getattr 9F 1189.It Xr ddi_dmae_getcnt 9F Ta Xr ddi_dmae_prog 9F 1190.It Xr ddi_dmae_release 9F Ta Xr ddi_dmae_stop 9F 1191.It Xr ddi_dmae 9F Ta 1192.El 1193.Ss Interrupt Handler Related Functions 1194Interrupts are a central part of the role of device drivers and one of 1195the things that's important to get right. 1196Interrupts come in different types: fixed, MSI, and MSI-X. 1197The kinds that are available depend on the device and the rest of the 1198system. 1199For example, MSI and MSI-X interrupts are generally specific to PCI and 1200PCI Express devices. 1201To begin the interrupt allocation process, the first thing a driver 1202needs to do is to discover what type of interrupts it supports with 1203.Xr ddi_intr_get_supported_types 9F . 1204Then, the driver should work through the supported types, preferring 1205MSI-X, then MSI, and finally fixed interrupts, and try to allocate 1206interrupts. 1207.Pp 1208Drivers first need to know how many interrupts that they require. 1209For example, a networking driver may want to have an interrupt made 1210available for each ring that it has. 1211To discover the number of interrupts available, the driver should call 1212.Xr ddi_intr_get_navail 9F . 1213If there are sufficient interrupts, it can proceed to actually 1214allocate the interrupts with 1215.Xr ddi_intr_alloc 9F . 1216When allocating interrupts, callers need to check to see how many 1217interrupts the system actually gave them. 1218Just because an interrupt is allocated does not mean that it will fire 1219or be ready to use, there are a series of additional steps that the 1220driver must take. 1221.Pp 1222To go through and enable the interrupt, the driver should go through and 1223get the interrupt capabilities with 1224.Xr ddi_intr_get_cap 9F 1225and the priority of the interrupt with 1226.Xr ddi_intr_get_pri 9F . 1227The priority must be used while creating mutexes and related 1228synchronization primitives that will be used during the interrupt 1229handler. 1230At this point, the driver can go ahead and register the functions that 1231will be called with each allocated interrupt with the 1232.Xr ddi_intr_add_handler 9F 1233function. 1234The arguments can vary for each allocated interrupt. 1235It is common to have an interrupt-specific data structure passed in one 1236of the arguments or an interrupt number, while the other argument is 1237generally the driver's instance-specific data structure. 1238.Pp 1239At this point, the last step for the interrupt to be made active from 1240the kernel's perspective is to enable it. 1241This will use either the 1242.Xr ddi_intr_block_enable 9F 1243or 1244.Xr ddi_intr_enable 9F 1245functions depending on the interrupt's capabilities. 1246The reason that these are different is because some interrupt types 1247.Pq MSI 1248require that all interrupts in a group be enabled and disabled at the 1249same time. 1250This is indicated with the 1251.Dv DDI_INTR_FLAG_BLOCK 1252flag found in the interrupt's capabilities. 1253Once that is called, interrupts that are generated by a device will be 1254delivered to the registered function. 1255.Pp 1256It's important to note that there is often device-specific interrupt 1257setup that is required. 1258While the kernel takes care of updating any pieces of the processor's 1259interrupt controller, I/O crossbar, or the PCI MSI and MSI-X 1260capabilities, many devices have device-specific registers that are used 1261to manage, set up, and acknowledge interrupts. 1262These registers or other controls are often capable of separately 1263masking interrupts and are generally what should be used if there are 1264times that you need to separately enable or disable interrupts such as 1265to poll an I/O ring. 1266.Pp 1267When unwinding interrupts, one needs to work in the reverse order here. 1268Until 1269.Xr ddi_intr_block_disable 9F 1270or 1271.Xr ddi_intr_disable 9F 1272is called, one should assume that their interrupt handler will be 1273called. 1274Due to cases where an interrupt is shared between multiple devices, this 1275can happen even if the device is quiesced! 1276Only after that is done is it safe to then free the interrupts with a 1277call to 1278.Xr ddi_intr_free 9F . 1279.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1280.It Xr ddi_add_intr 9F Ta Xr ddi_add_softintr 9F 1281.It Xr ddi_get_iblock_cookie 9F Ta Xr ddi_get_soft_iblock_cookie 9F 1282.It Xr ddi_intr_add_handler 9F Ta Xr ddi_intr_add_softint 9F 1283.It Xr ddi_intr_alloc 9F Ta Xr ddi_intr_block_disable 9F 1284.It Xr ddi_intr_block_enable 9F Ta Xr ddi_intr_clr_mask 9F 1285.It Xr ddi_intr_disable 9F Ta Xr ddi_intr_dup_handler 9F 1286.It Xr ddi_intr_enable 9F Ta Xr ddi_intr_free 9F 1287.It Xr ddi_intr_get_cap 9F Ta Xr ddi_intr_get_hilevel_pri 9F 1288.It Xr ddi_intr_get_navail 9F Ta Xr ddi_intr_get_nintrs 9F 1289.It Xr ddi_intr_get_pending 9F Ta Xr ddi_intr_get_pri 9F 1290.It Xr ddi_intr_get_softint_pri 9F Ta Xr ddi_intr_get_supported_types 9F 1291.It Xr ddi_intr_hilevel 9F Ta Xr ddi_intr_remove_handler 9F 1292.It Xr ddi_intr_remove_softint 9F Ta Xr ddi_intr_set_cap 9F 1293.It Xr ddi_intr_set_mask 9F Ta Xr ddi_intr_set_nreq 9F 1294.It Xr ddi_intr_set_pri 9F Ta Xr ddi_intr_set_softint_pri 9F 1295.It Xr ddi_intr_trigger_softint 9F Ta Xr ddi_remove_intr 9F 1296.It Xr ddi_remove_softintr 9F Ta Xr ddi_trigger_softintr 9F 1297.El 1298.Ss Minor Nodes 1299For a device driver to be accessed by a program in user space 1300.Pq or with the kernel layered device interface 1301then it must create a minor node. 1302Minor nodes are created under 1303.Pa /devices 1304.Pq Xr devfs 4FS 1305and are tied to the instance of a device driver via its 1306.Vt dev_info_t . 1307The 1308.Xr devfsadm 8 1309daemon and the 1310.Pa /dev 1311file system 1312.Po 1313sdev, 1314.Xr dev 4FS 1315.Pc 1316are responsible for creating a coherent set of names that user programs 1317access. 1318Drivers create these minor nodes using the 1319.Xr ddi_create_minor_node 9F 1320function listed below. 1321.Pp 1322In UNIX tradition, character, block, and STREAMS device special files 1323are identified by a major and minor number. 1324All instances of a given driver share the same major number, which means 1325that a device driver must coordinate the minor number space across 1326.Em all 1327instances. 1328While a minor node is created with a fixed minor number, it is possible 1329to change the minor number while processing an 1330.Xr open 9E 1331call, allowing subsequent character device operations to uniquely 1332identify a particular caller. 1333This is usually referred to as a driver that 1334.Dq clones . 1335.Pp 1336When drivers aren't performing cloning, then usually the minor number 1337used when creating the minor node is some fixed offset or multiple of 1338the driver's instance number. 1339When cloning and a driver needs to allocate and manage a minor number 1340space, usually an ID space is leveraged whose IDs are usually in the 1341range from 0 through 1342.Dv MAXMIN32 . 1343There are severa different strategies for tracking data structures as 1344they relate to minor numbers. 1345Sometimes, the soft state functionality is used. 1346Others might keep an AVL tree around or tie the data to some other data 1347structure. 1348The method chosen often varies on the specifics of the implementation 1349and its broader context. 1350.Pp 1351The 1352.Vt dev_t 1353structure represents the combined major and minor number. 1354It can be taken apart with the 1355.Xr getmajor 9F 1356and 1357.Xr getminor 9F 1358functions and then reconstructed with the 1359.Xr makedevice 9F 1360function. 1361.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1362.It Xr ddi_create_minor_node 9F Ta Xr ddi_remove_minor_node 9F 1363.It Xr getmajor 9F Ta Xr getminor 9F 1364.It Xr devfs_clean 9F Ta Xr makedevice 9F 1365.El 1366.Ss Accessing Time, Delays, and Periodic Events 1367The kernel provides a number of ways to understand time in the system. 1368In particular it provides a few different clocks and time measurements: 1369.Bl -tag -width Ds 1370.It High-resolution monotonic time 1371The kernel provides access to a high-resolution monotonic clock that is 1372tracked in nanoseconds. 1373This clock is perfect for measuring durations and is accessed via 1374.Xr gethrtime 9F . 1375Unlike the real-time clock, this clock is not subject to adjustments by 1376a time synchronization daemon and is the preferred clock that drivers 1377should be using for tracking events. 1378The high-resolution clock is consistent across CPUs, meaning that you 1379may call 1380.Xr gethrtime 9F 1381on one CPU and the value will be consistent with what is returned, even 1382if a thread is migrated to another CPU. 1383.Pp 1384The high-resolution clock is implemented using an architecture and 1385platform-specific means. 1386For example, on x86 it is generally backed by the TSC 1387.Pq time stamp counter . 1388.It Real-time 1389The real-time clock tracks time as humans perceive it. 1390This clock is accessed using 1391.Xr ddi_get_time 9F . 1392If the system is running a time synchronization daemon that leverages 1393the network time protocol, then this time may be in sync with other 1394systems 1395.Pq subject to some amount of variance ; 1396however, it is critical that this is not assumed. 1397.Pp 1398In general, this time should not be used by drivers for any purpose. 1399It can jump around, drift, and most aspects in the kernel are not based 1400on the real-time clock. 1401For any device timing activities, the high-resolution clock should be 1402used. 1403.It Tick-based monotonic time 1404The kernel has a running periodic function that fires based on the rate 1405dictated by the 1406.Va hz 1407variable, generally operating at 100 or 1000 kHz. 1408The current number of ticks since boot is accessible through the 1409.Xr ddi_get_lbolt 9F 1410function. 1411When functions operate in units of ticks, this is what they are 1412tracking. 1413This value can be converted to and from microseconds using the 1414.Xr drv_usectohz 9F 1415and 1416.Xr drv_hztousec 9F 1417functions. 1418.Pp 1419In general, drivers should prefer the high-resolution monotonic clock 1420for tracking events internally. 1421.El 1422.Pp 1423With these different timing mechanisms, the kernel provides a few 1424different ways to delay execution or to get a callback after some 1425amount of time passes. 1426.Pp 1427The 1428.Xr delay 9F 1429and 1430.Xr drv_usecwait 9F 1431functions are used to block the execution of the current thread. 1432.Xr delay 9F 1433can be used in conditions where sleeping and blocking is allowed where 1434as 1435.Xr drv_usecwait 9F 1436is a busy-wait, which is appropriate for some device drivers, 1437particularly when in high-level interrupt context. 1438.Pp 1439The kernel also allows a function to be called after some time has 1440elapsed. 1441This callback occurs on a different thread and will be executed in 1442.Sy kernel 1443context. 1444A timeout can be scheduled in the future with the 1445.Xr timeout 9F 1446function and cancelled with the 1447.Xr untimeout 9F 1448function. 1449There is also a STREAMs-specific version that can be used if the 1450circumstances are required with the 1451.Xr qtimeout 9F 1452function. 1453.Pp 1454These are all considered one-shot events. 1455That is, they will only happen once after being scheduled. 1456If instead, a driver requires periodic behavior, such as needing 1457something to occur every second, then it should use the 1458.Xr ddi_periodic_add 9F 1459function to establish that. 1460.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1461.It Xr delay 9F Ta Xr ddi_get_lbolt 9F 1462.It Xr ddi_get_lbolt64 9F Ta Xr ddi_get_time 9F 1463.It Xr ddi_periodic_add 9F Ta Xr ddi_periodic_delete 9F 1464.It Xr drv_hztousec 9F Ta Xr drv_usectohz 9F 1465.It Xr drv_usecwait 9F Ta Xr gethrtime 9F 1466.It Xr qtimeout 9F Ta Xr quntimeout 9F 1467.It Xr timeout 9F Ta Xr untimeout 9F 1468.El 1469.Ss Task Queues 1470A task queue provides an asynchronous processing mechanism that can be 1471used by drivers and the broader system. 1472A task queue can be created with 1473.Xr ddi_taskq_create 9F 1474and sized with a given number of threads and a relative priority of those 1475threads. 1476Once created, tasks can be dispatched to the queue with 1477.Xr ddi_taskq_dispatch 9F . 1478The different functions and arguments dispatched do not need to be the 1479same and can vary from invocation to invocation. 1480However, it is the caller's responsibility to ensure that any reference 1481memory is valid until the task queue is done processing. 1482It is possible to create a barrier for a task queue by using the 1483.Xr ddi_taskq_wait 9F 1484function. 1485.Pp 1486While task queues are a flexible mechanism for handling and processing 1487events that occur in a well defined context, they do not have an 1488inherent backpressure mechanism built in. 1489This means it is possible to add events to a task queue faster than they 1490can be processed. 1491For high-volume events, this must be considered before just dispatching 1492an event. 1493Do not rely on a non-sleeping allocation in the task queue dispatch 1494context. 1495.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1496.It Xr ddi_taskq_create 9F Ta Xr ddi_taskq_destroy 9F 1497.It Xr ddi_taskq_dispatch 9F Ta Xr ddi_taskq_resume 9F 1498.It Xr ddi_taskq_suspend 9F Ta Xr ddi_taskq_suspended 9F 1499ddi_taskq_wait 1500.El 1501.Ss Credential Management and Privileges 1502Not everything in the system has the same power to impact it. 1503To determine the permissions and context of a caller, the 1504.Vt cred_t 1505data structure encapsulates a number of different things including the 1506traditional user and group IDs, but also the zone that one is operating 1507in the context of and the associated privileges that the caller has. 1508While this concept is more often thought of due to userland processes being 1509associated with specific users, these same principles apply to different 1510threads in the kernel. 1511Not all kernel threads are allowed to indiscriminately do what they 1512want, they can be constrained by the same privilege model that processes 1513are, which is discussed in 1514.Xr privileges 7 . 1515.Pp 1516Most operations that device drivers implement are given a credential. 1517However, from within the kernel, a credential can be obtained that 1518refers to a specific zone, the current process, or a generic kernel 1519credential. 1520.Pp 1521It is up to drivers and the kernel writ-large to check whether a given 1522credential is authorized to perform a given operation. 1523This is encapsulated by the various privilege checks that exist. 1524The most common check used is 1525.Xr drv_priv 9F 1526which checks for 1527.Dv PRIV_SYS_DEVICES . 1528.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1529.It Xr CRED 9F Ta Xr crdup 9F 1530.It Xr crfree 9F Ta Xr crget 9F 1531.It Xr crgetgid 9F Ta Xr crgetgroups 9F 1532.It Xr crgetngroups 9F Ta Xr crgetrgid 9F 1533.It Xr crgetruid 9F Ta Xr crgetsgid 9F 1534.It Xr crgetsuid 9F Ta Xr crgetuid 9F 1535.It Xr crgetzoneid 9F Ta Xr crhold 9F 1536.It Xr ddi_get_cred 9F Ta Xr drv_priv 9F 1537.It Xr kcred 9F Ta Xr priv_getbyname 9F 1538.It Xr priv_policy_choice 9F Ta Xr priv_policy_only 9F 1539.It Xr priv_policy 9F Ta Xr zone_kcred 9F 1540.El 1541.Ss Device ID Management 1542Device IDs are a means of establishing a unique ID for a device in the 1543kernel. 1544These unique IDs are generally tied to something from the device's 1545hardware such as a serial number or related, but can also be fabricated 1546and stored on the device. 1547These device IDs are used by other subsystems like ZFS to record 1548information about a device as the actual 1549.Pa /devices 1550path that a device resides at may change because it is moved around in 1551the system. 1552.Pp 1553For device drivers, particularly those that represent block devices, 1554they should first call 1555.Xr ddi_devid_init 9F 1556to initialize the device ID data structure. 1557After that is done, it is then safe to call 1558.Xr ddi_devid_register 9F 1559to notify the kernel about the ID. 1560.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1561.It Xr ddi_devid_compare 9F Ta Xr ddi_devid_free 9F 1562.It Xr ddi_devid_get 9F Ta Xr ddi_devid_init 9F 1563.It Xr ddi_devid_register 9F Ta Xr ddi_devid_sizeof 9F 1564.It Xr ddi_devid_str_decode 9F Ta Xr ddi_devid_str_encode 9F 1565.It Xr ddi_devid_str_free 9F Ta Xr ddi_devid_unregister 9F 1566.It Xr ddi_devid_valid 9F Ta 1567.El 1568.Ss Message Block Functions 1569The 1570.Vt "mblk_t" 1571data structure is used to chain together messages which are used through 1572the kernel for different subsystems including all of networking, 1573terminals, STREAMS, USB, and more. 1574.Pp 1575Message blocks are chained together by a series of two different 1576pointers: 1577.Fa b_cont 1578and 1579.Fa b_next . 1580When a message is split across multiple data buffers, they are linked by 1581the 1582.Fa b_cont 1583pointer. 1584However, multiple distinct messages can be chained together and linked 1585by the 1586.Fa b_next 1587pointer. 1588Let's look at this in the context of a series of networking packets. 1589If we had a chain of say 10 UDP packets that we were given, each UDP 1590packet is considered an independent message and would be linked from one 1591to the next based on the order they should be transmitted with the 1592.Fa b_next 1593pointer. 1594However, an individual message may be entirely in one message block, in 1595which case its 1596.Fa b_cont 1597pointer would be 1598.Dv NULL , 1599but if say the packet were split into a 100 byte data buffer that 1600contained the headers and then a 1000 byte data buffer that contained 1601the actual packet data, those two would be linked together by 1602.Fa b_cont . 1603A continued message would never have its next pointer used to link it to 1604a wholly different message. 1605Visually you might see this as: 1606.Bd -literal 1607 +---------------+ 1608 | UDP Message 0 | 1609 | Bytes 0-1100 | 1610 | b_cont ---+--> NULL 1611 | b_next + | 1612 +---------|-----+ 1613 | 1614 v 1615 +---------------+ +----------------+ 1616 | UDP Message 1 | | UDP Message 1+ | 1617 | Bytes 0-100 | | Bytes 100-1100 | 1618 | b_cont ---+--> | b_cont ----+->NULL 1619 | b_next + | | b_next ----+->NULL 1620 +---------|-----+ +----------------+ 1621 | 1622 ... 1623 | 1624 v 1625 +---------------+ 1626 | UDP Message 9 | 1627 | Bytes 0-1100 | 1628 | b_cont ---+--> NULL 1629 | b_next ---+--> NULL 1630 +---------------+ 1631.Ed 1632.Pp 1633Message blocks all have an associated data block which contains the 1634actual data that is present. 1635Multiple message blocks can share the same data block as well. 1636The data block has a notion of a type, which is generally 1637.Dv M_DATA 1638which signifies that they operate on data. 1639.Pp 1640To allocate message blocks, one generally uses the 1641.Xr allocb 9F 1642function to create one; however, you can also create message blocks 1643using your own source of data through functions like 1644.Xr desballoc 9F . 1645This is generally used when one wants to use memory that was originally 1646used for DMA to pass data back into the kernel, such as in a networking 1647device driver. 1648When this happens, a callback function will be called once the last user 1649of the data block is done with it. 1650.Pp 1651The functions listed below often end in either 1652.Dq msg 1653or 1654.Dq b 1655to indicate that they will operate on an entire message and follow the 1656.Fa b_cont 1657pointer or they will not respectively. 1658.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1659.It Xr adjmsg 9F Ta Xr allocb 9F 1660.It Xr copyb 9F Ta Xr copymsg 9F 1661.It Xr datamsg 9F Ta Xr desballoc 9F 1662.It Xr desballoca 9F Ta Xr dupb 9F 1663.It Xr dupmsg 9F Ta Xr esballoc 9F 1664.It Xr esballoca 9F Ta Xr freeb 9F 1665.It Xr freemsg 9F Ta Xr linkb 9F 1666.It Xr mcopymsg 9F Ta Xr msgdsize 9F 1667.It Xr msgpullup 9F Ta Xr msgsize 9F 1668.It Xr pullupmsg 9F Ta Xr rmvb 9F 1669.It Xr testb 9F Ta Xr unlinkb 9F 1670.El 1671.Ss Upgradable Firmware Modules 1672The UFM 1673.Pq Upgradable Firmware Module 1674subsystem is used to grant the system observability into firmware that 1675exists persistently on a device. 1676These functions are intended for use by drivers that are participating in 1677the kernel's UFM framework, which is discussed in 1678.Xr ddi_ufm 9E . 1679.Pp 1680The 1681.Xr ddi_ufm_init 9E 1682and 1683.Xr ddi_ufm_fini 9E 1684functions are used to indicate support of the subsystem to the kernel. 1685The driver is required to use the 1686.Xr ddi_ufm_update 9F 1687function to indicate both that it is ready to receive UFM requests and 1688to indicate that any data that the kernel may have previously received 1689has changed. 1690Once that's completed, then the other functions listed here are 1691generally used as part of implementing specific callback functions that 1692are registered. 1693.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1694.It Xr ddi_ufm_fini 9F Ta Xr ddi_ufm_image_set_desc 9F 1695.It Xr ddi_ufm_image_set_misc 9F Ta Xr ddi_ufm_image_set_nslots 9F 1696.It Xr ddi_ufm_init 9F Ta Xr ddi_ufm_slot_set_attrs 9F 1697.It Xr ddi_ufm_slot_set_imgsize 9F Ta Xr ddi_ufm_slot_set_misc 9F 1698.It Xr ddi_ufm_slot_set_version 9F Ta Xr ddi_ufm_update 9F 1699.El 1700.Ss Firmware Loading 1701Some hardware devices have firmware that is not stored as part of the 1702device itself and must instead be sent to the device each time it is 1703powered on. 1704These routines help drivers that need to perform this read such data 1705from the file system from well-known locations in the operating system. 1706To begin with, a driver should call 1707.Xr firmware_open 9F 1708to open a handle to the firmware file. 1709At that point, one can determine the size of the file with the 1710.Xr firmware_get_size 9F 1711function and allocate the appropriate sized memory buffer to read it in. 1712Callers should always check what the size of the returned file is and 1713should not just blindly pass that size off to the kernel memory 1714allocator. 1715For example, if a file was over 100 MiB in size, then one should not 1716assume that they're going to just blindly allocate 100 MiB of kernel 1717memory and should instead perform incremental reads and sends to a 1718device that are smaller in size. 1719.Pp 1720A driver can then go through and perform arbitrary reads of the firmware 1721file through the 1722.Xr firmware_read 9F 1723interface until they have read everything that they need. 1724Once complete, the corresponding handle needs to be released through the 1725.Xr firmware_close 9F 1726function. 1727.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1728.It Xr firmware_close 9F Ta Xr firmware_get_size 9F 1729.It Xr firmware_open 9F Ta Xr firmware_read 9F 1730.El 1731.Ss Fault Management Handling 1732These functions allow device drivers to harden themselves against errors 1733that might occur while interfacing with devices and tie into the broader 1734fault management architecture. 1735.Pp 1736To begin, a driver must declare which capabilities it implements during 1737its 1738.Xr attach 9E 1739function by calling 1740.Xr ddi_fm_init 9F . 1741The set of capabilities it receives back may be less than what was 1742requested because the capabilities are dependent on the overall chain of 1743drivers present. 1744.Pp 1745If 1746.Dv DDI_FM_EREPORT_CAPABLE 1747was negotiated, then the driver is expected to generate error events 1748when certain conditions occur using the 1749.Xr ddi_fm_ereport_post 9F 1750function or the more specific 1751.Xr pci_ereport_post 9F 1752function. 1753If a caller has negotiated 1754.Dv DDI_FM_ACCCHK_CAPABLE , 1755then it is allowed to set up its register attributes to indicate that it 1756will check for errors on the register handle after using functions like 1757.Xr ddi_get8 9F 1758and 1759.Xr ddi_set8 9F 1760by calling 1761.Xr ddi_fm_acc_err_get 9F 1762and reacting accordingly. 1763Similarly, if a driver has negotiated 1764.Dv DDI_FM_DMACHK_CAPABLE , 1765then it will use 1766.Xr ddi_check_dma_handle 9F 1767to check the results of DMA activity and handle the results 1768appropriately. 1769Similar to register accesses, the DMA attributes must be updated to set 1770that error handling is anticipated on this handle. 1771The 1772.Xr ddi_fm_init 9F 1773manual page has an overview of the other types of flags that can be 1774negotiated and how they are used. 1775.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1776.It Xr ddi_check_acc_handle 9F Ta Xr ddi_check_dma_handle 9F 1777.It Xr ddi_dev_report_fault 9F Ta Xr ddi_fm_acc_err_clear 9F 1778.It Xr ddi_fm_acc_err_get 9F Ta Xr ddi_fm_capable 9F 1779.It Xr ddi_fm_dma_err_clear 9F Ta Xr ddi_fm_dma_err_get 9F 1780.It Xr ddi_fm_ereport_post 9F Ta Xr ddi_fm_fini 9F 1781.It Xr ddi_fm_handler_register 9F Ta Xr ddi_fm_handler_unregister 9F 1782.It Xr ddi_fm_init 9F Ta Xr ddi_fm_service_impact 9F 1783.It Xr pci_ereport_post 9F Ta Xr pci_ereport_setup 9F 1784.It Xr pci_ereport_teardown 9F Ta 1785.El 1786.Ss SCSI and SAS Device Driver Functions 1787These functions are for use by SCSI and SAS device drivers that leverage 1788the kernel's frameworks. 1789Other device drivers should not use these. 1790For more background on these, some of the general concepts are discussed 1791in 1792.Xr iport 9 , 1793.Xr phymap 9 , 1794and 1795.Xr tgtmap 9 . 1796.Pp 1797Device drivers register initially with the kernel by using the 1798.Xr scsi_ha_init 9F 1799function and then, in their attach routine, register specific instances, 1800using functions like 1801.Xr scsi_hba_iport_register 9F 1802or instead 1803.Xr scsi_hba_tran_alloc 9F 1804and 1805.Xr scsi_hba_attach_setup 9F . 1806New drivers are encouraged to use the target map and iports framework to 1807simplify the device driver writing process. 1808.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1809.It Xr makecom_g0_s 9F Ta Xr makecom_g0 9F 1810.It Xr makecom_g1 9F Ta Xr makecom_g5 9F 1811.It Xr makecom 9F Ta Xr sas_phymap_create 9F 1812.It Xr sas_phymap_destroy 9F Ta Xr sas_phymap_lookup_ua 9F 1813.It Xr sas_phymap_lookup_uapriv 9F Ta Xr sas_phymap_phy_add 9F 1814.It Xr sas_phymap_phy_rem 9F Ta Xr sas_phymap_phy2ua 9F 1815.It Xr sas_phymap_phys_free 9F Ta Xr sas_phymap_phys_next 9F 1816.It Xr sas_phymap_ua_free 9F Ta Xr sas_phymap_ua2phys 9F 1817.It Xr sas_phymap_uahasphys 9F Ta Xr scsi_abort 9F 1818.It Xr scsi_address_device 9F Ta Xr scsi_alloc_consistent_buf 9F 1819.It Xr scsi_cname 9F Ta Xr scsi_destroy_pkt 9F 1820.It Xr scsi_device_hba_private_get 9F Ta Xr scsi_device_hba_private_set 9F 1821.It Xr scsi_device_unit_address 9F Ta Xr scsi_dmafree 9F 1822.It Xr scsi_dmaget 9F Ta Xr scsi_dname 9F 1823.It Xr scsi_errmsg 9F Ta Xr scsi_ext_sense_fields 9F 1824.It Xr scsi_find_sense_descr 9F Ta Xr scsi_free_consistent_buf 9F 1825.It Xr scsi_free_wwnstr 9F Ta Xr scsi_get_device_type_scsi_options 9F 1826.It Xr scsi_get_device_type_string 9F Ta Xr scsi_hba_attach_setup 9F 1827.It Xr scsi_hba_detach 9F Ta Xr scsi_hba_fini 9F 1828.It Xr scsi_hba_init 9F Ta Xr scsi_hba_iport_exist 9F 1829.It Xr scsi_hba_iport_find 9F Ta Xr scsi_hba_iport_register 9F 1830.It Xr scsi_hba_iport_unit_address 9F Ta Xr scsi_hba_iportmap_create 9F 1831.It Xr scsi_hba_iportmap_destroy 9F Ta Xr scsi_hba_iportmap_iport_add 9F 1832.It Xr scsi_hba_iportmap_iport_remove 9F Ta Xr scsi_hba_lookup_capstr 9F 1833.It Xr scsi_hba_pkt_alloc 9F Ta Xr scsi_hba_pkt_comp 9F 1834.It Xr scsi_hba_pkt_free 9F Ta Xr scsi_hba_probe 9F 1835.It Xr scsi_hba_tgtmap_create 9F Ta Xr scsi_hba_tgtmap_destroy 9F 1836.It Xr scsi_hba_tgtmap_scan_luns 9F Ta Xr scsi_hba_tgtmap_set_add 9F 1837.It Xr scsi_hba_tgtmap_set_begin 9F Ta Xr scsi_hba_tgtmap_set_end 9F 1838.It Xr scsi_hba_tgtmap_set_flush 9F Ta Xr scsi_hba_tgtmap_tgt_add 9F 1839.It Xr scsi_hba_tgtmap_tgt_remove 9F Ta Xr scsi_hba_tran_alloc 9F 1840.It Xr scsi_hba_tran_free 9F Ta Xr scsi_ifgetcap 9F 1841.It Xr scsi_ifsetcap 9F Ta Xr scsi_init_pkt 9F 1842.It Xr scsi_log 9F Ta Xr scsi_mname 9F 1843.It Xr scsi_pktalloc 9F Ta Xr scsi_pktfree 9F 1844.It Xr scsi_poll 9F Ta Xr scsi_probe 9F 1845.It Xr scsi_resalloc 9F Ta Xr scsi_reset_notify 9F 1846.It Xr scsi_reset 9F Ta Xr scsi_resfree 9F 1847.It Xr scsi_rname 9F Ta Xr scsi_sense_asc 9F 1848.It Xr scsi_sense_ascq 9F Ta Xr scsi_sense_cmdspecific_uint64 9F 1849.It Xr scsi_sense_info_uint64 9F Ta Xr scsi_sense_key 9F 1850.It Xr scsi_setup_cdb 9F Ta Xr scsi_slave 9F 1851.It Xr scsi_sname 9F Ta Xr scsi_sync_pkt 9F 1852.It Xr scsi_transport 9F Ta Xr scsi_unprobe 9F 1853.It Xr scsi_unslave 9F Ta Xr scsi_validate_sense 9F 1854.It Xr scsi_vu_errmsg 9F Ta Xr scsi_wwn_to_wwnstr 9F 1855scsi_wwnstr_to_wwn 1856.El 1857.Ss Block Device Buffer Handling 1858Block devices operate with a data structure called the 1859.Vt struct buf 1860which is described in 1861.Xr buf 9S . 1862This structure is used to represent a given block request and is used 1863heavily in block devices, the SCSI/SAS framework, and the blkdev 1864framework. 1865The functions described here are used to manipulate these structures in 1866various ways such as copying them around, indicating error conditions, 1867or indicating when the I/O operation is done. 1868By default, this memory is not mapped into the kernel's address space so 1869several functions such as 1870.Xr bp_mapin 9F 1871are present to allow for that to happen when required. 1872.Pp 1873To initially obtain a 1874.Vt struct buf , 1875drivers should begin by calling 1876.Xr getrbuf 9S 1877at which point, the caller can fill in the structure. 1878Once that's done, the 1879.Xr physio 9F 1880function can be used to actually perform the I/O and wait until it's 1881complete. 1882.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1883.It Xr bioclone 9F Ta Xr biodone 9F 1884.It Xr bioerror 9F Ta Xr biofini 9F 1885.It Xr bioinit 9F Ta Xr biomodified 9F 1886.It Xr bioreset 9F Ta Xr biosize 9F 1887.It Xr biowait 9F Ta Xr bp_mapin 9F 1888.It Xr bp_mapout 9F Ta Xr clrbuf 9F 1889.It Xr disksort 9F Ta Xr freerbuf 9F 1890.It Xr geterror 9F Ta Xr getrbuf 9F 1891.It Xr minphys 9F Ta Xr physio 9F 1892.El 1893.Ss Networking Device Driver Functions 1894These functions are for networking device drivers that implant the MAC, 1895GLDv3 interfaces. 1896The full framework and how to use it is described in 1897.Xr mac 9E . 1898.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1899.It Xr mac_alloc 9F Ta Xr mac_fini_ops 9F 1900.It Xr mac_free 9F Ta Xr mac_hcksum_get 9F 1901.It Xr mac_hcksum_set 9F Ta Xr mac_init_ops 9F 1902.It Xr mac_link_update 9F Ta Xr mac_lso_get 9F 1903.It Xr mac_maxsdu_update 9F Ta Xr mac_prop_info_set_default_fec 9F 1904.It Xr mac_prop_info_set_default_link_flowctrl 9F Ta Xr mac_prop_info_set_default_str 9F 1905.It Xr mac_prop_info_set_default_uint32 9F Ta Xr mac_prop_info_set_default_uint64 9F 1906.It Xr mac_prop_info_set_default_uint8 9F Ta Xr mac_prop_info_set_perm 9F 1907.It Xr mac_prop_info_set_range_uint32 9F Ta Xr mac_prop_info 9F 1908.It Xr mac_register 9F Ta Xr mac_ring_rx 9F 1909.It Xr mac_rx 9F Ta Xr mac_transceiver_info_set_present 9F 1910.It Xr mac_transceiver_info_set_usable 9F Ta Xr mac_transceiver_info 9F 1911.It Xr mac_tx_ring_update 9F Ta Xr mac_tx_update 9F 1912.It Xr mac_unregister 9F Ta 1913.El 1914.Ss USB Device Driver Functions 1915These functions are designed for USB device drivers. 1916To first initialize with the kernel, a device driver must call 1917.Xr usb_client_attach 9F 1918and then 1919.Xr usb_get_dev_data 9F . 1920The latter call is required to get access to the USB-level 1921descriptors about the device which describe what kinds of USB endpoints 1922.Pq control, bulk, interrupt, or isochronous 1923exist on the device as well as how many different interfaces and 1924configurations are present. 1925.Pp 1926Once a given configuration, sometimes the default, is selected, then the 1927driver can proceed to opening up what the USB architecture calls a pipe, 1928which provides a way to send requests to a specific USB endpoint. 1929First, specific endpoints can be looked up using the 1930.Xr usb_lookup_ep_data 9F 1931function which gets information from the parsed descriptors and then 1932that gets filled into an extended descriptor with 1933.Xr usb_ep_xdescr_fill 9F . 1934With that in hand, a pipe can be opened with 1935.Xr usb_pipe_xopen 9F . 1936.Pp 1937Once a pipe has been opened, which most often happens in a driver's 1938.Xr attach 9E 1939entry point, then requests can be allocated and submitted. 1940There is a different allocation for each type of request 1941.Po 1942e.g. 1943.Xr usb_alloc_bulk_req 9F 1944.Pc 1945and a different submission function for each type as well. 1946Each request structure has a corresponding page in section 9S that 1947describes the structure, its members, and how to work with it. 1948.Pp 1949One other major concern for USB devices, which isn't as common with 1950other types of devices, is that they can be yanked out and reinserted 1951at any time. 1952To help determine when this happens, the kernel offers the 1953.Xr usb_register_event_cbs 9F 1954function which allows a driver to register for callbacks when a device 1955is disconnected, reconnected, or around checkpoint suspend/resume 1956behavior. 1957.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1958.It Xr usb_alloc_bulk_req 9F Ta Xr usb_alloc_ctrl_req 9F 1959.It Xr usb_alloc_intr_req 9F Ta Xr usb_alloc_isoc_req 9F 1960.It Xr usb_alloc_request 9F Ta Xr usb_client_attach 9F 1961.It Xr usb_client_detach 9F Ta Xr usb_clr_feature 9F 1962.It Xr usb_create_pm_components 9F Ta Xr usb_ep_xdescr_fill 9F 1963.It Xr usb_free_bulk_req 9F Ta Xr usb_free_ctrl_req 9F 1964.It Xr usb_free_descr_tree 9F Ta Xr usb_free_dev_data 9F 1965.It Xr usb_free_intr_req 9F Ta Xr usb_free_isoc_req 9F 1966.It Xr usb_get_addr 9F Ta Xr usb_get_alt_if 9F 1967.It Xr usb_get_cfg 9F Ta Xr usb_get_current_frame_number 9F 1968.It Xr usb_get_dev_data 9F Ta Xr usb_get_if_number 9F 1969.It Xr usb_get_max_pkts_per_isoc_request 9F Ta Xr usb_get_status 9F 1970.It Xr usb_get_string_descr 9F Ta Xr usb_handle_remote_wakeup 9F 1971.It Xr usb_lookup_ep_data 9F Ta Xr usb_owns_device 9F 1972.It Xr usb_parse_data 9F Ta Xr usb_pipe_bulk_xfer 9F 1973.It Xr usb_pipe_close 9F Ta Xr usb_pipe_ctrl_xfer_wait 9F 1974.It Xr usb_pipe_ctrl_xfer 9F Ta Xr usb_pipe_drain_reqs 9F 1975.It Xr usb_pipe_get_max_bulk_transfer_size 9F Ta Xr usb_pipe_get_private 9F 1976.It Xr usb_pipe_get_state 9F Ta Xr usb_pipe_intr_xfer 9F 1977.It Xr usb_pipe_isoc_xfer 9F Ta Xr usb_pipe_open 9F 1978.It Xr usb_pipe_reset 9F Ta Xr usb_pipe_set_private 9F 1979.It Xr usb_pipe_stop_intr_polling 9F Ta Xr usb_pipe_stop_isoc_polling 9F 1980.It Xr usb_pipe_xopen 9F Ta Xr usb_print_descr_tree 9F 1981.It Xr usb_register_hotplug_cbs 9F Ta Xr usb_reset_device 9F 1982.It Xr usb_set_alt_if 9F Ta Xr usb_set_cfg 9F 1983.It Xr usb_unregister_hotplug_cbs 9F Ta 1984.El 1985.Ss PCI Device Driver Functions 1986These functions are specific for PCI and PCI Express based device 1987drivers and are intended to be used to get access to PCI configuration 1988space. 1989For normal PCI base address registers 1990.Pq BARs 1991instead see 1992.Sx Register Setup and Access . 1993.Pp 1994To access PCI configuration space, a device driver should first call 1995.Xr pci_config_setup 9F . 1996Generally, drivers will call this in their 1997.Xr attach 9E 1998entry point and then tear down the configuration space access with the 1999.Xr pci_config_teardown 9F 2000entry point in 2001.Xr detach 9E . 2002After setting up access to configuration space, the returned handle can 2003be used in all of the various configuration space routines to get and 2004set specific sized values in configuration space. 2005.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2006.It Xr pci_config_get8 9F Ta Xr pci_config_get16 9F 2007.It Xr pci_config_get32 9F Ta Xr pci_config_get64 9F 2008.It Xr pci_config_put8 9F Ta Xr pci_config_put16 9F 2009.It Xr pci_config_put32 9F Ta Xr pci_config_put64 9F 2010.It Xr pci_config_setup 9F Ta Xr pci_config_teardown 9F 2011.It Xr pci_report_pmcap 9F Ta Xr pci_restore_config_regs 9F 2012.It Xr pci_save_config_regs 9F Ta 2013.El 2014.Ss USB Host Controller Interface Functions 2015These routines are used for device drivers which implement the USB 2016host controller interfaces described in 2017.Xr usba_hcdi 9E . 2018Other types of devices drivers and modules should not call these 2019functions. 2020In particular, if one is writing a device driver for a USB device, these 2021are not the routines you're looking for and you want to see 2022.Sx USB Device Driver Functions . 2023These are what the 2024.Xr ehci 4D 2025or 2026.Xr xhci 4D 2027drivers use to provide services that USB drivers use via the kernel USB 2028architecture. 2029.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2030.It Xr usba_alloc_hcdi_ops 9F Ta Xr usba_free_hcdi_ops 9F 2031.It Xr usba_hcdi_cb 9F Ta Xr usba_hcdi_dup_intr_req 9F 2032.It Xr usba_hcdi_dup_isoc_req 9F Ta Xr usba_hcdi_get_device_private 9F 2033.It Xr usba_hcdi_register 9F Ta Xr usba_hcdi_unregister 9F 2034.It Xr usba_hubdi_bind_root_hub 9F Ta Xr usba_hubdi_cb_ops 9F 2035.It Xr usba_hubdi_close 9F Ta Xr usba_hubdi_dev_ops 9F 2036.It Xr usba_hubdi_ioctl 9F Ta Xr usba_hubdi_open 9F 2037.It Xr usba_hubdi_root_hub_power 9F Ta Xr usba_hubdi_unbind_root_hub 9F 2038.El 2039.Ss Functions for PCMCIA Drivers 2040These functions exist for older PCMCIA device drivers. 2041These should not otherwise be used by the system. 2042.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2043.It Xr csx_AccessConfigurationRegister 9F Ta Xr csx_ConvertSize 9F 2044.It Xr csx_ConvertSpeed 9F Ta Xr csx_CS_DDI_Info 9F 2045.It Xr csx_DeregisterClient 9F Ta Xr csx_DupHandle 9F 2046.It Xr csx_Error2Text 9F Ta Xr csx_Event2Text 9F 2047.It Xr csx_FreeHandle 9F Ta Xr csx_Get16 9F 2048.It Xr csx_Get32 9F Ta Xr csx_Get64 9F 2049.It Xr csx_Get8 9F Ta Xr csx_GetEventMask 9F 2050.It Xr csx_GetFirstClient 9F Ta Xr csx_GetFirstTuple 9F 2051.It Xr csx_GetHandleOffset 9F Ta Xr csx_GetMappedAddr 9F 2052.It Xr csx_GetNextClient 9F Ta Xr csx_GetNextTuple 9F 2053.It Xr csx_GetStatus 9F Ta Xr csx_GetTupleData 9F 2054.It Xr csx_MakeDeviceNode 9F Ta Xr csx_MapLogSocket 9F 2055.It Xr csx_MapMemPage 9F Ta Xr csx_ModifyConfiguration 9F 2056.It Xr csx_ModifyWindow 9F Ta Xr csx_Parse_CISTPL_BATTERY 9F 2057.It Xr csx_Parse_CISTPL_BYTEORDER 9F Ta Xr csx_Parse_CISTPL_CFTABLE_ENTRY 9F 2058.It Xr csx_Parse_CISTPL_CONFIG 9F Ta Xr csx_Parse_CISTPL_DATE 9F 2059.It Xr csx_Parse_CISTPL_DEVICE_A 9F Ta Xr csx_Parse_CISTPL_DEVICE_OA 9F 2060.It Xr csx_Parse_CISTPL_DEVICE_OC 9F Ta Xr csx_Parse_CISTPL_DEVICE 9F 2061.It Xr csx_Parse_CISTPL_DEVICEGEO_A 9F Ta Xr csx_Parse_CISTPL_DEVICEGEO 9F 2062.It Xr csx_Parse_CISTPL_FORMAT 9F Ta Xr csx_Parse_CISTPL_FUNCE 9F 2063.It Xr csx_Parse_CISTPL_FUNCID 9F Ta Xr csx_Parse_CISTPL_GEOMETRY 9F 2064.It Xr csx_Parse_CISTPL_JEDEC_A 9F Ta Xr csx_Parse_CISTPL_JEDEC_C 9F 2065.It Xr csx_Parse_CISTPL_LINKTARGET 9F Ta Xr csx_Parse_CISTPL_LONGLINK_A 9F 2066.It Xr csx_Parse_CISTPL_LONGLINK_C 9F Ta Xr csx_Parse_CISTPL_LONGLINK_MFC 9F 2067.It Xr csx_Parse_CISTPL_MANFID 9F Ta Xr csx_Parse_CISTPL_ORG 9F 2068.It Xr csx_Parse_CISTPL_SPCL 9F Ta Xr csx_Parse_CISTPL_SWIL 9F 2069.It Xr csx_Parse_CISTPL_VERS_1 9F Ta Xr csx_Parse_CISTPL_VERS_2 9F 2070.It Xr csx_ParseTuple 9F Ta Xr csx_Put16 9F 2071.It Xr csx_Put32 9F Ta Xr csx_Put64 9F 2072.It Xr csx_Put8 9F Ta Xr csx_RegisterClient 9F 2073.It Xr csx_ReleaseConfiguration 9F Ta Xr csx_ReleaseIO 9F 2074.It Xr csx_ReleaseIRQ 9F Ta Xr csx_ReleaseSocketMask 9F 2075.It Xr csx_ReleaseWindow 9F Ta Xr csx_RemoveDeviceNode 9F 2076.It Xr csx_RepGet16 9F Ta Xr csx_RepGet32 9F 2077.It Xr csx_RepGet64 9F Ta Xr csx_RepGet8 9F 2078.It Xr csx_RepPut16 9F Ta Xr csx_RepPut32 9F 2079.It Xr csx_RepPut64 9F Ta Xr csx_RepPut8 9F 2080.It Xr csx_RequestConfiguration 9F Ta Xr csx_RequestIO 9F 2081.It Xr csx_RequestIRQ 9F Ta Xr csx_RequestSocketMask 9F 2082.It Xr csx_RequestWindow 9F Ta Xr csx_ResetFunction 9F 2083.It Xr csx_SetEventMask 9F Ta Xr csx_SetHandleOffset 9F 2084.It Xr csx_ValidateCIS 9F Ta 2085.El 2086.Ss STREAMS related functions 2087These functions are meant to be used when interacting with STREAMS 2088devices or when implementing one. 2089When a STREAMS driver is opened, it receives messages on a queue which 2090are then processed and can be sent back. 2091As different queues are often linked together, the most common thing is 2092to process a message and then pass the message onto the next queue using 2093the 2094.Xr putnext 9F 2095function. 2096.Pp 2097STREAMS messages are passed around using message blocks, which use the 2098.Vt mblk_t 2099type. 2100See 2101.Sx Message Block Functions 2102for more about how the data structure and functions that manipulate 2103message blocks. 2104.Pp 2105These functions should generally not be used when implementing a 2106networking device driver today. 2107See 2108.Xr mac 9E 2109instead. 2110.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2111.It Xr backq 9F Ta Xr bcanput 9F 2112.It Xr bcanputnext 9F Ta Xr canput 9F 2113.It Xr canputnext 9F Ta Xr enableok 9F 2114.It Xr flushband 9F Ta Xr flushq 9F 2115.It Xr freezestr 9F Ta Xr getq 9F 2116.It Xr insq 9F Ta Xr merror 9F 2117.It Xr mexchange 9F Ta Xr noenable 9F 2118.It Xr put 9F Ta Xr putbq 9F 2119.It Xr putctl 9F Ta Xr putctl1 9F 2120.It Xr putnext 9F Ta Xr putnextctl 9F 2121.It Xr putnextctl1 9F Ta Xr putq 9F 2122.It Xr mt-streams 9F Ta Xr qassociate 9F 2123.It Xr qenable 9F Ta Xr qprocsoff 9F 2124.It Xr qprocson 9F Ta Xr qreply 9F 2125.It Xr qsize 9F Ta Xr qwait_sig 9F 2126.It Xr qwait 9F Ta Xr qwriter 9F 2127.It Xr OTHERQ 9F Ta Xr RD 9F 2128.It Xr rmvq 9F Ta Xr SAMESTR 9F 2129.It Xr unfreezestr 9F Ta Xr WR 9F 2130.El 2131.Ss STREAMS ioctls 2132The following functions are used when a STREAMS-based device driver is 2133processing its 2134.Xr ioctl 9E 2135entry point. 2136Unlike character and block devices, STREAMS ioctls are passed around in 2137message blocks and copying data in and out of userland as STREAMS 2138ioctls are generally always processed in 2139.Sy kernel 2140context. 2141This means that the normal functions like 2142.Xr ddi_copyin 9F 2143and 2144.Xr ddi_copyout 9F 2145cannot be used. 2146Instead, when a message block has a type of 2147.Dv M_IOCTL , 2148then these routines can often be used to convert the structure into one 2149that asks for data to be copied in, copied out, or to finally 2150acknowledge the ioctl as successful or to terminate the processing in 2151error. 2152.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2153.It Xr mcopyin 9F Ta Xr mcopyout 9F 2154.It Xr mioc2ack 9F Ta Xr miocack 9F 2155.It Xr miocnak 9F Ta Xr miocpullup 9F 2156.It Xr mkiocb 9F Ta 2157.El 2158.Ss chpoll(9E) Related Functions 2159These functions are present in service of the 2160.Xr chpoll 9E 2161interface which is used to support the traditional 2162.Xr poll 2 , 2163and 2164.Xr select 3C 2165interfaces as well as event ports through the 2166.Xr port_get 3C 2167interface. 2168See 2169.Xr chpoll 9E 2170for the specific cases this should be called. 2171If a device driver does not implement the 2172.Xr chpoll 9E 2173character device entry point, then these functions should not be used. 2174.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2175.It Xr pollhead_clean 9F Ta Xr pollwakeup 9F 2176.El 2177.Ss Kernel Statistics 2178The kernel statistics or kstat framework provides an easy way of 2179exporting statistic information to be consumed outside of the kernel. 2180Users can interface with this data via 2181.Xr kstat 8 2182and the corresponding kstat library discussed in 2183.Xr kstat 3KSTAT . 2184.Pp 2185Kernel statistics are grouped using a tuple of four identifiers, 2186separated by colons when using 2187.Xr kstat 8 . 2188These are, in order, the statistic module name, instance, a name 2189which covers a group of statistics, and an individual name for a 2190statistic. 2191In addition, kernel statistics have a class which is used to group 2192similar named groups of statistics together across devices. 2193When using 2194.Xr kstat_create 9F , 2195drivers specify the first three parts of the tuple and the class. 2196The naming of individual statistics, the last part of the tuple, varies 2197based upon the type of the statistic. 2198For the most part, drivers will use the kstat type 2199.Dv KSTAT_TYPE_NAMED , 2200which allows multiple name-value pairs to exist within the statistic. 2201For example, the kernel's layer 2 networking framework, 2202.Xr mac 9E , 2203creates a kstat with the driver's name and instance and names it 2204.Dq mac . 2205Within this named group, there are statistics for all of the different 2206individual stats that the kernel and devices track such as bytes 2207transmitted and received, the state and speed of the link, and 2208advertised and enabled capabilities. 2209.Pp 2210A device driver can initialize a kstat with the 2211.Xr kstat_create 9F 2212function. 2213It will not be made accessible to users until the 2214.Xr kstat_install 9F 2215function is called. 2216The device driver must perform additional initialization of the kstat 2217before proceeding and calling 2218.Xr kstat_install 9F . 2219The kstat structure that drivers see is discussed in 2220.Xr kstat 9S . 2221.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2222.It Xr kstat_create 9F Ta Xr kstat_delete 9F 2223.It Xr kstat_install 9F Ta Xr kstat_named_init 9F 2224.It Xr kstat_named_setstr 9F Ta Xr kstat_queue 9F 2225.It Xr kstat_runq_back_to_waitq 9F Ta Xr kstat_runq_enter 9F 2226.It Xr kstat_runq_exit 9F Ta Xr kstat_waitq_enter 9F 2227.It Xr kstat_waitq_exit 9F Ta Xr kstat_waitq_to_runq 9F 2228.El 2229.Ss NDI Events 2230These functions are used to allow a device driver to register for 2231certain events that might occur to its device or a parent in the tree 2232and receive a callback function when they occur. 2233A good example of this is when a device has been removed from the system 2234such as someone just pulling out a USB device or NVMe U.2 device. 2235The event handlers work by first getting a cookie that names the type of 2236event with 2237.Xr ddi_get_eventcookie 9F 2238and then registering the callback with 2239.Xr ddi_add_event_handler 9F . 2240.Pp 2241The 2242.Xr ddi_cb_register 9F 2243function is used to collect over classes of events such as when 2244participating in dynamic interrupt sharing. 2245.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2246.It Xr ddi_add_event_handler 9F Ta Xr ddi_cb_register 9F 2247.It Xr ddi_cb_unregister 9F Ta Xr ddi_get_eventcookie 9F 2248.It Xr ddi_remove_event_handler 9F Ta 2249.El 2250.Ss Layered Device Interfaces 2251The LDI 2252.Pq Layered Device Interface 2253provides a mechanism for a driver to open up another device in the 2254kernel and begin calling basic operations on the device as though the 2255calling driver were a normal user process. 2256Through the LDI, drivers can perform equivalents to the basic file 2257.Xr read 2 2258and 2259.Xr write 2 2260calls, look up properties on the device, perform networking style calls 2261ala 2262.Xr getmsg 2 2263and 2264.Xr pumsg 2 , 2265and register callbacks to be called when something happens to the 2266underlying device. 2267For example, the ZFS file system uses the LDI to open and operate on 2268block devices. 2269.Pp 2270Before opening a device itself, callers must obtain a notion of their 2271identity which is used when making subsequent calls. 2272The simplest form is often to use the device's 2273.Vt dev_info_t 2274and call 2275.Xr ldi_ident_from_dip 9F ; 2276however, there are also methods available based upon having a 2277.Vt dev_t 2278or a STREAMS 2279.Vt struct queue . 2280.Pp 2281Once that identity is established, there are several ways to open a 2282device such as 2283.Xr ldi_open_by_dev 9F , 2284.Xr ldi_open_by_devid 9F , 2285or 2286.Xr ldi_open_by_name 9F . 2287Once an LDI device has been opened, then all of the other functions may 2288be used to operate on the device; however, consumers of the LDI must 2289think carefully about what kind of device they are opening. 2290While a kernel pseudo-device driver cannot disappear while it is open, 2291when the device represents an actual piece of hardware, it is possible 2292for it to be physically removed and no longer be accessible. 2293Consumers should not assume that a layered device will always be 2294present. 2295.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2296.It Xr ldi_add_event_handler 9F Ta Xr ldi_aread 9F 2297.It Xr ldi_awrite 9F Ta Xr ldi_close 9F 2298.It Xr ldi_devmap 9F Ta Xr ldi_dump 9F 2299.It Xr ldi_ev_finalize 9F Ta Xr ldi_ev_get_cookie 9F 2300.It Xr ldi_ev_get_type 9F Ta Xr ldi_ev_notify 9F 2301.It Xr ldi_ev_register_callbacks 9F Ta Xr ldi_ev_remove_callbacks 9F 2302.It Xr ldi_get_dev 9F Ta Xr ldi_get_devid 9F 2303.It Xr ldi_get_eventcookie 9F Ta Xr ldi_get_minor_name 9F 2304.It Xr ldi_get_otyp 9F Ta Xr ldi_get_size 9F 2305.It Xr ldi_getmsg 9F Ta Xr ldi_ident_from_dev 9F 2306.It Xr ldi_ident_from_dip 9F Ta Xr ldi_ident_from_stream 9F 2307.It Xr ldi_ident_release 9F Ta Xr ldi_ioctl 9F 2308.It Xr ldi_open_by_dev 9F Ta Xr ldi_open_by_devid 9F 2309.It Xr ldi_open_by_name 9F Ta Xr ldi_poll 9F 2310.It Xr ldi_prop_exists 9F Ta Xr ldi_prop_get_int 9F 2311.It Xr ldi_prop_get_int64 9F Ta Xr ldi_prop_lookup_byte_array 9F 2312.It Xr ldi_prop_lookup_int_array 9F Ta Xr ldi_prop_lookup_int64_array 9F 2313.It Xr ldi_prop_lookup_string_array 9F Ta Xr ldi_prop_lookup_string 9F 2314.It Xr ldi_putmsg 9F Ta Xr ldi_read 9F 2315.It Xr ldi_remove_event_handler 9F Ta Xr ldi_strategy 9F 2316.It Xr ldi_write 9F Ta 2317.El 2318.Ss Signal Manipulation 2319These utility functions all relate to understanding whether or not a 2320process can receive a signal an actually delivering one to a process 2321from a driver. 2322This interface is specific to device drivers and should not be used by 2323the broader kernel. 2324These interfaces are not recommended and should only be used after 2325consultation. 2326.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2327.It Xr ddi_can_receive_sig 9F Ta Xr proc_ref 9F 2328.It Xr proc_signal 9F Ta Xr proc_unref 9F 2329.El 2330.Ss Getting at Surrounding Context 2331These functions allow a driver to better understand its current context. 2332For example, some drivers have to deal with providing polled I/O or take 2333special care as part of creating a kernel crash dump. 2334These cases may need to call the 2335.Xr ddi_in_panic 9F 2336function. 2337The other functions generally provie a way to get at information such as 2338the process ID or other information from the system; however, this 2339generally should not be needed or used. 2340Almost all values exposed by say 2341.Xr drv_getparm 9F 2342have more usable first-class methods of getting at the data. 2343.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2344.It Xr ddi_get_kt_did 9F Ta Xr ddi_get_pid 9F 2345.It Xr ddi_in_panic 9F Ta Xr drv_getparm 9F 2346.El 2347.Ss Driver Memory Mapping 2348These functions are present for device drivers that implement the 2349.Xr devmap 9E 2350or 2351.Xr segmap 9E 2352entry points. 2353The 2354.Xr ddi_umem_alloc 9F 2355routines are used to allocate and lock memory that can later be used as 2356part of passing this memory to userland through the mapping entry 2357points. 2358.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2359.It Xr ddi_devmap_segmap 9F Ta Xr ddi_mmap_get_model 9F 2360.It Xr ddi_segmap_setup 9F Ta Xr ddi_segmap 9F 2361.It Xr ddi_umem_alloc 9F Ta Xr ddi_umem_free 9F 2362.It Xr ddi_umem_iosetup 9F Ta Xr ddi_umem_lock 9F 2363.It Xr ddi_umem_unlock 9F Ta Xr ddi_unmap_regs 9F 2364.It Xr devmap_default_access 9F Ta Xr devmap_devmem_setup 9F 2365.It Xr devmap_do_ctxmgt 9F Ta Xr devmap_load 9F 2366.It Xr devmap_set_ctx_timeout 9F Ta Xr devmap_setup 9F 2367.It Xr devmap_umem_setup 9F Ta Xr devmap_unload 9F 2368.El 2369.Ss UTF-8, UTF-16, UTF-32, and Code Set Utilities 2370These routines provide the ability to work with and deal with text in 2371different encodings and code sets. 2372Generally the kernel does not assume that much about the type of the text 2373that it is operating in, though some subsystems will require that the 2374names of things be ASCII only. 2375.Pp 2376The primary other locales that the system supports are generally UTF-8 2377based and so the kernel provides a set of routines to deal with UTF-8 2378and Unicode normalization. 2379However, there are still cases where different character encodings are 2380required or conversation between UTF-8 and some other type is required. 2381This is provided by the kernel iconv framework, which provides a 2382subset of the traditional userland iconv conversions. 2383.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2384.It Xr kiconv_close 9F Ta Xr kiconv_open 9F 2385.It Xr kiconv 9F Ta Xr kiconvstr 9F 2386.It Xr u8_strcmp 9F Ta Xr u8_textprep_str 9F 2387.It Xr u8_validate 9F Ta Xr uconv_u16tou32 9F 2388.It Xr uconv_u16tou8 9F Ta Xr uconv_u32tou16 9F 2389.It Xr uconv_u32tou8 9F Ta Xr uconv_u8tou16 9F 2390.It Xr uconv_u8tou32 9F Ta 2391.El 2392.Ss Raw I/O Port Access 2393This group of functions provides raw access to I/O ports on architecture 2394that support them. 2395These functions do not allow any coordination with other callers nor is 2396the validity of the port assured in any way. 2397In general, device drivers should use the normal register access 2398routines to access I/O ports. 2399See 2400.Sx Device Register Setup and Access 2401for more information on the preferred way to setup and access registers. 2402.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2403.It Xr inb 9F Ta Xr inw 9F 2404.It Xr inl 9F Ta Xr outb 9F 2405.It Xr outw 9F Ta Xr outl 9F 2406.El 2407.Ss Power Management 2408These functions are used to raise and lower the internal power levels of 2409a device driver or to indicate to the kernel that the device is busy and 2410therefore cannot have its power changed. 2411See 2412.Xr power 9E 2413for additional information. 2414.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2415.It Xr ddi_removing_power 9F Ta Xr pm_busy_component 9F 2416.It Xr pm_idle_component 9F Ta Xr pm_lower_power 9F 2417.It Xr pm_power_has_changed 9F Ta Xr pm_raise_power 9F 2418.It Xr pm_trans_check 9F Ta 2419.El 2420.Ss Network Packet Hooks 2421These functions are intended to be used by device drivers that wish to 2422inspect and potentially modify packets along their path through the 2423networking stack. 2424The most common use case is for implementing something like a network 2425firewall. 2426Otherwise, if looking to add support for a new protocol or other network 2427processing feature, one is better off more directly integrating with the 2428networking stack. 2429.Pp 2430To get started, drivers generally will need to first use 2431.Xr net_protocol_lookup 9F 2432to get a handle to say that they're interested in looking at IPv4 or 2433IPv6 traffic and then can allocate an actual hook object with 2434.Xr hook_alloc 9F . 2435After filling out the hook, the hook can be inserted into the actual 2436system with 2437.Xr net_hook_register 9F . 2438.Pp 2439Hooks operate in the context of a networking stack. 2440Every networking stack in the system is independent and therefore has 2441its own set of interfaces, routing tables, settings, and related. 2442Most zones have their own networking stack. 2443This is the exclusive-IP option that is described in 2444.Xr zoneadm 8 . 2445.Pp 2446Drivers can register to get a callback for every netstack in the system 2447and be notified when they are created and destroyed. 2448This is done by calling the 2449.Xr net_instance_register 9F 2450function, filling out its data structure, and then finally calling 2451.Xr net_instance_regster 9F . 2452Like other callback interfaces, the moment the callback functions are 2453registered, drivers need to expect that they're going to be called. 2454.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2455.It Xr hook_alloc 9F Ta Xr hook_free 9F 2456.It Xr net_event_notify_register 9F Ta Xr net_event_notify_unregister 9F 2457.It Xr net_getifname 9F Ta Xr net_getlifaddr 9F 2458.It Xr net_getmtu 9F Ta Xr net_getnetid 9F 2459.It Xr net_getpmtuenabled 9F Ta Xr net_hook_register 9F 2460.It Xr net_hook_unregister 9F Ta Xr net_inject_alloc 9F 2461.It Xr net_inject_free 9F Ta Xr net_inject 9F 2462.It Xr net_instance_alloc 9F Ta Xr net_instance_free 9F 2463.It Xr net_instance_notify_register 9F Ta Xr net_instance_notify_unregister 9F 2464.It Xr net_instance_protocol_unregister 9F Ta Xr net_instance_register 9F 2465.It Xr net_instance_unregister 9F Ta Xr net_ispartialchecksum 9F 2466.It Xr net_isvalidchecksum 9F Ta Xr net_kstat_create 9F 2467.It Xr net_kstat_delete 9F Ta Xr net_lifgetnext 9F 2468.It Xr net_netidtozonid 9F Ta Xr net_phygetnext 9F 2469.It Xr net_phylookup 9F Ta Xr net_protocol_lookup 9F 2470.It Xr net_protocol_notify_register 9F Ta Xr net_protocol_release 9F 2471.It Xr net_protocol_walk 9F Ta Xr net_routeto 9F 2472.It Xr net_zoneidtonetid 9F Ta Xr netinfo 9F 2473.El 2474.Sh SEE ALSO 2475.Xr Intro 2 , 2476.Xr Intro 9 , 2477.Xr Intro 9E , 2478.Xr Intro 9S 2479.Rs 2480.%T illumos Developer's Guide 2481.%U https://www.illumos.org/books/dev/ 2482.Re 2483.Rs 2484.%T Writing Device Drivers 2485.%U https://www.illumos.org/books/wdd/ 2486.Re 2487