1.\" 2.\" This file and its contents are supplied under the terms of the 3.\" Common Development and Distribution License ("CDDL"), version 1.0. 4.\" You may only use this file in accordance with the terms of version 5.\" 1.0 of the CDDL. 6.\" 7.\" A full copy of the text of the CDDL should have accompanied this 8.\" source. A copy of the CDDL is also available via the Internet at 9.\" http://www.illumos.org/license/CDDL. 10.\" 11.\" 12.\" Copyright 2023 Oxide Computer Company 13.\" Copyright 2023 Peter Tribble 14.\" 15.Dd July 17, 2023 16.Dt INTRO 9F 17.Os 18.Sh NAME 19.Nm Intro 20.Nd Introduction to kernel and device driver functions 21.Sh SYNOPSIS 22.In sys/ddi.h 23.In sys/sunddi.h 24.Sh DESCRIPTION 25Section 9F of the manual page describes functions that are used for device 26drivers, kernel modules, and the implementation of the kernel itself. 27This first provides an overview for the use of kernel functions and portions of 28the manual that are specific to the kernel. 29After that, we have grouped together most functions that are available by use, 30with some brief commentary and introduction. 31.Pp 32Most manual pages are similar to those in other sections. 33They have common fields such as the NAME, a SYNOPSIS to show which header files 34to include and prototypes, an extended DESCRIPTION discussing its use, and the 35common combination of RETURN VALUES and ERRORS. 36Some manuals will have examples and additional manuals to reference in the SEE 37ALSO section. 38.Ss RETURN VALUES and ERRORS 39One major difference when programming in the kernel versus userland is that 40there is no equivalent to 41.Va errno . 42Instead, there are a few common patterns that are used throughout the kernel 43that we'll discuss. 44While there are common patterns, please be aware that due to the natural 45evolution of the system, you will need to read the specifics of the 46section. 47.Bl -bullet 48.It 49Many functions will return a specific DDI 50.Pq Device Driver Interface 51value, which is commonly one of 52.Dv DDI_SUCCESS 53or 54.Dv DDI_FAILURE , 55indicating success and failure respectively. 56Some functions will return additional error codes to indicate why something 57failed. 58In general, when checking a response code is always preferred to compare that 59something equals or does not equal 60.Dv DDI_SUCCESS 61as there can be many different error cases and additional ones can be added over 62time. 63.It 64Many routines explicitly return 65.Sy 0 66on success and will return an explicit error number. 67.Xr Intro 2 68has a list of error numbers. 69.It 70There are classes of functions that return either a pointer or a boolean type, 71either the C99 72.Vt bool 73or the system's traditional type 74.Vt boolean_t . 75In these cases, sometimes a more detailed error is provided via an additional 76argument such as a 77.Vt "int *" . 78Absent such an argument, there is generally no more detailed information 79available. 80.El 81.Ss CONTEXT 82The CONTEXT section of a manual page describes the times in which this function 83may be called. 84In generally there are three different contexts that come up: 85.Bl -tag -width Ds 86.It Sy User 87User context implies that the thread of execution is operating because a user 88thread has entered the kernel for an operation. 89When an application issues a system call such as 90.Xr open 2 , 91.Xr read 2 , 92.Xr write 2 , 93or 94.Xr ioctl 2 95then we are said to be in user context. 96When in user context, one can copy in or out data from a user's address space. 97When writing a character or block device driver, the majority of the time that a 98character device operation such as the corresponding 99.Xr open 9E , 100.Xr read 9E , 101.Xr write 9E , 102and 103.Xr ioctl 9E 104entry point being called, it is executing in user context. 105It is possible to call those entry points through the kernel's layered device 106interface, so drivers cannot assume those entry points will always have a user 107process present, strictly speaking. 108.It Sy Interrupt 109Interrupt context refers to when the operating system is handling an interrupt 110.Po 111See 112.Sx Interrupt Related Functions 113.Pc 114and executing a registered interrupt handler. 115Interrupt context is split into two different sets: high-level and low-level 116interrupts. 117Most device drivers are always going to be executing low-level interrupts. 118To determine whether an interrupt is considered high level or not, you should 119pass the interrupt handle to the 120.Xr ddi_intr_get_pri 9F 121function and compare the resulting priority with 122.Xr ddi_intr_get_hilevel_pri 9F . 123.Pp 124When executing high-level interrupts, the thread may only execute a limited 125number of functions. 126In particular, it may call 127.Xr ddi_intr_trigger_softint 9F , 128.Xr mutex_enter 9F , 129and 130.Xr mutex_exit 9F . 131It is critical that the mutex being used be properly initialized with the 132driver's interrupt priority. 133The system will transparently pick the correct implementation of a mutex based 134on the interrupt type. 135Aside from the above, one must not block while in high-level interrupt context. 136.Pp 137On the other hand, when a thread is not in high-level interrupt context, most of 138these restrictions are lifted. 139Kernel memory may be allocated 140.Po 141if using a non-blocking allocation such as 142.Dv KM_NOSLEEP 143or 144.Dv KM_NOSLEEP_LAZY 145.Pc , 146and many of the other documented functions may be called. 147.Pp 148Regardless of whether a thread is in high-level or low-level interrupt context, 149it will never have a user context associated with it and therefore cannot use 150routines like 151.Xr ddi_copyin 9F 152or 153.Xr ddi_copyout 9F . 154.It Sy Kernel 155Kernel context refers to all other times in the kernel. 156Whenever the kernel is executing something on a thread that is not associated 157with a user process, then one is in kernel context. 158The most common situation for writers of kernel modules are things like timeout 159callbacks, such as 160.Xr timeout 9F 161or 162.Xr ddi_periodic_add 9F , 163cases where the kernel is invoking a driver's device operation routines such as 164.Xr attach 9E 165and 166.Xr detach 9E , 167or many of the device driver's registered callbacks from frameworks such as the 168.Xr mac 9E , 169.Xr usba_hcdi 9E , 170and various portions of SCSI, USB, and block devices. 171.It Sy Framework-specific Contexts 172Some manuals will discuss more specific constraints about when they can be used. 173For example, some functions may only be called while executing a specific entry 174point like 175.Xr attach 9E . 176Another example of this is that the 177.Xr mac_transceiver_info_set_present 9F 178function is only meant to be used while executing a networking driver's 179.Xr mct_info 9E 180entry point. 181.El 182.Ss PARAMETERS 183In kernel manual pages 184.Pq section 9 , 185each function and entry point description generally has a separate list 186of parameters which are arguments to the function. 187The parameters section describes the basic purpose of each argument and 188should explain where such things often come from and any constraints on 189their values. 190.Sh INTERFACES 191Functions below are organized into categories that describe their purpose. 192Individual functions are documented in their own manual pages. 193For each of these areas, we discuss high-level concepts behind each area and 194provide a brief discussion of how to get started with it. 195Note, some deprecated functions or older frameworks are not listed here. 196.Pp 197Every function listed below has its own manual page in section 9F and 198can be read with 199.Xr man 1 . 200In addition, some corresponding concepts are documented in section 9 and 201some groups of functions are present to support a specific type of 202device driver, which is discussed more in section 9E . 203.Ss Logging Functions 204Through the kernel there are often needs to log messages that either 205make it into the system log or on the console. 206These kinds of messages can be performed with the 207.Xr cmn_err 9F 208function or one of its more specific variants that operate in the 209context of a device 210.Po 211.Xr dev_err 9F 212.Pc 213or a zone 214.Po 215.Xr zcmn_err 9F 216.Pc . 217.Pp 218The console should be used sparingly. 219While a notice may be found there, one should assume that it may be 220missed either due to overflow, not being connected to say a serial 221console at the time, or some other reason. 222While the system log is better than the console, folks need to take care 223not to spam the log. 224Imagine if someone logged every time a network packet was generated or 225received, you'd quickly potentially run out of space and make it harder 226to find useful messages for bizarre behavior. 227It's also important to remember that only system administrators and 228privileged users can actually see this log. 229Where possible and appropriate use programmatic errors in routines that 230allow it. 231.Pp 232The system also supports a structured event log called a system event 233that is processed by 234.Xr syseventd 8 . 235This is used by the OS to provide notifications for things like device 236insertion and removal or the change of a data link. 237These are driven by the 238.Xr ddi_log_sysevent 9F 239function and allow arbitrary additional structured metadata in the form 240of a 241.Vt nvlist_t . 242.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 243.It Xr cmn_err 9F Ta Xr dev_err 9F 244.It Xr vcmn_err 9F Ta Xr vzcmn_err 9F 245.It Xr zcmn_err 9F Ta Xr ddi_log_sysevent 9F 246.El 247.Ss Memory Allocation 248At the heart of most device drivers is memory allocation. 249The primary kernel allocator is called 250.Qq kmem 251.Pq kernel memory 252and it is based on the 253.Qq vmem 254.Pq virtual memory 255subsystem. 256Most of the time, device drivers should use 257.Xr kmem_alloc 9F 258and 259.Xr kmem_zalloc 9F 260to allocate memory and free it with 261.Xr kmem_free 9F . 262Based on the original kmem and subsequent vmem papers, the kernel is 263internally using object caches and magazines to allow high-throughput 264allocation in a multi-CPU environment. 265.Pp 266When allocating memory, an important choice must be made: whether or not 267to block for memory. 268If one opts to perform a sleeping allocation, then the caller can be 269guaranteed that the allocation will succeed, but it may take some time 270and the thread will be blocked during that entire duration. 271This is the 272.Dv KM_SLEEP 273flag. 274On the other hand, there are many circumstances where this is not 275appropriate, especially because a thread that is inside a memory 276allocation function cannot currently be cancelled. 277If the thread corresponds to a user process, then it will not be 278killable. 279.Pp 280Given that there are many situations where this is not appropriate, the 281kernel offers an allocation mode where it will not block for memory to 282be available: 283.Dv KM_NOSLEEP 284and 285.Dv KM_NOSLEEP_LAZY . 286These allocations can fail and return 287.Dv NULL 288when they do fail. 289Even though these are said to be no sleep operations, that does not mean 290that the caller may not end up temporarily blocked due to mutex 291contention or due to trying a bit more aggressively to reclaim memory in 292the case of 293.Dv KM_NOSLEEP . 294Unless operating in special circumstances, using 295.Dv KM_NOSLEEP_LAZY 296should be preferred to 297.Dv KM_NOSLEEP . 298.Pp 299If a device driver has its own complex object that has more significant 300set up and tear down costs, then the kmem cache function family should 301be considered. 302To use a kmem cache, it must first be created using the 303.Xr kmem_cache_create 9F 304function, which requires specifying the size, alignment, and 305constructors and destructors. 306Individual objects are allocated from the cache with the 307.Xr kmem_cache_alloc 9F 308function. 309An important constraint when using the caches is that when an object is 310freed with 311.Xr kmem_cache_free 9F , 312it is the callers responsibility to ensure that the object is returned 313to its constructed state prior to freeing it. 314If the object is reused, prior to the kernel reclaiming the memory for 315other uses, then the constructor will not be called again. 316Most device drivers do not need to create a kmem cache for their 317own allocations. 318.Pp 319If you are writing a device driver that is trying to interact with the 320networking, STREAMS, or USB subsystems, then they are generally using 321the 322.Vt mblk_t 323data structure which is managed through a different set of APIs, though 324they are leveraging kmem under the hood. 325.Pp 326The vmem set of interfaces allows for the management of abstract regions 327of integers, generally representing memory or some other object, each 328with an offset and length. 329While it is not common that a device driver needs to do their own such 330management, 331.Xr vmem_create 9F 332and 333.Xr vmem_alloc 9F 334are what to reach for when the need arises. 335Rather than using vmem, if one needs to model a set of integers where 336each is a valid identifier, that is you need to allocate every integer 337between 0 and 1000 as a distinct identifier, instead use 338.Xr id_space_create 9F 339which is discussed in 340.Sx Identifier Management . 341For more information on vmem, see 342.Xr vmem 9 . 343.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 344.It Xr kmem_alloc 9F Ta Xr kmem_cache_alloc 9F 345.It Xr kmem_cache_create 9F Ta Xr kmem_cache_destroy 9F 346.It Xr kmem_cache_free 9F Ta Xr kmem_cache_set_move 9F 347.It Xr kmem_free 9F Ta Xr kmem_zalloc 9F 348.It Xr vmem_add 9F Ta Xr vmem_alloc 9F 349.It Xr vmem_contains 9F Ta Xr vmem_create 9F 350.It Xr vmem_destroy 9F Ta Xr vmem_free 9F 351.It Xr vmem_size 9F Ta Xr vmem_walk 9F 352.It Xr vmem_xalloc 9F Ta Xr vmem_xcreate 9F 353.It Xr vmem_xfree 9F Ta Xr bufcall 9F 354.It Xr esbbcall 9F Ta Xr qbufcall 9F 355.It Xr qunbufcall 9F Ta Xr unbufcall 9F 356.El 357.Ss String and libc Analogues 358The kernel has many analogues for classic libc functions that deal with 359string processing, memory copying, and related. 360For the most part, these behave similarly to their userland analogues, 361but there can be some differences in return values and for example, in 362the set of supported format characters in the case of 363.Xr snprintf 9F 364and related. 365.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 366.It Xr ASSERT 9F Ta Xr bcmp 9F 367.It Xr bzero 9F Ta Xr bcopy 9F 368.It Xr ddi_strdup 9F Ta Xr ddi_strtol 9F 369.It Xr ddi_strtoll 9F Ta Xr ddi_strtoul 9F 370.It Xr ddi_strtoull 9F Ta Xr ddi_ffs 9F 371.It Xr ddi_fls 9F Ta Xr max 9F 372.It Xr memchr 9F Ta Xr memcmp 9F 373.It Xr memcpy 9F Ta Xr memmove 9F 374.It Xr memset 9F Ta Xr min 9F 375.It Xr numtos 9F Ta Xr snprintf 9F 376.It Xr sprintf 9F Ta Xr stoi 9F 377.It Xr strcasecmp 9F Ta Xr strcat 9F 378.It Xr strchr 9F Ta Xr strcmp 9F 379.It Xr strcpy 9F Ta Xr strdup 9F 380.It Xr strfree 9F Ta Xr string 9F 381.It Xr strlcat 9F Ta Xr strlcpy 9F 382.It Xr strlen 9F Ta Xr strlog 9F 383.It Xr strncasecmp 9F Ta Xr strncat 9F 384.It Xr strncmp 9F Ta Xr strncpy 9F 385.It Xr strnlen 9F Ta Xr strqget 9F 386.It Xr strqset 9F Ta Xr strrchr 9F 387.It Xr strspn 9F Ta Xr swab 9F 388.It Xr vsnprintf 9F Ta Xr va_arg 9F 389.It Xr va_copy 9F Ta Xr va_end 9F 390.It Xr va_start 9F Ta Xr vsprintf 9F 391.El 392.Ss Tree Data Structures 393These functions provide access to an intrusive self-balancing binary 394tree that is generally used throughout illumos. 395The primary type here is the 396.Vt avl_tree_t . 397Structures can be present in multiple trees and there are built-in 398walkers for the data structure in 399.Xr mdb 1 . 400.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 401.It Xr avl_add 9F Ta Xr avl_create 9F 402.It Xr avl_destroy_nodes 9F Ta Xr avl_destroy 9F 403.It Xr avl_find 9F Ta Xr avl_first 9F 404.It Xr avl_insert_here 9F Ta Xr avl_insert 9F 405.It Xr avl_is_empty 9F Ta Xr avl_last 9F 406.It Xr avl_nearest 9F Ta Xr AVL_NEXT 9F 407.It Xr avl_numnodes 9F Ta Xr AVL_PREV 9F 408.It Xr avl_remove 9F Ta Xr avl_swap 9F 409.El 410.Ss Linked Lists 411These functions provide a standard, intrusive doubly-linked list whose 412type is the 413.Vt list_t . 414This list implementation is used extensively throughout illumos, has 415debugging support through 416.Xr mdb 1 417walkers, and is generally recommended rather than creating your own 418list. 419Due to its intrusive nature, a given structure can be present on 420multiple lists. 421.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 422.It Xr list_create 9F Ta Xr list_destroy 9F 423.It Xr list_head 9F Ta Xr list_insert_after 9F 424.It Xr list_insert_before 9F Ta Xr list_insert_head 9F 425.It Xr list_insert_tail 9F Ta Xr list_is_empty 9F 426.It Xr list_link_active 9F Ta Xr list_link_init 9F 427.It Xr list_link_replace 9F Ta Xr list_move_tail 9F 428.It Xr list_next 9F Ta Xr list_prev 9F 429.It Xr list_remove_head 9F Ta Xr list_remove_tail 9F 430.It Xr list_remove 9F Ta Xr list_tail 9F 431.El 432.Ss Name-Value Pairs 433The kernel often uses the 434.Vt nvlist_t 435data structure to pass around a list of typed name-value pairs. 436This data structure is used in diverse areas, particularly because of 437its ability to be serialized in different formats that are suitable not 438only for use between userland and the kernel, but also persistently to a 439file. 440.Pp 441A 442.Vt nvlist_t 443structure is initialized with the 444.Xr nvlist_alloc 9F 445function and can operate with two different degrees of uniqueness: a 446mode where only names are unique or that every name is qualified to a 447type. 448The former means that if I have an integer name 449.Dq foo 450and then add a string, array, or any other value with the same name, it 451will be replaced. 452However, if were using the name and type as unique, then the value would 453only be replaced if both the pair's type and the name 454.Dq foo 455matched a pair that was already present. 456Otherwise, the two different entries would co-exist. 457.Pp 458When constructing an nvlist, it is normally backed by the normal kmem 459allocator and may either use sleeping or non-sleeping allocations. 460It is also possible to use a custom allocator, though that generally has 461not been necessary in the kernel. 462.Pp 463Specific keys and values can be looked up directly with the 464nvlist_lookup family of functions, but the entire list can be iterated 465as well, which is especially useful when trying to validate that no 466unknown keys are present in the list. 467The iteration API 468.Xr nvlist_next_nvpair 9F 469allows one to then get both the key's name, the type of value of the 470pair, and then the value itself. 471.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 472.It Xr nv_alloc_fini 9F Ta Xr nv_alloc_init 9F 473.It Xr nvlist_add_boolean_array 9F Ta Xr nvlist_add_boolean_value 9F 474.It Xr nvlist_add_boolean 9F Ta Xr nvlist_add_byte_array 9F 475.It Xr nvlist_add_byte 9F Ta Xr nvlist_add_int16_array 9F 476.It Xr nvlist_add_int16 9F Ta Xr nvlist_add_int32_array 9F 477.It Xr nvlist_add_int32 9F Ta Xr nvlist_add_int64_array 9F 478.It Xr nvlist_add_int64 9F Ta Xr nvlist_add_int8_array 9F 479.It Xr nvlist_add_int8 9F Ta Xr nvlist_add_nvlist_array 9F 480.It Xr nvlist_add_nvlist 9F Ta Xr nvlist_add_nvpair 9F 481.It Xr nvlist_add_string_array 9F Ta Xr nvlist_add_string 9F 482.It Xr nvlist_add_uint16_array 9F Ta Xr nvlist_add_uint16 9F 483.It Xr nvlist_add_uint32_array 9F Ta Xr nvlist_add_uint32 9F 484.It Xr nvlist_add_uint64_array 9F Ta Xr nvlist_add_uint64 9F 485.It Xr nvlist_add_uint8_array 9F Ta Xr nvlist_add_uint8 9F 486.It Xr nvlist_alloc 9F Ta Xr nvlist_dup 9F 487.It Xr nvlist_exists 9F Ta Xr nvlist_free 9F 488.It Xr nvlist_lookup_boolean_array 9F Ta Xr nvlist_lookup_boolean_value 9F 489.It Xr nvlist_lookup_boolean 9F Ta Xr nvlist_lookup_byte_array 9F 490.It Xr nvlist_lookup_byte 9F Ta Xr nvlist_lookup_int16_array 9F 491.It Xr nvlist_lookup_int16 9F Ta Xr nvlist_lookup_int32_array 9F 492.It Xr nvlist_lookup_int32 9F Ta Xr nvlist_lookup_int64_array 9F 493.It Xr nvlist_lookup_int64 9F Ta Xr nvlist_lookup_int8_array 9F 494.It Xr nvlist_lookup_int8 9F Ta Xr nvlist_lookup_nvlist_array 9F 495.It Xr nvlist_lookup_nvlist 9F Ta Xr nvlist_lookup_nvpair 9F 496.It Xr nvlist_lookup_pairs 9F Ta Xr nvlist_lookup_string_array 9F 497.It Xr nvlist_lookup_string 9F Ta Xr nvlist_lookup_uint16_array 9F 498.It Xr nvlist_lookup_uint16 9F Ta Xr nvlist_lookup_uint32_array 9F 499.It Xr nvlist_lookup_uint32 9F Ta Xr nvlist_lookup_uint64_array 9F 500.It Xr nvlist_lookup_uint64 9F Ta Xr nvlist_lookup_uint8_array 9F 501.It Xr nvlist_lookup_uint8 9F Ta Xr nvlist_merge 9F 502.It Xr nvlist_next_nvpair 9F Ta Xr nvlist_pack 9F 503.It Xr nvlist_remove_all 9F Ta Xr nvlist_remove 9F 504.It Xr nvlist_size 9F Ta Xr nvlist_t 9F 505.It Xr nvlist_unpack 9F Ta Xr nvlist_xalloc 9F 506.It Xr nvlist_xdup 9F Ta Xr nvlist_xpack 9F 507.It Xr nvlist_xunpack 9F Ta Xr nvpair_name 9F 508.It Xr nvpair_type 9F Ta Xr nvpair_value_boolean_array 9F 509.It Xr nvpair_value_byte_array 9F Ta Xr nvpair_value_byte 9F 510.It Xr nvpair_value_int16_array 9F Ta Xr nvpair_value_int16 9F 511.It Xr nvpair_value_int32_array 9F Ta Xr nvpair_value_int32 9F 512.It Xr nvpair_value_int64_array 9F Ta Xr nvpair_value_int64 9F 513.It Xr nvpair_value_int8_array 9F Ta Xr nvpair_value_int8 9F 514.It Xr nvpair_value_nvlist_array 9F Ta Xr nvpair_value_nvlist 9F 515.It Xr nvpair_value_string_array 9F Ta Xr nvpair_value_string 9F 516.It Xr nvpair_value_uint16_array 9F Ta Xr nvpair_value_uint16 9F 517.It Xr nvpair_value_uint32_array 9F Ta Xr nvpair_value_uint32 9F 518.It Xr nvpair_value_uint64_array 9F Ta Xr nvpair_value_uint64 9F 519.It Xr nvpair_value_uint8_array 9F Ta Xr nvpair_value_uint8 9F 520.El 521.Ss Identifier Management 522A common challenge in the kernel is the management of a series of 523different IDs. 524There are three different families of routines for managing identifiers 525presented here, but we recommend the use of the 526.Xr id_space_create 9F 527and 528.Xr id_alloc 9F 529family for new use cases. 530The ID space can cover all or a subset of the 32-bit integer space and 531provides different allocation strategies for this. 532.Pp 533Due to the current implementation, callers should generally prefer the 534non-sleeping variants because the sleeping ones are not cancellable 535.Po 536currently this is backed by vmem, but this should not be assumed and may 537change in the future 538.Pc . 539.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 540.It Xr id_alloc_nosleep 9F Ta Xr id_alloc_specific_nosleep 9F 541.It Xr id_alloc 9F Ta Xr id_allocff_nosleep 9F 542.It Xr id_allocff 9F Ta Xr id_free 9F 543.It Xr id_space_create 9F Ta Xr id_space_destroy 9F 544.It Xr id_space_extend 9F Ta Xr id_space 9F 545.It Xr id32_alloc 9F Ta Xr id32_free 9F 546.It Xr id32_lookup 9F Ta Xr rmalloc_wait 9F 547.It Xr rmalloc 9F Ta Xr rmallocmap_wait 9F 548.It Xr rmallocmap 9F Ta Xr rmfree 9F 549.It Xr rmfreemap 9F Ta 550.El 551.Ss Bit Manipulation Routines 552Many device drivers that are working with registers often need to get a 553specific range of bits out of an integer. 554These functions provide safe ways to set 555.Pq bitset 556and extract 557.Pq bitx 558bit ranges, as well 559as modify an integer to remove a set of bits entirely 560.Pq bitdel . 561Using these functions is preferred to constructing manual masks and 562shifts particularly when a programming manual for a device is specified 563in ranges of bits. 564On debug builds, these provide extra checking to try and catch 565programmer error. 566.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 567.It Xr bitdel64 9F Ta Xr bitset8 9F 568.It Xr bitset16 9F Ta Xr bitset32 9F 569.It Xr bitset64 9F Ta Xr bitx8 9F 570.It Xr bitx16 9F Ta Xr bitx32 9F 571.It Xr bitx64 9F Ta 572.El 573.Ss Synchronization Primitives 574The kernel provides a set of basic synchronization primitives that can 575be used by the system. 576These include mutexes, condition variables, reader/writer locks, and 577semaphores. 578When creating mutexes and reader/writer locks, the kernel requires that 579one pass in the interrupt priority of a mutex if it will be used in 580interrupt context. 581This is required so the kernel can determine the correct underlying type 582of lock to use. 583This ensures that if for some reason a mutex needs to be used in 584high-level interrupt context, the kernel will use a spin lock, but 585otherwise can use the standard adaptive mutex that might block. 586For developers familiar with other operating systems, this is somewhat 587different in that the consumer does not need to generally figure out 588this level of detail and this is why this is not present. 589.Pp 590In addition, condition variables provide means for waiting and detecting 591that a signal has been delivered. 592These variants are particularly useful when writing character device 593operations for device drivers as it allows users the chance to cancel an 594operation and not be blocked indefinitely on something that may not 595occur. 596These _sig variants should generally be preferred where applicable. 597.Pp 598The kernel also provides memory barrier primitives. 599See the 600.Sx Memory Barriers 601section for more information. 602There is no need to use manual memory barriers when using the 603synchronization primitives. 604The synchronization primitives contain that the appropriate barriers are 605present to ensure coherency while the lock is held. 606.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 607.It Xr cv_broadcast 9F Ta Xr cv_destroy 9F 608.It Xr cv_init 9F Ta Xr cv_reltimedwait_sig 9F 609.It Xr cv_reltimedwait 9F Ta Xr cv_signal 9F 610.It Xr cv_timedwait_sig 9F Ta Xr cv_timedwait 9F 611.It Xr cv_wait_sig 9F Ta Xr cv_wait 9F 612.It Xr ddi_enter_critical 9F Ta Xr ddi_exit_critical 9F 613.It Xr mutex_destroy 9F Ta Xr mutex_enter 9F 614.It Xr mutex_exit 9F Ta Xr mutex_init 9F 615.It Xr mutex_owned 9F Ta Xr mutex_tryenter 9F 616.It Xr rw_destroy 9F Ta Xr rw_downgrade 9F 617.It Xr rw_enter 9F Ta Xr rw_exit 9F 618.It Xr rw_init 9F Ta Xr rw_read_locked 9F 619.It Xr rw_tryenter 9F Ta Xr rw_tryupgrade 9F 620.It Xr sema_destroy 9F Ta Xr sema_init 9F 621.It Xr sema_p_sig 9F Ta Xr sema_p 9F 622.It Xr sema_tryp 9F Ta Xr sema_v 9F 623.It Xr semaphore 9F Ta 624.El 625.Ss Atomic Operations 626This group of functions provides a general way to perform atomic 627operations on integers of different sizes and explicit types. 628The 629.Xr atomic_ops 9F 630manual page describes the different classes of functions in more detail, 631but there are functions that take care of using the CPU's instructions 632for addition, compare and swap, and more. 633If data is being protected and only accessed under a synchronization 634primitive such as a mutex or reader-writer lock, then there isn't a 635reason to use an atomic operation for that data, generally speaking. 636.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 637.It Xr atomic_add_8_nv 9F Ta Xr atomic_add_8 9F 638.It Xr atomic_add_16_nv 9F Ta Xr atomic_add_16 9F 639.It Xr atomic_add_32_nv 9F Ta Xr atomic_add_32 9F 640.It Xr atomic_add_64_nv 9F Ta Xr atomic_add_64 9F 641.It Xr atomic_add_char_nv 9F Ta Xr atomic_add_char 9F 642.It Xr atomic_add_int_nv 9F Ta Xr atomic_add_int 9F 643.It Xr atomic_add_long_nv 9F Ta Xr atomic_add_long 9F 644.It Xr atomic_add_ptr_nv 9F Ta Xr atomic_add_ptr 9F 645.It Xr atomic_add_short_nv 9F Ta Xr atomic_add_short 9F 646.It Xr atomic_and_8_nv 9F Ta Xr atomic_and_8 9F 647.It Xr atomic_and_16_nv 9F Ta Xr atomic_and_16 9F 648.It Xr atomic_and_32_nv 9F Ta Xr atomic_and_32 9F 649.It Xr atomic_and_64_nv 9F Ta Xr atomic_and_64 9F 650.It Xr atomic_and_uchar_nv 9F Ta Xr atomic_and_uchar 9F 651.It Xr atomic_and_uint_nv 9F Ta Xr atomic_and_uint 9F 652.It Xr atomic_and_ulong_nv 9F Ta Xr atomic_and_ulong 9F 653.It Xr atomic_and_ushort_nv 9F Ta Xr atomic_and_ushort 9F 654.It Xr atomic_cas_16 9F Ta Xr atomic_cas_32 9F 655.It Xr atomic_cas_64 9F Ta Xr atomic_cas_8 9F 656.It Xr atomic_cas_ptr 9F Ta Xr atomic_cas_uchar 9F 657.It Xr atomic_cas_uint 9F Ta Xr atomic_cas_ulong 9F 658.It Xr atomic_cas_ushort 9F Ta Xr atomic_clear_long_excl 9F 659.It Xr atomic_dec_8_nv 9F Ta Xr atomic_dec_8 9F 660.It Xr atomic_dec_16_nv 9F Ta Xr atomic_dec_16 9F 661.It Xr atomic_dec_32_nv 9F Ta Xr atomic_dec_32 9F 662.It Xr atomic_dec_64_nv 9F Ta Xr atomic_dec_64 9F 663.It Xr atomic_dec_ptr_nv 9F Ta Xr atomic_dec_ptr 9F 664.It Xr atomic_dec_uchar_nv 9F Ta Xr atomic_dec_uchar 9F 665.It Xr atomic_dec_uint_nv 9F Ta Xr atomic_dec_uint 9F 666.It Xr atomic_dec_ulong_nv 9F Ta Xr atomic_dec_ulong 9F 667.It Xr atomic_dec_ushort_nv 9F Ta Xr atomic_dec_ushort 9F 668.It Xr atomic_inc_8_nv 9F Ta Xr atomic_inc_8 9F 669.It Xr atomic_inc_16_nv 9F Ta Xr atomic_inc_16 9F 670.It Xr atomic_inc_32_nv 9F Ta Xr atomic_inc_32 9F 671.It Xr atomic_inc_64_nv 9F Ta Xr atomic_inc_64 9F 672.It Xr atomic_inc_ptr_nv 9F Ta Xr atomic_inc_ptr 9F 673.It Xr atomic_inc_uchar_nv 9F Ta Xr atomic_inc_uchar 9F 674.It Xr atomic_inc_uint_nv 9F Ta Xr atomic_inc_uint 9F 675.It Xr atomic_inc_ulong_nv 9F Ta Xr atomic_inc_ulong 9F 676.It Xr atomic_inc_ushort_nv 9F Ta Xr atomic_inc_ushort 9F 677.It Xr atomic_or_8_nv 9F Ta Xr atomic_or_8 9F 678.It Xr atomic_or_16_nv 9F Ta Xr atomic_or_16 9F 679.It Xr atomic_or_32_nv 9F Ta Xr atomic_or_32 9F 680.It Xr atomic_or_64_nv 9F Ta Xr atomic_or_64 9F 681.It Xr atomic_or_uchar_nv 9F Ta Xr atomic_or_uchar 9F 682.It Xr atomic_or_uint_nv 9F Ta Xr atomic_or_uint 9F 683.It Xr atomic_or_ulong_nv 9F Ta Xr atomic_or_ulong 9F 684.It Xr atomic_or_ushort_nv 9F Ta Xr atomic_or_ushort 9F 685.It Xr atomic_set_long_excl 9F Ta Xr atomic_swap_8 9F 686.It Xr atomic_swap_16 9F Ta Xr atomic_swap_32 9F 687.It Xr atomic_swap_64 9F Ta Xr atomic_swap_ptr 9F 688.It Xr atomic_swap_uchar 9F Ta Xr atomic_swap_uint 9F 689.It Xr atomic_swap_ulong 9F Ta Xr atomic_swap_ushort 9F 690.El 691.Ss Memory Barriers 692The kernel provides general purpose memory barriers that can be used 693when required. 694In general, when using items described in the 695.Sx Synchronization Primitives 696section, these are not required. 697.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 698.It Xr membar_consumer 9F Ta Xr membar_enter 9F 699.It Xr membar_exit 9F Ta Xr membar_producer 9F 700.El 701.Ss Virtual Memory and Pages 702All platforms that the operating system supports have some form of 703virtual memory which is managed in units of pages. 704The page size varies between architectures and platforms. 705For example, the smallest x86 page size is 4 KiB while SPARC 706traditionally used 8 KiB pages. 707These functions can be used to convert between pages and bytes. 708.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 709.It Xr btop 9F Ta Xr btopr 9F 710.It Xr ddi_btop 9F Ta Xr ddi_btopr 9F 711.It Xr ddi_ptob 9F Ta Xr ptob 9F 712.El 713.Ss Module and Device Framework 714These functions are used as part of implementing kernel modules and 715register device drivers with the various kernel frameworks. 716There are also functions here that are suitable for use in the 717.Xr dev_ops 9S , 718.Xr cb_ops 9S , 719etc. 720structures and for interrogating module information. 721.Pp 722The 723.Xr mod_install 9F 724and 725.Xr mod_remove 9F 726functions are used during a driver's 727.Xr _init 9E 728and 729.Xr _fini 9E 730functions. 731.Pp 732There are two different ways that drivers often manage their instance 733state which is created during 734.Xr attach 9E . 735The first is the use of 736.Xr ddi_set_driver_private 9F 737and 738.Xr ddi_get_driver_private 9F . 739This stores a driver-specific value on the 740.Vt dev_info_t 741structure which allows it to be used during other operations. 742Some device driver frameworks may use this themselves, making this 743unavailable to the driver. 744.Pp 745The other path is to use the soft state suite of functions which 746dynamically grows to cover the number of instances of a device that 747exist. 748The soft state is generally initialized in the 749.Xr _init 9E 750entry point with 751.Xr ddi_soft_state_init 9F 752and then instances are allocated and freed during 753.Xr attach 9E 754and 755.Xr detach 9E 756with 757.Xr ddi_soft_state_zalloc 9F 758and 759.Xr ddi_soft_state_free 9F , 760and then retrieved with 761.Xr ddi_get_soft_state 9F . 762.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 763.It Xr ddi_get_driver_private 9F Ta Xr ddi_get_soft_state 9F 764.It Xr ddi_modclose 9F Ta Xr ddi_modopen 9F 765.It Xr ddi_modsym 9F Ta Xr ddi_no_info 9F 766.It Xr ddi_report_dev 9F Ta Xr ddi_set_driver_private 9F 767.It Xr ddi_soft_state_fini 9F Ta Xr ddi_soft_state_free 9F 768.It Xr ddi_soft_state_init 9F Ta Xr ddi_soft_state_zalloc 9F 769.It Xr mod_info 9F Ta Xr mod_install 9F 770.It Xr mod_modname 9F Ta Xr mod_remove 9F 771.It Xr nochpoll 9F Ta Xr nodev 9F 772.It Xr nulldev 9F Ta 773.El 774.Ss Device Tree Information 775Devices are organized into a tree that is partially seeded by the 776platform based on information discovered at boot and augmented with 777additional information at runtime. 778Every instance of a device driver is given a 779.Vt "dev_info_t *" 780.Pq device information 781data structure which corresponds to information about an instance and 782has a place in the tree. 783When a driver requests operations like to allocate memory for DMA, that 784request is passed up the tree and modified. 785The same is true for other things like interrupts, event notifications, 786or properties. 787.Pp 788There are many different informational properties about a device driver. 789For example, 790.Xr ddi_driver_name 9F 791returns the name of the device driver, 792.Xr ddi_get_name 9F 793returns the name of the node in the tree, 794.Xr ddi_get_parent 9F 795returns a node's parent, and 796.Xr ddi_get_instance 9F 797returns the instance number of a specific driver. 798.Pp 799There are a series of properties that exist on the tree, the exact set 800of which depend on the class of the device and are often documented in a 801specific device class's manual. 802For example, the 803.Dq reg 804property is used for PCI and PCIe devices to describe the various base 805address registers, their types, and related, which are documented in 806.Xr pci 5 . 807.Pp 808When getting a property one can constrain it to the current instance or 809you can ask for a parent to try to look up the property. 810Which mode is appropriate depends on the specific class of driver, its 811parent, and the property. 812.Pp 813Using a 814.Vt "dev_info_t *" 815pointer has to be done carefully. 816When a device driver is in any of its 817.Xr dev_ops 9S , 818.Xr cb_ops 9S , 819or similar callback functions that it has registered with the kernel, 820then it can always safely use its own 821.Vt "dev_info_t" 822and those of any parents it discovers through 823.Xr ddi_get_parent 9F . 824However, it cannot assume the validity of any siblings or children 825unless there are other circumstances that guarantee that they will not 826disappear. 827In the broader kernel, one should not assume that it is safe to use a 828given 829.Vt "dev_info_t *" 830structure without the appropriate NDI 831.Pq nexus driver interface 832hold having been applied. 833.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 834.It Xr ddi_binding_name 9F Ta Xr ddi_dev_is_sid 9F 835.It Xr ddi_driver_major 9F Ta Xr ddi_driver_name 9F 836.It Xr ddi_get_devstate 9F Ta Xr ddi_get_instance 9F 837.It Xr ddi_get_name 9F Ta Xr ddi_get_parent 9F 838.It Xr ddi_getlongprop_buf 9F Ta Xr ddi_getlongprop 9F 839.It Xr ddi_getprop 9F Ta Xr ddi_getproplen 9F 840.It Xr ddi_node_name 9F Ta Xr ddi_prop_create 9F 841.It Xr ddi_prop_exists 9F Ta Xr ddi_prop_free 9F 842.It Xr ddi_prop_get_int 9F Ta Xr ddi_prop_get_int64 9F 843.It Xr ddi_prop_lookup_byte_array 9F Ta Xr ddi_prop_lookup_int_array 9F 844.It Xr ddi_prop_lookup_int64_array 9F Ta Xr ddi_prop_lookup_string_array 9F 845.It Xr ddi_prop_lookup_string 9F Ta Xr ddi_prop_lookup 9F 846.It Xr ddi_prop_modify 9F Ta Xr ddi_prop_op 9F 847.It Xr ddi_prop_remove_all 9F Ta Xr ddi_prop_remove 9F 848.It Xr ddi_prop_undefine 9F Ta Xr ddi_prop_update_byte_array 9F 849.It Xr ddi_prop_update_int_array 9F Ta Xr ddi_prop_update_int 9F 850.It Xr ddi_prop_update_int64_array 9F Ta Xr ddi_prop_update_int64 9F 851.It Xr ddi_prop_update_string_array 9F Ta Xr ddi_prop_update_string 9F 852.It Xr ddi_prop_update 9F Ta Xr ddi_root_node 9F 853.It Xr ddi_slaveonly 9F Ta 854.El 855.Ss Copying Data to and from Userland 856The kernel operates in a different context from userland. 857One does not simply access user memory. 858This is enforced either by the architecture's memory model, where user 859address space isn't even present in the kernel's virtual address space 860or by architectural mechanisms such as Supervisor Mode Access Protect 861.Pq SMAP 862on x86. 863.Pp 864To facilitate accessing memory, the kernel provides a few routines that 865can be used. 866In most contexts the main thing to use is 867.Xr ddi_copyin 9F 868and 869.Xr ddi_copyout 9F . 870These will safely dereference addresses and ensure that the address is 871appropriate depending on whether this is coming from the user or kernel. 872When operating with the kernel's 873.Vt uio_t 874structure which is for mostly used when processing read and write 875requests, instead 876.Xr uiomove 9F 877is the goto function. 878.Pp 879When reading data from userland into the kernel, there is another 880concern: the data model. 881The most common place this comes up is in an 882.Xr ioctl 9E 883handler or other places where the kernel is operating on data that isn't 884fixed size. 885Particularly in C, though this applies to other languages, structures 886and unions vary in the size and alignment requirements between 32-bit 887and 64-bit processes. 888The same even applies if one uses pointers or the 889.Vt long , 890.Vt size_t , 891or similar types in C. 892In supported 32-bit and 64-bit environments these types are 4 and 8 893bytes respectively. 894To account for this, when data is not fixed size between all data 895models, the driver must look at the data model of the process it is 896copying data from. 897.Pp 898The simplest way to solve this problem is to try to make the data 899structure the same across the different models. 900It's not sufficient to just use the same structure definition and fixed 901size types as the alignment and padding between the two can vary. 902For example, the alignment of a 64-bit integer like a 903.Vt uint64_t 904can change between a 32-bit and 64-bit data model. 905One way to check for the data structures being identical is to leverage 906the 907.Xr ctfdiff 1 908program, generally with the 909.Fl I 910option. 911.Pp 912However, there are times when a structure simply can't be the same, such 913as when we're encoding a pointer into the structure or a type like the 914.Vt size_t . 915When this happens, the most natural way to accomplish this is to use the 916.Xr ddi_model_convert_from 9F 917function which can determine the appropriate model from the ioctl's 918arguments. 919This provides a natural way to copy a structure in and out in the 920appropriate data model and convert it at those points to the kernel's 921native form. 922.Pp 923An alternate way to approach the data model is to use the 924.Xr STRUCT_DECL 9F 925functions, but as this requires wrapping every access to every member, 926often times the 927.Xr ddi_model_convert_from 9F 928approach and taking care of converting values and ensuring that limits 929aren't exceeded at the end is preferred. 930.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 931.It Xr bp_copyin 9F Ta Xr bp_copyout 9F 932.It Xr copyin 9F Ta Xr copyout 9F 933.It Xr ddi_copyin 9F Ta Xr ddi_copyout 9F 934.It Xr ddi_model_convert_from 9F Ta Xr SIZEOF_PTR 9F 935.It Xr SIZEOF_STRUCT 9F Ta Xr STRUCT_BUF 9F 936.It Xr STRUCT_DECL 9F Ta Xr STRUCT_FADDR 9F 937.It Xr STRUCT_FGET 9F Ta Xr STRUCT_FGETP 9F 938.It Xr STRUCT_FSET 9F Ta Xr STRUCT_FSETP 9F 939.It Xr STRUCT_HANDLE 9F Ta Xr STRUCT_INIT 9F 940.It Xr STRUCT_SET_HANDLE 9F Ta Xr STRUCT_SIZE 9F 941.It Xr uiomove 9F Ta Xr ureadc 9F 942.It Xr uwritec 9F Ta 943.El 944.Ss Device Register Setup and Access 945The kernel abstracts out accessing registers on a device on behalf of 946drivers. 947This allows a similar set of interfaces to be used whether the registers 948are found within a PCI BAR, utilizing I/O ports, memory mapped 949registers, or some other scheme. 950Devices with registers all have a 951.Dq regs 952property that is set up by their parent device, generally a kernel 953framework as is the case for PCIe devices, and the meaning is a contract 954between the two. 955Register sets are identified by a numeric ID, which varies on the device 956type. 957For example, the first BAR of a PCI device is defined as register set 1. 958On the other hand, the AMD GPIO controller might have three register sets 959because of how the hardware design splits them up. 960The meaning of the registers and their semantics is still 961device-specific. 962The kernel doesn't know how to interpret the actual registers of a PCIe 963device say, just that they exist. 964.Pp 965To begin with register setup, one often first looks at the number of 966register sets that exist and their size. 967Most PCI-based device drivers will skip calling 968.Xr ddi_dev_nregs 9F 969and will just move straight to calling 970.Xr ddi_dev_regsize 9F 971to determine the size of a register set that they are interested in. 972To actually map the registers, a device driver will call 973.Xr ddi_regs_map_setup 9F 974which requires both a register set and a series of attributes and 975returns an access handle that is used to actually read and write the 976registers. 977When setting up registers, one must have a corresponding 978.Vt ddi_device_acc_attr_t 979structure which is used to define what endianness the register set is 980in, whether any kind of reordering is allowed 981.Po 982if in doubt specify 983.Dv DDI_STRICTORDER_ACC 984.Pc , 985and whether any particular error handling is being used. 986The structure and all of its different options are described in 987.Xr ddi_device_acc_attr 9S . 988.Pp 989Once a register handle is obtained, then it's easy to read and write the 990register space. 991Functions are organized based on the size of the access. 992For the most part, most situations call for the use of the 993.Xr ddi_get8 9F , 994.Xr ddi_get16 9F , 995.Xr ddi_get32 9F , 996and 997.Xr ddi_get64 9F 998functions to read a register and the 999.Xr ddi_put8 9F , 1000.Xr ddi_put16 9F , 1001.Xr ddi_put32 9F , 1002and 1003.Xr ddi_put64 9F 1004functions to set a register value. 1005While there are the ddi_io_ and ddi_mem_ families of functions below, 1006these are not generally needed and are generally present for 1007compatibility. 1008The kernel will automatically perform the appropriate type of register 1009read for the device type in question. 1010.Pp 1011Once a register set is no longer being used, the 1012.Xr ddi_regs_map_free 9F 1013function should be used to release resources. 1014In most cases, this happens while executing the 1015.Xr detach 9E 1016entry point. 1017.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1018.It Xr ddi_dev_nregs 9F Ta Xr ddi_dev_regsize 9F 1019.It Xr ddi_device_copy 9F Ta Xr ddi_device_zero 9F 1020.It Xr ddi_regs_map_free 9F Ta Xr ddi_regs_map_setup 9F 1021.It Xr ddi_get8 9F Ta Xr ddi_get16 9F 1022.It Xr ddi_get32 9F Ta Xr ddi_get64 9F 1023.It Xr ddi_io_get8 9F Ta Xr ddi_io_get16 9F 1024.It Xr ddi_io_get32 9F Ta Xr ddi_io_put8 9F 1025.It Xr ddi_io_put16 9F Ta Xr ddi_io_put32 9F 1026.It Xr ddi_io_rep_get8 9F Ta Xr ddi_io_rep_get16 9F 1027.It Xr ddi_io_rep_get32 9F Ta Xr ddi_io_rep_put8 9F 1028.It Xr ddi_io_rep_put16 9F Ta Xr ddi_io_rep_put32 9F 1029.It Xr ddi_map_regs 9F Ta Xr ddi_mem_get8 9F 1030.It Xr ddi_mem_get16 9F Ta Xr ddi_mem_get32 9F 1031.It Xr ddi_mem_get64 9F Ta Xr ddi_mem_put8 9F 1032.It Xr ddi_mem_put16 9F Ta Xr ddi_mem_put32 9F 1033.It Xr ddi_mem_put64 9F Ta Xr ddi_mem_rep_get8 9F 1034.It Xr ddi_mem_rep_get16 9F Ta Xr ddi_mem_rep_get32 9F 1035.It Xr ddi_mem_rep_get64 9F Ta Xr ddi_mem_rep_put8 9F 1036.It Xr ddi_mem_rep_put16 9F Ta Xr ddi_mem_rep_put32 9F 1037.It Xr ddi_mem_rep_put64 9F Ta Xr ddi_peek8 9F 1038.It Xr ddi_peek16 9F Ta Xr ddi_peek32 9F 1039.It Xr ddi_peek64 9F Ta Xr ddi_poke8 9F 1040.It Xr ddi_poke16 9F Ta Xr ddi_poke32 9F 1041.It Xr ddi_poke64 9F Ta Xr ddi_put8 9F 1042.It Xr ddi_put16 9F Ta Xr ddi_put32 9F 1043.It Xr ddi_put64 9F Ta Xr ddi_rep_get8 9F 1044.It Xr ddi_rep_get16 9F Ta Xr ddi_rep_get32 9F 1045.It Xr ddi_rep_get64 9F Ta Xr ddi_rep_put8 9F 1046.It Xr ddi_rep_put16 9F Ta Xr ddi_rep_put32 9F 1047.It Xr ddi_rep_put64 9F Ta 1048.El 1049.Ss DMA Related Functions 1050Most high-performance devices provide first-class support for DMA 1051.Pq direct memory access . 1052DMA allows a transfer between a device and memory to occur 1053asynchronously and generally without a thread's specific involvement. 1054Today, most DMA is provided directly by devices and the corresponding 1055device scheme. 1056Take PCI and PCI Express for example. 1057The idea of DMA is built into the PCIe standard and therefore basic 1058support for it exists and therefore there isn't a lot of special 1059programming required. 1060However, this hasn't always been true and still exists in some cases 1061where there is a 3rd party DMA engine. 1062If we consider the PCIe example, the PCIe device directly performs reads 1063and writes to main memory on its own. 1064However, in the 3rd party case, there is a distinct controller that is 1065neither the device nor memory that facilitates this, which is called a 1066DMA engine. 1067For most part, DMA engines are not something that needs to be thought 1068about for most platforms that illumos is present on; however, they still 1069exist in some embedded and related contexts. 1070.Pp 1071The first thing that a driver needs to do to set up DMA is to understand 1072the constraints of the device and bus. 1073These constraints are described in a series of attributes in the 1074.Vt ddi_dma_attr_t 1075structure which is defined in 1076.Xr ddi_dma_attr 9S . 1077The reason that attributes exist is because different devices, and 1078sometimes different memory uses with a device, have different 1079requirements for memory. 1080A simple example of this is that not all devices can accept memory 1081addresses that are 64-bits wide and may have to be constrained to the 1082lower 32-bits of memory. 1083Another common constraint is how this memory is chunked up. 1084Some devices may require that all of the DMA memory be contiguous, while 1085others can allow that to be broken up into say up to 4 or 8 different 1086regions. 1087.Pp 1088When memory is allocated for DMA it isn't immediately mapped into the 1089kernel's address space. 1090The addresses that describe a DMA address are defined in a DMA cookie, 1091several of which may make up a request. 1092However, those addresses are always physical addresses or addresses that 1093are virtualized by an IOMMU. 1094There are some cases were the kernel or a driver needs to be able to 1095access that memory, such as memory that represents a networking packet. 1096The IP stack will expect to be able to actually read the data it's 1097given. 1098.Pp 1099To begin with allocating DMA memory, a driver first fills out its 1100attribute structure. 1101Once that's ready, the DMA allocation process can begin. 1102This starts off by a driver calling 1103.Xr ddi_dma_alloc_handle 9F . 1104This handle is used through the lifetime of a given DMA memory buffer, 1105but it can be used across multiple operations that a device or the 1106kernel may perform. 1107The next step is to actually request that the kernel allocate some 1108amount of memory in the kernel for this DMA request. 1109This phase actually allocates addresses in virtual address space for the 1110activity and also requires a register attribute object that is discussed 1111in 1112.Sx Device Register Setup and Access . 1113Armed with this a driver can now call 1114.Xr ddi_dma_mem_alloc 9F 1115to specify how much memory they are looking for. 1116If this is successful, a virtual address, the actual length of the 1117region, and an access handle will be returned. 1118.Pp 1119At this point, the virtual address region is present. 1120Most drivers will access this virtual address range directly and will 1121ignore the register access handle. 1122The side effect of this is that they will handle all endianness issues 1123with the memory region themselves. 1124If the driver would prefer to go through the handle, then it can use the 1125register access functions discussed earlier. 1126.Pp 1127Before the memory can be programmed into the device, it must be bound to 1128a series of physical addresses or addresses virtualized by an IOMMU. 1129While the kernel presents the illusion of a single consistent virtual 1130address range for applications, the physical reality can be quite 1131different. 1132When the driver is ready it calls 1133.Xr ddi_dma_addr_bind_handle 9F 1134to create the mapping to well known physical addresses. 1135.Pp 1136These addresses are stored in a series of cookies. 1137A driver can determine the number of cookies for a given request by 1138utilizing its DMA handle and calling 1139.Xr ddi_dma_ncookies 9F 1140and then pairing that with 1141.Xr ddi_dma_cookie_get 9F . 1142These DMA cookies will not change and can be used time and time again 1143until 1144.Xr ddi_dma_unbind_handle 9F 1145is called. 1146With this information in hand, a physical device can be programmed with 1147these addresses and let loose to perform I/O. 1148.Pp 1149When performing I/O to and from a device, synchronization is a vitally 1150important thing which ensures that the actual state in memory is 1151coherent with the rest of the CPU's internal structures such as caches. 1152In general, a given DMA request is only going in one direction: for a 1153device or for the local CPU. 1154In either case, the 1155.Xr ddi_dma_sync 9F 1156function must be called after the kernel is done writing to a region of 1157DMA memory and before it triggers the device or the kernel must call it 1158after the device has told it that some activity has completed that it is 1159going to check. 1160.Pp 1161Some DMA operations utilize what are called DMA windows. 1162The most common consumer is something like a disk device where DMA 1163operations to a given series of sectors can be split up into different 1164chunks where as long as all the transfers are performed, the 1165intermediate states are acceptable. 1166Put another way, because of how SCSI and SAS commands are designed, 1167block devices can basically take a given I/O request and break it into 1168multiple independent I/Os that will equate to the same final item. 1169.Pp 1170When a device supports this mode of operation and it is opted into, then 1171a DMA allocation may result in the use of DMA windows. 1172This allows for cases where the kernel can't perform a DMA allocation 1173for the entire request, but instead can allocate a partial region and 1174then walk through each part one at a time. 1175This is uncommon outside of block devices and usually also is related to 1176calling 1177.Xr ddi_dma_buf_bind_handle 9F . 1178.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1179.It Xr ddi_dma_addr_bind_handle 9F Ta Xr ddi_dma_alloc_handle 9F 1180.It Xr ddi_dma_buf_bind_handle 9F Ta Xr ddi_dma_burstsizes 9F 1181.It Xr ddi_dma_cookie_get 9F Ta Xr ddi_dma_cookie_iter 9F 1182.It Xr ddi_dma_cookie_one 9F Ta Xr ddi_dma_free_handle 9F 1183.It Xr ddi_dma_getwin 9F Ta Xr ddi_dma_mem_alloc 9F 1184.It Xr ddi_dma_mem_free 9F Ta Xr ddi_dma_ncookies 9F 1185.It Xr ddi_dma_nextcookie 9F Ta Xr ddi_dma_numwin 9F 1186.It Xr ddi_dma_set_sbus64 9F Ta Xr ddi_dma_sync 9F 1187.It Xr ddi_dma_unbind_handle 9F Ta Xr ddi_dmae_1stparty 9F 1188.It Xr ddi_dmae_alloc 9F Ta Xr ddi_dmae_disable 9F 1189.It Xr ddi_dmae_enable 9F Ta Xr ddi_dmae_getattr 9F 1190.It Xr ddi_dmae_getcnt 9F Ta Xr ddi_dmae_prog 9F 1191.It Xr ddi_dmae_release 9F Ta Xr ddi_dmae_stop 9F 1192.It Xr ddi_dmae 9F Ta 1193.El 1194.Ss Interrupt Handler Related Functions 1195Interrupts are a central part of the role of device drivers and one of 1196the things that's important to get right. 1197Interrupts come in different types: fixed, MSI, and MSI-X. 1198The kinds that are available depend on the device and the rest of the 1199system. 1200For example, MSI and MSI-X interrupts are generally specific to PCI and 1201PCI Express devices. 1202To begin the interrupt allocation process, the first thing a driver 1203needs to do is to discover what type of interrupts it supports with 1204.Xr ddi_intr_get_supported_types 9F . 1205Then, the driver should work through the supported types, preferring 1206MSI-X, then MSI, and finally fixed interrupts, and try to allocate 1207interrupts. 1208.Pp 1209Drivers first need to know how many interrupts that they require. 1210For example, a networking driver may want to have an interrupt made 1211available for each ring that it has. 1212To discover the number of interrupts available, the driver should call 1213.Xr ddi_intr_get_navail 9F . 1214If there are sufficient interrupts, it can proceed to actually 1215allocate the interrupts with 1216.Xr ddi_intr_alloc 9F . 1217When allocating interrupts, callers need to check to see how many 1218interrupts the system actually gave them. 1219Just because an interrupt is allocated does not mean that it will fire 1220or be ready to use, there are a series of additional steps that the 1221driver must take. 1222.Pp 1223To go through and enable the interrupt, the driver should go through and 1224get the interrupt capabilities with 1225.Xr ddi_intr_get_cap 9F 1226and the priority of the interrupt with 1227.Xr ddi_intr_get_pri 9F . 1228The priority must be used while creating mutexes and related 1229synchronization primitives that will be used during the interrupt 1230handler. 1231At this point, the driver can go ahead and register the functions that 1232will be called with each allocated interrupt with the 1233.Xr ddi_intr_add_handler 9F 1234function. 1235The arguments can vary for each allocated interrupt. 1236It is common to have an interrupt-specific data structure passed in one 1237of the arguments or an interrupt number, while the other argument is 1238generally the driver's instance-specific data structure. 1239.Pp 1240At this point, the last step for the interrupt to be made active from 1241the kernel's perspective is to enable it. 1242This will use either the 1243.Xr ddi_intr_block_enable 9F 1244or 1245.Xr ddi_intr_enable 9F 1246functions depending on the interrupt's capabilities. 1247The reason that these are different is because some interrupt types 1248.Pq MSI 1249require that all interrupts in a group be enabled and disabled at the 1250same time. 1251This is indicated with the 1252.Dv DDI_INTR_FLAG_BLOCK 1253flag found in the interrupt's capabilities. 1254Once that is called, interrupts that are generated by a device will be 1255delivered to the registered function. 1256.Pp 1257It's important to note that there is often device-specific interrupt 1258setup that is required. 1259While the kernel takes care of updating any pieces of the processor's 1260interrupt controller, I/O crossbar, or the PCI MSI and MSI-X 1261capabilities, many devices have device-specific registers that are used 1262to manage, set up, and acknowledge interrupts. 1263These registers or other controls are often capable of separately 1264masking interrupts and are generally what should be used if there are 1265times that you need to separately enable or disable interrupts such as 1266to poll an I/O ring. 1267.Pp 1268When unwinding interrupts, one needs to work in the reverse order here. 1269Until 1270.Xr ddi_intr_block_disable 9F 1271or 1272.Xr ddi_intr_disable 9F 1273is called, one should assume that their interrupt handler will be 1274called. 1275Due to cases where an interrupt is shared between multiple devices, this 1276can happen even if the device is quiesced! 1277Only after that is done is it safe to then free the interrupts with a 1278call to 1279.Xr ddi_intr_free 9F . 1280.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1281.It Xr ddi_add_intr 9F Ta Xr ddi_add_softintr 9F 1282.It Xr ddi_get_iblock_cookie 9F Ta Xr ddi_get_soft_iblock_cookie 9F 1283.It Xr ddi_intr_add_handler 9F Ta Xr ddi_intr_add_softint 9F 1284.It Xr ddi_intr_alloc 9F Ta Xr ddi_intr_block_disable 9F 1285.It Xr ddi_intr_block_enable 9F Ta Xr ddi_intr_clr_mask 9F 1286.It Xr ddi_intr_disable 9F Ta Xr ddi_intr_dup_handler 9F 1287.It Xr ddi_intr_enable 9F Ta Xr ddi_intr_free 9F 1288.It Xr ddi_intr_get_cap 9F Ta Xr ddi_intr_get_hilevel_pri 9F 1289.It Xr ddi_intr_get_navail 9F Ta Xr ddi_intr_get_nintrs 9F 1290.It Xr ddi_intr_get_pending 9F Ta Xr ddi_intr_get_pri 9F 1291.It Xr ddi_intr_get_softint_pri 9F Ta Xr ddi_intr_get_supported_types 9F 1292.It Xr ddi_intr_hilevel 9F Ta Xr ddi_intr_remove_handler 9F 1293.It Xr ddi_intr_remove_softint 9F Ta Xr ddi_intr_set_cap 9F 1294.It Xr ddi_intr_set_mask 9F Ta Xr ddi_intr_set_nreq 9F 1295.It Xr ddi_intr_set_pri 9F Ta Xr ddi_intr_set_softint_pri 9F 1296.It Xr ddi_intr_trigger_softint 9F Ta Xr ddi_remove_intr 9F 1297.It Xr ddi_remove_softintr 9F Ta Xr ddi_trigger_softintr 9F 1298.El 1299.Ss Minor Nodes 1300For a device driver to be accessed by a program in user space 1301.Pq or with the kernel layered device interface 1302then it must create a minor node. 1303Minor nodes are created under 1304.Pa /devices 1305.Pq Xr devfs 4FS 1306and are tied to the instance of a device driver via its 1307.Vt dev_info_t . 1308The 1309.Xr devfsadm 8 1310daemon and the 1311.Pa /dev 1312file system 1313.Po 1314sdev, 1315.Xr dev 4FS 1316.Pc 1317are responsible for creating a coherent set of names that user programs 1318access. 1319Drivers create these minor nodes using the 1320.Xr ddi_create_minor_node 9F 1321function listed below. 1322.Pp 1323In UNIX tradition, character, block, and STREAMS device special files 1324are identified by a major and minor number. 1325All instances of a given driver share the same major number, which means 1326that a device driver must coordinate the minor number space across 1327.Em all 1328instances. 1329While a minor node is created with a fixed minor number, it is possible 1330to change the minor number while processing an 1331.Xr open 9E 1332call, allowing subsequent character device operations to uniquely 1333identify a particular caller. 1334This is usually referred to as a driver that 1335.Dq clones . 1336.Pp 1337When drivers aren't performing cloning, then usually the minor number 1338used when creating the minor node is some fixed offset or multiple of 1339the driver's instance number. 1340When cloning and a driver needs to allocate and manage a minor number 1341space, usually an ID space is leveraged whose IDs are usually in the 1342range from 0 through 1343.Dv MAXMIN32 . 1344There are several different strategies for tracking data structures as 1345they relate to minor numbers. 1346Sometimes, the soft state functionality is used. 1347Others might keep an AVL tree around or tie the data to some other data 1348structure. 1349The method chosen often varies on the specifics of the implementation 1350and its broader context. 1351.Pp 1352The 1353.Vt dev_t 1354structure represents the combined major and minor number. 1355It can be taken apart with the 1356.Xr getmajor 9F 1357and 1358.Xr getminor 9F 1359functions and then reconstructed with the 1360.Xr makedevice 9F 1361function. 1362.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1363.It Xr ddi_create_minor_node 9F Ta Xr ddi_remove_minor_node 9F 1364.It Xr getmajor 9F Ta Xr getminor 9F 1365.It Xr devfs_clean 9F Ta Xr makedevice 9F 1366.El 1367.Ss Accessing Time, Delays, and Periodic Events 1368The kernel provides a number of ways to understand time in the system. 1369In particular it provides a few different clocks and time measurements: 1370.Bl -tag -width Ds 1371.It High-resolution monotonic time 1372The kernel provides access to a high-resolution monotonic clock that is 1373tracked in nanoseconds. 1374This clock is perfect for measuring durations and is accessed via 1375.Xr gethrtime 9F . 1376Unlike the real-time clock, this clock is not subject to adjustments by 1377a time synchronization daemon and is the preferred clock that drivers 1378should be using for tracking events. 1379The high-resolution clock is consistent across CPUs, meaning that you 1380may call 1381.Xr gethrtime 9F 1382on one CPU and the value will be consistent with what is returned, even 1383if a thread is migrated to another CPU. 1384.Pp 1385The high-resolution clock is implemented using an architecture and 1386platform-specific means. 1387For example, on x86 it is generally backed by the TSC 1388.Pq time stamp counter . 1389.It Real-time 1390The real-time clock tracks time as humans perceive it. 1391This clock is accessed using 1392.Xr ddi_get_time 9F . 1393If the system is running a time synchronization daemon that leverages 1394the network time protocol, then this time may be in sync with other 1395systems 1396.Pq subject to some amount of variance ; 1397however, it is critical that this is not assumed. 1398.Pp 1399In general, this time should not be used by drivers for any purpose. 1400It can jump around, drift, and most aspects in the kernel are not based 1401on the real-time clock. 1402For any device timing activities, the high-resolution clock should be 1403used. 1404.It Tick-based monotonic time 1405The kernel has a running periodic function that fires based on the rate 1406dictated by the 1407.Va hz 1408variable, generally operating at 100 or 1000 kHz. 1409The current number of ticks since boot is accessible through the 1410.Xr ddi_get_lbolt 9F 1411function. 1412When functions operate in units of ticks, this is what they are 1413tracking. 1414This value can be converted to and from microseconds using the 1415.Xr drv_usectohz 9F 1416and 1417.Xr drv_hztousec 9F 1418functions. 1419.Pp 1420In general, drivers should prefer the high-resolution monotonic clock 1421for tracking events internally. 1422.El 1423.Pp 1424With these different timing mechanisms, the kernel provides a few 1425different ways to delay execution or to get a callback after some 1426amount of time passes. 1427.Pp 1428The 1429.Xr delay 9F 1430and 1431.Xr drv_usecwait 9F 1432functions are used to block the execution of the current thread. 1433.Xr delay 9F 1434can be used in conditions where sleeping and blocking is allowed where 1435as 1436.Xr drv_usecwait 9F 1437is a busy-wait, which is appropriate for some device drivers, 1438particularly when in high-level interrupt context. 1439.Pp 1440The kernel also allows a function to be called after some time has 1441elapsed. 1442This callback occurs on a different thread and will be executed in 1443.Sy kernel 1444context. 1445A timeout can be scheduled in the future with the 1446.Xr timeout 9F 1447function and cancelled with the 1448.Xr untimeout 9F 1449function. 1450There is also a STREAMs-specific version that can be used if the 1451circumstances are required with the 1452.Xr qtimeout 9F 1453function. 1454.Pp 1455These are all considered one-shot events. 1456That is, they will only happen once after being scheduled. 1457If instead, a driver requires periodic behavior, such as needing 1458something to occur every second, then it should use the 1459.Xr ddi_periodic_add 9F 1460function to establish that. 1461.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1462.It Xr delay 9F Ta Xr ddi_get_lbolt 9F 1463.It Xr ddi_get_lbolt64 9F Ta Xr ddi_get_time 9F 1464.It Xr ddi_periodic_add 9F Ta Xr ddi_periodic_delete 9F 1465.It Xr drv_hztousec 9F Ta Xr drv_usectohz 9F 1466.It Xr drv_usecwait 9F Ta Xr gethrtime 9F 1467.It Xr qtimeout 9F Ta Xr quntimeout 9F 1468.It Xr timeout 9F Ta Xr untimeout 9F 1469.El 1470.Ss Task Queues 1471A task queue provides an asynchronous processing mechanism that can be 1472used by drivers and the broader system. 1473A task queue can be created with 1474.Xr ddi_taskq_create 9F 1475and sized with a given number of threads and a relative priority of those 1476threads. 1477Once created, tasks can be dispatched to the queue with 1478.Xr ddi_taskq_dispatch 9F . 1479The different functions and arguments dispatched do not need to be the 1480same and can vary from invocation to invocation. 1481However, it is the caller's responsibility to ensure that any reference 1482memory is valid until the task queue is done processing. 1483It is possible to create a barrier for a task queue by using the 1484.Xr ddi_taskq_wait 9F 1485function. 1486.Pp 1487While task queues are a flexible mechanism for handling and processing 1488events that occur in a well defined context, they do not have an 1489inherent backpressure mechanism built in. 1490This means it is possible to add events to a task queue faster than they 1491can be processed. 1492For high-volume events, this must be considered before just dispatching 1493an event. 1494Do not rely on a non-sleeping allocation in the task queue dispatch 1495context. 1496.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1497.It Xr ddi_taskq_create 9F Ta Xr ddi_taskq_destroy 9F 1498.It Xr ddi_taskq_dispatch 9F Ta Xr ddi_taskq_resume 9F 1499.It Xr ddi_taskq_suspend 9F Ta Xr ddi_taskq_suspended 9F 1500ddi_taskq_wait 1501.El 1502.Ss Credential Management and Privileges 1503Not everything in the system has the same power to impact it. 1504To determine the permissions and context of a caller, the 1505.Vt cred_t 1506data structure encapsulates a number of different things including the 1507traditional user and group IDs, but also the zone that one is operating 1508in the context of and the associated privileges that the caller has. 1509While this concept is more often thought of due to userland processes being 1510associated with specific users, these same principles apply to different 1511threads in the kernel. 1512Not all kernel threads are allowed to indiscriminately do what they 1513want, they can be constrained by the same privilege model that processes 1514are, which is discussed in 1515.Xr privileges 7 . 1516.Pp 1517Most operations that device drivers implement are given a credential. 1518However, from within the kernel, a credential can be obtained that 1519refers to a specific zone, the current process, or a generic kernel 1520credential. 1521.Pp 1522It is up to drivers and the kernel writ-large to check whether a given 1523credential is authorized to perform a given operation. 1524This is encapsulated by the various privilege checks that exist. 1525The most common check used is 1526.Xr drv_priv 9F 1527which checks for 1528.Dv PRIV_SYS_DEVICES . 1529.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1530.It Xr CRED 9F Ta Xr crdup 9F 1531.It Xr crfree 9F Ta Xr crget 9F 1532.It Xr crgetgid 9F Ta Xr crgetgroups 9F 1533.It Xr crgetngroups 9F Ta Xr crgetrgid 9F 1534.It Xr crgetruid 9F Ta Xr crgetsgid 9F 1535.It Xr crgetsuid 9F Ta Xr crgetuid 9F 1536.It Xr crgetzoneid 9F Ta Xr crhold 9F 1537.It Xr ddi_get_cred 9F Ta Xr drv_priv 9F 1538.It Xr kcred 9F Ta Xr priv_getbyname 9F 1539.It Xr priv_policy_choice 9F Ta Xr priv_policy_only 9F 1540.It Xr priv_policy 9F Ta Xr zone_kcred 9F 1541.El 1542.Ss Device ID Management 1543Device IDs are a means of establishing a unique ID for a device in the 1544kernel. 1545These unique IDs are generally tied to something from the device's 1546hardware such as a serial number or related, but can also be fabricated 1547and stored on the device. 1548These device IDs are used by other subsystems like ZFS to record 1549information about a device as the actual 1550.Pa /devices 1551path that a device resides at may change because it is moved around in 1552the system. 1553.Pp 1554For device drivers, particularly those that represent block devices, 1555they should first call 1556.Xr ddi_devid_init 9F 1557to initialize the device ID data structure. 1558After that is done, it is then safe to call 1559.Xr ddi_devid_register 9F 1560to notify the kernel about the ID. 1561.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1562.It Xr ddi_devid_compare 9F Ta Xr ddi_devid_free 9F 1563.It Xr ddi_devid_get 9F Ta Xr ddi_devid_init 9F 1564.It Xr ddi_devid_register 9F Ta Xr ddi_devid_sizeof 9F 1565.It Xr ddi_devid_str_decode 9F Ta Xr ddi_devid_str_encode 9F 1566.It Xr ddi_devid_str_free 9F Ta Xr ddi_devid_unregister 9F 1567.It Xr ddi_devid_valid 9F Ta 1568.El 1569.Ss Message Block Functions 1570The 1571.Vt "mblk_t" 1572data structure is used to chain together messages which are used through 1573the kernel for different subsystems including all of networking, 1574terminals, STREAMS, USB, and more. 1575.Pp 1576Message blocks are chained together by a series of two different 1577pointers: 1578.Fa b_cont 1579and 1580.Fa b_next . 1581When a message is split across multiple data buffers, they are linked by 1582the 1583.Fa b_cont 1584pointer. 1585However, multiple distinct messages can be chained together and linked 1586by the 1587.Fa b_next 1588pointer. 1589Let's look at this in the context of a series of networking packets. 1590If we had a chain of say 10 UDP packets that we were given, each UDP 1591packet is considered an independent message and would be linked from one 1592to the next based on the order they should be transmitted with the 1593.Fa b_next 1594pointer. 1595However, an individual message may be entirely in one message block, in 1596which case its 1597.Fa b_cont 1598pointer would be 1599.Dv NULL , 1600but if say the packet were split into a 100 byte data buffer that 1601contained the headers and then a 1000 byte data buffer that contained 1602the actual packet data, those two would be linked together by 1603.Fa b_cont . 1604A continued message would never have its next pointer used to link it to 1605a wholly different message. 1606Visually you might see this as: 1607.Bd -literal 1608 +---------------+ 1609 | UDP Message 0 | 1610 | Bytes 0-1100 | 1611 | b_cont ---+--> NULL 1612 | b_next + | 1613 +---------|-----+ 1614 | 1615 v 1616 +---------------+ +----------------+ 1617 | UDP Message 1 | | UDP Message 1+ | 1618 | Bytes 0-100 | | Bytes 100-1100 | 1619 | b_cont ---+--> | b_cont ----+->NULL 1620 | b_next + | | b_next ----+->NULL 1621 +---------|-----+ +----------------+ 1622 | 1623 ... 1624 | 1625 v 1626 +---------------+ 1627 | UDP Message 9 | 1628 | Bytes 0-1100 | 1629 | b_cont ---+--> NULL 1630 | b_next ---+--> NULL 1631 +---------------+ 1632.Ed 1633.Pp 1634Message blocks all have an associated data block which contains the 1635actual data that is present. 1636Multiple message blocks can share the same data block as well. 1637The data block has a notion of a type, which is generally 1638.Dv M_DATA 1639which signifies that they operate on data. 1640.Pp 1641To allocate message blocks, one generally uses the 1642.Xr allocb 9F 1643function to create one; however, you can also create message blocks 1644using your own source of data through functions like 1645.Xr desballoc 9F . 1646This is generally used when one wants to use memory that was originally 1647used for DMA to pass data back into the kernel, such as in a networking 1648device driver. 1649When this happens, a callback function will be called once the last user 1650of the data block is done with it. 1651.Pp 1652The functions listed below often end in either 1653.Dq msg 1654or 1655.Dq b 1656to indicate that they will operate on an entire message and follow the 1657.Fa b_cont 1658pointer or they will not respectively. 1659.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1660.It Xr adjmsg 9F Ta Xr allocb 9F 1661.It Xr copyb 9F Ta Xr copymsg 9F 1662.It Xr datamsg 9F Ta Xr desballoc 9F 1663.It Xr desballoca 9F Ta Xr dupb 9F 1664.It Xr dupmsg 9F Ta Xr esballoc 9F 1665.It Xr esballoca 9F Ta Xr freeb 9F 1666.It Xr freemsg 9F Ta Xr linkb 9F 1667.It Xr mcopymsg 9F Ta Xr msgdsize 9F 1668.It Xr msgpullup 9F Ta Xr msgsize 9F 1669.It Xr pullupmsg 9F Ta Xr rmvb 9F 1670.It Xr testb 9F Ta Xr unlinkb 9F 1671.El 1672.Ss Upgradable Firmware Modules 1673The UFM 1674.Pq Upgradable Firmware Module 1675subsystem is used to grant the system observability into firmware that 1676exists persistently on a device. 1677These functions are intended for use by drivers that are participating in 1678the kernel's UFM framework, which is discussed in 1679.Xr ddi_ufm 9E . 1680.Pp 1681The 1682.Xr ddi_ufm_init 9E 1683and 1684.Xr ddi_ufm_fini 9E 1685functions are used to indicate support of the subsystem to the kernel. 1686The driver is required to use the 1687.Xr ddi_ufm_update 9F 1688function to indicate both that it is ready to receive UFM requests and 1689to indicate that any data that the kernel may have previously received 1690has changed. 1691Once that's completed, then the other functions listed here are 1692generally used as part of implementing specific callback functions that 1693are registered. 1694.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1695.It Xr ddi_ufm_fini 9F Ta Xr ddi_ufm_image_set_desc 9F 1696.It Xr ddi_ufm_image_set_misc 9F Ta Xr ddi_ufm_image_set_nslots 9F 1697.It Xr ddi_ufm_init 9F Ta Xr ddi_ufm_slot_set_attrs 9F 1698.It Xr ddi_ufm_slot_set_imgsize 9F Ta Xr ddi_ufm_slot_set_misc 9F 1699.It Xr ddi_ufm_slot_set_version 9F Ta Xr ddi_ufm_update 9F 1700.El 1701.Ss Firmware Loading 1702Some hardware devices have firmware that is not stored as part of the 1703device itself and must instead be sent to the device each time it is 1704powered on. 1705These routines help drivers that need to perform this read such data 1706from the file system from well-known locations in the operating system. 1707To begin with, a driver should call 1708.Xr firmware_open 9F 1709to open a handle to the firmware file. 1710At that point, one can determine the size of the file with the 1711.Xr firmware_get_size 9F 1712function and allocate the appropriate sized memory buffer to read it in. 1713Callers should always check what the size of the returned file is and 1714should not just blindly pass that size off to the kernel memory 1715allocator. 1716For example, if a file was over 100 MiB in size, then one should not 1717assume that they're going to just blindly allocate 100 MiB of kernel 1718memory and should instead perform incremental reads and sends to a 1719device that are smaller in size. 1720.Pp 1721A driver can then go through and perform arbitrary reads of the firmware 1722file through the 1723.Xr firmware_read 9F 1724interface until they have read everything that they need. 1725Once complete, the corresponding handle needs to be released through the 1726.Xr firmware_close 9F 1727function. 1728.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1729.It Xr firmware_close 9F Ta Xr firmware_get_size 9F 1730.It Xr firmware_open 9F Ta Xr firmware_read 9F 1731.El 1732.Ss Fault Management Handling 1733These functions allow device drivers to harden themselves against errors 1734that might occur while interfacing with devices and tie into the broader 1735fault management architecture. 1736.Pp 1737To begin, a driver must declare which capabilities it implements during 1738its 1739.Xr attach 9E 1740function by calling 1741.Xr ddi_fm_init 9F . 1742The set of capabilities it receives back may be less than what was 1743requested because the capabilities are dependent on the overall chain of 1744drivers present. 1745.Pp 1746If 1747.Dv DDI_FM_EREPORT_CAPABLE 1748was negotiated, then the driver is expected to generate error events 1749when certain conditions occur using the 1750.Xr ddi_fm_ereport_post 9F 1751function or the more specific 1752.Xr pci_ereport_post 9F 1753function. 1754If a caller has negotiated 1755.Dv DDI_FM_ACCCHK_CAPABLE , 1756then it is allowed to set up its register attributes to indicate that it 1757will check for errors on the register handle after using functions like 1758.Xr ddi_get8 9F 1759and 1760.Xr ddi_set8 9F 1761by calling 1762.Xr ddi_fm_acc_err_get 9F 1763and reacting accordingly. 1764Similarly, if a driver has negotiated 1765.Dv DDI_FM_DMACHK_CAPABLE , 1766then it will use 1767.Xr ddi_check_dma_handle 9F 1768to check the results of DMA activity and handle the results 1769appropriately. 1770Similar to register accesses, the DMA attributes must be updated to set 1771that error handling is anticipated on this handle. 1772The 1773.Xr ddi_fm_init 9F 1774manual page has an overview of the other types of flags that can be 1775negotiated and how they are used. 1776.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1777.It Xr ddi_check_acc_handle 9F Ta Xr ddi_check_dma_handle 9F 1778.It Xr ddi_dev_report_fault 9F Ta Xr ddi_fm_acc_err_clear 9F 1779.It Xr ddi_fm_acc_err_get 9F Ta Xr ddi_fm_capable 9F 1780.It Xr ddi_fm_dma_err_clear 9F Ta Xr ddi_fm_dma_err_get 9F 1781.It Xr ddi_fm_ereport_post 9F Ta Xr ddi_fm_fini 9F 1782.It Xr ddi_fm_handler_register 9F Ta Xr ddi_fm_handler_unregister 9F 1783.It Xr ddi_fm_init 9F Ta Xr ddi_fm_service_impact 9F 1784.It Xr pci_ereport_post 9F Ta Xr pci_ereport_setup 9F 1785.It Xr pci_ereport_teardown 9F Ta 1786.El 1787.Ss SCSI and SAS Device Driver Functions 1788These functions are for use by SCSI and SAS device drivers that leverage 1789the kernel's frameworks. 1790Other device drivers should not use these. 1791For more background on these, some of the general concepts are discussed 1792in 1793.Xr iport 9 , 1794.Xr phymap 9 , 1795and 1796.Xr tgtmap 9 . 1797.Pp 1798Device drivers register initially with the kernel by using the 1799.Xr scsi_ha_init 9F 1800function and then, in their attach routine, register specific instances, 1801using functions like 1802.Xr scsi_hba_iport_register 9F 1803or instead 1804.Xr scsi_hba_tran_alloc 9F 1805and 1806.Xr scsi_hba_attach_setup 9F . 1807New drivers are encouraged to use the target map and iports framework to 1808simplify the device driver writing process. 1809.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1810.It Xr makecom_g0_s 9F Ta Xr makecom_g0 9F 1811.It Xr makecom_g1 9F Ta Xr makecom_g5 9F 1812.It Xr makecom 9F Ta Xr sas_phymap_create 9F 1813.It Xr sas_phymap_destroy 9F Ta Xr sas_phymap_lookup_ua 9F 1814.It Xr sas_phymap_lookup_uapriv 9F Ta Xr sas_phymap_phy_add 9F 1815.It Xr sas_phymap_phy_rem 9F Ta Xr sas_phymap_phy2ua 9F 1816.It Xr sas_phymap_phys_free 9F Ta Xr sas_phymap_phys_next 9F 1817.It Xr sas_phymap_ua_free 9F Ta Xr sas_phymap_ua2phys 9F 1818.It Xr sas_phymap_uahasphys 9F Ta Xr scsi_abort 9F 1819.It Xr scsi_address_device 9F Ta Xr scsi_alloc_consistent_buf 9F 1820.It Xr scsi_cname 9F Ta Xr scsi_destroy_pkt 9F 1821.It Xr scsi_device_hba_private_get 9F Ta Xr scsi_device_hba_private_set 9F 1822.It Xr scsi_device_unit_address 9F Ta Xr scsi_dmafree 9F 1823.It Xr scsi_dmaget 9F Ta Xr scsi_dname 9F 1824.It Xr scsi_errmsg 9F Ta Xr scsi_ext_sense_fields 9F 1825.It Xr scsi_find_sense_descr 9F Ta Xr scsi_free_consistent_buf 9F 1826.It Xr scsi_free_wwnstr 9F Ta Xr scsi_get_device_type_scsi_options 9F 1827.It Xr scsi_get_device_type_string 9F Ta Xr scsi_hba_attach_setup 9F 1828.It Xr scsi_hba_detach 9F Ta Xr scsi_hba_fini 9F 1829.It Xr scsi_hba_init 9F Ta Xr scsi_hba_iport_exist 9F 1830.It Xr scsi_hba_iport_find 9F Ta Xr scsi_hba_iport_register 9F 1831.It Xr scsi_hba_iport_unit_address 9F Ta Xr scsi_hba_iportmap_create 9F 1832.It Xr scsi_hba_iportmap_destroy 9F Ta Xr scsi_hba_iportmap_iport_add 9F 1833.It Xr scsi_hba_iportmap_iport_remove 9F Ta Xr scsi_hba_lookup_capstr 9F 1834.It Xr scsi_hba_pkt_alloc 9F Ta Xr scsi_hba_pkt_comp 9F 1835.It Xr scsi_hba_pkt_free 9F Ta Xr scsi_hba_probe 9F 1836.It Xr scsi_hba_tgtmap_create 9F Ta Xr scsi_hba_tgtmap_destroy 9F 1837.It Xr scsi_hba_tgtmap_scan_luns 9F Ta Xr scsi_hba_tgtmap_set_add 9F 1838.It Xr scsi_hba_tgtmap_set_begin 9F Ta Xr scsi_hba_tgtmap_set_end 9F 1839.It Xr scsi_hba_tgtmap_set_flush 9F Ta Xr scsi_hba_tgtmap_tgt_add 9F 1840.It Xr scsi_hba_tgtmap_tgt_remove 9F Ta Xr scsi_hba_tran_alloc 9F 1841.It Xr scsi_hba_tran_free 9F Ta Xr scsi_ifgetcap 9F 1842.It Xr scsi_ifsetcap 9F Ta Xr scsi_init_pkt 9F 1843.It Xr scsi_log 9F Ta Xr scsi_mname 9F 1844.It Xr scsi_pktalloc 9F Ta Xr scsi_pktfree 9F 1845.It Xr scsi_poll 9F Ta Xr scsi_probe 9F 1846.It Xr scsi_resalloc 9F Ta Xr scsi_reset_notify 9F 1847.It Xr scsi_reset 9F Ta Xr scsi_resfree 9F 1848.It Xr scsi_rname 9F Ta Xr scsi_sense_asc 9F 1849.It Xr scsi_sense_ascq 9F Ta Xr scsi_sense_cmdspecific_uint64 9F 1850.It Xr scsi_sense_info_uint64 9F Ta Xr scsi_sense_key 9F 1851.It Xr scsi_setup_cdb 9F Ta Xr scsi_slave 9F 1852.It Xr scsi_sname 9F Ta Xr scsi_sync_pkt 9F 1853.It Xr scsi_transport 9F Ta Xr scsi_unprobe 9F 1854.It Xr scsi_unslave 9F Ta Xr scsi_validate_sense 9F 1855.It Xr scsi_vu_errmsg 9F Ta Xr scsi_wwn_to_wwnstr 9F 1856scsi_wwnstr_to_wwn 1857.El 1858.Ss Block Device Buffer Handling 1859Block devices operate with a data structure called the 1860.Vt struct buf 1861which is described in 1862.Xr buf 9S . 1863This structure is used to represent a given block request and is used 1864heavily in block devices, the SCSI/SAS framework, and the blkdev 1865framework. 1866The functions described here are used to manipulate these structures in 1867various ways such as copying them around, indicating error conditions, 1868or indicating when the I/O operation is done. 1869By default, this memory is not mapped into the kernel's address space so 1870several functions such as 1871.Xr bp_mapin 9F 1872are present to allow for that to happen when required. 1873.Pp 1874To initially obtain a 1875.Vt struct buf , 1876drivers should begin by calling 1877.Xr getrbuf 9F 1878at which point, the caller can fill in the structure. 1879Once that's done, the 1880.Xr physio 9F 1881function can be used to actually perform the I/O and wait until it's 1882complete. 1883.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1884.It Xr bioclone 9F Ta Xr biodone 9F 1885.It Xr bioerror 9F Ta Xr biofini 9F 1886.It Xr bioinit 9F Ta Xr biomodified 9F 1887.It Xr bioreset 9F Ta Xr biosize 9F 1888.It Xr biowait 9F Ta Xr bp_mapin 9F 1889.It Xr bp_mapout 9F Ta Xr clrbuf 9F 1890.It Xr disksort 9F Ta Xr freerbuf 9F 1891.It Xr geterror 9F Ta Xr getrbuf 9F 1892.It Xr minphys 9F Ta Xr physio 9F 1893.El 1894.Ss Networking Device Driver Functions 1895These functions are for networking device drivers that implant the MAC, 1896GLDv3 interfaces. 1897The full framework and how to use it is described in 1898.Xr mac 9E . 1899.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1900.It Xr mac_alloc 9F Ta Xr mac_fini_ops 9F 1901.It Xr mac_free 9F Ta Xr mac_hcksum_get 9F 1902.It Xr mac_hcksum_set 9F Ta Xr mac_init_ops 9F 1903.It Xr mac_link_update 9F Ta Xr mac_lso_get 9F 1904.It Xr mac_maxsdu_update 9F Ta Xr mac_prop_info_set_default_fec 9F 1905.It Xr mac_prop_info_set_default_link_flowctrl 9F Ta Xr mac_prop_info_set_default_str 9F 1906.It Xr mac_prop_info_set_default_uint32 9F Ta Xr mac_prop_info_set_default_uint64 9F 1907.It Xr mac_prop_info_set_default_uint8 9F Ta Xr mac_prop_info_set_perm 9F 1908.It Xr mac_prop_info_set_range_uint32 9F Ta Xr mac_prop_info 9F 1909.It Xr mac_register 9F Ta Xr mac_rx_ring 9F 1910.It Xr mac_rx 9F Ta Xr mac_transceiver_info_set_present 9F 1911.It Xr mac_transceiver_info_set_usable 9F Ta Xr mac_transceiver_info 9F 1912.It Xr mac_tx_ring_update 9F Ta Xr mac_tx_update 9F 1913.It Xr mac_unregister 9F Ta 1914.El 1915.Ss USB Device Driver Functions 1916These functions are designed for USB device drivers. 1917To first initialize with the kernel, a device driver must call 1918.Xr usb_client_attach 9F 1919and then 1920.Xr usb_get_dev_data 9F . 1921The latter call is required to get access to the USB-level 1922descriptors about the device which describe what kinds of USB endpoints 1923.Pq control, bulk, interrupt, or isochronous 1924exist on the device as well as how many different interfaces and 1925configurations are present. 1926.Pp 1927Once a given configuration, sometimes the default, is selected, then the 1928driver can proceed to opening up what the USB architecture calls a pipe, 1929which provides a way to send requests to a specific USB endpoint. 1930First, specific endpoints can be looked up using the 1931.Xr usb_lookup_ep_data 9F 1932function which gets information from the parsed descriptors and then 1933that gets filled into an extended descriptor with 1934.Xr usb_ep_xdescr_fill 9F . 1935With that in hand, a pipe can be opened with 1936.Xr usb_pipe_xopen 9F . 1937.Pp 1938Once a pipe has been opened, which most often happens in a driver's 1939.Xr attach 9E 1940entry point, then requests can be allocated and submitted. 1941There is a different allocation for each type of request 1942.Po 1943e.g. 1944.Xr usb_alloc_bulk_req 9F 1945.Pc 1946and a different submission function for each type as well. 1947Each request structure has a corresponding page in section 9S that 1948describes the structure, its members, and how to work with it. 1949.Pp 1950One other major concern for USB devices, which isn't as common with 1951other types of devices, is that they can be yanked out and reinserted 1952at any time. 1953To help determine when this happens, the kernel offers the 1954.Xr usb_register_event_cbs 9F 1955function which allows a driver to register for callbacks when a device 1956is disconnected, reconnected, or around checkpoint suspend/resume 1957behavior. 1958.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 1959.It Xr usb_alloc_bulk_req 9F Ta Xr usb_alloc_ctrl_req 9F 1960.It Xr usb_alloc_intr_req 9F Ta Xr usb_alloc_isoc_req 9F 1961.It Xr usb_alloc_request 9F Ta Xr usb_client_attach 9F 1962.It Xr usb_client_detach 9F Ta Xr usb_clr_feature 9F 1963.It Xr usb_create_pm_components 9F Ta Xr usb_ep_xdescr_fill 9F 1964.It Xr usb_free_bulk_req 9F Ta Xr usb_free_ctrl_req 9F 1965.It Xr usb_free_descr_tree 9F Ta Xr usb_free_dev_data 9F 1966.It Xr usb_free_intr_req 9F Ta Xr usb_free_isoc_req 9F 1967.It Xr usb_get_addr 9F Ta Xr usb_get_alt_if 9F 1968.It Xr usb_get_cfg 9F Ta Xr usb_get_current_frame_number 9F 1969.It Xr usb_get_dev_data 9F Ta Xr usb_get_if_number 9F 1970.It Xr usb_get_max_pkts_per_isoc_request 9F Ta Xr usb_get_status 9F 1971.It Xr usb_get_string_descr 9F Ta Xr usb_handle_remote_wakeup 9F 1972.It Xr usb_lookup_ep_data 9F Ta Xr usb_owns_device 9F 1973.It Xr usb_parse_data 9F Ta Xr usb_pipe_bulk_xfer 9F 1974.It Xr usb_pipe_close 9F Ta Xr usb_pipe_ctrl_xfer_wait 9F 1975.It Xr usb_pipe_ctrl_xfer 9F Ta Xr usb_pipe_drain_reqs 9F 1976.It Xr usb_pipe_get_max_bulk_transfer_size 9F Ta Xr usb_pipe_get_private 9F 1977.It Xr usb_pipe_get_state 9F Ta Xr usb_pipe_intr_xfer 9F 1978.It Xr usb_pipe_isoc_xfer 9F Ta Xr usb_pipe_open 9F 1979.It Xr usb_pipe_reset 9F Ta Xr usb_pipe_set_private 9F 1980.It Xr usb_pipe_stop_intr_polling 9F Ta Xr usb_pipe_stop_isoc_polling 9F 1981.It Xr usb_pipe_xopen 9F Ta Xr usb_print_descr_tree 9F 1982.It Xr usb_register_hotplug_cbs 9F Ta Xr usb_reset_device 9F 1983.It Xr usb_set_alt_if 9F Ta Xr usb_set_cfg 9F 1984.It Xr usb_unregister_hotplug_cbs 9F Ta 1985.El 1986.Ss PCI Device Driver Functions 1987These functions are specific for PCI and PCI Express based device 1988drivers and are intended to be used to get access to PCI configuration 1989space. 1990For normal PCI base address registers 1991.Pq BARs 1992instead see 1993.Sx Register Setup and Access . 1994.Pp 1995To access PCI configuration space, a device driver should first call 1996.Xr pci_config_setup 9F . 1997Generally, drivers will call this in their 1998.Xr attach 9E 1999entry point and then tear down the configuration space access with the 2000.Xr pci_config_teardown 9F 2001entry point in 2002.Xr detach 9E . 2003After setting up access to configuration space, the returned handle can 2004be used in all of the various configuration space routines to get and 2005set specific sized values in configuration space. 2006.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2007.It Xr pci_config_get8 9F Ta Xr pci_config_get16 9F 2008.It Xr pci_config_get32 9F Ta Xr pci_config_get64 9F 2009.It Xr pci_config_put8 9F Ta Xr pci_config_put16 9F 2010.It Xr pci_config_put32 9F Ta Xr pci_config_put64 9F 2011.It Xr pci_config_setup 9F Ta Xr pci_config_teardown 9F 2012.It Xr pci_report_pmcap 9F Ta Xr pci_restore_config_regs 9F 2013.It Xr pci_save_config_regs 9F Ta 2014.El 2015.Ss USB Host Controller Interface Functions 2016These routines are used for device drivers which implement the USB 2017host controller interfaces described in 2018.Xr usba_hcdi 9E . 2019Other types of devices drivers and modules should not call these 2020functions. 2021In particular, if one is writing a device driver for a USB device, these 2022are not the routines you're looking for and you want to see 2023.Sx USB Device Driver Functions . 2024These are what the 2025.Xr ehci 4D 2026or 2027.Xr xhci 4D 2028drivers use to provide services that USB drivers use via the kernel USB 2029architecture. 2030.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2031.It Xr usba_alloc_hcdi_ops 9F Ta Xr usba_free_hcdi_ops 9F 2032.It Xr usba_hcdi_cb 9F Ta Xr usba_hcdi_dup_intr_req 9F 2033.It Xr usba_hcdi_dup_isoc_req 9F Ta Xr usba_hcdi_get_device_private 9F 2034.It Xr usba_hcdi_register 9F Ta Xr usba_hcdi_unregister 9F 2035.It Xr usba_hubdi_bind_root_hub 9F Ta Xr usba_hubdi_cb_ops 9F 2036.It Xr usba_hubdi_close 9F Ta Xr usba_hubdi_dev_ops 9F 2037.It Xr usba_hubdi_ioctl 9F Ta Xr usba_hubdi_open 9F 2038.It Xr usba_hubdi_root_hub_power 9F Ta Xr usba_hubdi_unbind_root_hub 9F 2039.El 2040.Ss Functions for PCMCIA Drivers 2041These functions exist for older PCMCIA device drivers. 2042These should not otherwise be used by the system. 2043.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2044.It Xr csx_AccessConfigurationRegister 9F Ta Xr csx_ConvertSize 9F 2045.It Xr csx_ConvertSpeed 9F Ta Xr csx_CS_DDI_Info 9F 2046.It Xr csx_DeregisterClient 9F Ta Xr csx_DupHandle 9F 2047.It Xr csx_Error2Text 9F Ta Xr csx_Event2Text 9F 2048.It Xr csx_FreeHandle 9F Ta Xr csx_Get16 9F 2049.It Xr csx_Get32 9F Ta Xr csx_Get64 9F 2050.It Xr csx_Get8 9F Ta Xr csx_GetEventMask 9F 2051.It Xr csx_GetFirstClient 9F Ta Xr csx_GetFirstTuple 9F 2052.It Xr csx_GetHandleOffset 9F Ta Xr csx_GetMappedAddr 9F 2053.It Xr csx_GetNextClient 9F Ta Xr csx_GetNextTuple 9F 2054.It Xr csx_GetStatus 9F Ta Xr csx_GetTupleData 9F 2055.It Xr csx_MakeDeviceNode 9F Ta Xr csx_MapLogSocket 9F 2056.It Xr csx_MapMemPage 9F Ta Xr csx_ModifyConfiguration 9F 2057.It Xr csx_ModifyWindow 9F Ta Xr csx_Parse_CISTPL_BATTERY 9F 2058.It Xr csx_Parse_CISTPL_BYTEORDER 9F Ta Xr csx_Parse_CISTPL_CFTABLE_ENTRY 9F 2059.It Xr csx_Parse_CISTPL_CONFIG 9F Ta Xr csx_Parse_CISTPL_DATE 9F 2060.It Xr csx_Parse_CISTPL_DEVICE_A 9F Ta Xr csx_Parse_CISTPL_DEVICE_OA 9F 2061.It Xr csx_Parse_CISTPL_DEVICE_OC 9F Ta Xr csx_Parse_CISTPL_DEVICE 9F 2062.It Xr csx_Parse_CISTPL_DEVICEGEO_A 9F Ta Xr csx_Parse_CISTPL_DEVICEGEO 9F 2063.It Xr csx_Parse_CISTPL_FORMAT 9F Ta Xr csx_Parse_CISTPL_FUNCE 9F 2064.It Xr csx_Parse_CISTPL_FUNCID 9F Ta Xr csx_Parse_CISTPL_GEOMETRY 9F 2065.It Xr csx_Parse_CISTPL_JEDEC_A 9F Ta Xr csx_Parse_CISTPL_JEDEC_C 9F 2066.It Xr csx_Parse_CISTPL_LINKTARGET 9F Ta Xr csx_Parse_CISTPL_LONGLINK_A 9F 2067.It Xr csx_Parse_CISTPL_LONGLINK_C 9F Ta Xr csx_Parse_CISTPL_LONGLINK_MFC 9F 2068.It Xr csx_Parse_CISTPL_MANFID 9F Ta Xr csx_Parse_CISTPL_ORG 9F 2069.It Xr csx_Parse_CISTPL_SPCL 9F Ta Xr csx_Parse_CISTPL_SWIL 9F 2070.It Xr csx_Parse_CISTPL_VERS_1 9F Ta Xr csx_Parse_CISTPL_VERS_2 9F 2071.It Xr csx_ParseTuple 9F Ta Xr csx_Put16 9F 2072.It Xr csx_Put32 9F Ta Xr csx_Put64 9F 2073.It Xr csx_Put8 9F Ta Xr csx_RegisterClient 9F 2074.It Xr csx_ReleaseConfiguration 9F Ta Xr csx_ReleaseIO 9F 2075.It Xr csx_ReleaseIRQ 9F Ta Xr csx_ReleaseSocketMask 9F 2076.It Xr csx_ReleaseWindow 9F Ta Xr csx_RemoveDeviceNode 9F 2077.It Xr csx_RepGet16 9F Ta Xr csx_RepGet32 9F 2078.It Xr csx_RepGet64 9F Ta Xr csx_RepGet8 9F 2079.It Xr csx_RepPut16 9F Ta Xr csx_RepPut32 9F 2080.It Xr csx_RepPut64 9F Ta Xr csx_RepPut8 9F 2081.It Xr csx_RequestConfiguration 9F Ta Xr csx_RequestIO 9F 2082.It Xr csx_RequestIRQ 9F Ta Xr csx_RequestSocketMask 9F 2083.It Xr csx_RequestWindow 9F Ta Xr csx_ResetFunction 9F 2084.It Xr csx_SetEventMask 9F Ta Xr csx_SetHandleOffset 9F 2085.It Xr csx_ValidateCIS 9F Ta 2086.El 2087.Ss STREAMS related functions 2088These functions are meant to be used when interacting with STREAMS 2089devices or when implementing one. 2090When a STREAMS driver is opened, it receives messages on a queue which 2091are then processed and can be sent back. 2092As different queues are often linked together, the most common thing is 2093to process a message and then pass the message onto the next queue using 2094the 2095.Xr putnext 9F 2096function. 2097.Pp 2098STREAMS messages are passed around using message blocks, which use the 2099.Vt mblk_t 2100type. 2101See 2102.Sx Message Block Functions 2103for more about how the data structure and functions that manipulate 2104message blocks. 2105.Pp 2106These functions should generally not be used when implementing a 2107networking device driver today. 2108See 2109.Xr mac 9E 2110instead. 2111.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2112.It Xr backq 9F Ta Xr bcanput 9F 2113.It Xr bcanputnext 9F Ta Xr canput 9F 2114.It Xr canputnext 9F Ta Xr enableok 9F 2115.It Xr flushband 9F Ta Xr flushq 9F 2116.It Xr freezestr 9F Ta Xr getq 9F 2117.It Xr insq 9F Ta Xr merror 9F 2118.It Xr mexchange 9F Ta Xr noenable 9F 2119.It Xr put 9F Ta Xr putbq 9F 2120.It Xr putctl 9F Ta Xr putctl1 9F 2121.It Xr putnext 9F Ta Xr putnextctl 9F 2122.It Xr putnextctl1 9F Ta Xr putq 9F 2123.It Xr mt-streams 9F Ta Xr qassociate 9F 2124.It Xr qenable 9F Ta Xr qprocsoff 9F 2125.It Xr qprocson 9F Ta Xr qreply 9F 2126.It Xr qsize 9F Ta Xr qwait_sig 9F 2127.It Xr qwait 9F Ta Xr qwriter 9F 2128.It Xr OTHERQ 9F Ta Xr RD 9F 2129.It Xr rmvq 9F Ta Xr SAMESTR 9F 2130.It Xr unfreezestr 9F Ta Xr WR 9F 2131.El 2132.Ss STREAMS ioctls 2133The following functions are used when a STREAMS-based device driver is 2134processing its 2135.Xr ioctl 9E 2136entry point. 2137Unlike character and block devices, STREAMS ioctls are passed around in 2138message blocks and copying data in and out of userland as STREAMS 2139ioctls are generally always processed in 2140.Sy kernel 2141context. 2142This means that the normal functions like 2143.Xr ddi_copyin 9F 2144and 2145.Xr ddi_copyout 9F 2146cannot be used. 2147Instead, when a message block has a type of 2148.Dv M_IOCTL , 2149then these routines can often be used to convert the structure into one 2150that asks for data to be copied in, copied out, or to finally 2151acknowledge the ioctl as successful or to terminate the processing in 2152error. 2153.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2154.It Xr mcopyin 9F Ta Xr mcopyout 9F 2155.It Xr mioc2ack 9F Ta Xr miocack 9F 2156.It Xr miocnak 9F Ta Xr miocpullup 9F 2157.It Xr mkiocb 9F Ta 2158.El 2159.Ss chpoll(9E) Related Functions 2160These functions are present in service of the 2161.Xr chpoll 9E 2162interface which is used to support the traditional 2163.Xr poll 2 , 2164and 2165.Xr select 3C 2166interfaces as well as event ports through the 2167.Xr port_get 3C 2168interface. 2169See 2170.Xr chpoll 9E 2171for the specific cases this should be called. 2172If a device driver does not implement the 2173.Xr chpoll 9E 2174character device entry point, then these functions should not be used. 2175.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2176.It Xr pollhead_clean 9F Ta Xr pollwakeup 9F 2177.El 2178.Ss Kernel Statistics 2179The kernel statistics or kstat framework provides an easy way of 2180exporting statistic information to be consumed outside of the kernel. 2181Users can interface with this data via 2182.Xr kstat 8 2183and the corresponding kstat library discussed in 2184.Xr kstat 3KSTAT . 2185.Pp 2186Kernel statistics are grouped using a tuple of four identifiers, 2187separated by colons when using 2188.Xr kstat 8 . 2189These are, in order, the statistic module name, instance, a name 2190which covers a group of statistics, and an individual name for a 2191statistic. 2192In addition, kernel statistics have a class which is used to group 2193similar named groups of statistics together across devices. 2194When using 2195.Xr kstat_create 9F , 2196drivers specify the first three parts of the tuple and the class. 2197The naming of individual statistics, the last part of the tuple, varies 2198based upon the type of the statistic. 2199For the most part, drivers will use the kstat type 2200.Dv KSTAT_TYPE_NAMED , 2201which allows multiple name-value pairs to exist within the statistic. 2202For example, the kernel's layer 2 networking framework, 2203.Xr mac 9E , 2204creates a kstat with the driver's name and instance and names it 2205.Dq mac . 2206Within this named group, there are statistics for all of the different 2207individual stats that the kernel and devices track such as bytes 2208transmitted and received, the state and speed of the link, and 2209advertised and enabled capabilities. 2210.Pp 2211A device driver can initialize a kstat with the 2212.Xr kstat_create 9F 2213function. 2214It will not be made accessible to users until the 2215.Xr kstat_install 9F 2216function is called. 2217The device driver must perform additional initialization of the kstat 2218before proceeding and calling 2219.Xr kstat_install 9F . 2220The kstat structure that drivers see is discussed in 2221.Xr kstat 9S . 2222.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2223.It Xr kstat_create 9F Ta Xr kstat_delete 9F 2224.It Xr kstat_install 9F Ta Xr kstat_named_init 9F 2225.It Xr kstat_named_setstr 9F Ta Xr kstat_queue 9F 2226.It Xr kstat_runq_back_to_waitq 9F Ta Xr kstat_runq_enter 9F 2227.It Xr kstat_runq_exit 9F Ta Xr kstat_waitq_enter 9F 2228.It Xr kstat_waitq_exit 9F Ta Xr kstat_waitq_to_runq 9F 2229.El 2230.Ss NDI Events 2231These functions are used to allow a device driver to register for 2232certain events that might occur to its device or a parent in the tree 2233and receive a callback function when they occur. 2234A good example of this is when a device has been removed from the system 2235such as someone just pulling out a USB device or NVMe U.2 device. 2236The event handlers work by first getting a cookie that names the type of 2237event with 2238.Xr ddi_get_eventcookie 9F 2239and then registering the callback with 2240.Xr ddi_add_event_handler 9F . 2241.Pp 2242The 2243.Xr ddi_cb_register 9F 2244function is used to collect over classes of events such as when 2245participating in dynamic interrupt sharing. 2246.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2247.It Xr ddi_add_event_handler 9F Ta Xr ddi_cb_register 9F 2248.It Xr ddi_cb_unregister 9F Ta Xr ddi_get_eventcookie 9F 2249.It Xr ddi_remove_event_handler 9F Ta 2250.El 2251.Ss Layered Device Interfaces 2252The LDI 2253.Pq Layered Device Interface 2254provides a mechanism for a driver to open up another device in the 2255kernel and begin calling basic operations on the device as though the 2256calling driver were a normal user process. 2257Through the LDI, drivers can perform equivalents to the basic file 2258.Xr read 2 2259and 2260.Xr write 2 2261calls, look up properties on the device, perform networking style calls 2262ala 2263.Xr getmsg 2 2264and 2265.Xr putmsg 2 , 2266and register callbacks to be called when something happens to the 2267underlying device. 2268For example, the ZFS file system uses the LDI to open and operate on 2269block devices. 2270.Pp 2271Before opening a device itself, callers must obtain a notion of their 2272identity which is used when making subsequent calls. 2273The simplest form is often to use the device's 2274.Vt dev_info_t 2275and call 2276.Xr ldi_ident_from_dip 9F ; 2277however, there are also methods available based upon having a 2278.Vt dev_t 2279or a STREAMS 2280.Vt struct queue . 2281.Pp 2282Once that identity is established, there are several ways to open a 2283device such as 2284.Xr ldi_open_by_dev 9F , 2285.Xr ldi_open_by_devid 9F , 2286or 2287.Xr ldi_open_by_name 9F . 2288Once an LDI device has been opened, then all of the other functions may 2289be used to operate on the device; however, consumers of the LDI must 2290think carefully about what kind of device they are opening. 2291While a kernel pseudo-device driver cannot disappear while it is open, 2292when the device represents an actual piece of hardware, it is possible 2293for it to be physically removed and no longer be accessible. 2294Consumers should not assume that a layered device will always be 2295present. 2296.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2297.It Xr ldi_add_event_handler 9F Ta Xr ldi_aread 9F 2298.It Xr ldi_awrite 9F Ta Xr ldi_close 9F 2299.It Xr ldi_devmap 9F Ta Xr ldi_dump 9F 2300.It Xr ldi_ev_finalize 9F Ta Xr ldi_ev_get_cookie 9F 2301.It Xr ldi_ev_get_type 9F Ta Xr ldi_ev_notify 9F 2302.It Xr ldi_ev_register_callbacks 9F Ta Xr ldi_ev_remove_callbacks 9F 2303.It Xr ldi_get_dev 9F Ta Xr ldi_get_devid 9F 2304.It Xr ldi_get_eventcookie 9F Ta Xr ldi_get_minor_name 9F 2305.It Xr ldi_get_otyp 9F Ta Xr ldi_get_size 9F 2306.It Xr ldi_getmsg 9F Ta Xr ldi_ident_from_dev 9F 2307.It Xr ldi_ident_from_dip 9F Ta Xr ldi_ident_from_stream 9F 2308.It Xr ldi_ident_release 9F Ta Xr ldi_ioctl 9F 2309.It Xr ldi_open_by_dev 9F Ta Xr ldi_open_by_devid 9F 2310.It Xr ldi_open_by_name 9F Ta Xr ldi_poll 9F 2311.It Xr ldi_prop_exists 9F Ta Xr ldi_prop_get_int 9F 2312.It Xr ldi_prop_get_int64 9F Ta Xr ldi_prop_lookup_byte_array 9F 2313.It Xr ldi_prop_lookup_int_array 9F Ta Xr ldi_prop_lookup_int64_array 9F 2314.It Xr ldi_prop_lookup_string_array 9F Ta Xr ldi_prop_lookup_string 9F 2315.It Xr ldi_putmsg 9F Ta Xr ldi_read 9F 2316.It Xr ldi_remove_event_handler 9F Ta Xr ldi_strategy 9F 2317.It Xr ldi_write 9F Ta 2318.El 2319.Ss Signal Manipulation 2320These utility functions all relate to understanding whether or not a 2321process can receive a signal an actually delivering one to a process 2322from a driver. 2323This interface is specific to device drivers and should not be used by 2324the broader kernel. 2325These interfaces are not recommended and should only be used after 2326consultation. 2327.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2328.It Xr ddi_can_receive_sig 9F Ta Xr proc_ref 9F 2329.It Xr proc_signal 9F Ta Xr proc_unref 9F 2330.El 2331.Ss Getting at Surrounding Context 2332These functions allow a driver to better understand its current context. 2333For example, some drivers have to deal with providing polled I/O or take 2334special care as part of creating a kernel crash dump. 2335These cases may need to call the 2336.Xr ddi_in_panic 9F 2337function. 2338The other functions generally provide a way to get at information such as 2339the process ID or other information from the system; however, this 2340generally should not be needed or used. 2341Almost all values exposed by say 2342.Xr drv_getparm 9F 2343have more usable first-class methods of getting at the data. 2344.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2345.It Xr ddi_get_kt_did 9F Ta Xr ddi_get_pid 9F 2346.It Xr ddi_in_panic 9F Ta Xr drv_getparm 9F 2347.El 2348.Ss Driver Memory Mapping 2349These functions are present for device drivers that implement the 2350.Xr devmap 9E 2351or 2352.Xr segmap 9E 2353entry points. 2354The 2355.Xr ddi_umem_alloc 9F 2356routines are used to allocate and lock memory that can later be used as 2357part of passing this memory to userland through the mapping entry 2358points. 2359.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2360.It Xr ddi_devmap_segmap 9F Ta Xr ddi_mmap_get_model 9F 2361.It Xr ddi_segmap_setup 9F Ta Xr ddi_segmap 9F 2362.It Xr ddi_umem_alloc 9F Ta Xr ddi_umem_free 9F 2363.It Xr ddi_umem_iosetup 9F Ta Xr ddi_umem_lock 9F 2364.It Xr ddi_umem_unlock 9F Ta Xr ddi_unmap_regs 9F 2365.It Xr devmap_default_access 9F Ta Xr devmap_devmem_setup 9F 2366.It Xr devmap_do_ctxmgt 9F Ta Xr devmap_load 9F 2367.It Xr devmap_set_ctx_timeout 9F Ta Xr devmap_setup 9F 2368.It Xr devmap_umem_setup 9F Ta Xr devmap_unload 9F 2369.El 2370.Ss UTF-8, UTF-16, UTF-32, and Code Set Utilities 2371These routines provide the ability to work with and deal with text in 2372different encodings and code sets. 2373Generally the kernel does not assume that much about the type of the text 2374that it is operating in, though some subsystems will require that the 2375names of things be ASCII only. 2376.Pp 2377The primary other locales that the system supports are generally UTF-8 2378based and so the kernel provides a set of routines to deal with UTF-8 2379and Unicode normalization. 2380However, there are still cases where different character encodings are 2381required or conversation between UTF-8 and some other type is required. 2382This is provided by the kernel iconv framework, which provides a 2383subset of the traditional userland iconv conversions. 2384.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2385.It Xr kiconv_close 9F Ta Xr kiconv_open 9F 2386.It Xr kiconv 9F Ta Xr kiconvstr 9F 2387.It Xr u8_strcmp 9F Ta Xr u8_textprep_str 9F 2388.It Xr u8_validate 9F Ta Xr uconv_u16tou32 9F 2389.It Xr uconv_u16tou8 9F Ta Xr uconv_u32tou16 9F 2390.It Xr uconv_u32tou8 9F Ta Xr uconv_u8tou16 9F 2391.It Xr uconv_u8tou32 9F Ta 2392.El 2393.Ss Raw I/O Port Access 2394This group of functions provides raw access to I/O ports on architecture 2395that support them. 2396These functions do not allow any coordination with other callers nor is 2397the validity of the port assured in any way. 2398In general, device drivers should use the normal register access 2399routines to access I/O ports. 2400See 2401.Sx Device Register Setup and Access 2402for more information on the preferred way to setup and access registers. 2403.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2404.It Xr inb 9F Ta Xr inw 9F 2405.It Xr inl 9F Ta Xr outb 9F 2406.It Xr outw 9F Ta Xr outl 9F 2407.El 2408.Ss Power Management 2409These functions are used to raise and lower the internal power levels of 2410a device driver or to indicate to the kernel that the device is busy and 2411therefore cannot have its power changed. 2412See 2413.Xr power 9E 2414for additional information. 2415.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2416.It Xr ddi_removing_power 9F Ta Xr pm_busy_component 9F 2417.It Xr pm_idle_component 9F Ta Xr pm_lower_power 9F 2418.It Xr pm_power_has_changed 9F Ta Xr pm_raise_power 9F 2419.It Xr pm_trans_check 9F Ta 2420.El 2421.Ss Network Packet Hooks 2422These functions are intended to be used by device drivers that wish to 2423inspect and potentially modify packets along their path through the 2424networking stack. 2425The most common use case is for implementing something like a network 2426firewall. 2427Otherwise, if looking to add support for a new protocol or other network 2428processing feature, one is better off more directly integrating with the 2429networking stack. 2430.Pp 2431To get started, drivers generally will need to first use 2432.Xr net_protocol_lookup 9F 2433to get a handle to say that they're interested in looking at IPv4 or 2434IPv6 traffic and then can allocate an actual hook object with 2435.Xr hook_alloc 9F . 2436After filling out the hook, the hook can be inserted into the actual 2437system with 2438.Xr net_hook_register 9F . 2439.Pp 2440Hooks operate in the context of a networking stack. 2441Every networking stack in the system is independent and therefore has 2442its own set of interfaces, routing tables, settings, and related. 2443Most zones have their own networking stack. 2444This is the exclusive-IP option that is described in 2445.Xr zoneadm 8 . 2446.Pp 2447Drivers can register to get a callback for every netstack in the system 2448and be notified when they are created and destroyed. 2449This is done by calling the 2450.Xr net_instance_alloc 9F 2451function, filling out its data structure, and then finally calling 2452.Xr net_instance_register 9F . 2453Like other callback interfaces, the moment the callback functions are 2454registered, drivers need to expect that they're going to be called. 2455.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister" 2456.It Xr hook_alloc 9F Ta Xr hook_free 9F 2457.It Xr net_event_notify_register 9F Ta Xr net_event_notify_unregister 9F 2458.It Xr net_getifname 9F Ta Xr net_getlifaddr 9F 2459.It Xr net_getmtu 9F Ta Xr net_getnetid 9F 2460.It Xr net_getpmtuenabled 9F Ta Xr net_hook_register 9F 2461.It Xr net_hook_unregister 9F Ta Xr net_inject_alloc 9F 2462.It Xr net_inject_free 9F Ta Xr net_inject 9F 2463.It Xr net_instance_alloc 9F Ta Xr net_instance_free 9F 2464.It Xr net_instance_notify_register 9F Ta Xr net_instance_notify_unregister 9F 2465.It Xr net_instance_protocol_unregister 9F Ta Xr net_instance_register 9F 2466.It Xr net_instance_unregister 9F Ta Xr net_ispartialchecksum 9F 2467.It Xr net_isvalidchecksum 9F Ta Xr net_kstat_create 9F 2468.It Xr net_kstat_delete 9F Ta Xr net_lifgetnext 9F 2469.It Xr net_netidtozonid 9F Ta Xr net_phygetnext 9F 2470.It Xr net_phylookup 9F Ta Xr net_protocol_lookup 9F 2471.It Xr net_protocol_notify_register 9F Ta Xr net_protocol_release 9F 2472.It Xr net_protocol_walk 9F Ta Xr net_routeto 9F 2473.It Xr net_zoneidtonetid 9F Ta Xr netinfo 9F 2474.El 2475.Sh SEE ALSO 2476.Xr Intro 2 , 2477.Xr Intro 9 , 2478.Xr Intro 9E , 2479.Xr Intro 9S 2480.Rs 2481.%T illumos Developer's Guide 2482.%U https://www.illumos.org/books/dev/ 2483.Re 2484.Rs 2485.%T Writing Device Drivers 2486.%U https://www.illumos.org/books/wdd/ 2487.Re 2488