xref: /illumos-gate/usr/src/man/man9f/Intro.9f (revision 012e6ce759c490003aed29439cc47d3d73a99ad3)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source.  A copy of the CDDL is also available via the Internet at
9.\" http://www.illumos.org/license/CDDL.
10.\"
11.\"
12.\" Copyright 2023 Oxide Computer Company
13.\"
14.Dd January 26, 2023
15.Dt INTRO 9F
16.Os
17.Sh NAME
18.Nm Intro
19.Nd Introduction to kernel and device driver functions
20.Sh SYNOPSIS
21.In sys/ddi.h
22.In sys/sunddi.h
23.Sh DESCRIPTION
24Section 9F of the manual page describes functions that are used for device
25drivers, kernel modules, and the implementation of the kernel itself.
26This first provides an overview for the use of kernel functions and portions of
27the manual that are specific to the kernel.
28After that, we have grouped together most functions that are available by use,
29with some brief commentary and introduction.
30.Pp
31Most manual pages are similar to those in other sections.
32They have common fields such as the NAME, a SYNOPSIS to show which header files
33to include and prototypes, an extended DESCRIPTION discussing its use, and the
34common combination of RETURN VALUES and ERRORS.
35Some manuals will have examples and additional manuals to reference in the SEE
36ALSO section.
37.Ss RETURN VALUES and ERRORS
38One major difference when programming in the kernel versus userland is that
39there is no equivalent to
40.Va errno .
41Instead, there are a few common patterns that are used throughout the kernel
42that we'll discuss.
43While there are common patterns, please be aware that due to the natural
44evolution of the system, you will need to read the specifics of the
45section.
46.Bl -bullet
47.It
48Many functions will return a specific DDI
49.Pq Device Driver Interface
50value, which is commonly one of
51.Dv DDI_SUCCESS
52or
53.Dv DDI_FAILURE ,
54indicating success and failure respectively.
55Some functions will return additional error codes to indicate why something
56failed.
57In general, when checking a response code is always preferred to compare that
58something equals or does not equal
59.Dv DDI_SUCCESS
60as there can be many different error cases and additional ones can be added over
61time.
62.It
63Many routines explicitly return
64.Sy 0
65on success and will return an explicit error number.
66.Xr Intro 2
67has a list of error numbers.
68.It
69There are classes of functions that return either a pointer or a boolean type,
70either the C99
71.Vt bool
72or the system's traditional type
73.Vt boolean_t .
74In these cases, sometimes a more detailed error is provided via an additional
75argument such as a
76.Vt "int *" .
77Absent such an argument, there is generally no more detailed information
78available.
79.El
80.Ss CONTEXT
81The CONTEXT section of a manual page describes the times in which this function
82may be called.
83In generally there are three different contexts that come up:
84.Bl -tag -width Ds
85.It Sy User
86User context implies that the thread of execution is operating because a user
87thread has entered the kernel for an operation.
88When an application issues a system call such as
89.Xr open 2 ,
90.Xr read 2 ,
91.Xr write 2 ,
92or
93.Xr ioctl 2
94then we are said to be in user context.
95When in user context, one can copy in or out data from a user's address space.
96When writing a character or block device driver, the majority of the time that a
97character device operation such as the corresponding
98.Xr open 9E ,
99.Xr read 9E ,
100.Xr write 9E ,
101and
102.Xr ioctl 9E
103entry point being called, it is executing in user context.
104It is possible to call those entry points through the kernel's layered device
105interface, so drivers cannot assume those entry points will always have a user
106process present, strictly speaking.
107.It Sy Interrupt
108Interrupt context refers to when the operating system is handling an interrupt
109.Po
110See
111.Sx Interrupt Related Functions
112.Pc
113and executing a registered interrupt handler.
114Interrupt context is split into two different sets: high-level and low-level
115interrupts.
116Most device drivers are always going to be executing low-level interrupts.
117To determine whether an interrupt is considered high level or not, you should
118pass the interrupt handle to the
119.Xr ddi_intr_get_pri 9F
120function and compare the resulting priority with
121.Xr ddi_intr_get_hilevel_pri 9F .
122.Pp
123When executing high-level interrupts, the thread may only execute a limited
124number of functions.
125In particular, it may call
126.Xr ddi_intr_trigger_softint 9F ,
127.Xr mutex_enter 9F ,
128and
129.Xr mutex_exit 9F .
130It is critical that the mutex being used be properly initialized with the
131driver's interrupt priority.
132The system will transparently pick the correct implementation of a mutex based
133on the interrupt type.
134Aside from the above, one must not block while in high-level interrupt context.
135.Pp
136On the other hand, when a thread is not in high-level interrupt context, most of
137these restrictions are lifted.
138Kernel memory may be allocated
139.Po
140if using a non-blocking allocation such as
141.Dv KM_NOSLEEP
142or
143.Dv KM_NOSLEEP_LAZY
144.Pc ,
145and many of the other documented functions may be called.
146.Pp
147Regardless of whether a thread is in high-level or low-level interrupt context,
148it will never have a user context associated with it and therefore cannot use
149routines like
150.Xr ddi_copyin 9F
151or
152.Xr ddi_copyout 9F .
153.It Sy Kernel
154Kernel context refers to all other times in the kernel.
155Whenever the kernel is executing something on a thread that is not associated
156with a user process, then one is in kernel context.
157The most common situation for writers of kernel modules are things like timeout
158callbacks, such as
159.Xr timeout 9F
160or
161.Xr ddi_periodic_add 9F ,
162cases where the kernel is invoking a driver's device operation routines such as
163.Xr attach 9E
164and
165.Xr detach 9E ,
166or many of the device driver's registered callbacks from frameworks such as the
167.Xr mac 9E ,
168.Xr usba_hcdi 9E ,
169and various portions of SCSI, USB, and block devices.
170.It Sy Framework-specific Contexts
171Some manuals will discuss more specific constraints about when they can be used.
172For example, some functions may only be called while executing a specific entry
173point like
174.Xr attach 9E .
175Another example of this is that the
176.Xr mac_transceiver_info_set_present 9F
177function is only meant to be used while executing a networking driver's
178.Xr mct_info 9E
179entry point.
180.El
181.Ss PARAMETERS
182In kernel manual pages
183.Pq section 9 ,
184each function and entry point description generally has a separate list
185of parameters which are arguments to the function.
186The parameters section describes the basic purpose of each argument and
187should explain where such things often come from and any constraints on
188their values.
189.Sh INTERFACES
190Functions below are organized into categories that describe their purpose.
191Individual functions are documented in their own manual pages.
192For each of these areas, we discuss high-level concepts behind each area and
193provide a brief discussion of how to get started with it.
194Note, some deprecated functions or older frameworks are not listed here.
195.Pp
196Every function listed below has its own manual page in section 9F and
197can be read with
198.Xr man 1 .
199In addition, some corresponding concepts are documented in section 9 and
200some groups of functions are present to support a specific type of
201device driver, which is discussed more in section 9E .
202.Ss Logging Functions
203Through the kernel there are often needs to log messages that either
204make it into the system log or on the console.
205These kinds of messages can be performed with the
206.Xr cmn_err 9F
207function or one of its more specific variants that operate in the
208context of a device
209.Po
210.Xr dev_err 9F
211.Pc
212or a zone
213.Po
214.Xr zcmn_err 9F
215.Pc .
216.Pp
217The console should be used sparingly.
218While a notice may be found there, one should assume that it may be
219missed either due to overflow, not being connected to say a serial
220console at the time, or some other reason.
221While the system log is better than the console, folks need to take care
222not to spam the log.
223Imagine if someone logged every time a network packet was generated or
224received, you'd quickly potentially run out of space and make it harder
225to find useful messages for bizarre behavior.
226It's also important to remember that only system administrators and
227privileged users can actually see this log.
228Where possible and appropriate use programmatic errors in routines that
229allow it.
230.Pp
231The system also supports a structured event log called a system event
232that is processed by
233.Xr syseventd 8 .
234This is used by the OS to provide notifications for things like device
235insertion and removal or the change of a data link.
236These are driven by the
237.Xr ddi_log_sysevent 9F
238function and allow arbitrary additional structured metadata in the form
239of a
240.Vt nvlist_t .
241.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
242.It Xr cmn_err 9F Ta Xr dev_err 9F
243.It Xr vcmn_err 9F Ta Xr vzcmn_err 9F
244.It Xr zcmn_err 9F Ta Xr ddi_log_sysevent 9F
245.El
246.Ss Memory Allocation
247At the heart of most device drivers is memory allocation.
248The primary kernel allocator is called
249.Qq kmem
250.Pq kernel memory
251and it is based on the
252.Qq vmem
253.Pq virtual memory
254subsystem.
255Most of the time, device drivers should use
256.Xr kmem_alloc 9F
257and
258.Xr kmem_zalloc 9F
259to allocate memory and free it with
260.Xr kmem_free 9F .
261Based on the original kmem and subsequent vmem papers, the kernel is
262internally using object caches and magazines to allow high-throughput
263allocation in a multi-CPU environment.
264.Pp
265When allocating memory, an important choice must be made: whether or not
266to block for memory.
267If one opts to perform a sleeping allocation, then the caller can be
268guaranteed that the allocation will succeed, but it may take some time
269and the thread will be blocked during that entire duration.
270This is the
271.Dv KM_SLEEP
272flag.
273On the other hand, there are many circumstances where this is not
274appropriate, especially because a thread that is inside a memory
275allocation function cannot currently be cancelled.
276If the thread corresponds to a user process, then it will not be
277killable.
278.Pp
279Given that there are many situations where this is not appropriate, the
280kernel offers an allocation mode where it will not block for memory to
281be available:
282.Dv KM_NOSLEEP
283and
284.Dv KM_NOSLEEP_LAZY .
285These allocations can fail and return
286.Dv NULL
287when they do fail.
288Even though these are said to be no sleep operations, that does not mean
289that the caller may not end up temporarily blocked due to mutex
290contention or due to trying a bit more aggressively to reclaim memory in
291the case of
292.Dv KM_NOSLEEP .
293Unless operating in special circumstances, using
294.Dv KM_NOSLEEP_LAZY
295should be preferred to
296.Dv KM_NOSLEEP .
297.Pp
298If a device driver has its own complex object that has more significant
299set up and tear down costs, then the kmem cache function family should
300be considered.
301To use a kmem cache, it must first be created using the
302.Xr kmem_cache_create 9F
303function, which requires specifying the size, alignment, and
304constructors and destructors.
305Individual objects are allocated from the cache with the
306.Xr kmem_cache_alloc 9F
307function.
308An important constraint when using the caches is that when an object is
309freed with
310.Xr kmem_cache_free 9F ,
311it is the callers responsibility to ensure that the object is returned
312to its constructed state prior to freeing it.
313If the object is reused, prior to the kernel reclaiming the memory for
314other uses, then the constructor will not be called again.
315Most device drivers do not need to create a kmem cache for their
316own allocations.
317.Pp
318If you are writing a device driver that is trying to interact with the
319networking, STREAMS, or USB subsystems, then they are generally using
320the
321.Vt mblk_t
322data structure which is managed through a different set of APIs, though
323they are leveraging kmem under the hood.
324.Pp
325The vmem set of interfaces allows for the management of abstract regions
326of integers, generally representing memory or some other object, each
327with an offset and length.
328While it is not common that a device driver needs to do their own such
329management,
330.Xr vmem_create 9F
331and
332.Xr vmem_alloc 9F
333are what to reach for when the need arises.
334Rather than using vmem, if one needs to model a set of integers where
335each is a valid identifier, that is you need to allocate every integer
336between 0 and 1000 as a distinct identifier, instead use
337.Xr id_space_create 9F
338which is discussed in
339.Sx Identifier Management .
340For more information on vmem, see
341.Xr vmem 9 .
342.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
343.It Xr kmem_alloc 9F Ta Xr kmem_cache_alloc 9F
344.It Xr kmem_cache_create 9F Ta Xr kmem_cache_destroy 9F
345.It Xr kmem_cache_free 9F Ta Xr kmem_cache_set_move 9F
346.It Xr kmem_free 9F Ta Xr kmem_zalloc 9F
347.It Xr vmem_add 9F Ta Xr vmem_alloc 9F
348.It Xr vmem_contains 9F Ta Xr vmem_create 9F
349.It Xr vmem_destroy 9F Ta Xr vmem_free 9F
350.It Xr vmem_size 9F Ta Xr vmem_walk 9F
351.It Xr vmem_xalloc 9F Ta Xr vmem_xcreate 9F
352.It Xr vmem_xfree 9F Ta Xr bufcall 9F
353.It Xr esbbcall 9F Ta Xr qbufcall 9F
354.It Xr qunbufcall 9F Ta Xr unbufcall 9F
355.El
356.Ss String and libc Analogues
357The kernel has many analogues for classic libc functions that deal with
358string processing, memory copying, and related.
359For the most part, these behave similarly to their userland analogues,
360but there can be some differences in return values and for example, in
361the set of supported format characters in the case of
362.Xr snprintf 9F
363and related.
364.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
365.It Xr ASSERT 9F Ta Xr bcmp 9F
366.It Xr bzero 9F Ta Xr bcopy 9F
367.It Xr ddi_strdup 9F Ta Xr ddi_strtol 9F
368.It Xr ddi_strtoll 9F Ta Xr ddi_strtoul 9F
369.It Xr ddi_strtoull 9F Ta Xr ddi_ffs 9F
370.It Xr ddi_fls 9F Ta Xr max 9F
371.It Xr memchr 9F Ta Xr memcmp 9F
372.It Xr memcpy 9F Ta Xr memmove 9F
373.It Xr memset 9F Ta Xr min 9F
374.It Xr numtos 9F Ta Xr snprintf 9F
375.It Xr sprintf 9F Ta Xr stoi 9F
376.It Xr strcasecmp 9F Ta Xr strcat 9F
377.It Xr strchr 9F Ta Xr strcmp 9F
378.It Xr strcpy 9F Ta Xr strdup 9F
379.It Xr strfree 9F Ta Xr string 9F
380.It Xr strlcat 9F Ta Xr strlcpy 9F
381.It Xr strlen 9F Ta Xr strlog 9F
382.It Xr strncasecmp 9F Ta Xr strncat 9F
383.It Xr strncmp 9F Ta Xr strncpy 9F
384.It Xr strnlen 9F Ta Xr strqget 9F
385.It Xr strqset 9F Ta Xr strrchr 9F
386.It Xr strspn 9F Ta Xr swab 9F
387.It Xr vsnprintf 9F Ta Xr va_arg 9F
388.It Xr va_copy 9F Ta Xr va_end 9F
389.It Xr va_start 9F Ta Xr vsprintf 9F
390.El
391.Ss Tree Data Structures
392These functions provide access to an intrusive self-balancing binary
393tree that is generally used throughout illumos.
394The primary type here is the
395.Vt avl_tree_t .
396Structures can be present in multiple trees and there are built-in
397walkers for the data structure in
398.Xr mdb 1 .
399.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
400.It Xr avl_add 9F Ta Xr avl_create 9F
401.It Xr avl_destroy_nodes 9F Ta Xr avl_destroy 9F
402.It Xr avl_find 9F Ta Xr avl_first 9F
403.It Xr avl_insert_here 9F Ta Xr avl_insert 9F
404.It Xr avl_is_empty 9F Ta Xr avl_last 9F
405.It Xr avl_nearest 9F Ta Xr AVL_NEXT 9F
406.It Xr avl_numnodes 9F Ta Xr AVL_PREV 9F
407.It Xr avl_remove 9F Ta Xr avl_swap 9F
408.El
409.Ss Linked Lists
410These functions provide a standard, intrusive doubly-linked list whose
411type is the
412.Vt list_t .
413This list implementation is used extensively throughout illumos, has
414debugging support through
415.Xr mdb 1
416walkers, and is generally recommended rather than creating your own
417list.
418Due to its intrusive nature, a given structure can be present on
419multiple lists.
420.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
421.It Xr list_create 9F Ta Xr list_destroy 9F
422.It Xr list_head 9F Ta Xr list_insert_after 9F
423.It Xr list_insert_before 9F Ta Xr list_insert_head 9F
424.It Xr list_insert_tail 9F Ta Xr list_is_empty 9F
425.It Xr list_link_active 9F Ta Xr list_link_init 9F
426.It Xr list_link_replace 9F Ta Xr list_move_tail 9F
427.It Xr list_next 9F Ta Xr list_prev 9F
428.It Xr list_remove_head 9F Ta Xr list_remove_tail 9F
429.It Xr list_remove 9F Ta Xr list_tail 9F
430.El
431.Ss Name-Value Pairs
432The kernel often uses the
433.Vt nvlist_t
434data structure to pass around a list of typed name-value pairs.
435This data structure is used in diverse areas, particularly because of
436its ability to be serialized in different formats that are suitable not
437only for use between userland and the kernel, but also persistently to a
438file.
439.Pp
440A
441.Vt nvlist_t
442structure is initialized with the
443.Xr nvlist_alloc 9F
444function and can operate with two different degrees of uniqueness: a
445mode where only names are unique or that every name is qualified to a
446type.
447The former means that if I have an integer name
448.Dq foo
449and then add a string, array, or any other value with the same name, it
450will be replaced.
451However, if were using the name and type as unique, then the value would
452only be replaced if both the pair's type and the name
453.Dq foo
454matched a pair that was already present.
455Otherwise, the two different entries would co-exist.
456.Pp
457When constructing an nvlist, it is normally backed by the normal kmem
458allocator and may either use sleeping or non-sleeping allocations.
459It is also possible to use a custom allocator, though that generally has
460not been necessary in the kernel.
461.Pp
462Specific keys and values can be looked up directly with the
463nvlist_lookup family of functions, but the entire list can be iterated
464as well, which is especially useful when trying to validate that no
465unknown keys are present in the list.
466The iteration API
467.Xr nvlist_next_nvpair 9F
468allows one to then get both the key's name, the type of value of the
469pair, and then the value itself.
470.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
471.It Xr nv_alloc_fini 9F Ta Xr nv_alloc_init 9F
472.It Xr nvlist_add_boolean_array 9F Ta Xr nvlist_add_boolean_value 9F
473.It Xr nvlist_add_boolean 9F Ta Xr nvlist_add_byte_array 9F
474.It Xr nvlist_add_byte 9F Ta Xr nvlist_add_int16_array 9F
475.It Xr nvlist_add_int16 9F Ta Xr nvlist_add_int32_array 9F
476.It Xr nvlist_add_int32 9F Ta Xr nvlist_add_int64_array 9F
477.It Xr nvlist_add_int64 9F Ta Xr nvlist_add_int8_array 9F
478.It Xr nvlist_add_int8 9F Ta Xr nvlist_add_nvlist_array 9F
479.It Xr nvlist_add_nvlist 9F Ta Xr nvlist_add_nvpair 9F
480.It Xr nvlist_add_string_array 9F Ta Xr nvlist_add_string 9F
481.It Xr nvlist_add_uint16_array 9F Ta Xr nvlist_add_uint16 9F
482.It Xr nvlist_add_uint32_array 9F Ta Xr nvlist_add_uint32 9F
483.It Xr nvlist_add_uint64_array 9F Ta Xr nvlist_add_uint64 9F
484.It Xr nvlist_add_uint8_array 9F Ta Xr nvlist_add_uint8 9F
485.It Xr nvlist_alloc 9F Ta Xr nvlist_dup 9F
486.It Xr nvlist_exists 9F Ta Xr nvlist_free 9F
487.It Xr nvlist_lookup_boolean_array 9F Ta Xr nvlist_lookup_boolean_value 9F
488.It Xr nvlist_lookup_boolean 9F Ta Xr nvlist_lookup_byte_array 9F
489.It Xr nvlist_lookup_byte 9F Ta Xr nvlist_lookup_int16_array 9F
490.It Xr nvlist_lookup_int16 9F Ta Xr nvlist_lookup_int32_array 9F
491.It Xr nvlist_lookup_int32 9F Ta Xr nvlist_lookup_int64_array 9F
492.It Xr nvlist_lookup_int64 9F Ta Xr nvlist_lookup_int8_array 9F
493.It Xr nvlist_lookup_int8 9F Ta Xr nvlist_lookup_nvlist_array 9F
494.It Xr nvlist_lookup_nvlist 9F Ta Xr nvlist_lookup_nvpair 9F
495.It Xr nvlist_lookup_pairs 9F Ta Xr nvlist_lookup_string_array 9F
496.It Xr nvlist_lookup_string 9F Ta Xr nvlist_lookup_uint16_array 9F
497.It Xr nvlist_lookup_uint16 9F Ta Xr nvlist_lookup_uint32_array 9F
498.It Xr nvlist_lookup_uint32 9F Ta Xr nvlist_lookup_uint64_array 9F
499.It Xr nvlist_lookup_uint64 9F Ta Xr nvlist_lookup_uint8_array 9F
500.It Xr nvlist_lookup_uint8 9F Ta Xr nvlist_merge 9F
501.It Xr nvlist_next_nvpair 9F Ta Xr nvlist_pack 9F
502.It Xr nvlist_remove_all 9F Ta Xr nvlist_remove 9F
503.It Xr nvlist_size 9F Ta Xr nvlist_t 9F
504.It Xr nvlist_unpack 9F Ta Xr nvlist_xalloc 9F
505.It Xr nvlist_xdup 9F Ta Xr nvlist_xpack 9F
506.It Xr nvlist_xunpack 9F Ta Xr nvpair_name 9F
507.It Xr nvpair_type 9F Ta Xr nvpair_value_boolean_array 9F
508.It Xr nvpair_value_byte_array 9F Ta Xr nvpair_value_byte 9F
509.It Xr nvpair_value_int16_array 9F Ta Xr nvpair_value_int16 9F
510.It Xr nvpair_value_int32_array 9F Ta Xr nvpair_value_int32 9F
511.It Xr nvpair_value_int64_array 9F Ta Xr nvpair_value_int64 9F
512.It Xr nvpair_value_int8_array 9F Ta Xr nvpair_value_int8 9F
513.It Xr nvpair_value_nvlist_array 9F Ta Xr nvpair_value_nvlist 9F
514.It Xr nvpair_value_string_array 9F Ta Xr nvpair_value_string 9F
515.It Xr nvpair_value_uint16_array 9F Ta Xr nvpair_value_uint16 9F
516.It Xr nvpair_value_uint32_array 9F Ta Xr nvpair_value_uint32 9F
517.It Xr nvpair_value_uint64_array 9F Ta Xr nvpair_value_uint64 9F
518.It Xr nvpair_value_uint8_array 9F Ta Xr nvpair_value_uint8 9F
519.El
520.Ss Identifier Management
521A common challenge in the kernel is the management of a series of
522different IDs.
523There are three different families of routines for managing identifiers
524presented here, but we recommend the use of the
525.Xr id_space_create 9F
526and
527.Xr id_alloc 9F
528family for new use cases.
529The ID space can cover all or a subset of the 32-bit integer space and
530provides different allocation strategies for this.
531.Pp
532Due to the current implementation, callers should generally prefer the
533non-sleeping variants because the sleeping ones are not cancellable
534.Po
535currently this is backed by vmem, but this should not be assumed and may
536change in the future
537.Pc .
538.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
539.It Xr id_alloc_nosleep 9F Ta Xr id_alloc_specific_nosleep 9F
540.It Xr id_alloc 9F Ta Xr id_allocff_nosleep 9F
541.It Xr id_allocff 9F Ta Xr id_free 9F
542.It Xr id_space_create 9F Ta Xr id_space_destroy 9F
543.It Xr id_space_extend 9F Ta Xr id_space 9F
544.It Xr id32_alloc 9F Ta Xr id32_free 9F
545.It Xr id32_lookup 9F Ta Xr rmalloc_wait 9F
546.It Xr rmalloc 9F Ta Xr rmallocmap_wait 9F
547.It Xr rmallocmap 9F Ta Xr rmfree 9F
548.It Xr rmfreemap 9F Ta
549.El
550.Ss Bit Manipulation Routines
551Many device drivers that are working with registers often need to get a
552specific range of bits out of an integer.
553These functions provide safe ways to set
554.Pq bitset
555and extract
556.Pq bitx
557bit ranges, as well
558as modify an integer to remove a set of bits entirely
559.Pq bitdel .
560Using these functions is preferred to constructing manual masks and
561shifts particularly when a programming manual for a device is specified
562in ranges of bits.
563On debug builds, these provide extra checking to try and catch
564programmer error.
565.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
566.It Xr bitdel64 9F Ta Xr bitset8 9F
567.It Xr bitset16 9F Ta Xr bitset32 9F
568.It Xr bitset64 9F Ta Xr bitx8 9F
569.It Xr bitx16 9F Ta Xr bitx32 9F
570.It Xr bitx64 9F Ta
571.El
572.Ss Synchronization Primitives
573The kernel provides a set of basic synchronization primitives that can
574be used by the system.
575These include mutexes, condition variables, reader/writer locks, and
576semaphores.
577When creating mutexes and reader/writer locks, the kernel requires that
578one pass in the interrupt priority of a mutex if it will be used in
579interrupt context.
580This is required so the kernel can determine the correct underlying type
581of lock to use.
582This ensures that if for some reason a mutex needs to be used in
583high-level interrupt context, the kernel will use a spin lock, but
584otherwise can use the standard adaptive mutex that might block.
585For developers familiar with other operating systems, this is somewhat
586different in that the consumer does not need to generally figure out
587this level of detail and this is why this is not present.
588.Pp
589In addition, condition variables provide means for waiting and detecting
590that a signal has been delivered.
591These variants are particularly useful when writing character device
592operations for device drivers as it allows users the chance to cancel an
593operation and not be blocked indefinitely on something that may not
594occur.
595These _sig variants should generally be preferred where applicable.
596.Pp
597The kernel also provides memory barrier primitives.
598See the
599.Sx Memory Barriers
600section for more information.
601There is no need to use manual memory barriers when using the
602synchronization primitives.
603The synchronization primitives contain that the appropriate barriers are
604present to ensure coherency while the lock is held.
605.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
606.It Xr cv_broadcast 9F Ta Xr cv_destroy 9F
607.It Xr cv_init 9F Ta Xr cv_reltimedwait_sig 9F
608.It Xr cv_reltimedwait 9F Ta Xr cv_signal 9F
609.It Xr cv_timedwait_sig 9F Ta Xr cv_timedwait 9F
610.It Xr cv_wait_sig 9F Ta Xr cv_wait 9F
611.It Xr ddi_enter_critical 9F Ta Xr ddi_exit_critical 9F
612.It Xr mutex_destroy 9F Ta Xr mutex_enter 9F
613.It Xr mutex_exit 9F Ta Xr mutex_init 9F
614.It Xr mutex_owned 9F Ta Xr mutex_tryenter 9F
615.It Xr rw_destroy 9F Ta Xr rw_downgrade 9F
616.It Xr rw_enter 9F Ta Xr rw_exit 9F
617.It Xr rw_init 9F Ta Xr rw_read_locked 9F
618.It Xr rw_tryenter 9F Ta Xr rw_tryupgrade 9F
619.It Xr sema_destroy 9F Ta Xr sema_init 9F
620.It Xr sema_p_sig 9F Ta Xr sema_p 9F
621.It Xr sema_tryp 9F Ta Xr sema_v 9F
622.It Xr semaphore 9F Ta
623.El
624.Ss Atomic Operations
625This group of functions provides a general way to perform atomic
626operations on integers of different sizes and explicit types.
627The
628.Xr atomic_ops 9F
629manual page describes the different classes of functions in more detail,
630but there are functions that take care of using the CPU's instructions
631for addition, compare and swap, and more.
632If data is being protected and only accessed under a synchronization
633primitive such as a mutex or reader-writer lock, then there isn't a
634reason to use an atomic operation for that data, generally speaking.
635.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
636.It Xr atomic_add_8_nv 9F Ta Xr atomic_add_8 9F
637.It Xr atomic_add_16_nv 9F Ta Xr atomic_add_16 9F
638.It Xr atomic_add_32_nv 9F Ta Xr atomic_add_32 9F
639.It Xr atomic_add_64_nv 9F Ta Xr atomic_add_64 9F
640.It Xr atomic_add_char_nv 9F Ta Xr atomic_add_char 9F
641.It Xr atomic_add_int_nv 9F Ta Xr atomic_add_int 9F
642.It Xr atomic_add_long_nv 9F Ta Xr atomic_add_long 9F
643.It Xr atomic_add_ptr_nv 9F Ta Xr atomic_add_ptr 9F
644.It Xr atomic_add_short_nv 9F Ta Xr atomic_add_short 9F
645.It Xr atomic_and_8_nv 9F Ta Xr atomic_and_8 9F
646.It Xr atomic_and_16_nv 9F Ta Xr atomic_and_16 9F
647.It Xr atomic_and_32_nv 9F Ta Xr atomic_and_32 9F
648.It Xr atomic_and_64_nv 9F Ta Xr atomic_and_64 9F
649.It Xr atomic_and_uchar_nv 9F Ta Xr atomic_and_uchar 9F
650.It Xr atomic_and_uint_nv 9F Ta Xr atomic_and_uint 9F
651.It Xr atomic_and_ulong_nv 9F Ta Xr atomic_and_ulong 9F
652.It Xr atomic_and_ushort_nv 9F Ta Xr atomic_and_ushort 9F
653.It Xr atomic_cas_16 9F Ta Xr atomic_cas_32 9F
654.It Xr atomic_cas_64 9F Ta Xr atomic_cas_8 9F
655.It Xr atomic_cas_ptr 9F Ta Xr atomic_cas_uchar 9F
656.It Xr atomic_cas_uint 9F Ta Xr atomic_cas_ulong 9F
657.It Xr atomic_cas_ushort 9F Ta Xr atomic_clear_long_excl 9F
658.It Xr atomic_dec_8_nv 9F Ta Xr atomic_dec_8 9F
659.It Xr atomic_dec_16_nv 9F Ta Xr atomic_dec_16 9F
660.It Xr atomic_dec_32_nv 9F Ta Xr atomic_dec_32 9F
661.It Xr atomic_dec_64_nv 9F Ta Xr atomic_dec_64 9F
662.It Xr atomic_dec_ptr_nv 9F Ta Xr atomic_dec_ptr 9F
663.It Xr atomic_dec_uchar_nv 9F Ta Xr atomic_dec_uchar 9F
664.It Xr atomic_dec_uint_nv 9F Ta Xr atomic_dec_uint 9F
665.It Xr atomic_dec_ulong_nv 9F Ta Xr atomic_dec_ulong 9F
666.It Xr atomic_dec_ushort_nv 9F Ta Xr atomic_dec_ushort 9F
667.It Xr atomic_inc_8_nv 9F Ta Xr atomic_inc_8 9F
668.It Xr atomic_inc_16_nv 9F Ta Xr atomic_inc_16 9F
669.It Xr atomic_inc_32_nv 9F Ta Xr atomic_inc_32 9F
670.It Xr atomic_inc_64_nv 9F Ta Xr atomic_inc_64 9F
671.It Xr atomic_inc_ptr_nv 9F Ta Xr atomic_inc_ptr 9F
672.It Xr atomic_inc_uchar_nv 9F Ta Xr atomic_inc_uchar 9F
673.It Xr atomic_inc_uint_nv 9F Ta Xr atomic_inc_uint 9F
674.It Xr atomic_inc_ulong_nv 9F Ta Xr atomic_inc_ulong 9F
675.It Xr atomic_inc_ushort_nv 9F Ta Xr atomic_inc_ushort 9F
676.It Xr atomic_or_8_nv 9F Ta Xr atomic_or_8 9F
677.It Xr atomic_or_16_nv 9F Ta Xr atomic_or_16 9F
678.It Xr atomic_or_32_nv 9F Ta Xr atomic_or_32 9F
679.It Xr atomic_or_64_nv 9F Ta Xr atomic_or_64 9F
680.It Xr atomic_or_uchar_nv 9F Ta Xr atomic_or_uchar 9F
681.It Xr atomic_or_uint_nv 9F Ta Xr atomic_or_uint 9F
682.It Xr atomic_or_ulong_nv 9F Ta Xr atomic_or_ulong 9F
683.It Xr atomic_or_ushort_nv 9F Ta Xr atomic_or_ushort 9F
684.It Xr atomic_set_long_excl 9F Ta Xr atomic_swap_8 9F
685.It Xr atomic_swap_16 9F Ta Xr atomic_swap_32 9F
686.It Xr atomic_swap_64 9F Ta Xr atomic_swap_ptr 9F
687.It Xr atomic_swap_uchar 9F Ta Xr atomic_swap_uint 9F
688.It Xr atomic_swap_ulong 9F Ta Xr atomic_swap_ushort 9F
689.El
690.Ss Memory Barriers
691The kernel provides general purpose memory barriers that can be used
692when required.
693In general, when using items described in the
694.Sx Synchronization Primitives
695section, these are not required.
696.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
697.It Xr membar_consumer 9F Ta Xr membar_enter 9F
698.It Xr membar_exit 9F Ta Xr membar_producer 9F
699.El
700.Ss Virtual Memory and Pages
701All platforms that the operating system supports have some form of
702virtual memory which is managed in units of pages.
703The page size varies between architectures and platforms.
704For example, the smallest x86 page size is 4 KiB while SPARC
705traditionally used 8 KiB pages.
706These functions can be used to convert between pages and bytes.
707.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
708.It Xr btop 9F Ta Xr btopr 9F
709.It Xr ddi_btop 9F Ta Xr ddi_btopr 9F
710.It Xr ddi_ptob 9F Ta Xr ptob 9F
711.El
712.Ss Module and Device Framework
713These functions are used as part of implementing kernel modules and
714register device drivers with the various kernel frameworks.
715There are also functions here that are satiable for use in the
716.Xr dev_ops 9F ,
717.Xr cb_ops 9F ,
718etc.
719structures and for interrogating module information.
720.Pp
721The
722.Xr mod_install 9F
723and
724.Xr mod_remove 9F
725functions are used during a driver's
726.Xr _init 9E
727and
728.Xr _fini 9E
729functions.
730.Pp
731There are two different ways that drivers often manage their instance
732state which is created during
733.Xr attach 9E .
734The first is the use of
735.Xr ddi_set_driver_private 9F
736and
737.Xr ddi_get_driver_private 9F .
738This stores a driver-specific value on the
739.Vt dev_info_t
740structure which allows it to be used during other operations.
741Some device driver frameworks may use this themselves, making this
742unavailable to the driver.
743.Pp
744The other path is to use the soft state suite of functions which
745dynamically grows to cover the number of instances of a device that
746exist.
747The soft state is generally initialized in the
748.Xr _init 9E
749entry point with
750.Xr ddi_soft_state_init 9F
751and then instances are allocated and freed during
752.Xr attach 9E
753and
754.Xr detach 9E
755with
756.Xr ddi_soft_state_zalloc 9F
757and
758.Xr ddi_soft_state_free 9F ,
759and then retrieved with
760.Xr ddi_get_soft_state 9F .
761.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
762.It Xr ddi_get_driver_private 9F Ta Xr ddi_get_soft_state 9F
763.It Xr ddi_modclose 9F Ta Xr ddi_modopen 9F
764.It Xr ddi_modsym 9F Ta Xr ddi_no_info 9F
765.It Xr ddi_report_dev 9F Ta Xr ddi_set_driver_private 9F
766.It Xr ddi_soft_state_fini 9F Ta Xr ddi_soft_state_free 9F
767.It Xr ddi_soft_state_init 9F Ta Xr ddi_soft_state_zalloc 9F
768.It Xr mod_info 9F Ta Xr mod_install 9F
769.It Xr mod_modname 9F Ta Xr mod_remove 9F
770.It Xr nochpoll 9F Ta Xr nodev 9F
771.It Xr nulldev 9F Ta
772.El
773.Ss Device Tree Information
774Devices are organized into a tree that is partially seeded by the
775platform based on information discovered at boot and augmented with
776additional information at runtime.
777Every instance of a device driver is given a
778.Vt "dev_info_t *"
779.Pq device information
780data structure which corresponds to information about an instance and
781has a place in the tree.
782When a driver requests operations like to allocate memory for DMA, that
783request is passed up the tree and modified.
784The same is true for other things like interrupts, event notifications,
785or properties.
786.Pp
787There are many different informational properties about a device driver.
788For example,
789.Xr ddi_driver_name 9F
790returns the name of the device driver,
791.Xr ddi_get_name 9F
792returns the name of the node in the tree,
793.Xr ddi_get_parent 9F
794returns a node's parent, and
795.Xr ddi_get_instance 9F
796returns the instance number of a specific driver.
797.Pp
798There are a series of properties that exist on the tree, the exact set
799of which depend on the class of the device and are often documented in a
800specific device class's manual.
801For example, the
802.Dq reg
803property is used for PCI and PCIe devices to describe the various base
804address registers, their types, and related, which are documented in
805.Xr pci 5 .
806.Pp
807When getting a property one can constrain it to the current instance or
808you can ask for a parent to try to look up the property.
809Which mode is appropriate depends on the specific class of driver, its
810parent, and the property.
811.Pp
812Using a
813.Vt "dev_info_t *"
814pointer has to be done carefully.
815When a device driver is in any of its
816.Xr dev_ops 9S ,
817.Xr cb_ops 9S ,
818or similar callback functions that it has registered with the kernel,
819then it can always safely use its own
820.Vt "dev_info_t"
821and those of any parents it discovers through
822.Xr ddi_get_parent 9F .
823However, it cannot assume the validity of any siblings or children
824unless there are other circumstances that guarantee that they will not
825disappear.
826In the broader kernel, one should not assume that it is safe to use a
827given
828.Vt "dev_info_t *"
829structure without the appropriate NDI
830.Pq nexus driver interface
831hold having been applied.
832.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
833.It Xr ddi_binding_name 9F Ta Xr ddi_dev_is_sid 9F
834.It Xr ddi_driver_major 9F Ta Xr ddi_driver_name 9F
835.It Xr ddi_get_devstate 9F Ta Xr ddi_get_instance 9F
836.It Xr ddi_get_name 9F Ta Xr ddi_get_parent 9F
837.It Xr ddi_getlongprop_buf 9F Ta Xr ddi_getlongprop 9F
838.It Xr ddi_getprop 9F Ta Xr ddi_getproplen 9F
839.It Xr ddi_node_name 9F Ta Xr ddi_prop_create 9F
840.It Xr ddi_prop_exists 9F Ta Xr ddi_prop_free 9F
841.It Xr ddi_prop_get_int 9F Ta Xr ddi_prop_get_int64 9F
842.It Xr ddi_prop_lookup_byte_array 9F Ta Xr ddi_prop_lookup_int_array 9F
843.It Xr ddi_prop_lookup_int64_array 9F Ta Xr ddi_prop_lookup_string_array 9F
844.It Xr ddi_prop_lookup_string 9F Ta Xr ddi_prop_lookup 9F
845.It Xr ddi_prop_modify 9F Ta Xr ddi_prop_op 9F
846.It Xr ddi_prop_remove_all 9F Ta Xr ddi_prop_remove 9F
847.It Xr ddi_prop_undefine 9F Ta Xr ddi_prop_update_byte_array 9F
848.It Xr ddi_prop_update_int_array 9F Ta Xr ddi_prop_update_int 9F
849.It Xr ddi_prop_update_int64_array 9F Ta Xr ddi_prop_update_int64 9F
850.It Xr ddi_prop_update_string_array 9F Ta Xr ddi_prop_update_string 9F
851.It Xr ddi_prop_update 9F Ta Xr ddi_root_node 9F
852.It Xr ddi_slaveonly 9F Ta
853.El
854.Ss Copying Data to and from Userland
855The kernel operates in a different context from userland.
856One does not simply access user memory.
857This is enforced either by the architecture's memory model, where user
858address space isn't even present in the kernel's virtual address space
859or by architectural mechanisms such as Supervisor Mode Access Protect
860.Pq SMAP
861on x86.
862.Pp
863To facilitate accessing memory, the kernel provides a few routines that
864can be used.
865In most contexts the main thing to use is
866.Xr ddi_copyin 9F
867and
868.Xr ddi_copyout 9F .
869These will safely dereference addresses and ensure that the address is
870appropriate depending on whether this is coming from the user or kernel.
871When operating with the kernel's
872.Vt uio_t
873structure which is for mostly used when processing read and write
874requests, instead
875.Xr uiomove 9F
876is the goto function.
877.Pp
878When reading data from userland into the kernel, there is another
879concern: the data model.
880The most common place this comes up is in an
881.Xr ioctl 9E
882handler or other places where the kernel is operating on data that isn't
883fixed size.
884Particularly in C, though this applies to other languages, structures
885and unions vary in the size and alignment requirements between 32-bit
886and 64-bit processes.
887The same even applies if one uses pointers or the
888.Vt long ,
889.Vt size_t ,
890or similar types in C.
891In supported 32-bit and 64-bit environments these types are 4 and 8
892bytes respectively.
893To account for this, when data is not fixed size between all data
894models, the driver must look at the data model of the process it is
895copying data from.
896.Pp
897The simplest way to solve this problem is to try to make the data
898structure the same across the different models.
899It's not sufficient to just use the same structure definition and fixed
900size types as the alignment and padding between the two can vary.
901For example, the alignment of a 64-bit integer like a
902.Vt uint64_t
903can change between a 32-bit and 64-bit data model.
904One way to check for the data structures being identical is to leverage
905the
906.Xr ctfdiff 1
907program, generally with the
908.Fl I
909option.
910.Pp
911However, there are times when a structure simply can't be the same, such
912as when we're encoding a pointer into the structure or a type like the
913.Vt size_t .
914When this happens, the most natural way to accomplish this is to use the
915.Xr ddi_model_convert_from 9F
916function which can determine the appropriate model from the ioctl's
917arguments.
918This provides a natural way to copy a structure in and out in the
919appropriate data model and convert it at those points to the kernel's
920native form.
921.Pp
922An alternate way to approach the data model is to use the
923.Xr STRUCT_DECL 9F
924functions, but as this requires wrapping every access to every member,
925often times the
926.Xr ddi_model_convert_from 9F
927approach and taking care of converting values and ensuring that limits
928aren't exceeded at the end is preferred.
929.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
930.It Xr bp_copyin 9F Ta Xr bp_copyout 9F
931.It Xr copyin 9F Ta Xr copyout 9F
932.It Xr ddi_copyin 9F Ta Xr ddi_copyout 9F
933.It Xr ddi_model_convert_from 9F Ta Xr SIZEOF_PTR 9F
934.It Xr SIZEOF_STRUCT 9F Ta Xr STRUCT_BUF 9F
935.It Xr STRUCT_DECL 9F Ta Xr STRUCT_FADDR 9F
936.It Xr STRUCT_FGET 9F Ta Xr STRUCT_FGETP 9F
937.It Xr STRUCT_FSET 9F Ta Xr STRUCT_FSETP 9F
938.It Xr STRUCT_HANDLE 9F Ta Xr STRUCT_INIT 9F
939.It Xr STRUCT_SET_HANDLE 9F Ta Xr STRUCT_SIZE 9F
940.It Xr uiomove 9F Ta Xr ureadc 9F
941.It Xr uwritec 9F Ta
942.El
943.Ss Device Register Setup and Access
944The kernel abstracts out accessing registers on a device on behalf of
945drivers.
946This allows a similar set of interfaces to be used whether the registers
947are found within a PCI BAR, utilizing I/O ports, memory mapped
948registers, or some other scheme.
949Devices with registers all have a
950.Dq regs
951property that is set up by their parent device, generally a kernel
952framework as is the case for PCIe devices, and the meaning is a contract
953between the two.
954Register sets are identified by a numeric ID, which varies on the device
955type.
956For example, the first BAR of a PCI device is defined as register set 1.
957On the other hand, the AMD GPIO controller might have three register sets
958because of how the hardware design splits them up.
959The meaning of the registers and their semantics is still
960device-specific.
961The kernel doesn't know how to interpret the actual registers of a PCIe
962device say, just that they exist.
963.Pp
964To begin with register setup, one often first looks at the number of
965register sets that exist and their size.
966Most PCI-based device drivers will skip calling
967.Xr ddi_dev_nregs 9F
968and will just move straight to calling
969.Xr ddi_dev_regsize 9F
970to determine the size of a register set that they are interested in.
971To actually map the registers, a device driver will call
972.Xr ddi_regs_map_setup 9F
973which requires both a register set and a series of attributes and
974returns an access handle that is used to actually read and write the
975registers.
976When setting up registers, one must have a corresponding
977.Vt ddi_device_acc_attr_t
978structure which is used to define what endianness the register set is
979in, whether any kind of reordering is allowed
980.Po
981if in doubt specify
982.Dv DDI_STRICTORDER_ACC
983.Pc ,
984and whether any particular error handling is being used.
985The structure and all of its different options are described in
986.Xr ddi_device_acc_attr 9S .
987.Pp
988Once a register handle is obtained, then it's easy to read and write the
989register space.
990Functions are organized based on the size of the access.
991For the most part, most situations call for the use of the
992.Xr ddi_get8 9F ,
993.Xr ddi_get16 9F ,
994.Xr ddi_get32 9F ,
995and
996.Xr ddi_get64 9F
997functions to read a register and the
998.Xr ddi_put8 9F ,
999.Xr ddi_put16 9F ,
1000.Xr ddi_put32 9F ,
1001and
1002.Xr ddi_put64 9F
1003functions to set a register value.
1004While there are the ddi_io_ and ddi_mem_ families of functions below,
1005these are not generally needed and are generally present for
1006compatibility.
1007The kernel will automatically perform the appropriate type of register
1008read for the device type in question.
1009.Pp
1010Once a register set is no longer being used, the
1011.Xr ddi_regs_map_free 9F
1012function should be used to release resources.
1013In most cases, this happens while executing the
1014.Xr detach 9E
1015entry point.
1016.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1017.It Xr ddi_dev_nregs 9F Ta Xr ddi_dev_regsize 9F
1018.It Xr ddi_device_copy 9F Ta Xr ddi_device_zero 9F
1019.It Xr ddi_regs_map_free 9F Ta Xr ddi_regs_map_setup 9F
1020.It Xr ddi_get8 9F Ta Xr ddi_get16 9F
1021.It Xr ddi_get32 9F Ta Xr ddi_get64 9F
1022.It Xr ddi_io_get8 9F Ta Xr ddi_io_get16 9F
1023.It Xr ddi_io_get32 9F Ta Xr ddi_io_put8 9F
1024.It Xr ddi_io_put16 9F Ta Xr ddi_io_put32 9F
1025.It Xr ddi_io_rep_get8 9F Ta Xr ddi_io_rep_get16 9F
1026.It Xr ddi_io_rep_get32 9F Ta Xr ddi_io_rep_put8 9F
1027.It Xr ddi_io_rep_put16 9F Ta Xr ddi_io_rep_put32 9F
1028.It Xr ddi_map_regs 9F Ta Xr ddi_mem_get8 9F
1029.It Xr ddi_mem_get16 9F Ta Xr ddi_mem_get32 9F
1030.It Xr ddi_mem_get64 9F Ta Xr ddi_mem_put8 9F
1031.It Xr ddi_mem_put16 9F Ta Xr ddi_mem_put32 9F
1032.It Xr ddi_mem_put64 9F Ta Xr ddi_mem_rep_get8 9F
1033.It Xr ddi_mem_rep_get16 9F Ta Xr ddi_mem_rep_get32 9F
1034.It Xr ddi_mem_rep_get64 9F Ta Xr ddi_mem_rep_put8 9F
1035.It Xr ddi_mem_rep_put16 9F Ta Xr ddi_mem_rep_put32 9F
1036.It Xr ddi_mem_rep_put64 9F Ta Xr ddi_peek8 9F
1037.It Xr ddi_peek16 9F Ta Xr ddi_peek32 9F
1038.It Xr ddi_peek64 9F Ta Xr ddi_poke8 9F
1039.It Xr ddi_poke16 9F Ta Xr ddi_poke32 9F
1040.It Xr ddi_poke64 9F Ta Xr ddi_put8 9F
1041.It Xr ddi_put16 9F Ta Xr ddi_put32 9F
1042.It Xr ddi_put64 9F Ta Xr ddi_rep_get8 9F
1043.It Xr ddi_rep_get16 9F Ta Xr ddi_rep_get32 9F
1044.It Xr ddi_rep_get64 9F Ta Xr ddi_rep_put8 9F
1045.It Xr ddi_rep_put16 9F Ta Xr ddi_rep_put32 9F
1046.It Xr ddi_rep_put64 9F Ta
1047.El
1048.Ss DMA Related Functions
1049Most high-performance devices provide first-class support for DMA
1050.Pq direct memory access .
1051DMA allows a transfer between a device and memory to occur
1052asynchronously and generally without a thread's specific involvement.
1053Today, most DMA is provided directly by devices and the corresponding
1054device scheme.
1055Take PCI and PCI Express for example.
1056The idea of DMA is built into the PCIe standard and therefore basic
1057support for it exists and therefore there isn't a lot of special
1058programming required.
1059However, this hasn't always been true and still exists in some cases
1060where there is a 3rd party DMA engine.
1061If we consider the PCIe example, the PCIe device directly performs reads
1062and writes to main memory on its own.
1063However, in the 3rd party case, there is a distinct controller that is
1064neither the device nor memory that facilitates this, which is called a
1065DMA engine.
1066For most part, DMA engines are not something that needs to be thought
1067about for most platforms that illumos is present on; however, they still
1068exist in some embedded and related contexts.
1069.Pp
1070The first thing that a driver needs to do to set up DMA is to understand
1071the constraints of the device and bus.
1072These constraints are described in a series of attributes in the
1073.Vt ddi_dma_attr_t
1074structure which is defined in
1075.Xr ddi_dma_attr 9S .
1076The reason that attributes exist is because different devices, and
1077sometimes different memory uses with a device, have different
1078requirements for memory.
1079A simple example of this is that not all devices can accept memory
1080addresses that are 64-bits wide and may have to be constrained to the
1081lower 32-bits of memory.
1082Another common constraint is how this memory is chunked up.
1083Some devices may require that all of the DMA memory be contiguous, while
1084others can allow that to be broken up into say up to 4 or 8 different
1085regions.
1086.Pp
1087When memory is allocated for DMA it isn't immediately mapped into the
1088kernel's address space.
1089The addresses that describe a DMA address are defined in a DMA cookie,
1090several of which may make up a request.
1091However, those addresses are always physical addresses or addresses that
1092are virtualized by an IOMMU.
1093There are some cases were the kernel or a driver needs to be able to
1094access that memory, such as memory that represents a networking packet.
1095The IP stack will expect to be able to actually read the data it's
1096given.
1097.Pp
1098To begin with allocating DMA memory, a driver first fills out its
1099attribute structure.
1100Once that's ready, the DMA allocation process can begin.
1101This starts off by a driver calling
1102.Xr ddi_dma_alloc_handle 9F .
1103This handle is used through the lifetime of a given DMA memory buffer,
1104but it can be used across multiple operations that a device or the
1105kernel may perform.
1106The next step is to actually request that the kernel allocate some
1107amount of memory in the kernel for this DMA request.
1108This phase actually allocates addresses in virtual address space for the
1109activity and also requires a register attribute object that is discussed
1110in
1111.Sx Device Register Setup and Access .
1112Armed with this a driver can now call
1113.Xr ddi_dma_mem_alloc 9F
1114to specify how much memory they are looking for.
1115If this is successful, a virtual address, the actual length of the
1116region, and an access handle will be returned.
1117.Pp
1118At this point, the virtual address region is present.
1119Most drivers will access this virtual address range directly and will
1120ignore the register access handle.
1121The side effect of this is that they will handle all endianness issues
1122with the memory region themselves.
1123If the driver would prefer to go through the handle, then it can use the
1124register access functions discussed earlier.
1125.Pp
1126Before the memory can be programmed into the device, it must be bound to
1127a series of physical addresses or addresses virtualized by an IOMMU.
1128While the kernel presents the illusion of a single consistent virtual
1129address range for applications, the physical reality can be quite
1130different.
1131When the driver is ready it calls
1132.Xr ddi_dma_addr_bind_handle 9F
1133to create the mapping to well known physical addresses.
1134.Pp
1135These addresses are stored in a series of cookies.
1136A driver can determine the number of cookies for a given request by
1137utilizing its DMA handle and calling
1138.Xr ddi_dma_ncookies 9F
1139and then pairing that with
1140.Xr ddi_dma_cookie_get 9F .
1141These DMA cookies will not change and can be used time and time again
1142until
1143.Xr ddi_dma_unbind_handle 9F
1144is called.
1145With this information in hand, a physical device can be programmed with
1146these addresses and let loose to perform I/O.
1147.Pp
1148When performing I/O to and from a device, synchronization is a vitally
1149important thing which ensures that the actual state in memory is
1150coherent with the rest of the CPU's internal structures such as caches.
1151In general, a given DMA request is only going in one direction: for a
1152device or for the local CPU.
1153In either case, the
1154.Xr ddi_dma_sync 9F
1155function must be called after the kernel is done writing to a region of
1156DMA memory and before it triggers the device or the kernel must call it
1157after the device has told it that some activity has completed that it is
1158going to check.
1159.Pp
1160Some DMA operations utilize what are called DMA windows.
1161The most common consumer is something like a disk device where DMA
1162operations to a given series of sectors can be split up into different
1163chunks where as long as all the transfers are performed, the
1164intermediate states are acceptable.
1165Put another way, because of how SCSI and SAS commands are designed,
1166block devices can basically take a given I/O request and break it into
1167multiple independent I/Os that will equate to the same final item.
1168.Pp
1169When a device supports this mode of operation and it is opted into, then
1170a DMA allocation may result in the use of DMA windows.
1171This allows for cases where the kernel can't perform a DMA allocation
1172for the entire request, but instead can allocate a partial region and
1173then walk through each part one at a time.
1174This is uncommon outside of block devices and usually also is related to
1175calling
1176.Xr ddi_dma_buf_bind_handle 9F .
1177.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1178.It Xr ddi_dma_addr_bind_handle 9F Ta Xr ddi_dma_alloc_handle 9F
1179.It Xr ddi_dma_buf_bind_handle 9F Ta Xr ddi_dma_burstsizes 9F
1180.It Xr ddi_dma_cookie_get 9F Ta Xr ddi_dma_cookie_iter 9F
1181.It Xr ddi_dma_cookie_one 9F Ta Xr ddi_dma_free_handle 9F
1182.It Xr ddi_dma_getwin 9F Ta Xr ddi_dma_mem_alloc 9F
1183.It Xr ddi_dma_mem_free 9F Ta Xr ddi_dma_ncookies 9F
1184.It Xr ddi_dma_nextcookie 9F Ta Xr ddi_dma_numwin 9F
1185.It Xr ddi_dma_set_sbus64 9F Ta Xr ddi_dma_sync 9F
1186.It Xr ddi_dma_unbind_handle 9F Ta Xr ddi_dmae_1stparty 9F
1187.It Xr ddi_dmae_alloc 9F Ta Xr ddi_dmae_disable 9F
1188.It Xr ddi_dmae_enable 9F Ta Xr ddi_dmae_getattr 9F
1189.It Xr ddi_dmae_getcnt 9F Ta Xr ddi_dmae_prog 9F
1190.It Xr ddi_dmae_release 9F Ta Xr ddi_dmae_stop 9F
1191.It Xr ddi_dmae 9F Ta
1192.El
1193.Ss Interrupt Handler Related Functions
1194Interrupts are a central part of the role of device drivers and one of
1195the things that's important to get right.
1196Interrupts come in different types: fixed, MSI, and MSI-X.
1197The kinds that are available depend on the device and the rest of the
1198system.
1199For example, MSI and MSI-X interrupts are generally specific to PCI and
1200PCI Express devices.
1201To begin the interrupt allocation process, the first thing a driver
1202needs to do is to discover what type of interrupts it supports with
1203.Xr ddi_intr_get_supported_types 9F .
1204Then, the driver should work through the supported types, preferring
1205MSI-X, then MSI, and finally fixed interrupts, and try to allocate
1206interrupts.
1207.Pp
1208Drivers first need to know how many interrupts that they require.
1209For example, a networking driver may want to have an interrupt made
1210available for each ring that it has.
1211To discover the number of interrupts available, the driver should call
1212.Xr ddi_intr_get_navail 9F .
1213If there are sufficient interrupts, it can proceed to actually
1214allocate the interrupts with
1215.Xr ddi_intr_alloc 9F .
1216When allocating interrupts, callers need to check to see how many
1217interrupts the system actually gave them.
1218Just because an interrupt is allocated does not mean that it will fire
1219or be ready to use, there are a series of additional steps that the
1220driver must take.
1221.Pp
1222To go through and enable the interrupt, the driver should go through and
1223get the interrupt capabilities with
1224.Xr ddi_intr_get_cap 9F
1225and the priority of the interrupt with
1226.Xr ddi_intr_get_pri 9F .
1227The priority must be used while creating mutexes and related
1228synchronization primitives that will be used during the interrupt
1229handler.
1230At this point, the driver can go ahead and register the functions that
1231will be called with each allocated interrupt with the
1232.Xr ddi_intr_add_handler 9F
1233function.
1234The arguments can vary for each allocated interrupt.
1235It is common to have an interrupt-specific data structure passed in one
1236of the arguments or an interrupt number, while the other argument is
1237generally the driver's instance-specific data structure.
1238.Pp
1239At this point, the last step for the interrupt to be made active from
1240the kernel's perspective is to enable it.
1241This will use either the
1242.Xr ddi_intr_block_enable 9F
1243or
1244.Xr ddi_intr_enable 9F
1245functions depending on the interrupt's capabilities.
1246The reason that these are different is because some interrupt types
1247.Pq MSI
1248require that all interrupts in a group be enabled and disabled at the
1249same time.
1250This is indicated with the
1251.Dv DDI_INTR_FLAG_BLOCK
1252flag found in the interrupt's capabilities.
1253Once that is called, interrupts that are generated by a device will be
1254delivered to the registered function.
1255.Pp
1256It's important to note that there is often device-specific interrupt
1257setup that is required.
1258While the kernel takes care of updating any pieces of the processor's
1259interrupt controller, I/O crossbar, or the PCI MSI and MSI-X
1260capabilities, many devices have device-specific registers that are used
1261to manage, set up, and acknowledge interrupts.
1262These registers or other controls are often capable of separately
1263masking interrupts and are generally what should be used if there are
1264times that you need to separately enable or disable interrupts such as
1265to poll an I/O ring.
1266.Pp
1267When unwinding interrupts, one needs to work in the reverse order here.
1268Until
1269.Xr ddi_intr_block_disable 9F
1270or
1271.Xr ddi_intr_disable 9F
1272is called, one should assume that their interrupt handler will be
1273called.
1274Due to cases where an interrupt is shared between multiple devices, this
1275can happen even if the device is quiesced!
1276Only after that is done is it safe to then free the interrupts with a
1277call to
1278.Xr ddi_intr_free 9F .
1279.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1280.It Xr ddi_add_intr 9F Ta Xr ddi_add_softintr 9F
1281.It Xr ddi_get_iblock_cookie 9F Ta Xr ddi_get_soft_iblock_cookie 9F
1282.It Xr ddi_intr_add_handler 9F Ta Xr ddi_intr_add_softint 9F
1283.It Xr ddi_intr_alloc 9F Ta Xr ddi_intr_block_disable 9F
1284.It Xr ddi_intr_block_enable 9F Ta Xr ddi_intr_clr_mask 9F
1285.It Xr ddi_intr_disable 9F Ta Xr ddi_intr_dup_handler 9F
1286.It Xr ddi_intr_enable 9F Ta Xr ddi_intr_free 9F
1287.It Xr ddi_intr_get_cap 9F Ta Xr ddi_intr_get_hilevel_pri 9F
1288.It Xr ddi_intr_get_navail 9F Ta Xr ddi_intr_get_nintrs 9F
1289.It Xr ddi_intr_get_pending 9F Ta Xr ddi_intr_get_pri 9F
1290.It Xr ddi_intr_get_softint_pri 9F Ta Xr ddi_intr_get_supported_types 9F
1291.It Xr ddi_intr_hilevel 9F Ta Xr ddi_intr_remove_handler 9F
1292.It Xr ddi_intr_remove_softint 9F Ta Xr ddi_intr_set_cap 9F
1293.It Xr ddi_intr_set_mask 9F Ta Xr ddi_intr_set_nreq 9F
1294.It Xr ddi_intr_set_pri 9F Ta Xr ddi_intr_set_softint_pri 9F
1295.It Xr ddi_intr_trigger_softint 9F Ta Xr ddi_remove_intr 9F
1296.It Xr ddi_remove_softintr 9F Ta Xr ddi_trigger_softintr 9F
1297.El
1298.Ss Minor Nodes
1299For a device driver to be accessed by a program in user space
1300.Pq or with the kernel layered device interface
1301then it must create a minor node.
1302Minor nodes are created under
1303.Pa /devices
1304.Pq Xr devfs 4FS
1305and are tied to the instance of a device driver via its
1306.Vt dev_info_t .
1307The
1308.Xr devfsadm 8
1309daemon and the
1310.Pa /dev
1311file system
1312.Po
1313sdev,
1314.Xr dev 4FS
1315.Pc
1316are responsible for creating a coherent set of names that user programs
1317access.
1318Drivers create these minor nodes using the
1319.Xr ddi_create_minor_node 9F
1320function listed below.
1321.Pp
1322In UNIX tradition, character, block, and STREAMS device special files
1323are identified by a major and minor number.
1324All instances of a given driver share the same major number, which means
1325that a device driver must coordinate the minor number space across
1326.Em all
1327instances.
1328While a minor node is created with a fixed minor number, it is possible
1329to change the minor number while processing an
1330.Xr open 9E
1331call, allowing subsequent character device operations to uniquely
1332identify a particular caller.
1333This is usually referred to as a driver that
1334.Dq clones .
1335.Pp
1336When drivers aren't performing cloning, then usually the minor number
1337used when creating the minor node is some fixed offset or multiple of
1338the driver's instance number.
1339When cloning and a driver needs to allocate and manage a minor number
1340space, usually an ID space is leveraged whose IDs are usually in the
1341range from 0 through
1342.Dv MAXMIN32 .
1343There are severa different strategies for tracking data structures as
1344they relate to minor numbers.
1345Sometimes, the soft state functionality is used.
1346Others might keep an AVL tree around or tie the data to some other data
1347structure.
1348The method chosen often varies on the specifics of the implementation
1349and its broader context.
1350.Pp
1351The
1352.Vt dev_t
1353structure represents the combined major and minor number.
1354It can be taken apart with the
1355.Xr getmajor 9F
1356and
1357.Xr getminor 9F
1358functions and then reconstructed with the
1359.Xr makedevice 9F
1360function.
1361.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1362.It Xr ddi_create_minor_node 9F Ta Xr ddi_remove_minor_node 9F
1363.It Xr getmajor 9F Ta Xr getminor 9F
1364.It Xr devfs_clean 9F Ta Xr makedevice 9F
1365.El
1366.Ss Accessing Time, Delays, and Periodic Events
1367The kernel provides a number of ways to understand time in the system.
1368In particular it provides a few different clocks and time measurements:
1369.Bl -tag -width Ds
1370.It High-resolution monotonic time
1371The kernel provides access to a high-resolution monotonic clock that is
1372tracked in nanoseconds.
1373This clock is perfect for measuring durations and is accessed via
1374.Xr gethrtime 9F .
1375Unlike the real-time clock, this clock is not subject to adjustments by
1376a time synchronization daemon and is the preferred clock that drivers
1377should be using for tracking events.
1378The high-resolution clock is consistent across CPUs, meaning that you
1379may call
1380.Xr gethrtime 9F
1381on one CPU and the value will be consistent with what is returned, even
1382if a thread is migrated to another CPU.
1383.Pp
1384The high-resolution clock is implemented using an architecture and
1385platform-specific means.
1386For example, on x86 it is generally backed by the TSC
1387.Pq time stamp counter .
1388.It Real-time
1389The real-time clock tracks time as humans perceive it.
1390This clock is accessed using
1391.Xr ddi_get_time 9F .
1392If the system is running a time synchronization daemon that leverages
1393the network time protocol, then this time may be in sync with other
1394systems
1395.Pq subject to some amount of variance ;
1396however, it is critical that this is not assumed.
1397.Pp
1398In general, this time should not be used by drivers for any purpose.
1399It can jump around, drift, and most aspects in the kernel are not based
1400on the real-time clock.
1401For any device timing activities, the high-resolution clock should be
1402used.
1403.It Tick-based monotonic time
1404The kernel has a running periodic function that fires based on the rate
1405dictated by the
1406.Va hz
1407variable, generally operating at 100 or 1000 kHz.
1408The current number of ticks since boot is accessible through the
1409.Xr ddi_get_lbolt 9F
1410function.
1411When functions operate in units of ticks, this is what they are
1412tracking.
1413This value can be converted to and from microseconds using the
1414.Xr drv_usectohz 9F
1415and
1416.Xr drv_hztousec 9F
1417functions.
1418.Pp
1419In general, drivers should prefer the high-resolution monotonic clock
1420for tracking events internally.
1421.El
1422.Pp
1423With these different timing mechanisms, the kernel provides a few
1424different ways to delay execution or to get a callback after some
1425amount of time passes.
1426.Pp
1427The
1428.Xr delay 9F
1429and
1430.Xr drv_usecwait 9F
1431functions are used to block the execution of the current thread.
1432.Xr delay 9F
1433can be used in conditions where sleeping and blocking is allowed where
1434as
1435.Xr drv_usecwait 9F
1436is a busy-wait, which is appropriate for some device drivers,
1437particularly when in high-level interrupt context.
1438.Pp
1439The kernel also allows a function to be called after some time has
1440elapsed.
1441This callback occurs on a different thread and will be executed in
1442.Sy kernel
1443context.
1444A timeout can be scheduled in the future with the
1445.Xr timeout 9F
1446function and cancelled with the
1447.Xr untimeout 9F
1448function.
1449There is also a STREAMs-specific version that can be used if the
1450circumstances are required with the
1451.Xr qtimeout 9F
1452function.
1453.Pp
1454These are all considered one-shot events.
1455That is, they will only happen once after being scheduled.
1456If instead, a driver requires periodic behavior, such as needing
1457something to occur every second, then it should use the
1458.Xr ddi_periodic_add 9F
1459function to establish that.
1460.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1461.It Xr delay 9F Ta Xr ddi_get_lbolt 9F
1462.It Xr ddi_get_lbolt64 9F Ta Xr ddi_get_time 9F
1463.It Xr ddi_periodic_add 9F Ta Xr ddi_periodic_delete 9F
1464.It Xr drv_hztousec 9F Ta Xr drv_usectohz 9F
1465.It Xr drv_usecwait 9F Ta Xr gethrtime 9F
1466.It Xr qtimeout 9F Ta Xr quntimeout 9F
1467.It Xr timeout 9F Ta Xr untimeout 9F
1468.El
1469.Ss Task Queues
1470A task queue provides an asynchronous processing mechanism that can be
1471used by drivers and the broader system.
1472A task queue can be created with
1473.Xr ddi_taskq_create 9F
1474and sized with a given number of threads and a relative priority of those
1475threads.
1476Once created, tasks can be dispatched to the queue with
1477.Xr ddi_taskq_dispatch 9F .
1478The different functions and arguments dispatched do not need to be the
1479same and can vary from invocation to invocation.
1480However, it is the caller's responsibility to ensure that any reference
1481memory is valid until the task queue is done processing.
1482It is possible to create a barrier for a task queue by using the
1483.Xr ddi_taskq_wait 9F
1484function.
1485.Pp
1486While task queues are a flexible mechanism for handling and processing
1487events that occur in a well defined context, they do not have an
1488inherent backpressure mechanism built in.
1489This means it is possible to add events to a task queue faster than they
1490can be processed.
1491For high-volume events, this must be considered before just dispatching
1492an event.
1493Do not rely on a non-sleeping allocation in the task queue dispatch
1494context.
1495.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1496.It Xr ddi_taskq_create 9F Ta Xr ddi_taskq_destroy 9F
1497.It Xr ddi_taskq_dispatch 9F Ta Xr ddi_taskq_resume 9F
1498.It Xr ddi_taskq_suspend 9F Ta Xr ddi_taskq_suspended 9F
1499ddi_taskq_wait
1500.El
1501.Ss Credential Management and Privileges
1502Not everything in the system has the same power to impact it.
1503To determine the permissions and context of a caller, the
1504.Vt cred_t
1505data structure encapsulates a number of different things including the
1506traditional user and group IDs, but also the zone that one is operating
1507in the context of and the associated privileges that the caller has.
1508While this concept is more often thought of due to userland processes being
1509associated with specific users, these same principles apply to different
1510threads in the kernel.
1511Not all kernel threads are allowed to indiscriminately do what they
1512want, they can be constrained by the same privilege model that processes
1513are, which is discussed in
1514.Xr privileges 7 .
1515.Pp
1516Most operations that device drivers implement are given a credential.
1517However, from within the kernel, a credential can be obtained that
1518refers to a specific zone, the current process, or a generic kernel
1519credential.
1520.Pp
1521It is up to drivers and the kernel writ-large to check whether a given
1522credential is authorized to perform a given operation.
1523This is encapsulated by the various privilege checks that exist.
1524The most common check used is
1525.Xr drv_priv 9F
1526which checks for
1527.Dv PRIV_SYS_DEVICES .
1528.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1529.It Xr CRED 9F Ta Xr crdup 9F
1530.It Xr crfree 9F Ta Xr crget 9F
1531.It Xr crgetgid 9F Ta Xr crgetgroups 9F
1532.It Xr crgetngroups 9F Ta Xr crgetrgid 9F
1533.It Xr crgetruid 9F Ta Xr crgetsgid 9F
1534.It Xr crgetsuid 9F Ta Xr crgetuid 9F
1535.It Xr crgetzoneid 9F Ta Xr crhold 9F
1536.It Xr ddi_get_cred 9F Ta Xr drv_priv 9F
1537.It Xr kcred 9F Ta Xr priv_getbyname 9F
1538.It Xr priv_policy_choice 9F Ta Xr priv_policy_only 9F
1539.It Xr priv_policy 9F Ta Xr zone_kcred 9F
1540.El
1541.Ss Device ID Management
1542Device IDs are a means of establishing a unique ID for a device in the
1543kernel.
1544These unique IDs are generally tied to something from the device's
1545hardware such as a serial number or related, but can also be fabricated
1546and stored on the device.
1547These device IDs are used by other subsystems like ZFS to record
1548information about a device as the actual
1549.Pa /devices
1550path that a device resides at may change because it is moved around in
1551the system.
1552.Pp
1553For device drivers, particularly those that represent block devices,
1554they should first call
1555.Xr ddi_devid_init 9F
1556to initialize the device ID data structure.
1557After that is done, it is then safe to call
1558.Xr ddi_devid_register 9F
1559to notify the kernel about the ID.
1560.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1561.It Xr ddi_devid_compare 9F Ta Xr ddi_devid_free 9F
1562.It Xr ddi_devid_get 9F Ta Xr ddi_devid_init 9F
1563.It Xr ddi_devid_register 9F Ta Xr ddi_devid_sizeof 9F
1564.It Xr ddi_devid_str_decode 9F Ta Xr ddi_devid_str_encode 9F
1565.It Xr ddi_devid_str_free 9F Ta Xr ddi_devid_unregister 9F
1566.It Xr ddi_devid_valid 9F Ta
1567.El
1568.Ss Message Block Functions
1569The
1570.Vt "mblk_t"
1571data structure is used to chain together messages which are used through
1572the kernel for different subsystems including all of networking,
1573terminals, STREAMS, USB, and more.
1574.Pp
1575Message blocks are chained together by a series of two different
1576pointers:
1577.Fa b_cont
1578and
1579.Fa b_next .
1580When a message is split across multiple data buffers, they are linked by
1581the
1582.Fa b_cont
1583pointer.
1584However, multiple distinct messages can be chained together and linked
1585by the
1586.Fa b_next
1587pointer.
1588Let's look at this in the context of a series of networking packets.
1589If we had a chain of say 10 UDP packets that we were given, each UDP
1590packet is considered an independent message and would be linked from one
1591to the next based on the order they should be transmitted with the
1592.Fa b_next
1593pointer.
1594However, an individual message may be entirely in one message block, in
1595which case its
1596.Fa b_cont
1597pointer would be
1598.Dv NULL ,
1599but if say the packet were split into a 100 byte data buffer that
1600contained the headers and then a 1000 byte data buffer that contained
1601the actual packet data, those two would be linked together by
1602.Fa b_cont .
1603A continued message would never have its next pointer used to link it to
1604a wholly different message.
1605Visually you might see this as:
1606.Bd -literal
1607  +---------------+
1608  | UDP Message 0 |
1609  | Bytes 0-1100  |
1610  | b_cont     ---+--> NULL
1611  | b_next  +     |
1612  +---------|-----+
1613            |
1614            v
1615  +---------------+    +----------------+
1616  | UDP Message 1 |    | UDP Message 1+ |
1617  | Bytes 0-100   |    | Bytes 100-1100 |
1618  | b_cont     ---+--> | b_cont     ----+->NULL
1619  | b_next  +     |    | b_next     ----+->NULL
1620  +---------|-----+    +----------------+
1621            |
1622           ...
1623            |
1624            v
1625  +---------------+
1626  | UDP Message 9 |
1627  | Bytes 0-1100  |
1628  | b_cont     ---+--> NULL
1629  | b_next     ---+--> NULL
1630  +---------------+
1631.Ed
1632.Pp
1633Message blocks all have an associated data block which contains the
1634actual data that is present.
1635Multiple message blocks can share the same data block as well.
1636The data block has a notion of a type, which is generally
1637.Dv M_DATA
1638which signifies that they operate on data.
1639.Pp
1640To allocate message blocks, one generally uses the
1641.Xr allocb 9F
1642function to create one; however, you can also create message blocks
1643using your own source of data through functions like
1644.Xr desballoc 9F .
1645This is generally used when one wants to use memory that was originally
1646used for DMA to pass data back into the kernel, such as in a networking
1647device driver.
1648When this happens, a callback function will be called once the last user
1649of the data block is done with it.
1650.Pp
1651The functions listed below often end in either
1652.Dq msg
1653or
1654.Dq b
1655to indicate that they will operate on an entire message and follow the
1656.Fa b_cont
1657pointer or they will not respectively.
1658.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1659.It Xr adjmsg 9F Ta Xr allocb 9F
1660.It Xr copyb 9F Ta Xr copymsg 9F
1661.It Xr datamsg 9F Ta Xr desballoc 9F
1662.It Xr desballoca 9F Ta Xr dupb 9F
1663.It Xr dupmsg 9F Ta Xr esballoc 9F
1664.It Xr esballoca 9F Ta Xr freeb 9F
1665.It Xr freemsg 9F Ta Xr linkb 9F
1666.It Xr mcopymsg 9F Ta Xr msgdsize 9F
1667.It Xr msgpullup 9F Ta Xr msgsize 9F
1668.It Xr pullupmsg 9F Ta Xr rmvb 9F
1669.It Xr testb 9F Ta Xr unlinkb 9F
1670.El
1671.Ss Upgradable Firmware Modules
1672The UFM
1673.Pq Upgradable Firmware Module
1674subsystem is used to grant the system observability into firmware that
1675exists persistently on a device.
1676These functions are intended for use by drivers that are participating in
1677the kernel's UFM framework, which is discussed in
1678.Xr ddi_ufm 9E .
1679.Pp
1680The
1681.Xr ddi_ufm_init 9E
1682and
1683.Xr ddi_ufm_fini 9E
1684functions are used to indicate support of the subsystem to the kernel.
1685The driver is required to use the
1686.Xr ddi_ufm_update 9F
1687function to indicate both that it is ready to receive UFM requests and
1688to indicate that any data that the kernel may have previously received
1689has changed.
1690Once that's completed, then the other functions listed here are
1691generally used as part of implementing specific callback functions that
1692are registered.
1693.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1694.It Xr ddi_ufm_fini 9F Ta Xr ddi_ufm_image_set_desc 9F
1695.It Xr ddi_ufm_image_set_misc 9F Ta Xr ddi_ufm_image_set_nslots 9F
1696.It Xr ddi_ufm_init 9F Ta Xr ddi_ufm_slot_set_attrs 9F
1697.It Xr ddi_ufm_slot_set_imgsize 9F Ta Xr ddi_ufm_slot_set_misc 9F
1698.It Xr ddi_ufm_slot_set_version 9F Ta Xr ddi_ufm_update 9F
1699.El
1700.Ss Firmware Loading
1701Some hardware devices have firmware that is not stored as part of the
1702device itself and must instead be sent to the device each time it is
1703powered on.
1704These routines help drivers that need to perform this read such data
1705from the file system from well-known locations in the operating system.
1706To begin with, a driver should call
1707.Xr firmware_open 9F
1708to open a handle to the firmware file.
1709At that point, one can determine the size of the file with the
1710.Xr firmware_get_size 9F
1711function and allocate the appropriate sized memory buffer to read it in.
1712Callers should always check what the size of the returned file is and
1713should not just blindly pass that size off to the kernel memory
1714allocator.
1715For example, if a file was over 100 MiB in size, then one should not
1716assume that they're going to just blindly allocate 100 MiB of kernel
1717memory and should instead perform incremental reads and sends to a
1718device that are smaller in size.
1719.Pp
1720A driver can then go through and perform arbitrary reads of the firmware
1721file through the
1722.Xr firmware_read 9F
1723interface until they have read everything that they need.
1724Once complete, the corresponding handle needs to be released through the
1725.Xr firmware_close 9F
1726function.
1727.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1728.It Xr firmware_close 9F Ta Xr firmware_get_size 9F
1729.It Xr firmware_open 9F Ta Xr firmware_read 9F
1730.El
1731.Ss Fault Management Handling
1732These functions allow device drivers to harden themselves against errors
1733that might occur while interfacing with devices and tie into the broader
1734fault management architecture.
1735.Pp
1736To begin, a driver must declare which capabilities it implements during
1737its
1738.Xr attach 9E
1739function by calling
1740.Xr ddi_fm_init 9F .
1741The set of capabilities it receives back may be less than what was
1742requested because the capabilities are dependent on the overall chain of
1743drivers present.
1744.Pp
1745If
1746.Dv DDI_FM_EREPORT_CAPABLE
1747was negotiated, then the driver is expected to generate error events
1748when certain conditions occur using the
1749.Xr ddi_fm_ereport_post 9F
1750function or the more specific
1751.Xr pci_ereport_post 9F
1752function.
1753If a caller has negotiated
1754.Dv DDI_FM_ACCCHK_CAPABLE ,
1755then it is allowed to set up its register attributes to indicate that it
1756will check for errors on the register handle after using functions like
1757.Xr ddi_get8 9F
1758and
1759.Xr ddi_set8 9F
1760by calling
1761.Xr ddi_fm_acc_err_get 9F
1762and reacting accordingly.
1763Similarly, if a driver has negotiated
1764.Dv DDI_FM_DMACHK_CAPABLE ,
1765then it will use
1766.Xr ddi_check_dma_handle 9F
1767to check the results of DMA activity and handle the results
1768appropriately.
1769Similar to register accesses, the DMA attributes must be updated to set
1770that error handling is anticipated on this handle.
1771The
1772.Xr ddi_fm_init 9F
1773manual page has an overview of the other types of flags that can be
1774negotiated and how they are used.
1775.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1776.It Xr ddi_check_acc_handle 9F Ta Xr ddi_check_dma_handle 9F
1777.It Xr ddi_dev_report_fault 9F Ta Xr ddi_fm_acc_err_clear 9F
1778.It Xr ddi_fm_acc_err_get 9F Ta Xr ddi_fm_capable 9F
1779.It Xr ddi_fm_dma_err_clear 9F Ta Xr ddi_fm_dma_err_get 9F
1780.It Xr ddi_fm_ereport_post 9F Ta Xr ddi_fm_fini 9F
1781.It Xr ddi_fm_handler_register 9F Ta Xr ddi_fm_handler_unregister 9F
1782.It Xr ddi_fm_init 9F Ta Xr ddi_fm_service_impact 9F
1783.It Xr pci_ereport_post 9F Ta Xr pci_ereport_setup 9F
1784.It Xr pci_ereport_teardown 9F Ta
1785.El
1786.Ss SCSI and SAS Device Driver Functions
1787These functions are for use by SCSI and SAS device drivers that leverage
1788the kernel's frameworks.
1789Other device drivers should not use these.
1790For more background on these, some of the general concepts are discussed
1791in
1792.Xr iport 9 ,
1793.Xr phymap 9 ,
1794and
1795.Xr tgtmap 9 .
1796.Pp
1797Device drivers register initially with the kernel by using the
1798.Xr scsi_ha_init 9F
1799function and then, in their attach routine, register specific instances,
1800using functions like
1801.Xr scsi_hba_iport_register 9F
1802or instead
1803.Xr scsi_hba_tran_alloc 9F
1804and
1805.Xr scsi_hba_attach_setup 9F .
1806New drivers are encouraged to use the target map and iports framework to
1807simplify the device driver writing process.
1808.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1809.It Xr makecom_g0_s 9F Ta Xr makecom_g0 9F
1810.It Xr makecom_g1 9F Ta Xr makecom_g5 9F
1811.It Xr makecom 9F Ta Xr sas_phymap_create 9F
1812.It Xr sas_phymap_destroy 9F Ta Xr sas_phymap_lookup_ua 9F
1813.It Xr sas_phymap_lookup_uapriv 9F Ta Xr sas_phymap_phy_add 9F
1814.It Xr sas_phymap_phy_rem 9F Ta Xr sas_phymap_phy2ua 9F
1815.It Xr sas_phymap_phys_free 9F Ta Xr sas_phymap_phys_next 9F
1816.It Xr sas_phymap_ua_free 9F Ta Xr sas_phymap_ua2phys 9F
1817.It Xr sas_phymap_uahasphys 9F Ta Xr scsi_abort 9F
1818.It Xr scsi_address_device 9F Ta Xr scsi_alloc_consistent_buf 9F
1819.It Xr scsi_cname 9F Ta Xr scsi_destroy_pkt 9F
1820.It Xr scsi_device_hba_private_get 9F Ta Xr scsi_device_hba_private_set 9F
1821.It Xr scsi_device_unit_address 9F Ta Xr scsi_dmafree 9F
1822.It Xr scsi_dmaget 9F Ta Xr scsi_dname 9F
1823.It Xr scsi_errmsg 9F Ta Xr scsi_ext_sense_fields 9F
1824.It Xr scsi_find_sense_descr 9F Ta Xr scsi_free_consistent_buf 9F
1825.It Xr scsi_free_wwnstr 9F Ta Xr scsi_get_device_type_scsi_options 9F
1826.It Xr scsi_get_device_type_string 9F Ta Xr scsi_hba_attach_setup 9F
1827.It Xr scsi_hba_detach 9F Ta Xr scsi_hba_fini 9F
1828.It Xr scsi_hba_init 9F Ta Xr scsi_hba_iport_exist 9F
1829.It Xr scsi_hba_iport_find 9F Ta Xr scsi_hba_iport_register 9F
1830.It Xr scsi_hba_iport_unit_address 9F Ta Xr scsi_hba_iportmap_create 9F
1831.It Xr scsi_hba_iportmap_destroy 9F Ta Xr scsi_hba_iportmap_iport_add 9F
1832.It Xr scsi_hba_iportmap_iport_remove 9F Ta Xr scsi_hba_lookup_capstr 9F
1833.It Xr scsi_hba_pkt_alloc 9F Ta Xr scsi_hba_pkt_comp 9F
1834.It Xr scsi_hba_pkt_free 9F Ta Xr scsi_hba_probe 9F
1835.It Xr scsi_hba_tgtmap_create 9F Ta Xr scsi_hba_tgtmap_destroy 9F
1836.It Xr scsi_hba_tgtmap_scan_luns 9F Ta Xr scsi_hba_tgtmap_set_add 9F
1837.It Xr scsi_hba_tgtmap_set_begin 9F Ta Xr scsi_hba_tgtmap_set_end 9F
1838.It Xr scsi_hba_tgtmap_set_flush 9F Ta Xr scsi_hba_tgtmap_tgt_add 9F
1839.It Xr scsi_hba_tgtmap_tgt_remove 9F Ta Xr scsi_hba_tran_alloc 9F
1840.It Xr scsi_hba_tran_free 9F Ta Xr scsi_ifgetcap 9F
1841.It Xr scsi_ifsetcap 9F Ta Xr scsi_init_pkt 9F
1842.It Xr scsi_log 9F Ta Xr scsi_mname 9F
1843.It Xr scsi_pktalloc 9F Ta Xr scsi_pktfree 9F
1844.It Xr scsi_poll 9F Ta Xr scsi_probe 9F
1845.It Xr scsi_resalloc 9F Ta Xr scsi_reset_notify 9F
1846.It Xr scsi_reset 9F Ta Xr scsi_resfree 9F
1847.It Xr scsi_rname 9F Ta Xr scsi_sense_asc 9F
1848.It Xr scsi_sense_ascq 9F Ta Xr scsi_sense_cmdspecific_uint64 9F
1849.It Xr scsi_sense_info_uint64 9F Ta Xr scsi_sense_key 9F
1850.It Xr scsi_setup_cdb 9F Ta Xr scsi_slave 9F
1851.It Xr scsi_sname 9F Ta Xr scsi_sync_pkt 9F
1852.It Xr scsi_transport 9F Ta Xr scsi_unprobe 9F
1853.It Xr scsi_unslave 9F Ta Xr scsi_validate_sense 9F
1854.It Xr scsi_vu_errmsg 9F Ta Xr scsi_wwn_to_wwnstr 9F
1855scsi_wwnstr_to_wwn
1856.El
1857.Ss Block Device Buffer Handling
1858Block devices operate with a data structure called the
1859.Vt struct buf
1860which is described in
1861.Xr buf 9S .
1862This structure is used to represent a given block request and is used
1863heavily in block devices, the SCSI/SAS framework, and the blkdev
1864framework.
1865The functions described here are used to manipulate these structures in
1866various ways such as copying them around, indicating error conditions,
1867or indicating when the I/O operation is done.
1868By default, this memory is not mapped into the kernel's address space so
1869several functions such as
1870.Xr bp_mapin 9F
1871are present to allow for that to happen when required.
1872.Pp
1873To initially obtain a
1874.Vt struct buf ,
1875drivers should begin by calling
1876.Xr getrbuf 9S
1877at which point, the caller can fill in the structure.
1878Once that's done, the
1879.Xr physio 9F
1880function can be used to actually perform the I/O and wait until it's
1881complete.
1882.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1883.It Xr bioclone 9F Ta Xr biodone 9F
1884.It Xr bioerror 9F Ta Xr biofini 9F
1885.It Xr bioinit 9F Ta Xr biomodified 9F
1886.It Xr bioreset 9F Ta Xr biosize 9F
1887.It Xr biowait 9F Ta Xr bp_mapin 9F
1888.It Xr bp_mapout 9F Ta Xr clrbuf 9F
1889.It Xr disksort 9F Ta Xr freerbuf 9F
1890.It Xr geterror 9F Ta Xr getrbuf 9F
1891.It Xr minphys 9F Ta Xr physio 9F
1892.El
1893.Ss Networking Device Driver Functions
1894These functions are for networking device drivers that implant the MAC,
1895GLDv3 interfaces.
1896The full framework and how to use it is described in
1897.Xr mac 9E .
1898.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1899.It Xr mac_alloc 9F Ta Xr mac_fini_ops 9F
1900.It Xr mac_free 9F Ta Xr mac_hcksum_get 9F
1901.It Xr mac_hcksum_set 9F Ta Xr mac_init_ops 9F
1902.It Xr mac_link_update 9F Ta Xr mac_lso_get 9F
1903.It Xr mac_maxsdu_update 9F Ta Xr mac_prop_info_set_default_fec 9F
1904.It Xr mac_prop_info_set_default_link_flowctrl 9F Ta Xr mac_prop_info_set_default_str 9F
1905.It Xr mac_prop_info_set_default_uint32 9F Ta Xr mac_prop_info_set_default_uint64 9F
1906.It Xr mac_prop_info_set_default_uint8 9F Ta Xr mac_prop_info_set_perm 9F
1907.It Xr mac_prop_info_set_range_uint32 9F Ta Xr mac_prop_info 9F
1908.It Xr mac_register 9F Ta Xr mac_ring_rx 9F
1909.It Xr mac_rx 9F Ta Xr mac_transceiver_info_set_present 9F
1910.It Xr mac_transceiver_info_set_usable 9F Ta Xr mac_transceiver_info 9F
1911.It Xr mac_tx_ring_update 9F Ta Xr mac_tx_update 9F
1912.It Xr mac_unregister 9F Ta
1913.El
1914.Ss USB Device Driver Functions
1915These functions are designed for USB device drivers.
1916To first initialize with the kernel, a device driver must call
1917.Xr usb_client_attach 9F
1918and then
1919.Xr usb_get_dev_data 9F .
1920The latter call is required to get access to the USB-level
1921descriptors about the device which describe what kinds of USB endpoints
1922.Pq control, bulk, interrupt, or isochronous
1923exist on the device as well as how many different interfaces and
1924configurations are present.
1925.Pp
1926Once a given configuration, sometimes the default, is selected, then the
1927driver can proceed to opening up what the USB architecture calls a pipe,
1928which provides a way to send requests to a specific USB endpoint.
1929First, specific endpoints can be looked up using the
1930.Xr usb_lookup_ep_data 9F
1931function which gets information from the parsed descriptors and then
1932that gets filled into an extended descriptor with
1933.Xr usb_ep_xdescr_fill 9F .
1934With that in hand, a pipe can be opened with
1935.Xr usb_pipe_xopen 9F .
1936.Pp
1937Once a pipe has been opened, which most often happens in a driver's
1938.Xr attach 9E
1939entry point, then requests can be allocated and submitted.
1940There is a different allocation for each type of request
1941.Po
1942e.g.
1943.Xr usb_alloc_bulk_req 9F
1944.Pc
1945and a different submission function for each type as well.
1946Each request structure has a corresponding page in section 9S that
1947describes the structure, its members, and how to work with it.
1948.Pp
1949One other major concern for USB devices, which isn't as common with
1950other types of devices, is that they can be yanked out and reinserted
1951at any time.
1952To help determine when this happens, the kernel offers the
1953.Xr usb_register_event_cbs 9F
1954function which allows a driver to register for callbacks when a device
1955is disconnected, reconnected, or around checkpoint suspend/resume
1956behavior.
1957.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1958.It Xr usb_alloc_bulk_req 9F Ta Xr usb_alloc_ctrl_req 9F
1959.It Xr usb_alloc_intr_req 9F Ta Xr usb_alloc_isoc_req 9F
1960.It Xr usb_alloc_request 9F Ta Xr usb_client_attach 9F
1961.It Xr usb_client_detach 9F Ta Xr usb_clr_feature 9F
1962.It Xr usb_create_pm_components 9F Ta Xr usb_ep_xdescr_fill 9F
1963.It Xr usb_free_bulk_req 9F Ta Xr usb_free_ctrl_req 9F
1964.It Xr usb_free_descr_tree 9F Ta Xr usb_free_dev_data 9F
1965.It Xr usb_free_intr_req 9F Ta Xr usb_free_isoc_req 9F
1966.It Xr usb_get_addr 9F Ta Xr usb_get_alt_if 9F
1967.It Xr usb_get_cfg 9F Ta Xr usb_get_current_frame_number 9F
1968.It Xr usb_get_dev_data 9F Ta Xr usb_get_if_number 9F
1969.It Xr usb_get_max_pkts_per_isoc_request 9F Ta Xr usb_get_status 9F
1970.It Xr usb_get_string_descr 9F Ta Xr usb_handle_remote_wakeup 9F
1971.It Xr usb_lookup_ep_data 9F Ta Xr usb_owns_device 9F
1972.It Xr usb_parse_data 9F Ta Xr usb_pipe_bulk_xfer 9F
1973.It Xr usb_pipe_close 9F Ta Xr usb_pipe_ctrl_xfer_wait 9F
1974.It Xr usb_pipe_ctrl_xfer 9F Ta Xr usb_pipe_drain_reqs 9F
1975.It Xr usb_pipe_get_max_bulk_transfer_size 9F Ta Xr usb_pipe_get_private 9F
1976.It Xr usb_pipe_get_state 9F Ta Xr usb_pipe_intr_xfer 9F
1977.It Xr usb_pipe_isoc_xfer 9F Ta Xr usb_pipe_open 9F
1978.It Xr usb_pipe_reset 9F Ta Xr usb_pipe_set_private 9F
1979.It Xr usb_pipe_stop_intr_polling 9F Ta Xr usb_pipe_stop_isoc_polling 9F
1980.It Xr usb_pipe_xopen 9F Ta Xr usb_print_descr_tree 9F
1981.It Xr usb_register_hotplug_cbs 9F Ta Xr usb_reset_device 9F
1982.It Xr usb_set_alt_if 9F Ta Xr usb_set_cfg 9F
1983.It Xr usb_unregister_hotplug_cbs 9F Ta
1984.El
1985.Ss PCI Device Driver Functions
1986These functions are specific for PCI and PCI Express based device
1987drivers and are intended to be used to get access to PCI configuration
1988space.
1989For normal PCI base address registers
1990.Pq BARs
1991instead see
1992.Sx Register Setup and Access .
1993.Pp
1994To access PCI configuration space, a device driver should first call
1995.Xr pci_config_setup 9F .
1996Generally, drivers will call this in their
1997.Xr attach 9E
1998entry point and then tear down the configuration space access with the
1999.Xr pci_config_teardown 9F
2000entry point in
2001.Xr detach 9E .
2002After setting up access to configuration space, the returned handle can
2003be used in all of the various configuration space routines to get and
2004set specific sized values in configuration space.
2005.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2006.It Xr pci_config_get8 9F Ta Xr pci_config_get16 9F
2007.It Xr pci_config_get32 9F Ta Xr pci_config_get64 9F
2008.It Xr pci_config_put8 9F Ta Xr pci_config_put16 9F
2009.It Xr pci_config_put32 9F Ta Xr pci_config_put64 9F
2010.It Xr pci_config_setup 9F Ta Xr pci_config_teardown 9F
2011.It Xr pci_report_pmcap 9F Ta Xr pci_restore_config_regs 9F
2012.It Xr pci_save_config_regs 9F Ta
2013.El
2014.Ss USB Host Controller Interface Functions
2015These routines are used for device drivers which implement the USB
2016host controller interfaces described in
2017.Xr usba_hcdi 9E .
2018Other types of devices drivers and modules should not call these
2019functions.
2020In particular, if one is writing a device driver for a USB device, these
2021are not the routines you're looking for and you want to see
2022.Sx USB Device Driver Functions .
2023These are what the
2024.Xr ehci 4D
2025or
2026.Xr xhci 4D
2027drivers use to provide services that USB drivers use via the kernel USB
2028architecture.
2029.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2030.It Xr usba_alloc_hcdi_ops 9F Ta Xr usba_free_hcdi_ops 9F
2031.It Xr usba_hcdi_cb 9F Ta Xr usba_hcdi_dup_intr_req 9F
2032.It Xr usba_hcdi_dup_isoc_req 9F Ta Xr usba_hcdi_get_device_private 9F
2033.It Xr usba_hcdi_register 9F Ta Xr usba_hcdi_unregister 9F
2034.It Xr usba_hubdi_bind_root_hub 9F Ta Xr usba_hubdi_cb_ops 9F
2035.It Xr usba_hubdi_close 9F Ta Xr usba_hubdi_dev_ops 9F
2036.It Xr usba_hubdi_ioctl 9F Ta Xr usba_hubdi_open 9F
2037.It Xr usba_hubdi_root_hub_power 9F Ta Xr usba_hubdi_unbind_root_hub 9F
2038.El
2039.Ss Functions for PCMCIA Drivers
2040These functions exist for older PCMCIA device drivers.
2041These should not otherwise be used by the system.
2042.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2043.It Xr csx_AccessConfigurationRegister 9F Ta Xr csx_ConvertSize 9F
2044.It Xr csx_ConvertSpeed 9F Ta Xr csx_CS_DDI_Info 9F
2045.It Xr csx_DeregisterClient 9F Ta Xr csx_DupHandle 9F
2046.It Xr csx_Error2Text 9F Ta Xr csx_Event2Text 9F
2047.It Xr csx_FreeHandle 9F Ta Xr csx_Get16 9F
2048.It Xr csx_Get32 9F Ta Xr csx_Get64 9F
2049.It Xr csx_Get8 9F Ta Xr csx_GetEventMask 9F
2050.It Xr csx_GetFirstClient 9F Ta Xr csx_GetFirstTuple 9F
2051.It Xr csx_GetHandleOffset 9F Ta Xr csx_GetMappedAddr 9F
2052.It Xr csx_GetNextClient 9F Ta Xr csx_GetNextTuple 9F
2053.It Xr csx_GetStatus 9F Ta Xr csx_GetTupleData 9F
2054.It Xr csx_MakeDeviceNode 9F Ta Xr csx_MapLogSocket 9F
2055.It Xr csx_MapMemPage 9F Ta Xr csx_ModifyConfiguration 9F
2056.It Xr csx_ModifyWindow 9F Ta Xr csx_Parse_CISTPL_BATTERY 9F
2057.It Xr csx_Parse_CISTPL_BYTEORDER 9F Ta Xr csx_Parse_CISTPL_CFTABLE_ENTRY 9F
2058.It Xr csx_Parse_CISTPL_CONFIG 9F Ta Xr csx_Parse_CISTPL_DATE 9F
2059.It Xr csx_Parse_CISTPL_DEVICE_A 9F Ta Xr csx_Parse_CISTPL_DEVICE_OA 9F
2060.It Xr csx_Parse_CISTPL_DEVICE_OC 9F Ta Xr csx_Parse_CISTPL_DEVICE 9F
2061.It Xr csx_Parse_CISTPL_DEVICEGEO_A 9F Ta Xr csx_Parse_CISTPL_DEVICEGEO 9F
2062.It Xr csx_Parse_CISTPL_FORMAT 9F Ta Xr csx_Parse_CISTPL_FUNCE 9F
2063.It Xr csx_Parse_CISTPL_FUNCID 9F Ta Xr csx_Parse_CISTPL_GEOMETRY 9F
2064.It Xr csx_Parse_CISTPL_JEDEC_A 9F Ta Xr csx_Parse_CISTPL_JEDEC_C 9F
2065.It Xr csx_Parse_CISTPL_LINKTARGET 9F Ta Xr csx_Parse_CISTPL_LONGLINK_A 9F
2066.It Xr csx_Parse_CISTPL_LONGLINK_C 9F Ta Xr csx_Parse_CISTPL_LONGLINK_MFC 9F
2067.It Xr csx_Parse_CISTPL_MANFID 9F Ta Xr csx_Parse_CISTPL_ORG 9F
2068.It Xr csx_Parse_CISTPL_SPCL 9F Ta Xr csx_Parse_CISTPL_SWIL 9F
2069.It Xr csx_Parse_CISTPL_VERS_1 9F Ta Xr csx_Parse_CISTPL_VERS_2 9F
2070.It Xr csx_ParseTuple 9F Ta Xr csx_Put16 9F
2071.It Xr csx_Put32 9F Ta Xr csx_Put64 9F
2072.It Xr csx_Put8 9F Ta Xr csx_RegisterClient 9F
2073.It Xr csx_ReleaseConfiguration 9F Ta Xr csx_ReleaseIO 9F
2074.It Xr csx_ReleaseIRQ 9F Ta Xr csx_ReleaseSocketMask 9F
2075.It Xr csx_ReleaseWindow 9F Ta Xr csx_RemoveDeviceNode 9F
2076.It Xr csx_RepGet16 9F Ta Xr csx_RepGet32 9F
2077.It Xr csx_RepGet64 9F Ta Xr csx_RepGet8 9F
2078.It Xr csx_RepPut16 9F Ta Xr csx_RepPut32 9F
2079.It Xr csx_RepPut64 9F Ta Xr csx_RepPut8 9F
2080.It Xr csx_RequestConfiguration 9F Ta Xr csx_RequestIO 9F
2081.It Xr csx_RequestIRQ 9F Ta Xr csx_RequestSocketMask 9F
2082.It Xr csx_RequestWindow 9F Ta Xr csx_ResetFunction 9F
2083.It Xr csx_SetEventMask 9F Ta Xr csx_SetHandleOffset 9F
2084.It Xr csx_ValidateCIS 9F Ta
2085.El
2086.Ss STREAMS related functions
2087These functions are meant to be used when interacting with STREAMS
2088devices or when implementing one.
2089When a STREAMS driver is opened, it receives messages on a queue which
2090are then processed and can be sent back.
2091As different queues are often linked together, the most common thing is
2092to process a message and then pass the message onto the next queue using
2093the
2094.Xr putnext 9F
2095function.
2096.Pp
2097STREAMS messages are passed around using message blocks, which use the
2098.Vt mblk_t
2099type.
2100See
2101.Sx Message Block Functions
2102for more about how the data structure and functions that manipulate
2103message blocks.
2104.Pp
2105These functions should generally not be used when implementing a
2106networking device driver today.
2107See
2108.Xr mac 9E
2109instead.
2110.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2111.It Xr backq 9F Ta Xr bcanput 9F
2112.It Xr bcanputnext 9F Ta Xr canput 9F
2113.It Xr canputnext 9F Ta Xr enableok 9F
2114.It Xr flushband 9F Ta Xr flushq 9F
2115.It Xr freezestr 9F Ta Xr getq 9F
2116.It Xr insq 9F Ta Xr merror 9F
2117.It Xr mexchange 9F Ta Xr noenable 9F
2118.It Xr put 9F Ta Xr putbq 9F
2119.It Xr putctl 9F Ta Xr putctl1 9F
2120.It Xr putnext 9F Ta Xr putnextctl 9F
2121.It Xr putnextctl1 9F Ta Xr putq 9F
2122.It Xr mt-streams 9F Ta Xr qassociate 9F
2123.It Xr qenable 9F Ta Xr qprocsoff 9F
2124.It Xr qprocson 9F Ta Xr qreply 9F
2125.It Xr qsize 9F Ta Xr qwait_sig 9F
2126.It Xr qwait 9F Ta Xr qwriter 9F
2127.It Xr OTHERQ 9F Ta Xr RD 9F
2128.It Xr rmvq 9F Ta Xr SAMESTR 9F
2129.It Xr unfreezestr 9F Ta Xr WR 9F
2130.El
2131.Ss STREAMS ioctls
2132The following functions are used when a STREAMS-based device driver is
2133processing its
2134.Xr ioctl 9E
2135entry point.
2136Unlike character and block devices, STREAMS ioctls are passed around in
2137message blocks and copying data in and out of userland as STREAMS
2138ioctls are generally always processed in
2139.Sy kernel
2140context.
2141This means that the normal functions like
2142.Xr ddi_copyin 9F
2143and
2144.Xr ddi_copyout 9F
2145cannot be used.
2146Instead, when a message block has a type of
2147.Dv M_IOCTL ,
2148then these routines can often be used to convert the structure into one
2149that asks for data to be copied in, copied out, or to finally
2150acknowledge the ioctl as successful or to terminate the processing in
2151error.
2152.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2153.It Xr mcopyin 9F Ta Xr mcopyout 9F
2154.It Xr mioc2ack 9F Ta Xr miocack 9F
2155.It Xr miocnak 9F Ta Xr miocpullup 9F
2156.It Xr mkiocb 9F Ta
2157.El
2158.Ss chpoll(9E) Related Functions
2159These functions are present in service of the
2160.Xr chpoll 9E
2161interface which is used to support the traditional
2162.Xr poll 2 ,
2163and
2164.Xr select 3C
2165interfaces as well as event ports through the
2166.Xr port_get 3C
2167interface.
2168See
2169.Xr chpoll 9E
2170for the specific cases this should be called.
2171If a device driver does not implement the
2172.Xr chpoll 9E
2173character device entry point, then these functions should not be used.
2174.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2175.It Xr pollhead_clean 9F Ta Xr pollwakeup 9F
2176.El
2177.Ss Kernel Statistics
2178The kernel statistics or kstat framework provides an easy way of
2179exporting statistic information to be consumed outside of the kernel.
2180Users can interface with this data via
2181.Xr kstat 8
2182and the corresponding kstat library discussed in
2183.Xr kstat 3KSTAT .
2184.Pp
2185Kernel statistics are grouped using a tuple of four identifiers,
2186separated by colons when using
2187.Xr kstat 8 .
2188These are, in order, the statistic module name, instance, a name
2189which covers a group of statistics, and an individual name for a
2190statistic.
2191In addition, kernel statistics have a class which is used to group
2192similar named groups of statistics together across devices.
2193When using
2194.Xr kstat_create 9F ,
2195drivers specify the first three parts of the tuple and the class.
2196The naming of individual statistics, the last part of the tuple, varies
2197based upon the type of the statistic.
2198For the most part, drivers will use the kstat type
2199.Dv KSTAT_TYPE_NAMED ,
2200which allows multiple name-value pairs to exist within the statistic.
2201For example, the kernel's layer 2 networking framework,
2202.Xr mac 9E ,
2203creates a kstat with the driver's name and instance and names it
2204.Dq mac .
2205Within this named group, there are statistics for all of the different
2206individual stats that the kernel and devices track such as bytes
2207transmitted and received, the state and speed of the link, and
2208advertised and enabled capabilities.
2209.Pp
2210A device driver can initialize a kstat with the
2211.Xr kstat_create 9F
2212function.
2213It will not be made accessible to users until the
2214.Xr kstat_install 9F
2215function is called.
2216The device driver must perform additional initialization of the kstat
2217before proceeding and calling
2218.Xr kstat_install 9F .
2219The kstat structure that drivers see is discussed in
2220.Xr kstat 9S .
2221.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2222.It Xr kstat_create 9F Ta Xr kstat_delete 9F
2223.It Xr kstat_install 9F Ta Xr kstat_named_init 9F
2224.It Xr kstat_named_setstr 9F Ta Xr kstat_queue 9F
2225.It Xr kstat_runq_back_to_waitq 9F Ta Xr kstat_runq_enter 9F
2226.It Xr kstat_runq_exit 9F Ta Xr kstat_waitq_enter 9F
2227.It Xr kstat_waitq_exit 9F Ta Xr kstat_waitq_to_runq 9F
2228.El
2229.Ss NDI Events
2230These functions are used to allow a device driver to register for
2231certain events that might occur to its device or a parent in the tree
2232and receive a callback function when they occur.
2233A good example of this is when a device has been removed from the system
2234such as someone just pulling out a USB device or NVMe U.2 device.
2235The event handlers work by first getting a cookie that names the type of
2236event with
2237.Xr ddi_get_eventcookie 9F
2238and then registering the callback with
2239.Xr ddi_add_event_handler 9F .
2240.Pp
2241The
2242.Xr ddi_cb_register 9F
2243function is used to collect over classes of events such as when
2244participating in dynamic interrupt sharing.
2245.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2246.It Xr ddi_add_event_handler 9F Ta Xr ddi_cb_register 9F
2247.It Xr ddi_cb_unregister 9F Ta Xr ddi_get_eventcookie 9F
2248.It Xr ddi_remove_event_handler 9F Ta
2249.El
2250.Ss Layered Device Interfaces
2251The LDI
2252.Pq Layered Device Interface
2253provides a mechanism for a driver to open up another device in the
2254kernel and begin calling basic operations on the device as though the
2255calling driver were a normal user process.
2256Through the LDI, drivers can perform equivalents to the basic file
2257.Xr read 2
2258and
2259.Xr write 2
2260calls, look up properties on the device, perform networking style calls
2261ala
2262.Xr getmsg 2
2263and
2264.Xr pumsg 2 ,
2265and register callbacks to be called when something happens to the
2266underlying device.
2267For example, the ZFS file system uses the LDI to open and operate on
2268block devices.
2269.Pp
2270Before opening a device itself, callers must obtain a notion of their
2271identity which is used when making subsequent calls.
2272The simplest form is often to use the device's
2273.Vt dev_info_t
2274and call
2275.Xr ldi_ident_from_dip 9F ;
2276however, there are also methods available based upon having a
2277.Vt dev_t
2278or a STREAMS
2279.Vt struct queue .
2280.Pp
2281Once that identity is established, there are several ways to open a
2282device such as
2283.Xr ldi_open_by_dev 9F ,
2284.Xr ldi_open_by_devid 9F ,
2285or
2286.Xr ldi_open_by_name 9F .
2287Once an LDI device has been opened, then all of the other functions may
2288be used to operate on the device; however, consumers of the LDI must
2289think carefully about what kind of device they are opening.
2290While a kernel pseudo-device driver cannot disappear while it is open,
2291when the device represents an actual piece of hardware, it is possible
2292for it to be physically removed and no longer be accessible.
2293Consumers should not assume that a layered device will always be
2294present.
2295.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2296.It Xr ldi_add_event_handler 9F Ta Xr ldi_aread 9F
2297.It Xr ldi_awrite 9F Ta Xr ldi_close 9F
2298.It Xr ldi_devmap 9F Ta Xr ldi_dump 9F
2299.It Xr ldi_ev_finalize 9F Ta Xr ldi_ev_get_cookie 9F
2300.It Xr ldi_ev_get_type 9F Ta Xr ldi_ev_notify 9F
2301.It Xr ldi_ev_register_callbacks 9F Ta Xr ldi_ev_remove_callbacks 9F
2302.It Xr ldi_get_dev 9F Ta Xr ldi_get_devid 9F
2303.It Xr ldi_get_eventcookie 9F Ta Xr ldi_get_minor_name 9F
2304.It Xr ldi_get_otyp 9F Ta Xr ldi_get_size 9F
2305.It Xr ldi_getmsg 9F Ta Xr ldi_ident_from_dev 9F
2306.It Xr ldi_ident_from_dip 9F Ta Xr ldi_ident_from_stream 9F
2307.It Xr ldi_ident_release 9F Ta Xr ldi_ioctl 9F
2308.It Xr ldi_open_by_dev 9F Ta Xr ldi_open_by_devid 9F
2309.It Xr ldi_open_by_name 9F Ta Xr ldi_poll 9F
2310.It Xr ldi_prop_exists 9F Ta Xr ldi_prop_get_int 9F
2311.It Xr ldi_prop_get_int64 9F Ta Xr ldi_prop_lookup_byte_array 9F
2312.It Xr ldi_prop_lookup_int_array 9F Ta Xr ldi_prop_lookup_int64_array 9F
2313.It Xr ldi_prop_lookup_string_array 9F Ta Xr ldi_prop_lookup_string 9F
2314.It Xr ldi_putmsg 9F Ta Xr ldi_read 9F
2315.It Xr ldi_remove_event_handler 9F Ta Xr ldi_strategy 9F
2316.It Xr ldi_write 9F Ta
2317.El
2318.Ss Signal Manipulation
2319These utility functions all relate to understanding whether or not a
2320process can receive a signal an actually delivering one to a process
2321from a driver.
2322This interface is specific to device drivers and should not be used by
2323the broader kernel.
2324These interfaces are not recommended and should only be used after
2325consultation.
2326.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2327.It Xr ddi_can_receive_sig 9F Ta Xr proc_ref 9F
2328.It Xr proc_signal 9F Ta Xr proc_unref 9F
2329.El
2330.Ss Getting at Surrounding Context
2331These functions allow a driver to better understand its current context.
2332For example, some drivers have to deal with providing polled I/O or take
2333special care as part of creating a kernel crash dump.
2334These cases may need to call the
2335.Xr ddi_in_panic 9F
2336function.
2337The other functions generally provie a way to get at information such as
2338the process ID or other information from the system; however, this
2339generally should not be needed or used.
2340Almost all values exposed by say
2341.Xr drv_getparm 9F
2342have more usable first-class methods of getting at the data.
2343.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2344.It Xr ddi_get_kt_did 9F Ta Xr ddi_get_pid 9F
2345.It Xr ddi_in_panic 9F Ta Xr drv_getparm 9F
2346.El
2347.Ss Driver Memory Mapping
2348These functions are present for device drivers that implement the
2349.Xr devmap 9E
2350or
2351.Xr segmap 9E
2352entry points.
2353The
2354.Xr ddi_umem_alloc 9F
2355routines are used to allocate and lock memory that can later be used as
2356part of passing this memory to userland through the mapping entry
2357points.
2358.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2359.It Xr ddi_devmap_segmap 9F Ta Xr ddi_mmap_get_model 9F
2360.It Xr ddi_segmap_setup 9F Ta Xr ddi_segmap 9F
2361.It Xr ddi_umem_alloc 9F Ta Xr ddi_umem_free 9F
2362.It Xr ddi_umem_iosetup 9F Ta Xr ddi_umem_lock 9F
2363.It Xr ddi_umem_unlock 9F Ta Xr ddi_unmap_regs 9F
2364.It Xr devmap_default_access 9F Ta Xr devmap_devmem_setup 9F
2365.It Xr devmap_do_ctxmgt 9F Ta Xr devmap_load 9F
2366.It Xr devmap_set_ctx_timeout 9F Ta Xr devmap_setup 9F
2367.It Xr devmap_umem_setup 9F Ta Xr devmap_unload 9F
2368.El
2369.Ss UTF-8, UTF-16, UTF-32, and Code Set Utilities
2370These routines provide the ability to work with and deal with text in
2371different encodings and code sets.
2372Generally the kernel does not assume that much about the type of the text
2373that it is operating in, though some subsystems will require that the
2374names of things be ASCII only.
2375.Pp
2376The primary other locales that the system supports are generally UTF-8
2377based and so the kernel provides a set of routines to deal with UTF-8
2378and Unicode normalization.
2379However, there are still cases where different character encodings are
2380required or conversation between UTF-8 and some other type is required.
2381This is provided by the kernel iconv framework, which provides a
2382subset of the traditional userland iconv conversions.
2383.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2384.It Xr kiconv_close 9F Ta Xr kiconv_open 9F
2385.It Xr kiconv 9F Ta Xr kiconvstr 9F
2386.It Xr u8_strcmp 9F Ta Xr u8_textprep_str 9F
2387.It Xr u8_validate 9F Ta Xr uconv_u16tou32 9F
2388.It Xr uconv_u16tou8 9F Ta Xr uconv_u32tou16 9F
2389.It Xr uconv_u32tou8 9F Ta Xr uconv_u8tou16 9F
2390.It Xr uconv_u8tou32 9F Ta
2391.El
2392.Ss Raw I/O Port Access
2393This group of functions provides raw access to I/O ports on architecture
2394that support them.
2395These functions do not allow any coordination with other callers nor is
2396the validity of the port assured in any way.
2397In general, device drivers should use the normal register access
2398routines to access I/O ports.
2399See
2400.Sx Device Register Setup and Access
2401for more information on the preferred way to setup and access registers.
2402.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2403.It Xr inb 9F Ta Xr inw 9F
2404.It Xr inl 9F Ta Xr outb 9F
2405.It Xr outw 9F Ta Xr outl 9F
2406.El
2407.Ss Power Management
2408These functions are used to raise and lower the internal power levels of
2409a device driver or to indicate to the kernel that the device is busy and
2410therefore cannot have its power changed.
2411See
2412.Xr power 9E
2413for additional information.
2414.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2415.It Xr ddi_removing_power 9F Ta Xr pm_busy_component 9F
2416.It Xr pm_idle_component 9F Ta Xr pm_lower_power 9F
2417.It Xr pm_power_has_changed 9F Ta Xr pm_raise_power 9F
2418.It Xr pm_trans_check 9F Ta
2419.El
2420.Ss Network Packet Hooks
2421These functions are intended to be used by device drivers that wish to
2422inspect and potentially modify packets along their path through the
2423networking stack.
2424The most common use case is for implementing something like a network
2425firewall.
2426Otherwise, if looking to add support for a new protocol or other network
2427processing feature, one is better off more directly integrating with the
2428networking stack.
2429.Pp
2430To get started, drivers generally will need to first use
2431.Xr net_protocol_lookup 9F
2432to get a handle to say that they're interested in looking at IPv4 or
2433IPv6 traffic and then can allocate an actual hook object with
2434.Xr hook_alloc 9F .
2435After filling out the hook, the hook can be inserted into the actual
2436system with
2437.Xr net_hook_register 9F .
2438.Pp
2439Hooks operate in the context of a networking stack.
2440Every networking stack in the system is independent and therefore has
2441its own set of interfaces, routing tables, settings, and related.
2442Most zones have their own networking stack.
2443This is the exclusive-IP option that is described in
2444.Xr zoneadm 8 .
2445.Pp
2446Drivers can register to get a callback for every netstack in the system
2447and be notified when they are created and destroyed.
2448This is done by calling the
2449.Xr net_instance_register 9F
2450function, filling out its data structure, and then finally calling
2451.Xr net_instance_regster 9F .
2452Like other callback interfaces, the moment the callback functions are
2453registered, drivers need to expect that they're going to be called.
2454.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2455.It Xr hook_alloc 9F Ta Xr hook_free 9F
2456.It Xr net_event_notify_register 9F Ta Xr net_event_notify_unregister 9F
2457.It Xr net_getifname 9F Ta Xr net_getlifaddr 9F
2458.It Xr net_getmtu 9F Ta Xr net_getnetid 9F
2459.It Xr net_getpmtuenabled 9F Ta Xr net_hook_register 9F
2460.It Xr net_hook_unregister 9F Ta Xr net_inject_alloc 9F
2461.It Xr net_inject_free 9F Ta Xr net_inject 9F
2462.It Xr net_instance_alloc 9F Ta Xr net_instance_free 9F
2463.It Xr net_instance_notify_register 9F Ta Xr net_instance_notify_unregister 9F
2464.It Xr net_instance_protocol_unregister 9F Ta Xr net_instance_register 9F
2465.It Xr net_instance_unregister 9F Ta Xr net_ispartialchecksum 9F
2466.It Xr net_isvalidchecksum 9F Ta Xr net_kstat_create 9F
2467.It Xr net_kstat_delete 9F Ta Xr net_lifgetnext 9F
2468.It Xr net_netidtozonid 9F Ta Xr net_phygetnext 9F
2469.It Xr net_phylookup 9F Ta Xr net_protocol_lookup 9F
2470.It Xr net_protocol_notify_register 9F Ta Xr net_protocol_release 9F
2471.It Xr net_protocol_walk 9F Ta Xr net_routeto 9F
2472.It Xr net_zoneidtonetid 9F Ta Xr netinfo 9F
2473.El
2474.Sh SEE ALSO
2475.Xr Intro 2 ,
2476.Xr Intro 9 ,
2477.Xr Intro 9E ,
2478.Xr Intro 9S
2479.Rs
2480.%T illumos Developer's Guide
2481.%U https://www.illumos.org/books/dev/
2482.Re
2483.Rs
2484.%T Writing Device Drivers
2485.%U https://www.illumos.org/books/wdd/
2486.Re
2487