xref: /illumos-gate/usr/src/man/man9f/Intro.9f (revision 8119dad84d6416f13557b0ba8e2aaf9064cbcfd3)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source.  A copy of the CDDL is also available via the Internet at
9.\" http://www.illumos.org/license/CDDL.
10.\"
11.\"
12.\" Copyright 2023 Oxide Computer Company
13.\" Copyright 2023 Peter Tribble
14.\"
15.Dd July 17, 2023
16.Dt INTRO 9F
17.Os
18.Sh NAME
19.Nm Intro
20.Nd Introduction to kernel and device driver functions
21.Sh SYNOPSIS
22.In sys/ddi.h
23.In sys/sunddi.h
24.Sh DESCRIPTION
25Section 9F of the manual page describes functions that are used for device
26drivers, kernel modules, and the implementation of the kernel itself.
27This first provides an overview for the use of kernel functions and portions of
28the manual that are specific to the kernel.
29After that, we have grouped together most functions that are available by use,
30with some brief commentary and introduction.
31.Pp
32Most manual pages are similar to those in other sections.
33They have common fields such as the NAME, a SYNOPSIS to show which header files
34to include and prototypes, an extended DESCRIPTION discussing its use, and the
35common combination of RETURN VALUES and ERRORS.
36Some manuals will have examples and additional manuals to reference in the SEE
37ALSO section.
38.Ss RETURN VALUES and ERRORS
39One major difference when programming in the kernel versus userland is that
40there is no equivalent to
41.Va errno .
42Instead, there are a few common patterns that are used throughout the kernel
43that we'll discuss.
44While there are common patterns, please be aware that due to the natural
45evolution of the system, you will need to read the specifics of the
46section.
47.Bl -bullet
48.It
49Many functions will return a specific DDI
50.Pq Device Driver Interface
51value, which is commonly one of
52.Dv DDI_SUCCESS
53or
54.Dv DDI_FAILURE ,
55indicating success and failure respectively.
56Some functions will return additional error codes to indicate why something
57failed.
58In general, when checking a response code is always preferred to compare that
59something equals or does not equal
60.Dv DDI_SUCCESS
61as there can be many different error cases and additional ones can be added over
62time.
63.It
64Many routines explicitly return
65.Sy 0
66on success and will return an explicit error number.
67.Xr Intro 2
68has a list of error numbers.
69.It
70There are classes of functions that return either a pointer or a boolean type,
71either the C99
72.Vt bool
73or the system's traditional type
74.Vt boolean_t .
75In these cases, sometimes a more detailed error is provided via an additional
76argument such as a
77.Vt "int *" .
78Absent such an argument, there is generally no more detailed information
79available.
80.El
81.Ss CONTEXT
82The CONTEXT section of a manual page describes the times in which this function
83may be called.
84In generally there are three different contexts that come up:
85.Bl -tag -width Ds
86.It Sy User
87User context implies that the thread of execution is operating because a user
88thread has entered the kernel for an operation.
89When an application issues a system call such as
90.Xr open 2 ,
91.Xr read 2 ,
92.Xr write 2 ,
93or
94.Xr ioctl 2
95then we are said to be in user context.
96When in user context, one can copy in or out data from a user's address space.
97When writing a character or block device driver, the majority of the time that a
98character device operation such as the corresponding
99.Xr open 9E ,
100.Xr read 9E ,
101.Xr write 9E ,
102and
103.Xr ioctl 9E
104entry point being called, it is executing in user context.
105It is possible to call those entry points through the kernel's layered device
106interface, so drivers cannot assume those entry points will always have a user
107process present, strictly speaking.
108.It Sy Interrupt
109Interrupt context refers to when the operating system is handling an interrupt
110.Po
111See
112.Sx Interrupt Related Functions
113.Pc
114and executing a registered interrupt handler.
115Interrupt context is split into two different sets: high-level and low-level
116interrupts.
117Most device drivers are always going to be executing low-level interrupts.
118To determine whether an interrupt is considered high level or not, you should
119pass the interrupt handle to the
120.Xr ddi_intr_get_pri 9F
121function and compare the resulting priority with
122.Xr ddi_intr_get_hilevel_pri 9F .
123.Pp
124When executing high-level interrupts, the thread may only execute a limited
125number of functions.
126In particular, it may call
127.Xr ddi_intr_trigger_softint 9F ,
128.Xr mutex_enter 9F ,
129and
130.Xr mutex_exit 9F .
131It is critical that the mutex being used be properly initialized with the
132driver's interrupt priority.
133The system will transparently pick the correct implementation of a mutex based
134on the interrupt type.
135Aside from the above, one must not block while in high-level interrupt context.
136.Pp
137On the other hand, when a thread is not in high-level interrupt context, most of
138these restrictions are lifted.
139Kernel memory may be allocated
140.Po
141if using a non-blocking allocation such as
142.Dv KM_NOSLEEP
143or
144.Dv KM_NOSLEEP_LAZY
145.Pc ,
146and many of the other documented functions may be called.
147.Pp
148Regardless of whether a thread is in high-level or low-level interrupt context,
149it will never have a user context associated with it and therefore cannot use
150routines like
151.Xr ddi_copyin 9F
152or
153.Xr ddi_copyout 9F .
154.It Sy Kernel
155Kernel context refers to all other times in the kernel.
156Whenever the kernel is executing something on a thread that is not associated
157with a user process, then one is in kernel context.
158The most common situation for writers of kernel modules are things like timeout
159callbacks, such as
160.Xr timeout 9F
161or
162.Xr ddi_periodic_add 9F ,
163cases where the kernel is invoking a driver's device operation routines such as
164.Xr attach 9E
165and
166.Xr detach 9E ,
167or many of the device driver's registered callbacks from frameworks such as the
168.Xr mac 9E ,
169.Xr usba_hcdi 9E ,
170and various portions of SCSI, USB, and block devices.
171.It Sy Framework-specific Contexts
172Some manuals will discuss more specific constraints about when they can be used.
173For example, some functions may only be called while executing a specific entry
174point like
175.Xr attach 9E .
176Another example of this is that the
177.Xr mac_transceiver_info_set_present 9F
178function is only meant to be used while executing a networking driver's
179.Xr mct_info 9E
180entry point.
181.El
182.Ss PARAMETERS
183In kernel manual pages
184.Pq section 9 ,
185each function and entry point description generally has a separate list
186of parameters which are arguments to the function.
187The parameters section describes the basic purpose of each argument and
188should explain where such things often come from and any constraints on
189their values.
190.Sh INTERFACES
191Functions below are organized into categories that describe their purpose.
192Individual functions are documented in their own manual pages.
193For each of these areas, we discuss high-level concepts behind each area and
194provide a brief discussion of how to get started with it.
195Note, some deprecated functions or older frameworks are not listed here.
196.Pp
197Every function listed below has its own manual page in section 9F and
198can be read with
199.Xr man 1 .
200In addition, some corresponding concepts are documented in section 9 and
201some groups of functions are present to support a specific type of
202device driver, which is discussed more in section 9E .
203.Ss Logging Functions
204Through the kernel there are often needs to log messages that either
205make it into the system log or on the console.
206These kinds of messages can be performed with the
207.Xr cmn_err 9F
208function or one of its more specific variants that operate in the
209context of a device
210.Po
211.Xr dev_err 9F
212.Pc
213or a zone
214.Po
215.Xr zcmn_err 9F
216.Pc .
217.Pp
218The console should be used sparingly.
219While a notice may be found there, one should assume that it may be
220missed either due to overflow, not being connected to say a serial
221console at the time, or some other reason.
222While the system log is better than the console, folks need to take care
223not to spam the log.
224Imagine if someone logged every time a network packet was generated or
225received, you'd quickly potentially run out of space and make it harder
226to find useful messages for bizarre behavior.
227It's also important to remember that only system administrators and
228privileged users can actually see this log.
229Where possible and appropriate use programmatic errors in routines that
230allow it.
231.Pp
232The system also supports a structured event log called a system event
233that is processed by
234.Xr syseventd 8 .
235This is used by the OS to provide notifications for things like device
236insertion and removal or the change of a data link.
237These are driven by the
238.Xr ddi_log_sysevent 9F
239function and allow arbitrary additional structured metadata in the form
240of a
241.Vt nvlist_t .
242.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
243.It Xr cmn_err 9F Ta Xr dev_err 9F
244.It Xr vcmn_err 9F Ta Xr vzcmn_err 9F
245.It Xr zcmn_err 9F Ta Xr ddi_log_sysevent 9F
246.El
247.Ss Memory Allocation
248At the heart of most device drivers is memory allocation.
249The primary kernel allocator is called
250.Qq kmem
251.Pq kernel memory
252and it is based on the
253.Qq vmem
254.Pq virtual memory
255subsystem.
256Most of the time, device drivers should use
257.Xr kmem_alloc 9F
258and
259.Xr kmem_zalloc 9F
260to allocate memory and free it with
261.Xr kmem_free 9F .
262Based on the original kmem and subsequent vmem papers, the kernel is
263internally using object caches and magazines to allow high-throughput
264allocation in a multi-CPU environment.
265.Pp
266When allocating memory, an important choice must be made: whether or not
267to block for memory.
268If one opts to perform a sleeping allocation, then the caller can be
269guaranteed that the allocation will succeed, but it may take some time
270and the thread will be blocked during that entire duration.
271This is the
272.Dv KM_SLEEP
273flag.
274On the other hand, there are many circumstances where this is not
275appropriate, especially because a thread that is inside a memory
276allocation function cannot currently be cancelled.
277If the thread corresponds to a user process, then it will not be
278killable.
279.Pp
280Given that there are many situations where this is not appropriate, the
281kernel offers an allocation mode where it will not block for memory to
282be available:
283.Dv KM_NOSLEEP
284and
285.Dv KM_NOSLEEP_LAZY .
286These allocations can fail and return
287.Dv NULL
288when they do fail.
289Even though these are said to be no sleep operations, that does not mean
290that the caller may not end up temporarily blocked due to mutex
291contention or due to trying a bit more aggressively to reclaim memory in
292the case of
293.Dv KM_NOSLEEP .
294Unless operating in special circumstances, using
295.Dv KM_NOSLEEP_LAZY
296should be preferred to
297.Dv KM_NOSLEEP .
298.Pp
299If a device driver has its own complex object that has more significant
300set up and tear down costs, then the kmem cache function family should
301be considered.
302To use a kmem cache, it must first be created using the
303.Xr kmem_cache_create 9F
304function, which requires specifying the size, alignment, and
305constructors and destructors.
306Individual objects are allocated from the cache with the
307.Xr kmem_cache_alloc 9F
308function.
309An important constraint when using the caches is that when an object is
310freed with
311.Xr kmem_cache_free 9F ,
312it is the callers responsibility to ensure that the object is returned
313to its constructed state prior to freeing it.
314If the object is reused, prior to the kernel reclaiming the memory for
315other uses, then the constructor will not be called again.
316Most device drivers do not need to create a kmem cache for their
317own allocations.
318.Pp
319If you are writing a device driver that is trying to interact with the
320networking, STREAMS, or USB subsystems, then they are generally using
321the
322.Vt mblk_t
323data structure which is managed through a different set of APIs, though
324they are leveraging kmem under the hood.
325.Pp
326The vmem set of interfaces allows for the management of abstract regions
327of integers, generally representing memory or some other object, each
328with an offset and length.
329While it is not common that a device driver needs to do their own such
330management,
331.Xr vmem_create 9F
332and
333.Xr vmem_alloc 9F
334are what to reach for when the need arises.
335Rather than using vmem, if one needs to model a set of integers where
336each is a valid identifier, that is you need to allocate every integer
337between 0 and 1000 as a distinct identifier, instead use
338.Xr id_space_create 9F
339which is discussed in
340.Sx Identifier Management .
341For more information on vmem, see
342.Xr vmem 9 .
343.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
344.It Xr kmem_alloc 9F Ta Xr kmem_cache_alloc 9F
345.It Xr kmem_cache_create 9F Ta Xr kmem_cache_destroy 9F
346.It Xr kmem_cache_free 9F Ta Xr kmem_cache_set_move 9F
347.It Xr kmem_free 9F Ta Xr kmem_zalloc 9F
348.It Xr vmem_add 9F Ta Xr vmem_alloc 9F
349.It Xr vmem_contains 9F Ta Xr vmem_create 9F
350.It Xr vmem_destroy 9F Ta Xr vmem_free 9F
351.It Xr vmem_size 9F Ta Xr vmem_walk 9F
352.It Xr vmem_xalloc 9F Ta Xr vmem_xcreate 9F
353.It Xr vmem_xfree 9F Ta Xr bufcall 9F
354.It Xr esbbcall 9F Ta Xr qbufcall 9F
355.It Xr qunbufcall 9F Ta Xr unbufcall 9F
356.El
357.Ss String and libc Analogues
358The kernel has many analogues for classic libc functions that deal with
359string processing, memory copying, and related.
360For the most part, these behave similarly to their userland analogues,
361but there can be some differences in return values and for example, in
362the set of supported format characters in the case of
363.Xr snprintf 9F
364and related.
365.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
366.It Xr ASSERT 9F Ta Xr bcmp 9F
367.It Xr bzero 9F Ta Xr bcopy 9F
368.It Xr ddi_strdup 9F Ta Xr ddi_strtol 9F
369.It Xr ddi_strtoll 9F Ta Xr ddi_strtoul 9F
370.It Xr ddi_strtoull 9F Ta Xr ddi_ffs 9F
371.It Xr ddi_fls 9F Ta Xr max 9F
372.It Xr memchr 9F Ta Xr memcmp 9F
373.It Xr memcpy 9F Ta Xr memmove 9F
374.It Xr memset 9F Ta Xr min 9F
375.It Xr numtos 9F Ta Xr snprintf 9F
376.It Xr sprintf 9F Ta Xr stoi 9F
377.It Xr strcasecmp 9F Ta Xr strcat 9F
378.It Xr strchr 9F Ta Xr strcmp 9F
379.It Xr strcpy 9F Ta Xr strdup 9F
380.It Xr strfree 9F Ta Xr string 9F
381.It Xr strlcat 9F Ta Xr strlcpy 9F
382.It Xr strlen 9F Ta Xr strlog 9F
383.It Xr strncasecmp 9F Ta Xr strncat 9F
384.It Xr strncmp 9F Ta Xr strncpy 9F
385.It Xr strnlen 9F Ta Xr strqget 9F
386.It Xr strqset 9F Ta Xr strrchr 9F
387.It Xr strspn 9F Ta Xr swab 9F
388.It Xr vsnprintf 9F Ta Xr va_arg 9F
389.It Xr va_copy 9F Ta Xr va_end 9F
390.It Xr va_start 9F Ta Xr vsprintf 9F
391.El
392.Ss Tree Data Structures
393These functions provide access to an intrusive self-balancing binary
394tree that is generally used throughout illumos.
395The primary type here is the
396.Vt avl_tree_t .
397Structures can be present in multiple trees and there are built-in
398walkers for the data structure in
399.Xr mdb 1 .
400.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
401.It Xr avl_add 9F Ta Xr avl_create 9F
402.It Xr avl_destroy_nodes 9F Ta Xr avl_destroy 9F
403.It Xr avl_find 9F Ta Xr avl_first 9F
404.It Xr avl_insert_here 9F Ta Xr avl_insert 9F
405.It Xr avl_is_empty 9F Ta Xr avl_last 9F
406.It Xr avl_nearest 9F Ta Xr AVL_NEXT 9F
407.It Xr avl_numnodes 9F Ta Xr AVL_PREV 9F
408.It Xr avl_remove 9F Ta Xr avl_swap 9F
409.El
410.Ss Linked Lists
411These functions provide a standard, intrusive doubly-linked list whose
412type is the
413.Vt list_t .
414This list implementation is used extensively throughout illumos, has
415debugging support through
416.Xr mdb 1
417walkers, and is generally recommended rather than creating your own
418list.
419Due to its intrusive nature, a given structure can be present on
420multiple lists.
421.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
422.It Xr list_create 9F Ta Xr list_destroy 9F
423.It Xr list_head 9F Ta Xr list_insert_after 9F
424.It Xr list_insert_before 9F Ta Xr list_insert_head 9F
425.It Xr list_insert_tail 9F Ta Xr list_is_empty 9F
426.It Xr list_link_active 9F Ta Xr list_link_init 9F
427.It Xr list_link_replace 9F Ta Xr list_move_tail 9F
428.It Xr list_next 9F Ta Xr list_prev 9F
429.It Xr list_remove_head 9F Ta Xr list_remove_tail 9F
430.It Xr list_remove 9F Ta Xr list_tail 9F
431.El
432.Ss Name-Value Pairs
433The kernel often uses the
434.Vt nvlist_t
435data structure to pass around a list of typed name-value pairs.
436This data structure is used in diverse areas, particularly because of
437its ability to be serialized in different formats that are suitable not
438only for use between userland and the kernel, but also persistently to a
439file.
440.Pp
441A
442.Vt nvlist_t
443structure is initialized with the
444.Xr nvlist_alloc 9F
445function and can operate with two different degrees of uniqueness: a
446mode where only names are unique or that every name is qualified to a
447type.
448The former means that if I have an integer name
449.Dq foo
450and then add a string, array, or any other value with the same name, it
451will be replaced.
452However, if were using the name and type as unique, then the value would
453only be replaced if both the pair's type and the name
454.Dq foo
455matched a pair that was already present.
456Otherwise, the two different entries would co-exist.
457.Pp
458When constructing an nvlist, it is normally backed by the normal kmem
459allocator and may either use sleeping or non-sleeping allocations.
460It is also possible to use a custom allocator, though that generally has
461not been necessary in the kernel.
462.Pp
463Specific keys and values can be looked up directly with the
464nvlist_lookup family of functions, but the entire list can be iterated
465as well, which is especially useful when trying to validate that no
466unknown keys are present in the list.
467The iteration API
468.Xr nvlist_next_nvpair 9F
469allows one to then get both the key's name, the type of value of the
470pair, and then the value itself.
471.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
472.It Xr nv_alloc_fini 9F Ta Xr nv_alloc_init 9F
473.It Xr nvlist_add_boolean_array 9F Ta Xr nvlist_add_boolean_value 9F
474.It Xr nvlist_add_boolean 9F Ta Xr nvlist_add_byte_array 9F
475.It Xr nvlist_add_byte 9F Ta Xr nvlist_add_int16_array 9F
476.It Xr nvlist_add_int16 9F Ta Xr nvlist_add_int32_array 9F
477.It Xr nvlist_add_int32 9F Ta Xr nvlist_add_int64_array 9F
478.It Xr nvlist_add_int64 9F Ta Xr nvlist_add_int8_array 9F
479.It Xr nvlist_add_int8 9F Ta Xr nvlist_add_nvlist_array 9F
480.It Xr nvlist_add_nvlist 9F Ta Xr nvlist_add_nvpair 9F
481.It Xr nvlist_add_string_array 9F Ta Xr nvlist_add_string 9F
482.It Xr nvlist_add_uint16_array 9F Ta Xr nvlist_add_uint16 9F
483.It Xr nvlist_add_uint32_array 9F Ta Xr nvlist_add_uint32 9F
484.It Xr nvlist_add_uint64_array 9F Ta Xr nvlist_add_uint64 9F
485.It Xr nvlist_add_uint8_array 9F Ta Xr nvlist_add_uint8 9F
486.It Xr nvlist_alloc 9F Ta Xr nvlist_dup 9F
487.It Xr nvlist_exists 9F Ta Xr nvlist_free 9F
488.It Xr nvlist_lookup_boolean_array 9F Ta Xr nvlist_lookup_boolean_value 9F
489.It Xr nvlist_lookup_boolean 9F Ta Xr nvlist_lookup_byte_array 9F
490.It Xr nvlist_lookup_byte 9F Ta Xr nvlist_lookup_int16_array 9F
491.It Xr nvlist_lookup_int16 9F Ta Xr nvlist_lookup_int32_array 9F
492.It Xr nvlist_lookup_int32 9F Ta Xr nvlist_lookup_int64_array 9F
493.It Xr nvlist_lookup_int64 9F Ta Xr nvlist_lookup_int8_array 9F
494.It Xr nvlist_lookup_int8 9F Ta Xr nvlist_lookup_nvlist_array 9F
495.It Xr nvlist_lookup_nvlist 9F Ta Xr nvlist_lookup_nvpair 9F
496.It Xr nvlist_lookup_pairs 9F Ta Xr nvlist_lookup_string_array 9F
497.It Xr nvlist_lookup_string 9F Ta Xr nvlist_lookup_uint16_array 9F
498.It Xr nvlist_lookup_uint16 9F Ta Xr nvlist_lookup_uint32_array 9F
499.It Xr nvlist_lookup_uint32 9F Ta Xr nvlist_lookup_uint64_array 9F
500.It Xr nvlist_lookup_uint64 9F Ta Xr nvlist_lookup_uint8_array 9F
501.It Xr nvlist_lookup_uint8 9F Ta Xr nvlist_merge 9F
502.It Xr nvlist_next_nvpair 9F Ta Xr nvlist_pack 9F
503.It Xr nvlist_remove_all 9F Ta Xr nvlist_remove 9F
504.It Xr nvlist_size 9F Ta Xr nvlist_t 9F
505.It Xr nvlist_unpack 9F Ta Xr nvlist_xalloc 9F
506.It Xr nvlist_xdup 9F Ta Xr nvlist_xpack 9F
507.It Xr nvlist_xunpack 9F Ta Xr nvpair_name 9F
508.It Xr nvpair_type 9F Ta Xr nvpair_value_boolean_array 9F
509.It Xr nvpair_value_byte_array 9F Ta Xr nvpair_value_byte 9F
510.It Xr nvpair_value_int16_array 9F Ta Xr nvpair_value_int16 9F
511.It Xr nvpair_value_int32_array 9F Ta Xr nvpair_value_int32 9F
512.It Xr nvpair_value_int64_array 9F Ta Xr nvpair_value_int64 9F
513.It Xr nvpair_value_int8_array 9F Ta Xr nvpair_value_int8 9F
514.It Xr nvpair_value_nvlist_array 9F Ta Xr nvpair_value_nvlist 9F
515.It Xr nvpair_value_string_array 9F Ta Xr nvpair_value_string 9F
516.It Xr nvpair_value_uint16_array 9F Ta Xr nvpair_value_uint16 9F
517.It Xr nvpair_value_uint32_array 9F Ta Xr nvpair_value_uint32 9F
518.It Xr nvpair_value_uint64_array 9F Ta Xr nvpair_value_uint64 9F
519.It Xr nvpair_value_uint8_array 9F Ta Xr nvpair_value_uint8 9F
520.El
521.Ss Identifier Management
522A common challenge in the kernel is the management of a series of
523different IDs.
524There are three different families of routines for managing identifiers
525presented here, but we recommend the use of the
526.Xr id_space_create 9F
527and
528.Xr id_alloc 9F
529family for new use cases.
530The ID space can cover all or a subset of the 32-bit integer space and
531provides different allocation strategies for this.
532.Pp
533Due to the current implementation, callers should generally prefer the
534non-sleeping variants because the sleeping ones are not cancellable
535.Po
536currently this is backed by vmem, but this should not be assumed and may
537change in the future
538.Pc .
539.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
540.It Xr id_alloc_nosleep 9F Ta Xr id_alloc_specific_nosleep 9F
541.It Xr id_alloc 9F Ta Xr id_allocff_nosleep 9F
542.It Xr id_allocff 9F Ta Xr id_free 9F
543.It Xr id_space_create 9F Ta Xr id_space_destroy 9F
544.It Xr id_space_extend 9F Ta Xr id_space 9F
545.It Xr id32_alloc 9F Ta Xr id32_free 9F
546.It Xr id32_lookup 9F Ta Xr rmalloc_wait 9F
547.It Xr rmalloc 9F Ta Xr rmallocmap_wait 9F
548.It Xr rmallocmap 9F Ta Xr rmfree 9F
549.It Xr rmfreemap 9F Ta
550.El
551.Ss Bit Manipulation Routines
552Many device drivers that are working with registers often need to get a
553specific range of bits out of an integer.
554These functions provide safe ways to set
555.Pq bitset
556and extract
557.Pq bitx
558bit ranges, as well
559as modify an integer to remove a set of bits entirely
560.Pq bitdel .
561Using these functions is preferred to constructing manual masks and
562shifts particularly when a programming manual for a device is specified
563in ranges of bits.
564On debug builds, these provide extra checking to try and catch
565programmer error.
566.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
567.It Xr bitdel64 9F Ta Xr bitset8 9F
568.It Xr bitset16 9F Ta Xr bitset32 9F
569.It Xr bitset64 9F Ta Xr bitx8 9F
570.It Xr bitx16 9F Ta Xr bitx32 9F
571.It Xr bitx64 9F Ta
572.El
573.Ss Synchronization Primitives
574The kernel provides a set of basic synchronization primitives that can
575be used by the system.
576These include mutexes, condition variables, reader/writer locks, and
577semaphores.
578When creating mutexes and reader/writer locks, the kernel requires that
579one pass in the interrupt priority of a mutex if it will be used in
580interrupt context.
581This is required so the kernel can determine the correct underlying type
582of lock to use.
583This ensures that if for some reason a mutex needs to be used in
584high-level interrupt context, the kernel will use a spin lock, but
585otherwise can use the standard adaptive mutex that might block.
586For developers familiar with other operating systems, this is somewhat
587different in that the consumer does not need to generally figure out
588this level of detail and this is why this is not present.
589.Pp
590In addition, condition variables provide means for waiting and detecting
591that a signal has been delivered.
592These variants are particularly useful when writing character device
593operations for device drivers as it allows users the chance to cancel an
594operation and not be blocked indefinitely on something that may not
595occur.
596These _sig variants should generally be preferred where applicable.
597.Pp
598The kernel also provides memory barrier primitives.
599See the
600.Sx Memory Barriers
601section for more information.
602There is no need to use manual memory barriers when using the
603synchronization primitives.
604The synchronization primitives contain that the appropriate barriers are
605present to ensure coherency while the lock is held.
606.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
607.It Xr cv_broadcast 9F Ta Xr cv_destroy 9F
608.It Xr cv_init 9F Ta Xr cv_reltimedwait_sig 9F
609.It Xr cv_reltimedwait 9F Ta Xr cv_signal 9F
610.It Xr cv_timedwait_sig 9F Ta Xr cv_timedwait 9F
611.It Xr cv_wait_sig 9F Ta Xr cv_wait 9F
612.It Xr ddi_enter_critical 9F Ta Xr ddi_exit_critical 9F
613.It Xr mutex_destroy 9F Ta Xr mutex_enter 9F
614.It Xr mutex_exit 9F Ta Xr mutex_init 9F
615.It Xr mutex_owned 9F Ta Xr mutex_tryenter 9F
616.It Xr rw_destroy 9F Ta Xr rw_downgrade 9F
617.It Xr rw_enter 9F Ta Xr rw_exit 9F
618.It Xr rw_init 9F Ta Xr rw_read_locked 9F
619.It Xr rw_tryenter 9F Ta Xr rw_tryupgrade 9F
620.It Xr sema_destroy 9F Ta Xr sema_init 9F
621.It Xr sema_p_sig 9F Ta Xr sema_p 9F
622.It Xr sema_tryp 9F Ta Xr sema_v 9F
623.It Xr semaphore 9F Ta
624.El
625.Ss Atomic Operations
626This group of functions provides a general way to perform atomic
627operations on integers of different sizes and explicit types.
628The
629.Xr atomic_ops 9F
630manual page describes the different classes of functions in more detail,
631but there are functions that take care of using the CPU's instructions
632for addition, compare and swap, and more.
633If data is being protected and only accessed under a synchronization
634primitive such as a mutex or reader-writer lock, then there isn't a
635reason to use an atomic operation for that data, generally speaking.
636.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
637.It Xr atomic_add_8_nv 9F Ta Xr atomic_add_8 9F
638.It Xr atomic_add_16_nv 9F Ta Xr atomic_add_16 9F
639.It Xr atomic_add_32_nv 9F Ta Xr atomic_add_32 9F
640.It Xr atomic_add_64_nv 9F Ta Xr atomic_add_64 9F
641.It Xr atomic_add_char_nv 9F Ta Xr atomic_add_char 9F
642.It Xr atomic_add_int_nv 9F Ta Xr atomic_add_int 9F
643.It Xr atomic_add_long_nv 9F Ta Xr atomic_add_long 9F
644.It Xr atomic_add_ptr_nv 9F Ta Xr atomic_add_ptr 9F
645.It Xr atomic_add_short_nv 9F Ta Xr atomic_add_short 9F
646.It Xr atomic_and_8_nv 9F Ta Xr atomic_and_8 9F
647.It Xr atomic_and_16_nv 9F Ta Xr atomic_and_16 9F
648.It Xr atomic_and_32_nv 9F Ta Xr atomic_and_32 9F
649.It Xr atomic_and_64_nv 9F Ta Xr atomic_and_64 9F
650.It Xr atomic_and_uchar_nv 9F Ta Xr atomic_and_uchar 9F
651.It Xr atomic_and_uint_nv 9F Ta Xr atomic_and_uint 9F
652.It Xr atomic_and_ulong_nv 9F Ta Xr atomic_and_ulong 9F
653.It Xr atomic_and_ushort_nv 9F Ta Xr atomic_and_ushort 9F
654.It Xr atomic_cas_16 9F Ta Xr atomic_cas_32 9F
655.It Xr atomic_cas_64 9F Ta Xr atomic_cas_8 9F
656.It Xr atomic_cas_ptr 9F Ta Xr atomic_cas_uchar 9F
657.It Xr atomic_cas_uint 9F Ta Xr atomic_cas_ulong 9F
658.It Xr atomic_cas_ushort 9F Ta Xr atomic_clear_long_excl 9F
659.It Xr atomic_dec_8_nv 9F Ta Xr atomic_dec_8 9F
660.It Xr atomic_dec_16_nv 9F Ta Xr atomic_dec_16 9F
661.It Xr atomic_dec_32_nv 9F Ta Xr atomic_dec_32 9F
662.It Xr atomic_dec_64_nv 9F Ta Xr atomic_dec_64 9F
663.It Xr atomic_dec_ptr_nv 9F Ta Xr atomic_dec_ptr 9F
664.It Xr atomic_dec_uchar_nv 9F Ta Xr atomic_dec_uchar 9F
665.It Xr atomic_dec_uint_nv 9F Ta Xr atomic_dec_uint 9F
666.It Xr atomic_dec_ulong_nv 9F Ta Xr atomic_dec_ulong 9F
667.It Xr atomic_dec_ushort_nv 9F Ta Xr atomic_dec_ushort 9F
668.It Xr atomic_inc_8_nv 9F Ta Xr atomic_inc_8 9F
669.It Xr atomic_inc_16_nv 9F Ta Xr atomic_inc_16 9F
670.It Xr atomic_inc_32_nv 9F Ta Xr atomic_inc_32 9F
671.It Xr atomic_inc_64_nv 9F Ta Xr atomic_inc_64 9F
672.It Xr atomic_inc_ptr_nv 9F Ta Xr atomic_inc_ptr 9F
673.It Xr atomic_inc_uchar_nv 9F Ta Xr atomic_inc_uchar 9F
674.It Xr atomic_inc_uint_nv 9F Ta Xr atomic_inc_uint 9F
675.It Xr atomic_inc_ulong_nv 9F Ta Xr atomic_inc_ulong 9F
676.It Xr atomic_inc_ushort_nv 9F Ta Xr atomic_inc_ushort 9F
677.It Xr atomic_or_8_nv 9F Ta Xr atomic_or_8 9F
678.It Xr atomic_or_16_nv 9F Ta Xr atomic_or_16 9F
679.It Xr atomic_or_32_nv 9F Ta Xr atomic_or_32 9F
680.It Xr atomic_or_64_nv 9F Ta Xr atomic_or_64 9F
681.It Xr atomic_or_uchar_nv 9F Ta Xr atomic_or_uchar 9F
682.It Xr atomic_or_uint_nv 9F Ta Xr atomic_or_uint 9F
683.It Xr atomic_or_ulong_nv 9F Ta Xr atomic_or_ulong 9F
684.It Xr atomic_or_ushort_nv 9F Ta Xr atomic_or_ushort 9F
685.It Xr atomic_set_long_excl 9F Ta Xr atomic_swap_8 9F
686.It Xr atomic_swap_16 9F Ta Xr atomic_swap_32 9F
687.It Xr atomic_swap_64 9F Ta Xr atomic_swap_ptr 9F
688.It Xr atomic_swap_uchar 9F Ta Xr atomic_swap_uint 9F
689.It Xr atomic_swap_ulong 9F Ta Xr atomic_swap_ushort 9F
690.El
691.Ss Memory Barriers
692The kernel provides general purpose memory barriers that can be used
693when required.
694In general, when using items described in the
695.Sx Synchronization Primitives
696section, these are not required.
697.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
698.It Xr membar_consumer 9F Ta Xr membar_enter 9F
699.It Xr membar_exit 9F Ta Xr membar_producer 9F
700.El
701.Ss Virtual Memory and Pages
702All platforms that the operating system supports have some form of
703virtual memory which is managed in units of pages.
704The page size varies between architectures and platforms.
705For example, the smallest x86 page size is 4 KiB while SPARC
706traditionally used 8 KiB pages.
707These functions can be used to convert between pages and bytes.
708.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
709.It Xr btop 9F Ta Xr btopr 9F
710.It Xr ddi_btop 9F Ta Xr ddi_btopr 9F
711.It Xr ddi_ptob 9F Ta Xr ptob 9F
712.El
713.Ss Module and Device Framework
714These functions are used as part of implementing kernel modules and
715register device drivers with the various kernel frameworks.
716There are also functions here that are suitable for use in the
717.Xr dev_ops 9S ,
718.Xr cb_ops 9S ,
719etc.
720structures and for interrogating module information.
721.Pp
722The
723.Xr mod_install 9F
724and
725.Xr mod_remove 9F
726functions are used during a driver's
727.Xr _init 9E
728and
729.Xr _fini 9E
730functions.
731.Pp
732There are two different ways that drivers often manage their instance
733state which is created during
734.Xr attach 9E .
735The first is the use of
736.Xr ddi_set_driver_private 9F
737and
738.Xr ddi_get_driver_private 9F .
739This stores a driver-specific value on the
740.Vt dev_info_t
741structure which allows it to be used during other operations.
742Some device driver frameworks may use this themselves, making this
743unavailable to the driver.
744.Pp
745The other path is to use the soft state suite of functions which
746dynamically grows to cover the number of instances of a device that
747exist.
748The soft state is generally initialized in the
749.Xr _init 9E
750entry point with
751.Xr ddi_soft_state_init 9F
752and then instances are allocated and freed during
753.Xr attach 9E
754and
755.Xr detach 9E
756with
757.Xr ddi_soft_state_zalloc 9F
758and
759.Xr ddi_soft_state_free 9F ,
760and then retrieved with
761.Xr ddi_get_soft_state 9F .
762.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
763.It Xr ddi_get_driver_private 9F Ta Xr ddi_get_soft_state 9F
764.It Xr ddi_modclose 9F Ta Xr ddi_modopen 9F
765.It Xr ddi_modsym 9F Ta Xr ddi_no_info 9F
766.It Xr ddi_report_dev 9F Ta Xr ddi_set_driver_private 9F
767.It Xr ddi_soft_state_fini 9F Ta Xr ddi_soft_state_free 9F
768.It Xr ddi_soft_state_init 9F Ta Xr ddi_soft_state_zalloc 9F
769.It Xr mod_info 9F Ta Xr mod_install 9F
770.It Xr mod_modname 9F Ta Xr mod_remove 9F
771.It Xr nochpoll 9F Ta Xr nodev 9F
772.It Xr nulldev 9F Ta
773.El
774.Ss Device Tree Information
775Devices are organized into a tree that is partially seeded by the
776platform based on information discovered at boot and augmented with
777additional information at runtime.
778Every instance of a device driver is given a
779.Vt "dev_info_t *"
780.Pq device information
781data structure which corresponds to information about an instance and
782has a place in the tree.
783When a driver requests operations like to allocate memory for DMA, that
784request is passed up the tree and modified.
785The same is true for other things like interrupts, event notifications,
786or properties.
787.Pp
788There are many different informational properties about a device driver.
789For example,
790.Xr ddi_driver_name 9F
791returns the name of the device driver,
792.Xr ddi_get_name 9F
793returns the name of the node in the tree,
794.Xr ddi_get_parent 9F
795returns a node's parent, and
796.Xr ddi_get_instance 9F
797returns the instance number of a specific driver.
798.Pp
799There are a series of properties that exist on the tree, the exact set
800of which depend on the class of the device and are often documented in a
801specific device class's manual.
802For example, the
803.Dq reg
804property is used for PCI and PCIe devices to describe the various base
805address registers, their types, and related, which are documented in
806.Xr pci 5 .
807.Pp
808When getting a property one can constrain it to the current instance or
809you can ask for a parent to try to look up the property.
810Which mode is appropriate depends on the specific class of driver, its
811parent, and the property.
812.Pp
813Using a
814.Vt "dev_info_t *"
815pointer has to be done carefully.
816When a device driver is in any of its
817.Xr dev_ops 9S ,
818.Xr cb_ops 9S ,
819or similar callback functions that it has registered with the kernel,
820then it can always safely use its own
821.Vt "dev_info_t"
822and those of any parents it discovers through
823.Xr ddi_get_parent 9F .
824However, it cannot assume the validity of any siblings or children
825unless there are other circumstances that guarantee that they will not
826disappear.
827In the broader kernel, one should not assume that it is safe to use a
828given
829.Vt "dev_info_t *"
830structure without the appropriate NDI
831.Pq nexus driver interface
832hold having been applied.
833.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
834.It Xr ddi_binding_name 9F Ta Xr ddi_dev_is_sid 9F
835.It Xr ddi_driver_major 9F Ta Xr ddi_driver_name 9F
836.It Xr ddi_get_devstate 9F Ta Xr ddi_get_instance 9F
837.It Xr ddi_get_name 9F Ta Xr ddi_get_parent 9F
838.It Xr ddi_getlongprop_buf 9F Ta Xr ddi_getlongprop 9F
839.It Xr ddi_getprop 9F Ta Xr ddi_getproplen 9F
840.It Xr ddi_node_name 9F Ta Xr ddi_prop_create 9F
841.It Xr ddi_prop_exists 9F Ta Xr ddi_prop_free 9F
842.It Xr ddi_prop_get_int 9F Ta Xr ddi_prop_get_int64 9F
843.It Xr ddi_prop_lookup_byte_array 9F Ta Xr ddi_prop_lookup_int_array 9F
844.It Xr ddi_prop_lookup_int64_array 9F Ta Xr ddi_prop_lookup_string_array 9F
845.It Xr ddi_prop_lookup_string 9F Ta Xr ddi_prop_lookup 9F
846.It Xr ddi_prop_modify 9F Ta Xr ddi_prop_op 9F
847.It Xr ddi_prop_remove_all 9F Ta Xr ddi_prop_remove 9F
848.It Xr ddi_prop_undefine 9F Ta Xr ddi_prop_update_byte_array 9F
849.It Xr ddi_prop_update_int_array 9F Ta Xr ddi_prop_update_int 9F
850.It Xr ddi_prop_update_int64_array 9F Ta Xr ddi_prop_update_int64 9F
851.It Xr ddi_prop_update_string_array 9F Ta Xr ddi_prop_update_string 9F
852.It Xr ddi_prop_update 9F Ta Xr ddi_root_node 9F
853.It Xr ddi_slaveonly 9F Ta
854.El
855.Ss Copying Data to and from Userland
856The kernel operates in a different context from userland.
857One does not simply access user memory.
858This is enforced either by the architecture's memory model, where user
859address space isn't even present in the kernel's virtual address space
860or by architectural mechanisms such as Supervisor Mode Access Protect
861.Pq SMAP
862on x86.
863.Pp
864To facilitate accessing memory, the kernel provides a few routines that
865can be used.
866In most contexts the main thing to use is
867.Xr ddi_copyin 9F
868and
869.Xr ddi_copyout 9F .
870These will safely dereference addresses and ensure that the address is
871appropriate depending on whether this is coming from the user or kernel.
872When operating with the kernel's
873.Vt uio_t
874structure which is for mostly used when processing read and write
875requests, instead
876.Xr uiomove 9F
877is the goto function.
878.Pp
879When reading data from userland into the kernel, there is another
880concern: the data model.
881The most common place this comes up is in an
882.Xr ioctl 9E
883handler or other places where the kernel is operating on data that isn't
884fixed size.
885Particularly in C, though this applies to other languages, structures
886and unions vary in the size and alignment requirements between 32-bit
887and 64-bit processes.
888The same even applies if one uses pointers or the
889.Vt long ,
890.Vt size_t ,
891or similar types in C.
892In supported 32-bit and 64-bit environments these types are 4 and 8
893bytes respectively.
894To account for this, when data is not fixed size between all data
895models, the driver must look at the data model of the process it is
896copying data from.
897.Pp
898The simplest way to solve this problem is to try to make the data
899structure the same across the different models.
900It's not sufficient to just use the same structure definition and fixed
901size types as the alignment and padding between the two can vary.
902For example, the alignment of a 64-bit integer like a
903.Vt uint64_t
904can change between a 32-bit and 64-bit data model.
905One way to check for the data structures being identical is to leverage
906the
907.Xr ctfdiff 1
908program, generally with the
909.Fl I
910option.
911.Pp
912However, there are times when a structure simply can't be the same, such
913as when we're encoding a pointer into the structure or a type like the
914.Vt size_t .
915When this happens, the most natural way to accomplish this is to use the
916.Xr ddi_model_convert_from 9F
917function which can determine the appropriate model from the ioctl's
918arguments.
919This provides a natural way to copy a structure in and out in the
920appropriate data model and convert it at those points to the kernel's
921native form.
922.Pp
923An alternate way to approach the data model is to use the
924.Xr STRUCT_DECL 9F
925functions, but as this requires wrapping every access to every member,
926often times the
927.Xr ddi_model_convert_from 9F
928approach and taking care of converting values and ensuring that limits
929aren't exceeded at the end is preferred.
930.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
931.It Xr bp_copyin 9F Ta Xr bp_copyout 9F
932.It Xr copyin 9F Ta Xr copyout 9F
933.It Xr ddi_copyin 9F Ta Xr ddi_copyout 9F
934.It Xr ddi_model_convert_from 9F Ta Xr SIZEOF_PTR 9F
935.It Xr SIZEOF_STRUCT 9F Ta Xr STRUCT_BUF 9F
936.It Xr STRUCT_DECL 9F Ta Xr STRUCT_FADDR 9F
937.It Xr STRUCT_FGET 9F Ta Xr STRUCT_FGETP 9F
938.It Xr STRUCT_FSET 9F Ta Xr STRUCT_FSETP 9F
939.It Xr STRUCT_HANDLE 9F Ta Xr STRUCT_INIT 9F
940.It Xr STRUCT_SET_HANDLE 9F Ta Xr STRUCT_SIZE 9F
941.It Xr uiomove 9F Ta Xr ureadc 9F
942.It Xr uwritec 9F Ta
943.El
944.Ss Device Register Setup and Access
945The kernel abstracts out accessing registers on a device on behalf of
946drivers.
947This allows a similar set of interfaces to be used whether the registers
948are found within a PCI BAR, utilizing I/O ports, memory mapped
949registers, or some other scheme.
950Devices with registers all have a
951.Dq regs
952property that is set up by their parent device, generally a kernel
953framework as is the case for PCIe devices, and the meaning is a contract
954between the two.
955Register sets are identified by a numeric ID, which varies on the device
956type.
957For example, the first BAR of a PCI device is defined as register set 1.
958On the other hand, the AMD GPIO controller might have three register sets
959because of how the hardware design splits them up.
960The meaning of the registers and their semantics is still
961device-specific.
962The kernel doesn't know how to interpret the actual registers of a PCIe
963device say, just that they exist.
964.Pp
965To begin with register setup, one often first looks at the number of
966register sets that exist and their size.
967Most PCI-based device drivers will skip calling
968.Xr ddi_dev_nregs 9F
969and will just move straight to calling
970.Xr ddi_dev_regsize 9F
971to determine the size of a register set that they are interested in.
972To actually map the registers, a device driver will call
973.Xr ddi_regs_map_setup 9F
974which requires both a register set and a series of attributes and
975returns an access handle that is used to actually read and write the
976registers.
977When setting up registers, one must have a corresponding
978.Vt ddi_device_acc_attr_t
979structure which is used to define what endianness the register set is
980in, whether any kind of reordering is allowed
981.Po
982if in doubt specify
983.Dv DDI_STRICTORDER_ACC
984.Pc ,
985and whether any particular error handling is being used.
986The structure and all of its different options are described in
987.Xr ddi_device_acc_attr 9S .
988.Pp
989Once a register handle is obtained, then it's easy to read and write the
990register space.
991Functions are organized based on the size of the access.
992For the most part, most situations call for the use of the
993.Xr ddi_get8 9F ,
994.Xr ddi_get16 9F ,
995.Xr ddi_get32 9F ,
996and
997.Xr ddi_get64 9F
998functions to read a register and the
999.Xr ddi_put8 9F ,
1000.Xr ddi_put16 9F ,
1001.Xr ddi_put32 9F ,
1002and
1003.Xr ddi_put64 9F
1004functions to set a register value.
1005While there are the ddi_io_ and ddi_mem_ families of functions below,
1006these are not generally needed and are generally present for
1007compatibility.
1008The kernel will automatically perform the appropriate type of register
1009read for the device type in question.
1010.Pp
1011Once a register set is no longer being used, the
1012.Xr ddi_regs_map_free 9F
1013function should be used to release resources.
1014In most cases, this happens while executing the
1015.Xr detach 9E
1016entry point.
1017.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1018.It Xr ddi_dev_nregs 9F Ta Xr ddi_dev_regsize 9F
1019.It Xr ddi_device_copy 9F Ta Xr ddi_device_zero 9F
1020.It Xr ddi_regs_map_free 9F Ta Xr ddi_regs_map_setup 9F
1021.It Xr ddi_get8 9F Ta Xr ddi_get16 9F
1022.It Xr ddi_get32 9F Ta Xr ddi_get64 9F
1023.It Xr ddi_io_get8 9F Ta Xr ddi_io_get16 9F
1024.It Xr ddi_io_get32 9F Ta Xr ddi_io_put8 9F
1025.It Xr ddi_io_put16 9F Ta Xr ddi_io_put32 9F
1026.It Xr ddi_io_rep_get8 9F Ta Xr ddi_io_rep_get16 9F
1027.It Xr ddi_io_rep_get32 9F Ta Xr ddi_io_rep_put8 9F
1028.It Xr ddi_io_rep_put16 9F Ta Xr ddi_io_rep_put32 9F
1029.It Xr ddi_map_regs 9F Ta Xr ddi_mem_get8 9F
1030.It Xr ddi_mem_get16 9F Ta Xr ddi_mem_get32 9F
1031.It Xr ddi_mem_get64 9F Ta Xr ddi_mem_put8 9F
1032.It Xr ddi_mem_put16 9F Ta Xr ddi_mem_put32 9F
1033.It Xr ddi_mem_put64 9F Ta Xr ddi_mem_rep_get8 9F
1034.It Xr ddi_mem_rep_get16 9F Ta Xr ddi_mem_rep_get32 9F
1035.It Xr ddi_mem_rep_get64 9F Ta Xr ddi_mem_rep_put8 9F
1036.It Xr ddi_mem_rep_put16 9F Ta Xr ddi_mem_rep_put32 9F
1037.It Xr ddi_mem_rep_put64 9F Ta Xr ddi_peek8 9F
1038.It Xr ddi_peek16 9F Ta Xr ddi_peek32 9F
1039.It Xr ddi_peek64 9F Ta Xr ddi_poke8 9F
1040.It Xr ddi_poke16 9F Ta Xr ddi_poke32 9F
1041.It Xr ddi_poke64 9F Ta Xr ddi_put8 9F
1042.It Xr ddi_put16 9F Ta Xr ddi_put32 9F
1043.It Xr ddi_put64 9F Ta Xr ddi_rep_get8 9F
1044.It Xr ddi_rep_get16 9F Ta Xr ddi_rep_get32 9F
1045.It Xr ddi_rep_get64 9F Ta Xr ddi_rep_put8 9F
1046.It Xr ddi_rep_put16 9F Ta Xr ddi_rep_put32 9F
1047.It Xr ddi_rep_put64 9F Ta
1048.El
1049.Ss DMA Related Functions
1050Most high-performance devices provide first-class support for DMA
1051.Pq direct memory access .
1052DMA allows a transfer between a device and memory to occur
1053asynchronously and generally without a thread's specific involvement.
1054Today, most DMA is provided directly by devices and the corresponding
1055device scheme.
1056Take PCI and PCI Express for example.
1057The idea of DMA is built into the PCIe standard and therefore basic
1058support for it exists and therefore there isn't a lot of special
1059programming required.
1060However, this hasn't always been true and still exists in some cases
1061where there is a 3rd party DMA engine.
1062If we consider the PCIe example, the PCIe device directly performs reads
1063and writes to main memory on its own.
1064However, in the 3rd party case, there is a distinct controller that is
1065neither the device nor memory that facilitates this, which is called a
1066DMA engine.
1067For most part, DMA engines are not something that needs to be thought
1068about for most platforms that illumos is present on; however, they still
1069exist in some embedded and related contexts.
1070.Pp
1071The first thing that a driver needs to do to set up DMA is to understand
1072the constraints of the device and bus.
1073These constraints are described in a series of attributes in the
1074.Vt ddi_dma_attr_t
1075structure which is defined in
1076.Xr ddi_dma_attr 9S .
1077The reason that attributes exist is because different devices, and
1078sometimes different memory uses with a device, have different
1079requirements for memory.
1080A simple example of this is that not all devices can accept memory
1081addresses that are 64-bits wide and may have to be constrained to the
1082lower 32-bits of memory.
1083Another common constraint is how this memory is chunked up.
1084Some devices may require that all of the DMA memory be contiguous, while
1085others can allow that to be broken up into say up to 4 or 8 different
1086regions.
1087.Pp
1088When memory is allocated for DMA it isn't immediately mapped into the
1089kernel's address space.
1090The addresses that describe a DMA address are defined in a DMA cookie,
1091several of which may make up a request.
1092However, those addresses are always physical addresses or addresses that
1093are virtualized by an IOMMU.
1094There are some cases were the kernel or a driver needs to be able to
1095access that memory, such as memory that represents a networking packet.
1096The IP stack will expect to be able to actually read the data it's
1097given.
1098.Pp
1099To begin with allocating DMA memory, a driver first fills out its
1100attribute structure.
1101Once that's ready, the DMA allocation process can begin.
1102This starts off by a driver calling
1103.Xr ddi_dma_alloc_handle 9F .
1104This handle is used through the lifetime of a given DMA memory buffer,
1105but it can be used across multiple operations that a device or the
1106kernel may perform.
1107The next step is to actually request that the kernel allocate some
1108amount of memory in the kernel for this DMA request.
1109This phase actually allocates addresses in virtual address space for the
1110activity and also requires a register attribute object that is discussed
1111in
1112.Sx Device Register Setup and Access .
1113Armed with this a driver can now call
1114.Xr ddi_dma_mem_alloc 9F
1115to specify how much memory they are looking for.
1116If this is successful, a virtual address, the actual length of the
1117region, and an access handle will be returned.
1118.Pp
1119At this point, the virtual address region is present.
1120Most drivers will access this virtual address range directly and will
1121ignore the register access handle.
1122The side effect of this is that they will handle all endianness issues
1123with the memory region themselves.
1124If the driver would prefer to go through the handle, then it can use the
1125register access functions discussed earlier.
1126.Pp
1127Before the memory can be programmed into the device, it must be bound to
1128a series of physical addresses or addresses virtualized by an IOMMU.
1129While the kernel presents the illusion of a single consistent virtual
1130address range for applications, the physical reality can be quite
1131different.
1132When the driver is ready it calls
1133.Xr ddi_dma_addr_bind_handle 9F
1134to create the mapping to well known physical addresses.
1135.Pp
1136These addresses are stored in a series of cookies.
1137A driver can determine the number of cookies for a given request by
1138utilizing its DMA handle and calling
1139.Xr ddi_dma_ncookies 9F
1140and then pairing that with
1141.Xr ddi_dma_cookie_get 9F .
1142These DMA cookies will not change and can be used time and time again
1143until
1144.Xr ddi_dma_unbind_handle 9F
1145is called.
1146With this information in hand, a physical device can be programmed with
1147these addresses and let loose to perform I/O.
1148.Pp
1149When performing I/O to and from a device, synchronization is a vitally
1150important thing which ensures that the actual state in memory is
1151coherent with the rest of the CPU's internal structures such as caches.
1152In general, a given DMA request is only going in one direction: for a
1153device or for the local CPU.
1154In either case, the
1155.Xr ddi_dma_sync 9F
1156function must be called after the kernel is done writing to a region of
1157DMA memory and before it triggers the device or the kernel must call it
1158after the device has told it that some activity has completed that it is
1159going to check.
1160.Pp
1161Some DMA operations utilize what are called DMA windows.
1162The most common consumer is something like a disk device where DMA
1163operations to a given series of sectors can be split up into different
1164chunks where as long as all the transfers are performed, the
1165intermediate states are acceptable.
1166Put another way, because of how SCSI and SAS commands are designed,
1167block devices can basically take a given I/O request and break it into
1168multiple independent I/Os that will equate to the same final item.
1169.Pp
1170When a device supports this mode of operation and it is opted into, then
1171a DMA allocation may result in the use of DMA windows.
1172This allows for cases where the kernel can't perform a DMA allocation
1173for the entire request, but instead can allocate a partial region and
1174then walk through each part one at a time.
1175This is uncommon outside of block devices and usually also is related to
1176calling
1177.Xr ddi_dma_buf_bind_handle 9F .
1178.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1179.It Xr ddi_dma_addr_bind_handle 9F Ta Xr ddi_dma_alloc_handle 9F
1180.It Xr ddi_dma_buf_bind_handle 9F Ta Xr ddi_dma_burstsizes 9F
1181.It Xr ddi_dma_cookie_get 9F Ta Xr ddi_dma_cookie_iter 9F
1182.It Xr ddi_dma_cookie_one 9F Ta Xr ddi_dma_free_handle 9F
1183.It Xr ddi_dma_getwin 9F Ta Xr ddi_dma_mem_alloc 9F
1184.It Xr ddi_dma_mem_free 9F Ta Xr ddi_dma_ncookies 9F
1185.It Xr ddi_dma_nextcookie 9F Ta Xr ddi_dma_numwin 9F
1186.It Xr ddi_dma_set_sbus64 9F Ta Xr ddi_dma_sync 9F
1187.It Xr ddi_dma_unbind_handle 9F Ta Xr ddi_dmae_1stparty 9F
1188.It Xr ddi_dmae_alloc 9F Ta Xr ddi_dmae_disable 9F
1189.It Xr ddi_dmae_enable 9F Ta Xr ddi_dmae_getattr 9F
1190.It Xr ddi_dmae_getcnt 9F Ta Xr ddi_dmae_prog 9F
1191.It Xr ddi_dmae_release 9F Ta Xr ddi_dmae_stop 9F
1192.It Xr ddi_dmae 9F Ta
1193.El
1194.Ss Interrupt Handler Related Functions
1195Interrupts are a central part of the role of device drivers and one of
1196the things that's important to get right.
1197Interrupts come in different types: fixed, MSI, and MSI-X.
1198The kinds that are available depend on the device and the rest of the
1199system.
1200For example, MSI and MSI-X interrupts are generally specific to PCI and
1201PCI Express devices.
1202To begin the interrupt allocation process, the first thing a driver
1203needs to do is to discover what type of interrupts it supports with
1204.Xr ddi_intr_get_supported_types 9F .
1205Then, the driver should work through the supported types, preferring
1206MSI-X, then MSI, and finally fixed interrupts, and try to allocate
1207interrupts.
1208.Pp
1209Drivers first need to know how many interrupts that they require.
1210For example, a networking driver may want to have an interrupt made
1211available for each ring that it has.
1212To discover the number of interrupts available, the driver should call
1213.Xr ddi_intr_get_navail 9F .
1214If there are sufficient interrupts, it can proceed to actually
1215allocate the interrupts with
1216.Xr ddi_intr_alloc 9F .
1217When allocating interrupts, callers need to check to see how many
1218interrupts the system actually gave them.
1219Just because an interrupt is allocated does not mean that it will fire
1220or be ready to use, there are a series of additional steps that the
1221driver must take.
1222.Pp
1223To go through and enable the interrupt, the driver should go through and
1224get the interrupt capabilities with
1225.Xr ddi_intr_get_cap 9F
1226and the priority of the interrupt with
1227.Xr ddi_intr_get_pri 9F .
1228The priority must be used while creating mutexes and related
1229synchronization primitives that will be used during the interrupt
1230handler.
1231At this point, the driver can go ahead and register the functions that
1232will be called with each allocated interrupt with the
1233.Xr ddi_intr_add_handler 9F
1234function.
1235The arguments can vary for each allocated interrupt.
1236It is common to have an interrupt-specific data structure passed in one
1237of the arguments or an interrupt number, while the other argument is
1238generally the driver's instance-specific data structure.
1239.Pp
1240At this point, the last step for the interrupt to be made active from
1241the kernel's perspective is to enable it.
1242This will use either the
1243.Xr ddi_intr_block_enable 9F
1244or
1245.Xr ddi_intr_enable 9F
1246functions depending on the interrupt's capabilities.
1247The reason that these are different is because some interrupt types
1248.Pq MSI
1249require that all interrupts in a group be enabled and disabled at the
1250same time.
1251This is indicated with the
1252.Dv DDI_INTR_FLAG_BLOCK
1253flag found in the interrupt's capabilities.
1254Once that is called, interrupts that are generated by a device will be
1255delivered to the registered function.
1256.Pp
1257It's important to note that there is often device-specific interrupt
1258setup that is required.
1259While the kernel takes care of updating any pieces of the processor's
1260interrupt controller, I/O crossbar, or the PCI MSI and MSI-X
1261capabilities, many devices have device-specific registers that are used
1262to manage, set up, and acknowledge interrupts.
1263These registers or other controls are often capable of separately
1264masking interrupts and are generally what should be used if there are
1265times that you need to separately enable or disable interrupts such as
1266to poll an I/O ring.
1267.Pp
1268When unwinding interrupts, one needs to work in the reverse order here.
1269Until
1270.Xr ddi_intr_block_disable 9F
1271or
1272.Xr ddi_intr_disable 9F
1273is called, one should assume that their interrupt handler will be
1274called.
1275Due to cases where an interrupt is shared between multiple devices, this
1276can happen even if the device is quiesced!
1277Only after that is done is it safe to then free the interrupts with a
1278call to
1279.Xr ddi_intr_free 9F .
1280.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1281.It Xr ddi_add_intr 9F Ta Xr ddi_add_softintr 9F
1282.It Xr ddi_get_iblock_cookie 9F Ta Xr ddi_get_soft_iblock_cookie 9F
1283.It Xr ddi_intr_add_handler 9F Ta Xr ddi_intr_add_softint 9F
1284.It Xr ddi_intr_alloc 9F Ta Xr ddi_intr_block_disable 9F
1285.It Xr ddi_intr_block_enable 9F Ta Xr ddi_intr_clr_mask 9F
1286.It Xr ddi_intr_disable 9F Ta Xr ddi_intr_dup_handler 9F
1287.It Xr ddi_intr_enable 9F Ta Xr ddi_intr_free 9F
1288.It Xr ddi_intr_get_cap 9F Ta Xr ddi_intr_get_hilevel_pri 9F
1289.It Xr ddi_intr_get_navail 9F Ta Xr ddi_intr_get_nintrs 9F
1290.It Xr ddi_intr_get_pending 9F Ta Xr ddi_intr_get_pri 9F
1291.It Xr ddi_intr_get_softint_pri 9F Ta Xr ddi_intr_get_supported_types 9F
1292.It Xr ddi_intr_hilevel 9F Ta Xr ddi_intr_remove_handler 9F
1293.It Xr ddi_intr_remove_softint 9F Ta Xr ddi_intr_set_cap 9F
1294.It Xr ddi_intr_set_mask 9F Ta Xr ddi_intr_set_nreq 9F
1295.It Xr ddi_intr_set_pri 9F Ta Xr ddi_intr_set_softint_pri 9F
1296.It Xr ddi_intr_trigger_softint 9F Ta Xr ddi_remove_intr 9F
1297.It Xr ddi_remove_softintr 9F Ta Xr ddi_trigger_softintr 9F
1298.El
1299.Ss Minor Nodes
1300For a device driver to be accessed by a program in user space
1301.Pq or with the kernel layered device interface
1302then it must create a minor node.
1303Minor nodes are created under
1304.Pa /devices
1305.Pq Xr devfs 4FS
1306and are tied to the instance of a device driver via its
1307.Vt dev_info_t .
1308The
1309.Xr devfsadm 8
1310daemon and the
1311.Pa /dev
1312file system
1313.Po
1314sdev,
1315.Xr dev 4FS
1316.Pc
1317are responsible for creating a coherent set of names that user programs
1318access.
1319Drivers create these minor nodes using the
1320.Xr ddi_create_minor_node 9F
1321function listed below.
1322.Pp
1323In UNIX tradition, character, block, and STREAMS device special files
1324are identified by a major and minor number.
1325All instances of a given driver share the same major number, which means
1326that a device driver must coordinate the minor number space across
1327.Em all
1328instances.
1329While a minor node is created with a fixed minor number, it is possible
1330to change the minor number while processing an
1331.Xr open 9E
1332call, allowing subsequent character device operations to uniquely
1333identify a particular caller.
1334This is usually referred to as a driver that
1335.Dq clones .
1336.Pp
1337When drivers aren't performing cloning, then usually the minor number
1338used when creating the minor node is some fixed offset or multiple of
1339the driver's instance number.
1340When cloning and a driver needs to allocate and manage a minor number
1341space, usually an ID space is leveraged whose IDs are usually in the
1342range from 0 through
1343.Dv MAXMIN32 .
1344There are several different strategies for tracking data structures as
1345they relate to minor numbers.
1346Sometimes, the soft state functionality is used.
1347Others might keep an AVL tree around or tie the data to some other data
1348structure.
1349The method chosen often varies on the specifics of the implementation
1350and its broader context.
1351.Pp
1352The
1353.Vt dev_t
1354structure represents the combined major and minor number.
1355It can be taken apart with the
1356.Xr getmajor 9F
1357and
1358.Xr getminor 9F
1359functions and then reconstructed with the
1360.Xr makedevice 9F
1361function.
1362.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1363.It Xr ddi_create_minor_node 9F Ta Xr ddi_remove_minor_node 9F
1364.It Xr getmajor 9F Ta Xr getminor 9F
1365.It Xr devfs_clean 9F Ta Xr makedevice 9F
1366.El
1367.Ss Accessing Time, Delays, and Periodic Events
1368The kernel provides a number of ways to understand time in the system.
1369In particular it provides a few different clocks and time measurements:
1370.Bl -tag -width Ds
1371.It High-resolution monotonic time
1372The kernel provides access to a high-resolution monotonic clock that is
1373tracked in nanoseconds.
1374This clock is perfect for measuring durations and is accessed via
1375.Xr gethrtime 9F .
1376Unlike the real-time clock, this clock is not subject to adjustments by
1377a time synchronization daemon and is the preferred clock that drivers
1378should be using for tracking events.
1379The high-resolution clock is consistent across CPUs, meaning that you
1380may call
1381.Xr gethrtime 9F
1382on one CPU and the value will be consistent with what is returned, even
1383if a thread is migrated to another CPU.
1384.Pp
1385The high-resolution clock is implemented using an architecture and
1386platform-specific means.
1387For example, on x86 it is generally backed by the TSC
1388.Pq time stamp counter .
1389.It Real-time
1390The real-time clock tracks time as humans perceive it.
1391This clock is accessed using
1392.Xr ddi_get_time 9F .
1393If the system is running a time synchronization daemon that leverages
1394the network time protocol, then this time may be in sync with other
1395systems
1396.Pq subject to some amount of variance ;
1397however, it is critical that this is not assumed.
1398.Pp
1399In general, this time should not be used by drivers for any purpose.
1400It can jump around, drift, and most aspects in the kernel are not based
1401on the real-time clock.
1402For any device timing activities, the high-resolution clock should be
1403used.
1404.It Tick-based monotonic time
1405The kernel has a running periodic function that fires based on the rate
1406dictated by the
1407.Va hz
1408variable, generally operating at 100 or 1000 kHz.
1409The current number of ticks since boot is accessible through the
1410.Xr ddi_get_lbolt 9F
1411function.
1412When functions operate in units of ticks, this is what they are
1413tracking.
1414This value can be converted to and from microseconds using the
1415.Xr drv_usectohz 9F
1416and
1417.Xr drv_hztousec 9F
1418functions.
1419.Pp
1420In general, drivers should prefer the high-resolution monotonic clock
1421for tracking events internally.
1422.El
1423.Pp
1424With these different timing mechanisms, the kernel provides a few
1425different ways to delay execution or to get a callback after some
1426amount of time passes.
1427.Pp
1428The
1429.Xr delay 9F
1430and
1431.Xr drv_usecwait 9F
1432functions are used to block the execution of the current thread.
1433.Xr delay 9F
1434can be used in conditions where sleeping and blocking is allowed where
1435as
1436.Xr drv_usecwait 9F
1437is a busy-wait, which is appropriate for some device drivers,
1438particularly when in high-level interrupt context.
1439.Pp
1440The kernel also allows a function to be called after some time has
1441elapsed.
1442This callback occurs on a different thread and will be executed in
1443.Sy kernel
1444context.
1445A timeout can be scheduled in the future with the
1446.Xr timeout 9F
1447function and cancelled with the
1448.Xr untimeout 9F
1449function.
1450There is also a STREAMs-specific version that can be used if the
1451circumstances are required with the
1452.Xr qtimeout 9F
1453function.
1454.Pp
1455These are all considered one-shot events.
1456That is, they will only happen once after being scheduled.
1457If instead, a driver requires periodic behavior, such as needing
1458something to occur every second, then it should use the
1459.Xr ddi_periodic_add 9F
1460function to establish that.
1461.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1462.It Xr delay 9F Ta Xr ddi_get_lbolt 9F
1463.It Xr ddi_get_lbolt64 9F Ta Xr ddi_get_time 9F
1464.It Xr ddi_periodic_add 9F Ta Xr ddi_periodic_delete 9F
1465.It Xr drv_hztousec 9F Ta Xr drv_usectohz 9F
1466.It Xr drv_usecwait 9F Ta Xr gethrtime 9F
1467.It Xr qtimeout 9F Ta Xr quntimeout 9F
1468.It Xr timeout 9F Ta Xr untimeout 9F
1469.El
1470.Ss Task Queues
1471A task queue provides an asynchronous processing mechanism that can be
1472used by drivers and the broader system.
1473A task queue can be created with
1474.Xr ddi_taskq_create 9F
1475and sized with a given number of threads and a relative priority of those
1476threads.
1477Once created, tasks can be dispatched to the queue with
1478.Xr ddi_taskq_dispatch 9F .
1479The different functions and arguments dispatched do not need to be the
1480same and can vary from invocation to invocation.
1481However, it is the caller's responsibility to ensure that any reference
1482memory is valid until the task queue is done processing.
1483It is possible to create a barrier for a task queue by using the
1484.Xr ddi_taskq_wait 9F
1485function.
1486.Pp
1487While task queues are a flexible mechanism for handling and processing
1488events that occur in a well defined context, they do not have an
1489inherent backpressure mechanism built in.
1490This means it is possible to add events to a task queue faster than they
1491can be processed.
1492For high-volume events, this must be considered before just dispatching
1493an event.
1494Do not rely on a non-sleeping allocation in the task queue dispatch
1495context.
1496.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1497.It Xr ddi_taskq_create 9F Ta Xr ddi_taskq_destroy 9F
1498.It Xr ddi_taskq_dispatch 9F Ta Xr ddi_taskq_resume 9F
1499.It Xr ddi_taskq_suspend 9F Ta Xr ddi_taskq_suspended 9F
1500ddi_taskq_wait
1501.El
1502.Ss Credential Management and Privileges
1503Not everything in the system has the same power to impact it.
1504To determine the permissions and context of a caller, the
1505.Vt cred_t
1506data structure encapsulates a number of different things including the
1507traditional user and group IDs, but also the zone that one is operating
1508in the context of and the associated privileges that the caller has.
1509While this concept is more often thought of due to userland processes being
1510associated with specific users, these same principles apply to different
1511threads in the kernel.
1512Not all kernel threads are allowed to indiscriminately do what they
1513want, they can be constrained by the same privilege model that processes
1514are, which is discussed in
1515.Xr privileges 7 .
1516.Pp
1517Most operations that device drivers implement are given a credential.
1518However, from within the kernel, a credential can be obtained that
1519refers to a specific zone, the current process, or a generic kernel
1520credential.
1521.Pp
1522It is up to drivers and the kernel writ-large to check whether a given
1523credential is authorized to perform a given operation.
1524This is encapsulated by the various privilege checks that exist.
1525The most common check used is
1526.Xr drv_priv 9F
1527which checks for
1528.Dv PRIV_SYS_DEVICES .
1529.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1530.It Xr CRED 9F Ta Xr crdup 9F
1531.It Xr crfree 9F Ta Xr crget 9F
1532.It Xr crgetgid 9F Ta Xr crgetgroups 9F
1533.It Xr crgetngroups 9F Ta Xr crgetrgid 9F
1534.It Xr crgetruid 9F Ta Xr crgetsgid 9F
1535.It Xr crgetsuid 9F Ta Xr crgetuid 9F
1536.It Xr crgetzoneid 9F Ta Xr crhold 9F
1537.It Xr ddi_get_cred 9F Ta Xr drv_priv 9F
1538.It Xr kcred 9F Ta Xr priv_getbyname 9F
1539.It Xr priv_policy_choice 9F Ta Xr priv_policy_only 9F
1540.It Xr priv_policy 9F Ta Xr zone_kcred 9F
1541.El
1542.Ss Device ID Management
1543Device IDs are a means of establishing a unique ID for a device in the
1544kernel.
1545These unique IDs are generally tied to something from the device's
1546hardware such as a serial number or related, but can also be fabricated
1547and stored on the device.
1548These device IDs are used by other subsystems like ZFS to record
1549information about a device as the actual
1550.Pa /devices
1551path that a device resides at may change because it is moved around in
1552the system.
1553.Pp
1554For device drivers, particularly those that represent block devices,
1555they should first call
1556.Xr ddi_devid_init 9F
1557to initialize the device ID data structure.
1558After that is done, it is then safe to call
1559.Xr ddi_devid_register 9F
1560to notify the kernel about the ID.
1561.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1562.It Xr ddi_devid_compare 9F Ta Xr ddi_devid_free 9F
1563.It Xr ddi_devid_get 9F Ta Xr ddi_devid_init 9F
1564.It Xr ddi_devid_register 9F Ta Xr ddi_devid_sizeof 9F
1565.It Xr ddi_devid_str_decode 9F Ta Xr ddi_devid_str_encode 9F
1566.It Xr ddi_devid_str_free 9F Ta Xr ddi_devid_unregister 9F
1567.It Xr ddi_devid_valid 9F Ta
1568.El
1569.Ss Message Block Functions
1570The
1571.Vt "mblk_t"
1572data structure is used to chain together messages which are used through
1573the kernel for different subsystems including all of networking,
1574terminals, STREAMS, USB, and more.
1575.Pp
1576Message blocks are chained together by a series of two different
1577pointers:
1578.Fa b_cont
1579and
1580.Fa b_next .
1581When a message is split across multiple data buffers, they are linked by
1582the
1583.Fa b_cont
1584pointer.
1585However, multiple distinct messages can be chained together and linked
1586by the
1587.Fa b_next
1588pointer.
1589Let's look at this in the context of a series of networking packets.
1590If we had a chain of say 10 UDP packets that we were given, each UDP
1591packet is considered an independent message and would be linked from one
1592to the next based on the order they should be transmitted with the
1593.Fa b_next
1594pointer.
1595However, an individual message may be entirely in one message block, in
1596which case its
1597.Fa b_cont
1598pointer would be
1599.Dv NULL ,
1600but if say the packet were split into a 100 byte data buffer that
1601contained the headers and then a 1000 byte data buffer that contained
1602the actual packet data, those two would be linked together by
1603.Fa b_cont .
1604A continued message would never have its next pointer used to link it to
1605a wholly different message.
1606Visually you might see this as:
1607.Bd -literal
1608  +---------------+
1609  | UDP Message 0 |
1610  | Bytes 0-1100  |
1611  | b_cont     ---+--> NULL
1612  | b_next  +     |
1613  +---------|-----+
1614            |
1615            v
1616  +---------------+    +----------------+
1617  | UDP Message 1 |    | UDP Message 1+ |
1618  | Bytes 0-100   |    | Bytes 100-1100 |
1619  | b_cont     ---+--> | b_cont     ----+->NULL
1620  | b_next  +     |    | b_next     ----+->NULL
1621  +---------|-----+    +----------------+
1622            |
1623           ...
1624            |
1625            v
1626  +---------------+
1627  | UDP Message 9 |
1628  | Bytes 0-1100  |
1629  | b_cont     ---+--> NULL
1630  | b_next     ---+--> NULL
1631  +---------------+
1632.Ed
1633.Pp
1634Message blocks all have an associated data block which contains the
1635actual data that is present.
1636Multiple message blocks can share the same data block as well.
1637The data block has a notion of a type, which is generally
1638.Dv M_DATA
1639which signifies that they operate on data.
1640.Pp
1641To allocate message blocks, one generally uses the
1642.Xr allocb 9F
1643function to create one; however, you can also create message blocks
1644using your own source of data through functions like
1645.Xr desballoc 9F .
1646This is generally used when one wants to use memory that was originally
1647used for DMA to pass data back into the kernel, such as in a networking
1648device driver.
1649When this happens, a callback function will be called once the last user
1650of the data block is done with it.
1651.Pp
1652The functions listed below often end in either
1653.Dq msg
1654or
1655.Dq b
1656to indicate that they will operate on an entire message and follow the
1657.Fa b_cont
1658pointer or they will not respectively.
1659.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1660.It Xr adjmsg 9F Ta Xr allocb 9F
1661.It Xr copyb 9F Ta Xr copymsg 9F
1662.It Xr datamsg 9F Ta Xr desballoc 9F
1663.It Xr desballoca 9F Ta Xr dupb 9F
1664.It Xr dupmsg 9F Ta Xr esballoc 9F
1665.It Xr esballoca 9F Ta Xr freeb 9F
1666.It Xr freemsg 9F Ta Xr linkb 9F
1667.It Xr mcopymsg 9F Ta Xr msgdsize 9F
1668.It Xr msgpullup 9F Ta Xr msgsize 9F
1669.It Xr pullupmsg 9F Ta Xr rmvb 9F
1670.It Xr testb 9F Ta Xr unlinkb 9F
1671.El
1672.Ss Upgradable Firmware Modules
1673The UFM
1674.Pq Upgradable Firmware Module
1675subsystem is used to grant the system observability into firmware that
1676exists persistently on a device.
1677These functions are intended for use by drivers that are participating in
1678the kernel's UFM framework, which is discussed in
1679.Xr ddi_ufm 9E .
1680.Pp
1681The
1682.Xr ddi_ufm_init 9F
1683and
1684.Xr ddi_ufm_fini 9F
1685functions are used to indicate support of the subsystem to the kernel.
1686The driver is required to use the
1687.Xr ddi_ufm_update 9F
1688function to indicate both that it is ready to receive UFM requests and
1689to indicate that any data that the kernel may have previously received
1690has changed.
1691Once that's completed, then the other functions listed here are
1692generally used as part of implementing specific callback functions that
1693are registered.
1694.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1695.It Xr ddi_ufm_fini 9F Ta Xr ddi_ufm_image_set_desc 9F
1696.It Xr ddi_ufm_image_set_misc 9F Ta Xr ddi_ufm_image_set_nslots 9F
1697.It Xr ddi_ufm_init 9F Ta Xr ddi_ufm_slot_set_attrs 9F
1698.It Xr ddi_ufm_slot_set_imgsize 9F Ta Xr ddi_ufm_slot_set_misc 9F
1699.It Xr ddi_ufm_slot_set_version 9F Ta Xr ddi_ufm_update 9F
1700.El
1701.Ss Firmware Loading
1702Some hardware devices have firmware that is not stored as part of the
1703device itself and must instead be sent to the device each time it is
1704powered on.
1705These routines help drivers that need to perform this read such data
1706from the file system from well-known locations in the operating system.
1707To begin with, a driver should call
1708.Xr firmware_open 9F
1709to open a handle to the firmware file.
1710At that point, one can determine the size of the file with the
1711.Xr firmware_get_size 9F
1712function and allocate the appropriate sized memory buffer to read it in.
1713Callers should always check what the size of the returned file is and
1714should not just blindly pass that size off to the kernel memory
1715allocator.
1716For example, if a file was over 100 MiB in size, then one should not
1717assume that they're going to just blindly allocate 100 MiB of kernel
1718memory and should instead perform incremental reads and sends to a
1719device that are smaller in size.
1720.Pp
1721A driver can then go through and perform arbitrary reads of the firmware
1722file through the
1723.Xr firmware_read 9F
1724interface until they have read everything that they need.
1725Once complete, the corresponding handle needs to be released through the
1726.Xr firmware_close 9F
1727function.
1728.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1729.It Xr firmware_close 9F Ta Xr firmware_get_size 9F
1730.It Xr firmware_open 9F Ta Xr firmware_read 9F
1731.El
1732.Ss Fault Management Handling
1733These functions allow device drivers to harden themselves against errors
1734that might occur while interfacing with devices and tie into the broader
1735fault management architecture.
1736.Pp
1737To begin, a driver must declare which capabilities it implements during
1738its
1739.Xr attach 9E
1740function by calling
1741.Xr ddi_fm_init 9F .
1742The set of capabilities it receives back may be less than what was
1743requested because the capabilities are dependent on the overall chain of
1744drivers present.
1745.Pp
1746If
1747.Dv DDI_FM_EREPORT_CAPABLE
1748was negotiated, then the driver is expected to generate error events
1749when certain conditions occur using the
1750.Xr ddi_fm_ereport_post 9F
1751function or the more specific
1752.Xr pci_ereport_post 9F
1753function.
1754If a caller has negotiated
1755.Dv DDI_FM_ACCCHK_CAPABLE ,
1756then it is allowed to set up its register attributes to indicate that it
1757will check for errors on the register handle after using functions like
1758.Xr ddi_get8 9F
1759and
1760.Xr ddi_put8 9F
1761by calling
1762.Xr ddi_fm_acc_err_get 9F
1763and reacting accordingly.
1764Similarly, if a driver has negotiated
1765.Dv DDI_FM_DMACHK_CAPABLE ,
1766then it will use
1767.Xr ddi_check_dma_handle 9F
1768to check the results of DMA activity and handle the results
1769appropriately.
1770Similar to register accesses, the DMA attributes must be updated to set
1771that error handling is anticipated on this handle.
1772The
1773.Xr ddi_fm_init 9F
1774manual page has an overview of the other types of flags that can be
1775negotiated and how they are used.
1776.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1777.It Xr ddi_check_acc_handle 9F Ta Xr ddi_check_dma_handle 9F
1778.It Xr ddi_dev_report_fault 9F Ta Xr ddi_fm_acc_err_clear 9F
1779.It Xr ddi_fm_acc_err_get 9F Ta Xr ddi_fm_capable 9F
1780.It Xr ddi_fm_dma_err_clear 9F Ta Xr ddi_fm_dma_err_get 9F
1781.It Xr ddi_fm_ereport_post 9F Ta Xr ddi_fm_fini 9F
1782.It Xr ddi_fm_handler_register 9F Ta Xr ddi_fm_handler_unregister 9F
1783.It Xr ddi_fm_init 9F Ta Xr ddi_fm_service_impact 9F
1784.It Xr pci_ereport_post 9F Ta Xr pci_ereport_setup 9F
1785.It Xr pci_ereport_teardown 9F Ta
1786.El
1787.Ss SCSI and SAS Device Driver Functions
1788These functions are for use by SCSI and SAS device drivers that leverage
1789the kernel's frameworks.
1790Other device drivers should not use these.
1791For more background on these, some of the general concepts are discussed
1792in
1793.Xr iport 9 ,
1794.Xr phymap 9 ,
1795and
1796.Xr tgtmap 9 .
1797.Pp
1798Device drivers register initially with the kernel by using the
1799.Xr scsi_hba_init 9F
1800function and then, in their attach routine, register specific instances,
1801using functions like
1802.Xr scsi_hba_iport_register 9F
1803or instead
1804.Xr scsi_hba_tran_alloc 9F
1805and
1806.Xr scsi_hba_attach_setup 9F .
1807New drivers are encouraged to use the target map and iports framework to
1808simplify the device driver writing process.
1809.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1810.It Xr makecom_g0_s 9F Ta Xr makecom_g0 9F
1811.It Xr makecom_g1 9F Ta Xr makecom_g5 9F
1812.It Xr makecom 9F Ta Xr sas_phymap_create 9F
1813.It Xr sas_phymap_destroy 9F Ta Xr sas_phymap_lookup_ua 9F
1814.It Xr sas_phymap_lookup_uapriv 9F Ta Xr sas_phymap_phy_add 9F
1815.It Xr sas_phymap_phy_rem 9F Ta Xr sas_phymap_phy2ua 9F
1816.It Xr sas_phymap_phys_free 9F Ta Xr sas_phymap_phys_next 9F
1817.It Xr sas_phymap_ua_free 9F Ta Xr sas_phymap_ua2phys 9F
1818.It Xr sas_phymap_uahasphys 9F Ta Xr scsi_abort 9F
1819.It Xr scsi_address_device 9F Ta Xr scsi_alloc_consistent_buf 9F
1820.It Xr scsi_cname 9F Ta Xr scsi_destroy_pkt 9F
1821.It Xr scsi_device_hba_private_get 9F Ta Xr scsi_device_hba_private_set 9F
1822.It Xr scsi_device_unit_address 9F Ta Xr scsi_dmafree 9F
1823.It Xr scsi_dmaget 9F Ta Xr scsi_dname 9F
1824.It Xr scsi_errmsg 9F Ta Xr scsi_ext_sense_fields 9F
1825.It Xr scsi_find_sense_descr 9F Ta Xr scsi_free_consistent_buf 9F
1826.It Xr scsi_free_wwnstr 9F Ta Xr scsi_get_device_type_scsi_options 9F
1827.It Xr scsi_get_device_type_string 9F Ta Xr scsi_hba_attach_setup 9F
1828.It Xr scsi_hba_detach 9F Ta Xr scsi_hba_fini 9F
1829.It Xr scsi_hba_init 9F Ta Xr scsi_hba_iport_exist 9F
1830.It Xr scsi_hba_iport_find 9F Ta Xr scsi_hba_iport_register 9F
1831.It Xr scsi_hba_iport_unit_address 9F Ta Xr scsi_hba_iportmap_create 9F
1832.It Xr scsi_hba_iportmap_destroy 9F Ta Xr scsi_hba_iportmap_iport_add 9F
1833.It Xr scsi_hba_iportmap_iport_remove 9F Ta Xr scsi_hba_lookup_capstr 9F
1834.It Xr scsi_hba_pkt_alloc 9F Ta Xr scsi_hba_pkt_comp 9F
1835.It Xr scsi_hba_pkt_free 9F Ta Xr scsi_hba_probe 9F
1836.It Xr scsi_hba_tgtmap_create 9F Ta Xr scsi_hba_tgtmap_destroy 9F
1837.It Xr scsi_hba_tgtmap_scan_luns 9F Ta Xr scsi_hba_tgtmap_set_add 9F
1838.It Xr scsi_hba_tgtmap_set_begin 9F Ta Xr scsi_hba_tgtmap_set_end 9F
1839.It Xr scsi_hba_tgtmap_set_flush 9F Ta Xr scsi_hba_tgtmap_tgt_add 9F
1840.It Xr scsi_hba_tgtmap_tgt_remove 9F Ta Xr scsi_hba_tran_alloc 9F
1841.It Xr scsi_hba_tran_free 9F Ta Xr scsi_ifgetcap 9F
1842.It Xr scsi_ifsetcap 9F Ta Xr scsi_init_pkt 9F
1843.It Xr scsi_log 9F Ta Xr scsi_mname 9F
1844.It Xr scsi_pktalloc 9F Ta Xr scsi_pktfree 9F
1845.It Xr scsi_poll 9F Ta Xr scsi_probe 9F
1846.It Xr scsi_resalloc 9F Ta Xr scsi_reset_notify 9F
1847.It Xr scsi_reset 9F Ta Xr scsi_resfree 9F
1848.It Xr scsi_rname 9F Ta Xr scsi_sense_asc 9F
1849.It Xr scsi_sense_ascq 9F Ta Xr scsi_sense_cmdspecific_uint64 9F
1850.It Xr scsi_sense_info_uint64 9F Ta Xr scsi_sense_key 9F
1851.It Xr scsi_setup_cdb 9F Ta Xr scsi_slave 9F
1852.It Xr scsi_sname 9F Ta Xr scsi_sync_pkt 9F
1853.It Xr scsi_transport 9F Ta Xr scsi_unprobe 9F
1854.It Xr scsi_unslave 9F Ta Xr scsi_validate_sense 9F
1855.It Xr scsi_vu_errmsg 9F Ta Xr scsi_wwn_to_wwnstr 9F
1856scsi_wwnstr_to_wwn
1857.El
1858.Ss Block Device Buffer Handling
1859Block devices operate with a data structure called the
1860.Vt struct buf
1861which is described in
1862.Xr buf 9S .
1863This structure is used to represent a given block request and is used
1864heavily in block devices, the SCSI/SAS framework, and the blkdev
1865framework.
1866The functions described here are used to manipulate these structures in
1867various ways such as copying them around, indicating error conditions,
1868or indicating when the I/O operation is done.
1869By default, this memory is not mapped into the kernel's address space so
1870several functions such as
1871.Xr bp_mapin 9F
1872are present to allow for that to happen when required.
1873.Pp
1874To initially obtain a
1875.Vt struct buf ,
1876drivers should begin by calling
1877.Xr getrbuf 9F
1878at which point, the caller can fill in the structure.
1879Once that's done, the
1880.Xr physio 9F
1881function can be used to actually perform the I/O and wait until it's
1882complete.
1883.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1884.It Xr bioclone 9F Ta Xr biodone 9F
1885.It Xr bioerror 9F Ta Xr biofini 9F
1886.It Xr bioinit 9F Ta Xr biomodified 9F
1887.It Xr bioreset 9F Ta Xr biosize 9F
1888.It Xr biowait 9F Ta Xr bp_mapin 9F
1889.It Xr bp_mapout 9F Ta Xr clrbuf 9F
1890.It Xr disksort 9F Ta Xr freerbuf 9F
1891.It Xr geterror 9F Ta Xr getrbuf 9F
1892.It Xr minphys 9F Ta Xr physio 9F
1893.El
1894.Ss Networking Device Driver Functions
1895These functions are for networking device drivers that implant the MAC,
1896GLDv3 interfaces.
1897The full framework and how to use it is described in
1898.Xr mac 9E .
1899.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1900.It Xr mac_alloc 9F Ta Xr mac_fini_ops 9F
1901.It Xr mac_free 9F Ta Xr mac_hcksum_get 9F
1902.It Xr mac_hcksum_set 9F Ta Xr mac_init_ops 9F
1903.It Xr mac_link_update 9F Ta Xr mac_lso_get 9F
1904.It Xr mac_maxsdu_update 9F Ta Xr mac_prop_info_set_default_fec 9F
1905.It Xr mac_prop_info_set_default_link_flowctrl 9F Ta Xr mac_prop_info_set_default_str 9F
1906.It Xr mac_prop_info_set_default_uint32 9F Ta Xr mac_prop_info_set_default_uint64 9F
1907.It Xr mac_prop_info_set_default_uint8 9F Ta Xr mac_prop_info_set_perm 9F
1908.It Xr mac_prop_info_set_range_uint32 9F Ta Xr mac_prop_info 9F
1909.It Xr mac_register 9F Ta Xr mac_rx 9F
1910.It Xr mac_rx_ring 9F Ta Xr mac_transceiver_info_set_present 9F
1911.It Xr mac_transceiver_info_set_usable 9F Ta Xr mac_transceiver_info 9F
1912.It Xr mac_tx_ring_update 9F Ta Xr mac_tx_update 9F
1913.It Xr mac_unregister 9F Ta
1914.El
1915.Ss USB Device Driver Functions
1916These functions are designed for USB device drivers.
1917To first initialize with the kernel, a device driver must call
1918.Xr usb_client_attach 9F
1919and then
1920.Xr usb_get_dev_data 9F .
1921The latter call is required to get access to the USB-level
1922descriptors about the device which describe what kinds of USB endpoints
1923.Pq control, bulk, interrupt, or isochronous
1924exist on the device as well as how many different interfaces and
1925configurations are present.
1926.Pp
1927Once a given configuration, sometimes the default, is selected, then the
1928driver can proceed to opening up what the USB architecture calls a pipe,
1929which provides a way to send requests to a specific USB endpoint.
1930First, specific endpoints can be looked up using the
1931.Xr usb_lookup_ep_data 9F
1932function which gets information from the parsed descriptors and then
1933that gets filled into an extended descriptor with
1934.Xr usb_ep_xdescr_fill 9F .
1935With that in hand, a pipe can be opened with
1936.Xr usb_pipe_xopen 9F .
1937.Pp
1938Once a pipe has been opened, which most often happens in a driver's
1939.Xr attach 9E
1940entry point, then requests can be allocated and submitted.
1941There is a different allocation for each type of request
1942.Po
1943e.g.
1944.Xr usb_alloc_bulk_req 9F
1945.Pc
1946and a different submission function for each type as well.
1947Each request structure has a corresponding page in section 9S that
1948describes the structure, its members, and how to work with it.
1949.Pp
1950One other major concern for USB devices, which isn't as common with
1951other types of devices, is that they can be yanked out and reinserted
1952at any time.
1953To help determine when this happens, the kernel offers the
1954.Xr usb_register_event_cbs 9F
1955function which allows a driver to register for callbacks when a device
1956is disconnected, reconnected, or around checkpoint suspend/resume
1957behavior.
1958.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
1959.It Xr usb_alloc_bulk_req 9F Ta Xr usb_alloc_ctrl_req 9F
1960.It Xr usb_alloc_intr_req 9F Ta Xr usb_alloc_isoc_req 9F
1961.It Xr usb_alloc_request 9F Ta Xr usb_client_attach 9F
1962.It Xr usb_client_detach 9F Ta Xr usb_clr_feature 9F
1963.It Xr usb_create_pm_components 9F Ta Xr usb_ep_xdescr_fill 9F
1964.It Xr usb_free_bulk_req 9F Ta Xr usb_free_ctrl_req 9F
1965.It Xr usb_free_descr_tree 9F Ta Xr usb_free_dev_data 9F
1966.It Xr usb_free_intr_req 9F Ta Xr usb_free_isoc_req 9F
1967.It Xr usb_get_addr 9F Ta Xr usb_get_alt_if 9F
1968.It Xr usb_get_cfg 9F Ta Xr usb_get_current_frame_number 9F
1969.It Xr usb_get_dev_data 9F Ta Xr usb_get_if_number 9F
1970.It Xr usb_get_max_pkts_per_isoc_request 9F Ta Xr usb_get_status 9F
1971.It Xr usb_get_string_descr 9F Ta Xr usb_handle_remote_wakeup 9F
1972.It Xr usb_lookup_ep_data 9F Ta Xr usb_owns_device 9F
1973.It Xr usb_parse_data 9F Ta Xr usb_pipe_bulk_xfer 9F
1974.It Xr usb_pipe_close 9F Ta Xr usb_pipe_ctrl_xfer_wait 9F
1975.It Xr usb_pipe_ctrl_xfer 9F Ta Xr usb_pipe_drain_reqs 9F
1976.It Xr usb_pipe_get_max_bulk_transfer_size 9F Ta Xr usb_pipe_get_private 9F
1977.It Xr usb_pipe_get_state 9F Ta Xr usb_pipe_intr_xfer 9F
1978.It Xr usb_pipe_isoc_xfer 9F Ta Xr usb_pipe_open 9F
1979.It Xr usb_pipe_reset 9F Ta Xr usb_pipe_set_private 9F
1980.It Xr usb_pipe_stop_intr_polling 9F Ta Xr usb_pipe_stop_isoc_polling 9F
1981.It Xr usb_pipe_xopen 9F Ta Xr usb_print_descr_tree 9F
1982.It Xr usb_register_hotplug_cbs 9F Ta Xr usb_reset_device 9F
1983.It Xr usb_set_alt_if 9F Ta Xr usb_set_cfg 9F
1984.It Xr usb_unregister_hotplug_cbs 9F Ta
1985.El
1986.Ss PCI Device Driver Functions
1987These functions are specific for PCI and PCI Express based device
1988drivers and are intended to be used to get access to PCI configuration
1989space.
1990For normal PCI base address registers
1991.Pq BARs
1992instead see
1993.Sx Register Setup and Access .
1994.Pp
1995To access PCI configuration space, a device driver should first call
1996.Xr pci_config_setup 9F .
1997Generally, drivers will call this in their
1998.Xr attach 9E
1999entry point and then tear down the configuration space access with the
2000.Xr pci_config_teardown 9F
2001entry point in
2002.Xr detach 9E .
2003After setting up access to configuration space, the returned handle can
2004be used in all of the various configuration space routines to get and
2005set specific sized values in configuration space.
2006.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2007.It Xr pci_config_get8 9F Ta Xr pci_config_get16 9F
2008.It Xr pci_config_get32 9F Ta Xr pci_config_get64 9F
2009.It Xr pci_config_put8 9F Ta Xr pci_config_put16 9F
2010.It Xr pci_config_put32 9F Ta Xr pci_config_put64 9F
2011.It Xr pci_config_setup 9F Ta Xr pci_config_teardown 9F
2012.It Xr pci_report_pmcap 9F Ta Xr pci_restore_config_regs 9F
2013.It Xr pci_save_config_regs 9F Ta
2014.El
2015.Ss USB Host Controller Interface Functions
2016These routines are used for device drivers which implement the USB
2017host controller interfaces described in
2018.Xr usba_hcdi 9E .
2019Other types of devices drivers and modules should not call these
2020functions.
2021In particular, if one is writing a device driver for a USB device, these
2022are not the routines you're looking for and you want to see
2023.Sx USB Device Driver Functions .
2024These are what the
2025.Xr ehci 4D
2026or
2027.Xr xhci 4D
2028drivers use to provide services that USB drivers use via the kernel USB
2029architecture.
2030.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2031.It Xr usba_alloc_hcdi_ops 9F Ta Xr usba_free_hcdi_ops 9F
2032.It Xr usba_hcdi_cb 9F Ta Xr usba_hcdi_dup_intr_req 9F
2033.It Xr usba_hcdi_dup_isoc_req 9F Ta Xr usba_hcdi_get_device_private 9F
2034.It Xr usba_hcdi_register 9F Ta Xr usba_hcdi_unregister 9F
2035.It Xr usba_hubdi_bind_root_hub 9F Ta Xr usba_hubdi_cb_ops 9F
2036.It Xr usba_hubdi_close 9F Ta Xr usba_hubdi_dev_ops 9F
2037.It Xr usba_hubdi_ioctl 9F Ta Xr usba_hubdi_open 9F
2038.It Xr usba_hubdi_root_hub_power 9F Ta Xr usba_hubdi_unbind_root_hub 9F
2039.El
2040.Ss Functions for PCMCIA Drivers
2041These functions exist for older PCMCIA device drivers.
2042These should not otherwise be used by the system.
2043.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2044.It Xr csx_AccessConfigurationRegister 9F Ta Xr csx_ConvertSize 9F
2045.It Xr csx_ConvertSpeed 9F Ta Xr csx_CS_DDI_Info 9F
2046.It Xr csx_DeregisterClient 9F Ta Xr csx_DupHandle 9F
2047.It Xr csx_Error2Text 9F Ta Xr csx_Event2Text 9F
2048.It Xr csx_FreeHandle 9F Ta Xr csx_Get16 9F
2049.It Xr csx_Get32 9F Ta Xr csx_Get64 9F
2050.It Xr csx_Get8 9F Ta Xr csx_GetEventMask 9F
2051.It Xr csx_GetFirstClient 9F Ta Xr csx_GetFirstTuple 9F
2052.It Xr csx_GetHandleOffset 9F Ta Xr csx_GetMappedAddr 9F
2053.It Xr csx_GetNextClient 9F Ta Xr csx_GetNextTuple 9F
2054.It Xr csx_GetStatus 9F Ta Xr csx_GetTupleData 9F
2055.It Xr csx_MakeDeviceNode 9F Ta Xr csx_MapLogSocket 9F
2056.It Xr csx_MapMemPage 9F Ta Xr csx_ModifyConfiguration 9F
2057.It Xr csx_ModifyWindow 9F Ta Xr csx_Parse_CISTPL_BATTERY 9F
2058.It Xr csx_Parse_CISTPL_BYTEORDER 9F Ta Xr csx_Parse_CISTPL_CFTABLE_ENTRY 9F
2059.It Xr csx_Parse_CISTPL_CONFIG 9F Ta Xr csx_Parse_CISTPL_DATE 9F
2060.It Xr csx_Parse_CISTPL_DEVICE_A 9F Ta Xr csx_Parse_CISTPL_DEVICE_OA 9F
2061.It Xr csx_Parse_CISTPL_DEVICE_OC 9F Ta Xr csx_Parse_CISTPL_DEVICE 9F
2062.It Xr csx_Parse_CISTPL_DEVICEGEO_A 9F Ta Xr csx_Parse_CISTPL_DEVICEGEO 9F
2063.It Xr csx_Parse_CISTPL_FORMAT 9F Ta Xr csx_Parse_CISTPL_FUNCE 9F
2064.It Xr csx_Parse_CISTPL_FUNCID 9F Ta Xr csx_Parse_CISTPL_GEOMETRY 9F
2065.It Xr csx_Parse_CISTPL_JEDEC_A 9F Ta Xr csx_Parse_CISTPL_JEDEC_C 9F
2066.It Xr csx_Parse_CISTPL_LINKTARGET 9F Ta Xr csx_Parse_CISTPL_LONGLINK_A 9F
2067.It Xr csx_Parse_CISTPL_LONGLINK_C 9F Ta Xr csx_Parse_CISTPL_LONGLINK_MFC 9F
2068.It Xr csx_Parse_CISTPL_MANFID 9F Ta Xr csx_Parse_CISTPL_ORG 9F
2069.It Xr csx_Parse_CISTPL_SPCL 9F Ta Xr csx_Parse_CISTPL_SWIL 9F
2070.It Xr csx_Parse_CISTPL_VERS_1 9F Ta Xr csx_Parse_CISTPL_VERS_2 9F
2071.It Xr csx_ParseTuple 9F Ta Xr csx_Put16 9F
2072.It Xr csx_Put32 9F Ta Xr csx_Put64 9F
2073.It Xr csx_Put8 9F Ta Xr csx_RegisterClient 9F
2074.It Xr csx_ReleaseConfiguration 9F Ta Xr csx_ReleaseIO 9F
2075.It Xr csx_ReleaseIRQ 9F Ta Xr csx_ReleaseSocketMask 9F
2076.It Xr csx_ReleaseWindow 9F Ta Xr csx_RemoveDeviceNode 9F
2077.It Xr csx_RepGet16 9F Ta Xr csx_RepGet32 9F
2078.It Xr csx_RepGet64 9F Ta Xr csx_RepGet8 9F
2079.It Xr csx_RepPut16 9F Ta Xr csx_RepPut32 9F
2080.It Xr csx_RepPut64 9F Ta Xr csx_RepPut8 9F
2081.It Xr csx_RequestConfiguration 9F Ta Xr csx_RequestIO 9F
2082.It Xr csx_RequestIRQ 9F Ta Xr csx_RequestSocketMask 9F
2083.It Xr csx_RequestWindow 9F Ta Xr csx_ResetFunction 9F
2084.It Xr csx_SetEventMask 9F Ta Xr csx_SetHandleOffset 9F
2085.It Xr csx_ValidateCIS 9F Ta
2086.El
2087.Ss STREAMS related functions
2088These functions are meant to be used when interacting with STREAMS
2089devices or when implementing one.
2090When a STREAMS driver is opened, it receives messages on a queue which
2091are then processed and can be sent back.
2092As different queues are often linked together, the most common thing is
2093to process a message and then pass the message onto the next queue using
2094the
2095.Xr putnext 9F
2096function.
2097.Pp
2098STREAMS messages are passed around using message blocks, which use the
2099.Vt mblk_t
2100type.
2101See
2102.Sx Message Block Functions
2103for more about how the data structure and functions that manipulate
2104message blocks.
2105.Pp
2106These functions should generally not be used when implementing a
2107networking device driver today.
2108See
2109.Xr mac 9E
2110instead.
2111.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2112.It Xr backq 9F Ta Xr bcanput 9F
2113.It Xr bcanputnext 9F Ta Xr canput 9F
2114.It Xr canputnext 9F Ta Xr enableok 9F
2115.It Xr flushband 9F Ta Xr flushq 9F
2116.It Xr freezestr 9F Ta Xr getq 9F
2117.It Xr insq 9F Ta Xr merror 9F
2118.It Xr mexchange 9F Ta Xr noenable 9F
2119.It Xr put 9F Ta Xr putbq 9F
2120.It Xr putctl 9F Ta Xr putctl1 9F
2121.It Xr putnext 9F Ta Xr putnextctl 9F
2122.It Xr putnextctl1 9F Ta Xr putq 9F
2123.It Xr mt-streams 9F Ta Xr qassociate 9F
2124.It Xr qenable 9F Ta Xr qprocsoff 9F
2125.It Xr qprocson 9F Ta Xr qreply 9F
2126.It Xr qsize 9F Ta Xr qwait_sig 9F
2127.It Xr qwait 9F Ta Xr qwriter 9F
2128.It Xr OTHERQ 9F Ta Xr RD 9F
2129.It Xr rmvq 9F Ta Xr SAMESTR 9F
2130.It Xr unfreezestr 9F Ta Xr WR 9F
2131.El
2132.Ss STREAMS ioctls
2133The following functions are used when a STREAMS-based device driver is
2134processing its
2135.Xr ioctl 9E
2136entry point.
2137Unlike character and block devices, STREAMS ioctls are passed around in
2138message blocks and copying data in and out of userland as STREAMS
2139ioctls are generally always processed in
2140.Sy kernel
2141context.
2142This means that the normal functions like
2143.Xr ddi_copyin 9F
2144and
2145.Xr ddi_copyout 9F
2146cannot be used.
2147Instead, when a message block has a type of
2148.Dv M_IOCTL ,
2149then these routines can often be used to convert the structure into one
2150that asks for data to be copied in, copied out, or to finally
2151acknowledge the ioctl as successful or to terminate the processing in
2152error.
2153.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2154.It Xr mcopyin 9F Ta Xr mcopyout 9F
2155.It Xr mioc2ack 9F Ta Xr miocack 9F
2156.It Xr miocnak 9F Ta Xr miocpullup 9F
2157.It Xr mkiocb 9F Ta
2158.El
2159.Ss chpoll(9E) Related Functions
2160These functions are present in service of the
2161.Xr chpoll 9E
2162interface which is used to support the traditional
2163.Xr poll 2 ,
2164and
2165.Xr select 3C
2166interfaces as well as event ports through the
2167.Xr port_get 3C
2168interface.
2169See
2170.Xr chpoll 9E
2171for the specific cases this should be called.
2172If a device driver does not implement the
2173.Xr chpoll 9E
2174character device entry point, then these functions should not be used.
2175.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2176.It Xr pollhead_clean 9F Ta Xr pollwakeup 9F
2177.El
2178.Ss Kernel Statistics
2179The kernel statistics or kstat framework provides an easy way of
2180exporting statistic information to be consumed outside of the kernel.
2181Users can interface with this data via
2182.Xr kstat 8
2183and the corresponding kstat library discussed in
2184.Xr kstat 3KSTAT .
2185.Pp
2186Kernel statistics are grouped using a tuple of four identifiers,
2187separated by colons when using
2188.Xr kstat 8 .
2189These are, in order, the statistic module name, instance, a name
2190which covers a group of statistics, and an individual name for a
2191statistic.
2192In addition, kernel statistics have a class which is used to group
2193similar named groups of statistics together across devices.
2194When using
2195.Xr kstat_create 9F ,
2196drivers specify the first three parts of the tuple and the class.
2197The naming of individual statistics, the last part of the tuple, varies
2198based upon the type of the statistic.
2199For the most part, drivers will use the kstat type
2200.Dv KSTAT_TYPE_NAMED ,
2201which allows multiple name-value pairs to exist within the statistic.
2202For example, the kernel's layer 2 networking framework,
2203.Xr mac 9E ,
2204creates a kstat with the driver's name and instance and names it
2205.Dq mac .
2206Within this named group, there are statistics for all of the different
2207individual stats that the kernel and devices track such as bytes
2208transmitted and received, the state and speed of the link, and
2209advertised and enabled capabilities.
2210.Pp
2211A device driver can initialize a kstat with the
2212.Xr kstat_create 9F
2213function.
2214It will not be made accessible to users until the
2215.Xr kstat_install 9F
2216function is called.
2217The device driver must perform additional initialization of the kstat
2218before proceeding and calling
2219.Xr kstat_install 9F .
2220The kstat structure that drivers see is discussed in
2221.Xr kstat 9S .
2222.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2223.It Xr kstat_create 9F Ta Xr kstat_delete 9F
2224.It Xr kstat_install 9F Ta Xr kstat_named_init 9F
2225.It Xr kstat_named_setstr 9F Ta Xr kstat_queue 9F
2226.It Xr kstat_runq_back_to_waitq 9F Ta Xr kstat_runq_enter 9F
2227.It Xr kstat_runq_exit 9F Ta Xr kstat_waitq_enter 9F
2228.It Xr kstat_waitq_exit 9F Ta Xr kstat_waitq_to_runq 9F
2229.El
2230.Ss NDI Events
2231These functions are used to allow a device driver to register for
2232certain events that might occur to its device or a parent in the tree
2233and receive a callback function when they occur.
2234A good example of this is when a device has been removed from the system
2235such as someone just pulling out a USB device or NVMe U.2 device.
2236The event handlers work by first getting a cookie that names the type of
2237event with
2238.Xr ddi_get_eventcookie 9F
2239and then registering the callback with
2240.Xr ddi_add_event_handler 9F .
2241.Pp
2242The
2243.Xr ddi_cb_register 9F
2244function is used to collect over classes of events such as when
2245participating in dynamic interrupt sharing.
2246.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2247.It Xr ddi_add_event_handler 9F Ta Xr ddi_cb_register 9F
2248.It Xr ddi_cb_unregister 9F Ta Xr ddi_get_eventcookie 9F
2249.It Xr ddi_remove_event_handler 9F Ta
2250.El
2251.Ss Layered Device Interfaces
2252The LDI
2253.Pq Layered Device Interface
2254provides a mechanism for a driver to open up another device in the
2255kernel and begin calling basic operations on the device as though the
2256calling driver were a normal user process.
2257Through the LDI, drivers can perform equivalents to the basic file
2258.Xr read 2
2259and
2260.Xr write 2
2261calls, look up properties on the device, perform networking style calls
2262ala
2263.Xr getmsg 2
2264and
2265.Xr putmsg 2 ,
2266and register callbacks to be called when something happens to the
2267underlying device.
2268For example, the ZFS file system uses the LDI to open and operate on
2269block devices.
2270.Pp
2271Before opening a device itself, callers must obtain a notion of their
2272identity which is used when making subsequent calls.
2273The simplest form is often to use the device's
2274.Vt dev_info_t
2275and call
2276.Xr ldi_ident_from_dip 9F ;
2277however, there are also methods available based upon having a
2278.Vt dev_t
2279or a STREAMS
2280.Vt struct queue .
2281.Pp
2282Once that identity is established, there are several ways to open a
2283device such as
2284.Xr ldi_open_by_dev 9F ,
2285.Xr ldi_open_by_devid 9F ,
2286or
2287.Xr ldi_open_by_name 9F .
2288Once an LDI device has been opened, then all of the other functions may
2289be used to operate on the device; however, consumers of the LDI must
2290think carefully about what kind of device they are opening.
2291While a kernel pseudo-device driver cannot disappear while it is open,
2292when the device represents an actual piece of hardware, it is possible
2293for it to be physically removed and no longer be accessible.
2294Consumers should not assume that a layered device will always be
2295present.
2296.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2297.It Xr ldi_add_event_handler 9F Ta Xr ldi_aread 9F
2298.It Xr ldi_awrite 9F Ta Xr ldi_close 9F
2299.It Xr ldi_devmap 9F Ta Xr ldi_dump 9F
2300.It Xr ldi_ev_finalize 9F Ta Xr ldi_ev_get_cookie 9F
2301.It Xr ldi_ev_get_type 9F Ta Xr ldi_ev_notify 9F
2302.It Xr ldi_ev_register_callbacks 9F Ta Xr ldi_ev_remove_callbacks 9F
2303.It Xr ldi_get_dev 9F Ta Xr ldi_get_devid 9F
2304.It Xr ldi_get_eventcookie 9F Ta Xr ldi_get_minor_name 9F
2305.It Xr ldi_get_otyp 9F Ta Xr ldi_get_size 9F
2306.It Xr ldi_getmsg 9F Ta Xr ldi_ident_from_dev 9F
2307.It Xr ldi_ident_from_dip 9F Ta Xr ldi_ident_from_stream 9F
2308.It Xr ldi_ident_release 9F Ta Xr ldi_ioctl 9F
2309.It Xr ldi_open_by_dev 9F Ta Xr ldi_open_by_devid 9F
2310.It Xr ldi_open_by_name 9F Ta Xr ldi_poll 9F
2311.It Xr ldi_prop_exists 9F Ta Xr ldi_prop_get_int 9F
2312.It Xr ldi_prop_get_int64 9F Ta Xr ldi_prop_lookup_byte_array 9F
2313.It Xr ldi_prop_lookup_int_array 9F Ta Xr ldi_prop_lookup_int64_array 9F
2314.It Xr ldi_prop_lookup_string_array 9F Ta Xr ldi_prop_lookup_string 9F
2315.It Xr ldi_putmsg 9F Ta Xr ldi_read 9F
2316.It Xr ldi_remove_event_handler 9F Ta Xr ldi_strategy 9F
2317.It Xr ldi_write 9F Ta
2318.El
2319.Ss Signal Manipulation
2320These utility functions all relate to understanding whether or not a
2321process can receive a signal an actually delivering one to a process
2322from a driver.
2323This interface is specific to device drivers and should not be used by
2324the broader kernel.
2325These interfaces are not recommended and should only be used after
2326consultation.
2327.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2328.It Xr ddi_can_receive_sig 9F Ta Xr proc_ref 9F
2329.It Xr proc_signal 9F Ta Xr proc_unref 9F
2330.El
2331.Ss Getting at Surrounding Context
2332These functions allow a driver to better understand its current context.
2333For example, some drivers have to deal with providing polled I/O or take
2334special care as part of creating a kernel crash dump.
2335These cases may need to call the
2336.Xr ddi_in_panic 9F
2337function.
2338The other functions generally provide a way to get at information such as
2339the process ID or other information from the system; however, this
2340generally should not be needed or used.
2341Almost all values exposed by say
2342.Xr drv_getparm 9F
2343have more usable first-class methods of getting at the data.
2344.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2345.It Xr ddi_get_kt_did 9F Ta Xr ddi_get_pid 9F
2346.It Xr ddi_in_panic 9F Ta Xr drv_getparm 9F
2347.El
2348.Ss Driver Memory Mapping
2349These functions are present for device drivers that implement the
2350.Xr devmap 9E
2351or
2352.Xr segmap 9E
2353entry points.
2354The
2355.Xr ddi_umem_alloc 9F
2356routines are used to allocate and lock memory that can later be used as
2357part of passing this memory to userland through the mapping entry
2358points.
2359.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2360.It Xr ddi_devmap_segmap 9F Ta Xr ddi_mmap_get_model 9F
2361.It Xr ddi_segmap_setup 9F Ta Xr ddi_segmap 9F
2362.It Xr ddi_umem_alloc 9F Ta Xr ddi_umem_free 9F
2363.It Xr ddi_umem_iosetup 9F Ta Xr ddi_umem_lock 9F
2364.It Xr ddi_umem_unlock 9F Ta Xr ddi_unmap_regs 9F
2365.It Xr devmap_default_access 9F Ta Xr devmap_devmem_setup 9F
2366.It Xr devmap_do_ctxmgt 9F Ta Xr devmap_load 9F
2367.It Xr devmap_set_ctx_timeout 9F Ta Xr devmap_setup 9F
2368.It Xr devmap_umem_setup 9F Ta Xr devmap_unload 9F
2369.El
2370.Ss UTF-8, UTF-16, UTF-32, and Code Set Utilities
2371These routines provide the ability to work with and deal with text in
2372different encodings and code sets.
2373Generally the kernel does not assume that much about the type of the text
2374that it is operating in, though some subsystems will require that the
2375names of things be ASCII only.
2376.Pp
2377The primary other locales that the system supports are generally UTF-8
2378based and so the kernel provides a set of routines to deal with UTF-8
2379and Unicode normalization.
2380However, there are still cases where different character encodings are
2381required or conversation between UTF-8 and some other type is required.
2382This is provided by the kernel iconv framework, which provides a
2383subset of the traditional userland iconv conversions.
2384.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2385.It Xr kiconv_close 9F Ta Xr kiconv_open 9F
2386.It Xr kiconv 9F Ta Xr kiconvstr 9F
2387.It Xr u8_strcmp 9F Ta Xr u8_textprep_str 9F
2388.It Xr u8_validate 9F Ta Xr uconv_u16tou32 9F
2389.It Xr uconv_u16tou8 9F Ta Xr uconv_u32tou16 9F
2390.It Xr uconv_u32tou8 9F Ta Xr uconv_u8tou16 9F
2391.It Xr uconv_u8tou32 9F Ta
2392.El
2393.Ss Raw I/O Port Access
2394This group of functions provides raw access to I/O ports on architecture
2395that support them.
2396These functions do not allow any coordination with other callers nor is
2397the validity of the port assured in any way.
2398In general, device drivers should use the normal register access
2399routines to access I/O ports.
2400See
2401.Sx Device Register Setup and Access
2402for more information on the preferred way to setup and access registers.
2403.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2404.It Xr inb 9F Ta Xr inw 9F
2405.It Xr inl 9F Ta Xr outb 9F
2406.It Xr outw 9F Ta Xr outl 9F
2407.El
2408.Ss Power Management
2409These functions are used to raise and lower the internal power levels of
2410a device driver or to indicate to the kernel that the device is busy and
2411therefore cannot have its power changed.
2412See
2413.Xr power 9E
2414for additional information.
2415.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2416.It Xr ddi_removing_power 9F Ta Xr pm_busy_component 9F
2417.It Xr pm_idle_component 9F Ta Xr pm_lower_power 9F
2418.It Xr pm_power_has_changed 9F Ta Xr pm_raise_power 9F
2419.It Xr pm_trans_check 9F Ta
2420.El
2421.Ss Network Packet Hooks
2422These functions are intended to be used by device drivers that wish to
2423inspect and potentially modify packets along their path through the
2424networking stack.
2425The most common use case is for implementing something like a network
2426firewall.
2427Otherwise, if looking to add support for a new protocol or other network
2428processing feature, one is better off more directly integrating with the
2429networking stack.
2430.Pp
2431To get started, drivers generally will need to first use
2432.Xr net_protocol_lookup 9F
2433to get a handle to say that they're interested in looking at IPv4 or
2434IPv6 traffic and then can allocate an actual hook object with
2435.Xr hook_alloc 9F .
2436After filling out the hook, the hook can be inserted into the actual
2437system with
2438.Xr net_hook_register 9F .
2439.Pp
2440Hooks operate in the context of a networking stack.
2441Every networking stack in the system is independent and therefore has
2442its own set of interfaces, routing tables, settings, and related.
2443Most zones have their own networking stack.
2444This is the exclusive-IP option that is described in
2445.Xr zoneadm 8 .
2446.Pp
2447Drivers can register to get a callback for every netstack in the system
2448and be notified when they are created and destroyed.
2449This is done by calling the
2450.Xr net_instance_alloc 9F
2451function, filling out its data structure, and then finally calling
2452.Xr net_instance_register 9F .
2453Like other callback interfaces, the moment the callback functions are
2454registered, drivers need to expect that they're going to be called.
2455.Bl -column -offset indent "net_instance_protocol_unregister" "net_instance_protocol_unregister"
2456.It Xr hook_alloc 9F Ta Xr hook_free 9F
2457.It Xr net_event_notify_register 9F Ta Xr net_event_notify_unregister 9F
2458.It Xr net_getifname 9F Ta Xr net_getlifaddr 9F
2459.It Xr net_getmtu 9F Ta Xr net_getnetid 9F
2460.It Xr net_getpmtuenabled 9F Ta Xr net_hook_register 9F
2461.It Xr net_hook_unregister 9F Ta Xr net_inject_alloc 9F
2462.It Xr net_inject_free 9F Ta Xr net_inject 9F
2463.It Xr net_instance_alloc 9F Ta Xr net_instance_free 9F
2464.It Xr net_instance_notify_register 9F Ta Xr net_instance_notify_unregister 9F
2465.It Xr net_instance_protocol_unregister 9F Ta Xr net_instance_register 9F
2466.It Xr net_instance_unregister 9F Ta Xr net_ispartialchecksum 9F
2467.It Xr net_isvalidchecksum 9F Ta Xr net_kstat_create 9F
2468.It Xr net_kstat_delete 9F Ta Xr net_lifgetnext 9F
2469.It Xr net_netidtozonid 9F Ta Xr net_phygetnext 9F
2470.It Xr net_phylookup 9F Ta Xr net_protocol_lookup 9F
2471.It Xr net_protocol_notify_register 9F Ta Xr net_protocol_release 9F
2472.It Xr net_protocol_walk 9F Ta Xr net_routeto 9F
2473.It Xr net_zoneidtonetid 9F Ta Xr netinfo 9F
2474.El
2475.Sh SEE ALSO
2476.Xr Intro 2 ,
2477.Xr Intro 9 ,
2478.Xr Intro 9E ,
2479.Xr Intro 9S
2480.Rs
2481.%T illumos Developer's Guide
2482.%U https://www.illumos.org/books/dev/
2483.Re
2484.Rs
2485.%T Writing Device Drivers
2486.%U https://www.illumos.org/books/wdd/
2487.Re
2488