xref: /freebsd/share/man/man4/pci.4 (revision 31ba4ce8898f9dfa5e7f054fdbc26e50a599a6e3)
1.\"
2.\" Copyright (c) 1999 Kenneth D. Merry.
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. The name of the author may not be used to endorse or promote products
11.\"    derived from this software without specific prior written permission.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd July 27, 2021
28.Dt PCI 4
29.Os
30.Sh NAME
31.Nm pci
32.Nd generic PCI/PCIe bus driver
33.Sh SYNOPSIS
34To compile the PCI bus driver into the kernel,
35place the following line in your
36kernel configuration file:
37.Bd -ragged -offset indent
38.Cd device pci
39.Ed
40.Pp
41To compile in support for Single Root I/O Virtualization
42.Pq SR-IOV :
43.Bd -ragged -offset indent
44.Cd options PCI_IOV
45.Ed
46.Pp
47To compile in support for native PCI-express HotPlug:
48.Bd -ragged -offset indent
49.Cd options PCI_HP
50.Ed
51.Sh DESCRIPTION
52The
53.Nm
54driver provides support for
55.Tn PCI
56and
57.Tn PCIe
58devices in the kernel and limited access to
59.Tn PCI
60devices for userland.
61.Pp
62The
63.Nm
64driver provides a
65.Pa /dev/pci
66character device that can be used by userland programs to read and write
67.Tn PCI
68configuration registers.
69Programs can also use this device to get a list of all
70.Tn PCI
71devices, or all
72.Tn PCI
73devices that match various patterns.
74.Pp
75Since the
76.Nm
77driver provides a write interface for
78.Tn PCI
79configuration registers, system administrators should exercise caution when
80granting access to the
81.Nm
82device.
83If used improperly, this driver can allow userland applications to
84crash a machine or cause data loss.
85In particular, driver only allows operations on the opened
86.Pa /dev/pci
87to modify system state if the file descriptor was opened for writing.
88For instance, the
89.Dv PCIOCREAD
90and
91.Dv PCIOCBARMMAP
92operations require a writeable descriptor, because reading a config register
93or a BAR read access could have function-specific side-effects.
94.Pp
95The
96.Nm
97driver implements the
98.Tn PCI
99bus in the kernel.
100It enumerates any devices on the
101.Tn PCI
102bus and gives
103.Tn PCI
104client drivers the chance to attach to them.
105It assigns resources to children, when the BIOS does not.
106It takes care of routing interrupts when necessary.
107It reprobes the unattached
108.Tn PCI
109children when
110.Tn PCI
111client drivers are dynamically
112loaded at runtime.
113The
114.Nm
115driver also includes support for PCI-PCI bridges,
116various platform-specific Host-PCI bridges,
117and basic support for
118.Tn PCI
119VGA adapters.
120.Sh IOCTLS
121The following
122.Xr ioctl 2
123calls are supported by the
124.Nm
125driver.
126They are defined in the header file
127.In sys/pciio.h .
128.Bl -tag -width 012345678901234
129.It PCIOCGETCONF
130This
131.Xr ioctl 2
132takes a
133.Va pci_conf_io
134structure.
135It allows the user to retrieve information on all
136.Tn PCI
137devices in the system, or on
138.Tn PCI
139devices matching patterns supplied by the user.
140The call may set
141.Va errno
142to any value specified in either
143.Xr copyin 9
144or
145.Xr copyout 9 .
146The
147.Va pci_conf_io
148structure consists of a number of fields:
149.Bl -tag -width match_buf_len
150.It pat_buf_len
151The length, in bytes, of the buffer filled with user-supplied patterns.
152.It num_patterns
153The number of user-supplied patterns.
154.It patterns
155Pointer to a buffer filled with user-supplied patterns.
156.Va patterns
157is a pointer to
158.Va num_patterns
159.Va pci_match_conf
160structures.
161The
162.Va pci_match_conf
163structure consists of the following elements:
164.Bl -tag -width pd_vendor
165.It pc_sel
166.Tn PCI
167domain, bus, slot and function.
168.It pd_name
169.Tn PCI
170device driver name.
171.It pd_unit
172.Tn PCI
173device driver unit number.
174.It pc_vendor
175.Tn PCI
176vendor ID.
177.It pc_device
178.Tn PCI
179device ID.
180.It pc_class
181.Tn PCI
182device class.
183.It flags
184The flags describe which of the fields the kernel should match against.
185A device must match all specified fields in order to be returned.
186The match flags are enumerated in the
187.Va pci_getconf_flags
188structure.
189Hopefully the flag values are obvious enough that they do not need to
190described in detail.
191.El
192.It match_buf_len
193Length of the
194.Va matches
195buffer allocated by the user to hold the results of the
196.Dv PCIOCGETCONF
197query.
198.It num_matches
199Number of matches returned by the kernel.
200.It matches
201Buffer containing matching devices returned by the kernel.
202The items in this buffer are of type
203.Va pci_conf ,
204which consists of the following items:
205.Bl -tag -width pc_subvendor
206.It pc_sel
207.Tn PCI
208domain, bus, slot and function.
209.It pc_hdr
210.Tn PCI
211header type.
212.It pc_subvendor
213.Tn PCI
214subvendor ID.
215.It pc_subdevice
216.Tn PCI
217subdevice ID.
218.It pc_vendor
219.Tn PCI
220vendor ID.
221.It pc_device
222.Tn PCI
223device ID.
224.It pc_class
225.Tn PCI
226device class.
227.It pc_subclass
228.Tn PCI
229device subclass.
230.It pc_progif
231.Tn PCI
232device programming interface.
233.It pc_revid
234.Tn PCI
235revision ID.
236.It pd_name
237Driver name.
238.It pd_unit
239Driver unit number.
240.El
241.It offset
242The offset is passed in by the user to tell the kernel where it should
243start traversing the device list.
244The value passed out by the kernel
245points to the record immediately after the last one returned.
246The user may
247pass the value returned by the kernel in subsequent calls to the
248.Dv PCIOCGETCONF
249ioctl.
250If the user does not intend to use the offset, it must be set to zero.
251.It generation
252.Tn PCI
253configuration generation.
254This value only needs to be set if the offset is set.
255The kernel will compare the current generation number of its internal
256device list to the generation passed in by the user to determine whether
257its device list has changed since the user last called the
258.Dv PCIOCGETCONF
259ioctl.
260If the device list has changed, a status of
261.Va PCI_GETCONF_LIST_CHANGED
262will be passed back.
263.It status
264The status tells the user the disposition of his request for a device list.
265The possible status values are:
266.Bl -ohang
267.It PCI_GETCONF_LAST_DEVICE
268This means that there are no more devices in the PCI device list matching
269the specified criteria after the
270ones returned in the
271.Va matches
272buffer.
273.It PCI_GETCONF_LIST_CHANGED
274This status tells the user that the
275.Tn PCI
276device list has changed since his last call to the
277.Dv PCIOCGETCONF
278ioctl and he must reset the
279.Va offset
280and
281.Va generation
282to zero to start over at the beginning of the list.
283.It PCI_GETCONF_MORE_DEVS
284This tells the user that his buffer was not large enough to hold all of the
285remaining devices in the device list that match his criteria.
286.It PCI_GETCONF_ERROR
287This indicates a general error while servicing the user's request.
288If the
289.Va pat_buf_len
290is not equal to
291.Va num_patterns
292times
293.Fn sizeof "struct pci_match_conf" ,
294.Va errno
295will be set to
296.Er EINVAL .
297.El
298.El
299.It PCIOCREAD
300This
301.Xr ioctl 2
302reads the
303.Tn PCI
304configuration registers specified by the passed-in
305.Va pci_io
306structure.
307The
308.Va pci_io
309structure consists of the following fields:
310.Bl -tag -width pi_width
311.It pi_sel
312A
313.Va pcisel
314structure which specifies the domain, bus, slot and function the user would
315like to query.
316If the specific bus is not found, errno will be set to ENODEV and -1 returned
317from the ioctl.
318.It pi_reg
319The
320.Tn PCI
321configuration registers the user would like to access.
322.It pi_width
323The width, in bytes, of the data the user would like to read.
324This value
325may be either 1, 2, or 4.
3263-byte reads and reads larger than 4 bytes are
327not supported.
328If an invalid width is passed, errno will be set to EINVAL.
329.It pi_data
330The data returned by the kernel.
331.El
332.It PCIOCWRITE
333This
334.Xr ioctl 2
335allows users to write to the
336.Tn PCI
337configuration registers specified in the passed-in
338.Va pci_io
339structure.
340The
341.Va pci_io
342structure is described above.
343The limitations on data width described for
344reading registers, above, also apply to writing
345.Tn PCI
346configuration registers.
347.It PCIOCATTACHED
348This
349.Xr ioctl 2
350allows users to query if a driver is attached to the
351.Tn PCI
352device specified in the passed-in
353.Va pci_io
354structure.
355The
356.Va pci_io
357structure is described above, however, the
358.Va pi_reg
359and
360.Va pi_width
361fields are not used.
362The status of the device is stored in the
363.Va pi_data
364field.
365A value of 0 indicates no driver is attached, while a value larger than 0
366indicates that a driver is attached.
367.It PCIOCBARMMAP
368This
369.Xr ioctl 2
370command allows userspace processes to
371.Xr mmap 2
372the memory-mapped PCI BAR into its address space.
373The input parameters and results are passed in the
374.Va pci_bar_mmap
375structure, which has the following fields:
376.Bl -tag -width Vt struct pcise pbm_sel
377.It Vt uint64_t	pbm_map_base
378Reports the established mapping base to the caller.
379If
380.Va PCIIO_BAR_MMAP_FIXED
381flag was specified, then this field must be filled before the call
382with the desired address for the mapping.
383.It Vt uint64_t pbm_map_length
384Reports the mapped length of the BAR, in bytes.
385Its .Vt uint64_t value is always multiple of machine pages.
386.It Vt int64_t pbm_bar_length
387Reports length of the bar as exposed by the device.
388.It Vt int pbm_bar_off
389Reports offset from the mapped base to the start of the
390first register in the bar.
391.It Vt struct pcisel pbm_sel
392Should be filled before the call.
393Describes the device to operate on.
394.It Vt int pbm_reg
395The BAR index to mmap.
396.It Vt int pbm_flags
397Flags which augments the operation.
398See below.
399.It Vt int pbm_memattr
400The caching attribute for the mapping.
401Typical values are
402.Dv VM_MEMATTR_UNCACHEABLE
403for control registers BARs, and
404.Dv VM_MEMATTR_WRITE_COMBINING
405for frame buffers.
406Regular memory-like BAR should be mapped with
407.Dv VM_MEMATTR_DEFAULT
408attribute.
409.El
410.Pp
411Currently defined flags are:
412.Bl -tag -width PCIIO_BAR_MMAP_ACTIVATE
413.It PCIIO_BAR_MMAP_FIXED
414The resulted mappings should be established at the address
415specified by the
416.Va pbm_map_base
417member, otherwise fail.
418.It PCIIO_BAR_MMAP_EXCL
419Must be used together with
420.Dv PCIIO_BAR_MMAP_FIXED
421If the specified base contains already established mappings, the
422operation fails instead of implicitly unmapping them.
423.It PCIIO_BAR_MMAP_RW
424The requested mapping allows both reading and writing.
425Without the flag, read-only mapping is established.
426Note that it is common for the device registers to have side-effects
427even on reads.
428.It PCIIO_BAR_MMAP_ACTIVATE
429(Unimplemented) If the BAR is not activated, activate it in the course
430of mapping.
431Currently attempt to mmap an inactive BAR results in error.
432.El
433.El
434.Sh LOADER TUNABLES
435Tunables can be set at the
436.Xr loader 8
437prompt before booting the kernel, or stored in
438.Xr loader.conf 5 .
439The current value of these tunables can be examined at runtime via
440.Xr sysctl 8
441nodes of the same name.
442Unless otherwise specified,
443each of these tunables is a boolean that can be enabled by setting the
444tunable to a non-zero value.
445.Bl -tag -width indent
446.It Va hw.pci.clear_bars Pq Defaults to 0
447Ignore any firmware-assigned memory and I/O port resources.
448This forces the
449.Tn PCI
450bus driver to allocate resource ranges for memory and I/O port resources
451from scratch.
452.It Va hw.pci.clear_buses Pq Defaults to 0
453Ignore any firmware-assigned bus number registers in PCI-PCI bridges.
454This forces the
455.Tn PCI
456bus driver and PCI-PCI bridge driver to allocate bus numbers for secondary
457buses behind PCI-PCI bridges.
458.It Va hw.pci.clear_pcib Pq Defaults to 0
459Ignore any firmware-assigned memory and I/O port resource windows in PCI-PCI
460bridges.
461This forces the PCI-PCI bridge driver to allocate memory and I/O port resources
462for resource windows from scratch.
463.Pp
464By default the PCI-PCI bridge driver will allocate windows that
465contain the firmware-assigned resources devices behind the bridge.
466In addition, the PCI-PCI bridge driver will suballocate from existing window
467regions when possible to satisfy a resource request.
468As a result,
469both
470.Va hw.pci.clear_bars
471and
472.Va hw.pci.clear_pcib
473must be enabled to fully ignore firmware-supplied resource assignments.
474.It Va hw.pci.default_vgapci_unit Pq Defaults to -1
475By default,
476the first
477.Tn PCI
478VGA adapter encountered by the system is assumed to be the boot display device.
479This tunable can be set to choose a specific VGA adapter by specifying the
480unit number of the associated
481.Va vgapci Ns Ar X
482device.
483.It Va hw.pci.do_power_nodriver Pq Defaults to 0
484Place devices into a low power state
485.Pq D3
486when a suitable device driver is not found.
487Can be set to one of the following values:
488.Bl -tag -width indent
489.It 3
490Powers down all
491.Tn PCI
492devices without a device driver.
493.It 2
494Powers down most devices without a device driver.
495PCI devices with the display, memory, and base peripheral device classes
496are not powered down.
497.It 1
498Similar to a setting of 2 except that storage controllers are also not
499powered down.
500.It 0
501All devices are left fully powered.
502.El
503.Pp
504A
505.Tn PCI
506device must support power management to be powered down.
507Placing a device into a low power state may not reduce power consumption.
508.It Va hw.pci.do_power_resume Pq Defaults to 1
509Place
510.Tn PCI
511devices into the fully powered state when resuming either the system or an
512individual device.
513Setting this to zero is discouraged as the system will not attempt to power
514up non-powered PCI devices after a suspend.
515.It Va hw.pci.do_power_suspend Pq Defaults to 1
516Place
517.Tn PCI
518devices into a low power state when suspending either the system or individual
519devices.
520Normally the D3 state is used as the low power state,
521but firmware may override the desired power state during a system suspend.
522.It Va hw.pci.enable_ari Pq Defaults to 1
523Enable support for PCI-express Alternative RID Interpretation.
524This is often used in conjunction with SR-IOV.
525.It Va hw.pci.enable_io_modes Pq Defaults to 1
526Enable memory or I/O port decoding in a PCI device's command register if it has
527firmware-assigned memory or I/O port resources.
528The firmware
529.Pq BIOS
530in some systems does not enable memory or I/O port decoding for some devices
531even when it has assigned resources to the device.
532This enables decoding for such resources during bus probe.
533.It Va hw.pci.enable_msi Pq Defaults to 1
534Enable support for Message Signalled Interrupts
535.Pq MSI .
536MSI interrupts can be disabled by setting this tunable to 0.
537.It Va hw.pci.enable_msix Pq Defaults to 1
538Enable support for extended Message Signalled Interrupts
539.Pq MSI-X .
540MSI-X interrupts can be disabled by setting this tunable to 0.
541.It Va hw.pci.enable_pcie_hp Pq Defaults to 1
542Enable support for native PCI-express HotPlug.
543.It Va hw.pci.honor_msi_blacklist Pq Defaults to 1
544MSI and MSI-X interrupts are disabled for certain chipsets known to have
545broken MSI and MSI-X implementations when this tunable is set.
546It can be set to zero to permit use of MSI and MSI-X interrupts if the
547chipset match is a false positive.
548.It Va hw.pci.iov_max_config Pq Defaults to 1MB
549The maximum amount of memory permitted for the configuration parameters
550used when creating Virtual Functions via SR-IOV.
551This tunable can also be changed at runtime via
552.Xr sysctl 8 .
553.It Va hw.pci.realloc_bars Pq Defaults to 0
554Attempt to allocate a new resource range during the initial device scan
555for any memory or I/O port resources with firmware-assigned ranges that
556conflict with another active resource.
557.It Va hw.pci.usb_early_takeover Pq Defaults to 1 on Tn amd64 and Tn i386
558Disable legacy device emulation of USB devices during the initial device
559scan.
560Set this tunable to zero to use USB devices via legacy emulation when
561using a custom kernel without USB controller drivers.
562.It Va hw.pci<D>.<B>.<S>.INT<P>.irq
563These tunables can be used to override the interrupt routing for legacy
564PCI INTx interrupts.
565Unlike other tunables in this list,
566these do not have corresponding sysctl nodes.
567The tunable name includes the address of the PCI device as well as the
568pin of the desired INTx IRQ to override:
569.Bl -tag -width indent
570.It <D>
571The domain
572.Pq or segment
573of the PCI device in decimal.
574.It <B>
575The bus address of the PCI device in decimal.
576.It <S>
577The slot of the PCI device in decimal.
578.It <P>
579The interrupt pin of the PCI slot to override.
580One of
581.Ql A ,
582.Ql B ,
583.Ql C ,
584or
585.Ql D .
586.El
587.Pp
588The value of the tunable is the raw IRQ value to use for the INTx interrupt
589pin identified by the tunable name.
590Mapping of IRQ values to platform interrupt sources is machine dependent.
591.El
592.Sh DEVICE WIRING
593You can wire the device unit at a given location with device.hints.
594Entries of the form
595.Va hints.<name>.<unit>.at="pci<B>:<S>:<F>"
596or
597.Va hints.<name>.<unit>.at="pci<D>:<B>:<S>:<F>"
598will force the driver
599.Va name
600to probe and attach at unit
601.Va unit
602for any PCI device found to match the specification, where:
603.Bl -tag -width -indent
604.It <D>
605The domain
606.Pq or segment
607of the PCI device in decimal.
608Defaults to 0 if unspecified
609.It <B>
610The bus address of the PCI device in decimal.
611.It <S>
612The slot of the PCI device in decimal.
613.It <F>
614The function of the PCI device in decimal.
615.El
616.Pp
617The code to do the matching requires an exact string match.
618Do not specify the angle brackets
619.Pq < >
620in the hints file.
621Wiring multiple devices to the same
622.Va name
623and
624.Va unit
625produces undefined results.
626.Ss Examples
627Given the following lines in
628.Pa /boot/device.hints :
629.Cd hint.nvme.3.at="pci6:0:0"
630.Cd hint.igb.8.at="pci14:0:0"
631If there is a device that supports
632.Xr igb 4
633at PCI bus 14 slot 0 function 0,
634then it will be assigned igb8 for probe and attach.
635Likewise, if there is an
636.Xr nvme 4
637card at PCI bus 6 slot 0 function 0,
638then it will be assigned nvme3 for probe and attach.
639If another type of card is in either of these locations, the name and
640unit of that card will be the default names and will be unaffected by
641these hints.
642If other igb or nvme cards are located elsewhere, they will be
643assigned their unit numbers sequentially, skipping the unit numbers
644that have 'at' hints.
645.Sh FILES
646.Bl -tag -width /dev/pci -compact
647.It Pa /dev/pci
648Character device for the
649.Nm
650driver.
651.El
652.Sh SEE ALSO
653.Xr pciconf 8
654.Sh HISTORY
655The
656.Nm
657driver (not the kernel's
658.Tn PCI
659support code) first appeared in
660.Fx 2.2 ,
661and was written by Stefan Esser and Garrett Wollman.
662Support for device listing and matching was re-implemented by
663Kenneth Merry, and first appeared in
664.Fx 3.0 .
665.Sh AUTHORS
666.An Kenneth Merry Aq Mt ken@FreeBSD.org
667.Sh BUGS
668It is not possible for users to specify an accurate offset into the device
669list without calling the
670.Dv PCIOCGETCONF
671at least once, since they have no way of knowing the current generation
672number otherwise.
673This probably is not a serious problem, though, since
674users can easily narrow their search by specifying a pattern or patterns
675for the kernel to match against.
676