xref: /freebsd/share/man/man4/pci.4 (revision 6132212808e8dccedc9e5d85fea4390c2f38059a)
1.\"
2.\" Copyright (c) 1999 Kenneth D. Merry.
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. The name of the author may not be used to endorse or promote products
11.\"    derived from this software without specific prior written permission.
12.\"
13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
23.\" SUCH DAMAGE.
24.\"
25.\" $FreeBSD$
26.\"
27.Dd June 17, 2019
28.Dt PCI 4
29.Os
30.Sh NAME
31.Nm pci
32.Nd generic PCI bus driver
33.Sh SYNOPSIS
34To compile the PCI bus driver into the kernel,
35place the following line in your
36kernel configuration file:
37.Bd -ragged -offset indent
38.Cd device pci
39.Ed
40.Pp
41To compile in support for Single Root I/O Virtualization
42.Pq SR-IOV :
43.Bd -ragged -offset indent
44.Cd options PCI_IOV
45.Ed
46.Pp
47To compile in support for native PCI-express HotPlug:
48.Bd -ragged -offset indent
49.Cd options PCI_HP
50.Ed
51.Sh DESCRIPTION
52The
53.Nm
54driver provides support for
55.Tn PCI
56devices in the kernel and limited access to
57.Tn PCI
58devices for userland.
59.Pp
60The
61.Nm
62driver provides a
63.Pa /dev/pci
64character device that can be used by userland programs to read and write
65.Tn PCI
66configuration registers.
67Programs can also use this device to get a list of all
68.Tn PCI
69devices, or all
70.Tn PCI
71devices that match various patterns.
72.Pp
73Since the
74.Nm
75driver provides a write interface for
76.Tn PCI
77configuration registers, system administrators should exercise caution when
78granting access to the
79.Nm
80device.
81If used improperly, this driver can allow userland applications to
82crash a machine or cause data loss.
83.Pp
84The
85.Nm
86driver implements the
87.Tn PCI
88bus in the kernel.
89It enumerates any devices on the
90.Tn PCI
91bus and gives
92.Tn PCI
93client drivers the chance to attach to them.
94It assigns resources to children, when the BIOS does not.
95It takes care of routing interrupts when necessary.
96It reprobes the unattached
97.Tn PCI
98children when
99.Tn PCI
100client drivers are dynamically
101loaded at runtime.
102The
103.Nm
104driver also includes support for PCI-PCI bridges,
105various platform-specific Host-PCI bridges,
106and basic support for
107.Tn PCI
108VGA adapters.
109.Sh IOCTLS
110The following
111.Xr ioctl 2
112calls are supported by the
113.Nm
114driver.
115They are defined in the header file
116.In sys/pciio.h .
117.Bl -tag -width 012345678901234
118.It PCIOCGETCONF
119This
120.Xr ioctl 2
121takes a
122.Va pci_conf_io
123structure.
124It allows the user to retrieve information on all
125.Tn PCI
126devices in the system, or on
127.Tn PCI
128devices matching patterns supplied by the user.
129The call may set
130.Va errno
131to any value specified in either
132.Xr copyin 9
133or
134.Xr copyout 9 .
135The
136.Va pci_conf_io
137structure consists of a number of fields:
138.Bl -tag -width match_buf_len
139.It pat_buf_len
140The length, in bytes, of the buffer filled with user-supplied patterns.
141.It num_patterns
142The number of user-supplied patterns.
143.It patterns
144Pointer to a buffer filled with user-supplied patterns.
145.Va patterns
146is a pointer to
147.Va num_patterns
148.Va pci_match_conf
149structures.
150The
151.Va pci_match_conf
152structure consists of the following elements:
153.Bl -tag -width pd_vendor
154.It pc_sel
155.Tn PCI
156domain, bus, slot and function.
157.It pd_name
158.Tn PCI
159device driver name.
160.It pd_unit
161.Tn PCI
162device driver unit number.
163.It pc_vendor
164.Tn PCI
165vendor ID.
166.It pc_device
167.Tn PCI
168device ID.
169.It pc_class
170.Tn PCI
171device class.
172.It flags
173The flags describe which of the fields the kernel should match against.
174A device must match all specified fields in order to be returned.
175The match flags are enumerated in the
176.Va pci_getconf_flags
177structure.
178Hopefully the flag values are obvious enough that they do not need to
179described in detail.
180.El
181.It match_buf_len
182Length of the
183.Va matches
184buffer allocated by the user to hold the results of the
185.Dv PCIOCGETCONF
186query.
187.It num_matches
188Number of matches returned by the kernel.
189.It matches
190Buffer containing matching devices returned by the kernel.
191The items in this buffer are of type
192.Va pci_conf ,
193which consists of the following items:
194.Bl -tag -width pc_subvendor
195.It pc_sel
196.Tn PCI
197domain, bus, slot and function.
198.It pc_hdr
199.Tn PCI
200header type.
201.It pc_subvendor
202.Tn PCI
203subvendor ID.
204.It pc_subdevice
205.Tn PCI
206subdevice ID.
207.It pc_vendor
208.Tn PCI
209vendor ID.
210.It pc_device
211.Tn PCI
212device ID.
213.It pc_class
214.Tn PCI
215device class.
216.It pc_subclass
217.Tn PCI
218device subclass.
219.It pc_progif
220.Tn PCI
221device programming interface.
222.It pc_revid
223.Tn PCI
224revision ID.
225.It pd_name
226Driver name.
227.It pd_unit
228Driver unit number.
229.El
230.It offset
231The offset is passed in by the user to tell the kernel where it should
232start traversing the device list.
233The value passed out by the kernel
234points to the record immediately after the last one returned.
235The user may
236pass the value returned by the kernel in subsequent calls to the
237.Dv PCIOCGETCONF
238ioctl.
239If the user does not intend to use the offset, it must be set to zero.
240.It generation
241.Tn PCI
242configuration generation.
243This value only needs to be set if the offset is set.
244The kernel will compare the current generation number of its internal
245device list to the generation passed in by the user to determine whether
246its device list has changed since the user last called the
247.Dv PCIOCGETCONF
248ioctl.
249If the device list has changed, a status of
250.Va PCI_GETCONF_LIST_CHANGED
251will be passed back.
252.It status
253The status tells the user the disposition of his request for a device list.
254The possible status values are:
255.Bl -ohang
256.It PCI_GETCONF_LAST_DEVICE
257This means that there are no more devices in the PCI device list matching
258the specified criteria after the
259ones returned in the
260.Va matches
261buffer.
262.It PCI_GETCONF_LIST_CHANGED
263This status tells the user that the
264.Tn PCI
265device list has changed since his last call to the
266.Dv PCIOCGETCONF
267ioctl and he must reset the
268.Va offset
269and
270.Va generation
271to zero to start over at the beginning of the list.
272.It PCI_GETCONF_MORE_DEVS
273This tells the user that his buffer was not large enough to hold all of the
274remaining devices in the device list that match his criteria.
275.It PCI_GETCONF_ERROR
276This indicates a general error while servicing the user's request.
277If the
278.Va pat_buf_len
279is not equal to
280.Va num_patterns
281times
282.Fn sizeof "struct pci_match_conf" ,
283.Va errno
284will be set to
285.Er EINVAL .
286.El
287.El
288.It PCIOCREAD
289This
290.Xr ioctl 2
291reads the
292.Tn PCI
293configuration registers specified by the passed-in
294.Va pci_io
295structure.
296The
297.Va pci_io
298structure consists of the following fields:
299.Bl -tag -width pi_width
300.It pi_sel
301A
302.Va pcisel
303structure which specifies the domain, bus, slot and function the user would
304like to query.
305If the specific bus is not found, errno will be set to ENODEV and -1 returned
306from the ioctl.
307.It pi_reg
308The
309.Tn PCI
310configuration registers the user would like to access.
311.It pi_width
312The width, in bytes, of the data the user would like to read.
313This value
314may be either 1, 2, or 4.
3153-byte reads and reads larger than 4 bytes are
316not supported.
317If an invalid width is passed, errno will be set to EINVAL.
318.It pi_data
319The data returned by the kernel.
320.El
321.It PCIOCWRITE
322This
323.Xr ioctl 2
324allows users to write to the
325.Tn PCI
326configuration registers specified in the passed-in
327.Va pci_io
328structure.
329The
330.Va pci_io
331structure is described above.
332The limitations on data width described for
333reading registers, above, also apply to writing
334.Tn PCI
335configuration registers.
336.It PCIOCATTACHED
337This
338.Xr ioctl 2
339allows users to query if a driver is attached to the
340.Tn PCI
341device specified in the passed-in
342.Va pci_io
343structure.
344The
345.Va pci_io
346structure is described above, however, the
347.Va pi_reg
348and
349.Va pi_width
350fields are not used.
351The status of the device is stored in the
352.Va pi_data
353field.
354A value of 0 indicates no driver is attached, while a value larger than 0
355indicates that a driver is attached.
356.It PCIOCBARMMAP
357This
358.Xr ioctl 2
359command allows userspace processes to
360.Xr mmap 2
361the memory-mapped PCI BAR into its address space.
362The input parameters and results are passed in the
363.Va pci_bar_mmap
364structure, which has the following fields:
365.Bl -tag -width Vt struct pcise pbm_sel
366.It Vt uint64_t	pbm_map_base
367Reports the established mapping base to the caller.
368If
369.Va PCIIO_BAR_MMAP_FIXED
370flag was specified, then this field must be filled before the call
371with the desired address for the mapping.
372.It Vt uint64_t pbm_map_length
373Reports the mapped length of the BAR, in bytes.
374Its .Vt uint64_t value is always multiple of machine pages.
375.It Vt int64_t pbm_bar_length
376Reports length of the bar as exposed by the device.
377.It Vt int pbm_bar_off
378Reports offset from the mapped base to the start of the
379first register in the bar.
380.It Vt struct pcisel pbm_sel
381Should be filled before the call.
382Describes the device to operate on.
383.It Vt int pbm_reg
384The BAR index to mmap.
385.It Vt int pbm_flags
386Flags which augments the operation.
387See below.
388.It Vt int pbm_memattr
389The caching attribute for the mapping.
390Typical values are
391.Dv VM_MEMATTR_UNCACHEABLE
392for control registers BARs, and
393.Dv VM_MEMATTR_WRITE_COMBINING
394for frame buffers.
395Regular memory-like BAR should be mapped with
396.Dv VM_MEMATTR_DEFAULT
397attribute.
398.El
399.Pp
400Currently defined flags are:
401.Bl -tag -width PCIIO_BAR_MMAP_ACTIVATE
402.It PCIIO_BAR_MMAP_FIXED
403The resulted mappings should be established at the address
404specified by the
405.Va pbm_map_base
406member, otherwise fail.
407.It PCIIO_BAR_MMAP_EXCL
408Must be used together with
409.Dv PCIIO_BAR_MMAP_FIXED
410If the specified base contains already established mappings, the
411operation fails instead of implicitly unmapping them.
412.It PCIIO_BAR_MMAP_RW
413The requested mapping allows both reading and writing.
414Without the flag, read-only mapping is established.
415Note that it is common for the device registers to have side-effects
416even on reads.
417.It PCIIO_BAR_MMAP_ACTIVATE
418(Unimplemented) If the BAR is not activated, activate it in the course
419of mapping.
420Currently attempt to mmap an inactive BAR results in error.
421.El
422.El
423.Sh LOADER TUNABLES
424Tunables can be set at the
425.Xr loader 8
426prompt before booting the kernel, or stored in
427.Xr loader.conf 5 .
428The current value of these tunables can be examined at runtime via
429.Xr sysctl 8
430nodes of the same name.
431Unless otherwise specified,
432each of these tunables is a boolean that can be enabled by setting the
433tunable to a non-zero value.
434.Bl -tag -width indent
435.It Va hw.pci.clear_bars Pq Defaults to 0
436Ignore any firmware-assigned memory and I/O port resources.
437This forces the
438.Tn PCI
439bus driver to allocate resource ranges for memory and I/O port resources
440from scratch.
441.It Va hw.pci.clear_buses Pq Defaults to 0
442Ignore any firmware-assigned bus number registers in PCI-PCI bridges.
443This forces the
444.Tn PCI
445bus driver and PCI-PCI bridge driver to allocate bus numbers for secondary
446buses behind PCI-PCI bridges.
447.It Va hw.pci.clear_pcib Pq Defaults to 0
448Ignore any firmware-assigned memory and I/O port resource windows in PCI-PCI
449bridges.
450This forces the PCI-PCI bridge driver to allocate memory and I/O port resources
451for resource windows from scratch.
452.Pp
453By default the PCI-PCI bridge driver will allocate windows that
454contain the firmware-assigned resources devices behind the bridge.
455In addition, the PCI-PCI bridge driver will suballocate from existing window
456regions when possible to satisfy a resource request.
457As a result,
458both
459.Va hw.pci.clear_bars
460and
461.Va hw.pci.clear_pcib
462must be enabled to fully ignore firmware-supplied resource assignments.
463.It Va hw.pci.default_vgapci_unit Pq Defaults to -1
464By default,
465the first
466.Tn PCI
467VGA adapter encountered by the system is assumed to be the boot display device.
468This tunable can be set to choose a specific VGA adapter by specifying the
469unit number of the associated
470.Va vgapci Ns Ar X
471device.
472.It Va hw.pci.do_power_nodriver Pq Defaults to 0
473Place devices into a low power state
474.Pq D3
475when a suitable device driver is not found.
476Can be set to one of the following values:
477.Bl -tag -width indent
478.It 3
479Powers down all
480.Tn PCI
481devices without a device driver.
482.It 2
483Powers down most devices without a device driver.
484PCI devices with the display, memory, and base peripheral device classes
485are not powered down.
486.It 1
487Similar to a setting of 2 except that storage controllers are also not
488powered down.
489.It 0
490All devices are left fully powered.
491.El
492.Pp
493A
494.Tn PCI
495device must support power management to be powered down.
496Placing a device into a low power state may not reduce power consumption.
497.It Va hw.pci.do_power_resume Pq Defaults to 1
498Place
499.Tn PCI
500devices into the fully powered state when resuming either the system or an
501individual device.
502Setting this to zero is discouraged as the system will not attempt to power
503up non-powered PCI devices after a suspend.
504.It Va hw.pci.do_power_suspend Pq Defaults to 1
505Place
506.Tn PCI
507devices into a low power state when suspending either the system or individual
508devices.
509Normally the D3 state is used as the low power state,
510but firmware may override the desired power state during a system suspend.
511.It Va hw.pci.enable_ari Pq Defaults to 1
512Enable support for PCI-express Alternative RID Interpretation.
513This is often used in conjunction with SR-IOV.
514.It Va hw.pci.enable_io_modes Pq Defaults to 1
515Enable memory or I/O port decoding in a PCI device's command register if it has
516firmware-assigned memory or I/O port resources.
517The firmware
518.Pq BIOS
519in some systems does not enable memory or I/O port decoding for some devices
520even when it has assigned resources to the device.
521This enables decoding for such resources during bus probe.
522.It Va hw.pci.enable_msi Pq Defaults to 1
523Enable support for Message Signalled Interrupts
524.Pq MSI .
525MSI interrupts can be disabled by setting this tunable to 0.
526.It Va hw.pci.enable_msix Pq Defaults to 1
527Enable support for extended Message Signalled Interrupts
528.Pq MSI-X .
529MSI-X interrupts can be disabled by setting this tunable to 0.
530.It Va hw.pci.enable_pcie_hp Pq Defaults to 1
531Enable support for native PCI-express HotPlug.
532.It Va hw.pci.honor_msi_blacklist Pq Defaults to 1
533MSI and MSI-X interrupts are disabled for certain chipsets known to have
534broken MSI and MSI-X implementations when this tunable is set.
535It can be set to zero to permit use of MSI and MSI-X interrupts if the
536chipset match is a false positive.
537.It Va hw.pci.iov_max_config Pq Defaults to 1MB
538The maximum amount of memory permitted for the configuration parameters
539used when creating Virtual Functions via SR-IOV.
540This tunable can also be changed at runtime via
541.Xr sysctl 8 .
542.It Va hw.pci.realloc_bars Pq Defaults to 0
543Attempt to allocate a new resource range during the initial device scan
544for any memory or I/O port resources with firmware-assigned ranges that
545conflict with another active resource.
546.It Va hw.pci.usb_early_takeover Pq Defaults to 1 on Tn amd64 and Tn i386
547Disable legacy device emulation of USB devices during the initial device
548scan.
549Set this tunable to zero to use USB devices via legacy emulation when
550using a custom kernel without USB controller drivers.
551.It Va hw.pci<D>.<B>.<S>.INT<P>.irq
552These tunables can be used to override the interrupt routing for legacy
553PCI INTx interrupts.
554Unlike other tunables in this list,
555these do not have corresponding sysctl nodes.
556The tunable name includes the address of the PCI device as well as the
557pin of the desired INTx IRQ to override:
558.Bl -tag -width indent
559.It <D>
560The domain
561.Pq or segment
562of the PCI device in decimal.
563.It <B>
564The bus address of the PCI device in decimal.
565.It <S>
566The slot of the PCI device in decimal.
567.It <P>
568The interrupt pin of the PCI slot to override.
569One of
570.Ql A ,
571.Ql B ,
572.Ql C ,
573or
574.Ql D .
575.El
576.Pp
577The value of the tunable is the raw IRQ value to use for the INTx interrupt
578pin identified by the tunable name.
579Mapping of IRQ values to platform interrupt sources is machine dependent.
580.El
581.Sh DEVICE WIRING
582You can wire the device unit at a given location with device.hints.
583Entries of the form
584.Va hints.<name>.<unit>.at="pci<B>:<S>:<F>"
585or
586.Va hints.<name>.<unit>.at="pci<D>:<B>:<S>:<F>"
587will force the driver
588.Va name
589to probe and attach at unit
590.Va unit
591for any PCI device found to match the specification, where:
592.Bl -tag -width -indent
593.It <D>
594The domain
595.Pq or segment
596of the PCI device in decimal.
597Defaults to 0 if unspecified
598.It <B>
599The bus address of the PCI device in decimal.
600.It <S>
601The slot of the PCI device in decimal.
602.It <F>
603The function of the PCI device in decimal.
604.El
605.Pp
606The code to do the matching requires an exact string match.
607Do not specify the angle brackets
608.Pq < >
609in the hints file.
610Wiring multiple devices to the same
611.Va name
612and
613.Va unit
614produces undefined results.
615.Ss Examples
616Given the following lines in
617.Pa /boot/device.hints :
618.Cd hint.nvme.3.at="pci6:0:0"
619.Cd hint.igb.8.at="pci14:0:0"
620If there is a device that supports
621.Xr igb 4
622at PCI bus 14 slot 0 function 0,
623then it will be assigned igb8 for probe and attach.
624Likewise, if there is an
625.Xr nvme 4
626card at PCI bus 6 slot 0 function 0,
627then it will be assigned nvme3 for probe and attach.
628If another type of card is in either of these locations, the name and
629unit of that card will be the default names and will be unaffected by
630these hints.
631If other igb or nvme cards are located elsewhere, they will be
632assigned their unit numbers sequentially, skipping the unit numbers
633that have 'at' hints.
634.Sh FILES
635.Bl -tag -width /dev/pci -compact
636.It Pa /dev/pci
637Character device for the
638.Nm
639driver.
640.El
641.Sh SEE ALSO
642.Xr pciconf 8
643.Sh HISTORY
644The
645.Nm
646driver (not the kernel's
647.Tn PCI
648support code) first appeared in
649.Fx 2.2 ,
650and was written by Stefan Esser and Garrett Wollman.
651Support for device listing and matching was re-implemented by
652Kenneth Merry, and first appeared in
653.Fx 3.0 .
654.Sh AUTHORS
655.An Kenneth Merry Aq Mt ken@FreeBSD.org
656.Sh BUGS
657It is not possible for users to specify an accurate offset into the device
658list without calling the
659.Dv PCIOCGETCONF
660at least once, since they have no way of knowing the current generation
661number otherwise.
662This probably is not a serious problem, though, since
663users can easily narrow their search by specifying a pattern or patterns
664for the kernel to match against.
665