1.\" 2.\" Copyright (c) 1999 Kenneth D. Merry. 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. The name of the author may not be used to endorse or promote products 11.\" derived from this software without specific prior written permission. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.Dd August 31, 2025 26.Dt PCI 4 27.Os 28.Sh NAME 29.Nm pci 30.Nd generic PCI/PCIe bus driver 31.Sh SYNOPSIS 32To compile the PCI bus driver into the kernel, 33place the following line in your 34kernel configuration file: 35.Bd -ragged -offset indent 36.Cd device pci 37.Ed 38.Pp 39To compile in support for Single Root I/O Virtualization 40.Pq SR-IOV : 41.Bd -ragged -offset indent 42.Cd options PCI_IOV 43.Ed 44.Pp 45To compile in support for native PCI-express HotPlug: 46.Bd -ragged -offset indent 47.Cd options PCI_HP 48.Ed 49.Sh DESCRIPTION 50The 51.Nm 52driver provides support for 53.Tn PCI 54and 55.Tn PCIe 56devices in the kernel and limited access to 57.Tn PCI 58devices for userland. 59.Pp 60The 61.Nm 62driver provides a 63.Pa /dev/pci 64character device that can be used by userland programs to read and write 65.Tn PCI 66configuration registers. 67Programs can also use this device to get a list of all 68.Tn PCI 69devices, or all 70.Tn PCI 71devices that match various patterns. 72.Pp 73Since the 74.Nm 75driver provides a write interface for 76.Tn PCI 77configuration registers, system administrators should exercise caution when 78granting access to the 79.Nm 80device. 81If used improperly, this driver can allow userland applications to 82crash a machine or cause data loss. 83In particular, driver only allows operations on the opened 84.Pa /dev/pci 85to modify system state if the file descriptor was opened for writing. 86For instance, the 87.Dv PCIOCREAD 88and 89.Dv PCIOCBARMMAP 90operations require a writeable descriptor, because reading a config register 91or a BAR read access could have function-specific side-effects. 92.Pp 93The 94.Nm 95driver implements the 96.Tn PCI 97bus in the kernel. 98It enumerates any devices on the 99.Tn PCI 100bus and gives 101.Tn PCI 102client drivers the chance to attach to them. 103It assigns resources to children, when the BIOS does not. 104It takes care of routing interrupts when necessary. 105It reprobes the unattached 106.Tn PCI 107children when 108.Tn PCI 109client drivers are dynamically 110loaded at runtime. 111The 112.Nm 113driver also includes support for PCI-PCI bridges, 114various platform-specific Host-PCI bridges, 115and basic support for 116.Tn PCI 117VGA adapters. 118.Sh IOCTLS 119The following 120.Xr ioctl 2 121calls are supported by the 122.Nm 123driver. 124They are defined in the header file 125.In sys/pciio.h . 126.Bl -tag -width 012345678901234 127.It PCIOCGETCONF 128This 129.Xr ioctl 2 130takes a 131.Va pci_conf_io 132structure. 133It allows the user to retrieve information on all 134.Tn PCI 135devices in the system, or on 136.Tn PCI 137devices matching patterns supplied by the user. 138The call may set 139.Va errno 140to any value specified in either 141.Xr copyin 9 142or 143.Xr copyout 9 . 144The 145.Va pci_conf_io 146structure consists of a number of fields: 147.Bl -tag -width match_buf_len 148.It pat_buf_len 149The length, in bytes, of the buffer filled with user-supplied patterns. 150.It num_patterns 151The number of user-supplied patterns. 152.It patterns 153Pointer to a buffer filled with user-supplied patterns. 154.Va patterns 155is a pointer to 156.Va num_patterns 157.Va pci_match_conf 158structures. 159The 160.Va pci_match_conf 161structure consists of the following elements: 162.Bl -tag -width pd_vendor 163.It pc_sel 164.Tn PCI 165domain, bus, slot and function. 166.It pd_name 167.Tn PCI 168device driver name. 169.It pd_unit 170.Tn PCI 171device driver unit number. 172.It pc_vendor 173.Tn PCI 174vendor ID. 175.It pc_device 176.Tn PCI 177device ID. 178.It pc_class 179.Tn PCI 180device class. 181.It flags 182The flags describe which of the fields the kernel should match against. 183A device must match all specified fields in order to be returned. 184The match flags are enumerated in the 185.Va pci_getconf_flags 186structure. 187Hopefully the flag values are obvious enough that they do not need to 188described in detail. 189.El 190.It match_buf_len 191Length of the 192.Va matches 193buffer allocated by the user to hold the results of the 194.Dv PCIOCGETCONF 195query. 196.It num_matches 197Number of matches returned by the kernel. 198.It matches 199Buffer containing matching devices returned by the kernel. 200The items in this buffer are of type 201.Va pci_conf , 202which consists of the following items: 203.Bl -tag -width pc_subvendor 204.It pc_sel 205.Tn PCI 206domain, bus, slot and function. 207.It pc_hdr 208.Tn PCI 209header type. 210.It pc_subvendor 211.Tn PCI 212subvendor ID. 213.It pc_subdevice 214.Tn PCI 215subdevice ID. 216.It pc_vendor 217.Tn PCI 218vendor ID. 219.It pc_device 220.Tn PCI 221device ID. 222.It pc_class 223.Tn PCI 224device class. 225.It pc_subclass 226.Tn PCI 227device subclass. 228.It pc_progif 229.Tn PCI 230device programming interface. 231.It pc_revid 232.Tn PCI 233revision ID. 234.It pd_name 235Driver name. 236.It pd_unit 237Driver unit number. 238.It pd_numa_domain 239Driver NUMA domain. 240.It pc_reported_len 241Length of the valid portion of the encompassing 242.Vt pci_conf 243structure. 244This should always be equivalent to the offset of the 245.Va pc_spare 246member. 247.It pc_spare 248Reserved for future use. 249.El 250.It offset 251The offset is passed in by the user to tell the kernel where it should 252start traversing the device list. 253The value passed out by the kernel 254points to the record immediately after the last one returned. 255The user may 256pass the value returned by the kernel in subsequent calls to the 257.Dv PCIOCGETCONF 258ioctl. 259If the user does not intend to use the offset, it must be set to zero. 260.It generation 261.Tn PCI 262configuration generation. 263This value only needs to be set if the offset is set. 264The kernel will compare the current generation number of its internal 265device list to the generation passed in by the user to determine whether 266its device list has changed since the user last called the 267.Dv PCIOCGETCONF 268ioctl. 269If the device list has changed, a status of 270.Va PCI_GETCONF_LIST_CHANGED 271will be passed back. 272.It status 273The status tells the user the disposition of his request for a device list. 274The possible status values are: 275.Bl -ohang 276.It PCI_GETCONF_LAST_DEVICE 277This means that there are no more devices in the PCI device list matching 278the specified criteria after the 279ones returned in the 280.Va matches 281buffer. 282.It PCI_GETCONF_LIST_CHANGED 283This status tells the user that the 284.Tn PCI 285device list has changed since his last call to the 286.Dv PCIOCGETCONF 287ioctl and he must reset the 288.Va offset 289and 290.Va generation 291to zero to start over at the beginning of the list. 292.It PCI_GETCONF_MORE_DEVS 293This tells the user that his buffer was not large enough to hold all of the 294remaining devices in the device list that match his criteria. 295.It PCI_GETCONF_ERROR 296This indicates a general error while servicing the user's request. 297If the 298.Va pat_buf_len 299is not equal to 300.Va num_patterns 301times 302.Fn sizeof "struct pci_match_conf" , 303.Va errno 304will be set to 305.Er EINVAL . 306.El 307.El 308.It PCIOCREAD 309This 310.Xr ioctl 2 311reads the 312.Tn PCI 313configuration registers specified by the passed-in 314.Va pci_io 315structure. 316The 317.Va pci_io 318structure consists of the following fields: 319.Bl -tag -width pi_width 320.It pi_sel 321A 322.Va pcisel 323structure which specifies the domain, bus, slot and function the user would 324like to query. 325If the specific bus is not found, errno will be set to ENODEV and -1 returned 326from the ioctl. 327.It pi_reg 328The 329.Tn PCI 330configuration registers the user would like to access. 331.It pi_width 332The width, in bytes, of the data the user would like to read. 333This value 334may be either 1, 2, or 4. 3353-byte reads and reads larger than 4 bytes are 336not supported. 337If an invalid width is passed, errno will be set to EINVAL. 338.It pi_data 339The data returned by the kernel. 340.El 341.It PCIOCWRITE 342This 343.Xr ioctl 2 344allows users to write to the 345.Tn PCI 346configuration registers specified in the passed-in 347.Va pci_io 348structure. 349The 350.Va pci_io 351structure is described above. 352The limitations on data width described for 353reading registers, above, also apply to writing 354.Tn PCI 355configuration registers. 356.It PCIOCATTACHED 357This 358.Xr ioctl 2 359allows users to query if a driver is attached to the 360.Tn PCI 361device specified in the passed-in 362.Va pci_io 363structure. 364The 365.Va pci_io 366structure is described above, however, the 367.Va pi_reg 368and 369.Va pi_width 370fields are not used. 371The status of the device is stored in the 372.Va pi_data 373field. 374A value of 0 indicates no driver is attached, while a value larger than 0 375indicates that a driver is attached. 376.It PCIOCBARMMAP 377This 378.Xr ioctl 2 379command allows userspace processes to 380.Xr mmap 2 381the memory-mapped PCI BAR into its address space. 382The input parameters and results are passed in the 383.Va pci_bar_mmap 384structure, which has the following fields: 385.Bl -tag -width Vt struct pcise pbm_sel 386.It Vt void *pbm_map_base 387Reports the established mapping base to the caller. 388If 389.Va PCIIO_BAR_MMAP_FIXED 390flag was specified, then this field must be filled before the call 391with the desired address for the mapping. 392.It Vt size_t pbm_map_length 393Reports the mapped length of the BAR, in bytes. 394Its 395.Vt size_t 396value is always multiple of machine pages. 397.It Vt uint64_t pbm_bar_length 398Reports length of the bar as exposed by the device. 399.It Vt int pbm_bar_off 400Reports offset from the mapped base to the start of the 401first register in the bar. 402.It Vt struct pcisel pbm_sel 403Should be filled before the call. 404Describes the device to operate on. 405.It Vt int pbm_reg 406The BAR index to mmap. 407.It Vt int pbm_flags 408Flags which augments the operation. 409See below. 410.It Vt int pbm_memattr 411The caching attribute for the mapping. 412Typical values are 413.Dv VM_MEMATTR_UNCACHEABLE 414for control registers BARs, and 415.Dv VM_MEMATTR_WRITE_COMBINING 416for frame buffers. 417Regular memory-like BAR should be mapped with 418.Dv VM_MEMATTR_DEFAULT 419attribute. 420.El 421.Pp 422Currently defined flags are: 423.Bl -tag -width PCIIO_BAR_MMAP_ACTIVATE 424.It PCIIO_BAR_MMAP_FIXED 425The resulted mappings should be established at the address 426specified by the 427.Va pbm_map_base 428member, otherwise fail. 429.It PCIIO_BAR_MMAP_EXCL 430Must be used together with 431.Dv PCIIO_BAR_MMAP_FIXED 432If the specified base contains already established mappings, the 433operation fails instead of implicitly unmapping them. 434.It PCIIO_BAR_MMAP_RW 435The requested mapping allows both reading and writing. 436Without the flag, read-only mapping is established. 437Note that it is common for the device registers to have side-effects 438even on reads. 439.It PCIIO_BAR_MMAP_ACTIVATE 440(Unimplemented) If the BAR is not activated, activate it in the course 441of mapping. 442Currently attempt to mmap an inactive BAR results in error. 443.El 444.It PCIOCBARIO 445This 446.Xr ioctl 2 447command allows users to read from and write to BARs. 448The I/O request parameters are passed in a 449.Va struct pci_bar_ioreq 450structure, which has the following fields: 451.Bl -tag 452.It Vt struct pcisel pbi_sel 453Describes the device to operate on. 454.It Vt int pbi_op 455The operation to perform. 456Currently supported values are 457.Dv PCIBARIO_READ 458and 459.Dv PCIBARIO_WRITE . 460.It Vt uint32_t pbi_bar 461The index of the BAR on which to operate. 462.It Vt uint32_t pbi_offset 463The offset into the BAR at which to operate. 464.It Vt uint32_t pbi_width 465The size, in bytes, of the I/O operation. 4661-byte, 2-byte, 4-byte and 8-byte perations are supported. 467.It Vt uint32_t pbi_value 468For reads, the value is returned in this field. 469For writes, the caller specifies the value to be written in this field. 470.Pp 471Note that this operation maps and unmaps the corresponding resource and 472so is relatively expensive for memory BARs. 473The 474.Va PCIOCBARMMAP 475.Xr ioctl 2 476can be used to create a persistent userspace mapping for such BARs instead. 477.El 478.El 479.Sh LOADER TUNABLES 480Tunables can be set at the 481.Xr loader 8 482prompt before booting the kernel, or stored in 483.Xr loader.conf 5 . 484The current value of these tunables can be examined at runtime via 485.Xr sysctl 8 486nodes of the same name. 487Unless otherwise specified, 488each of these tunables is a boolean that can be enabled by setting the 489tunable to a non-zero value. 490.Bl -tag -width indent 491.It Va hw.pci.clear_bars Pq Defaults to 0 492Ignore any firmware-assigned memory and I/O port resources. 493This forces the 494.Tn PCI 495bus driver to allocate resource ranges for memory and I/O port resources 496from scratch. 497.It Va hw.pci.clear_buses Pq Defaults to 0 498Ignore any firmware-assigned bus number registers in PCI-PCI bridges. 499This forces the 500.Tn PCI 501bus driver and PCI-PCI bridge driver to allocate bus numbers for secondary 502buses behind PCI-PCI bridges. 503.It Va hw.pci.clear_pcib Pq Defaults to 0 504Ignore any firmware-assigned memory and I/O port resource windows in PCI-PCI 505bridges. 506This forces the PCI-PCI bridge driver to allocate memory and I/O port resources 507for resource windows from scratch. 508.Pp 509By default the PCI-PCI bridge driver will allocate windows that 510contain the firmware-assigned resources devices behind the bridge. 511In addition, the PCI-PCI bridge driver will suballocate from existing window 512regions when possible to satisfy a resource request. 513As a result, 514both 515.Va hw.pci.clear_bars 516and 517.Va hw.pci.clear_pcib 518must be enabled to fully ignore firmware-supplied resource assignments. 519.It Va hw.pci.default_vgapci_unit Pq Defaults to -1 520By default, 521the first 522.Tn PCI 523VGA adapter encountered by the system is assumed to be the boot display device. 524This tunable can be set to choose a specific VGA adapter by specifying the 525unit number of the associated 526.Va vgapci Ns Ar X 527device. 528.It Va hw.pci.do_power_nodriver Pq Defaults to 0 529Place devices into a low power state 530.Pq D3 531when a suitable device driver is not found. 532Can be set to one of the following values: 533.Bl -tag -width indent 534.It 3 535Powers down all 536.Tn PCI 537devices without a device driver. 538.It 2 539Powers down most devices without a device driver. 540PCI devices with the display, memory, and base peripheral device classes 541are not powered down. 542.It 1 543Similar to a setting of 2 except that storage controllers are also not 544powered down. 545.It 0 546All devices are left fully powered. 547.El 548.Pp 549A 550.Tn PCI 551device must support power management to be powered down. 552Placing a device into a low power state may not reduce power consumption. 553.It Va hw.pci.do_power_resume Pq Defaults to 1 554Place 555.Tn PCI 556devices into the fully powered state when resuming either the system or an 557individual device. 558Setting this to zero is discouraged as the system will not attempt to power 559up non-powered PCI devices after a suspend. 560.It Va hw.pci.do_power_suspend Pq Defaults to 1 561Place 562.Tn PCI 563devices into a low power state when suspending either the system or individual 564devices. 565Normally the D3 state is used as the low power state, 566but firmware may override the desired power state during a system suspend. 567.It Va hw.pci.enable_ari Pq Defaults to 1 568Enable support for PCI-express Alternative RID Interpretation. 569This is often used in conjunction with SR-IOV. 570.It Va hw.pci.enable_io_modes Pq Defaults to 1 571Enable memory or I/O port decoding in a PCI device's command register if it has 572firmware-assigned memory or I/O port resources. 573The firmware 574.Pq BIOS 575in some systems does not enable memory or I/O port decoding for some devices 576even when it has assigned resources to the device. 577This enables decoding for such resources during bus probe. 578.It Va hw.pci.enable_msi Pq Defaults to 1 579Enable support for Message Signalled Interrupts 580.Pq MSI . 581MSI interrupts can be disabled by setting this tunable to 0. 582.It Va hw.pci.enable_msix Pq Defaults to 1 583Enable support for extended Message Signalled Interrupts 584.Pq MSI-X . 585MSI-X interrupts can be disabled by setting this tunable to 0. 586.It Va hw.pci.enable_pcie_ei Pq Defaults to 0 587Enable support for PCI-express Electromechanical Interlock. 588.It Va hw.pci.enable_pcie_hp Pq Defaults to 1 589Enable support for native PCI-express HotPlug. 590.It Va hw.pci.honor_msi_blacklist Pq Defaults to 1 591MSI and MSI-X interrupts are disabled for certain chipsets known to have 592broken MSI and MSI-X implementations when this tunable is set. 593It can be set to zero to permit use of MSI and MSI-X interrupts if the 594chipset match is a false positive. 595.It Va hw.pci.iov_max_config Pq Defaults to 1MB 596The maximum amount of memory permitted for the configuration parameters 597used when creating Virtual Functions via SR-IOV. 598This tunable can also be changed at runtime via 599.Xr sysctl 8 . 600.It Va hw.pci.realloc_bars Pq Defaults to 0 601Attempt to allocate a new resource range during the initial device scan 602for any memory or I/O port resources with firmware-assigned ranges that 603conflict with another active resource. 604.It Va hw.pci.usb_early_takeover Pq Defaults to 1 on Tn amd64 and Tn i386 605Disable legacy device emulation of USB devices during the initial device 606scan. 607Set this tunable to zero to use USB devices via legacy emulation when 608using a custom kernel without USB controller drivers. 609.It Va hw.pci<D>.<B>.<S>.INT<P>.irq 610These tunables can be used to override the interrupt routing for legacy 611PCI INTx interrupts. 612Unlike other tunables in this list, 613these do not have corresponding sysctl nodes. 614The tunable name includes the address of the PCI device as well as the 615pin of the desired INTx IRQ to override: 616.Bl -tag -width indent 617.It <D> 618The domain 619.Pq or segment 620of the PCI device in decimal. 621.It <B> 622The bus address of the PCI device in decimal. 623.It <S> 624The slot of the PCI device in decimal. 625.It <P> 626The interrupt pin of the PCI slot to override. 627One of 628.Ql A , 629.Ql B , 630.Ql C , 631or 632.Ql D . 633.El 634.Pp 635The value of the tunable is the raw IRQ value to use for the INTx interrupt 636pin identified by the tunable name. 637Mapping of IRQ values to platform interrupt sources is machine dependent. 638.El 639.Sh DEVICE WIRING 640You can wire the device unit at a given location with 641.Xr device.hints 5 . 642.Ss BSF Based Wiring 643Devices may be wired to a Bus / Slot / Function (BSF) address. 644This is the form reported by 645.Xr pciconf 8 646Entries of the form 647.Va hints.<name>.<unit>.at="pci<B>:<S>:<F>" 648or 649.Va hints.<name>.<unit>.at="pci<D>:<B>:<S>:<F>" 650will force the driver 651.Va name 652to probe and attach at unit 653.Va unit 654for any PCI device found to match the specification, where: 655.Bl -tag -width -indent 656.It <D> 657The domain 658.Pq or segment 659of the PCI device in decimal. 660Defaults to 0 if unspecified. 661.It <B> 662The bus address of the PCI device in decimal. 663.It <S> 664The slot of the PCI device in decimal. 665.It <F> 666The function of the PCI device in decimal. 667.El 668.Pp 669The code to do the matching requires an exact string match. 670Do not specify the angle brackets 671.Pq < > 672in the hints file. 673Wiring multiple devices to the same 674.Va name 675and 676.Va unit 677produces undefined results. 678.Ss Examples 679Given the following lines in 680.Pa /boot/device.hints : 681.Bd -literal 682hint.nvme.3.at="pci6:0:0" 683hint.igb.8.at="pci14:0:0" 684.Ed 685.Pp 686If there is a device that supports 687.Xr igb 4 688at PCI bus 14 slot 0 function 0, 689then it will be assigned igb8 for probe and attach. 690Likewise, if there is an 691.Xr nvme 4 692device at PCI bus 6 slot 0 function 0, 693then it will be assigned nvme3 for probe and attach. 694If another type of card is in either of these locations, the name and 695unit of that card will be the default names and will be unaffected by 696these hints. 697If other igb or nvme cards are located elsewhere, they will be 698assigned their unit numbers sequentially, skipping the unit numbers 699that have 'at' hints. 700.Ss Location Based Wiring 701While simple to locate where to place a device for BSF wiring, the 702bus number of that is not invariant. 703Any number of changes to the devices within the system can cause 704this value to vary from boot to boot. 705The UEFI Standard defines a device path that's based only on the invariant parts 706of the address: The root complex (domain), the slot number and the function. 707These paths are hard to construct by hand, please see 708.Xr devctl 8 709.Sq Cm getpath 710command with a 711.Sq Ar UEFI 712locator. 713The above example could also be expressed as 714.Bd -literal 715hint.nvme.3.at="PciRoot(0x2)/Pci(0x1,0x3)/Pci(0x0,0x0)/Pci(0x0,0x0)/Pci(0x0,0x0)" 716hint.nvme.8.at="PciRoot(0x1)/Pci(0x2,0x2)/Pci(0x0,0x0)/Pci(0x0,0x0)" 717.Ed 718.Pp 719The advantage of this notation is that you can specify the exact location a 720device will be at. 721For deployments of multiple systems with the same configuration, this can be 722helpful in managing the devices. 723However, even slight variation in motherboards can cause the path to change 724substantially. 725It is also less natural to think of the UEFI Device Paths since little else 726will report it. 727.Sh FILES 728.Bl -tag -width /dev/pci -compact 729.It Pa /dev/pci 730Character device for the 731.Nm 732driver. 733.El 734.Sh SEE ALSO 735.Xr device.hints 5 736.Xr pciconf 8 737.Sh HISTORY 738The 739.Nm 740driver (not the kernel's 741.Tn PCI 742support code) first appeared in 743.Fx 2.2 , 744and was written by Stefan Esser and Garrett Wollman. 745Support for device listing and matching was re-implemented by 746Kenneth Merry, and first appeared in 747.Fx 3.0 . 748.Sh AUTHORS 749.An Kenneth Merry Aq Mt ken@FreeBSD.org 750.Sh BUGS 751It is not possible for users to specify an accurate offset into the device 752list without calling the 753.Dv PCIOCGETCONF 754at least once, since they have no way of knowing the current generation 755number otherwise. 756This probably is not a serious problem, though, since 757users can easily narrow their search by specifying a pattern or patterns 758for the kernel to match against. 759