1.\" 2.\" Copyright (c) 1999 Kenneth D. Merry. 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. The name of the author may not be used to endorse or promote products 11.\" derived from this software without specific prior written permission. 12.\" 13.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 14.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 15.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 16.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 17.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 18.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 19.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 20.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 21.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 22.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 23.\" SUCH DAMAGE. 24.\" 25.\" $FreeBSD$ 26.\" 27.Dd June 17, 2019 28.Dt PCI 4 29.Os 30.Sh NAME 31.Nm pci 32.Nd generic PCI bus driver 33.Sh SYNOPSIS 34To compile the PCI bus driver into the kernel, 35place the following line in your 36kernel configuration file: 37.Bd -ragged -offset indent 38.Cd device pci 39.Ed 40.Pp 41To compile in support for Single Root I/O Virtualization 42.Pq SR-IOV : 43.Bd -ragged -offset indent 44.Cd options PCI_IOV 45.Ed 46.Pp 47To compile in support for native PCI-express HotPlug: 48.Bd -ragged -offset indent 49.Cd options PCI_HP 50.Ed 51.Sh DESCRIPTION 52The 53.Nm 54driver provides support for 55.Tn PCI 56devices in the kernel and limited access to 57.Tn PCI 58devices for userland. 59.Pp 60The 61.Nm 62driver provides a 63.Pa /dev/pci 64character device that can be used by userland programs to read and write 65.Tn PCI 66configuration registers. 67Programs can also use this device to get a list of all 68.Tn PCI 69devices, or all 70.Tn PCI 71devices that match various patterns. 72.Pp 73Since the 74.Nm 75driver provides a write interface for 76.Tn PCI 77configuration registers, system administrators should exercise caution when 78granting access to the 79.Nm 80device. 81If used improperly, this driver can allow userland applications to 82crash a machine or cause data loss. 83.Pp 84The 85.Nm 86driver implements the 87.Tn PCI 88bus in the kernel. 89It enumerates any devices on the 90.Tn PCI 91bus and gives 92.Tn PCI 93client drivers the chance to attach to them. 94It assigns resources to children, when the BIOS does not. 95It takes care of routing interrupts when necessary. 96It reprobes the unattached 97.Tn PCI 98children when 99.Tn PCI 100client drivers are dynamically 101loaded at runtime. 102The 103.Nm 104driver also includes support for PCI-PCI bridges, 105various platform-specific Host-PCI bridges, 106and basic support for 107.Tn PCI 108VGA adapters. 109.Sh IOCTLS 110The following 111.Xr ioctl 2 112calls are supported by the 113.Nm 114driver. 115They are defined in the header file 116.In sys/pciio.h . 117.Bl -tag -width 012345678901234 118.It PCIOCGETCONF 119This 120.Xr ioctl 2 121takes a 122.Va pci_conf_io 123structure. 124It allows the user to retrieve information on all 125.Tn PCI 126devices in the system, or on 127.Tn PCI 128devices matching patterns supplied by the user. 129The call may set 130.Va errno 131to any value specified in either 132.Xr copyin 9 133or 134.Xr copyout 9 . 135The 136.Va pci_conf_io 137structure consists of a number of fields: 138.Bl -tag -width match_buf_len 139.It pat_buf_len 140The length, in bytes, of the buffer filled with user-supplied patterns. 141.It num_patterns 142The number of user-supplied patterns. 143.It patterns 144Pointer to a buffer filled with user-supplied patterns. 145.Va patterns 146is a pointer to 147.Va num_patterns 148.Va pci_match_conf 149structures. 150The 151.Va pci_match_conf 152structure consists of the following elements: 153.Bl -tag -width pd_vendor 154.It pc_sel 155.Tn PCI 156domain, bus, slot and function. 157.It pd_name 158.Tn PCI 159device driver name. 160.It pd_unit 161.Tn PCI 162device driver unit number. 163.It pc_vendor 164.Tn PCI 165vendor ID. 166.It pc_device 167.Tn PCI 168device ID. 169.It pc_class 170.Tn PCI 171device class. 172.It flags 173The flags describe which of the fields the kernel should match against. 174A device must match all specified fields in order to be returned. 175The match flags are enumerated in the 176.Va pci_getconf_flags 177structure. 178Hopefully the flag values are obvious enough that they do not need to 179described in detail. 180.El 181.It match_buf_len 182Length of the 183.Va matches 184buffer allocated by the user to hold the results of the 185.Dv PCIOCGETCONF 186query. 187.It num_matches 188Number of matches returned by the kernel. 189.It matches 190Buffer containing matching devices returned by the kernel. 191The items in this buffer are of type 192.Va pci_conf , 193which consists of the following items: 194.Bl -tag -width pc_subvendor 195.It pc_sel 196.Tn PCI 197domain, bus, slot and function. 198.It pc_hdr 199.Tn PCI 200header type. 201.It pc_subvendor 202.Tn PCI 203subvendor ID. 204.It pc_subdevice 205.Tn PCI 206subdevice ID. 207.It pc_vendor 208.Tn PCI 209vendor ID. 210.It pc_device 211.Tn PCI 212device ID. 213.It pc_class 214.Tn PCI 215device class. 216.It pc_subclass 217.Tn PCI 218device subclass. 219.It pc_progif 220.Tn PCI 221device programming interface. 222.It pc_revid 223.Tn PCI 224revision ID. 225.It pd_name 226Driver name. 227.It pd_unit 228Driver unit number. 229.El 230.It offset 231The offset is passed in by the user to tell the kernel where it should 232start traversing the device list. 233The value passed out by the kernel 234points to the record immediately after the last one returned. 235The user may 236pass the value returned by the kernel in subsequent calls to the 237.Dv PCIOCGETCONF 238ioctl. 239If the user does not intend to use the offset, it must be set to zero. 240.It generation 241.Tn PCI 242configuration generation. 243This value only needs to be set if the offset is set. 244The kernel will compare the current generation number of its internal 245device list to the generation passed in by the user to determine whether 246its device list has changed since the user last called the 247.Dv PCIOCGETCONF 248ioctl. 249If the device list has changed, a status of 250.Va PCI_GETCONF_LIST_CHANGED 251will be passed back. 252.It status 253The status tells the user the disposition of his request for a device list. 254The possible status values are: 255.Bl -ohang 256.It PCI_GETCONF_LAST_DEVICE 257This means that there are no more devices in the PCI device list matching 258the specified criteria after the 259ones returned in the 260.Va matches 261buffer. 262.It PCI_GETCONF_LIST_CHANGED 263This status tells the user that the 264.Tn PCI 265device list has changed since his last call to the 266.Dv PCIOCGETCONF 267ioctl and he must reset the 268.Va offset 269and 270.Va generation 271to zero to start over at the beginning of the list. 272.It PCI_GETCONF_MORE_DEVS 273This tells the user that his buffer was not large enough to hold all of the 274remaining devices in the device list that match his criteria. 275.It PCI_GETCONF_ERROR 276This indicates a general error while servicing the user's request. 277If the 278.Va pat_buf_len 279is not equal to 280.Va num_patterns 281times 282.Fn sizeof "struct pci_match_conf" , 283.Va errno 284will be set to 285.Er EINVAL . 286.El 287.El 288.It PCIOCREAD 289This 290.Xr ioctl 2 291reads the 292.Tn PCI 293configuration registers specified by the passed-in 294.Va pci_io 295structure. 296The 297.Va pci_io 298structure consists of the following fields: 299.Bl -tag -width pi_width 300.It pi_sel 301A 302.Va pcisel 303structure which specifies the domain, bus, slot and function the user would 304like to query. 305If the specific bus is not found, errno will be set to ENODEV and -1 returned 306from the ioctl. 307.It pi_reg 308The 309.Tn PCI 310configuration registers the user would like to access. 311.It pi_width 312The width, in bytes, of the data the user would like to read. 313This value 314may be either 1, 2, or 4. 3153-byte reads and reads larger than 4 bytes are 316not supported. 317If an invalid width is passed, errno will be set to EINVAL. 318.It pi_data 319The data returned by the kernel. 320.El 321.It PCIOCWRITE 322This 323.Xr ioctl 2 324allows users to write to the 325.Tn PCI 326configuration registers specified in the passed-in 327.Va pci_io 328structure. 329The 330.Va pci_io 331structure is described above. 332The limitations on data width described for 333reading registers, above, also apply to writing 334.Tn PCI 335configuration registers. 336.It PCIOCATTACHED 337This 338.Xr ioctl 2 339allows users to query if a driver is attached to the 340.Tn PCI 341device specified in the passed-in 342.Va pci_io 343structure. 344The 345.Va pci_io 346structure is described above, however, the 347.Va pi_reg 348and 349.Va pi_width 350fields are not used. 351The status of the device is stored in the 352.Va pi_data 353field. 354A value of 0 indicates no driver is attached, while a value larger than 0 355indicates that a driver is attached. 356.It PCIOCBARMMAP 357This 358.Xr ioctl 2 359command allows userspace processes to 360.Xr mmap 2 361the memory-mapped PCI BAR into its address space. 362The input parameters and results are passed in the 363.Va pci_bar_mmap 364structure, which has the following fields: 365.Bl -tag -width Vt struct pcise pbm_sel 366.It Vt uint64_t pbm_map_base 367Reports the established mapping base to the caller. 368If 369.Va PCIIO_BAR_MMAP_FIXED 370flag was specified, then this field must be filled before the call 371with the desired address for the mapping. 372.It Vt uint64_t pbm_map_length 373Reports the mapped length of the BAR, in bytes. 374Its .Vt uint64_t value is always multiple of machine pages. 375.It Vt int64_t pbm_bar_length 376Reports length of the bar as exposed by the device. 377.It Vt int pbm_bar_off 378Reports offset from the mapped base to the start of the 379first register in the bar. 380.It Vt struct pcisel pbm_sel 381Should be filled before the call. 382Describes the device to operate on. 383.It Vt int pbm_reg 384The BAR index to mmap. 385.It Vt int pbm_flags 386Flags which augments the operation. 387See below. 388.It Vt int pbm_memattr 389The caching attribute for the mapping. 390Typical values are 391.Dv VM_MEMATTR_UNCACHEABLE 392for control registers BARs, and 393.Dv VM_MEMATTR_WRITE_COMBINING 394for frame buffers. 395Regular memory-like BAR should be mapped with 396.Dv VM_MEMATTR_DEFAULT 397attribute. 398.El 399.Pp 400Currently defined flags are: 401.Bl -tag -width PCIIO_BAR_MMAP_ACTIVATE 402.It PCIIO_BAR_MMAP_FIXED 403The resulted mappings should be established at the address 404specified by the 405.Va pbm_map_base 406member, otherwise fail. 407.It PCIIO_BAR_MMAP_EXCL 408Must be used together with 409.Dv PCIIO_BAR_MMAP_FIXED 410If the specified base contains already established mappings, the 411operation fails instead of implicitly unmapping them. 412.It PCIIO_BAR_MMAP_RW 413The requested mapping allows both reading and writing. 414Without the flag, read-only mapping is established. 415Note that it is common for the device registers to have side-effects 416even on reads. 417.It PCIIO_BAR_MMAP_ACTIVATE 418(Unimplemented) If the BAR is not activated, activate it in the course 419of mapping. 420Currently attempt to mmap an inactive BAR results in error. 421.El 422.El 423.Sh LOADER TUNABLES 424Tunables can be set at the 425.Xr loader 8 426prompt before booting the kernel, or stored in 427.Xr loader.conf 5 . 428The current value of these tunables can be examined at runtime via 429.Xr sysctl 8 430nodes of the same name. 431Unless otherwise specified, 432each of these tunables is a boolean that can be enabled by setting the 433tunable to a non-zero value. 434.Bl -tag -width indent 435.It Va hw.pci.clear_bars Pq Defaults to 0 436Ignore any firmware-assigned memory and I/O port resources. 437This forces the 438.Tn PCI 439bus driver to allocate resource ranges for memory and I/O port resources 440from scratch. 441.It Va hw.pci.clear_buses Pq Defaults to 0 442Ignore any firmware-assigned bus number registers in PCI-PCI bridges. 443This forces the 444.Tn PCI 445bus driver and PCI-PCI bridge driver to allocate bus numbers for secondary 446buses behind PCI-PCI bridges. 447.It Va hw.pci.clear_pcib Pq Defaults to 0 448Ignore any firmware-assigned memory and I/O port resource windows in PCI-PCI 449bridges. 450This forces the PCI-PCI bridge driver to allocate memory and I/O port resources 451for resource windows from scratch. 452.Pp 453By default the PCI-PCI bridge driver will allocate windows that 454contain the firmware-assigned resources devices behind the bridge. 455In addition, the PCI-PCI bridge driver will suballocate from existing window 456regions when possible to satisfy a resource request. 457As a result, 458both 459.Va hw.pci.clear_bars 460and 461.Va hw.pci.clear_pcib 462must be enabled to fully ignore firmware-supplied resource assignments. 463.It Va hw.pci.default_vgapci_unit Pq Defaults to -1 464By default, 465the first 466.Tn PCI 467VGA adapter encountered by the system is assumed to be the boot display device. 468This tunable can be set to choose a specific VGA adapter by specifying the 469unit number of the associated 470.Va vgapci Ns Ar X 471device. 472.It Va hw.pci.do_power_nodriver Pq Defaults to 0 473Place devices into a low power state 474.Pq D3 475when a suitable device driver is not found. 476Can be set to one of the following values: 477.Bl -tag -width indent 478.It 3 479Powers down all 480.Tn PCI 481devices without a device driver. 482.It 2 483Powers down most devices without a device driver. 484PCI devices with the display, memory, and base peripheral device classes 485are not powered down. 486.It 1 487Similar to a setting of 2 except that storage controllers are also not 488powered down. 489.It 0 490All devices are left fully powered. 491.El 492.Pp 493A 494.Tn PCI 495device must support power management to be powered down. 496Placing a device into a low power state may not reduce power consumption. 497.It Va hw.pci.do_power_resume Pq Defaults to 1 498Place 499.Tn PCI 500devices into the fully powered state when resuming either the system or an 501individual device. 502Setting this to zero is discouraged as the system will not attempt to power 503up non-powered PCI devices after a suspend. 504.It Va hw.pci.do_power_suspend Pq Defaults to 1 505Place 506.Tn PCI 507devices into a low power state when suspending either the system or individual 508devices. 509Normally the D3 state is used as the low power state, 510but firmware may override the desired power state during a system suspend. 511.It Va hw.pci.enable_ari Pq Defaults to 1 512Enable support for PCI-express Alternative RID Interpretation. 513This is often used in conjunction with SR-IOV. 514.It Va hw.pci.enable_io_modes Pq Defaults to 1 515Enable memory or I/O port decoding in a PCI device's command register if it has 516firmware-assigned memory or I/O port resources. 517The firmware 518.Pq BIOS 519in some systems does not enable memory or I/O port decoding for some devices 520even when it has assigned resources to the device. 521This enables decoding for such resources during bus probe. 522.It Va hw.pci.enable_msi Pq Defaults to 1 523Enable support for Message Signalled Interrupts 524.Pq MSI . 525MSI interrupts can be disabled by setting this tunable to 0. 526.It Va hw.pci.enable_msix Pq Defaults to 1 527Enable support for extended Message Signalled Interrupts 528.Pq MSI-X . 529MSI-X interrupts can be disabled by setting this tunable to 0. 530.It Va hw.pci.enable_pcie_hp Pq Defaults to 1 531Enable support for native PCI-express HotPlug. 532.It Va hw.pci.honor_msi_blacklist Pq Defaults to 1 533MSI and MSI-X interrupts are disabled for certain chipsets known to have 534broken MSI and MSI-X implementations when this tunable is set. 535It can be set to zero to permit use of MSI and MSI-X interrupts if the 536chipset match is a false positive. 537.It Va hw.pci.iov_max_config Pq Defaults to 1MB 538The maximum amount of memory permitted for the configuration parameters 539used when creating Virtual Functions via SR-IOV. 540This tunable can also be changed at runtime via 541.Xr sysctl 8 . 542.It Va hw.pci.realloc_bars Pq Defaults to 0 543Attempt to allocate a new resource range during the initial device scan 544for any memory or I/O port resources with firmware-assigned ranges that 545conflict with another active resource. 546.It Va hw.pci.usb_early_takeover Pq Defaults to 1 on Tn amd64 and Tn i386 547Disable legacy device emulation of USB devices during the initial device 548scan. 549Set this tunable to zero to use USB devices via legacy emulation when 550using a custom kernel without USB controller drivers. 551.It Va hw.pci<D>.<B>.<S>.INT<P>.irq 552These tunables can be used to override the interrupt routing for legacy 553PCI INTx interrupts. 554Unlike other tunables in this list, 555these do not have corresponding sysctl nodes. 556The tunable name includes the address of the PCI device as well as the 557pin of the desired INTx IRQ to override: 558.Bl -tag -width indent 559.It <D> 560The domain 561.Pq or segment 562of the PCI device in decimal. 563.It <B> 564The bus address of the PCI device in decimal. 565.It <S> 566The slot of the PCI device in decimal. 567.It <P> 568The interrupt pin of the PCI slot to override. 569One of 570.Ql A , 571.Ql B , 572.Ql C , 573or 574.Ql D . 575.El 576.Pp 577The value of the tunable is the raw IRQ value to use for the INTx interrupt 578pin identified by the tunable name. 579Mapping of IRQ values to platform interrupt sources is machine dependent. 580.El 581.Sh DEVICE WIRING 582You can wire the device unit at a given location with device.hints. 583Entries of the form 584.Va hints.<name>.<unit>.at="pci<B>:<S>:<F>" 585or 586.Va hints.<name>.<unit>.at="pci<D>:<B>:<S>:<F>" 587will force the driver 588.Va name 589to probe and attach at unit 590.Va unit 591for any PCI device found to match the specification, where: 592.Bl -tag -width -indent 593.It <D> 594The domain 595.Pq or segment 596of the PCI device in decimal. 597Defaults to 0 if unspecified 598.It <B> 599The bus address of the PCI device in decimal. 600.It <S> 601The slot of the PCI device in decimal. 602.It <F> 603The function of the PCI device in decimal. 604.El 605.Pp 606The code to do the matching requires an exact string match. 607Do not specify the angle brackets 608.Pq < > 609in the hints file. 610Wiring multiple devices to the same 611.Va name 612and 613.Va unit 614produces undefined results. 615.Ss Examples 616Given the following lines in 617.Pa /boot/device.hints : 618.Cd hint.nvme.3.at="pci6:0:0" 619.Cd hint.igb.8.at="pci14:0:0" 620If there is a device that supports 621.Xr igb 4 622at PCI bus 14 slot 0 function 0, 623then it will be assigned igb8 for probe and attach. 624Likewise, if there is an 625.Xr nvme 4 626card at PCI bus 6 slot 0 function 0, 627then it will be assigned nvme3 for probe and attach. 628If another type of card is in either of these locations, the name and 629unit of that card will be the default names and will be unaffected by 630these hints. 631If other igb or nvme cards are located elsewhere, they will be 632assigned their unit numbers sequentially, skipping the unit numbers 633that have 'at' hints. 634.Sh FILES 635.Bl -tag -width /dev/pci -compact 636.It Pa /dev/pci 637Character device for the 638.Nm 639driver. 640.El 641.Sh SEE ALSO 642.Xr pciconf 8 643.Sh HISTORY 644The 645.Nm 646driver (not the kernel's 647.Tn PCI 648support code) first appeared in 649.Fx 2.2 , 650and was written by Stefan Esser and Garrett Wollman. 651Support for device listing and matching was re-implemented by 652Kenneth Merry, and first appeared in 653.Fx 3.0 . 654.Sh AUTHORS 655.An Kenneth Merry Aq Mt ken@FreeBSD.org 656.Sh BUGS 657It is not possible for users to specify an accurate offset into the device 658list without calling the 659.Dv PCIOCGETCONF 660at least once, since they have no way of knowing the current generation 661number otherwise. 662This probably is not a serious problem, though, since 663users can easily narrow their search by specifying a pattern or patterns 664for the kernel to match against. 665