1.Dd August 20, 2025 2.Dt IFLIB 4 3.Os 4.Sh NAME 5.Nm iflib 6.Nd Network Interface Driver Framework 7.Sh SYNOPSIS 8.Cd "device pci" 9.Cd "device iflib" 10.Sh DESCRIPTION 11.Nm 12is a framework for network interface drivers for 13.Fx . 14It is designed to remove a large amount of the boilerplate that is often 15needed for modern network interface devices, allowing driver authors to 16focus on the specific code needed for their hardware. 17This allows for a shared set of 18.Xr sysctl 8 19names, rather than each driver naming them individually. 20.Sh SYSCTL VARIABLES 21These variables must be set before loading the driver, either via 22.Xr loader.conf 5 23or through the use of 24.Xr kenv 1 . 25They are all prefixed by 26.Va dev.X.Y.iflib\&. 27where X is the driver name, and Y is the instance number. 28.Bl -tag -width indent 29.It Va override_nrxds 30Override the number of RX descriptors for each queue. 31The value is a comma separated list of positive integers. 32Some drivers only use a single value, but others may use more. 33These numbers must be powers of two, and zero means to use the default. 34Individual drivers may have additional restrictions on allowable values. 35Defaults to all zeros. 36.It Va override_ntxds 37Override the number of TX descriptors for each queue. 38The value is a comma separated list of positive integers. 39Some drivers only use a single value, but others may use more. 40These numbers must be powers of two, and zero means to use the default. 41Individual drivers may have additional restrictions on allowable values. 42Defaults to all zeros. 43.It Va override_qs_enable 44When set, allows the number of transmit and receive queues to be different. 45If not set, the lower of the number of TX or RX queues will be used for both. 46.It Va override_nrxqs 47Set the number of RX queues. 48If zero, the number of RX queues is derived from the number of cores on the 49socket connected to the controller. 50Defaults to 0. 51.It Va override_ntxqs 52Set the number of TX queues. 53If zero, the number of TX queues is derived from the number of cores on the 54socket connected to the controller. 55.It Va disable_msix 56Disables MSI-X interrupts for the device. 57.It Va core_offset 58Specifies a starting core offset to assign queues to. 59If the value is unspecified or 65535, cores are assigned sequentially across 60controllers. 61.It Va separate_txrx 62Requests that RX and TX queues not be paired on the same core. 63If this is zero or not set, an RX and TX queue pair will be assigned to each 64core. 65When set to a non-zero value, TX queues are assigned to cores following the 66last RX queue. 67.It Va simple_tx 68When set to one, iflib uses a simple transmit routine with no queuing at all. 69By default, iflib uses a highly optimized, lockless, transmit queue called 70mp_ring. 71This performs well when there are more CPU cores than NIC 72queues and prevents lock contention for transmit resources. 73Unfortunately, mp_ring incurs unneeded overheads on workloads where 74resource contention is not a problem (well behaved applications on 75systems where there are as many NIC queues as CPU cores). 76Note that when this is enabled, the tx_abdicate sysctl is no longer 77applicable and is ignored. 78Defaults to zero. 79.El 80.Pp 81These 82.Xr sysctl 8 83variables can be changed at any time: 84.Bl -tag -width indent 85.It Va tx_abdicate 86Controls how the transmit ring is serviced. 87If set to zero, when a frame is submitted to the transmission ring, the same 88task that is submitting it will service the ring unless there's already a 89task servicing the TX ring. 90This ensures that whenever there is a pending transmission, 91the transmit ring is being serviced. 92This results in higher transmit throughput. 93If set to a non-zero value, task returns immediately and the transmit 94ring is serviced by a different task. 95This returns control to the caller faster and under high receive load, 96may result in fewer dropped RX frames. 97.It Va rx_budget 98Sets the maximum number of frames to be received at a time. 99Zero (the default) indicates the default (currently 16) should be used. 100.El 101.Pp 102There are also some global sysctls which can change behaviour for all drivers, 103and may be changed at any time. 104.Bl -tag -width indent 105.It Va net.iflib.min_tx_latency 106If this is set to a non-zero value, iflib will avoid any attempt to combine 107multiple transmits, and notify the hardware as quickly as possible of 108new descriptors. 109This will lower the maximum throughput, but will also lower transmit latency. 110.It Va net.iflib.no_tx_batch 111Some NICs allow processing completed transmit descriptors in batches. 112Doing so usually increases the transmit throughput by reducing the number of 113transmit interrupts. 114Setting this to a non-zero value will disable the use of this feature. 115.El 116.Pp 117These 118.Xr sysctl 8 119variables are read-only: 120.Bl -tag -width indent 121.It Va driver_version 122A string indicating the internal version of the driver. 123.El 124.Pp 125There are a number of queue state 126.Xr sysctl 8 127variables as well: 128.Bl -tag -width indent 129.It Va txqZ 130The following are repeated for each transmit queue, where Z is the transmit 131queue instance number: 132.Bl -tag -width indent 133.It Va r_abdications 134Number of consumer abdications in the MP ring for this queue. 135An abdication occurs on every ring submission when tx_abdicate is true. 136.It Va r_restarts 137Number of consumer restarts in the MP ring for this queue. 138A restart occurs when an attempt to drain a non-empty ring fails, 139and the ring is already in the STALLED state. 140.It Va r_stalls 141Number of consumer stalls in the MP ring for this queue. 142A stall occurs when an attempt to drain a non-empty ring fails. 143.It Va r_starts 144Number of normal consumer starts in the MP ring for this queue. 145A start occurs when the MP ring transitions from IDLE to BUSY. 146.It Va r_drops 147Number of drops in the MP ring for this queue. 148A drop occurs when there is an attempt to add an entry to an MP ring with 149no available space. 150.It Va r_enqueues 151Number of entries which have been enqueued to the MP ring for this queue. 152.It Va ring_state 153MP (soft) ring state. 154This provides a snapshot of the current MP ring state, including the producer 155head and tail indexes, the consumer index, and the state. 156The state is one of "IDLE", "BUSY", 157"STALLED", or "ABDICATED". 158.It Va txq_cleaned 159The number of transmit descriptors which have been reclaimed. 160Total cleaned. 161.It Va txq_processed 162The number of transmit descriptors which have been processed, but may not yet 163have been reclaimed. 164.It Va txq_in_use 165Descriptors which have been added to the transmit queue, 166but have not yet been cleaned. 167This value will include both untransmitted descriptors as well as descriptors 168which have been processed. 169.It Va txq_cidx_processed 170The transmit queue consumer index of the next descriptor to process. 171.It Va txq_cidx 172The transmit queue consumer index of the oldest descriptor to reclaim. 173.It Va txq_pidx 174The transmit queue producer index where the next descriptor to transmit will 175be inserted. 176.It Va no_tx_dma_setup 177Number of times DMA mapping a transmit mbuf failed for reasons other than 178.Er EFBIG . 179.It Va txd_encap_efbig 180Number of times DMA mapping a transmit mbuf failed due to requiring too many 181segments. 182.It Va tx_map_failed 183Number of times DMA mapping a transmit mbuf failed for any reason 184(sum of no_tx_dma_setup and txd_encap_efbig) 185.It Va no_desc_avail 186Number of times a descriptor couldn't be added to the transmit ring because 187the transmit ring was full. 188.It Va mbuf_defrag_failed 189Number of times both 190.Xr m_collapse 9 191and 192.Xr m_defrag 9 193failed after an 194.Er EFBIG 195error 196result from DMA mapping a transmit mbuf. 197.It Va m_pullups 198Number of times 199.Xr m_pullup 9 200was called attempting to parse a header. 201.It Va mbuf_defrag 202Number of times 203.Xr m_defrag 9 204was called. 205.El 206.It Va rxqZ 207The following are repeated for each receive queue, where Z is the 208receive queue instance number: 209.Bl -tag -width indent 210.It Va rxq_fl0.credits 211Credits currently available in the receive ring. 212.It Va rxq_fl0.cidx 213Current receive ring consumer index. 214.It Va rxq_fl0.pidx 215Current receive ring producer index. 216.El 217.El 218.Pp 219Additional OIDs useful for driver and iflib development are exposed when the 220INVARIANTS and/or WITNESS options are enabled in the kernel. 221.Sh SEE ALSO 222.Xr iflib 9 223.Sh HISTORY 224This framework was introduced in 225.Fx 11.0 . 226