1.\" SPDX-License-Identifier: BSD-2-Clause 2.\" 3.\" Copyright (c) 2015-2024 Amazon.com, Inc. or its affiliates. 4.\" All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 13.\" 2. Redistributions in binary form must reproduce the above copyright 14.\" notice, this list of conditions and the following disclaimer in 15.\" the documentation and/or other materials provided with the 16.\" distribution. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 19.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 20.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 21.\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 22.\" OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 23.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 24.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 25.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 26.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 27.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 28.\" OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 29.\" 30.Dd November 14, 2024 31.Dt ENA 4 32.Os 33.Sh NAME 34.Nm ena 35.Nd AWS EC2 Elastic Network Adapter (ENA) driver 36.Sh SYNOPSIS 37To compile this driver into the kernel, 38place the following line in the 39kernel configuration file: 40.Bd -ragged -offset indent 41.Cd "device ena" 42.Ed 43.Pp 44Alternatively, to load the driver as a 45module at boot time, place the following line in 46.Xr loader.conf 5 : 47.Bd -literal -offset indent 48if_ena_load="YES" 49.Ed 50.Sh DESCRIPTION 51The ENA is a networking interface designed to make good use of modern CPU 52features and system architectures. 53.Pp 54The ENA device exposes a lightweight management interface with a 55minimal set of memory mapped registers and extendable command set 56through an Admin Queue. 57.Pp 58The driver supports a range of ENA devices, is link-speed independent 59(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has 60a negotiated and extendable feature set. 61.Pp 62Some ENA devices support SR-IOV. 63This driver is used for both the SR-IOV Physical Function (PF) and Virtual 64Function (VF) devices. 65.Pp 66The ENA devices enable high speed and low overhead network traffic 67processing by providing multiple Tx/Rx queue pairs (the maximum number 68is advertised by the device via the Admin Queue), a dedicated MSI-X 69interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized 70data placement. 71.Pp 72When RSS is enabled, each Tx/Rx queue pair is bound to a corresponding 73CPU core and its NUMA domain. 74The order of those bindings is based on the RSS bucket mapping. 75For builds with RSS support disabled, the 76CPU and NUMA management is left to the kernel. 77Receive-side scaling (RSS) is supported for multi-core scaling. 78.Pp 79The 80.Nm 81driver and its corresponding devices implement health 82monitoring mechanisms such as watchdog, enabling the device and driver 83to recover in a manner transparent to the application, as well as 84debug logs. 85.Pp 86Some of the ENA devices support a working mode called Low-latency 87Queue (LLQ), which saves several more microseconds. 88.Pp 89Support for the 90.Xr netmap 4 91framework is provided by the 92.Nm 93driver. 94Kernel must be built with the DEV_NETMAP option to be able to use this feature. 95.Sh HARDWARE 96The 97.Nm 98driver supports the following PCI vendor ID/device IDs: 99.Pp 100.Bl -bullet -compact 101.It 1021d0f:0ec2 - ENA PF 103.It 1041d0f:1ec2 - ENA PF with LLQ support 105.It 1061d0f:ec20 - ENA VF 107.It 1081d0f:ec21 - ENA VF with LLQ support 109.El 110.Sh LOADER TUNABLES 111The 112.Nm 113driver's behavior can be changed using run-time or boot-time sysctl 114arguments. 115The boot-time arguments can be set at the 116.Xr loader 8 117prompt before booting the kernel, or stored in the 118.Xr loader.conf 5 . 119The run-time arguments can be set using the 120.Xr sysctl 8 121command. 122.Pp 123Boot-time tunables: 124.Bl -tag -width indent 125.It Va hw.ena.enable_9k_mbufs 126Use 9k mbufs for the Rx descriptors. 127The default is 0. 128If the node value is set to 1, 9k mbufs will be used for the Rx buffers. 129If set to 0, page size mbufs will be used instead. 130.Pp 131Using 9k buffers for Rx can improve Rx throughput, but in low memory conditions 132it might increase allocation time, as the system has to look for 3 contiguous 133pages. 134This can further lead to OS instability, together with ENA driver reset and NVMe 135timeouts. 136If network performance is critical and memory capacity is sufficient, the 9k 137mbufs can be used. 138.It Va hw.ena.force_large_llq_header 139Force the driver to use large (224 bytes) or regular (96 bytes) LLQ header size. 140The default value is 2 and the recommended LLQ header size will be used. 141If the node value is set to 0, the regular size LLQ header will be used, which 142is 96B. 143In some cases, the packet header can be bigger than this (for example - 144IPv6 with multiple extensions). 145In such a situation, the large LLQ header size which is 224B should be used, 146and can be forced by setting this node value to 1. 147Using large LLQ header size will take effect only if the device supports 148both LLQ and large LLQ headers. 149Otherwise, it will fallback to the no LLQ mode or regular header size. 150.Pp 151Increasing LLQ header size reduces the size of the Tx queue by half, so it may 152affect the number of dropped Tx packets. 153.El 154.Pp 155Run-time tunables: 156.Bl -tag -width indent 157.It Va hw.ena.log_level 158Controls extra logging verbosity of the driver. 159The default is 2. 160The higher the logging level, the more logs will be printed out. 0 means all 161extra logs are disabled and only error logs will be printed out. 162Default value (2) reports errors, warnings and is verbose about driver 163operation. 164.Pp 165The possible flags are: 166.Pp 167.Bl -bullet -compact 168.It 1690 - ENA_ERR - Enable driver error messages and ena_com error logs. 170.It 1711 - ENA_WARN - Enable logs for non-critical errors. 172.It 1732 - ENA_INFO - Make the driver more verbose about its actions. 174.It 1753 - ENA_DBG - Enable debug logs. 176.El 177.Pp 178NOTE: In order to enable logging on the Tx/Rx data path, driver must be compiled 179with ENA_LOG_IO_ENABLE compilation flag. 180.Pp 181Example: 182To enable logs for errors and warnings, the following command should be used: 183.Bd -literal -offset indent 184sysctl hw.ena.log_level=1 185.Ed 186.It Va dev.ena.X.io_queues_nb 187Number of the currently allocated and used IO queues. 188The default is max_num_io_queues. 189Controls the number of IO queue pairs (Tx/Rx). As this call has to reallocate 190the queues, it will reset the interface and restart all the queues - this means 191that everything, which was currently held in the queue, will be lost, leading to 192potential packet drops. 193.Pp 194This call can fail if the system isn't able to provide the driver with enough 195resources. 196In that situation, the driver will try to revert the previous number of the IO 197queues. 198If this also fails, the device reset will be triggered. 199.Pp 200Example: 201To use only 2 Tx and Rx queues for the device ena1, the following command should 202be used: 203.Bd -literal -offset indent 204sysctl dev.ena.1.io_queues_nb=2 205.Ed 206.It Va dev.ena.X.rx_queue_size 207Size of the Rx queue. 208The default is 1024. 209Controls the number of IO descriptors for each Rx queue. 210The user may want to increase the Rx queue size if they observe a high number of 211Rx drops in the driver's statistics. 212For performance reasons, the Rx queue size must be a power of 2. 213.Pp 214This call can fail if the system isn't able to provide the driver with enough 215resources. 216In that situation, the driver will try to revert to the previous number of the 217descriptors. 218If this also fails, the device reset will be triggered. 219.Pp 220Example: 221To increase Rx ring size to 8K descriptors for the device ena0, the following 222command should be used: 223.Bd -literal -offset indent 224sysctl dev.ena.0.rx_queue_size=8192 225.Ed 226.It Va dev.ena.X.buf_ring_size 227Size of the Tx buffer ring (drbr). 228The default is 4096. 229Input must be a power of 2. 230Controls the number of mbufs that can be held in the Tx buffer ring. 231The drbr is used as a multiple-producer, single-consumer lockless ring for 232buffering extra mbufs coming from the stack in case the Tx procedure is busy 233sending the packets, or the Tx ring is full. 234Increasing the size of the buffer ring may reduce the number of Tx packets being 235dropped in case of a big Tx burst, which cannot be handled by the IO queue 236immediately. 237Each Tx queue has its own drbr. 238.Pp 239It is recommended to keep the drbr with at least the default value, but in case 240the system lacks the resources, it can be reduced. 241This call can fail if the system is not able to provide the driver with enough 242resources. 243In that situation, the driver will try to revert to the previous number of the 244drbr and trigger the device reset. 245.Pp 246Example: 247To set drbr size for interface ena0 to 2048, the following command should 248be used: 249.Bd -literal -offset indent 250sysctl dev.ena.0.buf_ring_size=2048 251.Ed 252.It Va dev.ena.X.eni_metrics.sample_interval 253Interval in seconds for updating ENI metrics. 254The default is 0. 255Determines how often (if ever) the ENI metrics should be updated. 256The ENI metrics are being updated asynchronously in a timer service in order to 257avoid admin queue overload by sysctl node reading. 258The value in this node controls the interval between issuing admin commands to 259the device, which will update the ENI metrics values. 260.Pp 261If some application is periodically monitoring the eni_metrics, then the ENI 262metrics interval can be adjusted accordingly. 263Value 0 turns off the update completely. 264Value 1 is the minimum interval and is equal to 1 second. 265The maximum allowed update interval is 1 hour. 266.Pp 267Example: 268To update ENI metrics for the device ena1 every 10 seconds, the following 269command should be used: 270.Bd -literal -offset indent 271sysctl dev.ena.1.eni_metrics.sample_interval=10 272.Ed 273.It Va dev.ena.X.rss.indir_table_size 274RSS indirection table size. 275The default is 128. 276Returns the number of entries in the RSS indirection table. 277.Pp 278Example: 279To read the RSS indirection table size, the following command should be used: 280.Bd -literal -offset indent 281sysctl dev.ena.0.rss.indir_table_size 282.Ed 283.It Va dev.ena.X.rss.indir_table 284RSS indirection table mapping. 285The default is x:y key-pairs of indir_table_size length. 286Updates selected indices of the RSS indirection table. 287.Pp 288The entry string consists of one or more x:y keypairs, where x stands for 289the table index and y for its new value. 290Table indices that don't need to be 291updated can be omitted from the string and will retain their existing values. 292.Pp 293If an index is entered more than once, the last value is used. 294.Pp 295Example: 296To update two selected indices in the RSS indirection table, e.g. setting index 2970 to queue 5 and then index 5 to queue 0, the following command should be used: 298.Bd -literal -offset indent 299sysctl dev.ena.0.rss.indir_table="0:5 5:0" 300.Ed 301.It Va dev.ena.X.rss.key 302RSS hash key. 303The default is 40 bytes long randomly generated hash key. 304Controls the RSS Toeplitz hash algorithm key value. 305.Pp 306Only available when driver compiled without the kernel side RSS support. 307.Pp 308Example: 309To change the RSS hash key value to 310.Pp 3110x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, 312.br 3130x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, 314.br 3150xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, 316.br 3170x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, 318.br 3190x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa 320.Pp 321the following command should be used: 322.Bd -literal -offset indent 323sysctl dev.ena.0.rss.key=6d5a56da255b0ec24167253d43a38fb0d0ca2bcbae7b30b477cb2da38030f20c6a42b73bbeac01fa 324.Ed 325.El 326.Sh DIAGNOSTICS 327.Ss Device initialization phase 328.Bl -diag 329.It ena%d: failed to init mmio read less 330.Pp 331Error occurred during initialization of the mmio register read request. 332.It ena%d: Can not reset device 333.Pp 334Device could not be reset. 335.br 336Device may not be responding or is already during reset. 337.It ena%d: device version is too low 338.Pp 339Version of the controller is too old and it is not supported by the driver. 340.It ena%d: Invalid dma width value %d 341.Pp 342The controller is unable to request dma transaction width. 343.br 344Device stopped responding or it demanded invalid value. 345.It ena%d: Can not initialize ena admin queue with device 346.Pp 347Initialization of the Admin Queue failed. 348.br 349Device may not be responding or there was a problem with initialization of 350the resources. 351.It ena%d: Cannot get attribute for ena device rc: %d 352.Pp 353Failed to get attributes of the device from the controller. 354.It ena%d: Cannot configure aenq groups rc: %d 355.Pp 356Errors occurred when trying to configure AENQ groups. 357.El 358.Ss Driver initialization/shutdown phase 359.Bl -diag 360.It ena%d: PCI resource allocation failed! 361.It ena%d: failed to pmap registers bar 362.It ena%d: can not allocate ifnet structure 363.It ena%d: Error with network interface setup 364.It ena%d: Failed to enable and set the admin interrupts 365.It ena%d: Error, MSI-X is already enabled 366.It ena%d: Failed to enable MSIX, vectors %d rc %d 367.It ena%d: Not enough number of MSI-X allocated: %d 368.It ena%d: Error with MSI-X enablement 369.It ena%d: could not allocate irq vector: %d 370.It ena%d: unable to allocate bus resource: registers! 371.It ena%d: unable to allocate bus resource: msix! 372.Pp 373Resource allocation failed when initializing the device. 374.br 375Driver will not be attached. 376.It ena%d: ENA device init failed (err: %d) 377.It ena%d: Cannot initialize device 378.Pp 379Device initialization failed. 380.br 381Driver will not be attached. 382.It ena%d: failed to register interrupt handler for irq %ju: %d 383.Pp 384Error occurred when trying to register Admin Queue interrupt handler. 385.It ena%d: Cannot setup mgmnt queue intr 386.Pp 387Error occurred during configuration of the Admin Queue interrupts. 388.It ena%d: Enable MSI-X failed 389.Pp 390Configuration of the MSI-X for Admin Queue failed. 391.br 392There could be lack of resources or interrupts could not have been configured. 393.br 394Driver will not be attached. 395.It ena%d: VLAN is in use, detach first 396.Pp 397VLANs are being used when trying to detach the driver. 398.br 399VLANs must be detached first and then detach routine have to be called again. 400.It ena%d: Unmapped RX DMA tag associations 401.It ena%d: Unmapped TX DMA tag associations 402.Pp 403Error occurred when trying to destroy RX/TX DMA tag. 404.It ena%d: Cannot init indirect table 405.It ena%d: Cannot fill indirect table 406.It ena%d: Cannot fill hash function 407.It ena%d: Cannot fill hash control 408.It ena%d: WARNING: RSS was not properly initialized, it will affect bandwidth 409.Pp 410Error occurred during initialization of one of RSS resources. 411.br 412The device will work with reduced performance because all RX packets will be 413passed to queue 0 and there will be no hash information. 414.It ena%d: LLQ is not supported. Fallback to host mode policy. 415.It ena%d: Failed to configure the device mode. Fallback to host mode policy. 416.It ena%d: unable to allocate LLQ bar resource. Fallback to host mode policy. 417.Pp 418Error occurred during Low-latency Queue mode setup. 419.br 420The device will work, but without the LLQ performance gain. 421.It ena%d: failed to enable write combining. 422.Pp 423Error occurred while setting the Write Combining mode, required for the LLQ. 424.It ena%d: failed to tear down irq: %d 425.It ena%d: dev has no parent while releasing res for irq: %d 426Release of the interrupts failed. 427.El 428.Ss Additional diagnostic 429.Bl -diag 430.It ena%d: Invalid MTU setting. new_mtu: %d max_mtu: %d min mtu: %d 431.Pp 432Requested MTU value is not supported and will not be set. 433.It ena%d: Failed to set MTU to %d 434.Pp 435This message appears when either MTU change feature is not supported, or device 436communication error has occurred. 437.It ena%d: Keep alive watchdog timeout. 438.Pp 439Device stopped responding and will be reset. 440.It ena%d: Found a Tx that wasn't completed on time, qid %d, index %d. 441.Pp 442Packet was pushed to the NIC but not sent within given time limit. 443.br 444It may be caused by hang of the IO queue. 445.It ena%d: The number of lost tx completion is above the threshold (%d > %d). Reset the device 446.Pp 447If too many Tx weren't completed on time the device is going to be reset. 448.br 449It may be caused by hanged queue or device. 450.It ena%d: Trigger reset is on 451.Pp 452Device will be reset. 453.br 454Reset is triggered either by watchdog or if too many TX packets were not 455completed on time. 456.It ena%d: device reset scheduled but trigger_reset is off 457.Pp 458Reset task has been triggered, but the driver did not request it. 459.br 460Device reset will not be performed. 461.It ena%d: Device reset failed 462.Pp 463Error occurred while trying to reset the device. 464.It ena%d: Cannot initialize device 465.It ena%d: Error, mac address are different 466.It ena%d: Error, device max mtu is smaller than ifp MTU 467.It ena%d: Validation of device parameters failed 468.It ena%d: Enable MSI-X failed 469.It ena%d: Failed to create I/O queues 470.It ena%d: Reset attempt failed. Can not reset the device 471.Pp 472Error occurred while trying to restore the device after reset. 473.It ena%d: Device reset completed successfully, Driver info: %s 474.Pp 475Device has been correctly restored after reset and is ready to use. 476.It ena%d: Allocation for Tx Queue %u failed 477.It ena%d: Allocation for Rx Queue %u failed 478.It ena%d: Unable to create Rx DMA map for buffer %d 479.It ena%d: Failed to create io TX queue #%d rc: %d 480.It ena%d: Failed to get TX queue handlers. TX queue num %d rc: %d 481.It ena%d: Failed to create io RX queue[%d] rc: %d 482.It ena%d: Failed to get RX queue handlers. RX queue num %d rc: %d 483.It ena%d: could not allocate irq vector: %d 484.It ena%d: failed to register interrupt handler for irq %ju: %d 485.Pp 486IO resources initialization failed. 487.br 488Interface will not be brought up. 489.It ena%d: LRO[%d] Initialization failed! 490.Pp 491Initialization of the LRO for the RX ring failed. 492.It ena%d: failed to alloc buffer for rx queue 493.It ena%d: failed to add buffer for rx queue %d 494.It ena%d: refilled rx qid %d with only %d mbufs (from %d) 495.Pp 496Allocation of resources used on RX path failed. 497.br 498If happened during initialization of the IO queue, the interface will not be 499brought up. 500.It ena%d: NULL mbuf in rx_info 501.Pp 502Error occurred while assembling mbuf from descriptors. 503.It ena%d: tx_info doesn't have valid mbuf 504.It ena%d: Invalid req_id: %hu 505.It ena%d: failed to prepare tx bufs 506.Pp 507Error occurred while preparing a packet for transmission. 508.It ena%d: ioctl promisc/allmulti 509.Pp 510IOCTL request for the device to work in promiscuous/allmulti mode. 511.br 512See 513.Xr ifconfig 8 514for more details. 515.El 516.Sh SUPPORT 517If an issue is identified with the released source code with a supported 518adapter, please email the specific information related to the issue to 519.Aq Mt akiyano@amazon.com , 520.Aq Mt osamaabb@amazon.com 521and 522.Aq Mt darinzon@amazon.com . 523.Sh SEE ALSO 524.Xr netmap 4 , 525.Xr vlan 4 , 526.Xr ifconfig 8 527.Sh HISTORY 528The 529.Nm 530driver first appeared in 531.Fx 11.1 . 532.Sh AUTHORS 533The 534.Nm 535driver was developed by Amazon and originally written by 536.An Semihalf . 537