xref: /freebsd/share/man/man4/ena.4 (revision ba7b7f94c239ce43343d7af403734fdc941b7664)
1.\" SPDX-License-Identifier: BSD-2-Clause
2.\"
3.\" Copyright (c) 2015-2023 Amazon.com, Inc. or its affiliates.
4.\" All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\"
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\"
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in
15.\"    the documentation and/or other materials provided with the
16.\"    distribution.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21.\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
22.\" OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
23.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
24.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
26.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28.\" OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29.\"
30.Dd June 4, 2021
31.Dt ENA 4
32.Os
33.Sh NAME
34.Nm ena
35.Nd "FreeBSD kernel driver for Elastic Network Adapter (ENA) family"
36.Sh SYNOPSIS
37To compile this driver into the kernel,
38place the following line in the
39kernel configuration file:
40.Bd -ragged -offset indent
41.Cd "device ena"
42.Ed
43.Pp
44Alternatively, to load the driver as a
45module at boot time, place the following line in
46.Xr loader.conf 5 :
47.Bd -literal -offset indent
48if_ena_load="YES"
49.Ed
50.Sh DESCRIPTION
51The ENA is a networking interface designed to make good use of modern CPU
52features and system architectures.
53.Pp
54The ENA device exposes a lightweight management interface with a
55minimal set of memory mapped registers and extendable command set
56through an Admin Queue.
57.Pp
58The driver supports a range of ENA devices, is link-speed independent
59(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
60a negotiated and extendable feature set.
61.Pp
62Some ENA devices support SR-IOV.
63This driver is used for both the SR-IOV Physical Function (PF) and Virtual
64Function (VF) devices.
65.Pp
66The ENA devices enable high speed and low overhead network traffic
67processing by providing multiple Tx/Rx queue pairs (the maximum number
68is advertised by the device via the Admin Queue), a dedicated MSI-X
69interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized
70data placement.
71.Pp
72When RSS is enabled, each Tx/Rx queue pair is bound to a corresponding
73CPU core and its NUMA domain.
74The order of those bindings is based on the RSS bucket mapping.
75For builds with RSS support disabled, the
76CPU and NUMA management is left to the kernel.
77Receive-side scaling (RSS) is supported for multi-core scaling.
78.Pp
79The
80.Nm
81driver and its corresponding devices implement health
82monitoring mechanisms such as watchdog, enabling the device and driver
83to recover in a manner transparent to the application, as well as
84debug logs.
85.Pp
86Some of the ENA devices support a working mode called Low-latency
87Queue (LLQ), which saves several more microseconds.
88.Pp
89Support for the
90.Xr netmap 4
91framework is provided by the
92.Nm
93driver.
94Kernel must be built with the DEV_NETMAP option to be able to use this feature.
95.Sh HARDWARE
96Supported PCI vendor ID/device IDs:
97.Pp
98.Bl -bullet -compact
99.It
1001d0f:0ec2 - ENA PF
101.It
1021d0f:1ec2 - ENA PF with LLQ support
103.It
1041d0f:ec20 - ENA VF
105.It
1061d0f:ec21 - ENA VF with LLQ support
107.El
108.Sh LOADER TUNABLES
109The
110.Nm
111driver's behavior can be changed using run-time or boot-time sysctl
112arguments.
113The boot-time arguments can be set at the
114.Xr loader 8
115prompt before booting the kernel, or stored in the
116.Xr loader.conf 5 .
117The run-time arguments can be set using the
118.Xr sysctl 8
119command.
120.Pp
121Boot-time tunables:
122.Bl -tag -width indent
123.It Va hw.ena.enable_9k_mbufs
124Use 9k mbufs for the Rx descriptors.
125The default is 0.
126If the node value is set to 1, 9k mbufs will be used for the Rx buffers.
127If set to 0, page size mbufs will be used instead.
128.Pp
129Using 9k buffers for Rx can improve Rx throughput, but in low memory conditions
130it might increase allocation time, as the system has to look for 3 contiguous
131pages.
132This can further lead to OS instability, together with ENA driver reset and NVMe
133timeouts.
134If network performance is critical and memory capacity is sufficient, the 9k
135mbufs can be used.
136.It Va hw.ena.force_large_llq_headers
137Force the driver to use large LLQ headers (224 bytes).
138The default is 0.
139If the node value is set to 0, the regular size LLQ header will be used, which
140is 96B.
141In some cases, the packet header can be bigger than this (for example -
142IPv6 with multiple extensions).
143In such a situation, the large LLQ headers should be used by setting this node
144value to 1.
145This will take effect only if the device supports both LLQ and large LLQ
146headers.
147Otherwise, it will fallback to the no LLQ mode or regular header size.
148.Pp
149Increasing LLQ header size reduces the size of the Tx queue by half, so it may
150affect the number of dropped Tx packets.
151.El
152.Pp
153Run-time tunables:
154.Bl -tag -width indent
155.It Va hw.ena.log_level
156Controls extra logging verbosity of the driver.
157The default is 2.
158The higher the logging level, the more logs will be printed out. 0 means all
159extra logs are disabled and only error logs will be printed out.
160Default value (2) reports errors, warnings and is verbose about driver
161operation.
162.Pp
163The possible flags are:
164.Pp
165.Bl -bullet -compact
166.It
1670 - ENA_ERR  - Enable driver error messages and ena_com error logs.
168.It
1691 - ENA_WARN - Enable logs for non-critical errors.
170.It
1712 - ENA_INFO - Make the driver more verbose about its actions.
172.It
1733 - ENA_DBG  - Enable debug logs.
174.El
175.Pp
176NOTE: In order to enable logging on the Tx/Rx data path, driver must be compiled
177with ENA_LOG_IO_ENABLE compilation flag.
178.Pp
179Example:
180To enable logs for errors and warnings, the following command should be used:
181.Bd -literal -offset indent
182sysctl hw.ena.log_level=1
183.Ed
184.It Va dev.ena.X.io_queues_nb
185Number of the currently allocated and used IO queues.
186The default is max_num_io_queues.
187Controls the number of IO queue pairs (Tx/Rx). As this call has to reallocate
188the queues, it will reset the interface and restart all the queues - this means
189that everything, which was currently held in the queue, will be lost, leading to
190potential packet drops.
191.Pp
192This call can fail if the system isn't able to provide the driver with enough
193resources.
194In that situation, the driver will try to revert the previous number of the IO
195queues.
196If this also fails, the device reset will be triggered.
197.Pp
198Example:
199To use only 2 Tx and Rx queues for the device ena1, the following command should
200be used:
201.Bd -literal -offset indent
202sysctl dev.ena.1.io_queues_nb=2
203.Ed
204.It Va dev.ena.X.rx_queue_size
205Size of the Rx queue.
206The default is 1024.
207Controls the number of IO descriptors for each Rx queue.
208The user may want to increase the Rx queue size if they observe a high number of
209Rx drops in the driver's statistics.
210For performance reasons, the Rx queue size must be a power of 2.
211.Pp
212This call can fail if the system isn't able to provide the driver with enough
213resources.
214In that situation, the driver will try to revert to the previous number of the
215descriptors.
216If this also fails, the device reset will be triggered.
217.Pp
218Example:
219To increase Rx ring size to 8K descriptors for the device ena0, the following
220command should be used:
221.Bd -literal -offset indent
222sysctl dev.ena.0.rx_queue_size=8192
223.Ed
224.It Va dev.ena.X.buf_ring_size
225Size of the Tx buffer ring (drbr).
226The default is 4096.
227Input must be a power of 2.
228Controls the number of mbufs that can be held in the Tx buffer ring.
229The drbr is used as a multiple-producer, single-consumer lockless ring for
230buffering extra mbufs coming from the stack in case the Tx procedure is busy
231sending the packets, or the Tx ring is full.
232Increasing the size of the buffer ring may reduce the number of Tx packets being
233dropped in case of a big Tx burst, which cannot be handled by the IO queue
234immediately.
235Each Tx queue has its own drbr.
236.Pp
237It is recommended to keep the drbr with at least the default value, but in case
238the system lacks the resources, it can be reduced.
239This call can fail if the system is not able to provide the driver with enough
240resources.
241In that situation, the driver will try to revert to the previous number of the
242drbr and trigger the device reset.
243.Pp
244Example:
245To set drbr size for interface ena0 to 2048, the following command should
246be used:
247.Bd -literal -offset indent
248sysctl dev.ena.0.buf_ring_size=2048
249.Ed
250.It Va dev.ena.X.eni_metrics.sample_interval
251Interval in seconds for updating ENI metrics.
252The default is 0.
253Determines how often (if ever) the ENI metrics should be updated.
254The ENI metrics are being updated asynchronously in a timer service in order to
255avoid admin queue overload by sysctl node reading.
256The value in this node controls the interval between issuing admin commands to
257the device, which will update the ENI metrics values.
258.Pp
259If some application is periodically monitoring the eni_metrics, then the ENI
260metrics interval can be adjusted accordingly.
261Value 0 turns off the update completely.
262Value 1 is the minimum interval and is equal to 1 second.
263The maximum allowed update interval is 1 hour.
264.Pp
265Example:
266To update ENI metrics for the device ena1 every 10 seconds, the following
267command should be used:
268.Bd -literal -offset indent
269sysctl dev.ena.1.eni_metrics.sample_interval=10
270.Ed
271.It Va dev.ena.X.rss.indir_table_size
272RSS indirection table size.
273The default is 128.
274Returns the number of entries in the RSS indirection table.
275.Pp
276Example:
277To read the RSS indirection table size, the following command should be used:
278.Bd -literal -offset indent
279sysctl dev.ena.0.rss.indir_table_size
280.Ed
281.It Va dev.ena.X.rss.indir_table
282RSS indirection table mapping.
283The default is x:y key-pairs of indir_table_size length.
284Updates selected indices of the RSS indirection table.
285.Pp
286The entry string consists of one or more x:y keypairs, where x stands for
287the table index and y for its new value.
288Table indices that don't need to be
289updated can be omitted from the string and will retain their existing values.
290.Pp
291If an index is entered more than once, the last value is used.
292.Pp
293Example:
294To update two selected indices in the RSS indirection table, e.g. setting index
2950 to queue 5 and then index 5 to queue 0, the following command should be used:
296.Bd -literal -offset indent
297sysctl dev.ena.0.rss.indir_table="0:5 5:0"
298.Ed
299.It Va dev.ena.X.rss.key
300RSS hash key.
301The default is 40 bytes long randomly generated hash key.
302Controls the RSS Toeplitz hash algorithm key value.
303.Pp
304Only available when driver compiled without the kernel side RSS support.
305.Pp
306Example:
307To change the RSS hash key value to
308.Pp
3090x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
310.br
3110x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
312.br
3130xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
314.br
3150x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c,
316.br
3170x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa
318.Pp
319the following command should be used:
320.Bd -literal -offset indent
321sysctl dev.ena.0.rss.key=6d5a56da255b0ec24167253d43a38fb0d0ca2bcbae7b30b477cb2da38030f20c6a42b73bbeac01fa
322.Ed
323.El
324.Sh DIAGNOSTICS
325.Ss Device initialization phase
326.Bl -diag
327.It ena%d: failed to init mmio read less
328.Pp
329Error occurred during initialization of the mmio register read request.
330.It ena%d: Can not reset device
331.Pp
332Device could not be reset.
333.br
334Device may not be responding or is already during reset.
335.It ena%d: device version is too low
336.Pp
337Version of the controller is too old and it is not supported by the driver.
338.It ena%d: Invalid dma width value %d
339.Pp
340The controller is unable to request dma transaction width.
341.br
342Device stopped responding or it demanded invalid value.
343.It ena%d: Can not initialize ena admin queue with device
344.Pp
345Initialization of the Admin Queue failed.
346.br
347Device may not be responding or there was a problem with initialization of
348the resources.
349.It ena%d: Cannot get attribute for ena device rc: %d
350.Pp
351Failed to get attributes of the device from the controller.
352.It ena%d: Cannot configure aenq groups rc: %d
353.Pp
354Errors occurred when trying to configure AENQ groups.
355.El
356.Ss Driver initialization/shutdown phase
357.Bl -diag
358.It ena%d: PCI resource allocation failed!
359.It ena%d: failed to pmap registers bar
360.It ena%d: can not allocate ifnet structure
361.It ena%d: Error with network interface setup
362.It ena%d: Failed to enable and set the admin interrupts
363.It ena%d: Error, MSI-X is already enabled
364.It ena%d: Failed to enable MSIX, vectors %d rc %d
365.It ena%d: Not enough number of MSI-X allocated: %d
366.It ena%d: Error with MSI-X enablement
367.It ena%d: could not allocate irq vector: %d
368.It ena%d: unable to allocate bus resource: registers!
369.It ena%d: unable to allocate bus resource: msix!
370.Pp
371Resource allocation failed when initializing the device.
372.br
373Driver will not be attached.
374.It ena%d: ENA device init failed (err: %d)
375.It ena%d: Cannot initialize device
376.Pp
377Device initialization failed.
378.br
379Driver will not be attached.
380.It ena%d: failed to register interrupt handler for irq %ju: %d
381.Pp
382Error occurred when trying to register Admin Queue interrupt handler.
383.It ena%d: Cannot setup mgmnt queue intr
384.Pp
385Error occurred during configuration of the Admin Queue interrupts.
386.It ena%d: Enable MSI-X failed
387.Pp
388Configuration of the MSI-X for Admin Queue failed.
389.br
390There could be lack of resources or interrupts could not have been configured.
391.br
392Driver will not be attached.
393.It ena%d: VLAN is in use, detach first
394.Pp
395VLANs are being used when trying to detach the driver.
396.br
397VLANs must be detached first and then detach routine have to be called again.
398.It ena%d: Unmapped RX DMA tag associations
399.It ena%d: Unmapped TX DMA tag associations
400.Pp
401Error occurred when trying to destroy RX/TX DMA tag.
402.It ena%d: Cannot init indirect table
403.It ena%d: Cannot fill indirect table
404.It ena%d: Cannot fill hash function
405.It ena%d: Cannot fill hash control
406.It ena%d: WARNING: RSS was not properly initialized, it will affect bandwidth
407.Pp
408Error occurred during initialization of one of RSS resources.
409.br
410The device will work with reduced performance because all RX packets will be
411passed to queue 0 and there will be no hash information.
412.It ena%d: LLQ is not supported. Fallback to host mode policy.
413.It ena%d: Failed to configure the device mode. Fallback to host mode policy.
414.It ena%d: unable to allocate LLQ bar resource. Fallback to host mode policy.
415.Pp
416Error occurred during Low-latency Queue mode setup.
417.br
418The device will work, but without the LLQ performance gain.
419.It ena%d: failed to enable write combining.
420.Pp
421Error occurred while setting the Write Combining mode, required for the LLQ.
422.It ena%d: failed to tear down irq: %d
423.It ena%d: dev has no parent while releasing res for irq: %d
424Release of the interrupts failed.
425.El
426.Ss Additional diagnostic
427.Bl -diag
428.It ena%d: Invalid MTU setting. new_mtu: %d max_mtu: %d min mtu: %d
429.Pp
430Requested MTU value is not supported and will not be set.
431.It ena%d: Failed to set MTU to %d
432.Pp
433This message appears when either MTU change feature is not supported, or device
434communication error has occurred.
435.It ena%d: Keep alive watchdog timeout.
436.Pp
437Device stopped responding and will be reset.
438.It ena%d: Found a Tx that wasn't completed on time, qid %d, index %d.
439.Pp
440Packet was pushed to the NIC but not sent within given time limit.
441.br
442It may be caused by hang of the IO queue.
443.It ena%d: The number of lost tx completion is above the threshold (%d > %d). Reset the device
444.Pp
445If too many Tx weren't completed on time the device is going to be reset.
446.br
447It may be caused by hanged queue or device.
448.It ena%d: Trigger reset is on
449.Pp
450Device will be reset.
451.br
452Reset is triggered either by watchdog or if too many TX packets were not
453completed on time.
454.It ena%d: device reset scheduled but trigger_reset is off
455.Pp
456Reset task has been triggered, but the driver did not request it.
457.br
458Device reset will not be performed.
459.It ena%d: Device reset failed
460.Pp
461Error occurred while trying to reset the device.
462.It ena%d: Cannot initialize device
463.It ena%d: Error, mac address are different
464.It ena%d: Error, device max mtu is smaller than ifp MTU
465.It ena%d: Validation of device parameters failed
466.It ena%d: Enable MSI-X failed
467.It ena%d: Failed to create I/O queues
468.It ena%d: Reset attempt failed. Can not reset the device
469.Pp
470Error occurred while trying to restore the device after reset.
471.It ena%d: Device reset completed successfully, Driver info: %s
472.Pp
473Device has been correctly restored after reset and is ready to use.
474.It ena%d: Allocation for Tx Queue %u failed
475.It ena%d: Allocation for Rx Queue %u failed
476.It ena%d: Unable to create Rx DMA map for buffer %d
477.It ena%d: Failed to create io TX queue #%d rc: %d
478.It ena%d: Failed to get TX queue handlers. TX queue num %d rc: %d
479.It ena%d: Failed to create io RX queue[%d] rc: %d
480.It ena%d: Failed to get RX queue handlers. RX queue num %d rc: %d
481.It ena%d: could not allocate irq vector: %d
482.It ena%d: failed to register interrupt handler for irq %ju: %d
483.Pp
484IO resources initialization failed.
485.br
486Interface will not be brought up.
487.It ena%d: LRO[%d] Initialization failed!
488.Pp
489Initialization of the LRO for the RX ring failed.
490.It ena%d: failed to alloc buffer for rx queue
491.It ena%d: failed to add buffer for rx queue %d
492.It ena%d: refilled rx qid %d with only %d mbufs (from %d)
493.Pp
494Allocation of resources used on RX path failed.
495.br
496If happened during initialization of the IO queue, the interface will not be
497brought up.
498.It ena%d: NULL mbuf in rx_info
499.Pp
500Error occurred while assembling mbuf from descriptors.
501.It ena%d: tx_info doesn't have valid mbuf
502.It ena%d: Invalid req_id: %hu
503.It ena%d: failed to prepare tx bufs
504.Pp
505Error occurred while preparing a packet for transmission.
506.It ena%d: ioctl promisc/allmulti
507.Pp
508IOCTL request for the device to work in promiscuous/allmulti mode.
509.br
510See
511.Xr ifconfig 8
512for more details.
513.El
514.Sh SUPPORT
515If an issue is identified with the released source code with a supported
516adapter, please email the specific information related to the issue to
517.Aq Mt akiyano@amazon.com ,
518.Aq Mt osamaabb@amazon.com
519and
520.Aq Mt darinzon@amazon.com .
521.Sh SEE ALSO
522.Xr netmap 4 ,
523.Xr vlan 4 ,
524.Xr ifconfig 8
525.Sh HISTORY
526The
527.Nm
528driver first appeared in
529.Fx 11.1 .
530.Sh AUTHORS
531The
532.Nm
533driver was developed by Amazon and originally written by
534.An Semihalf .
535