xref: /linux/Documentation/networking/device_drivers/ethernet/intel/ice.rst (revision f7dc3248dcfbdd81b5be64272f38b87a8e8085e7)
1.. SPDX-License-Identifier: GPL-2.0+
2
3=================================================================
4Linux Base Driver for the Intel(R) Ethernet Controller 800 Series
5=================================================================
6
7Intel ice Linux driver.
8Copyright(c) 2018-2021 Intel Corporation.
9
10Contents
11========
12
13- Overview
14- Identifying Your Adapter
15- Important Notes
16- Additional Features & Configurations
17- Performance Optimization
18
19
20The associated Virtual Function (VF) driver for this driver is iavf.
21
22Driver information can be obtained using ethtool and lspci.
23
24For questions related to hardware requirements, refer to the documentation
25supplied with your Intel adapter. All hardware requirements listed apply to use
26with Linux.
27
28This driver supports XDP (Express Data Path) and AF_XDP zero-copy. Note that
29XDP is blocked for frame sizes larger than 3KB.
30
31
32Identifying Your Adapter
33========================
34For information on how to identify your adapter, and for the latest Intel
35network drivers, refer to the Intel Support website:
36https://www.intel.com/support
37
38
39Important Notes
40===============
41
42Packet drops may occur under receive stress
43-------------------------------------------
44Devices based on the Intel(R) Ethernet Controller 800 Series are designed to
45tolerate a limited amount of system latency during PCIe and DMA transactions.
46If these transactions take longer than the tolerated latency, it can impact the
47length of time the packets are buffered in the device and associated memory,
48which may result in dropped packets. These packets drops typically do not have
49a noticeable impact on throughput and performance under standard workloads.
50
51If these packet drops appear to affect your workload, the following may improve
52the situation:
53
541) Make sure that your system's physical memory is in a high-performance
55   configuration, as recommended by the platform vendor. A common
56   recommendation is for all channels to be populated with a single DIMM
57   module.
582) In your system's BIOS/UEFI settings, select the "Performance" profile.
593) Your distribution may provide tools like "tuned," which can help tweak
60   kernel settings to achieve better standard settings for different workloads.
61
62
63Configuring SR-IOV for improved network security
64------------------------------------------------
65In a virtualized environment, on Intel(R) Ethernet Network Adapters that
66support SR-IOV, the virtual function (VF) may be subject to malicious behavior.
67Software-generated layer two frames, like IEEE 802.3x (link flow control), IEEE
68802.1Qbb (priority based flow-control), and others of this type, are not
69expected and can throttle traffic between the host and the virtual switch,
70reducing performance. To resolve this issue, and to ensure isolation from
71unintended traffic streams, configure all SR-IOV enabled ports for VLAN tagging
72from the administrative interface on the PF. This configuration allows
73unexpected, and potentially malicious, frames to be dropped.
74
75See "Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports" later in this
76README for configuration instructions.
77
78
79Do not unload port driver if VF with active VM is bound to it
80-------------------------------------------------------------
81Do not unload a port's driver if a Virtual Function (VF) with an active Virtual
82Machine (VM) is bound to it. Doing so will cause the port to appear to hang.
83Once the VM shuts down, or otherwise releases the VF, the command will
84complete.
85
86
87Additional Features and Configurations
88======================================
89
90ethtool
91-------
92The driver utilizes the ethtool interface for driver configuration and
93diagnostics, as well as displaying statistical information. The latest ethtool
94version is required for this functionality. Download it at:
95https://kernel.org/pub/software/network/ethtool/
96
97NOTE: The rx_bytes value of ethtool does not match the rx_bytes value of
98Netdev, due to the 4-byte CRC being stripped by the device. The difference
99between the two rx_bytes values will be 4 x the number of Rx packets. For
100example, if Rx packets are 10 and Netdev (software statistics) displays
101rx_bytes as "X", then ethtool (hardware statistics) will display rx_bytes as
102"X+40" (4 bytes CRC x 10 packets).
103
104
105Viewing Link Messages
106---------------------
107Link messages will not be displayed to the console if the distribution is
108restricting system messages. In order to see network driver link messages on
109your console, set dmesg to eight by entering the following::
110
111  # dmesg -n 8
112
113NOTE: This setting is not saved across reboots.
114
115
116Dynamic Device Personalization
117------------------------------
118Dynamic Device Personalization (DDP) allows you to change the packet processing
119pipeline of a device by applying a profile package to the device at runtime.
120Profiles can be used to, for example, add support for new protocols, change
121existing protocols, or change default settings. DDP profiles can also be rolled
122back without rebooting the system.
123
124The DDP package loads during device initialization. The driver looks for
125``intel/ice/ddp/ice.pkg`` in your firmware root (typically ``/lib/firmware/``
126or ``/lib/firmware/updates/``) and checks that it contains a valid DDP package
127file.
128
129NOTE: Your distribution should likely have provided the latest DDP file, but if
130ice.pkg is missing, you can find it in the linux-firmware repository or from
131intel.com.
132
133If the driver is unable to load the DDP package, the device will enter Safe
134Mode. Safe Mode disables advanced and performance features and supports only
135basic traffic and minimal functionality, such as updating the NVM or
136downloading a new driver or DDP package. Safe Mode only applies to the affected
137physical function and does not impact any other PFs. See the "Intel(R) Ethernet
138Adapters and Devices User Guide" for more details on DDP and Safe Mode.
139
140NOTES:
141
142- If you encounter issues with the DDP package file, you may need to download
143  an updated driver or DDP package file. See the log messages for more
144  information.
145
146- The ice.pkg file is a symbolic link to the default DDP package file.
147
148- You cannot update the DDP package if any PF drivers are already loaded. To
149  overwrite a package, unload all PFs and then reload the driver with the new
150  package.
151
152- Only the first loaded PF per device can download a package for that device.
153
154You can install specific DDP package files for different physical devices in
155the same system. To install a specific DDP package file:
156
1571. Download the DDP package file you want for your device.
158
1592. Rename the file ice-xxxxxxxxxxxxxxxx.pkg, where 'xxxxxxxxxxxxxxxx' is the
160   unique 64-bit PCI Express device serial number (in hex) of the device you
161   want the package downloaded on. The filename must include the complete
162   serial number (including leading zeros) and be all lowercase. For example,
163   if the 64-bit serial number is b887a3ffffca0568, then the file name would be
164   ice-b887a3ffffca0568.pkg.
165
166   To find the serial number from the PCI bus address, you can use the
167   following command::
168
169     # lspci -vv -s af:00.0 | grep -i Serial
170     Capabilities: [150 v1] Device Serial Number b8-87-a3-ff-ff-ca-05-68
171
172   You can use the following command to format the serial number without the
173   dashes::
174
175     # lspci -vv -s af:00.0 | grep -i Serial | awk '{print $7}' | sed s/-//g
176     b887a3ffffca0568
177
1783. Copy the renamed DDP package file to
179   ``/lib/firmware/updates/intel/ice/ddp/``. If the directory does not yet
180   exist, create it before copying the file.
181
1824. Unload all of the PFs on the device.
183
1845. Reload the driver with the new package.
185
186NOTE: The presence of a device-specific DDP package file overrides the loading
187of the default DDP package file (ice.pkg).
188
189
190Intel(R) Ethernet Flow Director
191-------------------------------
192The Intel Ethernet Flow Director performs the following tasks:
193
194- Directs receive packets according to their flows to different queues
195- Enables tight control on routing a flow in the platform
196- Matches flows and CPU cores for flow affinity
197
198NOTE: This driver supports the following flow types:
199
200- IPv4
201- TCPv4
202- UDPv4
203- SCTPv4
204- IPv6
205- TCPv6
206- UDPv6
207- SCTPv6
208
209Each flow type supports valid combinations of IP addresses (source or
210destination) and UDP/TCP/SCTP ports (source and destination). You can supply
211only a source IP address, a source IP address and a destination port, or any
212combination of one or more of these four parameters.
213
214NOTE: This driver allows you to filter traffic based on a user-defined flexible
215two-byte pattern and offset by using the ethtool user-def and mask fields. Only
216L3 and L4 flow types are supported for user-defined flexible filters. For a
217given flow type, you must clear all Intel Ethernet Flow Director filters before
218changing the input set (for that flow type).
219
220
221Flow Director Filters
222---------------------
223Flow Director filters are used to direct traffic that matches specified
224characteristics. They are enabled through ethtool's ntuple interface. To enable
225or disable the Intel Ethernet Flow Director and these filters::
226
227  # ethtool -K <ethX> ntuple <off|on>
228
229NOTE: When you disable ntuple filters, all the user programmed filters are
230flushed from the driver cache and hardware. All needed filters must be re-added
231when ntuple is re-enabled.
232
233To display all of the active filters::
234
235  # ethtool -u <ethX>
236
237To add a new filter::
238
239  # ethtool -U <ethX> flow-type <type> src-ip <ip> [m <ip_mask>] dst-ip <ip>
240  [m <ip_mask>] src-port <port> [m <port_mask>] dst-port <port> [m <port_mask>]
241  action <queue>
242
243  Where:
244    <ethX> - the Ethernet device to program
245    <type> - can be ip4, tcp4, udp4, sctp4, ip6, tcp6, udp6, sctp6
246    <ip> - the IP address to match on
247    <ip_mask> - the IPv4 address to mask on
248              NOTE: These filters use inverted masks.
249    <port> - the port number to match on
250    <port_mask> - the 16-bit integer for masking
251              NOTE: These filters use inverted masks.
252    <queue> - the queue to direct traffic toward (-1 discards the
253              matched traffic)
254
255To delete a filter::
256
257  # ethtool -U <ethX> delete <N>
258
259  Where <N> is the filter ID displayed when printing all the active filters,
260  and may also have been specified using "loc <N>" when adding the filter.
261
262EXAMPLES:
263
264To add a filter that directs packet to queue 2::
265
266  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
267  192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1]
268
269To set a filter using only the source and destination IP address::
270
271  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
272  192.168.10.2 action 2 [loc 1]
273
274To set a filter based on a user-defined pattern and offset::
275
276  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
277  192.168.10.2 user-def 0x4FFFF action 2 [loc 1]
278
279  where the value of the user-def field contains the offset (4 bytes) and
280  the pattern (0xffff).
281
282To match TCP traffic sent from 192.168.0.1, port 5300, directed to 192.168.0.5,
283port 80, and then send it to queue 7::
284
285  # ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5
286  src-port 5300 dst-port 80 action 7
287
288To add a TCPv4 filter with a partial mask for a source IP subnet::
289
290  # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.0.0 m 0.255.255.255 dst-ip
291  192.168.5.12 src-port 12600 dst-port 31 action 12
292
293NOTES:
294
295For each flow-type, the programmed filters must all have the same matching
296input set. For example, issuing the following two commands is acceptable::
297
298  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
299  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10
300
301Issuing the next two commands, however, is not acceptable, since the first
302specifies src-ip and the second specifies dst-ip::
303
304  # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
305  # ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10
306
307The second command will fail with an error. You may program multiple filters
308with the same fields, using different values, but, on one device, you may not
309program two tcp4 filters with different matching fields.
310
311The ice driver does not support matching on a subportion of a field, thus
312partial mask fields are not supported.
313
314
315Flex Byte Flow Director Filters
316-------------------------------
317The driver also supports matching user-defined data within the packet payload.
318This flexible data is specified using the "user-def" field of the ethtool
319command in the following way:
320
321.. table::
322
323    ============================== ============================
324    ``31    28    24    20    16`` ``15    12    8    4    0``
325    ``offset into packet payload`` ``2 bytes of flexible data``
326    ============================== ============================
327
328For example,
329
330::
331
332  ... user-def 0x4FFFF ...
333
334tells the filter to look 4 bytes into the payload and match that value against
3350xFFFF. The offset is based on the beginning of the payload, and not the
336beginning of the packet. Thus
337
338::
339
340  flow-type tcp4 ... user-def 0x8BEAF ...
341
342would match TCP/IPv4 packets which have the value 0xBEAF 8 bytes into the
343TCP/IPv4 payload.
344
345Note that ICMP headers are parsed as 4 bytes of header and 4 bytes of payload.
346Thus to match the first byte of the payload, you must actually add 4 bytes to
347the offset. Also note that ip4 filters match both ICMP frames as well as raw
348(unknown) ip4 frames, where the payload will be the L3 payload of the IP4
349frame.
350
351The maximum offset is 64. The hardware will only read up to 64 bytes of data
352from the payload. The offset must be even because the flexible data is 2 bytes
353long and must be aligned to byte 0 of the packet payload.
354
355The user-defined flexible offset is also considered part of the input set and
356cannot be programmed separately for multiple filters of the same type. However,
357the flexible data is not part of the input set and multiple filters may use the
358same offset but match against different data.
359
360
361RSS Hash Flow
362-------------
363Allows you to set the hash bytes per flow type and any combination of one or
364more options for Receive Side Scaling (RSS) hash byte configuration.
365
366::
367
368  # ethtool -N <ethX> rx-flow-hash <type> <option>
369
370  Where <type> is:
371    tcp4  signifying TCP over IPv4
372    udp4  signifying UDP over IPv4
373    tcp6  signifying TCP over IPv6
374    udp6  signifying UDP over IPv6
375  And <option> is one or more of:
376    s     Hash on the IP source address of the Rx packet.
377    d     Hash on the IP destination address of the Rx packet.
378    f     Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet.
379    n     Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet.
380
381
382Accelerated Receive Flow Steering (aRFS)
383----------------------------------------
384Devices based on the Intel(R) Ethernet Controller 800 Series support
385Accelerated Receive Flow Steering (aRFS) on the PF. aRFS is a load-balancing
386mechanism that allows you to direct packets to the same CPU where an
387application is running or consuming the packets in that flow.
388
389NOTES:
390
391- aRFS requires that ntuple filtering is enabled via ethtool.
392- aRFS support is limited to the following packet types:
393
394    - TCP over IPv4 and IPv6
395    - UDP over IPv4 and IPv6
396    - Nonfragmented packets
397
398- aRFS only supports Flow Director filters, which consist of the
399  source/destination IP addresses and source/destination ports.
400- aRFS and ethtool's ntuple interface both use the device's Flow Director. aRFS
401  and ntuple features can coexist, but you may encounter unexpected results if
402  there's a conflict between aRFS and ntuple requests. See "Intel(R) Ethernet
403  Flow Director" for additional information.
404
405To set up aRFS:
406
4071. Enable the Intel Ethernet Flow Director and ntuple filters using ethtool.
408
409::
410
411   # ethtool -K <ethX> ntuple on
412
4132. Set up the number of entries in the global flow table. For example:
414
415::
416
417   # NUM_RPS_ENTRIES=16384
418   # echo $NUM_RPS_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries
419
4203. Set up the number of entries in the per-queue flow table. For example:
421
422::
423
424   # NUM_RX_QUEUES=64
425   # for file in /sys/class/net/$IFACE/queues/rx-*/rps_flow_cnt; do
426   # echo $(($NUM_RPS_ENTRIES/$NUM_RX_QUEUES)) > $file;
427   # done
428
4294. Disable the IRQ balance daemon (this is only a temporary stop of the service
430   until the next reboot).
431
432::
433
434   # systemctl stop irqbalance
435
4365. Configure the interrupt affinity.
437
438   See ``/Documentation/core-api/irq/irq-affinity.rst``
439
440
441To disable aRFS using ethtool::
442
443  # ethtool -K <ethX> ntuple off
444
445NOTE: This command will disable ntuple filters and clear any aRFS filters in
446software and hardware.
447
448Example Use Case:
449
4501. Set the server application on the desired CPU (e.g., CPU 4).
451
452::
453
454   # taskset -c 4 netserver
455
4562. Use netperf to route traffic from the client to CPU 4 on the server with
457   aRFS configured. This example uses TCP over IPv4.
458
459::
460
461   # netperf -H <Host IPv4 Address> -t TCP_STREAM
462
463
464Enabling Virtual Functions (VFs)
465--------------------------------
466Use sysfs to enable virtual functions (VF).
467
468For example, you can create 4 VFs as follows::
469
470  # echo 4 > /sys/class/net/<ethX>/device/sriov_numvfs
471
472To disable VFs, write 0 to the same file::
473
474  # echo 0 > /sys/class/net/<ethX>/device/sriov_numvfs
475
476The maximum number of VFs for the ice driver is 256 total (all ports). To check
477how many VFs each PF supports, use the following command::
478
479  # cat /sys/class/net/<ethX>/device/sriov_totalvfs
480
481Note: You cannot use SR-IOV when link aggregation (LAG)/bonding is active, and
482vice versa. To enforce this, the driver checks for this mutual exclusion.
483
484
485Displaying VF Statistics on the PF
486----------------------------------
487Use the following command to display the statistics for the PF and its VFs::
488
489  # ip -s link show dev <ethX>
490
491NOTE: The output of this command can be very large due to the maximum number of
492possible VFs.
493
494The PF driver will display a subset of the statistics for the PF and for all
495VFs that are configured. The PF will always print a statistics block for each
496of the possible VFs, and it will show zero for all unconfigured VFs.
497
498
499Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports
500--------------------------------------------------------
501To configure VLAN tagging for the ports on an SR-IOV enabled adapter, use the
502following command. The VLAN configuration should be done before the VF driver
503is loaded or the VM is booted. The VF is not aware of the VLAN tag being
504inserted on transmit and removed on received frames (sometimes called "port
505VLAN" mode).
506
507::
508
509  # ip link set dev <ethX> vf <id> vlan <vlan id>
510
511For example, the following will configure PF eth0 and the first VF on VLAN 10::
512
513  # ip link set dev eth0 vf 0 vlan 10
514
515
516Enabling a VF link if the port is disconnected
517----------------------------------------------
518If the physical function (PF) link is down, you can force link up (from the
519host PF) on any virtual functions (VF) bound to the PF.
520
521For example, to force link up on VF 0 bound to PF eth0::
522
523  # ip link set eth0 vf 0 state enable
524
525Note: If the command does not work, it may not be supported by your system.
526
527
528Setting the MAC Address for a VF
529--------------------------------
530To change the MAC address for the specified VF::
531
532  # ip link set <ethX> vf 0 mac <address>
533
534For example::
535
536  # ip link set <ethX> vf 0 mac 00:01:02:03:04:05
537
538This setting lasts until the PF is reloaded.
539
540NOTE: Assigning a MAC address for a VF from the host will disable any
541subsequent requests to change the MAC address from within the VM. This is a
542security feature. The VM is not aware of this restriction, so if this is
543attempted in the VM, it will trigger MDD events.
544
545
546Trusted VFs and VF Promiscuous Mode
547-----------------------------------
548This feature allows you to designate a particular VF as trusted and allows that
549trusted VF to request selective promiscuous mode on the Physical Function (PF).
550
551To set a VF as trusted or untrusted, enter the following command in the
552Hypervisor::
553
554  # ip link set dev <ethX> vf 1 trust [on|off]
555
556NOTE: It's important to set the VF to trusted before setting promiscuous mode.
557If the VM is not trusted, the PF will ignore promiscuous mode requests from the
558VF. If the VM becomes trusted after the VF driver is loaded, you must make a
559new request to set the VF to promiscuous.
560
561Once the VF is designated as trusted, use the following commands in the VM to
562set the VF to promiscuous mode.
563
564For promiscuous all::
565
566  # ip link set <ethX> promisc on
567  Where <ethX> is a VF interface in the VM
568
569For promiscuous Multicast::
570
571  # ip link set <ethX> allmulticast on
572  Where <ethX> is a VF interface in the VM
573
574NOTE: By default, the ethtool private flag vf-true-promisc-support is set to
575"off," meaning that promiscuous mode for the VF will be limited. To set the
576promiscuous mode for the VF to true promiscuous and allow the VF to see all
577ingress traffic, use the following command::
578
579  # ethtool --set-priv-flags <ethX> vf-true-promisc-support on
580
581The vf-true-promisc-support private flag does not enable promiscuous mode;
582rather, it designates which type of promiscuous mode (limited or true) you will
583get when you enable promiscuous mode using the ip link commands above. Note
584that this is a global setting that affects the entire device. However, the
585vf-true-promisc-support private flag is only exposed to the first PF of the
586device. The PF remains in limited promiscuous mode regardless of the
587vf-true-promisc-support setting.
588
589Next, add a VLAN interface on the VF interface. For example::
590
591  # ip link add link eth2 name eth2.100 type vlan id 100
592
593Note that the order in which you set the VF to promiscuous mode and add the
594VLAN interface does not matter (you can do either first). The result in this
595example is that the VF will get all traffic that is tagged with VLAN 100.
596
597
598Malicious Driver Detection (MDD) for VFs
599----------------------------------------
600Some Intel Ethernet devices use Malicious Driver Detection (MDD) to detect
601malicious traffic from the VF and disable Tx/Rx queues or drop the offending
602packet until a VF driver reset occurs. You can view MDD messages in the PF's
603system log using the dmesg command.
604
605- If the PF driver logs MDD events from the VF, confirm that the correct VF
606  driver is installed.
607- To restore functionality, you can manually reload the VF or VM or enable
608  automatic VF resets.
609- When automatic VF resets are enabled, the PF driver will immediately reset
610  the VF and reenable queues when it detects MDD events on the receive path.
611- If automatic VF resets are disabled, the PF will not automatically reset the
612  VF when it detects MDD events.
613
614To enable or disable automatic VF resets, use the following command::
615
616  # ethtool --set-priv-flags <ethX> mdd-auto-reset-vf on|off
617
618
619MAC and VLAN Anti-Spoofing Feature for VFs
620------------------------------------------
621When a malicious driver on a Virtual Function (VF) interface attempts to send a
622spoofed packet, it is dropped by the hardware and not transmitted.
623
624NOTE: This feature can be disabled for a specific VF::
625
626  # ip link set <ethX> vf <vf id> spoofchk {off|on}
627
628
629Jumbo Frames
630------------
631Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU)
632to a value larger than the default value of 1500.
633
634Use the ifconfig command to increase the MTU size. For example, enter the
635following where <ethX> is the interface number::
636
637  # ifconfig <ethX> mtu 9000 up
638
639Alternatively, you can use the ip command as follows::
640
641  # ip link set mtu 9000 dev <ethX>
642  # ip link set up dev <ethX>
643
644This setting is not saved across reboots.
645
646
647NOTE: The maximum MTU setting for jumbo frames is 9702. This corresponds to the
648maximum jumbo frame size of 9728 bytes.
649
650NOTE: This driver will attempt to use multiple page sized buffers to receive
651each jumbo packet. This should help to avoid buffer starvation issues when
652allocating receive packets.
653
654NOTE: Packet loss may have a greater impact on throughput when you use jumbo
655frames. If you observe a drop in performance after enabling jumbo frames,
656enabling flow control may mitigate the issue.
657
658
659Speed and Duplex Configuration
660------------------------------
661In addressing speed and duplex configuration issues, you need to distinguish
662between copper-based adapters and fiber-based adapters.
663
664In the default mode, an Intel(R) Ethernet Network Adapter using copper
665connections will attempt to auto-negotiate with its link partner to determine
666the best setting. If the adapter cannot establish link with the link partner
667using auto-negotiation, you may need to manually configure the adapter and link
668partner to identical settings to establish link and pass packets. This should
669only be needed when attempting to link with an older switch that does not
670support auto-negotiation or one that has been forced to a specific speed or
671duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds
672and higher cannot be forced. Use the autonegotiation advertising setting to
673manually set devices for 1 Gbps and higher.
674
675Speed, duplex, and autonegotiation advertising are configured through the
676ethtool utility. For the latest version, download and install ethtool from the
677following website:
678
679   https://kernel.org/pub/software/network/ethtool/
680
681To see the speed configurations your device supports, run the following::
682
683  # ethtool <ethX>
684
685Caution: Only experienced network administrators should force speed and duplex
686or change autonegotiation advertising manually. The settings at the switch must
687always match the adapter settings. Adapter performance may suffer or your
688adapter may not operate if you configure the adapter differently from your
689switch.
690
691
692Data Center Bridging (DCB)
693--------------------------
694NOTE: The kernel assumes that TC0 is available, and will disable Priority Flow
695Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is
696enabled when setting up DCB on your switch.
697
698DCB is a configuration Quality of Service implementation in hardware. It uses
699the VLAN priority tag (802.1p) to filter traffic. That means that there are 8
700different priorities that traffic can be filtered into. It also enables
701priority flow control (802.1Qbb) which can limit or eliminate the number of
702dropped packets during network stress. Bandwidth can be allocated to each of
703these priorities, which is enforced at the hardware level (802.1Qaz).
704
705DCB is normally configured on the network using the DCBX protocol (802.1Qaz), a
706specialization of LLDP (802.1AB). The ice driver supports the following
707mutually exclusive variants of DCBX support:
708
7091) Firmware-based LLDP Agent
7102) Software-based LLDP Agent
711
712In firmware-based mode, firmware intercepts all LLDP traffic and handles DCBX
713negotiation transparently for the user. In this mode, the adapter operates in
714"willing" DCBX mode, receiving DCB settings from the link partner (typically a
715switch). The local user can only query the negotiated DCB configuration. For
716information on configuring DCBX parameters on a switch, please consult the
717switch manufacturer's documentation.
718
719In software-based mode, LLDP traffic is forwarded to the network stack and user
720space, where a software agent can handle it. In this mode, the adapter can
721operate in either "willing" or "nonwilling" DCBX mode and DCB configuration can
722be both queried and set locally. This mode requires the FW-based LLDP Agent to
723be disabled.
724
725NOTE:
726
727- You can enable and disable the firmware-based LLDP Agent using an ethtool
728  private flag. Refer to the "FW-LLDP (Firmware Link Layer Discovery Protocol)"
729  section in this README for more information.
730- In software-based DCBX mode, you can configure DCB parameters using software
731  LLDP/DCBX agents that interface with the Linux kernel's DCB Netlink API. We
732  recommend using OpenLLDP as the DCBX agent when running in software mode. For
733  more information, see the OpenLLDP man pages and
734  https://github.com/intel/openlldp.
735- The driver implements the DCB netlink interface layer to allow the user space
736  to communicate with the driver and query DCB configuration for the port.
737- iSCSI with DCB is not supported.
738
739
740FW-LLDP (Firmware Link Layer Discovery Protocol)
741------------------------------------------------
742Use ethtool to change FW-LLDP settings. The FW-LLDP setting is per port and
743persists across boots.
744
745To enable LLDP::
746
747  # ethtool --set-priv-flags <ethX> fw-lldp-agent on
748
749To disable LLDP::
750
751  # ethtool --set-priv-flags <ethX> fw-lldp-agent off
752
753To check the current LLDP setting::
754
755  # ethtool --show-priv-flags <ethX>
756
757NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to
758take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the
759OS.
760
761
762Flow Control
763------------
764Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
765receiving and transmitting pause frames for ice. When transmit is enabled,
766pause frames are generated when the receive packet buffer crosses a predefined
767threshold. When receive is enabled, the transmit unit will halt for the time
768delay specified when a pause frame is received.
769
770NOTE: You must have a flow control capable link partner.
771
772Flow Control is disabled by default.
773
774Use ethtool to change the flow control settings.
775
776To enable or disable Rx or Tx Flow Control::
777
778  # ethtool -A <ethX> rx <on|off> tx <on|off>
779
780Note: This command only enables or disables Flow Control if auto-negotiation is
781disabled. If auto-negotiation is enabled, this command changes the parameters
782used for auto-negotiation with the link partner.
783
784Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending
785on your device, you may not be able to change the auto-negotiation setting.
786
787NOTE:
788
789- The ice driver requires flow control on both the port and link partner. If
790  flow control is disabled on one of the sides, the port may appear to hang on
791  heavy traffic.
792- You may encounter issues with link-level flow control (LFC) after disabling
793  DCB. The LFC status may show as enabled but traffic is not paused. To resolve
794  this issue, disable and reenable LFC using ethtool::
795
796   # ethtool -A <ethX> rx off tx off
797   # ethtool -A <ethX> rx on tx on
798
799
800NAPI
801----
802
803This driver supports NAPI (Rx polling mode).
804
805See :ref:`Documentation/networking/napi.rst <napi>` for more information.
806
807MACVLAN
808-------
809This driver supports MACVLAN. Kernel support for MACVLAN can be tested by
810checking if the MACVLAN driver is loaded. You can run 'lsmod | grep macvlan' to
811see if the MACVLAN driver is loaded or run 'modprobe macvlan' to try to load
812the MACVLAN driver.
813
814NOTE:
815
816- In passthru mode, you can only set up one MACVLAN device. It will inherit the
817  MAC address of the underlying PF (Physical Function) device.
818
819
820IEEE 802.1ad (QinQ) Support
821---------------------------
822The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN
823IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as
824"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks
825allow L2 tunneling and the ability to segregate traffic within a particular
826VLAN ID, among other uses.
827
828NOTES:
829
830- Receive checksum offloads and VLAN acceleration are not supported for 802.1ad
831  (QinQ) packets.
832
833- 0x88A8 traffic will not be received unless VLAN stripping is disabled with
834  the following command::
835
836    # ethtool -K <ethX> rxvlan off
837
838- 0x88A8/0x8100 double VLANs cannot be used with 0x8100 or 0x8100/0x8100 VLANS
839  configured on the same port. 0x88a8/0x8100 traffic will not be received if
840  0x8100 VLANs are configured.
841
842- The VF can only transmit 0x88A8/0x8100 (i.e., 802.1ad/802.1Q) traffic if:
843
844    1) The VF is not assigned a port VLAN.
845    2) spoofchk is disabled from the PF. If you enable spoofchk, the VF will
846       not transmit 0x88A8/0x8100 traffic.
847
848- The VF may not receive all network traffic based on the Inner VLAN header
849  when VF true promiscuous mode (vf-true-promisc-support) and double VLANs are
850  enabled in SR-IOV mode.
851
852The following are examples of how to configure 802.1ad (QinQ)::
853
854  # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
855  # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
856
857  Where "24" and "371" are example VLAN IDs.
858
859
860Tunnel/Overlay Stateless Offloads
861---------------------------------
862Supported tunnels and overlays include VXLAN, GENEVE, and others depending on
863hardware and software configuration. Stateless offloads are enabled by default.
864
865To view the current state of all offloads::
866
867  # ethtool -k <ethX>
868
869
870UDP Segmentation Offload
871------------------------
872Allows the adapter to offload transmit segmentation of UDP packets with
873payloads up to 64K into valid Ethernet frames. Because the adapter hardware is
874able to complete data segmentation much faster than operating system software,
875this feature may improve transmission performance.
876In addition, the adapter may use fewer CPU resources.
877
878NOTE:
879
880- The application sending UDP packets must support UDP segmentation offload.
881
882To enable/disable UDP Segmentation Offload, issue the following command::
883
884  # ethtool -K <ethX> tx-udp-segmentation [off|on]
885
886
887GNSS module
888-----------
889Requires kernel compiled with CONFIG_GNSS=y or CONFIG_GNSS=m.
890Allows user to read messages from the GNSS hardware module and write supported
891commands. If the module is physically present, a GNSS device is spawned:
892``/dev/gnss<id>``.
893The protocol of write command is dependent on the GNSS hardware module as the
894driver writes raw bytes by the GNSS object to the receiver through i2c. Please
895refer to the hardware GNSS module documentation for configuration details.
896
897
898Firmware (FW) logging
899---------------------
900The driver supports FW logging via the debugfs interface on PF 0 only. The FW
901running on the NIC must support FW logging; if the FW doesn't support FW logging
902the 'fwlog' file will not get created in the ice debugfs directory.
903
904Module configuration
905~~~~~~~~~~~~~~~~~~~~
906Firmware logging is configured on a per module basis. Each module can be set to
907a value independent of the other modules (unless the module 'all' is specified).
908The modules will be instantiated under the 'fwlog/modules' directory.
909
910The user can set the log level for a module by writing to the module file like
911this::
912
913  # echo <log_level> > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/<module>
914
915where
916
917* log_level is a name as described below. Each level includes the
918  messages from the previous/lower level
919
920      *	none
921      *	error
922      *	warning
923      *	normal
924      *	verbose
925
926* module is a name that represents the module to receive events for. The
927  module names are
928
929      *	general
930      *	ctrl
931      *	link
932      *	link_topo
933      *	dnl
934      *	i2c
935      *	sdp
936      *	mdio
937      *	adminq
938      *	hdma
939      *	lldp
940      *	dcbx
941      *	dcb
942      *	xlr
943      *	nvm
944      *	auth
945      *	vpd
946      *	iosf
947      *	parser
948      *	sw
949      *	scheduler
950      *	txq
951      *	rsvd
952      *	post
953      *	watchdog
954      *	task_dispatch
955      *	mng
956      *	synce
957      *	health
958      *	tsdrv
959      *	pfreg
960      *	mdlver
961      *	all
962
963The name 'all' is special and allows the user to set all of the modules to the
964specified log_level or to read the log_level of all of the modules.
965
966Example usage to configure the modules
967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
968
969To set a single module to 'verbose'::
970
971  # echo verbose > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/link
972
973To set multiple modules then issue the command multiple times::
974
975  # echo verbose > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/link
976  # echo warning > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/ctrl
977  # echo none > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/dcb
978
979To set all the modules to the same value::
980
981  # echo normal > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/all
982
983To read the log_level of a specific module (e.g. module 'general')::
984
985  # cat /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/general
986
987To read the log_level of all the modules::
988
989  # cat /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules/all
990
991Enabling FW log
992~~~~~~~~~~~~~~~
993Configuring the modules indicates to the FW that the configured modules should
994generate events that the driver is interested in, but it **does not** send the
995events to the driver until the enable message is sent to the FW. To do this
996the user can write a 1 (enable) or 0 (disable) to 'fwlog/enable'. An example
997is::
998
999  # echo 1 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/enable
1000
1001Retrieving FW log data
1002~~~~~~~~~~~~~~~~~~~~~~
1003The FW log data can be retrieved by reading from 'fwlog/data'. The user can
1004write any value to 'fwlog/data' to clear the data. The data can only be cleared
1005when FW logging is disabled. The FW log data is a binary file that is sent to
1006Intel and used to help debug user issues.
1007
1008An example to read the data is::
1009
1010  # cat /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/data > fwlog.bin
1011
1012An example to clear the data is::
1013
1014  # echo 0 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/data
1015
1016Changing how often the log events are sent to the driver
1017~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1018The driver receives FW log data from the Admin Receive Queue (ARQ). The
1019frequency that the FW sends the ARQ events can be configured by writing to
1020'fwlog/nr_messages'. The range is 1-128 (1 means push every log message, 128
1021means push only when the max AQ command buffer is full). The suggested value is
102210. The user can see what the value is configured to by reading
1023'fwlog/nr_messages'. An example to set the value is::
1024
1025  # echo 50 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/nr_messages
1026
1027Configuring the amount of memory used to store FW log data
1028~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1029The driver stores FW log data within the driver. The default size of the memory
1030used to store the data is 1MB. Some use cases may require more or less data so
1031the user can change the amount of memory that is allocated for FW log data.
1032To change the amount of memory then write to 'fwlog/log_size'. The value must be
1033one of: 128K, 256K, 512K, 1M, or 2M. FW logging must be disabled to change the
1034value. An example of changing the value is::
1035
1036  # echo 128K > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/log_size
1037
1038
1039Performance Optimization
1040========================
1041Driver defaults are meant to fit a wide variety of workloads, but if further
1042optimization is required, we recommend experimenting with the following
1043settings.
1044
1045
1046Rx Descriptor Ring Size
1047-----------------------
1048To reduce the number of Rx packet discards, increase the number of Rx
1049descriptors for each Rx ring using ethtool.
1050
1051  Check if the interface is dropping Rx packets due to buffers being full
1052  (rx_dropped.nic can mean that there is no PCIe bandwidth)::
1053
1054    # ethtool -S <ethX> | grep "rx_dropped"
1055
1056  If the previous command shows drops on queues, it may help to increase
1057  the number of descriptors using 'ethtool -G'::
1058
1059    # ethtool -G <ethX> rx <N>
1060    Where <N> is the desired number of ring entries/descriptors
1061
1062  This can provide temporary buffering for issues that create latency while
1063  the CPUs process descriptors.
1064
1065
1066Interrupt Rate Limiting
1067-----------------------
1068This driver supports an adaptive interrupt throttle rate (ITR) mechanism that
1069is tuned for general workloads. The user can customize the interrupt rate
1070control for specific workloads, via ethtool, adjusting the number of
1071microseconds between interrupts.
1072
1073To set the interrupt rate manually, you must disable adaptive mode::
1074
1075  # ethtool -C <ethX> adaptive-rx off adaptive-tx off
1076
1077For lower CPU utilization:
1078
1079  Disable adaptive ITR and lower Rx and Tx interrupts. The examples below
1080  affect every queue of the specified interface.
1081
1082  Setting rx-usecs and tx-usecs to 80 will limit interrupts to about
1083  12,500 interrupts per second per queue::
1084
1085    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80 tx-usecs 80
1086
1087For reduced latency:
1088
1089  Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0
1090  using ethtool::
1091
1092    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0
1093
1094Per-queue interrupt rate settings:
1095
1096  The following examples are for queues 1 and 3, but you can adjust other
1097  queues.
1098
1099  To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or
1100  about 100,000 interrupts/second, for queues 1 and 3::
1101
1102    # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off
1103    rx-usecs 10
1104
1105  To show the current coalesce settings for queues 1 and 3::
1106
1107    # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce
1108
1109Bounding interrupt rates using rx-usecs-high:
1110
1111  :Valid Range: 0-236 (0=no limit)
1112
1113   The range of 0-236 microseconds provides an effective range of 4,237 to
1114   250,000 interrupts per second. The value of rx-usecs-high can be set
1115   independently of rx-usecs and tx-usecs in the same ethtool command, and is
1116   also independent of the adaptive interrupt moderation algorithm. The
1117   underlying hardware supports granularity in 4-microsecond intervals, so
1118   adjacent values may result in the same interrupt rate.
1119
1120  The following command would disable adaptive interrupt moderation, and allow
1121  a maximum of 5 microseconds before indicating a receive or transmit was
1122  complete. However, instead of resulting in as many as 200,000 interrupts per
1123  second, it limits total interrupts per second to 50,000 via the rx-usecs-high
1124  parameter.
1125
1126  ::
1127
1128    # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs-high 20
1129    rx-usecs 5 tx-usecs 5
1130
1131
1132Virtualized Environments
1133------------------------
1134In addition to the other suggestions in this section, the following may be
1135helpful to optimize performance in VMs.
1136
1137  Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to
1138  individual LCPUs, making sure to use a set of CPUs included in the
1139  device's local_cpulist: ``/sys/class/net/<ethX>/device/local_cpulist``.
1140
1141  Configure as many Rx/Tx queues in the VM as available. (See the iavf driver
1142  documentation for the number of queues supported.) For example::
1143
1144    # ethtool -L <virt_interface> rx <max> tx <max>
1145
1146
1147Support
1148=======
1149For general information, go to the Intel support website at:
1150https://www.intel.com/support/
1151
1152If an issue is identified with the released source code on a supported kernel
1153with a supported adapter, email the specific information related to the issue
1154to intel-wired-lan@lists.osuosl.org.
1155
1156
1157Trademarks
1158==========
1159Intel is a trademark or registered trademark of Intel Corporation or its
1160subsidiaries in the United States and/or other countries.
1161
1162* Other names and brands may be claimed as the property of others.
1163