1.. SPDX-License-Identifier: GPL-2.0 2 3================= 4Checksum Offloads 5================= 6 7 8Introduction 9============ 10 11This document describes a set of techniques in the Linux networking stack to 12take advantage of checksum offload capabilities of various NICs. 13 14The following technologies are described: 15 16* TX Checksum Offload 17* LCO: Local Checksum Offload 18* RCO: Remote Checksum Offload 19 20Things that should be documented here but aren't yet: 21 22* CHECKSUM_UNNECESSARY conversion 23 24 25TX Checksum Offload 26=================== 27 28In brief, Tx checksum offload allows to request the device fill in a single 29ones-complement 30checksum defined by the sk_buff fields skb->csum_start and skb->csum_offset. 31The device should compute the 16-bit ones-complement checksum (i.e. the 32'IP-style' checksum) from csum_start to the end of the packet, and fill in the 33result at (csum_start + csum_offset). 34 35Because csum_offset cannot be negative, this ensures that the previous value of 36the checksum field is included in the checksum computation, thus it can be used 37to supply any needed corrections to the checksum (such as the sum of the 38pseudo-header for UDP or TCP). 39 40This interface only allows a single checksum to be offloaded. Where 41encapsulation is used, the packet may have multiple checksum fields in 42different header layers, and the rest will have to be handled by another 43mechanism such as LCO or RCO. 44 45SCTP CRC32c can also be offloaded using this interface, by means of filling 46skb->csum_start and skb->csum_offset as described above, setting 47skb->csum_not_inet, and advertising NETIF_F_SCTP_CRC. Drivers must not treat 48ordinary IP checksum offload as SCTP CRC32c support. 49 50No offloading of the IP header checksum is performed; it is always done in 51software. This is OK because when we build the IP header, we obviously have it 52in cache, so summing it isn't expensive. It's also rather short. 53 54The requirements for GSO are more complicated, because when segmenting an 55encapsulated packet both the inner and outer checksums may need to be edited or 56recomputed for each resulting segment. 57 58A driver declares its offload capabilities in netdev->hw_features; see 59Documentation/networking/netdev-features.rst for more. NETIF_F_IP_CSUM and 60NETIF_F_IPV6_CSUM are restricted legacy features and are being deprecated in 61favor of NETIF_F_HW_CSUM. New devices should use NETIF_F_HW_CSUM to advertise 62generic checksum offload. The skb_csum_hwoffload_help() helper can resolve 63CHECKSUM_PARTIAL according to the device's advertised checksum capabilities, 64falling back to software when needed. 65 66The stack should, for the most part, assume that checksum offload is supported 67by the underlying device. The only place that should check is 68validate_xmit_skb(), and the functions it calls directly or indirectly. That 69function compares the offload features requested by the SKB (which may include 70other offloads besides TX Checksum Offload) and, if they are not supported or 71enabled on the device (determined by netdev->features), performs the 72corresponding offload in software. In the case of TX Checksum Offload, that 73means calling skb_csum_hwoffload_help(skb, features). 74 75 76LCO: Local Checksum Offload 77=========================== 78 79LCO is a technique for efficiently computing the outer checksum of an 80encapsulated datagram when the inner checksum is due to be offloaded. 81 82The ones-complement sum of a correctly checksummed TCP or UDP packet is equal 83to the complement of the sum of the pseudo header, because everything else gets 84'cancelled out' by the checksum field. This is because the sum was 85complemented before being written to the checksum field. 86 87More generally, this holds in any case where the 'IP-style' ones complement 88checksum is used, and thus any checksum that TX Checksum Offload supports. 89 90That is, if we have set up TX Checksum Offload with a start/offset pair, we 91know that after the device has filled in that checksum, the ones complement sum 92from csum_start to the end of the packet will be equal to the complement of 93whatever value we put in the checksum field beforehand. This allows us to 94compute the outer checksum without looking at the payload: we simply stop 95summing when we get to csum_start, then add the complement of the 16-bit word 96at (csum_start + csum_offset). 97 98Then, when the true inner checksum is filled in (either by hardware or by 99skb_checksum_help()), the outer checksum will become correct by virtue of the 100arithmetic. 101 102LCO is performed by the stack when constructing an outer UDP header for an 103encapsulation such as VXLAN or GENEVE, in udp_set_csum(). Similarly for the 104IPv6 equivalents, in udp6_set_csum(). 105 106It is also performed when constructing GRE headers with the shared 107gre_build_header() helper in include/net/gre.h, which is used by both IPv4 and 108IPv6 GRE. 109 110All of the LCO implementations use a helper function lco_csum(), in 111include/linux/skbuff.h. 112 113LCO can safely be used for nested encapsulations; in this case, the outer 114encapsulation layer will sum over both its own header and the 'middle' header. 115This does mean that the 'middle' header will get summed multiple times, but 116there doesn't seem to be a way to avoid that without incurring bigger costs 117(e.g. in SKB bloat). 118 119 120RCO: Remote Checksum Offload 121============================ 122 123RCO is a technique for eliding the inner checksum of an encapsulated datagram, 124allowing the outer checksum to be offloaded. It does, however, involve a 125change to the encapsulation protocols, which the receiver must also support. 126For this reason, it is disabled by default. 127 128RCO is detailed in the following Internet-Drafts: 129 130* https://tools.ietf.org/html/draft-herbert-remotecsumoffload-00 131* https://tools.ietf.org/html/draft-herbert-vxlan-rco-00 132 133In Linux, RCO is implemented individually in each encapsulation protocol, and 134most tunnel types have flags controlling its use. For instance, VXLAN has the 135configuration flag VXLAN_F_REMCSUM_TX to indicate that RCO should be used when 136transmitting. 137 138 139RX Checksum Offload 140=================== 141 142RX checksum offload is controlled via NETIF_F_RXCSUM. When disabled the driver 143must not set skb->ip_summed on ingress packets. As mentioned, IPv4 checksum 144is not offloaded, the RXCSUM feature controls the offload of verification of 145transport layer checksums. 146 147Note that packets with bad TCP/UDP checksums must still be passed 148to the stack. skb->ip_summed of such packets can be set to ``CHECKSUM_COMPLETE`` 149or left at ``CHECKSUM_NONE``. Drivers **must not discard** packets with 150bad TCP/UDP checksum and must not configure the device to drop them. 151Checksum validation is relatively inexpensive and having bad packets reflected 152in SNMP counters is crucial for network monitoring. 153 154skb checksum documentation 155========================== 156 157.. kernel-doc:: include/linux/skbuff.h 158 :doc: skb checksums 159