xref: /linux/Documentation/networking/segmentation-offloads.rst (revision 6443f4f20bdae726fe01cf5946fba9742a0ffda6)
1.. SPDX-License-Identifier: GPL-2.0
2
3=====================
4Segmentation Offloads
5=====================
6
7
8Introduction
9============
10
11This document describes a set of techniques in the Linux networking stack
12to take advantage of segmentation offload capabilities of various NICs.
13
14The following technologies are described:
15 * TCP Segmentation Offload - TSO
16 * UDP Fragmentation Offload - UFO
17 * UDP Segmentation Offload - USO
18 * IPIP, SIT, GRE, and UDP Tunnel Offloads
19 * Generic Segmentation Offload - GSO
20 * Generic Receive Offload - GRO
21 * Partial Generic Segmentation Offload - GSO_PARTIAL
22 * ESP Segmentation Offload
23 * Fraglist Generic Segmentation Offload - GSO_FRAGLIST
24 * SCTP acceleration with GSO - GSO_BY_FRAGS
25
26
27TCP Segmentation Offload
28========================
29
30TCP segmentation allows a device to segment a single frame into multiple
31frames with a data payload size specified in skb_shinfo()->gso_size.
32When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
33SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
34skb_shinfo()->gso_size should be set to a non-zero value.
35
36TCP segmentation is dependent on support for the use of partial checksum
37offload.  For this reason TSO is normally disabled if the Tx checksum
38offload for a given device is disabled.
39
40In order to support TCP segmentation offload it is necessary to populate
41the network and transport header offsets of the skbuff so that the device
42drivers will be able determine the offsets of the IP or IPv6 header and the
43TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
44also point to the TCP header of the packet, or to the inner transport header
45for encapsulated TSO.
46
47For IPv4 segmentation we support one of two types in terms of the IP ID.
48The default behavior is to increment the IP ID with every segment.  If the
49GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
50ID and all segments will use the same IP ID.
51
52For encapsulated packets, SKB_GSO_TCP_FIXEDID refers only to the outer header.
53SKB_GSO_TCP_FIXEDID_INNER can be used to specify the same for the inner header.
54Any combination of these two GSO types is allowed.
55
56If a device has NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when
57performing TSO and we will either increment the IP ID for all frames, or leave
58it at a static value based on driver preference.  For encapsulated packets,
59NETIF_F_TSO_MANGLEID is relevant for both outer and inner headers, unless the
60DF bit is not set on the outer header, in which case the device driver must
61guarantee that the IP ID field is incremented in the outer header with every
62segment.
63
64SKB_GSO_TCP_ACCECN is a modifier used with TCP segmentation offload for
65AccECN packets where the CWR bit must not be cleared during segmentation.
66Devices advertise support for this using NETIF_F_GSO_ACCECN.
67
68
69UDP Fragmentation Offload
70=========================
71
72UDP fragmentation offload allows a device to fragment an oversized UDP
73datagram into multiple IPv4 fragments.  Many of the requirements for UDP
74fragmentation offload are the same as TSO.  However the IPv4 ID for
75fragments should not increment as a single IPv4 datagram is fragmented.
76
77UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
78still receive them from tuntap and similar devices. Offload of UDP-based
79tunnel protocols is still supported.
80
81
82UDP Segmentation Offload
83========================
84
85UDP segmentation offload allows a device to segment a large UDP packet into
86multiple UDP datagrams.  Unlike UFO, these are not IP fragments.  The payload
87size of each datagram is specified in skb_shinfo()->gso_size and the GSO type
88is SKB_GSO_UDP_L4.  Devices advertise support for this using
89NETIF_F_GSO_UDP_L4.
90
91
92IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
93========================================================
94
95In addition to the offloads described above it is possible for a frame to
96contain additional headers such as an outer tunnel.  In order to account
97for such instances an additional set of segmentation offload types were
98introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
99SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
100cases where there are more than just 1 set of headers.  For example in the
101case of IPIP and SIT we should have the network and transport headers moved
102from the standard list of headers to "inner" header offsets.
103
104Currently only two levels of headers are supported.  The convention is to
105refer to the tunnel headers as the outer headers, while the encapsulated
106data is normally referred to as the inner headers.  Below is the list of
107calls to access the given headers:
108
109IPIP/SIT Tunnel::
110
111             Outer                  Inner
112  MAC        skb_mac_header
113  Network    skb_network_header     skb_inner_network_header
114  Transport  skb_transport_header
115
116UDP/GRE Tunnel::
117
118             Outer                  Inner
119  MAC        skb_mac_header         skb_inner_mac_header
120  Network    skb_network_header     skb_inner_network_header
121  Transport  skb_transport_header   skb_inner_transport_header
122
123In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
124SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
125fact that the outer header also requests to have a non-zero checksum
126included in the outer header.
127
128Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
129header has requested a remote checksum offload.  In this case the inner
130headers will be left with a partial checksum and only the outer header
131checksum will be computed.
132
133
134Generic Segmentation Offload
135============================
136
137Generic segmentation offload is a pure software offload that is meant to
138deal with cases where device drivers cannot perform the offloads described
139above.  What occurs in GSO is that a given skbuff will have its data broken
140out over multiple skbuffs that have been resized to match the MSS provided
141via skb_shinfo()->gso_size.
142
143Before enabling any hardware segmentation offload a corresponding software
144offload is required in GSO.  Otherwise it becomes possible for a frame to
145be re-routed between devices and end up being unable to be transmitted.
146
147
148Generic Receive Offload
149=======================
150
151Generic receive offload is the complement to GSO.  Ideally any frame
152assembled by GRO should be segmented to create an identical sequence of
153frames using GSO, and any sequence of frames segmented by GSO should be
154able to be reassembled back to the original by GRO.
155
156
157Partial Generic Segmentation Offload
158====================================
159
160Partial generic segmentation offload is a hybrid between TSO and GSO.  What
161it effectively does is take advantage of certain traits of TCP and tunnels
162so that instead of having to rewrite the packet headers for each segment
163only the inner-most transport header and possibly the outer-most network
164header need to be updated.  This allows devices that do not support tunnel
165offloads or tunnel offloads with checksum to still make use of segmentation.
166
167With the partial offload what occurs is that all headers excluding the
168inner transport header are updated such that they will contain the correct
169values for if the header was simply duplicated.  The one exception to this
170is the outer IPv4 ID field.  It is up to the device drivers to guarantee
171that the IPv4 ID field is incremented in the case that a given header does
172not have the DF bit set.
173
174
175ESP Segmentation Offload
176========================
177
178ESP segmentation offload uses SKB_GSO_ESP to mark packets that require
179IPsec ESP segmentation.  This type is set by the XFRM output path for GSO
180packets handled by ESP hardware offload.
181
182
183Fraglist Generic Segmentation Offload
184=====================================
185
186Fraglist GSO uses SKB_GSO_FRAGLIST to mark packets whose segments are
187already arranged as a list of skbs.  The segmentation path splits the skb
188based on that list rather than by creating segments of skb_shinfo()->gso_size
189bytes from the linear and page-fragment data.
190
191
192SCTP acceleration with GSO
193===========================
194
195SCTP - despite the lack of hardware support - can still take advantage of
196GSO to pass one large packet through the network stack, rather than
197multiple small packets.
198
199This requires a different approach to other offloads, as SCTP packets
200cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
201IP segments, padding respected. So unlike regular GSO, SCTP can't just
202generate a big skb, set gso_size to the fragmentation point and deliver it
203to IP layer.
204
205Instead, the SCTP protocol layer builds an skb with the segments correctly
206padded and stored as chained skbs, and skb_segment() splits based on those.
207To signal this, gso_size is set to the special value GSO_BY_FRAGS.
208
209Therefore, any code in the core networking stack must be aware of the
210possibility that gso_size will be GSO_BY_FRAGS and handle that case
211appropriately.
212
213There are some helpers to make this easier:
214
215- skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
216  an skb is an SCTP GSO skb.
217
218- For size checks, the skb_gso_validate_*_len family of helpers correctly
219  considers GSO_BY_FRAGS.
220
221- For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
222  will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
223
224This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
225set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.
226