xref: /linux/Documentation/networking/tcp_ao.rst (revision e7d759f31ca295d589f7420719c311870bb3166f)
1.. SPDX-License-Identifier: GPL-2.0
2
3========================================================
4TCP Authentication Option Linux implementation (RFC5925)
5========================================================
6
7TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying
8segments between trusted peers. It adds a new TCP header option with
9a Message Authentication Code (MAC). MACs are produced from the content
10of a TCP segment using a hashing function with a password known to both peers.
11The intent of TCP-AO is to deprecate TCP-MD5 providing better security,
12key rotation and support for variety of hashing algorithms.
13
141. Introduction
15===============
16
17.. table:: Short and Limited Comparison of TCP-AO and TCP-MD5
18
19 +----------------------+------------------------+-----------------------+
20 |                      |       TCP-MD5          |         TCP-AO        |
21 +======================+========================+=======================+
22 |Supported hashing     |MD5                     |Must support HMAC-SHA1 |
23 |algorithms            |(cryptographically weak)|(chosen-prefix attacks)|
24 |                      |                        |and CMAC-AES-128 (only |
25 |                      |                        |side-channel attacks). |
26 |                      |                        |May support any hashing|
27 |                      |                        |algorithm.             |
28 +----------------------+------------------------+-----------------------+
29 |Length of MACs (bytes)|16                      |Typically 12-16.       |
30 |                      |                        |Other variants that fit|
31 |                      |                        |TCP header permitted.  |
32 +----------------------+------------------------+-----------------------+
33 |Number of keys per    |1                       |Many                   |
34 |TCP connection        |                        |                       |
35 +----------------------+------------------------+-----------------------+
36 |Possibility to change |Non-practical (both     |Supported by protocol  |
37 |an active key         |peers have to change    |                       |
38 |                      |them during MSL)        |                       |
39 +----------------------+------------------------+-----------------------+
40 |Protection against    |No                      |Yes: ignoring them     |
41 |ICMP 'hard errors'    |                        |by default on          |
42 |                      |                        |established connections|
43 +----------------------+------------------------+-----------------------+
44 |Protection against    |No                      |Yes: pseudo-header     |
45 |traffic-crossing      |                        |includes TCP ports.    |
46 |attack                |                        |                       |
47 +----------------------+------------------------+-----------------------+
48 |Protection against    |No                      |Sequence Number        |
49 |replayed TCP segments |                        |Extension (SNE) and    |
50 |                      |                        |Initial Sequence       |
51 |                      |                        |Numbers (ISNs)         |
52 +----------------------+------------------------+-----------------------+
53 |Supports              |Yes                     |No. ISNs+SNE are needed|
54 |Connectionless Resets |                        |to correctly sign RST. |
55 +----------------------+------------------------+-----------------------+
56 |Standards             |RFC 2385                |RFC 5925, RFC 5926     |
57 +----------------------+------------------------+-----------------------+
58
59
601.1 Frequently Asked Questions (FAQ) with references to RFC 5925
61----------------------------------------------------------------
62
63Q: Can either SendID or RecvID be non-unique for the same 4-tuple
64(srcaddr, srcport, dstaddr, dstport)?
65
66A: No [3.1]::
67
68   >> The IDs of MKTs MUST NOT overlap where their TCP connection
69   identifiers overlap.
70
71Q: Can Master Key Tuple (MKT) for an active connection be removed?
72
73A: No, unless it's copied to Transport Control Block (TCB) [3.1]::
74
75   It is presumed that an MKT affecting a particular connection cannot
76   be destroyed during an active connection -- or, equivalently, that
77   its parameters are copied to an area local to the connection (i.e.,
78   instantiated) and so changes would affect only new connections.
79
80Q: If an old MKT needs to be deleted, how should it be done in order
81to not remove it for an active connection? (As it can be still in use
82at any moment later)
83
84A: Not specified by RFC 5925, seems to be a problem for key management
85to ensure that no one uses such MKT before trying to remove it.
86
87Q: Can an old MKT exist forever and be used by another peer?
88
89A: It can, it's a key management task to decide when to remove an old key [6.1]::
90
91   Deciding when to start using a key is a performance issue. Deciding
92   when to remove an MKT is a security issue. Invalid MKTs are expected
93   to be removed. TCP-AO provides no mechanism to coordinate their removal,
94   as we consider this a key management operation.
95
96also [6.1]::
97
98   The only way to avoid reuse of previously used MKTs is to remove the MKT
99   when it is no longer considered permitted.
100
101Linux TCP-AO will try its best to prevent you from removing a key that's
102being used, considering it a key management failure. But since keeping
103an outdated key may become a security issue and as a peer may
104unintentionally prevent the removal of an old key by always setting
105it as RNextKeyID - a forced key removal mechanism is provided, where
106userspace has to supply KeyID to use instead of the one that's being removed
107and the kernel will atomically delete the old key, even if the peer is
108still requesting it. There are no guarantees for force-delete as the peer
109may yet not have the new key - the TCP connection may just break.
110Alternatively, one may choose to shut down the socket.
111
112Q: What happens when a packet is received on a new connection with no known
113MKT's RecvID?
114
115A: RFC 5925 specifies that by default it is accepted with a warning logged, but
116the behaviour can be configured by the user [7.5.1.a]::
117
118   If the segment is a SYN, then this is the first segment of a new
119   connection. Find the matching MKT for this segment, using the segment's
120   socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection
121   identifier and the MKT's RecvID.
122
123      i. If there is no matching MKT, remove TCP-AO from the segment.
124         Proceed with further TCP handling of the segment.
125         NOTE: this presumes that connections that do not match any MKT
126         should be silently accepted, as noted in Section 7.3.
127
128[7.3]::
129
130   >> A TCP-AO implementation MUST allow for configuration of the behavior
131   of segments with TCP-AO but that do not match an MKT. The initial default
132   of this configuration SHOULD be to silently accept such connections.
133   If this is not the desired case, an MKT can be included to match such
134   connections, or the connection can indicate that TCP-AO is required.
135   Alternately, the configuration can be changed to discard segments with
136   the AO option not matching an MKT.
137
138[10.2.b]::
139
140   Connections not matching any MKT do not require TCP-AO. Further, incoming
141   segments with TCP-AO are not discarded solely because they include
142   the option, provided they do not match any MKT.
143
144Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO
145segments with unknown key signatures are discarded with warnings logged.
146
147Q: Does the RFC imply centralized kernel key management in any way?
148(i.e. that a key on all connections MUST be rotated at the same time?)
149
150A: Not specified. MKTs can be managed in userspace, the only relevant part to
151key changes is [7.3]::
152
153   >> All TCP segments MUST be checked against the set of MKTs for matching
154   TCP connection identifiers.
155
156Q: What happens when RNextKeyID requested by a peer is unknown? Should
157the connection be reset?
158
159A: It should not, no action needs to be performed [7.5.2.e]::
160
161   ii. If they differ, determine whether the RNextKeyID MKT is ready.
162
163       1. If the MKT corresponding to the segment’s socket pair and RNextKeyID
164       is not available, no action is required (RNextKeyID of a received
165       segment needs to match the MKT’s SendID).
166
167Q: How current_key is set and when does it change? It is a user-triggered
168change, or is it by a request from the remote peer? Is it set by the user
169explicitly, or by a matching rule?
170
171A: current_key is set by RNextKeyID [6.1]::
172
173   Rnext_key is changed only by manual user intervention or MKT management
174   protocol operation. It is not manipulated by TCP-AO. Current_key is updated
175   by TCP-AO when processing received TCP segments as discussed in the segment
176   processing description in Section 7.5. Note that the algorithm allows
177   the current_key to change to a new MKT, then change back to a previously
178   used MKT (known as "backing up"). This can occur during an MKT change when
179   segments are received out of order, and is considered a feature of TCP-AO,
180   because reordering does not result in drops.
181
182[7.5.2.e.ii]::
183
184   2. If the matching MKT corresponding to the segment’s socket pair and
185   RNextKeyID is available:
186
187      a. Set current_key to the RNextKeyID MKT.
188
189Q: If both peers have multiple MKTs matching the connection's socket pair
190(with different KeyIDs), how should the sender/receiver pick KeyID to use?
191
192A: Some mechanism should pick the "desired" MKT [3.3]::
193
194   Multiple MKTs may match a single outgoing segment, e.g., when MKTs
195   are being changed. Those MKTs cannot have conflicting IDs (as noted
196   elsewhere), and some mechanism must determine which MKT to use for each
197   given outgoing segment.
198
199   >> An outgoing TCP segment MUST match at most one desired MKT, indicated
200   by the segment’s socket pair. The segment MAY match multiple MKTs, provided
201   that exactly one MKT is indicated as desired. Other information in
202   the segment MAY be used to determine the desired MKT when multiple MKTs
203   match; such information MUST NOT include values in any TCP option fields.
204
205Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa):
206
207A: No [1]::
208
209   TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5
210   does not support any changes to a connection’s security algorithm
211   once established.
212
213Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed
214connection?
215
216A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i]
217that would allow accepting segments without a sign (which would be insecure).
218While switching to non-TCP-AO connection is not prohibited directly, it seems
219what the RFC means. Also, there's a requirement for TCP-AO connections to
220always have one current_key [3.3]::
221
222   TCP-AO requires that every protected TCP segment match exactly one MKT.
223
224[3.3]::
225
226   >> An incoming TCP segment including TCP-AO MUST match exactly one MKT,
227   indicated solely by the segment’s socket pair and its TCP-AO KeyID.
228
229[4.4]::
230
231   One or more MKTs. These are the MKTs that match this connection’s
232   socket pair.
233
234Q: Can a non-TCP-AO connection become a TCP-AO-enabled one?
235
236A: No: for already established non-TCP-AO connection it would be impossible
237to switch using TCP-AO as the traffic key generation requires the initial
238sequence numbers. Paraphrasing, starting using TCP-AO would require
239re-establishing the TCP connection.
240
2412. In-kernel MKTs database vs database in userspace
242===================================================
243
244Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way
245to TCP-MD5. It means that a userspace application that wants to use TCP-AO
246should perform ``setsockopt()`` on a TCP socket when it wants to add,
247remove or rotate MKTs. This approach moves the key management responsibility
248to userspace as well as decisions on corner cases, i.e. what to do if
249the peer doesn't respect RNextKeyID; moving more code to userspace, especially
250responsible for the policy decisions. Besides, it's flexible and scales well
251(with less locking needed than in the case of an in-kernel database). One also
252should keep in mind that mainly intended users are BGP processes, not any
253random applications, which means that compared to IPsec tunnels,
254no transparency is really needed and modern BGP daemons already have
255``setsockopt()s`` for TCP-MD5 support.
256
257.. table:: Considered pros and cons of the approaches
258
259 +----------------------+------------------------+-----------------------+
260 |                      |    ``setsockopt()``    |      in-kernel DB     |
261 +======================+========================+=======================+
262 | Extendability        | ``setsockopt()``       | Netlink messages are  |
263 |                      | commands should be     | simple and extendable |
264 |                      | extendable syscalls    |                       |
265 +----------------------+------------------------+-----------------------+
266 | Required userspace   | BGP or any application | could be transparent  |
267 | changes              | that wants TCP-AO needs| as tunnels, providing |
268 |                      | to perform             | something like        |
269 |                      | ``setsockopt()s``      | ``ip tcpao add key``  |
270 |                      | and do key management  | (delete/show/rotate)  |
271 +----------------------+------------------------+-----------------------+
272 |MKTs removal or adding| harder for userspace   | harder for kernel     |
273 +----------------------+------------------------+-----------------------+
274 | Dump-ability         | ``getsockopt()``       | Netlink .dump()       |
275 |                      |                        | callback              |
276 +----------------------+------------------------+-----------------------+
277 | Limits on kernel     |                      equal                     |
278 | resources/memory     |                                                |
279 +----------------------+------------------------+-----------------------+
280 | Scalability          | contention on          | contention on         |
281 |                      | ``TCP_LISTEN`` sockets | the whole database    |
282 +----------------------+------------------------+-----------------------+
283 | Monitoring & warnings| ``TCP_DIAG``           | same Netlink socket   |
284 +----------------------+------------------------+-----------------------+
285 | Matching of MKTs     | half-problem: only     | hard                  |
286 |                      | listen sockets         |                       |
287 +----------------------+------------------------+-----------------------+
288
289
2903. uAPI
291=======
292
293Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let
294userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs
295``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used
296It is not allowed to add a key on an established non-TCP-AO connection
297as well as to remove the last key from TCP-AO connection.
298
299``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key``
300+ ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext``
301+ ``tcp_ao_del::set_rnext`` which makes such delete "forced": it
302provides userspace a way to delete a key that's being used and atomically set
303another one instead. This is not intended for normal use and should be used
304only when the peer ignores RNextKeyID and keeps requesting/using an old key.
305It provides a way to force-delete a key that's not trusted but may break
306the TCP-AO connection.
307
308The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``.
309It also provides a uAPI to change per-socket TCP-AO settings, such as
310ignoring ICMPs, as well as clear per-socket TCP-AO packet counters.
311The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those
312per-socket TCP-AO settings.
313
314Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it
315to list all MKTs on a TCP socket or use a filter to get keys for a specific
316peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key.
317
318To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available,
319provided that the user previously has checkpointed/dumped the socket with
320``getsockopt(TCP_AO_REPAIR)``.
321
322A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO
323keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous
324delete with ``setsockopt(TCP_AO_DEL_KEY)``.
325
326Linux TCP-AO also provides a bunch of segment counters that can be helpful
327with troubleshooting/debugging issues. Every MKT has good/bad counters
328that reflect how many packets passed/failed verification.
329Each TCP-AO socket has the following counters:
330- for good segments (properly signed)
331- for bad segments (failed TCP-AO verification)
332- for segments with unknown keys
333- for segments where an AO signature was expected, but wasn't found
334- for the number of ignored ICMPs
335
336TCP-AO per-socket counters are also duplicated with per-netns counters,
337exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``,
338``TCPAORequired`` and ``TCPAODroppedIcmps``.
339
340RFC 5925 very permissively specifies how TCP port matching can be done for
341MKTs::
342
343   TCP connection identifier. A TCP socket pair, i.e., a local IP
344   address, a remote IP address, a TCP local port, and a TCP remote port.
345   Values can be partially specified using ranges (e.g., 2-30), masks
346   (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication.
347
348Currently Linux TCP-AO implementation doesn't provide any TCP port matching.
349Probably, port ranges are the most flexible for uAPI, but so far
350not implemented.
351
3524. ``setsockopt()`` vs ``accept()`` race
353========================================
354
355In contrast with TCP-MD5 established connection which has just one key,
356TCP-AO connections may have many keys, which means that accepted connections
357on a listen socket may have any amount of keys as well. As copying all those
358keys on a first properly signed SYN would make the request socket bigger, that
359would be undesirable. Currently, the implementation doesn't copy keys
360to request sockets, but rather look them up on the "parent" listener socket.
361
362The result is that when userspace removes TCP-AO keys, that may break
363not-yet-established connections on request sockets as well as not removing
364keys from sockets that were already established, but not yet ``accept()``'ed,
365hanging in the accept queue.
366
367The reverse is valid as well: if userspace adds a new key for a peer on
368a listener socket, the established sockets in accept queue won't
369have the new keys.
370
371At this moment, the resolution for the two races:
372``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()``
373and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace.
374This means that it's expected that userspace would check the MKTs on the socket
375that was returned by ``accept()`` to verify that any key rotation that
376happened on listen socket is reflected on the newly established connection.
377
378This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and
379may be changed later by introducing new flags to ``tcp_ao_add``
380and ``tcp_ao_del``.
381
382Note that this race is rare for it needs TCP-AO key rotation to happen
383during the 3-way handshake for the new TCP connection.
384
3855. Interaction with TCP-MD5
386===========================
387
388A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The
389established sockets that have either AO or MD5 keys are restricted for
390adding keys of the other option.
391
392For listening sockets the picture is different: BGP server may want to receive
393both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys
394may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add
395different types of keys for the same peer.
396
3976. SNE Linux implementation
398===========================
399
400RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers
401with SNE.  In short: TCP has to track the previous sequence numbers and set
402sne_flag when the current SEQ number rolls over. The flag is cleared when
403both current and previous SEQ numbers cross 0x7fff, which is 32Kb.
404
405In times when sne_flag is set, the algorithm compares SEQ for each packet with
4060x7fff and if it's higher than 32Kb, it assumes that the packet should be
407verified with SNE before the increment. As a result, there's
408this [0; 32Kb] window, when packets with (SNE - 1) can be accepted.
409
410Linux implementation simplifies this a bit: as the network stack already tracks
411the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that
412is wanted (rcv_nxt) - that's enough information for a rough estimation
413on where in the 4GB SEQ number space both sender and receiver are.
414When they roll over to zero, the corresponding SNE gets incremented.
415
416tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers
417from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them,
418detecting SEQ numbers rolling over. That simplifies the code a lot and only
419requires SNE numbers to be stored on every TCP-AO socket.
420
421The 2GB window at first glance seems much more permissive compared to
422RFC 5926. But that is only used to pick the correct SNE before/after
423a rollover. It allows more TCP segment replays, but yet all regular
424TCP checks in tcp_sequence() are applied on the verified segment.
425So, it trades a bit more permissive acceptance of replayed/retransmitted
426segments for the simplicity of the algorithm and what seems better behaviour
427for large TCP windows.
428
4297. Links
430========
431
432RFC 5925 The TCP Authentication Option
433   https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf
434
435RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO)
436   https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf
437
438Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)"
439   https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03
440
441RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option
442   https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf
443
444:Author: Dmitry Safonov <dima@arista.com>
445