xref: /linux/Documentation/networking/tcp_ao.rst (revision 079a028d6327e68cfa5d38b36123637b321c19a7)
1.. SPDX-License-Identifier: GPL-2.0
2
3========================================================
4TCP Authentication Option Linux implementation (RFC5925)
5========================================================
6
7TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying
8segments between trusted peers. It adds a new TCP header option with
9a Message Authentication Code (MAC). MACs are produced from the content
10of a TCP segment using a key known to both peers.
11The intent of TCP-AO is to deprecate TCP-MD5 providing better security,
12key rotation and support for a variety of MAC algorithms.
13
141. Introduction
15===============
16
17.. table:: Short and Limited Comparison of TCP-AO and TCP-MD5
18
19 +----------------------+------------------------+-----------------------+
20 |                      |       TCP-MD5          |         TCP-AO        |
21 +======================+========================+=======================+
22 |Supported MAC         |MD5 of data and key     |HMAC-SHA-1-96 and      |
23 |algorithms            |(cryptographically weak)|AES-128-CMAC-96.       |
24 |                      |                        |Implementations are    |
25 |                      |                        |permitted to support   |
26 |                      |                        |additional algorithms. |
27 +----------------------+------------------------+-----------------------+
28 |Length of MACs (bytes)|16                      |12 for HMAC-SHA-1-96   |
29 |                      |                        |and AES-128-CMAC-96.   |
30 |                      |                        |Implementations are    |
31 |                      |                        |permitted to support   |
32 |                      |                        |any MAC length that    |
33 |                      |                        |fits in the TCP header.|
34 +----------------------+------------------------+-----------------------+
35 |Number of keys per    |1                       |Many                   |
36 |TCP connection        |                        |                       |
37 +----------------------+------------------------+-----------------------+
38 |Possibility to change |Non-practical (both     |Supported by protocol  |
39 |an active key         |peers have to change    |                       |
40 |                      |them during MSL)        |                       |
41 +----------------------+------------------------+-----------------------+
42 |Protection against    |No                      |Yes: ignoring them     |
43 |ICMP 'hard errors'    |                        |by default on          |
44 |                      |                        |established connections|
45 +----------------------+------------------------+-----------------------+
46 |Protection against    |No                      |Yes: pseudo-header     |
47 |traffic-crossing      |                        |includes TCP ports.    |
48 |attack                |                        |                       |
49 +----------------------+------------------------+-----------------------+
50 |Protection against    |No                      |Sequence Number        |
51 |replayed TCP segments |                        |Extension (SNE) and    |
52 |                      |                        |Initial Sequence       |
53 |                      |                        |Numbers (ISNs)         |
54 +----------------------+------------------------+-----------------------+
55 |Supports              |Yes                     |No. ISNs+SNE are needed|
56 |Connectionless Resets |                        |to correctly sign RST. |
57 +----------------------+------------------------+-----------------------+
58 |Standards             |RFC 2385                |RFC 5925, RFC 5926     |
59 +----------------------+------------------------+-----------------------+
60
61
621.1 Frequently Asked Questions (FAQ) with references to RFC 5925
63----------------------------------------------------------------
64
65Q: Can either SendID or RecvID be non-unique for the same 4-tuple
66(srcaddr, srcport, dstaddr, dstport)?
67
68A: No [3.1]::
69
70   >> The IDs of MKTs MUST NOT overlap where their TCP connection
71   identifiers overlap.
72
73Q: Can Master Key Tuple (MKT) for an active connection be removed?
74
75A: No, unless it's copied to Transport Control Block (TCB) [3.1]::
76
77   It is presumed that an MKT affecting a particular connection cannot
78   be destroyed during an active connection -- or, equivalently, that
79   its parameters are copied to an area local to the connection (i.e.,
80   instantiated) and so changes would affect only new connections.
81
82Q: If an old MKT needs to be deleted, how should it be done in order
83to not remove it for an active connection? (As it can be still in use
84at any moment later)
85
86A: Not specified by RFC 5925, seems to be a problem for key management
87to ensure that no one uses such MKT before trying to remove it.
88
89Q: Can an old MKT exist forever and be used by another peer?
90
91A: It can, it's a key management task to decide when to remove an old key [6.1]::
92
93   Deciding when to start using a key is a performance issue. Deciding
94   when to remove an MKT is a security issue. Invalid MKTs are expected
95   to be removed. TCP-AO provides no mechanism to coordinate their removal,
96   as we consider this a key management operation.
97
98also [6.1]::
99
100   The only way to avoid reuse of previously used MKTs is to remove the MKT
101   when it is no longer considered permitted.
102
103Linux TCP-AO will try its best to prevent you from removing a key that's
104being used, considering it a key management failure. But since keeping
105an outdated key may become a security issue and as a peer may
106unintentionally prevent the removal of an old key by always setting
107it as RNextKeyID - a forced key removal mechanism is provided, where
108userspace has to supply KeyID to use instead of the one that's being removed
109and the kernel will atomically delete the old key, even if the peer is
110still requesting it. There are no guarantees for force-delete as the peer
111may yet not have the new key - the TCP connection may just break.
112Alternatively, one may choose to shut down the socket.
113
114Q: What happens when a packet is received on a new connection with no known
115MKT's RecvID?
116
117A: RFC 5925 specifies that by default it is accepted with a warning logged, but
118the behaviour can be configured by the user [7.5.1.a]::
119
120   If the segment is a SYN, then this is the first segment of a new
121   connection. Find the matching MKT for this segment, using the segment's
122   socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection
123   identifier and the MKT's RecvID.
124
125      i. If there is no matching MKT, remove TCP-AO from the segment.
126         Proceed with further TCP handling of the segment.
127         NOTE: this presumes that connections that do not match any MKT
128         should be silently accepted, as noted in Section 7.3.
129
130[7.3]::
131
132   >> A TCP-AO implementation MUST allow for configuration of the behavior
133   of segments with TCP-AO but that do not match an MKT. The initial default
134   of this configuration SHOULD be to silently accept such connections.
135   If this is not the desired case, an MKT can be included to match such
136   connections, or the connection can indicate that TCP-AO is required.
137   Alternately, the configuration can be changed to discard segments with
138   the AO option not matching an MKT.
139
140[10.2.b]::
141
142   Connections not matching any MKT do not require TCP-AO. Further, incoming
143   segments with TCP-AO are not discarded solely because they include
144   the option, provided they do not match any MKT.
145
146Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO
147segments with unknown key signatures are discarded with warnings logged.
148
149Q: Does the RFC imply centralized kernel key management in any way?
150(i.e. that a key on all connections MUST be rotated at the same time?)
151
152A: Not specified. MKTs can be managed in userspace, the only relevant part to
153key changes is [7.3]::
154
155   >> All TCP segments MUST be checked against the set of MKTs for matching
156   TCP connection identifiers.
157
158Q: What happens when RNextKeyID requested by a peer is unknown? Should
159the connection be reset?
160
161A: It should not, no action needs to be performed [7.5.2.e]::
162
163   ii. If they differ, determine whether the RNextKeyID MKT is ready.
164
165       1. If the MKT corresponding to the segment’s socket pair and RNextKeyID
166       is not available, no action is required (RNextKeyID of a received
167       segment needs to match the MKT’s SendID).
168
169Q: How is current_key set, and when does it change? Is it a user-triggered
170change, or is it triggered by a request from the remote peer? Is it set by the
171user explicitly, or by a matching rule?
172
173A: current_key is set by RNextKeyID [6.1]::
174
175   Rnext_key is changed only by manual user intervention or MKT management
176   protocol operation. It is not manipulated by TCP-AO. Current_key is updated
177   by TCP-AO when processing received TCP segments as discussed in the segment
178   processing description in Section 7.5. Note that the algorithm allows
179   the current_key to change to a new MKT, then change back to a previously
180   used MKT (known as "backing up"). This can occur during an MKT change when
181   segments are received out of order, and is considered a feature of TCP-AO,
182   because reordering does not result in drops.
183
184[7.5.2.e.ii]::
185
186   2. If the matching MKT corresponding to the segment’s socket pair and
187   RNextKeyID is available:
188
189      a. Set current_key to the RNextKeyID MKT.
190
191Q: If both peers have multiple MKTs matching the connection's socket pair
192(with different KeyIDs), how should the sender/receiver pick KeyID to use?
193
194A: Some mechanism should pick the "desired" MKT [3.3]::
195
196   Multiple MKTs may match a single outgoing segment, e.g., when MKTs
197   are being changed. Those MKTs cannot have conflicting IDs (as noted
198   elsewhere), and some mechanism must determine which MKT to use for each
199   given outgoing segment.
200
201   >> An outgoing TCP segment MUST match at most one desired MKT, indicated
202   by the segment’s socket pair. The segment MAY match multiple MKTs, provided
203   that exactly one MKT is indicated as desired. Other information in
204   the segment MAY be used to determine the desired MKT when multiple MKTs
205   match; such information MUST NOT include values in any TCP option fields.
206
207Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa):
208
209A: No [1]::
210
211   TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5
212   does not support any changes to a connection’s security algorithm
213   once established.
214
215Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed
216connection?
217
218A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i]
219that would allow accepting segments without a sign (which would be insecure).
220While switching to non-TCP-AO connection is not prohibited directly, it seems
221what the RFC means. Also, there's a requirement for TCP-AO connections to
222always have one current_key [3.3]::
223
224   TCP-AO requires that every protected TCP segment match exactly one MKT.
225
226[3.3]::
227
228   >> An incoming TCP segment including TCP-AO MUST match exactly one MKT,
229   indicated solely by the segment’s socket pair and its TCP-AO KeyID.
230
231[4.4]::
232
233   One or more MKTs. These are the MKTs that match this connection’s
234   socket pair.
235
236Q: Can a non-TCP-AO connection become a TCP-AO-enabled one?
237
238A: No: for an already established non-TCP-AO connection it would be impossible
239to switch to using TCP-AO, as the traffic key generation requires the initial
240sequence numbers. Paraphrasing, starting using TCP-AO would require
241re-establishing the TCP connection.
242
2432. In-kernel MKTs database vs database in userspace
244===================================================
245
246Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way
247to TCP-MD5. It means that a userspace application that wants to use TCP-AO
248should perform ``setsockopt()`` on a TCP socket when it wants to add,
249remove or rotate MKTs. This approach moves the key management responsibility
250to userspace as well as decisions on corner cases, i.e. what to do if
251the peer doesn't respect RNextKeyID; moving more code to userspace, especially
252responsible for the policy decisions. Besides, it's flexible and scales well
253(with less locking needed than in the case of an in-kernel database). One also
254should keep in mind that mainly intended users are BGP processes, not any
255random applications, which means that compared to IPsec tunnels,
256no transparency is really needed and modern BGP daemons already have
257``setsockopt()s`` for TCP-MD5 support.
258
259.. table:: Considered pros and cons of the approaches
260
261 +----------------------+------------------------+-----------------------+
262 |                      |    ``setsockopt()``    |      in-kernel DB     |
263 +======================+========================+=======================+
264 | Extendability        | ``setsockopt()``       | Netlink messages are  |
265 |                      | commands should be     | simple and extendable |
266 |                      | extendable syscalls    |                       |
267 +----------------------+------------------------+-----------------------+
268 | Required userspace   | BGP or any application | could be transparent  |
269 | changes              | that wants TCP-AO needs| as tunnels, providing |
270 |                      | to perform             | something like        |
271 |                      | ``setsockopt()s``      | ``ip tcpao add key``  |
272 |                      | and do key management  | (delete/show/rotate)  |
273 +----------------------+------------------------+-----------------------+
274 |MKTs removal or adding| harder for userspace   | harder for kernel     |
275 +----------------------+------------------------+-----------------------+
276 | Dump-ability         | ``getsockopt()``       | Netlink .dump()       |
277 |                      |                        | callback              |
278 +----------------------+------------------------+-----------------------+
279 | Limits on kernel     |                      equal                     |
280 | resources/memory     |                                                |
281 +----------------------+------------------------+-----------------------+
282 | Scalability          | contention on          | contention on         |
283 |                      | ``TCP_LISTEN`` sockets | the whole database    |
284 +----------------------+------------------------+-----------------------+
285 | Monitoring & warnings| ``TCP_DIAG``           | same Netlink socket   |
286 +----------------------+------------------------+-----------------------+
287 | Matching of MKTs     | half-problem: only     | hard                  |
288 |                      | listen sockets         |                       |
289 +----------------------+------------------------+-----------------------+
290
291
2923. uAPI
293=======
294
295Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let
296userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs
297``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used.
298It is not allowed to add a key on an established non-TCP-AO connection
299as well as to remove the last key from TCP-AO connection.
300
301``TCP_AO_ADD_KEY`` allows the MAC algorithm and MAC length to be selected.
302Linux supports the mandatory-to-implement algorithms HMAC-SHA-1-96 and
303AES-128-CMAC-96. In addition, as Linux extensions, it supports:
304
305- HMAC-SHA256. Linux uses HMAC-SHA256 in the same way as HMAC-SHA1; this
306  includes omitting an explicit entropy extraction step. To work around the
307  missing entropy extraction, users should provide keys with full entropy. The
308  implementation is interoperable with other implementations of HMAC-SHA256 for
309  TCP-AO only when they have implemented the key derivation the same way (and
310  also the same MAC length is selected on each side).
311
312- Any MAC length for any of the supported MAC algorithms, provided it fits in
313  the TCP header and is at least 4 bytes.
314
315``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key``
316+ ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext``
317+ ``tcp_ao_del::set_rnext`` which makes such delete "forced": it
318provides userspace a way to delete a key that's being used and atomically set
319another one instead. This is not intended for normal use and should be used
320only when the peer ignores RNextKeyID and keeps requesting/using an old key.
321It provides a way to force-delete a key that's not trusted but may break
322the TCP-AO connection.
323
324The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``.
325It also provides a uAPI to change per-socket TCP-AO settings, such as
326ignoring ICMPs, as well as clear per-socket TCP-AO packet counters.
327The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those
328per-socket TCP-AO settings.
329
330Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it
331to list all MKTs on a TCP socket or use a filter to get keys for a specific
332peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key.
333
334To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available,
335provided that the user previously has checkpointed/dumped the socket with
336``getsockopt(TCP_AO_REPAIR)``.
337
338A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO
339keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous
340delete with ``setsockopt(TCP_AO_DEL_KEY)``.
341
342Linux TCP-AO also provides a bunch of segment counters that can be helpful
343with troubleshooting/debugging issues. Every MKT has good/bad counters
344that reflect how many packets passed/failed verification.
345Each TCP-AO socket has the following counters:
346- for good segments (properly signed)
347- for bad segments (failed TCP-AO verification)
348- for segments with unknown keys
349- for segments where an AO signature was expected, but wasn't found
350- for the number of ignored ICMPs
351
352TCP-AO per-socket counters are also duplicated with per-netns counters,
353exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``,
354``TCPAORequired`` and ``TCPAODroppedIcmps``.
355
356For monitoring purposes, there are following TCP-AO trace events:
357``tcp_hash_bad_header``, ``tcp_hash_ao_required``, ``tcp_ao_handshake_failure``,
358``tcp_ao_wrong_maclen``, ``tcp_ao_wrong_maclen``, ``tcp_ao_key_not_found``,
359``tcp_ao_rnext_request``, ``tcp_ao_synack_no_key``, ``tcp_ao_snd_sne_update``,
360``tcp_ao_rcv_sne_update``. It's possible to separately enable any of them and
361one can filter them by net-namespace, 4-tuple, family, L3 index, and TCP header
362flags. If a segment has a TCP-AO header, the filters may also include
363keyid, rnext, and maclen. SNE updates include the rolled-over numbers.
364
365RFC 5925 very permissively specifies how TCP port matching can be done for
366MKTs::
367
368   TCP connection identifier. A TCP socket pair, i.e., a local IP
369   address, a remote IP address, a TCP local port, and a TCP remote port.
370   Values can be partially specified using ranges (e.g., 2-30), masks
371   (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication.
372
373Currently Linux TCP-AO implementation doesn't provide any TCP port matching.
374Probably, port ranges are the most flexible for uAPI, but so far
375not implemented.
376
3774. ``setsockopt()`` vs ``accept()`` race
378========================================
379
380In contrast with an established TCP-MD5 connection which has just one key,
381TCP-AO connections may have many keys, which means that accepted connections
382on a listen socket may have any amount of keys as well. As copying all those
383keys on a first properly signed SYN would make the request socket bigger, that
384would be undesirable. Currently, the implementation doesn't copy keys
385to request sockets, but rather look them up on the "parent" listener socket.
386
387The result is that when userspace removes TCP-AO keys, that may break
388not-yet-established connections on request sockets as well as not removing
389keys from sockets that were already established, but not yet ``accept()``'ed,
390hanging in the accept queue.
391
392The reverse is valid as well: if userspace adds a new key for a peer on
393a listener socket, the established sockets in the accept queue won't
394have the new keys.
395
396At this moment, the resolution for the two races:
397``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()``
398and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace.
399This means that it's expected that userspace would check the MKTs on the socket
400that was returned by ``accept()`` to verify that any key rotation that
401happened on the listen socket is reflected on the newly established connection.
402
403This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and
404may be changed later by introducing new flags to ``tcp_ao_add``
405and ``tcp_ao_del``.
406
407Note that this race is rare for it needs TCP-AO key rotation to happen
408during the 3-way handshake for the new TCP connection.
409
4105. Interaction with TCP-MD5
411===========================
412
413A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The
414established sockets that have either AO or MD5 keys are restricted for
415adding keys of the other option.
416
417For listening sockets the picture is different: BGP server may want to receive
418both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys
419may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add
420different types of keys for the same peer.
421
4226. SNE Linux implementation
423===========================
424
425RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers
426with SNE.  In short: TCP has to track the previous sequence numbers and set
427sne_flag when the current SEQ number rolls over. The flag is cleared when
428both current and previous SEQ numbers cross 0x7fff, which is 32Kb.
429
430In times when sne_flag is set, the algorithm compares SEQ for each packet with
4310x7fff and if it's higher than 32Kb, it assumes that the packet should be
432verified with SNE before the increment. As a result, there's
433this [0; 32Kb] window, when packets with (SNE - 1) can be accepted.
434
435Linux implementation simplifies this a bit: as the network stack already tracks
436the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that
437is wanted (rcv_nxt) - that's enough information for a rough estimation
438on where in the 4GB SEQ number space both sender and receiver are.
439When they roll over to zero, the corresponding SNE gets incremented.
440
441tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers
442from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them,
443detecting SEQ numbers rolling over. That simplifies the code a lot and only
444requires SNE numbers to be stored on every TCP-AO socket.
445
446The 2GB window at first glance seems much more permissive compared to
447RFC 5926. But that is only used to pick the correct SNE before/after
448a rollover. It allows more TCP segment replays, but yet all regular
449TCP checks in tcp_sequence() are applied on the verified segment.
450So, it trades a bit more permissive acceptance of replayed/retransmitted
451segments for the simplicity of the algorithm and what seems better behaviour
452for large TCP windows.
453
4547. Links
455========
456
457RFC 5925 The TCP Authentication Option
458   https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf
459
460RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO)
461   https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf
462
463Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)"
464   https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03
465
466RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option
467   https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf
468
469:Author: Dmitry Safonov <dima@arista.com>
470