1.. SPDX-License-Identifier: GPL-2.0 2 3======================================================== 4TCP Authentication Option Linux implementation (RFC5925) 5======================================================== 6 7TCP Authentication Option (TCP-AO) provides a TCP extension aimed at verifying 8segments between trusted peers. It adds a new TCP header option with 9a Message Authentication Code (MAC). MACs are produced from the content 10of a TCP segment using a key known to both peers. 11The intent of TCP-AO is to deprecate TCP-MD5 providing better security, 12key rotation and support for a variety of MAC algorithms. 13 141. Introduction 15=============== 16 17.. table:: Short and Limited Comparison of TCP-AO and TCP-MD5 18 19 +----------------------+------------------------+-----------------------+ 20 | | TCP-MD5 | TCP-AO | 21 +======================+========================+=======================+ 22 |Supported MAC |MD5 of data and key |HMAC-SHA-1-96 and | 23 |algorithms |(cryptographically weak)|AES-128-CMAC-96. | 24 | | |Implementations are | 25 | | |permitted to support | 26 | | |additional algorithms. | 27 +----------------------+------------------------+-----------------------+ 28 |Length of MACs (bytes)|16 |12 for HMAC-SHA-1-96 | 29 | | |and AES-128-CMAC-96. | 30 | | |Implementations are | 31 | | |permitted to support | 32 | | |any MAC length that | 33 | | |fits in the TCP header.| 34 +----------------------+------------------------+-----------------------+ 35 |Number of keys per |1 |Many | 36 |TCP connection | | | 37 +----------------------+------------------------+-----------------------+ 38 |Possibility to change |Non-practical (both |Supported by protocol | 39 |an active key |peers have to change | | 40 | |them during MSL) | | 41 +----------------------+------------------------+-----------------------+ 42 |Protection against |No |Yes: ignoring them | 43 |ICMP 'hard errors' | |by default on | 44 | | |established connections| 45 +----------------------+------------------------+-----------------------+ 46 |Protection against |No |Yes: pseudo-header | 47 |traffic-crossing | |includes TCP ports. | 48 |attack | | | 49 +----------------------+------------------------+-----------------------+ 50 |Protection against |No |Sequence Number | 51 |replayed TCP segments | |Extension (SNE) and | 52 | | |Initial Sequence | 53 | | |Numbers (ISNs) | 54 +----------------------+------------------------+-----------------------+ 55 |Supports |Yes |No. ISNs+SNE are needed| 56 |Connectionless Resets | |to correctly sign RST. | 57 +----------------------+------------------------+-----------------------+ 58 |Standards |RFC 2385 |RFC 5925, RFC 5926 | 59 +----------------------+------------------------+-----------------------+ 60 61 621.1 Frequently Asked Questions (FAQ) with references to RFC 5925 63---------------------------------------------------------------- 64 65Q: Can either SendID or RecvID be non-unique for the same 4-tuple 66(srcaddr, srcport, dstaddr, dstport)? 67 68A: No [3.1]:: 69 70 >> The IDs of MKTs MUST NOT overlap where their TCP connection 71 identifiers overlap. 72 73Q: Can Master Key Tuple (MKT) for an active connection be removed? 74 75A: No, unless it's copied to Transport Control Block (TCB) [3.1]:: 76 77 It is presumed that an MKT affecting a particular connection cannot 78 be destroyed during an active connection -- or, equivalently, that 79 its parameters are copied to an area local to the connection (i.e., 80 instantiated) and so changes would affect only new connections. 81 82Q: If an old MKT needs to be deleted, how should it be done in order 83to not remove it for an active connection? (As it can be still in use 84at any moment later) 85 86A: Not specified by RFC 5925, seems to be a problem for key management 87to ensure that no one uses such MKT before trying to remove it. 88 89Q: Can an old MKT exist forever and be used by another peer? 90 91A: It can, it's a key management task to decide when to remove an old key [6.1]:: 92 93 Deciding when to start using a key is a performance issue. Deciding 94 when to remove an MKT is a security issue. Invalid MKTs are expected 95 to be removed. TCP-AO provides no mechanism to coordinate their removal, 96 as we consider this a key management operation. 97 98also [6.1]:: 99 100 The only way to avoid reuse of previously used MKTs is to remove the MKT 101 when it is no longer considered permitted. 102 103Linux TCP-AO will try its best to prevent you from removing a key that's 104being used, considering it a key management failure. But since keeping 105an outdated key may become a security issue and as a peer may 106unintentionally prevent the removal of an old key by always setting 107it as RNextKeyID - a forced key removal mechanism is provided, where 108userspace has to supply KeyID to use instead of the one that's being removed 109and the kernel will atomically delete the old key, even if the peer is 110still requesting it. There are no guarantees for force-delete as the peer 111may yet not have the new key - the TCP connection may just break. 112Alternatively, one may choose to shut down the socket. 113 114Q: What happens when a packet is received on a new connection with no known 115MKT's RecvID? 116 117A: RFC 5925 specifies that by default it is accepted with a warning logged, but 118the behaviour can be configured by the user [7.5.1.a]:: 119 120 If the segment is a SYN, then this is the first segment of a new 121 connection. Find the matching MKT for this segment, using the segment's 122 socket pair and its TCP-AO KeyID, matched against the MKT's TCP connection 123 identifier and the MKT's RecvID. 124 125 i. If there is no matching MKT, remove TCP-AO from the segment. 126 Proceed with further TCP handling of the segment. 127 NOTE: this presumes that connections that do not match any MKT 128 should be silently accepted, as noted in Section 7.3. 129 130[7.3]:: 131 132 >> A TCP-AO implementation MUST allow for configuration of the behavior 133 of segments with TCP-AO but that do not match an MKT. The initial default 134 of this configuration SHOULD be to silently accept such connections. 135 If this is not the desired case, an MKT can be included to match such 136 connections, or the connection can indicate that TCP-AO is required. 137 Alternately, the configuration can be changed to discard segments with 138 the AO option not matching an MKT. 139 140[10.2.b]:: 141 142 Connections not matching any MKT do not require TCP-AO. Further, incoming 143 segments with TCP-AO are not discarded solely because they include 144 the option, provided they do not match any MKT. 145 146Note that Linux TCP-AO implementation differs in this aspect. Currently, TCP-AO 147segments with unknown key signatures are discarded with warnings logged. 148 149Q: Does the RFC imply centralized kernel key management in any way? 150(i.e. that a key on all connections MUST be rotated at the same time?) 151 152A: Not specified. MKTs can be managed in userspace, the only relevant part to 153key changes is [7.3]:: 154 155 >> All TCP segments MUST be checked against the set of MKTs for matching 156 TCP connection identifiers. 157 158Q: What happens when RNextKeyID requested by a peer is unknown? Should 159the connection be reset? 160 161A: It should not, no action needs to be performed [7.5.2.e]:: 162 163 ii. If they differ, determine whether the RNextKeyID MKT is ready. 164 165 1. If the MKT corresponding to the segment’s socket pair and RNextKeyID 166 is not available, no action is required (RNextKeyID of a received 167 segment needs to match the MKT’s SendID). 168 169Q: How is current_key set, and when does it change? Is it a user-triggered 170change, or is it triggered by a request from the remote peer? Is it set by the 171user explicitly, or by a matching rule? 172 173A: current_key is set by RNextKeyID [6.1]:: 174 175 Rnext_key is changed only by manual user intervention or MKT management 176 protocol operation. It is not manipulated by TCP-AO. Current_key is updated 177 by TCP-AO when processing received TCP segments as discussed in the segment 178 processing description in Section 7.5. Note that the algorithm allows 179 the current_key to change to a new MKT, then change back to a previously 180 used MKT (known as "backing up"). This can occur during an MKT change when 181 segments are received out of order, and is considered a feature of TCP-AO, 182 because reordering does not result in drops. 183 184[7.5.2.e.ii]:: 185 186 2. If the matching MKT corresponding to the segment’s socket pair and 187 RNextKeyID is available: 188 189 a. Set current_key to the RNextKeyID MKT. 190 191Q: If both peers have multiple MKTs matching the connection's socket pair 192(with different KeyIDs), how should the sender/receiver pick KeyID to use? 193 194A: Some mechanism should pick the "desired" MKT [3.3]:: 195 196 Multiple MKTs may match a single outgoing segment, e.g., when MKTs 197 are being changed. Those MKTs cannot have conflicting IDs (as noted 198 elsewhere), and some mechanism must determine which MKT to use for each 199 given outgoing segment. 200 201 >> An outgoing TCP segment MUST match at most one desired MKT, indicated 202 by the segment’s socket pair. The segment MAY match multiple MKTs, provided 203 that exactly one MKT is indicated as desired. Other information in 204 the segment MAY be used to determine the desired MKT when multiple MKTs 205 match; such information MUST NOT include values in any TCP option fields. 206 207Q: Can TCP-MD5 connection migrate to TCP-AO (and vice-versa): 208 209A: No [1]:: 210 211 TCP MD5-protected connections cannot be migrated to TCP-AO because TCP MD5 212 does not support any changes to a connection’s security algorithm 213 once established. 214 215Q: If all MKTs are removed on a connection, can it become a non-TCP-AO signed 216connection? 217 218A: [7.5.2] doesn't have the same choice as SYN packet handling in [7.5.1.i] 219that would allow accepting segments without a sign (which would be insecure). 220While switching to non-TCP-AO connection is not prohibited directly, it seems 221what the RFC means. Also, there's a requirement for TCP-AO connections to 222always have one current_key [3.3]:: 223 224 TCP-AO requires that every protected TCP segment match exactly one MKT. 225 226[3.3]:: 227 228 >> An incoming TCP segment including TCP-AO MUST match exactly one MKT, 229 indicated solely by the segment’s socket pair and its TCP-AO KeyID. 230 231[4.4]:: 232 233 One or more MKTs. These are the MKTs that match this connection’s 234 socket pair. 235 236Q: Can a non-TCP-AO connection become a TCP-AO-enabled one? 237 238A: No: for an already established non-TCP-AO connection it would be impossible 239to switch to using TCP-AO, as the traffic key generation requires the initial 240sequence numbers. Paraphrasing, starting using TCP-AO would require 241re-establishing the TCP connection. 242 2432. In-kernel MKTs database vs database in userspace 244=================================================== 245 246Linux TCP-AO support is implemented using ``setsockopt()s``, in a similar way 247to TCP-MD5. It means that a userspace application that wants to use TCP-AO 248should perform ``setsockopt()`` on a TCP socket when it wants to add, 249remove or rotate MKTs. This approach moves the key management responsibility 250to userspace as well as decisions on corner cases, i.e. what to do if 251the peer doesn't respect RNextKeyID; moving more code to userspace, especially 252responsible for the policy decisions. Besides, it's flexible and scales well 253(with less locking needed than in the case of an in-kernel database). One also 254should keep in mind that mainly intended users are BGP processes, not any 255random applications, which means that compared to IPsec tunnels, 256no transparency is really needed and modern BGP daemons already have 257``setsockopt()s`` for TCP-MD5 support. 258 259.. table:: Considered pros and cons of the approaches 260 261 +----------------------+------------------------+-----------------------+ 262 | | ``setsockopt()`` | in-kernel DB | 263 +======================+========================+=======================+ 264 | Extendability | ``setsockopt()`` | Netlink messages are | 265 | | commands should be | simple and extendable | 266 | | extendable syscalls | | 267 +----------------------+------------------------+-----------------------+ 268 | Required userspace | BGP or any application | could be transparent | 269 | changes | that wants TCP-AO needs| as tunnels, providing | 270 | | to perform | something like | 271 | | ``setsockopt()s`` | ``ip tcpao add key`` | 272 | | and do key management | (delete/show/rotate) | 273 +----------------------+------------------------+-----------------------+ 274 |MKTs removal or adding| harder for userspace | harder for kernel | 275 +----------------------+------------------------+-----------------------+ 276 | Dump-ability | ``getsockopt()`` | Netlink .dump() | 277 | | | callback | 278 +----------------------+------------------------+-----------------------+ 279 | Limits on kernel | equal | 280 | resources/memory | | 281 +----------------------+------------------------+-----------------------+ 282 | Scalability | contention on | contention on | 283 | | ``TCP_LISTEN`` sockets | the whole database | 284 +----------------------+------------------------+-----------------------+ 285 | Monitoring & warnings| ``TCP_DIAG`` | same Netlink socket | 286 +----------------------+------------------------+-----------------------+ 287 | Matching of MKTs | half-problem: only | hard | 288 | | listen sockets | | 289 +----------------------+------------------------+-----------------------+ 290 291 2923. uAPI 293======= 294 295Linux provides a set of ``setsockopt()s`` and ``getsockopt()s`` that let 296userspace manage TCP-AO on a per-socket basis. In order to add/delete MKTs 297``TCP_AO_ADD_KEY`` and ``TCP_AO_DEL_KEY`` TCP socket options must be used. 298It is not allowed to add a key on an established non-TCP-AO connection 299as well as to remove the last key from TCP-AO connection. 300 301``TCP_AO_ADD_KEY`` allows the MAC algorithm and MAC length to be selected. 302Linux supports the mandatory-to-implement algorithms HMAC-SHA-1-96 and 303AES-128-CMAC-96. In addition, as Linux extensions, it supports: 304 305- HMAC-SHA256. Linux uses HMAC-SHA256 in the same way as HMAC-SHA1; this 306 includes omitting an explicit entropy extraction step. To work around the 307 missing entropy extraction, users should provide keys with full entropy. The 308 implementation is interoperable with other implementations of HMAC-SHA256 for 309 TCP-AO only when they have implemented the key derivation the same way (and 310 also the same MAC length is selected on each side). 311 312- Any MAC length for any of the supported MAC algorithms, provided it fits in 313 the TCP header and is at least 4 bytes. 314 315``setsockopt(TCP_AO_DEL_KEY)`` command may specify ``tcp_ao_del::current_key`` 316+ ``tcp_ao_del::set_current`` and/or ``tcp_ao_del::rnext`` 317+ ``tcp_ao_del::set_rnext`` which makes such delete "forced": it 318provides userspace a way to delete a key that's being used and atomically set 319another one instead. This is not intended for normal use and should be used 320only when the peer ignores RNextKeyID and keeps requesting/using an old key. 321It provides a way to force-delete a key that's not trusted but may break 322the TCP-AO connection. 323 324The usual/normal key-rotation can be performed with ``setsockopt(TCP_AO_INFO)``. 325It also provides a uAPI to change per-socket TCP-AO settings, such as 326ignoring ICMPs, as well as clear per-socket TCP-AO packet counters. 327The corresponding ``getsockopt(TCP_AO_INFO)`` can be used to get those 328per-socket TCP-AO settings. 329 330Another useful command is ``getsockopt(TCP_AO_GET_KEYS)``. One can use it 331to list all MKTs on a TCP socket or use a filter to get keys for a specific 332peer and/or sndid/rcvid, VRF L3 interface or get current_key/rnext_key. 333 334To repair TCP-AO connections ``setsockopt(TCP_AO_REPAIR)`` is available, 335provided that the user previously has checkpointed/dumped the socket with 336``getsockopt(TCP_AO_REPAIR)``. 337 338A tip here for scaled TCP_LISTEN sockets, that may have some thousands TCP-AO 339keys, is: use filters in ``getsockopt(TCP_AO_GET_KEYS)`` and asynchronous 340delete with ``setsockopt(TCP_AO_DEL_KEY)``. 341 342Linux TCP-AO also provides a bunch of segment counters that can be helpful 343with troubleshooting/debugging issues. Every MKT has good/bad counters 344that reflect how many packets passed/failed verification. 345Each TCP-AO socket has the following counters: 346- for good segments (properly signed) 347- for bad segments (failed TCP-AO verification) 348- for segments with unknown keys 349- for segments where an AO signature was expected, but wasn't found 350- for the number of ignored ICMPs 351 352TCP-AO per-socket counters are also duplicated with per-netns counters, 353exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``, 354``TCPAORequired`` and ``TCPAODroppedIcmps``. 355 356For monitoring purposes, there are following TCP-AO trace events: 357``tcp_hash_bad_header``, ``tcp_hash_ao_required``, ``tcp_ao_handshake_failure``, 358``tcp_ao_wrong_maclen``, ``tcp_ao_wrong_maclen``, ``tcp_ao_key_not_found``, 359``tcp_ao_rnext_request``, ``tcp_ao_synack_no_key``, ``tcp_ao_snd_sne_update``, 360``tcp_ao_rcv_sne_update``. It's possible to separately enable any of them and 361one can filter them by net-namespace, 4-tuple, family, L3 index, and TCP header 362flags. If a segment has a TCP-AO header, the filters may also include 363keyid, rnext, and maclen. SNE updates include the rolled-over numbers. 364 365RFC 5925 very permissively specifies how TCP port matching can be done for 366MKTs:: 367 368 TCP connection identifier. A TCP socket pair, i.e., a local IP 369 address, a remote IP address, a TCP local port, and a TCP remote port. 370 Values can be partially specified using ranges (e.g., 2-30), masks 371 (e.g., 0xF0), wildcards (e.g., "*"), or any other suitable indication. 372 373Currently Linux TCP-AO implementation doesn't provide any TCP port matching. 374Probably, port ranges are the most flexible for uAPI, but so far 375not implemented. 376 3774. ``setsockopt()`` vs ``accept()`` race 378======================================== 379 380In contrast with an established TCP-MD5 connection which has just one key, 381TCP-AO connections may have many keys, which means that accepted connections 382on a listen socket may have any amount of keys as well. As copying all those 383keys on a first properly signed SYN would make the request socket bigger, that 384would be undesirable. Currently, the implementation doesn't copy keys 385to request sockets, but rather look them up on the "parent" listener socket. 386 387The result is that when userspace removes TCP-AO keys, that may break 388not-yet-established connections on request sockets as well as not removing 389keys from sockets that were already established, but not yet ``accept()``'ed, 390hanging in the accept queue. 391 392The reverse is valid as well: if userspace adds a new key for a peer on 393a listener socket, the established sockets in the accept queue won't 394have the new keys. 395 396At this moment, the resolution for the two races: 397``setsockopt(TCP_AO_ADD_KEY)`` vs ``accept()`` 398and ``setsockopt(TCP_AO_DEL_KEY)`` vs ``accept()`` is delegated to userspace. 399This means that it's expected that userspace would check the MKTs on the socket 400that was returned by ``accept()`` to verify that any key rotation that 401happened on the listen socket is reflected on the newly established connection. 402 403This is a similar "do-nothing" approach to TCP-MD5 from the kernel side and 404may be changed later by introducing new flags to ``tcp_ao_add`` 405and ``tcp_ao_del``. 406 407Note that this race is rare for it needs TCP-AO key rotation to happen 408during the 3-way handshake for the new TCP connection. 409 4105. Interaction with TCP-MD5 411=========================== 412 413A TCP connection can not migrate between TCP-AO and TCP-MD5 options. The 414established sockets that have either AO or MD5 keys are restricted for 415adding keys of the other option. 416 417For listening sockets the picture is different: BGP server may want to receive 418both TCP-AO and (deprecated) TCP-MD5 clients. As a result, both types of keys 419may be added to TCP_CLOSED or TCP_LISTEN sockets. It's not allowed to add 420different types of keys for the same peer. 421 4226. SNE Linux implementation 423=========================== 424 425RFC 5925 [6.2] describes the algorithm of how to extend TCP sequence numbers 426with SNE. In short: TCP has to track the previous sequence numbers and set 427sne_flag when the current SEQ number rolls over. The flag is cleared when 428both current and previous SEQ numbers cross 0x7fff, which is 32Kb. 429 430In times when sne_flag is set, the algorithm compares SEQ for each packet with 4310x7fff and if it's higher than 32Kb, it assumes that the packet should be 432verified with SNE before the increment. As a result, there's 433this [0; 32Kb] window, when packets with (SNE - 1) can be accepted. 434 435Linux implementation simplifies this a bit: as the network stack already tracks 436the first SEQ byte that ACK is wanted for (snd_una) and the next SEQ byte that 437is wanted (rcv_nxt) - that's enough information for a rough estimation 438on where in the 4GB SEQ number space both sender and receiver are. 439When they roll over to zero, the corresponding SNE gets incremented. 440 441tcp_ao_compute_sne() is called for each TCP-AO segment. It compares SEQ numbers 442from the segment with snd_una or rcv_nxt and fits the result into a 2GB window around them, 443detecting SEQ numbers rolling over. That simplifies the code a lot and only 444requires SNE numbers to be stored on every TCP-AO socket. 445 446The 2GB window at first glance seems much more permissive compared to 447RFC 5926. But that is only used to pick the correct SNE before/after 448a rollover. It allows more TCP segment replays, but yet all regular 449TCP checks in tcp_sequence() are applied on the verified segment. 450So, it trades a bit more permissive acceptance of replayed/retransmitted 451segments for the simplicity of the algorithm and what seems better behaviour 452for large TCP windows. 453 4547. Links 455======== 456 457RFC 5925 The TCP Authentication Option 458 https://www.rfc-editor.org/rfc/pdfrfc/rfc5925.txt.pdf 459 460RFC 5926 Cryptographic Algorithms for the TCP Authentication Option (TCP-AO) 461 https://www.rfc-editor.org/rfc/pdfrfc/rfc5926.txt.pdf 462 463Draft "SHA-2 Algorithm for the TCP Authentication Option (TCP-AO)" 464 https://datatracker.ietf.org/doc/html/draft-nayak-tcp-sha2-03 465 466RFC 2385 Protection of BGP Sessions via the TCP MD5 Signature Option 467 https://www.rfc-editor.org/rfc/pdfrfc/rfc2385.txt.pdf 468 469:Author: Dmitry Safonov <dima@arista.com> 470