Revision tags: release/14.1.0, release/13.3.0, release/14.0.0, release/13.2.0, release/12.4.0, release/13.1.0, release/12.3.0 |
|
#
ef2a572b |
| 22-Aug-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec_offload: kernel infrastructure
Inline IPSEC offload moves almost whole IPSEC processing from the CPU/MCU and possibly crypto accelerator, to the network card.
The transmitted packet content i
ipsec_offload: kernel infrastructure
Inline IPSEC offload moves almost whole IPSEC processing from the CPU/MCU and possibly crypto accelerator, to the network card.
The transmitted packet content is not touched by CPU during TX operations, kernel only does the required policy and security association lookups to find out that given flow is offloaded, and then packet is transmitted as plain text to the card. For driver convenience, a metadata is attached to the packet identifying SA which must process the packet. Card does encryption of the payload, padding, calculates authentication, and does the reformat according to the policy.
Similarly, on receive, card does the decapsulation, decryption, and authentification. Kernel receives the identifier of SA that was used to process the packet, together with the plain-text packet.
Overall, payload octets are only read or written by card DMA engine, removing a lot of memory subsystem overhead, and saving CPU time because IPSEC algos calculations are avoided.
If driver declares support for inline IPSEC offload (with the IFCAP2_IPSEC_OFFLOAD capability set and registering method table struct if_ipsec_accel_methods), kernel offers the SPD and SAD to driver. Driver decides which policies and SAs can be offloaded based on hardware capacity, and acks/nacks each SA for given interface to kernel. Kernel needs to keep this information to make a decision to skip software processing on TX, and to assume processing already done on RX. This shadow SPD/SAD database of offloads is rooted from policies (struct secpolicy accel_ifps, struct ifp_handle_sp) and SAs (struct secasvar accel_ipfs, struct ifp_handle_sav).
Some extensions to the PF_KEY socket allow to limit interfaces for which given SP/SA could be offloaded (proposed for offload). Also, additional statistics extensions allow to observe allocation/octet/use counters for specific SA.
Since SPs and SAs are typically instantiated in non-sleepable context, while offloading them into card is expected to require costly async manipulations of the card state, calls to the driver for offload and termination are executed in the threaded taskqueue. It also solves the issue of allocating resources needed for the offload database. Neither ipf_handle_sp nor ipf_handle_sav do not add reference to the owning SP/SA, the offload must be terminated before last reference is dropped. ipsec_accel only adds transient references to ensure safe pointer ownership by taskqueue.
Maintaining the SA counters for hardware-accelerated packets is the duty of the driver. The helper ipsec_accel_drv_sa_lifetime_update() is provided to hide accel infrastructure from drivers which would use expected callout to query hardware periodically for updates.
Reviewed by: rscheff (transport, stack integration), np Sponsored by: NVIDIA networking Differential revision: https://reviews.freebsd.org/D44219
show more ...
|
#
80044c78 |
| 16-Jan-2024 |
Xavier Beaudouin <xavier.beaudouin@klarasystems.com> |
Add UDP encapsulation of ESP in IPv6
This patch provides UDP encapsulation of ESP packets over IPv6. Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c As required by the RFC and un
Add UDP encapsulation of ESP in IPv6
This patch provides UDP encapsulation of ESP packets over IPv6. Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c As required by the RFC and unlike in IPv4 encapsulation, UDP checksums are calculated.
Co-authored-by: Aurelien Cazuc <aurelien.cazuc.external@stormshield.eu> Sponsored-by: Stormshield Sponsored-by: Wiktel Sponsored-by: Klara, Inc.
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
3d0d5b21 |
| 23-Jan-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
IfAPI: Explicitly include <net/if_private.h> in netstack
Summary: In preparation of making if_t completely opaque outside of the netstack, explicitly include the header. <net/if_var.h> will stop in
IfAPI: Explicitly include <net/if_private.h> in netstack
Summary: In preparation of making if_t completely opaque outside of the netstack, explicitly include the header. <net/if_var.h> will stop including the header in the future.
Sponsored by: Juniper Networks, Inc. Reviewed by: glebius, melifaro Differential Revision: https://reviews.freebsd.org/D38200
show more ...
|
#
e68b3792 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several struct
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately.
The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data.
One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
show more ...
|
#
fcb3f813 |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet*: remove PRC_ constants and streamline ICMP processing
In the original design of the network stack from the protocol control input method pr_ctlinput was used notify the protocols about two
netinet*: remove PRC_ constants and streamline ICMP processing
In the original design of the network stack from the protocol control input method pr_ctlinput was used notify the protocols about two very different kinds of events: internal system events and receival of an ICMP messages from outside. These events were coded with PRC_ codes. Today these methods are removed from the protosw(9) and are isolated to IPv4 and IPv6 stacks and are called only from icmp*_input(). The PRC_ codes now just create a shim layer between ICMP codes and errors or actions taken by protocols.
- Change ipproto_ctlinput_t to pass just pointer to ICMP header. This allows protocols to not deduct it from the internal IP header. - Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer. It has all the information needed to the protocols. In the structure, change ip6c_finaldst fields to sockaddr_in6. The reason is that icmp6_input() already has this address wrapped in sockaddr, and the protocols want this address as sockaddr. - For UDP tunneling control input, as well as for IPSEC control input, change the prototypes to accept a transparent union of either ICMP header pointer or struct ip6ctlparam pointer. - In icmp_input() and icmp6_input() do only validation of ICMP header and count bad packets. The translation of ICMP codes to errors/actions is done by protocols. - Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap, inet6ctlerrmap arrays. - In protocol ctlinput methods either trust what icmp_errmap() recommend, or do our own logic based on the ICMP header.
Differential revision: https://reviews.freebsd.org/D36731
show more ...
|
#
809fef29 |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netipsec: move specific ipsecmethods declarations to ipsec_support.h
where struct ipsec_methods is defined. Not a functional change. Allows further modification of method prototypes without breakin
netipsec: move specific ipsecmethods declarations to ipsec_support.h
where struct ipsec_methods is defined. Not a functional change. Allows further modification of method prototypes without breaking compilation of other ipsec compilation units.
Differential revision: https://reviews.freebsd.org/D36730
show more ...
|
#
46ddeb6b |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with the former removed in f0ffb944d25 in 2001 and the latter survived until today. It has been red
netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with the former removed in f0ffb944d25 in 2001 and the latter survived until today. It has been reduced down to only one useful declaration that moves to ip6_var.h
Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36726
show more ...
|
#
78b1fc05 |
| 17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protosw: separate pr_input and pr_ctlinput out of protosw
The protosw KPI historically has implemented two quite orthogonal things: protocols that implement a certain kind of socket, and protocols t
protosw: separate pr_input and pr_ctlinput out of protosw
The protosw KPI historically has implemented two quite orthogonal things: protocols that implement a certain kind of socket, and protocols that are IPv4/IPv6 protocol. These two things do not make one-to-one correspondence. The pr_input and pr_ctlinput methods were utilized only in IP protocols. This strange duality required IP protocols that doesn't have a socket to declare protosw, e.g. carp(4). On the other hand developers of socket protocols thought that they need to define pr_input/pr_ctlinput always, which lead to strange dead code, e.g. div_input() or sdp_ctlinput().
With this change pr_input and pr_ctlinput as part of protosw disappear and IPv4/IPv6 get their private single level protocol switch table ip_protox[] and ip6_protox[] respectively, pointing at array of ipproto_input_t functions. The pr_ctlinput that was used for control input coming from the network (ICMP, ICMPv6) is now represented by ip_ctlprotox[] and ip6_ctlprotox[].
ipproto_register() becomes the only official way to register in the table. Those protocols that were always static and unlikely anybody is interested in making them loadable, are now registered by ip_init(), ip6_init(). An IP protocol that considers itself unloadable shall register itself within its own private SYSINIT().
Reviewed by: tuexen, melifaro Differential revision: https://reviews.freebsd.org/D36157
show more ...
|
#
489482e2 |
| 17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
ipsec: isolate knowledge about protocols that are last header
Retire PR_LASTHDR protosw flag.
Reviewed by: ae Differential revision: https://reviews.freebsd.org/D36155
|
#
863871d3 |
| 27-Jul-2022 |
Kornel Dulęba <kd@FreeBSD.org> |
ipsec: Improve validation of PMTU
Currently there is no upper bound on the PMTU value that is accepted. Update hostcache only if the new pmtu is smaller than the current entry and the link MTU.
App
ipsec: Improve validation of PMTU
Currently there is no upper bound on the PMTU value that is accepted. Update hostcache only if the new pmtu is smaller than the current entry and the link MTU.
Approved by: mw(mentor) Sponsored by: Stormshield Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D35872
show more ...
|
#
590d0715 |
| 17-Sep-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
ipsec: enter epoch before calling into ipsec_run_hhooks
pfil_run_hooks which eventually can get called asserts on it.
Reviewed by: ae Sponsored by: Rubicon Communications, LLC ("Netgate") Different
ipsec: enter epoch before calling into ipsec_run_hhooks
pfil_run_hooks which eventually can get called asserts on it.
Reviewed by: ae Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D32007
show more ...
|
#
10eb2a2b |
| 10-Sep-2021 |
Mark Johnston <markj@FreeBSD.org> |
ipsec: Validate the protocol identifier in ipsec4_ctlinput()
key_allocsa() expects to handle only IPSec protocols and has an assertion to this effect. However, ipsec4_ctlinput() has to handle messa
ipsec: Validate the protocol identifier in ipsec4_ctlinput()
key_allocsa() expects to handle only IPSec protocols and has an assertion to this effect. However, ipsec4_ctlinput() has to handle messages from ICMP unreachable packets and was not validating the protocol number. In practice such a packet would simply fail to match any SADB entries and would thus be ignored.
Reported by: syzbot+6a9ef6fcfadb9f3877fe@syzkaller.appspotmail.com Reviewed by: ae MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31890
show more ...
|
#
d9d59bb1 |
| 09-Aug-2021 |
Wojciech Macek <wma@FreeBSD.org> |
ipsec: Handle ICMP NEEDFRAG message.
It will be needed for upcoming PMTU implementation in ipsec. For now simply create/update an entry in tcp hostcache when needed. The code is based on
ipsec: Handle ICMP NEEDFRAG message.
It will be needed for upcoming PMTU implementation in ipsec. For now simply create/update an entry in tcp hostcache when needed. The code is based on https://people.freebsd.org/~ae/ipsec_transport_mode_ctlinput.diff
Authored by: Kornel Duleba <mindal@semihalf.com> Differential revision: https://reviews.freebsd.org/D30992 Reviewed by: tuxen Sponsored by: Stormshield Obtained from: Semihalf
show more ...
|
Revision tags: release/13.0.0, release/12.2.0 |
|
#
662c1305 |
| 01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
f82eb2a6 |
| 26-Jun-2020 |
John Baldwin <jhb@FreeBSD.org> |
Enter and exit the network epoch for async IPsec callbacks.
When an IPsec packet has been encrypted or decrypted, the next step in the packet's traversal through the network stack is invoked from a
Enter and exit the network epoch for async IPsec callbacks.
When an IPsec packet has been encrypted or decrypted, the next step in the packet's traversal through the network stack is invoked from a crypto worker thread, not from the original calling thread. These threads need to enter the network epoch before passing packets down to IP output routines or up to transport protocols.
Reviewed by: ae Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25444
show more ...
|
Revision tags: release/11.4.0, release/12.1.0, release/11.3.0, release/12.0.0, release/11.2.0, release/10.4.0 |
|
#
0275f9db |
| 11-Aug-2017 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Merge ^/head r321383 through r322397.
|
#
d59ead01 |
| 03-Aug-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r321970
|
#
69ef36e3 |
| 01-Aug-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r321829
|
#
1a01e0e7 |
| 31-Jul-2017 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook from enc_hhook().
This should solve the problem when pf is used with if_enc(4) interface, and outbound packet with existing PC
Add inpcb pointer to struct ipsec_ctx_data and pass it to the pfil hook from enc_hhook().
This should solve the problem when pf is used with if_enc(4) interface, and outbound packet with existing PCB checked by pf, and this leads to deadlock due to pf does its own PCB lookup and tries to take rlock when wlock is already held.
Now we pass PCB pointer if it is known to the pfil hook, this helps to avoid extra PCB lookup and thus rlock acquiring is not needed. For inbound packets it is safe to pass NULL, because we do not held any PCB locks yet.
PR: 220217 MFC after: 3 weeks Sponsored by: Yandex LLC
show more ...
|
Revision tags: release/11.1.0 |
|
#
a773cead |
| 30-May-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r318964 through r319164.
|
#
7f1f6591 |
| 29-May-2017 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified.
Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The d
Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified.
Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The debugging code usually is not used, but it requires a lot of stack space to keep buffers for strings formatting. This patch conditionally defines macros to disable building of IPsec debugging code.
IPsec currently has two sysctl variables to configure debug output: * net.key.debug variable is used to enable debug output for PF_KEY protocol. Such debug messages are produced by KEYDBG() macro and usually they can be interesting for developers. * net.inet.ipsec.debug variable is used to enable debug output for DPRINTF() macro and ipseclog() function. DPRINTF() macro usually is used for development debugging. ipseclog() function is used for debugging by administrator.
The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers declarations when IPSEC_DEBUG is not present in kernel config. This reduces stack requirement for up to several hundreds of bytes. The net.inet.ipsec.debug variable still can be used to enable ipseclog() messages by administrator.
PR: 219476 Reported by: eugen No objection from: #network MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10869
show more ...
|
#
d02c951f |
| 26-May-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r318658 through r318963.
|
#
5f7c516f |
| 23-May-2017 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Fix possible double releasing for SA reference.
There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread.
For inbound packets the direct call
Fix possible double releasing for SA reference.
There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread.
For inbound packets the direct call chain is the following: IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue().
The SA reference is held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did SA reference releasing in xform_input_cb(). But when the crypto callback called directly, in case of error (e.g. data authentification failed) the error handling in ipsec_common_input() also did SA reference releasing.
To fix this, remove error handling from ipsec_common_input() and do it in xform_input() before crypto_dispatch().
PR: 219356 MFC after: 10 days
show more ...
|
#
1a36faad |
| 11-Feb-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r313301 through r313643.
|