#
0ff2d00d |
| 29-Dec-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec: allow it to work with unmapped mbufs
Only map mbuf when a policy is looked up and indicates that IPSEC needs to transform the packet. If IPSEC is inline offloaded, it is up to the interface
ipsec: allow it to work with unmapped mbufs
Only map mbuf when a policy is looked up and indicates that IPSEC needs to transform the packet. If IPSEC is inline offloaded, it is up to the interface driver to request remap if needed.
Fetch the IP header using m_copydata() instead of using mtod() to select policy/SA.
Reviewed by: markj Sponsored by: NVidia networking Differential revision: https://reviews.freebsd.org/D48265
show more ...
|
#
b0e02076 |
| 28-Dec-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec + ktls: cannot coexists
but instead of tripping the assert in debug kernel, and silently falling into UB for prod, skip IPSEC processing for KTLS framed packets when mb_unmapped_to_ext() faile
ipsec + ktls: cannot coexists
but instead of tripping the assert in debug kernel, and silently falling into UB for prod, skip IPSEC processing for KTLS framed packets when mb_unmapped_to_ext() failed.
Reviewed by: markj Sponsored by: NVidia networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D48265
show more ...
|
Revision tags: release/14.2.0 |
|
#
f8707400 |
| 27-Nov-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
ip6_output(): if mtu is not yet computed for ipsec hook, use ifp mtu
Sponsored by: NVidia networking
|
Revision tags: release/13.4.0, release/14.1.0, release/13.3.0, release/14.0.0, release/13.2.0 |
|
#
da0efbdb |
| 25-Jan-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
ip6_output: place IPSEC_OUTPUT hook after the outgoing ifp is calculated
To be able to pass ifp and mtu to the ipsec_output() and ipsec accelerator filter.
Sponsored by: NVIDIA networking Different
ip6_output: place IPSEC_OUTPUT hook after the outgoing ifp is calculated
To be able to pass ifp and mtu to the ipsec_output() and ipsec accelerator filter.
Sponsored by: NVIDIA networking Differential revision: https://reviews.freebsd.org/D44225
show more ...
|
#
00524fd4 |
| 30-Jan-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec_output(): add mtu argument
Similarly, mtu is needed to decide inline IPSEC offloiad for the driver.
Sponsored by: NVIDIA networking Differential revision: https://reviews.freebsd.org/D44224
|
#
de1da299 |
| 25-Jan-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec_output(): add outcoming ifp argument
The information about the interface is needed to coordinate inline offloading of IPSEC processing with corresponding driver.
Sponsored by: NVIDIA networki
ipsec_output(): add outcoming ifp argument
The information about the interface is needed to coordinate inline offloading of IPSEC processing with corresponding driver.
Sponsored by: NVIDIA networking Differential revision: https://reviews.freebsd.org/D44223
show more ...
|
#
530c2c30 |
| 20-Mar-2024 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ip6_output: Reduce cache misses on pktopts
When profiling an IP6 heavy workload, I noticed that we were getting a lot of cache misses in ip6_output() around ip6_pktopts. This was happening because t
ip6_output: Reduce cache misses on pktopts
When profiling an IP6 heavy workload, I noticed that we were getting a lot of cache misses in ip6_output() around ip6_pktopts. This was happening because the TCP stack passes inp->in6p_outputopts even if all options are unused. So in the common case of no options present, pkt_opts is not null, and is checked repeatedly for different options. Since ip6_pktopts is large (4 cachelines), and every field is checked, we take 4 cache misses (2 of which tend to be hidden by the adjacent line prefetcher).
To fix this common case, I introduced a new flag in ip6_pktopts (ip6po_valid) which tracks which options have been set. In the common case where nothing is set, this causes just a single cache miss to load. It also eliminates a test for some options (if (opt != NULL && opt->val >= const) vs if ((optvalid & flag) !=0 )
To keep the struct the same size in 64-bit kernels, and to keep the integer values (like ip6po_hlim, ip6po_tclass, etc) on the same cacheline, I moved them to the top.
As suggested by zlei, the null check in MAKE_EXTHDR() becomes redundant, and can be removed.
For our web server workload (with the ip6po_tclass option set), this drops the CPI from 2.9 to 2.4 for ip6_output
Differential Revision: https://reviews.freebsd.org/D44204 Reviewed by: bz, glebius, zlei No Objection from: melifaro Sponsored by: Netflix Inc.
show more ...
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
e3ba0d6a |
| 27-Jul-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
inpcb: do not copy so_options into inp_flags2
Since f71cb9f74808 socket stays connnected with inpcb through latter's lifetime and there is no reason to complicate things and copy these flags.
Revie
inpcb: do not copy so_options into inp_flags2
Since f71cb9f74808 socket stays connnected with inpcb through latter's lifetime and there is no reason to complicate things and copy these flags.
Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D41198
show more ...
|
#
bc310a95 |
| 20-Jul-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
ip output: ensure that mbufs are mapped if ipsec is enabled
Ipsec needs access to packet headers to determine if a policy is applicable. It seems that typically IP headers are mapped, but the code i
ip output: ensure that mbufs are mapped if ipsec is enabled
Ipsec needs access to packet headers to determine if a policy is applicable. It seems that typically IP headers are mapped, but the code is arguably needs to check this before blindly accessing them. Then, operations like m_unshare() and m_makespace() are not yet ready for unmapped mbufs.
Ensure that the packet is mapped before calling into IPSEC_OUTPUT().
PR: 272616 Reviewed by: jhb, markj Sponsored by: NVidia networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D41112
show more ...
|
#
317fa516 |
| 28-Feb-2023 |
Mark Johnston <markj@FreeBSD.org> |
netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option
It has no effect, and an exp-run revealed that it is not in use.
PR: 261398 (exp-run) Reviewed by: mjg, glebius Sponsored by: Klara, Inc.
netinet: Remove the IP(V6)_RSS_LISTEN_BUCKET socket option
It has no effect, and an exp-run revealed that it is not in use.
PR: 261398 (exp-run) Reviewed by: mjg, glebius Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38822
show more ...
|
#
3aff4ccd |
| 27-Feb-2023 |
Mark Johnston <markj@FreeBSD.org> |
netinet: Remove IP(V6)_BINDMULTI
This option was added in commit 0a100a6f1ee5 but was never completed. In particular, there is no logic to map flowids to different listening sockets, so it accomplis
netinet: Remove IP(V6)_BINDMULTI
This option was added in commit 0a100a6f1ee5 but was never completed. In particular, there is no logic to map flowids to different listening sockets, so it accomplishes basically the same thing as SO_REUSEPORT. Meanwhile, we've since added SO_REUSEPORT_LB, which at least tries to balance among listening sockets using a hash of the 4-tuple and some optional NUMA policy.
The option was never documented or completed, and an exp-run revealed nothing using it in the ports tree. Moreover, it complicates the already very complicated in_pcbbind_setup(), and the checking in in_pcbbind_check_bindmulti() is insufficient. So, let's remove it.
PR: 261398 (exp-run) Reviewed by: glebius Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38574
show more ...
|
#
3d0d5b21 |
| 23-Jan-2023 |
Justin Hibbits <jhibbits@FreeBSD.org> |
IfAPI: Explicitly include <net/if_private.h> in netstack
Summary: In preparation of making if_t completely opaque outside of the netstack, explicitly include the header. <net/if_var.h> will stop in
IfAPI: Explicitly include <net/if_private.h> in netstack
Summary: In preparation of making if_t completely opaque outside of the netstack, explicitly include the header. <net/if_var.h> will stop including the header in the future.
Sponsored by: Juniper Networks, Inc. Reviewed by: glebius, melifaro Differential Revision: https://reviews.freebsd.org/D38200
show more ...
|
Revision tags: release/12.4.0, release/13.1.0, release/12.3.0 |
|
#
21cc0918 |
| 17-Aug-2021 |
Elliott Mitchell <ehem+freebsd@m5p.com> |
sys: Nuke double-semicolons
A distinct number of double-semicolons have ended up in FreeBSD. Take a pass at getting rid of many of these harmless typos.
Reviewed by: emaste, rrs Pull Request: http
sys: Nuke double-semicolons
A distinct number of double-semicolons have ended up in FreeBSD. Take a pass at getting rid of many of these harmless typos.
Reviewed by: emaste, rrs Pull Request: https://github.com/freebsd/freebsd-src/pull/609 Differential Revision: https://reviews.freebsd.org/D31716
show more ...
|
#
2e0e2739 |
| 13-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet6: trim overly long lines in GET_PKTOPT_VAR(), fit into 80 chars
|
#
53af6903 |
| 07-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove INP_TIMEWAIT flag
Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes.
Note: this flag was very often check
tcp: remove INP_TIMEWAIT flag
Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes.
Note: this flag was very often checked together with INP_DROPPED. If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we will be able to remove most of this checks and turn them to assertions. Some of them can be turned into assertions right now, but that should be carefully done on a case by case basis.
Differential revision: https://reviews.freebsd.org/D36400
show more ...
|
#
46ddeb6b |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with the former removed in f0ffb944d25 in 2001 and the latter survived until today. It has been red
netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with the former removed in f0ffb944d25 in 2001 and the latter survived until today. It has been reduced down to only one useful declaration that moves to ip6_var.h
Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36726
show more ...
|
#
dda6376b |
| 08-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: employ newly added pfil_mbuf_{in,out} where approriate
Reviewed by: glebius Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D36454
|
#
74ed2e8a |
| 02-Sep-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
raw ip: fix regression with multicast and RSVP
With 61f7427f02a raw sockets protosw has wildcard pr_protocol. Protocol of a specific pcb is stored in inp_ip_p.
Reviewed by: karels Reported by: k
raw ip: fix regression with multicast and RSVP
With 61f7427f02a raw sockets protosw has wildcard pr_protocol. Protocol of a specific pcb is stored in inp_ip_p.
Reviewed by: karels Reported by: karels Differential revision: https://reviews.freebsd.org/D36429 Fixes: 61f7427f02a307d28af674a12c45dd546e3898e4
show more ...
|
#
50fa27e7 |
| 10-Jul-2022 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
netinet6: fix interface handling for loopback traffic
Currently, processing of IPv6 local traffic is partially broken: link-local connection fails and global unicast connect() takes 3 seconds to c
netinet6: fix interface handling for loopback traffic
Currently, processing of IPv6 local traffic is partially broken: link-local connection fails and global unicast connect() takes 3 seconds to complete. This happens due to the combination of multiple factors. IPv6 code passes original interface "origifp" when passing traffic via loopack to retain the scope that is mandatory for the correct hadling of link-local traffic. First problem is that the logic of passing source interface is not working correcly for TCP connections, resulting in passing "origifp" on the first 2 connection attempts and lo0 on the subsequent ones. Second problem is that source address validation logic skips its checks iff the source interface is loopback, which doesn't cover "origifp" case. More detailed description is available at https://reviews.freebsd.org/D35732
Fix the first problem by untangling&simplifying ifp/origifp logic. Fix the second problem by switching source address validation check to using M_LOOP mbuf flag instead of interface type.
PR: 265089 Reviewed by: ae, bz(previous version) Differential Revision: https://reviews.freebsd.org/D35732 MFC after: 2 weeks
show more ...
|
#
7d98cc09 |
| 01-Apr-2022 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Fix ipfw fwd that doesn't work in some cases
For IPv4 use dst pointer as destination address in fib4_lookup(). It keeps destination address from IPv4 header and can be changed when PACKET_TAG_IPFORW
Fix ipfw fwd that doesn't work in some cases
For IPv4 use dst pointer as destination address in fib4_lookup(). It keeps destination address from IPv4 header and can be changed when PACKET_TAG_IPFORWARD tag was set by packet filter.
For IPv6 override destination address with address from dst_sa.sin6_addr, that was set from PACKET_TAG_IPFORWARD tag.
Reviewed by: eugen MFC after: 1 week PR: 256828, 261697, 255705 Differential Revision: https://reviews.freebsd.org/D34732
show more ...
|
#
9ba11796 |
| 27-Jan-2022 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Fix a memory leak when ip_output_send() returns EAGAIN due to send tag issues
When ip_output_send() returns EAGAIN due to issues with send tags (route change, lagg failover, etc), it must free the m
Fix a memory leak when ip_output_send() returns EAGAIN due to send tag issues
When ip_output_send() returns EAGAIN due to issues with send tags (route change, lagg failover, etc), it must free the mbuf. This is because ip_output_send() was written as a wrapper/replacement for a direct call to if_output(), and the contract with if_output() has historically been that it owns the mbufs once called. When ip_output_send() failed to free mbufs, it violated this assumption and lead to leaked mbufs.
This was noticed when using NIC TLS in combination with hardware rate-limited connections. When seeing lots of NIC output drops triggered ratelimit send tag changes, we noticed we were leaking ktls_sessions, send tags and mbufs. This was due ip_output_send() leaking mbufs which held references to ktls_sessions, which in turn held references to send tags.
Many thanks to jbh, rrs, hselasky and markj for their help in debugging this.
Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34054 Reviewed by: hselasky, jhb, rrs MFC after: 2 weeks
show more ...
|
#
9f5432d5 |
| 15-Dec-2021 |
Kristof Provost <kp@FreeBSD.org> |
netinet6: ip6_setpktopt() requires NET_EPOCH
ip6_setpktopt() can call ifnet_byindex() which requires epoch. Mark the function as requiring NET_EPOCH, and ensure we enter it priot to calling it.
Rep
netinet6: ip6_setpktopt() requires NET_EPOCH
ip6_setpktopt() can call ifnet_byindex() which requires epoch. Mark the function as requiring NET_EPOCH, and ensure we enter it priot to calling it.
Reported-by: syzbot+92526116441688fea8a3@syzkaller.appspotmail.com Reviewed by: glebius Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D33462
show more ...
|
#
d74b7bae |
| 04-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them in epoch. Most of the code touched remains unsafe, as the returned pointer is be
ifnet_byindex() actually requires network epoch
Sweep over potentially unsafe calls to ifnet_byindex() and wrap them in epoch. Most of the code touched remains unsafe, as the returned pointer is being used after epoch exit. Mark that with a comment.
Validate the index argument inside the function, reducing argument validation requirement from the callers and making V_if_index private to if.c.
Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D33263
show more ...
|