#
507f87a7 |
| 16-Jan-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: retire sorflush()
With removal of dom_dispose method the function boils down to two meaningful function calls: socantrcvmore() and sbrelease(). The latter is only relevant for protocols th
sockets: retire sorflush()
With removal of dom_dispose method the function boils down to two meaningful function calls: socantrcvmore() and sbrelease(). The latter is only relevant for protocols that use generic socket buffers.
The socket I/O sx(9) lock acquisition in sorflush() is not relevant for shutdown(2) operation as it doesn't do any I/O that may interleave with read(2) or write(2). The socket buffer mutex acquisition inside sbrelease() is what guarantees thread safety. This sx(9) acquisition in soshutdown() can be tracked down to 4.4BSD times, where it used to be sblock(), and it was carried over through the years evolving together with sockets with no reconsideration of why do we carry it over. I can't tell if that sblock() made sense back then, but it doesn't make any today.
Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D43415
show more ...
|
#
5bba2728 |
| 16-Jan-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: make pr_shutdown fully protocol specific method
Disassemble a one-for-all soshutdown() into protocol specific methods. This creates a small amount of copy & paste, but makes code a lot more
sockets: make pr_shutdown fully protocol specific method
Disassemble a one-for-all soshutdown() into protocol specific methods. This creates a small amount of copy & paste, but makes code a lot more self documented, as protocol specific method would execute only the code that is relevant to that protocol and nothing else. This also fixes a couple recent regressions and reduces risk of future regressions. The extended KPI for the new pr_shutdown removes need for the extra pr_flush which was added for the sake of SCTP which could not perform its shutdown properly with the old one. Particularly for SCTP this change streamlines a lot of code.
Some notes on why certain parts of code were copied or were not to certain protocols: * The (SS_ISCONNECTED | SS_ISCONNECTING | SS_ISDISCONNECTING) check is needed only for those protocols that may be connected or disconnected. * The above reduces into only SS_ISCONNECTED for those protocols that always connect instantly. * The ENOTCONN and continue processing hack is left only for datagram protocols. * The SOLISTENING(so) block is copied to those protocols that listen(2). * sorflush() on SHUT_RD is copied almost to every protocol, but that will be refactored later. * wakeup(&so->so_timeo) is copied to protocols that can make a non-instant connect(2), can SO_LINGER or can accept(2).
There are three protocols (netgraph(4), Bluetooth, SDP) that did not have pr_shutdown, but old soshutdown() would still perform sorflush() on SHUT_RD for them and also wakeup(9). Those protocols partially supported shutdown(2) returning EOPNOTSUP for SHUT_WR/SHUT_RDWR, now they fully lost shutdown(2) support. I'm pretty sure netgraph(4) and Bluetooth are okay about that and SDP is almost abandoned anyway.
Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D43413
show more ...
|
#
a13039e2 |
| 27-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
inpcb: reoder inpcb destruction
First, merge in_pcbdetach() with in_pcbfree(). The comment for in_pcbdetach() was no longer correct. Then, make sure we remove the inpcb from the hash before we com
inpcb: reoder inpcb destruction
First, merge in_pcbdetach() with in_pcbfree(). The comment for in_pcbdetach() was no longer correct. Then, make sure we remove the inpcb from the hash before we commit any destructive actions on it. There are couple functions that rely on the hash lock skipping SMR + inpcb lock to lookup an inpcb. Although there are no known functions that similarly rely on the global inpcb list lock, also do list removal before destructive actions.
PR: 273890 Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D43122
show more ...
|
#
d2ef52ef |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given stack. Some time ago it was doing actual callout_stop(). Today bbr/rack just mark their internal state as inactive in their tfb_tcp_timer_stop_all methods, but tcpcb stays in HPTS wheel and potentially called in from HPTS. Change the methods to also call tcp_hpts_remove(). Note: I'm not sure if internal flag is still relevant once we are out of HPTS wheel.
Call the method when connection goes into TCP_CLOSED state, instead of calling it later when tcpcb is freed. Also call it when we switch between stacks.
Reviewed by: tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42857
show more ...
|
#
f42518ff |
| 30-Nov-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt
Moving lrd sysctl to the tcp.sack branch, since LRD only works with SACK. Remove the sockopt to programmatically control LRD
tcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt
Moving lrd sysctl to the tcp.sack branch, since LRD only works with SACK. Remove the sockopt to programmatically control LRD per session.
Reviewed By: #transport, tuexen, rrs Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D42851
show more ...
|
#
cfb1e929 |
| 30-Nov-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: don't malloc/free sockaddr memory on accept(2)
Let the accept functions provide stack memory for protocols to fill it in. Generic code should provide sockaddr_storage, specialized code may
sockets: don't malloc/free sockaddr memory on accept(2)
Let the accept functions provide stack memory for protocols to fill it in. Generic code should provide sockaddr_storage, specialized code may provide smaller structure.
While rewriting accept(2) make 'addrlen' a true in/out parameter, reporting required length in case if provided length was insufficient. Our manual page accept(2) and POSIX don't explicitly require that, but one can read the text as they do. Linux also does that. Update tests accordingly.
Reviewed by: rscheff, tuexen, zlei, dchagin Differential Revision: https://reviews.freebsd.org/D42635
show more ...
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
#
70e30add |
| 17-Nov-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove extraneous network epoch entry
accept(2) on IPv6 TCP doesn't need epoch. Some leaf functions may need it, but they will enter accordingly, see sa6_recoverscope().
Reviewed by: rscheff
tcp: remove extraneous network epoch entry
accept(2) on IPv6 TCP doesn't need epoch. Some leaf functions may need it, but they will enter accordingly, see sa6_recoverscope().
Reviewed by: rscheff, tuexen (implicitly, see deleted XXXMT) Differential Revision: https://reviews.freebsd.org/D42634
show more ...
|
Revision tags: release/14.0.0 |
|
#
dc485b96 |
| 22-Aug-2023 |
Marius Strobl <marius@FreeBSD.org> |
tcp_info: Add and export more FreeBSD-specific fields
This change adds struct tcp_info fields corresponding to the following struct tcpcb ones: - snd_una - snd_max - rcv_numsacks - rcv_adv - dupacks
tcp_info: Add and export more FreeBSD-specific fields
This change adds struct tcp_info fields corresponding to the following struct tcpcb ones: - snd_una - snd_max - rcv_numsacks - rcv_adv - dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are extended accordingly, no counterpart of rcv_numsacks is available in the cxgbe(4) TOE PCB, though.
Sponsored by: NetApp, Inc. (originally)
show more ...
|
#
8c6104c4 |
| 22-Aug-2023 |
Marius Strobl <marius@FreeBSD.org> |
tcp_fill_info(): Change lock assertion on INPCB to locked only
This function actually only ever reads from the TCP PCB. Consequently, also make the pointer to its TCP PCB parameter const.
Sponsored
tcp_fill_info(): Change lock assertion on INPCB to locked only
This function actually only ever reads from the TCP PCB. Consequently, also make the pointer to its TCP PCB parameter const.
Sponsored by: NetApp, Inc. (originally)
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
de0a2eb2 |
| 23-Jun-2023 |
Mark Johnston <markj@FreeBSD.org> |
tcp: Disallow connecting a disconnected socket
Currently nothing prevents tcp_usr_connect() from attempting to connect when the socket has been disconnected. At the moment, doing so triggers an ass
tcp: Disallow connecting a disconnected socket
Currently nothing prevents tcp_usr_connect() from attempting to connect when the socket has been disconnected. At the moment, doing so triggers an assertion in in_pcbconnect() because inp_faddr is not unspecified. I believe this may have been caught in the past by TIMEWAIT checks, but those are now removed.
Check for additional socket states in tcp_connect().
Reported by: syzbot+f0f7871ec5397602b446@syzkaller.appspotmail.com Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40579
show more ...
|
#
04682968 |
| 20-Jun-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: expose AccECN mode and TCP FastOpen (TFO) in TCPI
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D40621
|
#
c2399dd2 |
| 06-May-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve BBLoging for PRUs
Log all errors for PRUs, except when INP_DROPPED is set. In that case, don't log it.
Reviewed by: glebius, rrs Sponsored by: Netflix, Inc. Differential Revision: ht
tcp: improve BBLoging for PRUs
Log all errors for PRUs, except when INP_DROPPED is set. In that case, don't log it.
Reviewed by: glebius, rrs Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D39591
show more ...
|
#
c2a69e84 |
| 25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME-WAIT state (see 0d7445193ab), that used to free a tcpcb, while the associated connection is still on the HPTS ring.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39697
show more ...
|
#
66fbc19f |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: pass tcpcb in the tfb_tcp_ctloutput() method instead of inpcb
Just matches rest of the KPI.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39435
|
#
945f9a7c |
| 07-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Atta
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also there is a t_maxpeak_rate that needs to be removed (its un-used).
Reviewed by: tuexen, cc Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39427
show more ...
|
Revision tags: release/13.2.0 |
|
#
73ee5756 |
| 01-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use a break before make setup which means that if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider for sure).
Once all this lands I will then update rack to begin using all these new features.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39210
show more ...
|
#
69c7c811 |
| 16-Mar-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and t
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints.
Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831
show more ...
|
#
399a5655 |
| 28-Feb-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Make TCP PCAP buffer properly configurable.
Reviewed By: tuexen, cc, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D38824
|
#
453aa7fa |
| 23-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: ensure the tcpcb is not NULL when logging an event
When calling tcp_bblog_pru() on some error paths, tp is NULL, therefore handle it.
Sponsored by: Netflix, Inc.
|
#
4065becf |
| 21-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
bblog: unbreak build
Ensure that tp is always declared and set.
Reported by: Michael Butler Sponsored by: Netflix, Inc.
|
#
00812bbd |
| 21-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
bblog: add logging of protocol user requests
This information was available in trpt and is useful. So provide a way to get this information via TCP BBLog.
Reviewed by: rscheff@ Sponsored by: Netf
bblog: add logging of protocol user requests
This information was available in trpt and is useful. So provide a way to get this information via TCP BBLog.
Reviewed by: rscheff@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D38701
show more ...
|
#
cda6bdba |
| 17-Feb-2023 |
John Baldwin <jhb@FreeBSD.org> |
tcp: Don't try to disconnect a socket multiple times.
When the checks for INP_TIMEWAIT were removed, tcp_usr_close() and tcp_usr_disconnect() were no longer prevented from calling tcp_disconnect() o
tcp: Don't try to disconnect a socket multiple times.
When the checks for INP_TIMEWAIT were removed, tcp_usr_close() and tcp_usr_disconnect() were no longer prevented from calling tcp_disconnect() on a socket that was already disconnected. This triggered a panic in cxgbe(4) for TOE where the tcp_disconnect() on an already-disconnected socket invoked tcp_output() on a socket that was already in time-wait.
Reviewed by: rrs, np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D37112
show more ...
|
#
96871af0 |
| 15-Feb-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
inpcb: use family specific sockaddr argument for bind functions
Do the cast from sockaddr to either IPv4 or IPv6 sockaddr in the protocol's pr_bind method and from there on go down the call stack wi
inpcb: use family specific sockaddr argument for bind functions
Do the cast from sockaddr to either IPv4 or IPv6 sockaddr in the protocol's pr_bind method and from there on go down the call stack with family specific argument.
Reviewed by: zlei, melifaro, markj Differential Revision: https://reviews.freebsd.org/D38601
show more ...
|