#
605a0066 |
| 15-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp bbr: improve code consistency
Improve code consistency with the RACK stack. Reviewed by: gallatin, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews
tcp bbr: improve code consistency
Improve code consistency with the RACK stack. Reviewed by: gallatin, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44800
show more ...
|
#
dd7b86e2 |
| 18-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that declaration of the macro in tcp_var.h depended on a kernel option. It is a bad practice to create such definitions in installable headers.
Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44362
show more ...
|
#
e18b97bd |
| 12-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
With a bit of a struggle with git I finally got rack_pcm.c into place (apologies for not noticing this error). The LINT kernel is running on my box now .. sigh.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
#
c112243f |
| 11-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa4
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa473491a0b8c3878113e.
show more ...
|
#
f6d489f4 |
| 11-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
Note there is a new file that I can't figure out how to get in rack_pcm.c
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
Revision tags: release/13.3.0 |
|
#
2f4e46df |
| 16-Feb-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
RACK, BBR: handle EACCES like EPERM for IP output handling
The FreeBSD TCP base stack handles them also the same way. In case of packet filters dropping packets in the output path, this avoids retra
RACK, BBR: handle EACCES like EPERM for IP output handling
The FreeBSD TCP base stack handles them also the same way. In case of packet filters dropping packets in the output path, this avoids retranmitting the dropped packet every 10ms or so.
Reviewed by: rscheff MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D43773
show more ...
|
#
7b0b448b |
| 27-Dec-2023 |
Gordon Bergling <gbe@FreeBSD.org> |
tcp_stacks: Fix two typos in a source code comments
- s/recieved/received/
MFC after: 3 days
|
#
d2ef52ef |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given
tcp/hpts: make stacks responsible for clearing themselves out HPTS
There already is the tfb_tcp_timer_stop_all method that is supposed to stop all time events associated with a given tcpcb by given stack. Some time ago it was doing actual callout_stop(). Today bbr/rack just mark their internal state as inactive in their tfb_tcp_timer_stop_all methods, but tcpcb stays in HPTS wheel and potentially called in from HPTS. Change the methods to also call tcp_hpts_remove(). Note: I'm not sure if internal flag is still relevant once we are out of HPTS wheel.
Call the method when connection goes into TCP_CLOSED state, instead of calling it later when tcpcb is freed. Also call it when we switch between stacks.
Reviewed by: tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42857
show more ...
|
#
2b3a7746 |
| 04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
hpts: make stacks responsible for tcp_hpts_init()
Those stacks that use HPTS should care about init, not generic code.
Reviewed by: imp, tuexen, rrs Differential Revision: https://reviews.freebsd.
hpts: make stacks responsible for tcp_hpts_init()
Those stacks that use HPTS should care about init, not generic code.
Reviewed by: imp, tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42856
show more ...
|
Revision tags: release/14.0.0 |
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
ab65c64b |
| 26-Jul-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix handling of <RST,ACK> segments in SYN-RCVD for RACK and BBR
This deals with TCP endpoints in the SYN-RCVD state coming from the SYN-SENT state.
Reviewed by: rscheff MFC after: 3 days Spo
tcp: fix handling of <RST,ACK> segments in SYN-RCVD for RACK and BBR
This deals with TCP endpoints in the SYN-RCVD state coming from the SYN-SENT state.
Reviewed by: rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D41203
show more ...
|
#
02b885b0 |
| 21-Jun-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix TCP MD5 computation for the BBR and RACK stack
PR: 253096 Reviewed by: cc, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D405
tcp: fix TCP MD5 computation for the BBR and RACK stack
PR: 253096 Reviewed by: cc, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D40597
show more ...
|
#
7ea8d027 |
| 20-Jun-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
Update various sys/netinet source files to conform with the style(9) guide on how to label FALLTHOUGH in switch statements.
No functional chance.
Reviewed By: tuexen, cc, #transport Sponsored by:
Update various sys/netinet source files to conform with the style(9) guide on how to label FALLTHOUGH in switch statements.
No functional chance.
Reviewed By: tuexen, cc, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D40622
show more ...
|
#
43b117f8 |
| 06-Jun-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based ret
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based retransmissions. Add that same capability on a per-VNet basis to FreeBSD.
Reviewed By: cc, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D40424
show more ...
|
#
c3c20de3 |
| 25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: move HPTS/LRO flags out of inpcb to tcpcb
These flags are TCP specific. While here, make also several LRO internal functions to pass tcpcb pointer instead of inpcb one.
Reviewed by: rrs Diff
tcp: move HPTS/LRO flags out of inpcb to tcpcb
These flags are TCP specific. While here, make also several LRO internal functions to pass tcpcb pointer instead of inpcb one.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39698
show more ...
|
#
c2a69e84 |
| 25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME-WAIT state (see 0d7445193ab), that used to free a tcpcb, while the associated connection is still on the HPTS ring.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39697
show more ...
|
#
2ad584c5 |
| 17-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: Inconsistent use of hpts_calling flag
Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One such inconsistency results in a bug when we can't a
tcp: Inconsistent use of hpts_calling flag
Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One such inconsistency results in a bug when we can't allocate enough sendmap entries to entertain a call to rack_output().. basically a timer won't get started like it should. Also in cleaning this up I find that the "no_output" side of input needs to be adjusted to make sure we don't try to re-pace too quickly outside the hpts assurance of 250useconds.
Another thing here is we end up with duplicate calls to tcp_output() which we should not. If packets go from hpts for processing the input side of tcp will call the output side of tcp on the last packet if it is needed. This means that when that occurs a second call to tcp_output would be made that is not needed and if pacing is going on may be harmful.
Lets fix all this and explicitly state the contract that hpts is making with transports that care about the flag.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D39653
show more ...
|
#
a540cdca |
| 17-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: use queue(9) STAILQ for the input queue
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39574
|
#
3cc7b667 |
| 14-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: stack unloading crash in rack and bbr
Its possible to induce a crash in either rack or bbr. This would be done if the rack stack were say the default and bbr was being used by a connection. If
tcp: stack unloading crash in rack and bbr
Its possible to induce a crash in either rack or bbr. This would be done if the rack stack were say the default and bbr was being used by a connection. If the bbr stack is then unloaded and it was active, we will trigger a MPASS assert in tcp_hpts since the new stack (default rack) would start a timer, and the old stack (bbr) would have the inp already in hpts.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D39576
show more ...
|
#
66fbc19f |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: pass tcpcb in the tfb_tcp_ctloutput() method instead of inpcb
Just matches rest of the KPI.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39435
|
#
35bc0bcc |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: reduce argument list to functions that pass a segment
The socket argument is superfluous, as a tcpcb always has one and only one socket.
Reviewed by: rrs Differential Revision: https://review
tcp: reduce argument list to functions that pass a segment
The socket argument is superfluous, as a tcpcb always has one and only one socket.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39434
show more ...
|
#
945f9a7c |
| 07-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Atta
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also there is a t_maxpeak_rate that needs to be removed (its un-used).
Reviewed by: tuexen, cc Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39427
show more ...
|
Revision tags: release/13.2.0 |
|
#
73ee5756 |
| 01-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use a break before make setup which means that if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider for sure).
Once all this lands I will then update rack to begin using all these new features.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39210
show more ...
|
#
69c7c811 |
| 16-Mar-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and t
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints.
Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831
show more ...
|
#
c0e4090e |
| 08-Feb-2023 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET)
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that now, or in the past, ifnet ktls was active on a socket. Later, I added code to switch ifnet ktls sessions to software in the case of lossy TCP connections that have a high retransmit rate. Because TCP was using SB_TLS_IFNET to know if it needed to do math to calculate the retransmit ratio and potentially call into ktls_disable_ifnet(), it was doing unneeded work long after a session was moved to software.
This patch carefully tracks whether or not ifnet ktls is still enabled on a TCP connection. Because the inp is now embedded in the tcpcb, and because TCP is the most frequent accessor of this state, it made sense to move this from the socket buffer flags to the tcpcb. Because we now need reliable access to the tcbcb, we take a ref on the inp when creating a tx ktls session.
While here, I noticed that rack/bbr were incorrectly implementing tfb_hwtls_change(), and applying the change to all pending sends, when it should apply only to future sends.
This change reduces spurious calls to ktls_disable_ifnet() by 95% or so in a Netflix CDN environment.
Reviewed by: markj, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D38380
show more ...
|