#
43b117f8 |
| 06-Jun-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based ret
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based retransmissions. Add that same capability on a per-VNet basis to FreeBSD.
Reviewed By: cc, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D40424
show more ...
|
#
57a3a161 |
| 24-May-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: request tracking is not http specific.
This change is a name change only. TCP Request tracking can track sendfile and even non-sendfile requests. The names however in the current code use http,
tcp: request tracking is not http specific.
This change is a name change only. TCP Request tracking can track sendfile and even non-sendfile requests. The names however in the current code use http, and they should not. The feature is not http specific. Lets change the name so they more properly reflect whats going on. This also fixes conflicts with http_req which caused application pain.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D40229
show more ...
|
#
ec6d620b |
| 19-May-2023 |
Randall Stewart <rrs@FreeBSD.org> |
There are congestion control algorithms will that pull in srtt, and this can cause issues with rack.
When using rack, cubic and htcp will grab the srtt, but they think it is in ticks. For rack it is
There are congestion control algorithms will that pull in srtt, and this can cause issues with rack.
When using rack, cubic and htcp will grab the srtt, but they think it is in ticks. For rack it is in micro-seconds (which we should probably move all stacks to actually). This causes issues so instead lets make a new interface so that any CC module can pull the srtt in whatever granularity they want.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D40146
show more ...
|
#
c3c20de3 |
| 25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: move HPTS/LRO flags out of inpcb to tcpcb
These flags are TCP specific. While here, make also several LRO internal functions to pass tcpcb pointer instead of inpcb one.
Reviewed by: rrs Diff
tcp: move HPTS/LRO flags out of inpcb to tcpcb
These flags are TCP specific. While here, make also several LRO internal functions to pass tcpcb pointer instead of inpcb one.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39698
show more ...
|
#
c2a69e84 |
| 25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME
tcp_hpts: move HPTS related fields from inpcb to tcpcb
This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME-WAIT state (see 0d7445193ab), that used to free a tcpcb, while the associated connection is still on the HPTS ring.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39697
show more ...
|
#
8aa2be69 |
| 20-Apr-2023 |
Cheng Cui <cc@FreeBSD.org> |
Correct the value of macro TF2_TCP_ACCOUNTING.
Summary: Make sure the values are in order.
Reviewers: rscheff, tuexen, #transport! Approved by: rscheff, tuexen, glebius Subscribers: imp, melifaro,
Correct the value of macro TF2_TCP_ACCOUNTING.
Summary: Make sure the values are in order.
Reviewers: rscheff, tuexen, #transport! Approved by: rscheff, tuexen, glebius Subscribers: imp, melifaro, glebius Differential Revision: https://reviews.freebsd.org/D39716
show more ...
|
#
a540cdca |
| 17-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: use queue(9) STAILQ for the input queue
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39574
|
#
66fbc19f |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: pass tcpcb in the tfb_tcp_ctloutput() method instead of inpcb
Just matches rest of the KPI.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39435
|
#
35bc0bcc |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: reduce argument list to functions that pass a segment
The socket argument is superfluous, as a tcpcb always has one and only one socket.
Reviewed by: rrs Differential Revision: https://review
tcp: reduce argument list to functions that pass a segment
The socket argument is superfluous, as a tcpcb always has one and only one socket.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39434
show more ...
|
#
de4368dd |
| 07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: retire tfb_tcp_hpts_do_segment()
Isn't in use anymore. Correct comments that mention it.
Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39433
|
#
945f9a7c |
| 07-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Atta
tcp: misc cleanup of options for rack as well as socket option logging.
Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also there is a t_maxpeak_rate that needs to be removed (its un-used).
Reviewed by: tuexen, cc Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39427
show more ...
|
Revision tags: release/13.2.0 |
|
#
73ee5756 |
| 01-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use
Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.
So stack switching as always been a bit of a issue. We currently use a break before make setup which means that if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider for sure).
Once all this lands I will then update rack to begin using all these new features.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D39210
show more ...
|
#
69c7c811 |
| 16-Mar-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and t
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints.
Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831
show more ...
|
Revision tags: release/12.4.0, release/13.1.0, release/12.3.0 |
|
#
2f201df1 |
| 20-Jul-2021 |
Alfonso <gfunni234@gmail.com> |
Change hw_tls to a bool
Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/512
|
#
624de4ec |
| 22-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: remove unused function prototype
tcp_trace was implemented in tcp_debug.c, which was removed recently.
Reviewed by: rscheff@, zlei@ Sponsored by: Netflix, Inc. Differential Revision: https:/
tcp: remove unused function prototype
tcp_trace was implemented in tcp_debug.c, which was removed recently.
Reviewed by: rscheff@, zlei@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D38712
show more ...
|
#
76578d60 |
| 21-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
bblog: improve timeout event handling
Extend the BBLog RTO event to deal with all timers of the base stack. Also provide information about starting, stopping, and running off. The expiration of the
bblog: improve timeout event handling
Extend the BBLog RTO event to deal with all timers of the base stack. Also provide information about starting, stopping, and running off. The expiration of the retransmission timer is reported as it was done before.
Reviewed by: rscheff@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D38710
show more ...
|
#
6b802933 |
| 21-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: rearrange enum and remove unused variable
Rearrange the enum tt_which such that TT_REXMIT is 0. This allows an extension of the BBLog event RTO in a backwards compatible way. Remove tcptimers,
tcp: rearrange enum and remove unused variable
Rearrange the enum tt_which such that TT_REXMIT is 0. This allows an extension of the BBLog event RTO in a backwards compatible way. Remove tcptimers, which was only used in trpt, a utility removed from the source tree recently.
Reviewed by: glebius@, guest-ccui@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D38547
show more ...
|
#
c0e4090e |
| 08-Feb-2023 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET)
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that now, or in the past, ifnet ktls was active on a socket. Later, I added code to switch ifnet ktls sessions to software in the case of lossy TCP connections that have a high retransmit rate. Because TCP was using SB_TLS_IFNET to know if it needed to do math to calculate the retransmit ratio and potentially call into ktls_disable_ifnet(), it was doing unneeded work long after a session was moved to software.
This patch carefully tracks whether or not ifnet ktls is still enabled on a TCP connection. Because the inp is now embedded in the tcpcb, and because TCP is the most frequent accessor of this state, it made sense to move this from the socket buffer flags to the tcpcb. Because we now need reliable access to the tcbcb, we take a ref on the inp when creating a tx ktls session.
While here, I noticed that rack/bbr were incorrectly implementing tfb_hwtls_change(), and applying the change to all pending sends, when it should apply only to future sends.
This change reduces spurious calls to ktls_disable_ifnet() by 95% or so in a Netflix CDN environment.
Reviewed by: markj, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D38380
show more ...
|
#
9aff05bb |
| 07-Feb-2023 |
John Baldwin <jhb@FreeBSD.org> |
tcp_var.h: Fix spelling of independent in comment
|
#
18b83b62 |
| 26-Jan-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: reduce the size of t_rttupdated in tcpcb
During tcp session start, various mechanisms need to track a few initial RTTs before becoming active. Prevent overflows of the corresponding tracking co
tcp: reduce the size of t_rttupdated in tcpcb
During tcp session start, various mechanisms need to track a few initial RTTs before becoming active. Prevent overflows of the corresponding tracking counter and reduce the size of tcpcb simultaneously.
Reviewed By: #transport, tuexen, guest-ccui Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D21117
show more ...
|
#
446ccdd0 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: use single locked callout per tcpcb for the TCP timers
Use only one callout structure per tcpcb that is responsible for handling all five TCP timeouts. Use locked version of callout, of course
tcp: use single locked callout per tcpcb for the TCP timers
Use only one callout structure per tcpcb that is responsible for handling all five TCP timeouts. Use locked version of callout, of course. The callout function tcp_timer_enter() chooses soonest timer and executes it with lock held. Unless the timer reports that the tcpcb has been freed, the callout is rescheduled for next soonest timer, if there is any.
With single callout per tcpcb on connection teardown we should be able to fully stop the callout and immediately free it, avoiding use of callout_async_drain(). There is one gotcha here: callout_stop() can actually touch our memory when a rare race condition happens. See comment above tcp_timer_stop(). Synchronous stop of the callout makes tcp_discardcb() the single entry point for tcpcb destructor, merging the tcp_freecb() to the end of the function.
While here, also remove lots of lingering checks in the beginning of TCP timer functions. With a locked callout they are unnecessary.
While here, clean unused parts of timer KPI for the pluggable TCP stacks.
While here, remove TCPDEBUG from tcp_timer.c, as this allows for more simplification of TCP timers. The TCPDEBUG is scheduled for removal.
Move the DTrace probes in timers to the beginning of a function, where a tcpcb is always existing.
Discussed with: rrs, tuexen, rscheff (the TCP part of the diff) Reviewed by: hselasky, kib, mav (the callout part) Differential revision: https://reviews.freebsd.org/D37321
show more ...
|
#
918fa422 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove tcp_timer_suspend()
It was a temporary code added together with RACK to fight against TCP timer races.
|
#
e68b3792 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several struct
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately.
The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data.
One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
show more ...
|
#
bd4f9866 |
| 16-Nov-2022 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: remove unused t_rttbest
No functional change intended.
Reviewed by: rscheff@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D37401
|
#
1a70101a |
| 10-Nov-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: account sent/received IP ECN markings independently
Have tcpstats (netstat -s) differentiate between received and sent ECN-marked packets. Also account for IP ECN bits (on TCP packets) even whe
tcp: account sent/received IP ECN markings independently
Have tcpstats (netstat -s) differentiate between received and sent ECN-marked packets. Also account for IP ECN bits (on TCP packets) even when the tcp session has not negotiated ECN support.
Event: IETF 115 Hackathon Reviewed By: glebius, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D37314
show more ...
|