#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
Revision tags: release/14.0.0 |
|
#
2ff63af9 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
|
Revision tags: release/13.2.0 |
|
#
6b802933 |
| 21-Feb-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: rearrange enum and remove unused variable
Rearrange the enum tt_which such that TT_REXMIT is 0. This allows an extension of the BBLog event RTO in a backwards compatible way. Remove tcptimers,
tcp: rearrange enum and remove unused variable
Rearrange the enum tt_which such that TT_REXMIT is 0. This allows an extension of the BBLog event RTO in a backwards compatible way. Remove tcptimers, which was only used in trpt, a utility removed from the source tree recently.
Reviewed by: glebius@, guest-ccui@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D38547
show more ...
|
#
446ccdd0 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: use single locked callout per tcpcb for the TCP timers
Use only one callout structure per tcpcb that is responsible for handling all five TCP timeouts. Use locked version of callout, of course
tcp: use single locked callout per tcpcb for the TCP timers
Use only one callout structure per tcpcb that is responsible for handling all five TCP timeouts. Use locked version of callout, of course. The callout function tcp_timer_enter() chooses soonest timer and executes it with lock held. Unless the timer reports that the tcpcb has been freed, the callout is rescheduled for next soonest timer, if there is any.
With single callout per tcpcb on connection teardown we should be able to fully stop the callout and immediately free it, avoiding use of callout_async_drain(). There is one gotcha here: callout_stop() can actually touch our memory when a rare race condition happens. See comment above tcp_timer_stop(). Synchronous stop of the callout makes tcp_discardcb() the single entry point for tcpcb destructor, merging the tcp_freecb() to the end of the function.
While here, also remove lots of lingering checks in the beginning of TCP timer functions. With a locked callout they are unnecessary.
While here, clean unused parts of timer KPI for the pluggable TCP stacks.
While here, remove TCPDEBUG from tcp_timer.c, as this allows for more simplification of TCP timers. The TCPDEBUG is scheduled for removal.
Move the DTrace probes in timers to the beginning of a function, where a tcpcb is always existing.
Discussed with: rrs, tuexen, rscheff (the TCP part of the diff) Reviewed by: hselasky, kib, mav (the callout part) Differential revision: https://reviews.freebsd.org/D37321
show more ...
|
#
918fa422 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove tcp_timer_suspend()
It was a temporary code added together with RACK to fight against TCP timer races.
|
#
e68b3792 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several struct
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately.
The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data.
One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
show more ...
|
Revision tags: release/12.4.0 |
|
#
0d744519 |
| 07-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove tcptw, the compressed timewait state structure
The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no longer justify the complexity required to maintain it. For longer e
tcp: remove tcptw, the compressed timewait state structure
The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no longer justify the complexity required to maintain it. For longer explanation please check out the email [1].
Surpisingly through almost 20 years the TCP stack functionality of handling the TIME_WAIT state with a normal tcpcb did not bitrot. The existing tcp_input() properly handles a tcpcb in TCPS_TIME_WAIT state, which is confirmed by the packetdrill tcp-testsuite [2].
This change just removes tcptw and leaves INP_TIMEWAIT. The flag will be removed in a separate commit. This makes it easier to review and possibly debug the changes.
[1] https://lists.freebsd.org/archives/freebsd-net/2022-January/001206.html [2] https://github.com/freebsd-net/tcp-testsuite
Differential revision: https://reviews.freebsd.org/D36398
show more ...
|
#
77198a94 |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_timers: provide tcp_timer_drop() and tcp_timer_close()
Two functions to call tcp_drop() and tcp_close() from a callout context. Garbage collect tcp_inpinfo_lock_del(), it has a single use now.
tcp_timers: provide tcp_timer_drop() and tcp_timer_close()
Two functions to call tcp_drop() and tcp_close() from a callout context. Garbage collect tcp_inpinfo_lock_del(), it has a single use now.
Differential revision: https://reviews.freebsd.org/D36397
show more ...
|
#
08af8aac |
| 27-Sep-2022 |
Randall Stewart <rrs@FreeBSD.org> |
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket opt
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket option has been missing in tcp_usrreq.c). Lets get the progress timeout fully supported in the base stack as well as rack.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36716
show more ...
|
Revision tags: release/13.1.0 |
|
#
c2c8e360 |
| 04-Dec-2021 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
tcp: virtualise net.inet.tcp.msl sysctl.
VNET teardown waits 2*MSL (60 seconds by default) before expiring tcp PCBs. These PCBs holds references to nexthops, which, in turn, reference ifnets. This
tcp: virtualise net.inet.tcp.msl sysctl.
VNET teardown waits 2*MSL (60 seconds by default) before expiring tcp PCBs. These PCBs holds references to nexthops, which, in turn, reference ifnets. This chain results in VNET interfaces being destroyed and moved to default VNET only after 60 seconds. Allow tcp_msl to be set in jail by virtualising net.inet.tcp.msl sysctl, permitting more predictable VNET tests outcomes.
MFC after: 1 week Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D33270
show more ...
|
Revision tags: release/12.3.0 |
|
#
ff945008 |
| 19-Nov-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Add tcp_freecb() - single place to free tcpcb.
Until this change there were two places where we would free tcpcb - tcp_discardcb() in case if all timers are drained and tcp_timer_discard() otherwise
Add tcp_freecb() - single place to free tcpcb.
Until this change there were two places where we would free tcpcb - tcp_discardcb() in case if all timers are drained and tcp_timer_discard() otherwise. They were pretty much copy-n-paste, except that in the default case we would run tcp_hc_update(). Merge this into single function tcp_freecb() and move new short version of tcp_timer_discard() to tcp_timer.c and make it static.
Reviewed by: rrs, hselasky Differential revision: https://reviews.freebsd.org/D32965
show more ...
|
Revision tags: release/13.0.0 |
|
#
4c0bef07 |
| 21-Jan-2021 |
Kyle Evans <kevans@FreeBSD.org> |
kern: net: remove TCP_LINGERTIME
TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in exactly the same form that it appears here modulo slightly different context. It used to be
kern: net: remove TCP_LINGERTIME
TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in exactly the same form that it appears here modulo slightly different context. It used to be the case that there was a single pr_usrreq method with requests dispatched to it; these exact two lines appeared in tcp_usrreq's PRU_ATTACH handling.
The only purpose of this that I can find is to cause surprising behavior on accepted connections. Newly-created sockets will never hit these paths as one cannot set SO_LINGER prior to socket(2). If SO_LINGER is set on a listening socket and inherited, one would expect the timeout to be inherited rather than changed arbitrarily like this -- noting that SO_LINGER is nonsense on a listening socket beyond inheritance, since they cannot be 'connected' by definition.
Neither Illumos nor Linux reset the timer like this based on testing and inspection of Illumos, and testing of Linux.
Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D28265
show more ...
|
Revision tags: release/12.2.0, release/11.4.0 |
|
#
d7ca3f78 |
| 16-Apr-2020 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
Reduce default TCP delayed ACK timeout to 40ms.
Reviewed by: kbowling, tuexen Approved by: tuexen (mentor) MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebs
Reduce default TCP delayed ACK timeout to 40ms.
Reviewed by: kbowling, tuexen Approved by: tuexen (mentor) MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D23281
show more ...
|
#
44e86fbd |
| 13-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357662 through r357854.
|
#
481be5de |
| 12-Feb-2020 |
Randall Stewart <rrs@FreeBSD.org> |
White space cleanup -- remove trailing tab's or spaces from any line.
Sponsored by: Netflix Inc.
|
#
334fc582 |
| 09-Jan-2020 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
vnet: virtualise more network stack sysctls.
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain. All three are set in the netoptions startup script, which we would love to run for VNETs as we
vnet: virtualise more network stack sysctls.
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain. All three are set in the netoptions startup script, which we would love to run for VNETs as well [1].
While virtualising the log_in_vain sysctls seems pointles at first for as long as the kernel message buffer is not virtualised, it at least allows an administrator to debug the base system or an individual jail if needed without turning the logging on for all jails running on a system.
PR: 243193 [1] MFC after: 2 weeks
show more ...
|
Revision tags: release/12.1.0, release/11.3.0 |
|
#
415e34c4 |
| 29-Mar-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead@r345677
|
#
0999766d |
| 23-Mar-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
Add sysctl variable net.inet.tcp.rexmit_initial for setting RTO.Initial used by TCP.
Reviewed by: rrs@, 0mp@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D19355
|
#
18b18078 |
| 25-Feb-2019 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r344527
|
#
a8fe8db4 |
| 25-Feb-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r344178 through r344512.
|
#
3b853844 |
| 20-Feb-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
Reduce the TCP initial retransmission timeout from 3 seconds to 1 second as allowed by RFC 6298.
Reviewed by: kbowling@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: ht
Reduce the TCP initial retransmission timeout from 3 seconds to 1 second as allowed by RFC 6298.
Reviewed by: kbowling@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D18941
show more ...
|
#
c6dcb64b |
| 20-Feb-2019 |
Michael Tuexen <tuexen@FreeBSD.org> |
Use exponential backoff for retransmitting SYN segments as specified in the TCP RFCs.
Reviewed by: rrs@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.fr
Use exponential backoff for retransmitting SYN segments as specified in the TCP RFCs.
Reviewed by: rrs@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D18974
show more ...
|
Revision tags: release/12.0.0 |
|
#
6573d758 |
| 04-Jul-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simpli
epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simplify tfb_tcp_do_segment to not take a ti_locked argument, there's no longer any benefit to dropping the pcbinfo lock and trying to do so just adds an error prone branchfest to these functions - Remove cases of same function recursion on the epoch as recursing is no longer free. - Remove the the TAILQ_ENTRY and epoch_section from struct thread as the tracker field is now stack or heap allocated as appropriate.
Tested by: pho and Limelight Networks Reviewed by: kbowling at llnw dot com Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16066
show more ...
|
Revision tags: release/11.2.0 |
|
#
89e560f4 |
| 07-Jun-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent ackno
This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent acknowledgment) where counting dup-acks is no longer done instead time is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt to try not to take a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS - PRR (partial rate reduction) see the RFC.
Once built into your kernel, you can select this stack by either socket option with the name of the stack is "rack" or by setting the global sysctl so the default is rack.
Note that any connection that does not support SACK will be kicked back to the "default" base FreeBSD stack (currently known as "default").
To build this into your kernel you will need to enable in your kernel: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS
Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15525
show more ...
|
#
f1798531 |
| 31-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so expor
Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so export this variable properly (including moving it into the tcp_* namespace).
Reviewed by: bz, emaste MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14129
show more ...
|