#
8840ae22 |
| 08-Nov-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: don't store VNET in every tcpcb, take it from the inpcbinfo
Reviewed by: rscheff Differential revision: https://reviews.freebsd.org/D37125
|
#
9eb0e832 |
| 08-Nov-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: provide macros to access inpcb and socket from a tcpcb
There should be no functional changes with this commit.
Reviewed by: rscheff Differential revision: https://reviews.freebsd.org/D37123
|
#
dc9daa04 |
| 08-Nov-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: allow packets to be marked as ECT1 instead of ECT0
This adds the capability for a modular congestion control to select which variant of ECN-capable-transport it wants to use when sending out el
tcp: allow packets to be marked as ECT1 instead of ECT0
This adds the capability for a modular congestion control to select which variant of ECN-capable-transport it wants to use when sending out elegible segments. As an initial CC to utilize this, DCTCP was selected.
Event: IETF 115 Hackathon Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D24869
show more ...
|
#
37bf391d |
| 06-Nov-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: make tcp_packets_this_ack() only visible in kernel scope
|
#
b1258b76 |
| 06-Nov-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: add conservative d.cep accounting algorithm
Accurate ECN asks to conservatively estimate, when the ACE counter may have wrapped due to a single ACK covering a larger number of segments. This is
tcp: add conservative d.cep accounting algorithm
Accurate ECN asks to conservatively estimate, when the ACE counter may have wrapped due to a single ACK covering a larger number of segments. This is described in Annex A.2 of the accurate-ecn draft.
Event: IETF 115 Hackathon Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D37281
show more ...
|
#
c348e880 |
| 31-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: make tcp_handle_wakeup() static and robust
It is called only from tcp_input() and always has valid parameter.
Reviewed by: rscheff, tuexen Differential revision: https://reviews.freebsd.org/D
tcp: make tcp_handle_wakeup() static and robust
It is called only from tcp_input() and always has valid parameter.
Reviewed by: rscheff, tuexen Differential revision: https://reviews.freebsd.org/D37115
show more ...
|
#
83c1ec92 |
| 20-Oct-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: ECN preparations for ECN++, AccECN (tcp_respond)
tcp_respond is another function to build a tcp control packet quickly. With ECN++ and AccECN, both the IP ECN header, and the TCP ECN flags are
tcp: ECN preparations for ECN++, AccECN (tcp_respond)
tcp_respond is another function to build a tcp control packet quickly. With ECN++ and AccECN, both the IP ECN header, and the TCP ECN flags are supposed to reflect the correct state.
Also ensure that on receiving multiple ECN SYN-ACKs, the responses triggered will reflect the latest state.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D36973
show more ...
|
#
c3738466 |
| 19-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: style the struct tcpcb definition
- Use C99 types uintXX_t instead of u_intXX_t. - Try to make space/tab usage a little bit more consistent. - Shorten comments to fit into 80 chars.
Not a func
tcp: style the struct tcpcb definition
- Use C99 types uintXX_t instead of u_intXX_t. - Try to make space/tab usage a little bit more consistent. - Shorten comments to fit into 80 chars.
Not a functional change, just making future changes easier to read.
show more ...
|
#
0d744519 |
| 07-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove tcptw, the compressed timewait state structure
The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no longer justify the complexity required to maintain it. For longer e
tcp: remove tcptw, the compressed timewait state structure
The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no longer justify the complexity required to maintain it. For longer explanation please check out the email [1].
Surpisingly through almost 20 years the TCP stack functionality of handling the TIME_WAIT state with a normal tcpcb did not bitrot. The existing tcp_input() properly handles a tcpcb in TCPS_TIME_WAIT state, which is confirmed by the packetdrill tcp-testsuite [2].
This change just removes tcptw and leaves INP_TIMEWAIT. The flag will be removed in a separate commit. This makes it easier to review and possibly debug the changes.
[1] https://lists.freebsd.org/archives/freebsd-net/2022-January/001206.html [2] https://github.com/freebsd-net/tcp-testsuite
Differential revision: https://reviews.freebsd.org/D36398
show more ...
|
#
cd84e78f |
| 04-Oct-2022 |
Randall Stewart <rrs@FreeBSD.org> |
tcp idle reduce does not work for a server.
TCP has an idle-reduce feature that allows a connection to reduce its cwnd after it has been idle more than an RTT. This feature only works for a sending
tcp idle reduce does not work for a server.
TCP has an idle-reduce feature that allows a connection to reduce its cwnd after it has been idle more than an RTT. This feature only works for a sending side connection. It does this by at output checking the idle time (t_rcvtime vs ticks) to see if its more than the RTO timeout.
The problem comes if you are a web server. You get a request and then send out all the data.. then go idle. The next time you would send is in response to a request from the peer asking for more data. But the thing is you updated t_rcvtime when the request came in so you never reduce.
The fix is to do the idle reduce check also on inbound.
Reviewed by: tuexen, rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36721
show more ...
|
#
775e20c1 |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: make tcp_drop_syn_sent() static
|
#
43d39ca7 |
| 04-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
netinet*: de-void control input IP protocol methods
After decoupling of protosw(9) and IP wire protocols in 78b1fc05b205 for IPv4 we got vector ip_ctlprotox[] that is executed only and only from icm
netinet*: de-void control input IP protocol methods
After decoupling of protosw(9) and IP wire protocols in 78b1fc05b205 for IPv4 we got vector ip_ctlprotox[] that is executed only and only from icmp_input() and respectively for IPv6 we got ip6_ctlprotox[] executed only and only from icmp6_input(). This allows to use protocol specific argument types in these methods instead of struct sockaddr and void.
Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36727
show more ...
|
#
08af8aac |
| 27-Sep-2022 |
Randall Stewart <rrs@FreeBSD.org> |
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket opt
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket option has been missing in tcp_usrreq.c). Lets get the progress timeout fully supported in the base stack as well as rack.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36716
show more ...
|
#
e5049a17 |
| 26-Sep-2022 |
Randall Stewart <rrs@FreeBSD.org> |
TCP rack does not work properly with cubic.
Right now if you use rack with cubic (the new default cc) you will have improper results. This is because rack uses different variables than the base stac
TCP rack does not work properly with cubic.
Right now if you use rack with cubic (the new default cc) you will have improper results. This is because rack uses different variables than the base stack (or bbr) and thus tcp_compute_pipe() always returns so that cubic will choose a 30% backoff not the 50% backoff it should when it is newreno compatibility mode. The fix is to allow a stack (rack) to override its own compute_pipe.
Reviewed by: tuexen, rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36711
show more ...
|
#
493105c2 |
| 21-Sep-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: fix simultaneous open and refine e80062a2d43
- The soisconnected() call on transition from SYN_RCVD to ESTABLISHED is also necessary for a half-synchronized connection. Fix that just setti
tcp: fix simultaneous open and refine e80062a2d43
- The soisconnected() call on transition from SYN_RCVD to ESTABLISHED is also necessary for a half-synchronized connection. Fix that just setting the flag, when we transfer SYN-SENT -> SYN-RECEIVED. - Provide a comment that explains at what conditions the call to soisconnected() is necessary. - Hence mechanically rename the TF_INCQUEUE flag to TF_SONOTCONN. - Extend the change to the BBR and RACK stacks.
Note: the interaction between the accept_filter(9) and the socket layer is not fully consistent, yet. For most accept filters this call to soisconnected() will not move the connection from the incomplete queue to the complete. The move would happen only when the filter has received the desired data, and soisconnected() would be called once again from sorwakeup(). Ideally, we should mark socket as connected only there, and leave the soisconnected() from SYN_RCVD->ESTABLISHED only for the simultaneous open case. However, this doesn't yet work.
Reviewed by: rscheff, tuexen, rrs Differential revision: https://reviews.freebsd.org/D36641
show more ...
|
#
0c7f3ae8 |
| 19-Sep-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcpcb: fix tabulation count in i4012ef7754c and abbreviate "packets"
This lines up comments to the rest of the file. Abbreviation helps to fit in to 80 char terminal. Not a functional change.
|
#
e80062a2 |
| 08-Sep-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: avoid call to soisconnected() on transition to ESTABLISHED
This call existed since pre-FreeBSD times, and it is hard to understand why it was there in the first place. After 6f3caa6d815 it def
tcp: avoid call to soisconnected() on transition to ESTABLISHED
This call existed since pre-FreeBSD times, and it is hard to understand why it was there in the first place. After 6f3caa6d815 it definitely became necessary always and commit message from f1ee30ccd60 confirms that. Now that 6f3caa6d815 is effectively backed out by 07285bb4c22, the call appears to be useful only for sockets that landed on the incomplete queue, e.g. sockets that have accept_filter(9) enabled on them.
Provide a new TCP flag to mark connections that are known to be on the incomplete queue, and call soisconnected() only for those connections.
Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D36488
show more ...
|
#
4012ef77 |
| 31-Aug-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Functional implementation of Accurate ECN
The AccECN handshake and TCP header flags are supported, no support yet for the AccECN option. This minimalistic implementation is sufficient to suppor
tcp: Functional implementation of Accurate ECN
The AccECN handshake and TCP header flags are supported, no support yet for the AccECN option. This minimalistic implementation is sufficient to support DCTCP while dramatically cutting the number of ACKs, and provide ECN response from the receiver to the CC modules.
Reviewed By: #transport, #manpages, rrs, pauamma Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D21011
show more ...
|
#
e7d02be1 |
| 17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protosw: refactor protosw and domain static declaration and load
o Assert that every protosw has pr_attach. Now this structure is only for socket protocols declarations and nothing else. o Merge
protosw: refactor protosw and domain static declaration and load
o Assert that every protosw has pr_attach. Now this structure is only for socket protocols declarations and nothing else. o Merge struct pr_usrreqs into struct protosw. This was suggested in 1996 by wollman@ (see 7b187005d18ef), and later reiterated in 2006 by rwatson@ (see 6fbb9cf860dcd). o Make struct domain hold a variable sized array of protosw pointers. For most protocols these pointers are initialized statically. Those domains that may have loadable protocols have spacers. IPv4 and IPv6 have 8 spacers each (andre@ dff3237ee54ea). o For inetsw and inet6sw leave a comment noting that many protosw entries very likely are dead code. o Refactor pf_proto_[un]register() into protosw_[un]register(). o Isolate pr_*_notsupp() methods into uipc_domain.c
Reviewed by: melifaro Differential revision: https://reviews.freebsd.org/D36232
show more ...
|
#
81a34d37 |
| 17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
protosw: retire pr_drain and use EVENTHANDLER(9) directly
The method was called for two different conditions: 1) the VM layer is low on pages or 2) one of UMA zones of mbuf allocator exhausted. This
protosw: retire pr_drain and use EVENTHANDLER(9) directly
The method was called for two different conditions: 1) the VM layer is low on pages or 2) one of UMA zones of mbuf allocator exhausted. This change 2) into a new event handler, but all affected network subsystems modified to subscribe to both, so this change shall not bring functional changes under different low memory situations.
There were three subsystems still using pr_drain: TCP, SCTP and frag6. The latter had its protosw entry for the only reason to register its pr_drain method.
Reviewed by: tuexen, melifaro Differential revision: https://reviews.freebsd.org/D36164
show more ...
|
#
6c452841 |
| 17-Aug-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: use callout(9) directly instead of pr_slowtimo
Modern TCP stacks uses multiple callouts per tcpcb, and a global callout is ancient artifact. However it is still used to garbage collect compres
tcp: use callout(9) directly instead of pr_slowtimo
Modern TCP stacks uses multiple callouts per tcpcb, and a global callout is ancient artifact. However it is still used to garbage collect compressed timewait entries.
Reviewed by: melifaro, tuexen Differential revision: https://reviews.freebsd.org/D36159
show more ...
|
#
74703901 |
| 04-Jul-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: use a TCP flag to check if connection has been close(2)d
The flag SS_NOFDREF is a private flag of the socket layer. It also is supposed to be read with SOCK_LOCK(), which we don't own here.
R
tcp: use a TCP flag to check if connection has been close(2)d
The flag SS_NOFDREF is a private flag of the socket layer. It also is supposed to be read with SOCK_LOCK(), which we don't own here.
Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D35663
show more ...
|
#
700a395c |
| 14-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
tcp_log_vain/addrs: Use a const pointer for the IPv4 header.
The pointer to the IPv6 header was already const.
|
#
ea9017fb |
| 21-Feb-2022 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: Congestion control move to using reference counting.
In the transport call on 12/3 Gleb asked to move the CC modules towards using reference counting to prevent folks from unloading a module in
tcp: Congestion control move to using reference counting.
In the transport call on 12/3 Gleb asked to move the CC modules towards using reference counting to prevent folks from unloading a module in use. It was also agreed that Michael would do a user space utility like tcp_drop that could be used to move all connections that are using a specific CC to some other CC.
This is the half I committed to doing, making it so that we maintain a refcount on a cc module every time a pcb refers to it and decrementing that every time a pcb no longer uses a cc module. This also helps us simplify the whole unloading process by getting rid of tcp_ccunload() which munged through all the tcb's. Instead we mark a module as being removed and prevent further references to it. We also make sure that if a module is marked as being removed it cannot be made as the default and also the opposite of that, if its a default it fails and does not mark it as being removed.
Reviewed by: Michael Tuexen, Gleb Smirnoff Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D33249
show more ...
|
#
0c2832ee |
| 15-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Restore 6 tcps padding entries in HEAD
The padding in CURRENT shall not shrink. It is designed that in CURRENT at always stays the same, and then when a new stable is branched, it inherits 6 po
tcp: Restore 6 tcps padding entries in HEAD
The padding in CURRENT shall not shrink. It is designed that in CURRENT at always stays the same, and then when a new stable is branched, it inherits 6 pointer placeholders that can be used withing this stable/X lifetime to extend the structure.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34269
show more ...
|