#
dcdfe449 |
| 08-May-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: add sysctl to allow/disallow TSO during SACK loss recovery
Introduce net.inet.tcp.sack.tso for future use when TSO is ready to be used during loss recovery.
Reviewed By: tuexen, #transport Sp
tcp: add sysctl to allow/disallow TSO during SACK loss recovery
Introduce net.inet.tcp.sack.tso for future use when TSO is ready to be used during loss recovery.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D45068
show more ...
|
#
fce03f85 |
| 05-May-2024 |
Randall Stewart <rrs@FreeBSD.org> |
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if i
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if it uses lists in sack processing. The idea of the attack is that the attacker is driving you to look at 100's of sack blocks that only update 1 byte. So for example if you have 1 - 10,000 bytes outstanding the attacker sends in something like:
ACK 0 SACK(1-512) SACK(1024 - 1536), SACK(2048-2536), SACK(4096 - 4608), SACK(8192-8704) This first sack looks fine but then the attacker sends
ACK 0 SACK(1-512) SACK(1025 - 1537), SACK(2049-2537), SACK(4097 - 4609), SACK(8193-8705) ACK 0 SACK(1-512) SACK(1027 - 1539), SACK(2051-2539), SACK(4099 - 4611), SACK(8195-8707) ... These blocks are making you hunt across your linked list and split things up so that you have an entry for every other byte. Has your list grows you spend more and more CPU running through the lists. The idea here is the attacker chooses entries as far apart as possible that make you run through the list. This example is small but in theory if the window is open to say 1Meg you could end up with 100's of thousands link list entries.
To combat this we introduce three things.
when the peer requests a very small MSS we stop processing SACK's from them. This prevents a malicious peer from just using a small MSS to do the same thing. Any time we get a sack block, we use the sack-filter to remove sacks that are smaller than the smallest v4 mss (minus 40 for max TCP options) unless it ties up to snd_max (since that is legal). All other sacks in theory should be at least an MSS. If we get such an attacker that means we basically start skipping all but MSS sized Sacked blocks. The sack filter used to throw away data when its bounds were exceeded, instead now we increase its size to 15 and then throw away sack's if the filter gets over-run to prevent the malicious attacker from over-running the sack filter and thus we start to process things anyway. The default stack will need to start using the sack-filter which we have talked about in past conference calls to take full advantage of the protections offered by it (and reduce cpu consumption when processing sacks).
After this set of changes is in rack can drop its SAD detection completely
Reviewed by:tuexen@, rscheff@ Differential Revision: <https://reviews.freebsd.org/D44903>
show more ...
|
#
1d14e88e |
| 08-Apr-2024 |
Mark Johnston <markj@FreeBSD.org> |
tcp: Make tcp_var.h more self-contained
struct tcpcb embeds a struct osd and a struct callout. Rather than forcing all consumers to pull in the same headers, include the headers directly.
No funct
tcp: Make tcp_var.h more self-contained
struct tcpcb embeds a struct osd and a struct callout. Rather than forcing all consumers to pull in the same headers, include the headers directly.
No functional change intended.
Reviewed by: glebius MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D44685
show more ...
|
#
60d8dbbe |
| 18-Jan-2024 |
Kristof Provost <kp@FreeBSD.org> |
netinet: add a probe point for IP, IP6, ICMP, ICMP6, UDP and TCP stats counters
When debugging network issues one common clue is an unexpectedly incrementing error counter. This is helpful, in that
netinet: add a probe point for IP, IP6, ICMP, ICMP6, UDP and TCP stats counters
When debugging network issues one common clue is an unexpectedly incrementing error counter. This is helpful, in that it gives us an idea of what might be going wrong, but often these counters may be incremented in different functions.
Add a static probe point for them so that we can use dtrace to get futher information (e.g. a stack trace).
For example: dtrace -n 'mib:ip:count: { printf("%d", arg0); stack(); }'
This can be disabled by setting the following kernel option: options KDTRACE_NO_MIB_SDT
Reviewed by: gallatin, tuexen (previous version), gnn (previous version) Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D43504
show more ...
|
#
5a268d86 |
| 03-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix comment
Make the comment consistent with the code.
Reviewed by: rscheff MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44611
|
#
dd7b86e2 |
| 18-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that declaration of the macro in tcp_var.h depended on a kernel option. It is a bad practice to create such definitions in installable headers.
Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44362
show more ...
|
#
220ee18f |
| 13-Mar-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
netinet/tcp_var.h: always define IS_FASTOPEN() for kernel compilation env
and drop the definition for userspace (which matched TCP_RFC7413) since it depends on presence of the kernel option.
Review
netinet/tcp_var.h: always define IS_FASTOPEN() for kernel compilation env
and drop the definition for userspace (which matched TCP_RFC7413) since it depends on presence of the kernel option.
Reviewed by: glebius, rscheff Sponsored by: NVIDIA networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D44349
show more ...
|
#
e4315bbc |
| 13-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: move struct tcp_ifcap declaration under _KERNEL
Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44340
|
#
e18b97bd |
| 12-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
With a bit of a struggle with git I finally got rack_pcm.c into place (apologies for not noticing this error). The LINT kernel is running on my box now .. sigh.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
#
c112243f |
| 11-Mar-2024 |
Brooks Davis <brooks@FreeBSD.org> |
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa4
Revert "Update to bring the rack stack with all its fixes in."
This commit was incomplete and breaks LINT kernels. The tree has been broken for 8+ hours.
This reverts commit f6d489f402c320f1a6eaa473491a0b8c3878113e.
show more ...
|
#
f6d489f4 |
| 11-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal w
Update to bring the rack stack with all its fixes in.
This brings the rack stack up to the current level used at NF. Many fixes and improvements have been added. I also add in a fix to BBR to deal with the changes that have been in hpts for a while i.e. only one call no matter if mbuf queue or tcp_output.
Note there is a new file that I can't figure out how to get in rack_pcm.c
It basically does little except BBlogs and is a placemark for future work on doing path capacity measurements.
Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D43986
show more ...
|
#
c7c325d0 |
| 24-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions) slightly by calculating maxseg only once. This typically saves one of two calls to tc
tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions) slightly by calculating maxseg only once. This typically saves one of two calls to tcp_maxseg().
Reviewed By: glebius, tuexen, cc, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43536
show more ...
|
#
7f3184ba |
| 22-Jan-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove outdated comment
This paragraph should have been removed in 446ccdd08e2a.
|
#
30409ecd |
| 06-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: do not purge SACK scoreboard on first RTO
Keeping the SACK scoreboard intact after the first RTO and retransmitting all data anew only on subsequent RTOs allows a more timely and efficient loss
tcp: do not purge SACK scoreboard on first RTO
Keeping the SACK scoreboard intact after the first RTO and retransmitting all data anew only on subsequent RTOs allows a more timely and efficient loss recovery under many adverse cirumstances.
Reviewed By: tuexen, #transport MFC after: 10 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D42906
show more ...
|
#
a8b70cf2 |
| 25-Dec-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
netpfil: Use accessor functions and named constants for all tcphdr flags
Update all remaining references to the struct tcphdr th_x2 field. This completes the compatibilty of various aspects with Acc
netpfil: Use accessor functions and named constants for all tcphdr flags
Update all remaining references to the struct tcphdr th_x2 field. This completes the compatibilty of various aspects with AccECN (TH_AE), after the internal ipfw "re-checksum required" was moved to use the TH_RES1 flag.
No functional change.
Reviewed By: tuexen, #transport, glebius Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43172
show more ...
|
#
8717c306 |
| 22-Dec-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: allow userspace use of tcp header flags accessor functions
Provide accessor functions to all 12 possible TCP header flags for userspace too.
Reviewed By: zlei MFC after:
tcp: allow userspace use of tcp header flags accessor functions
Provide accessor functions to all 12 possible TCP header flags for userspace too.
Reviewed By: zlei MFC after: 2 weeks Sponsored by: Netapp, Inc. Differential Revision: https://reviews.freebsd.org/D43152
show more ...
|
#
29363fb4 |
| 23-Nov-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl s
sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script.
Sponsored by: Netflix
show more ...
|
#
219a6ca9 |
| 21-Nov-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: uninline tcp_account_for_send()
This allows to clear inclusion of "opt_kern_tls.h" from a system header.
Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D42696
|
#
49a6fbe3 |
| 15-Nov-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
[tcp] add PRR 6937bis heuristic and retire prr_conservative sysctl
Improve Proportional Rate Reduction (RFC6937) by using a heuristic, which automatically chooses between conservative CRB and more a
[tcp] add PRR 6937bis heuristic and retire prr_conservative sysctl
Improve Proportional Rate Reduction (RFC6937) by using a heuristic, which automatically chooses between conservative CRB and more aggressive SSRB modes. Only when snd_una advances (a partial ACK), SSRB may be used. Also, that ACK must not have any indication of ongoing loss - using the addition of new holes into the scoreboard as proxy for such an event.
MFC after: 4 weeks Reviewed By: #transport, kbowling, rrs Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28822
show more ...
|
Revision tags: release/14.0.0 |
|
#
22dc8609 |
| 17-Oct-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: use signed IsLost() related accounting variables
Coverity found that one safety check (kassert) was not functional, as possible incorrect subtractions during the accounting wouldn't show up as
tcp: use signed IsLost() related accounting variables
Coverity found that one safety check (kassert) was not functional, as possible incorrect subtractions during the accounting wouldn't show up as (invalid) negative values.
Reported by: gallatin Reviewed By: cc, #transport Sponsored By: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D42180
show more ...
|
#
e2c6a6d2 |
| 09-Oct-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: include RFC6675 IsLost() in pipe calculation
Add more accounting while processing SACK data, to keep track of when a packet is deemed lost using the RFC6675 guidance.
Together with PRR (RFC697
tcp: include RFC6675 IsLost() in pipe calculation
Add more accounting while processing SACK data, to keep track of when a packet is deemed lost using the RFC6675 guidance.
Together with PRR (RFC6972) this allows a sender to retransmit presumed lost packets faster, and loss recovery to complete earlier.
Reviewed By: cc, rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D39299
show more ...
|
#
2ff63af9 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
|
#
cf32543f |
| 27-Jul-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: document that conditional fields in tcpcb should be at the end
Reviewed by: rscheff, Peter Lei Sponsored by: Netflix, Inc.
|
#
e4a873bf |
| 19-Jul-2023 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve layout of struct tcpcb
Put optional fields at the end to minimize run time problems in case CC modules are build from within its directory.
Reviewed by: cc, gallatin, glebius, imp Spo
tcp: improve layout of struct tcpcb
Put optional fields at the end to minimize run time problems in case CC modules are build from within its directory.
Reviewed by: cc, gallatin, glebius, imp Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D41059
show more ...
|
#
d3152ab2 |
| 17-Jul-2023 |
Warner Losh <imp@FreeBSD.org> |
tcbpcb: Always define t_osd
Always define t_osd. congestion control modules access it unconditionally. This fixes the build.
However, this is, at best, a temporary band-aide until the larger issues
tcbpcb: Always define t_osd
Always define t_osd. congestion control modules access it unconditionally. This fixes the build.
However, this is, at best, a temporary band-aide until the larger issues are sorted.
Sponsored by: Netflix
show more ...
|