#
c28440db |
| 20-Aug-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This change represents a substantial restructure of the way we reassembly inbound tcp segments. The old algorithm just blindly dropped in segments without coalescing. This meant that every segment co
This change represents a substantial restructure of the way we reassembly inbound tcp segments. The old algorithm just blindly dropped in segments without coalescing. This meant that every segment could take up greater and greater room on the linked list of segments. This of course is now subject to a tighter limit (100) of segments which in a high BDP situation will cause us to be a lot more in-efficent as we drop segments beyond 100 entries that we receive. What this restructure does is cause the reassembly buffer to coalesce segments putting an emphasis on the two common cases (which avoid walking the list of segments) i.e. where we add to the back of the queue of segments and where we add to the front. We also have the reassembly buffer supporting a couple of debug options (black box logging as well as counters for code coverage). These are compiled out by default but can be added by uncommenting the defines.
Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D16626
show more ...
|
#
8e02b4e0 |
| 19-Aug-2018 |
Michael Tuexen <tuexen@FreeBSD.org> |
Don't expose the uptime via the TCP timestamps.
The TCP client side or the TCP server side when not using SYN-cookies used the uptime as the TCP timestamp value. This patch uses in all cases an offs
Don't expose the uptime via the TCP timestamps.
The TCP client side or the TCP server side when not using SYN-cookies used the uptime as the TCP timestamp value. This patch uses in all cases an offset, which is the result of a keyed hash function taking the source and destination addresses and port numbers into account. The keyed hash function is the same a used for the initial TSN.
Reviewed by: rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16636
show more ...
|
#
f38b68ae |
| 05-Jul-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Make struct xinpcb and friends word-size independent.
Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) wit
Make struct xinpcb and friends word-size independent.
Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) with kvaddr_t (uint64_t). This makes the structs identical between 32-bit and 64-bit ABIs.
On 64-bit bit systems, the ABI is maintained. On 32-bit systems, this is an ABI breaking change. The ABI of most of these structs was previously broken in r315662. This also imposes a small API change on userspace consumers who must handle kernel pointers becoming virtual addresses.
PR: 228301 (exp-run by antoine) Reviewed by: jtl, kib, rwatson (various versions) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15386
show more ...
|
#
6573d758 |
| 04-Jul-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simpli
epoch(9): allow preemptible epochs to compose
- Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simplify tfb_tcp_do_segment to not take a ti_locked argument, there's no longer any benefit to dropping the pcbinfo lock and trying to do so just adds an error prone branchfest to these functions - Remove cases of same function recursion on the epoch as recursing is no longer free. - Remove the the TAILQ_ENTRY and epoch_section from struct thread as the tracker field is now stack or heap allocated as appropriate.
Tested by: pho and Limelight Networks Reviewed by: kbowling at llnw dot com Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16066
show more ...
|
Revision tags: release/11.2.0 |
|
#
89e560f4 |
| 07-Jun-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent ackno
This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent acknowledgment) where counting dup-acks is no longer done instead time is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt to try not to take a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS - PRR (partial rate reduction) see the RFC.
Once built into your kernel, you can select this stack by either socket option with the name of the stack is "rack" or by setting the global sysctl so the default is rack.
Note that any connection that does not support SACK will be kicked back to the "default" base FreeBSD stack (currently known as "default").
To build this into your kernel you will need to enable in your kernel: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS
Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15525
show more ...
|
#
803a2305 |
| 26-Apr-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This change re-arranges the fields within the tcp-pcb so that they are more in order of cache line use as one passes through the tcp_input/output paths (non-errors most likely path). This helps speed
This change re-arranges the fields within the tcp-pcb so that they are more in order of cache line use as one passes through the tcp_input/output paths (non-errors most likely path). This helps speed up cache line optimization so that the tcp stack runs a bit more efficently.
Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15136
show more ...
|
#
3ee9c3c4 |
| 19-Apr-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This commit brings in the TCP high precision timer system (tcp_hpts). It is the forerunner/foundational work of bringing in both Rack and BBR which use hpts for pacing out packets. The feature is opt
This commit brings in the TCP high precision timer system (tcp_hpts). It is the forerunner/foundational work of bringing in both Rack and BBR which use hpts for pacing out packets. The feature is optional and requires the TCPHPTS option to be enabled before the feature will be active. TCP modules that use it must assure that the base component is compile in the kernel in which they are loaded.
MFC after: Never Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15020
show more ...
|
#
f8979519 |
| 10-Apr-2018 |
Jonathan T. Looney <jtl@FreeBSD.org> |
Add missing header change from r332381.
Sponsored by: Netflix, Inc. Pointy hat: jtl
|
#
2529f56e |
| 22-Mar-2018 |
Jonathan T. Looney <jtl@FreeBSD.org> |
Add the "TCP Blackbox Recorder" which we discussed at the developer summits at BSDCan and BSDCam in 2017.
The TCP Blackbox Recorder allows you to capture events on a TCP connection in a ring buffer.
Add the "TCP Blackbox Recorder" which we discussed at the developer summits at BSDCan and BSDCam in 2017.
The TCP Blackbox Recorder allows you to capture events on a TCP connection in a ring buffer. It stores metadata with the event. It optionally stores the TCP header associated with an event (if the event is associated with a packet) and also optionally stores information on the sockets.
It supports setting a log ID on a TCP connection and using this to correlate multiple connections that share a common log ID.
You can log connections in different modes. If you are doing a coordinated test with a particular connection, you may tell the system to put it in mode 4 (continuous dump). Or, if you just want to monitor for errors, you can put it in mode 1 (ring buffer) and dump all the ring buffers associated with the connection ID when we receive an error signal for that connection ID. You can set a default mode that will be applied to a particular ratio of incoming connections. You can also manually set a mode using a socket option.
This commit includes only basic probes. rrs@ has added quite an abundance of probes in his TCP development work. He plans to commit those soon.
There are user-space programs which we plan to commit as ports. These read the data from the log device and output pcapng files, and then let you analyze the data (and metadata) in the pcapng files.
Reviewed by: gnn (previous version) Obtained from: Netflix, Inc. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11085
show more ...
|
#
18a75309 |
| 26-Feb-2018 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Greatly reduce the number of #ifdefs supporting the TCP_RFC7413 kernel option.
The conditional compilation support is now centralized in tcp_fastopen.h and tcp_var.h. This doesn't provide the minimu
Greatly reduce the number of #ifdefs supporting the TCP_RFC7413 kernel option.
The conditional compilation support is now centralized in tcp_fastopen.h and tcp_var.h. This doesn't provide the minimum theoretical code/data footprint when TCP_RFC7413 is disabled, but nearly all the TFO code should wind up being removed by the optimizer, the additional footprint in the syncache entries is a single pointer, and the additional overhead in the tcpcb is at the end of the structure.
This enables the TCP_RFC7413 kernel option by default in amd64 and arm64 GENERIC.
Reviewed by: hiren MFC after: 1 month Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14048
show more ...
|
#
c560df6f |
| 26-Feb-2018 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
This is an implementation of the client side of TCP Fast Open (TFO) [RFC7413]. It also includes a pre-shared key mode of operation in which the server requires the client to be in possession of a sha
This is an implementation of the client side of TCP Fast Open (TFO) [RFC7413]. It also includes a pre-shared key mode of operation in which the server requires the client to be in possession of a shared secret in order to successfully open TFO connections with that server.
The names of some existing fastopen sysctls have changed (e.g., net.inet.tcp.fastopen.enabled -> net.inet.tcp.fastopen.server_enable).
Reviewed by: tuexen MFC after: 1 month Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14047
show more ...
|
#
66492fea |
| 07-Dec-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Separate out send buffer autoscaling code into function, so that alternative TCP stacks may reuse it instead of pasting.
Obtained from: Netflix
|
#
82725ba9 |
| 23-Nov-2017 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Merge ^/head r325999 through r326131.
|
#
51369649 |
| 20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
show more ...
|
#
c2c014f2 |
| 07-Nov-2017 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Merge ^/head r323559 through r325504.
|
#
0a8f81bc |
| 22-Oct-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r324837
While here, diff reduce some of the changes in sys/boot by moving MK_COVERAGE=no to sys/boot/Makefile.inc .
|
#
3bdf4c42 |
| 11-Oct-2017 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Declare more TCP globals in tcp_var.h, so that alternative TCP stacks can use them. Gather all TCP tunables in tcp_var.h in one place and alphabetically sort them, to ease maintainance of the list.
Declare more TCP globals in tcp_var.h, so that alternative TCP stacks can use them. Gather all TCP tunables in tcp_var.h in one place and alphabetically sort them, to ease maintainance of the list.
Don't copy and paste declarations in tcp_stacks/fastpath.c.
show more ...
|
Revision tags: release/10.4.0 |
|
#
8fcbcc2d |
| 16-Sep-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r323635
|
#
b754c279 |
| 13-Sep-2017 |
Navdeep Parhar <np@FreeBSD.org> |
MFH @ r323558.
|
#
e5cccc35 |
| 12-Sep-2017 |
Michael Tuexen <tuexen@FreeBSD.org> |
Add support to print the TCP stack being used.
Sponsored by: Netflix, Inc.
|
#
05505b6c |
| 26-Aug-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r322921
|
#
32a04bb8 |
| 25-Aug-2017 |
Sean Bruno <sbruno@FreeBSD.org> |
Use counter(9) for PLPMTUD counters.
Remove unused PLPMTUD sysctl counters.
Bump UPDATING and FreeBSD Version to indicate a rebuild is required.
Submitted by: kevin.bowling@kev009.com Reviewed by:
Use counter(9) for PLPMTUD counters.
Remove unused PLPMTUD sysctl counters.
Bump UPDATING and FreeBSD Version to indicate a rebuild is required.
Submitted by: kevin.bowling@kev009.com Reviewed by: jtl Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12003
show more ...
|
Revision tags: release/11.1.0 |
|
#
686fb94a |
| 10-Jun-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r319548 through r319778.
|
#
dc6a41b9 |
| 08-Jun-2017 |
Jonathan T. Looney <jtl@FreeBSD.org> |
Add the infrastructure to support loading multiple versions of TCP stack modules.
It adds support for mangling symbols exported by a module by prepending a string to them. (This avoids overlapping s
Add the infrastructure to support loading multiple versions of TCP stack modules.
It adds support for mangling symbols exported by a module by prepending a string to them. (This avoids overlapping symbols in the kernel linker.)
It allows the use of a macro as the module name in the DECLARE_MACRO() and MACRO_VERSION() macros.
It allows the code to register stack aliases (e.g. both a generic name ["default"] and version-specific name ["default_10_3p1"]).
With these changes, it is trivial to compile TCP stack modules with the name defined in the Makefile and to load multiple versions of the same stack simultaneously. This functionality can be used to enable side-by-side testing of an old and new version of the same TCP stack. It also could support upgrading the TCP stack without a reboot.
Reviewed by: gnn, sjg (makefiles only) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D11086
show more ...
|
#
e44c1887 |
| 10-Apr-2017 |
Steven Hartland <smh@FreeBSD.org> |
Use estimated RTT for receive buffer auto resizing instead of timestamps
Switched from using timestamps to RTT estimates when performing TCP receive buffer auto resizing, as not all hosts support /
Use estimated RTT for receive buffer auto resizing instead of timestamps
Switched from using timestamps to RTT estimates when performing TCP receive buffer auto resizing, as not all hosts support / enable TCP timestamps.
Disabled reset of receive buffer auto scaling when not in bulk receive mode, which gives an extra 20% performance increase.
Also extracted auto resizing to a common method shared between standard and fastpath modules.
With this AWS S3 downloads at ~17ms latency on a 1Gbps connection jump from ~3MB/s to ~100MB/s using the default settings.
Reviewed by: lstewart, gnn MFC after: 2 weeks Relnotes: Yes Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D9668
show more ...
|