#
12eeb81f |
| 28-Oct-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
Calculate the correct amount of bytes that are in-flight for a connection as suggested by RFC 6675.
Currently differnt places in the stack tries to guess this in suboptimal ways. The main problem is
Calculate the correct amount of bytes that are in-flight for a connection as suggested by RFC 6675.
Currently differnt places in the stack tries to guess this in suboptimal ways. The main problem is that current calculations don't take sacked bytes into account. Sacked bytes are the bytes receiver acked via SACK option. This is suboptimal because it assumes that network has more outstanding (unacked) bytes than the actual value and thus sends less data by setting congestion window lower than what's possible which in turn may cause slower recovery from losses.
As an example, one of the current calculations looks something like this: snd_nxt - snd_fack + sackhint.sack_bytes_rexmit New proposal from RFC 6675 is: snd_max - snd_una - sackhint.sacked_bytes + sackhint.sack_bytes_rexmit which takes sacked bytes into account which is a new addition to the sackhint struct. Only thing we are missing from RFC 6675 is isLost() i.e. segment being considered lost and thus adjusting pipe based on that which makes this calculation a bit on conservative side.
The approach is very simple. We already process each ack with sack info in tcp_sack_doack() and extract sack blocks/holes out of it. We'd now also track this new variable sacked_bytes which keeps track of total sacked bytes reported.
One downside to this approach is that we may get incorrect count of sacked_bytes if the other end decides to drop sack info in the ack because of memory pressure or some other reasons. But in this (not very likely) case also the pipe calculation would be conservative which is okay as opposed to being aggressive in sending packets into the network.
Next step is to use this more accurate pipe estimation to drive congestion window adjustments.
In collaboration with: rrs Reviewed by: jason_eggnet dot com, rrs MFC after: 2 weeks Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D3971
show more ...
|
#
11d38a57 |
| 28-Oct-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
Sponsored by: Gandi.net
|
#
356c7958 |
| 27-Oct-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
Add sysctl tunable net.inet.tcp.initcwnd_segments to specify initial congestion window in number of segments on fly. It is set to 10 segments by default.
Remove net.inet.tcp.experimental.initcwnd10
Add sysctl tunable net.inet.tcp.initcwnd_segments to specify initial congestion window in number of segments on fly. It is set to 10 segments by default.
Remove net.inet.tcp.experimental.initcwnd10 which is now redundant. Also remove the parent node net.inet.tcp.experimental as it's not needed anymore and also because it was not well thought out.
Differential Revision: https://reviews.freebsd.org/D3858 In collaboration with: lstewart Reviewed by: gnn (prev version), rwatson, allanjude, wblock (man page) MFC after: 2 weeks Relnotes: yes Sponsored by: Limelight Networks
show more ...
|
#
031c294c |
| 19-Oct-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
|
#
324fd1ce |
| 15-Oct-2015 |
Glen Barber <gjb@FreeBSD.org> |
MFH to r289370
Sponsored by: The FreeBSD Foundation
|
#
86a996e6 |
| 14-Oct-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
There are times when it would be really nice to have a record of the last few packets and/or state transitions from each TCP socket. That would help with narrowing down certain problems we see in the
There are times when it would be really nice to have a record of the last few packets and/or state transitions from each TCP socket. That would help with narrowing down certain problems we see in the field that are hard to reproduce without understanding the history of how we got into a certain state. This change provides just that.
It saves copies of the last N packets in a list in the tcpcb. When the tcpcb is destroyed, the list is freed. I thought this was likely to be more performance-friendly than saving copies of the tcpcb. Plus, with the packets, you should be able to reverse-engineer what happened to the tcpcb.
To enable the feature, you will need to compile a kernel with the TCPPCAP option. Even then, the feature defaults to being deactivated. You can activate it by setting a positive value for the number of captured packets. You can do that on either a global basis or on a per-socket basis (via a setsockopt call).
There is no way to get the packets out of the kernel other than using kmem or getting a coredump. I thought that would help some of the legal/privacy concerns regarding such a feature. However, it should be possible to add a future effort to export them in PCAP format.
I tested this at low scale, and found that there were no mbuf leaks and the peak mbuf usage appeared to be unchanged with and without the feature.
The main performance concern I can envision is the number of mbufs that would be used on systems with a large number of sockets. If you save five packets per direction per socket and have 3,000 sockets, that will consume at least 30,000 mbufs just to keep these packets. I tried to reduce the concerns associated with this by limiting the number of clusters (not mbufs) that could be used for this feature. Again, in my testing, that appears to work correctly.
Differential Revision: D3100 Submitted by: Jonathan Looney <jlooney at juniper dot net> Reviewed by: gnn, hiren
show more ...
|
#
becbad1f |
| 13-Oct-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
|
#
65dcb5bc |
| 01-Oct-2015 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r288197 through r288456.
|
#
5a2b666c |
| 01-Oct-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
|
#
0f405ee7 |
| 28-Sep-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Sync up with head (up to r288341).
|
#
1558cb24 |
| 27-Sep-2015 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Eliminate nd6_nud_hint() and its TCP bindings.
Initially function was introduced in r53541 (KAME initial commit) to "provide hints from upper layer protocols that indicate a connection is making
Eliminate nd6_nud_hint() and its TCP bindings.
Initially function was introduced in r53541 (KAME initial commit) to "provide hints from upper layer protocols that indicate a connection is making "forward progress"" (quote from RFC 2461 7.3.1 Reachability Confirmation). However, it was converted to do nothing (e.g. just return) in r122922 (tcp_hostcache implementation) back in 2003. Some defines were moved to tcp_var.h in r169541. Then, it was broken (for non-corner cases) by r186119 (L2<>L3 split) in 2008 (NULL ifp in nd6_lookup). So, right now this code is broken and has no "real" base users.
Differential Revision: https://reviews.freebsd.org/D3699
show more ...
|
#
f94594b3 |
| 12-Sep-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Finish merging from head, messed up in previous attempt
|
#
00176600 |
| 09-Sep-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Merge r286744-r287584 from head.
|
#
d9442b10 |
| 05-Sep-2015 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r286858 through r287489.
|
#
24067db8 |
| 04-Sep-2015 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Make tcp_mtudisc() static and void. No functional changes.
Sponsored by: Nginx, Inc.
|
#
ab875b71 |
| 14-Aug-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Catch up with head, primarily for the 1.14.4.0 firmware.
|
Revision tags: release/10.2.0 |
|
#
1347814c |
| 07-Aug-2015 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r285924 through r286421.
|
#
03041aaa |
| 30-Jul-2015 |
Hiren Panchasara <hiren@FreeBSD.org> |
Update snd_una description to make it more readable.
Differential Revision: https://reviews.freebsd.org/D3179 Reviewed by: gnn Sponsored by: Limelight Networks
|
#
4741bfcb |
| 29-Jul-2015 |
Patrick Kelsey <pkelsey@FreeBSD.org> |
Revert r265338, r271089 and r271123 as those changes do not handle non-inline urgent data and introduce an mbuf exhaustion attack vector similar to FreeBSD-SA-15:15.tcp, but not requiring VNETs.
Add
Revert r265338, r271089 and r271123 as those changes do not handle non-inline urgent data and introduce an mbuf exhaustion attack vector similar to FreeBSD-SA-15:15.tcp, but not requiring VNETs.
Address the issue described in FreeBSD-SA-15:15.tcp.
Reviewed by: glebius Approved by: so Approved by: jmallett (mentor) Security: FreeBSD-SA-15:15.tcp Sponsored by: Norse Corp, Inc.
show more ...
|
#
416ba5c7 |
| 22-Jun-2015 |
Navdeep Parhar <np@FreeBSD.org> |
Catch up with HEAD (r280229-r284686).
|
#
98e0ffae |
| 27-May-2015 |
Simon J. Gerraty <sjg@FreeBSD.org> |
Merge sync of head
|
#
7757a1b4 |
| 03-May-2015 |
Baptiste Daroussin <bapt@FreeBSD.org> |
Merge from head
|
#
7263c8c0 |
| 22-Apr-2015 |
Glen Barber <gjb@FreeBSD.org> |
MFH: r280643-r281852
Sponsored by: The FreeBSD Foundation
|
#
5571f9cf |
| 16-Apr-2015 |
Julien Charbon <jch@FreeBSD.org> |
Fix an old and well-documented use-after-free race condition in TCP timers: - Add a reference from tcpcb to its inpcb - Defer tcpcb deletion until TCP timers have finished
Differential Revision: h
Fix an old and well-documented use-after-free race condition in TCP timers: - Add a reference from tcpcb to its inpcb - Defer tcpcb deletion until TCP timers have finished
Differential Revision: https://reviews.freebsd.org/D2079 Submitted by: jch, Marc De La Gueronniere <mdelagueronniere@verisign.com> Reviewed by: imp, rrs, adrian, jhb, bz Approved by: jhb Sponsored by: Verisign, Inc.
show more ...
|
#
d899be7d |
| 19-Jan-2015 |
Glen Barber <gjb@FreeBSD.org> |
Reintegrate head: r274132-r277384
Sponsored by: The FreeBSD Foundation
|