History log of /freebsd/sys/netinet/tcp_stacks/rack.c (Results 151 – 175 of 295)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# b1e806c0 07-Jul-2021 Andrew Gallatin <gallatin@FreeBSD.org>

tcp: fix alternate stack build with LINT-NO{INET,INET6,IP}

When fixing another bug, I noticed that the alternate
TCP stacks do not build when various combinations of
ipv4 and ipv6 are disabled.

Rev

tcp: fix alternate stack build with LINT-NO{INET,INET6,IP}

When fixing another bug, I noticed that the alternate
TCP stacks do not build when various combinations of
ipv4 and ipv6 are disabled.

Reviewed by: rrs, tuexen
Differential Revision: https://reviews.freebsd.org/D31094
Sponsored by: Netflix

show more ...


# d7955cc0 06-Jul-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: HPTS performance enhancements

HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does t

tcp: HPTS performance enhancements

HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does this by now instead of relying on the timer/callout to
drive it instead use user return from a system call as well as lro flushes
to drive hpts. The timer becomes a backstop that dynamically adjusts
based on how "late" we are.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31083

show more ...


# e834f9a4 06-Jul-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Address goodput and TLP edge cases.

There are several cases where we make a goodput measurement and we are running
out of data when we decide to make the measurement. In reality we should not m

tcp: Address goodput and TLP edge cases.

There are several cases where we make a goodput measurement and we are running
out of data when we decide to make the measurement. In reality we should not make
such a measurement if there is no chance we can have "enough" data. There is also
some corner case TLP's that end up not registering as a TLP like they should, we
fix this by pushing the doing_tlp setup to the actual timeout that knows it did
a TLP. This makes it so we always have the appropriate flag on the sendmap
indicating a TLP being done as well as count correctly so we make no more
that two TLP's.

In addressing the goodput lets also add a "quality" metric that can be viewed via
blackbox logs so that a casual observer does not have to figure out how good
of a measurement it is. This is needed due to the fact that we may still make
a measurement that is of a poorer quality as we run out of data but still have
a minimal amount of data to make a measurement.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31076

show more ...


# 9e4d9e4c 25-Jun-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Preparation for allowing hardware TLS to be able to kick a tcp connection that is retransmitting too much out of hardware and back to software.

Hardware TLS is now supported in some interface c

tcp: Preparation for allowing hardware TLS to be able to kick a tcp connection that is retransmitting too much out of hardware and back to software.

Hardware TLS is now supported in some interface cards and it works well. Except that
when we have connections that retransmit a lot we get into trouble with all the retransmits.
This prep step makes way for change that Drew will be making so that we can "kick out" a
session from hardware TLS.

Reviewed by: mtuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30895

show more ...


# 66aec14a 24-Jun-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Rack not being very friendly with V6:4 socket and having a connection from V4

There were two bugs that prevented V4 sockets from connecting to
a rack server running a V4/V6 socket. As well as a

tcp: Rack not being very friendly with V6:4 socket and having a connection from V4

There were two bugs that prevented V4 sockets from connecting to
a rack server running a V4/V6 socket. As well as a bug that stops the
mapped v4 in V6 address from working.

Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30885

show more ...


# f1536bb5 11-Jun-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: remove debug output from RACK

Reported by: iron.udjin@gmail.com, Marek Zarychta
Reviewed by: rrs
PR: 256538
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D30723
Spon

tcp: remove debug output from RACK

Reported by: iron.udjin@gmail.com, Marek Zarychta
Reviewed by: rrs
PR: 256538
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D30723
Sponsored by: Netflix, Inc.

show more ...


# ba1b3e48 11-Jun-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Missing mfree in rack and bbr

Recently (Nov) we added logic that protects against a peer negotiating a timestamp, and
then not including a timestamp. This involved in the input path doing a got

tcp: Missing mfree in rack and bbr

Recently (Nov) we added logic that protects against a peer negotiating a timestamp, and
then not including a timestamp. This involved in the input path doing a goto done_with_input
label. Now I suspect the code was cribbed from one in Rack that has to do with the SYN.
This had a bug, i.e. it should have a m_freem(m) before going to the label (bbr had this
missing m_freem() but rack did not). This then caused the missing m_freem to show
up in both BBR and Rack. Also looking at the code referencing m->m_pkthdr.lro_nsegs
later (after processing) is not a good idea, even though its only for logging. Best to
copy that off before any frees can take place.

Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30727

show more ...


# 224cf7b3 11-Jun-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: fix compilation of IPv4-only builds

PR: 256538
Reported by: iron.udjin@gmail.com
MFC after: 3 days
Sponsored by: Netflix, Inc.


# 67e89281 10-Jun-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Mbuf leak while holding a socket buffer lock.

When running at NF the current Rack and BBR changes with the recent
commits from Richard that cause the socket buffer lock to be held over
the ip_o

tcp: Mbuf leak while holding a socket buffer lock.

When running at NF the current Rack and BBR changes with the recent
commits from Richard that cause the socket buffer lock to be held over
the ip_output() call and then finally culminating in a call to tcp_handle_wakeup()
we get a lot of leaked mbufs. I don't think that this leak is actually caused
by holding the lock or what Richard has done, but is exposing some other
bug that has probably been lying dormant for a long time. I will continue to
look (using his changes) at what is going on to try to root cause out the issue.

In the meantime I can't leave the leaks out for everyone else. So this commit
will revert all of Richards changes and move both Rack and BBR back to just
doing the old sorwakeup_locked() calls after messing with the so_rcv buffer.

We may want to look at adding back in Richards changes after I have pinpointed
the root cause of the mbuf leak and fixed it.

Reviewed by: mtuexen,rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30704

show more ...


# 4747500d 04-Jun-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: A better fix for the previously attempted fix of the ack-war issue with tcp.

So it turns out that my fix before was not correct. It ended with us failing
some of the "improved" SYN tests, since

tcp: A better fix for the previously attempted fix of the ack-war issue with tcp.

So it turns out that my fix before was not correct. It ended with us failing
some of the "improved" SYN tests, since we are not in the correct states.
With more digging I have figured out the root of the problem is that when
we receive a SYN|FIN the reassembly code made it so we create a segq entry
to hold the FIN. In the established state where we were not in order this
would be correct i.e. a 0 len with a FIN would need to be accepted. But
if you are in a front state we need to strip the FIN so we correctly handle
the ACK but ignore the FIN. This gets us into the proper states
and avoids the previous ack war.

I back out some of the previous changes but then add a new change
here in tcp_reass() that fixes the root cause of the issue. We still
leave the rack panic fixes in place however.

Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30627

show more ...


# 8c69d988 27-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: When we have an out-of-order FIN we do want to strip off the FIN bit.

The last set of commits fixed both a panic (in rack) and an ACK-war (in freebsd and bbr).
However there was a missing case,

tcp: When we have an out-of-order FIN we do want to strip off the FIN bit.

The last set of commits fixed both a panic (in rack) and an ACK-war (in freebsd and bbr).
However there was a missing case, i.e. where we get an out-of-order FIN by itself.
In such a case we don't want to leave the FIN bit set, otherwise we will do the
wrong thing and ack the FIN incorrectly. Instead we need to go through the
tcp_reasm() code and that way the FIN will be stripped and all will be well.

Reviewed by: mtuexen,rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30497

show more ...


# 4f3addd9 26-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Add a socket option to rack so we can test various changes to the slop value in timers.

Timer_slop, in TCP, has been 200ms for a long time. This value dates back
a long time when delayed ack ti

tcp: Add a socket option to rack so we can test various changes to the slop value in timers.

Timer_slop, in TCP, has been 200ms for a long time. This value dates back
a long time when delayed ack timers were longer and links were slower. A
200ms timer slop allows 1 MSS to be sent over a 60kbps link. Its possible that
lowering this value to something more in line with todays delayed ack values (40ms)
might improve TCP. This bit of code makes it so rack can, via a socket option,
adjust the timer slop.

Reviewed by: mtuexen
Sponsered by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30249

show more ...


# 13c0e198 25-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Fix bugs related to the PUSH bit and rack and an ack war

Michaels testing with UDP tunneling found an issue with the push bit, which was only partly fixed
in the last commit. The problem is the

tcp: Fix bugs related to the PUSH bit and rack and an ack war

Michaels testing with UDP tunneling found an issue with the push bit, which was only partly fixed
in the last commit. The problem is the left edge gets transmitted before the adjustments are done
to the send_map, this means that right edge bits must be considered to be added only if
the entire RSM is being retransmitted.

Now syzkaller also continued to find a crash, which Michael sent me the reproducer for. Turns
out that the reproducer on default (freebsd) stack made the stack get into an ack-war with itself.
After fixing the reference issues in rack the same ack-war was found in rack (and bbr). Basically
what happens is we go into the reassembly code and lose the FIN bit. The trick here is we
should not be going into the reassembly code if tlen == 0 i.e. the peer never sent you anything.
That then gets the proper action on the FIN bit but then you end up in LAST_ACK with no
timers running. This is because the usrclosed function gets called and the FIN's and such have
already been exchanged. So when we should be entering FIN_WAIT2 (or even FIN_WAIT1) we get
stuck in LAST_ACK. Fixing this means tweaking the usrclosed function so that we properly
recognize the condition and drop into FIN_WAIT2 where a timer will allow at least TP_MAXIDLE
before closing (to allow time for the peer to retransmit its FIN if the ack is lost). Setting the fast_finwait2
timer can speed this up in testing.

Reviewed by: mtuexen,rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30451

show more ...


# 9bbd1a8f 24-May-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: fix a RACK socket buffer lock issue

Fix a missing socket buffer unlocking of the socket receive buffer.

Reviewed by: gallatin, rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential

tcp: fix a RACK socket buffer lock issue

Fix a missing socket buffer unlocking of the socket receive buffer.

Reviewed by: gallatin, rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D30402

show more ...


# 631449d5 24-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Fix an issue with the PUSH bit as well as fill in the missing mtu change for fsb's

The push bit itself was also not actually being properly moved to
the right edge. The FIN bit was incorrectly

tcp: Fix an issue with the PUSH bit as well as fill in the missing mtu change for fsb's

The push bit itself was also not actually being properly moved to
the right edge. The FIN bit was incorrectly on the left edge. We
fix these two issues as well as plumb in the mtu_change for
alternate stacks.

Reviewed by: mtuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30413

show more ...


# 8923ce63 22-May-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: Handle stack switch while processing socket options

Handle the case where during socket option processing, the user
switches a stack such that processing the stack specific socket
option does n

tcp: Handle stack switch while processing socket options

Handle the case where during socket option processing, the user
switches a stack such that processing the stack specific socket
option does not make sense anymore. Return an error in this case.

MFC after: 1 week
Reviewed by: markj
Reported by: syzbot+a6e1d91f240ad5d72cd1@syzkaller.appspotmail.com
Sponsored by: Netflix, Inc.
Differential revision: https://reviews.freebsd.org/D30395

show more ...


# 39756885 22-May-2021 Richard Scheffenegger <rscheff@FreeBSD.org>

rack: honor prior socket buffer lock when doing the upcall

While partially reverting D24237 with D29690, due to introducing some
unintended effects for in-kernel TCP consumers, the preexisting lock

rack: honor prior socket buffer lock when doing the upcall

While partially reverting D24237 with D29690, due to introducing some
unintended effects for in-kernel TCP consumers, the preexisting lock
on the socket send buffer was not considered properly.

Found by: markj
MFC after: 2 weeks
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D30390

show more ...


# 032bf749 21-May-2021 Richard Scheffenegger <rscheff@FreeBSD.org>

[tcp] Keep socket buffer locked until upcall

r367492 would unlock the socket buffer before eventually calling the upcall.
This leads to problematic interaction with NFS kernel server/client componen

[tcp] Keep socket buffer locked until upcall

r367492 would unlock the socket buffer before eventually calling the upcall.
This leads to problematic interaction with NFS kernel server/client components
(MP threads) accessing the socket buffer with potentially not correctly updated
state.

Reported by: rmacklem
Reviewed By: tuexen, #transport
Tested by: rmacklem, otis
MFC after: 2 weeks
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29690

show more ...


# 500eb6dd 21-May-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp: Fix sending of TCP segments with IP level options

When bringing in TCP over UDP support in
https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605,
the length of IP lev

tcp: Fix sending of TCP segments with IP level options

When bringing in TCP over UDP support in
https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605,
the length of IP level options was considered when locating the
transport header. This was incorrect and is fixed by this patch.

X-MFC with: https://cgit.FreeBSD.org/src/commit/?id=9e644c23000c2f5028b235f6263d17ffb24d3605
MFC after: 3 days
Reviewed by: markj, rscheff
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D30358

show more ...


# 02cffbc2 13-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: Incorrect KASSERT causes a panic in rack

Skyzall found an interesting panic in rack. When a SYN and FIN are
both sent together a KASSERT gets tripped where it is validating that
a mbuf pointer

tcp: Incorrect KASSERT causes a panic in rack

Skyzall found an interesting panic in rack. When a SYN and FIN are
both sent together a KASSERT gets tripped where it is validating that
a mbuf pointer is in the sendmap. But a SYN and FIN often will not
have a mbuf pointer. So the fix is two fold a) make sure that the
SYN and FIN split the right way when cloning an RSM SYN on left
edge and FIN on right. And also make sure the KASSERT properly
accounts for the case that we have a SYN or FIN so we don't
panic.

Reviewed by: mtuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D30241

show more ...


# 251842c6 12-May-2021 Michael Tuexen <tuexen@FreeBSD.org>

tcp rack: improve initialisation of retransmit timeout

When the TCP is in the front states, don't take the slop variable
into account. This improves consistency with the base stack.

Reviewed by: r

tcp rack: improve initialisation of retransmit timeout

When the TCP is in the front states, don't take the slop variable
into account. This improves consistency with the base stack.

Reviewed by: rrs@
Differential Revision: https://reviews.freebsd.org/D30230
MFC after: 1 week
Sponsored by: Netflix, Inc.

show more ...


# 4b86a24a 11-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: In rack, we must only convert restored rtt when the hostcache does restore them.

Rack now after the previous commit is very careful to translate any
value in the hostcache for srtt/rttvar into

tcp: In rack, we must only convert restored rtt when the hostcache does restore them.

Rack now after the previous commit is very careful to translate any
value in the hostcache for srtt/rttvar into its proper format. However
there is a snafu here in that if tp->srtt is 0 is the only time that
the HC will actually restore the srtt. We need to then only convert
the srtt restored when it is actually restored. We do this by making
sure it was zero before the call to cc_conn_init and it is non-zero
afterwards.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30213

show more ...


# 9867224b 10-May-2021 Randall Stewart <rrs@FreeBSD.org>

tcp:Host cache and rack ending up with incorrect values.

The hostcache up to now as been updated in the discard callback
but without checking if we are all done (the race where there are
more than o

tcp:Host cache and rack ending up with incorrect values.

The hostcache up to now as been updated in the discard callback
but without checking if we are all done (the race where there are
more than one calls and the counter has not yet reached zero). This
means that when the race occurs, we end up calling the hc_upate
more than once. Also alternate stacks can keep there srtt/rttvar
in different formats (example rack keeps its values in microseconds).
Since we call the hc_update *before* the stack fini() then the
values will be in the wrong format.

Rack on the other hand, needs to convert items pulled from the
hostcache into its internal format else it may end up with
very much incorrect values from the hostcache. In the process
lets commonize the update mechanism for srtt/rttvar since we
now have more than one place that needs to call it.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30172

show more ...


# a16cee02 07-May-2021 Randall Stewart <rrs@FreeBSD.org>

Fix a UDP tunneling issue with rack. Basically there are two
issues.
A) Not enough hdrlen was being calculated when a UDP tunnel is
in place.
and
B) Not enough memory is allocated in racks fsb. We

Fix a UDP tunneling issue with rack. Basically there are two
issues.
A) Not enough hdrlen was being calculated when a UDP tunnel is
in place.
and
B) Not enough memory is allocated in racks fsb. We need to
overbook the fsb to include a udphdr just in case.

Submitted by: Peter Lei
Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30157

show more ...


# 5d8fd932 06-May-2021 Randall Stewart <rrs@FreeBSD.org>

This brings into sync FreeBSD with the netflix versions of rack and bbr.
This fixes several breakages (panics) since the tcp_lro code was
committed that have been reported. Quite a few new features a

This brings into sync FreeBSD with the netflix versions of rack and bbr.
This fixes several breakages (panics) since the tcp_lro code was
committed that have been reported. Quite a few new features are
now in rack (prefecting of DGP -- Dynamic Goodput Pacing among the
largest). There is also support for ack-war prevention. Documents
comming soon on rack..

Sponsored by: Netflix
Reviewed by: rscheff, mtuexen
Differential Revision: https://reviews.freebsd.org/D30036

show more ...


12345678910>>...12