History log of /freebsd/sys/netinet/tcp_stacks/rack.c (Results 51 – 75 of 295)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 4e8a20a7 19-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: rack the request level logging is a bit too noisy when doing point logging.

When doing request level BB logging the hybrid_bw_log() does not have proper screening to minimize logging
when point

tcp: rack the request level logging is a bit too noisy when doing point logging.

When doing request level BB logging the hybrid_bw_log() does not have proper screening to minimize logging
when point level logging is in use. Lets fix it properly so you have to have the proper knobs set to get the
more noisy logging.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39699

show more ...


# 7a842346 19-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: Rack can crash with the new non-TSO fix..

Turns out the location of the check to see if we can do output is in the wrong place. We need
to jump off to the compressed acks before handling that c

tcp: Rack can crash with the new non-TSO fix..

Turns out the location of the check to see if we can do output is in the wrong place. We need
to jump off to the compressed acks before handling that case since th is NULL in the
compressed ack case which is handled differently anyway.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39690

show more ...


# 2ad584c5 17-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: Inconsistent use of hpts_calling flag

Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One
such inconsistency results in a bug when we can't a

tcp: Inconsistent use of hpts_calling flag

Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One
such inconsistency results in a bug when we can't allocate enough sendmap entries to entertain a call to
rack_output().. basically a timer won't get started like it should. Also in cleaning this up I find that the
"no_output" side of input needs to be adjusted to make sure we don't try to re-pace too quickly outside
the hpts assurance of 250useconds.

Another thing here is we end up with duplicate calls to tcp_output() which we should not. If packets go
from hpts for processing the input side of tcp will call the output side of tcp on the last packet if it is needed.
This means that when that occurs a second call to tcp_output would be made that is not needed and if pacing
is going on may be harmful.

Lets fix all this and explicitly state the contract that hpts is making with transports that care about the
flag.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39653

show more ...


# a540cdca 17-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: use queue(9) STAILQ for the input queue

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D39574


# 3cc7b667 14-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: stack unloading crash in rack and bbr

Its possible to induce a crash in either rack or bbr. This would be done
if the rack stack were say the default and bbr was being used by a connection.
If

tcp: stack unloading crash in rack and bbr

Its possible to induce a crash in either rack or bbr. This would be done
if the rack stack were say the default and bbr was being used by a connection.
If the bbr stack is then unloaded and it was active, we will trigger a MPASS assert
in tcp_hpts since the new stack (default rack) would start a timer, and the old stack
(bbr) would have the inp already in hpts.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39576

show more ...


# 9903bf34 14-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: rack pacing has some caveats that need to be obeyed when LRO is missing

n further non-LRO testing I found a case where rack is supposed to be waking up but
it is not now. In this special case i

tcp: rack pacing has some caveats that need to be obeyed when LRO is missing

n further non-LRO testing I found a case where rack is supposed to be waking up but
it is not now. In this special case it sets the flag rc_ack_can_sendout_data. When that is
set we should not prohibit output.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39565

show more ...


# a2b33c9a 10-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: Rack - in the absence of LRO fixed rate pacing (loopback or interfaces with no LRO) does not work correctly.

Rack is capable of fixed rate or dynamic rate pacing. Both of these can get mixed up

tcp: Rack - in the absence of LRO fixed rate pacing (loopback or interfaces with no LRO) does not work correctly.

Rack is capable of fixed rate or dynamic rate pacing. Both of these can get mixed up when
LRO is not available. This is because LRO will hold off waking up the tcp connection to
processing the inbound packets until the pacing timer is up. Without LRO the pacing only
sort-of works. Sometimes we pace correctly, other times not so much.

This set of changes will make it so pacing works properly in the absence of LRO.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39494

show more ...


# e2224617 10-Apr-2023 John Baldwin <jhb@FreeBSD.org>

rack: mask and tclass are only used for INET6.

This fixes the LINT-NOINET6 build.


# 66fbc19f 07-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: pass tcpcb in the tfb_tcp_ctloutput() method instead of inpcb

Just matches rest of the KPI.

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D39435


# 35bc0bcc 07-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: reduce argument list to functions that pass a segment

The socket argument is superfluous, as a tcpcb always has one and
only one socket.

Reviewed by: rrs
Differential Revision: https://review

tcp: reduce argument list to functions that pass a segment

The socket argument is superfluous, as a tcpcb always has one and
only one socket.

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D39434

show more ...


# 945f9a7c 07-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

tcp: misc cleanup of options for rack as well as socket option logging.

Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack
has an experimental SaD (Sack Atta

tcp: misc cleanup of options for rack as well as socket option logging.

Both BBR and Rack have the ability to log socket options, which is currently disabled. Rack
has an experimental SaD (Sack Attack Detection) algorithm that should be made available. Also
there is a t_maxpeak_rate that needs to be removed (its un-used).

Reviewed by: tuexen, cc
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D39427

show more ...


Revision tags: release/13.2.0
# 84b42df8 05-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

rack: fix build on powerpc


# 030434ac 04-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

Update rack to the latest code used at NF.

There have been many changes to rack over the last couple of years, including:
a) Ability when switching stacks to have one stack query another.

Update rack to the latest code used at NF.

There have been many changes to rack over the last couple of years, including:
a) Ability when switching stacks to have one stack query another.
b) Internal use of micro-second timers instead of ticks.
c) Many changes to pacing in forms of
1) Improvements to Dynamic Goodput Pacing (DGP)
2) Improvements to fixed rate paciing
3) A new feature called hybrid pacing where the requestor can
get a combination of DGP and fixed rate pacing with deadlines
for delivery that can dynamically speed things up.
d) All kinds of bugs found during extensive testing and use of the
rack stack for streaming video and in fact all data transferred
by NF

Reviewed by: glebius, gallatin, tuexen
Sponsored By: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D39402

show more ...


# 73ee5756 01-Apr-2023 Randall Stewart <rrs@FreeBSD.org>

Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.

So stack switching as always been a bit of a issue. We currently use

Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.

So stack switching as always been a bit of a issue. We currently use a break before make setup which means that
if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so
that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow
more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from
it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the
concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and
of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider
for sure).

Once all this lands I will then update rack to begin using all these new features.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D39210

show more ...


# 69c7c811 16-Mar-2023 Randall Stewart <rrs@FreeBSD.org>

Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.

The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and t

Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.

The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints
we need to move to using the new inline functions. This adds them and moves rack to now use
the tcp_tracepoints.

Reviewed by: tuexen, gallatin
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D38831

show more ...


# c0e4090e 08-Feb-2023 Andrew Gallatin <gallatin@FreeBSD.org>

ktls: Accurately track if ifnet ktls is enabled

This allows us to avoid spurious calls to ktls_disable_ifnet()

When we implemented ifnet kTLSe, we set a flag in the tx socket
buffer (SB_TLS_IFNET)

ktls: Accurately track if ifnet ktls is enabled

This allows us to avoid spurious calls to ktls_disable_ifnet()

When we implemented ifnet kTLSe, we set a flag in the tx socket
buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that
now, or in the past, ifnet ktls was active on a socket. Later,
I added code to switch ifnet ktls sessions to software in the case
of lossy TCP connections that have a high retransmit rate.
Because TCP was using SB_TLS_IFNET to know if it needed to do math
to calculate the retransmit ratio and potentially call into
ktls_disable_ifnet(), it was doing unneeded work long after
a session was moved to software.

This patch carefully tracks whether or not ifnet ktls is still enabled
on a TCP connection. Because the inp is now embedded in the tcpcb, and
because TCP is the most frequent accessor of this state, it made sense to
move this from the socket buffer flags to the tcpcb. Because we now need
reliable access to the tcbcb, we take a ref on the inp when creating a tx
ktls session.

While here, I noticed that rack/bbr were incorrectly implementing
tfb_hwtls_change(), and applying the change to all pending sends,
when it should apply only to future sends.

This change reduces spurious calls to ktls_disable_ifnet() by 95% or so
in a Netflix CDN environment.

Reviewed by: markj, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D38380

show more ...


# 18b83b62 26-Jan-2023 Richard Scheffenegger <rscheff@FreeBSD.org>

tcp: reduce the size of t_rttupdated in tcpcb

During tcp session start, various mechanisms need to
track a few initial RTTs before becoming active.
Prevent overflows of the corresponding tracking co

tcp: reduce the size of t_rttupdated in tcpcb

During tcp session start, various mechanisms need to
track a few initial RTTs before becoming active.
Prevent overflows of the corresponding tracking counter
and reduce the size of tcpcb simultaneously.

Reviewed By: #transport, tuexen, guest-ccui
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D21117

show more ...


# 73e994a9 19-Jan-2023 Gordon Bergling <gbe@FreeBSD.org>

extra_tcp_stacks: Fix a common typo in source code comments

- s/orginal/original/

MFC after: 3 days


# 432a398d 11-Jan-2023 Gordon Bergling <gbe@FreeBSD.org>

tcp_rack(4): Fix a typo in a source code comment

- s/postion/position/

MFC after: 3 days


# 8ea41829 10-Jan-2023 Andrew Gallatin <gallatin@FreeBSD.org>

tcp: Build RACK and BBR stacks as a part of LINT

When RACK and BBR were added to the kernel, they were put
behind 'WITH_EXTRA_TCP_STACKS=1'. Unfortunately that was
never added to any NOTES file, s

tcp: Build RACK and BBR stacks as a part of LINT

When RACK and BBR were added to the kernel, they were put
behind 'WITH_EXTRA_TCP_STACKS=1'. Unfortunately that was
never added to any NOTES file, so RACK & BBR were not compiled
with the various LINT-NOINET, LINT-NOINET6, and LINT-NOIP kernels.
This lead to the stacks sometimes being broken.

This change:

- Fixes RACK so that it compiles with the various LINT-NO* kernels
- Adds WITH_EXTRA_TCP_STACKS=1 to all NOTES kernels so that
RACK and BBR are compile tested regularly

Sponsored by: Netflix
Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D37903

show more ...


# 2e2a1c31 14-Dec-2022 Randall Stewart <rrs@FreeBSD.org>

Opps take out a stray left behind printf that was
for debugging.. Sorry.


# e2e088ae 14-Dec-2022 Randall Stewart <rrs@FreeBSD.org>

Rack cannot be loaded without cc_newreno compiled into the kernel.

Right now rack will fail to load due to its hack in accessing symbol names
in cc_newreno. This was fine when newreno was always com

Rack cannot be loaded without cc_newreno compiled into the kernel.

Right now rack will fail to load due to its hack in accessing symbol names
in cc_newreno. This was fine when newreno was always compiled into the
kernel but now ... not so much. Instead lets fix up rack to use the socket
option queries to get the information it wants and set the parameters. We
also fix the CC parameter so they are always settable.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D37622

show more ...


# eaabc937 14-Dec-2022 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: retire TCPDEBUG

This subsystem is superseded by modern debugging facilities,
e.g. DTrace probes and TCP black box logging.

We intentionally leave SO_DEBUG in place, as many utilities may
set i

tcp: retire TCPDEBUG

This subsystem is superseded by modern debugging facilities,
e.g. DTrace probes and TCP black box logging.

We intentionally leave SO_DEBUG in place, as many utilities may
set it on a socket. Also the tcp::debug DTrace probes look at
this flag on a socket.

Reviewed by: gnn, tuexen
Discussed with: rscheff, rrs, jtl
Differential revision: https://reviews.freebsd.org/D37694

show more ...


# e6fc01f6 14-Dec-2022 Mateusz Guzik <mjg@FreeBSD.org>

tcp: whack the stale declaration of rack_timer_stop

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 446ccdd0 07-Dec-2022 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: use single locked callout per tcpcb for the TCP timers

Use only one callout structure per tcpcb that is responsible for handling
all five TCP timeouts. Use locked version of callout, of course

tcp: use single locked callout per tcpcb for the TCP timers

Use only one callout structure per tcpcb that is responsible for handling
all five TCP timeouts. Use locked version of callout, of course. The
callout function tcp_timer_enter() chooses soonest timer and executes it
with lock held. Unless the timer reports that the tcpcb has been freed,
the callout is rescheduled for next soonest timer, if there is any.

With single callout per tcpcb on connection teardown we should be able
to fully stop the callout and immediately free it, avoiding use of
callout_async_drain(). There is one gotcha here: callout_stop() can
actually touch our memory when a rare race condition happens. See
comment above tcp_timer_stop(). Synchronous stop of the callout makes
tcp_discardcb() the single entry point for tcpcb destructor, merging the
tcp_freecb() to the end of the function.

While here, also remove lots of lingering checks in the beginning of
TCP timer functions. With a locked callout they are unnecessary.

While here, clean unused parts of timer KPI for the pluggable TCP stacks.

While here, remove TCPDEBUG from tcp_timer.c, as this allows for more
simplification of TCP timers. The TCPDEBUG is scheduled for removal.

Move the DTrace probes in timers to the beginning of a function, where
a tcpcb is always existing.

Discussed with: rrs, tuexen, rscheff (the TCP part of the diff)
Reviewed by: hselasky, kib, mav (the callout part)
Differential revision: https://reviews.freebsd.org/D37321

show more ...


12345678910>>...12