History log of /freebsd/sys/netinet/in_pcb.h (Results 26 – 50 of 514)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 0aa120d5 02-Dec-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: allow to provide protocol specific pcb size

The protocol specific structure shall start with inpcb.

Differential revision: https://reviews.freebsd.org/D37126


Revision tags: release/12.4.0
# d93ec8cb 02-Nov-2022 Mark Johnston <markj@FreeBSD.org>

inpcb: Allow SO_REUSEPORT_LB to be used in jails

Currently SO_REUSEPORT_LB silently does nothing when set by a jailed
process. It is trivial to support this option in VNET jails, but it's
also usef

inpcb: Allow SO_REUSEPORT_LB to be used in jails

Currently SO_REUSEPORT_LB silently does nothing when set by a jailed
process. It is trivial to support this option in VNET jails, but it's
also useful in traditional jails.

This patch enables LB groups in jails with the following semantics:
- all PCBs in a group must belong to the same jail,
- PCB lookup prefers jailed groups to non-jailed groups

This is a straightforward extension of the semantics used for individual
listening sockets. One pre-existing quirk of the lbgroup implementation
is that non-jailed lbgroups are searched before jailed listening
sockets; that is preserved with this change.

Discussed with: glebius
MFC after: 1 month
Sponsored by: Modirum MDPay
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37029

show more ...


# 19acc506 31-Oct-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: retire suppresion of randomization of ephemeral ports

The suppresion was added in 5f311da2ccb with no explanation in the
commit message of the exact problem that was fixed. In the BSDCan
2006

inpcb: retire suppresion of randomization of ephemeral ports

The suppresion was added in 5f311da2ccb with no explanation in the
commit message of the exact problem that was fixed. In the BSDCan
2006 talk [1], slides 12 to 14, we can find that it seems that there
was some problem with the TIME_WAIT state not properly being handled
on the remote side (also FreeBSD!), and this switching off the
suppression had hidden the problem. The rationale of the change was
that other stacks may also be buggy wrt the TIME_WAIT.

I did not find the actual problem in TIME_WAIT that the suppression
has hidden, neither a commit that would fix it. However, since that
time we started to handle SYNs with RFC5961 instead of RFC793, see
3220a2121cc. We also now have the tcp-testsuite [2], that has full
coverage of all possible scenarios of receiving SYN in TIME_WAIT.

This effectively reverts 5f311da2ccb6c216b79049172be840af4778129a
and 6ee79c59d2c081f837a11cc78908be54a6dbe09d.

[1] https://www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf
[2] https://github.com/freebsd-net/tcp-testsuite

Reviewed by: rscheff
Discussed with: rscheff, rrs, tuexen
Differential revision: https://reviews.freebsd.org/D37042

show more ...


# 24cf7a8d 20-Oct-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: provide pcbinfo pointer argument to inp_apply_all()

Allows to clear inpcb layer of TCP knowledge.


# 53af6903 07-Oct-2022 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: remove INP_TIMEWAIT flag

Mechanically cleanup INP_TIMEWAIT from the kernel sources. After
0d7445193ab, this commit shall not cause any functional changes.

Note: this flag was very often check

tcp: remove INP_TIMEWAIT flag

Mechanically cleanup INP_TIMEWAIT from the kernel sources. After
0d7445193ab, this commit shall not cause any functional changes.

Note: this flag was very often checked together with INP_DROPPED.
If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we
will be able to remove most of this checks and turn them to
assertions. Some of them can be turned into assertions right now,
but that should be carefully done on a case by case basis.

Differential revision: https://reviews.freebsd.org/D36400

show more ...


# 893f36b7 04-Sep-2022 Gordon Bergling <gbe@FreeBSD.org>

netinet: Correct a typo in source code comment

- s/occured/occurred/

MFC after: 3 days


Revision tags: release/13.1.0
# a35bdd44 09-Feb-2022 Michael Tuexen <tuexen@FreeBSD.org>

tcp: add sysctl interface for setting socket options

This interface allows to set a socket option on a TCP endpoint,
which is specified by its inp_gencnt. This interface will be
used in an upcoming

tcp: add sysctl interface for setting socket options

This interface allows to set a socket option on a TCP endpoint,
which is specified by its inp_gencnt. This interface will be
used in an upcoming command line tool tcpsso.

Reviewed by: glebius, rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D34138

show more ...


# afad340a 03-Jan-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: garbage collect INP_LOCK_INIT(), used only once in sctp

Reviewed by: tuexen
Differential revision: https://reviews.freebsd.org/D33543


# fec8a8c7 03-Jan-2022 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: use global UMA zones for protocols

Provide structure inpcbstorage, that holds zones and lock names for
a protocol. Initialize it with global protocol init using macro
INPCBSTORAGE_DEFINE().

inpcb: use global UMA zones for protocols

Provide structure inpcbstorage, that holds zones and lock names for
a protocol. Initialize it with global protocol init using macro
INPCBSTORAGE_DEFINE(). Then, at VNET protocol init supply it as
the main argument to the in_pcbinfo_init(). Each VNET pcbinfo uses
its private hash, but they all use same zone to allocate and SMR
section to synchronize.

Note: there is kern.ipc.maxsockets sysctl, which controls UMA limit
on the socket zone, which was always global. Historically same
maxsockets value is applied also to every PCB zone. Important fact:
you can't create a pcb without a socket! A pcb may outlive its socket,
however. Given that there are multiple protocols, and only one socket
zone, the per pcb zone limits seem to have little value. Under very
special conditions it may trigger a little bit earlier than socket zone
limit, but in most setups the socket zone limit will be triggered
earlier. When VIMAGE was added to the kernel PCB zones became per-VNET.
This magnified existing disbalance further: now we have multiple pcb
zones in multiple vnets limited to maxsockets, but every pcb requires a
socket allocated from the global zone also limited by maxsockets.
IMHO, this per pcb zone limit doesn't bring any value, so this patch
drops it. If anybody explains value of this limit, it can be restored
very easy - just 2 lines change to in_pcbstorage_init().

Differential revision: https://reviews.freebsd.org/D33542

show more ...


# a0577692 26-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

in_pcb: use jenkins hash over the entire IPv6 (or IPv4) address

The intent is to provide more entropy than can be provided
by just the 32-bits of the IPv6 address which overlaps with
6to4 tunnels.

in_pcb: use jenkins hash over the entire IPv6 (or IPv4) address

The intent is to provide more entropy than can be provided
by just the 32-bits of the IPv6 address which overlaps with
6to4 tunnels. This is needed to mitigate potential algorithmic
complexity attacks from attackers who can control large
numbers of IPv6 addresses.

Together with: gallatin
Reviewed by: dwmalone, rscheff
Differential revision: https://reviews.freebsd.org/D33254

show more ...


# a370832b 26-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: remove delayed drop KPI

No longer needed after tcp_output() can ask caller to drop.

Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D33371


# db0ac6de 02-Dec-2021 Cy Schubert <cy@FreeBSD.org>

Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"

This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing
changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b.

A mism

Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"

This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing
changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.

show more ...


# 266f97b5 02-Dec-2021 Cy Schubert <cy@FreeBSD.org>

wpa: Import wpa_supplicant/hostapd commit 14ab4a816

This is the November update to vendor/wpa committed upstream 2021-11-26.

MFC after: 1 month


# 2e27230f 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: rewrite inpcb synchronization

Just trust the pcb database, that if we did in_pcbref(), no way
an inpcb can go away. And if we never put a dropped inpcb on
our queue, and tcp_discardcb() a

tcp_hpts: rewrite inpcb synchronization

Just trust the pcb database, that if we did in_pcbref(), no way
an inpcb can go away. And if we never put a dropped inpcb on
our queue, and tcp_discardcb() always removes an inpcb to be
dropped from the queue, then any inpcb on the queue is valid.

Now, to solve LOR between inpcb lock and HPTS queue lock do the
following trick. When we are about to process a certain time
slot, take the full queue of the head list into on stack list,
drop the HPTS lock and work on our queue. This of course opens
a race when an inpcb is being removed from the on stack queue,
which was already mentioned in comments. To address this race
introduce generation count into queues. If we want to remove
an inpcb with generation count mismatch, we can't do that, we
can only mark it with desired new time slot or -1 for remove.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33026

show more ...


# f971e791 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: rename input queue to drop queue and trim dead code

The HPTS input queue is in reality used only for "delayed drops".
When a TCP stack decides to drop a connection on the output path
it ca

tcp_hpts: rename input queue to drop queue and trim dead code

The HPTS input queue is in reality used only for "delayed drops".
When a TCP stack decides to drop a connection on the output path
it can't do that due to locking protocol between main tcp_output()
and stacks. So, rack/bbr utilize HPTS to drop the connection in
a different context.

In the past the queue could also process input packets in context
of HPTS thread, but now no stack uses this, so remove this
functionality.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33025

show more ...


# de2d4784 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

SMR protection for inpcbs

With introduction of epoch(9) synchronization to network stack the
inpcb database became protected by the network epoch together with
static network data (interfaces, addre

SMR protection for inpcbs

With introduction of epoch(9) synchronization to network stack the
inpcb database became protected by the network epoch together with
static network data (interfaces, addresses, etc). However, inpcb
aren't static in nature, they are created and destroyed all the
time, which creates some traffic on the epoch(9) garbage collector.

Fairly new feature of uma(9) - Safe Memory Reclamation allows to
safely free memory in page-sized batches, with virtually zero
overhead compared to uma_zfree(). However, unlike epoch(9), it
puts stricter requirement on the access to the protected memory,
needing the critical(9) section to access it. Details:

- The database is already build on CK lists, thanks to epoch(9).
- For write access nothing is changed.
- For a lookup in the database SMR section is now required.
Once the desired inpcb is found we need to transition from SMR
section to r/w lock on the inpcb itself, with a check that inpcb
isn't yet freed. This requires some compexity, since SMR section
itself is a critical(9) section. The complexity is hidden from
KPI users in inp_smr_lock().
- For a inpcb list traversal (a pcblist sysctl, or broadcast
notification) also a new KPI is provided, that hides internals of
the database - inp_next(struct inp_iterator *).

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33022

show more ...


# 565655f4 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

inpcb: reduce some aliased functions after removal of PCBGROUP.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33021


# 93c67567 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

Remove "options PCBGROUP"

With upcoming changes to the inpcb synchronisation it is going to be
broken. Even its current status after the move of PCB synchronization
to the network epoch is very ques

Remove "options PCBGROUP"

With upcoming changes to the inpcb synchronisation it is going to be
broken. Even its current status after the move of PCB synchronization
to the network epoch is very questionable.

This experimental feature was sponsored by Juniper but ended never to
be used in Juniper and doesn't exist in their source tree [sjg@, stevek@,
jtl@]. In the past (AFAIK, pre-epoch times) it was tried out at Netflix
[gallatin@, rrs@] with no positive result and at Yandex [ae@, melifaro@].

I'm up to resurrecting it back if there is any interest from anybody.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33020

show more ...


Revision tags: release/12.3.0
# 0f617ae4 18-Oct-2021 Gleb Smirnoff <glebius@FreeBSD.org>

Add in_pcb_var.h for KPIs that are private to in_pcb.c and in6_pcb.c.


# 744a64bd 28-Apr-2021 Gleb Smirnoff <glebius@FreeBSD.org>

in_pcb: garbage collect in_pcbrele()


# 5a78df20 27-Apr-2021 Gleb Smirnoff <glebius@FreeBSD.org>

in_pcb: garbage collect unused structure in_pcblist


# d7955cc0 06-Jul-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: HPTS performance enhancements

HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does t

tcp: HPTS performance enhancements

HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does this by now instead of relying on the timer/callout to
drive it instead use user return from a system call as well as lro flushes
to drive hpts. The timer becomes a backstop that dynamically adjusts
based on how "late" we are.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31083

show more ...


# 1db08fbe 16-Apr-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_input: always request read-locking of PCB for any pure SYN segment.

This is further rework of 08d9c920275. Now we carry the knowledge of
lock type all the way through tcp_input() and also into

tcp_input: always request read-locking of PCB for any pure SYN segment.

This is further rework of 08d9c920275. Now we carry the knowledge of
lock type all the way through tcp_input() and also into tcp_twcheck().
Ideally the rlocking for pure SYNs should propagate all the way into
the alternative TCP stacks, but not yet today.

This should close a race when socket is bind(2)-ed but not yet
listen(2)-ed and a SYN-packet arrives racing with listen(2), discovered
recently by pho@.

show more ...


Revision tags: release/13.0.0
# 08d9c920 19-Mar-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_input/syncache: acquire only read lock on PCB for SYN,!ACK packets

When packet is a SYN packet, we don't need to modify any existing PCB.
Normally SYN arrives on a listening socket, we either cr

tcp_input/syncache: acquire only read lock on PCB for SYN,!ACK packets

When packet is a SYN packet, we don't need to modify any existing PCB.
Normally SYN arrives on a listening socket, we either create a syncache
entry or generate syncookie, but we don't modify anything with the
listening socket or associated PCB. Thus create a new PCB lookup
mode - rlock if listening. This removes the primary contention point
under SYN flood - the listening socket PCB.

Sidenote: when SYN arrives on a synchronized connection, we still
don't need write access to PCB to send a challenge ACK or just to
drop. There is only one exclusion - tcptw recycling. However,
existing entanglement of tcp_input + stacks doesn't allow to make
this change small. Consider this patch as first approach to the problem.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D29576

show more ...


# 69a34e8d 27-Jan-2021 Randall Stewart <rrs@FreeBSD.org>

Update the LRO processing code so that we can support
a further CPU enhancements for compressed acks. These
are acks that are compressed into an mbuf. The transport
has to be aware of how to process

Update the LRO processing code so that we can support
a further CPU enhancements for compressed acks. These
are acks that are compressed into an mbuf. The transport
has to be aware of how to process these, and an upcoming
update to rack will do so. You need the rack changes
to actually test and validate these since if the transport
does not support mbuf compression, then the old code paths
stay in place. We do in this commit take out the concept
of logging if you don't have a lock (which was quite
dangerous and was only for some early debugging but has
been left in the code).

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D28374

show more ...


12345678910>>...21