#
b93e930c |
| 25-Jul-2025 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sendfile: retire M_BLOCKED
Follow unix(4) commit 51ac5ee0d57f and retire M_BLOCKED for TCP sockets as well. The M_BLOCKED flag was introduced back 2016 together with non- blocking sendfile(2). It
sendfile: retire M_BLOCKED
Follow unix(4) commit 51ac5ee0d57f and retire M_BLOCKED for TCP sockets as well. The M_BLOCKED flag was introduced back 2016 together with non- blocking sendfile(2). It marked mbufs in a sending socket buffer that could be ready to sent, but are sitting behind an M_NOTREADY mbuf(s), that blocks them.
You may consider this flag as an INVARIANT flag that helped to ensure socket buffer consistency. Or maybe the socket code was so convoluted back then, that it was unclear if sbfree() may be called on an mbuf that is in the middle of the buffer, and I decided to introduce the flag to protect against that. With today state of socket buffer code it became clear that the latter cannot happen. And this commit adds an assertion proving that.
Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D50728
show more ...
|
Revision tags: release/14.3.0-p1, release/14.2.0-p4, release/13.5.0-p2, release/14.3.0, release/13.4.0-p5, release/13.5.0-p1, release/14.2.0-p3, release/13.5.0, release/14.2.0-p2, release/14.1.0-p8, release/13.4.0-p4, release/14.1.0-p7, release/14.2.0-p1, release/13.4.0-p3 |
|
#
c2cd12b7 |
| 14-Jan-2025 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Make t4_push_frames static to t4_cpl_io.c
This function is not used outside of this file.
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D47760
|
Revision tags: release/14.2.0 |
|
#
314cb279 |
| 31-Oct-2024 |
John Baldwin <jhb@FreeBSD.org> |
mbuf: Don't force all M_EXTPG mbufs to be read-only
Some M_EXTPG mbufs are read-only (e.g. those backing sendfile requests), but others are not. Add a flags argument to mb_alloc_ext_pgs that can be
mbuf: Don't force all M_EXTPG mbufs to be read-only
Some M_EXTPG mbufs are read-only (e.g. those backing sendfile requests), but others are not. Add a flags argument to mb_alloc_ext_pgs that can be used to set M_RDONLY when needed rather than setting it unconditionally. Update mb_unmapped_to_ext to preserve M_RDONLY from the unmapped mbuf.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D46783
show more ...
|
Revision tags: release/13.4.0 |
|
#
955b3803 |
| 03-Sep-2024 |
Zhenlei Huang <zlei@FreeBSD.org> |
cxgbe(4): Stop checking for failures from malloc/mb_alloc_ext_pgs(M_WAITOK)
MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D45852
|
#
bbc32624 |
| 17-Jul-2024 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe/t4_tom: Detach the toep from the tcpcb when entering TIME_WAIT.
The kernel used to call tod_pcb_detach when entering TIME_WAIT but that seems to have changed, likely with the TIME_WAIT overhau
cxgbe/t4_tom: Detach the toep from the tcpcb when entering TIME_WAIT.
The kernel used to call tod_pcb_detach when entering TIME_WAIT but that seems to have changed, likely with the TIME_WAIT overhaul in the kernel some time ago. Catch up by having the driver perform the detach.
The hardware does not handle TIME_WAIT so it's important to detach and let the kernel arm the 2MSL timer to deal with it.
Reported by: Sony Arpita Das @ Chelsio Reviewed by: jhb MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D45990
show more ...
|
Revision tags: release/14.1.0 |
|
#
eba13bbc |
| 20-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe: Support TCP_USE_DDP on offloaded TOE connections
When this socket option is enabled, relatively large contiguous buffers are allocated and used to receive data from the remote connection. Wh
cxgbe: Support TCP_USE_DDP on offloaded TOE connections
When this socket option is enabled, relatively large contiguous buffers are allocated and used to receive data from the remote connection. When data is received a wrapper M_EXT mbuf is queued to the socket's receive buffer. This reduces the length of the linked list of received mbufs and allows consumers to consume receive data in larger chunks.
To minimize reprogramming the page pods in the adapter, receive buffers for a given connection are recycled. When a buffer has been fully consumed by the receiver and freed, the buffer is placed on a per-connection free buffers list.
The size of the receive buffers defaults to 256k and can be set via the hw.cxgbe.toe.ddp_rcvbuf_len sysctl. The hw.cxgbe.toe.ddp_rcvbuf_cache sysctl (defaults to 4) determines the maximum number of free buffers cached per connection. Note that this limit does not apply to "in-flight" receive buffers that are associated with mbufs in the socket's receive buffer.
Co-authored-by: Navdeep Parhar <np@FreeBSD.org> Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44001
show more ...
|
Revision tags: release/13.3.0 |
|
#
a5a965d7 |
| 31-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Enable ULP_MODE_TCPDDP on demand
Most ULP modes in cxgbe's TOE are enabled on the fly when a protocol is needed (e.g. ULP_MODE_ISCSI is enabled by cxgbei when offloading a connection usin
cxgbe tom: Enable ULP_MODE_TCPDDP on demand
Most ULP modes in cxgbe's TOE are enabled on the fly when a protocol is needed (e.g. ULP_MODE_ISCSI is enabled by cxgbei when offloading a connection using iSCSI, and ULP_MODE_TLS is enabled when RX TLS keys are programmed for a TOE connection). The one exception to this is ULP_MODE_TCPDDP.
Currently the cxgbe driver enables ULP_MODE_TCPDDP when a TOE connection is first created. However, since DDP connections cannot be converted to other connection types, this requires some special handling in the driver. For example, iSCSI daemons use the SO_NO_DDP socket option to ensure TOE connections use ULP_MODE_NONE so they can be converted to ULP_MODE_ISCSI. Similarly, using TLS receive offload (ULP_MODE_TLS) requires disabling TCP DDP for new connections by default.
This commit changes cxgbe to instead switch a connection from ULP_MODE_NONE to ULP_MODE_TCPDDP when a connection first attempts to use TCP DDP via aio_read(2). This permits connections to always start as ULP_MODE_NONE and switch to a protocol-specific mode as needed.
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43670
show more ...
|
#
c3d4aea6 |
| 31-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe: Add counters for POSIX async I/O requests handled by the driver
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43668
|
Revision tags: release/14.0.0 |
|
#
dcfddc8d |
| 09-Sep-2023 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Call t4_rcvd_locked from do_rx_data to return RX credits
In particular, the kernel RPC layer used by the NFS client never invokes pru_rcvd since it always reads data from the socket upcal
cxgbe tom: Call t4_rcvd_locked from do_rx_data to return RX credits
In particular, the kernel RPC layer used by the NFS client never invokes pru_rcvd since it always reads data from the socket upcall via MSG_SOCALLBCK which avoids calling pru_rcvd. As a result, on an NFS client connection managed by t4_tom, RX credits were never returned to the TOE connection to open the TCP window resulting in connection hangs.
To fix, expand the set of conditions in do_rx_data where RX credits are returned to match those in t4_rcvd_locked by calling the function directly.
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D41688
show more ...
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
4d846d26 |
| 10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
show more ...
|
Revision tags: release/13.2.0, release/12.4.0 |
|
#
2ff447ee |
| 15-Nov-2022 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe: Enable TOE TLS RX when an RX key is provided via setsockopt().
Rather than requiring a socket to be created as a TLS socket from the get go, switch a TOE socket from "plain" TOE to TLS mode w
cxgbe: Enable TOE TLS RX when an RX key is provided via setsockopt().
Rather than requiring a socket to be created as a TLS socket from the get go, switch a TOE socket from "plain" TOE to TLS mode when a receive key is added to the socket.
The firmware is only able to switch a "plain" TOE connection to TLS mode if the head of the pending socket data is the start of a TLS record, so the connection is migrated to TLS mode as a multi-step process.
When TOE TLS RX is enabled, the associated connection's receive side is frozen via a flag in the TCB. The state of the socket buffer is then examined to determine if the pending data in the socket buffer ends on a TLS record boundary. If so, the connection is migrated to TLS mode and unfrozen. Otherwise, the connection is unfrozen temporarily until more data arrives. Once more data arrives, the receive queue is frozen again and rechecked. This continues until the connection is paused at a record boundary. Any records received before TLS mode is enabled are decrypted as software records.
Note that this removes the 'rx_tls_ports' sysctl. TOE TLS offload for receive is now enabled automatically on existing TOE connections when using a KTLS-aware SSL library just as it was previously enabled automatically for TLS transmit. This also enables TLS offload for TOE connections which enable TLS after passing initial data in the clear (e.g. STARTTLS with SMTP).
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D37351
show more ...
|
#
21186bdb |
| 15-Nov-2022 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe: Various whitespace fixes.
Mostly trailing whitespace and spaces before tabs.
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D37350
|
#
e4bc19b2 |
| 08-Nov-2022 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Fix jobtotid() compilation.
The previous commit lost an implicit struct socket * cast. Use an inline function instead as the macro is already rather long.
Fixes: e1401f75790f cxgbe: us
cxgbe tom: Fix jobtotid() compilation.
The previous commit lost an implicit struct socket * cast. Use an inline function instead as the macro is already rather long.
Fixes: e1401f75790f cxgbe: use standard sototcpcb() accessor macro to get socket's tcpcb Sponsored by: Chelsio Communications
show more ...
|
#
9eb0e832 |
| 08-Nov-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: provide macros to access inpcb and socket from a tcpcb
There should be no functional changes with this commit.
Reviewed by: rscheff Differential revision: https://reviews.freebsd.org/D37123
|
#
e1401f75 |
| 20-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
cxgbe: use standard sototcpcb() accessor macro to get socket's tcpcb
Reviewed by: np Differential revision: https://reviews.freebsd.org/D37041
|
#
53af6903 |
| 07-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove INP_TIMEWAIT flag
Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes.
Note: this flag was very often check
tcp: remove INP_TIMEWAIT flag
Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes.
Note: this flag was very often checked together with INP_DROPPED. If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we will be able to remove most of this checks and turn them to assertions. Some of them can be turned into assertions right now, but that should be carefully done on a case by case basis.
Differential revision: https://reviews.freebsd.org/D36400
show more ...
|
#
43283184 |
| 12-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that a
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that access this mutex directly were added. Go over the core socket code and eliminate code that reaches the mutex by dereferencing the sockbuf compatibility pointer.
This change requires a KPI change, as some functions were given the sockbuf pointer only without any hint if it is a receive or send buffer.
This change doesn't cover the whole kernel, many protocols still use compatibility pointers internally. However, it allows operation of a protocol that doesn't use them.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35152
show more ...
|
Revision tags: release/13.1.0 |
|
#
d37dca9e |
| 19-Apr-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
cxgbe: plug a set-but-not-used var
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
#
2beaefe8 |
| 11-Mar-2022 |
John Baldwin <jhb@FreeBSD.org> |
cxgbei: Support unmapped I/O requests.
- Add icl_pdu_append_bio and icl_pdu_get_bio methods.
- Add new page pod routines for allocating and writing page pods for unmapped bio requests. Use these
cxgbei: Support unmapped I/O requests.
- Add icl_pdu_append_bio and icl_pdu_get_bio methods.
- Add new page pod routines for allocating and writing page pods for unmapped bio requests. Use these new routines for setting up DDP for iSCSI tasks with a SCSI I/O CCB which uses CAM_DATA_BIO.
- When ICL_NOCOPY is used to append data from an unmapped I/O request to a PDU, construct unmapped mbufs from the relevant pages backing the struct bio. This also requires changes in the t4_push_pdus path to support unmapped mbufs.
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D34383
show more ...
|
#
f64dc2ab |
| 26-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: TCP output method can request tcp_drop
The advanced TCP stacks (bbr, rack) may decide to drop a TCP connection when they do output on it. The default stack never does this, thus existing frame
tcp: TCP output method can request tcp_drop
The advanced TCP stacks (bbr, rack) may decide to drop a TCP connection when they do output on it. The default stack never does this, thus existing framework expects tcp_output() always to return locked and valid tcpcb.
Provide KPI extension to satisfy demands of advanced stacks. If the output method returns negative error code, it means that caller must call tcp_drop().
In tcp_var() provide three inline methods to call tcp_output(): - tcp_output() is a drop-in replacement for the default stack, so that default stack can continue using it internally without modifications. For advanced stacks it would perform tcp_drop() and unlock and report that with negative error code. - tcp_output_unlock() handles the negative code and always converts it to positive and always unlocks. - tcp_output_nodrop() just calls the method and leaves the responsibility to drop on the caller.
Sweep over the advanced stacks and use new KPI instead of using HPTS delayed drop queue for that.
Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D33370
show more ...
|
#
40fa3e40 |
| 26-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: mechanically substitute call to tfb_tcp_output to new method.
Made with sed(1) execution:
sed -Ef sed -i "" $(grep --exclude tcp_var.h -lr tcp_output sys/)
sed: s/tp->t_fb->tfb_tcp_output\(tp
tcp: mechanically substitute call to tfb_tcp_output to new method.
Made with sed(1) execution:
sed -Ef sed -i "" $(grep --exclude tcp_var.h -lr tcp_output sys/)
sed: s/tp->t_fb->tfb_tcp_output\(tp\)/tcp_output(tp)/ s/to tfb_tcp_output\(\)/to tcp_output()/
Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D33366
show more ...
|
Revision tags: release/12.3.0 |
|
#
e3ba94d4 |
| 09-Nov-2021 |
John Baldwin <jhb@FreeBSD.org> |
Don't require the socket lock for sorele().
Previously, sorele() always required the socket lock and dropped the lock if the released reference was not the last reference. Many callers locked the s
Don't require the socket lock for sorele().
Previously, sorele() always required the socket lock and dropped the lock if the released reference was not the last reference. Many callers locked the socket lock just before calling sorele() resulting in a wasted lock/unlock when not dropping the last reference.
Move the previous implementation of sorele() into a new sorele_locked() function and use it instead of sorele() for various places in uipc_socket.c that called sorele() while already holding the socket lock.
The sorele() macro now uses refcount_release_if_not_last() try to drop the socket reference without locking the socket. If that shortcut fails, it locks the socket and calls sorele_locked().
Reviewed by: kib, markj Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D32741
show more ...
|
#
9affbb0f |
| 14-Sep-2021 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Enter network epoch in t4_aiotx_task().
While here, don't restore the old vnet until after sorele().
Sponsored by: Chelsio Communications
|
#
5dbf8c15 |
| 14-Sep-2021 |
John Baldwin <jhb@FreeBSD.org> |
cxgbe tom: Update rcv_nxt for a FIN after handle_ddp_close().
For TCP DDP, handle_ddp_close() needs to see the pre-FIN rcv_nxt to determine how much data was placed in the local buffer before the FI
cxgbe tom: Update rcv_nxt for a FIN after handle_ddp_close().
For TCP DDP, handle_ddp_close() needs to see the pre-FIN rcv_nxt to determine how much data was placed in the local buffer before the FIN was received. The changes in d59f1c49e26b broke this by updating rcv_nxt before calling handle_ddp_close().
Fixes: d59f1c49e26b cxgbe tom: Permit rcv_nxt mismatches on FIN for iSCSI connections on T6. Sponsored by: Chelsio Communications
show more ...
|