#
3f43ada9 |
| 28-Jan-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Catch up with 6edfd179c86: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG.
Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was e
Catch up with 6edfd179c86: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG.
Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was extended to array of such pointers. Then, such mbufs were augmented with header/trailer. Basically, extended mbufs are extended, and set of features is subject to change. The new name should be generic enough to avoid further renaming.
show more ...
|
#
4dc1b17d |
| 20-Jan-2021 |
Mark Johnston <markj@FreeBSD.org> |
ktls: Improve handling of the bind_threads tunable a bit
- Only check for empty domains if we actually tried to configure domain affinity in the first place. Otherwise setting bind_threads=1 will
ktls: Improve handling of the bind_threads tunable a bit
- Only check for empty domains if we actually tried to configure domain affinity in the first place. Otherwise setting bind_threads=1 will always cause the sysctl value to be reported as zero. This is harmless since the threads end up being bound, but it's confusing. - Try to improve the sysctl description a bit.
Reviewed by: gallatin, jhb Submitted by: Klara, Inc. Sponsored by: Ampere Computing Differential Revision: https://reviews.freebsd.org/D28161
show more ...
|
#
6685e259 |
| 08-Jan-2021 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: don't use KTLS socket option on listening sockets
KTLS socket options make use of socket buffers, which are not available for listening sockets.
Reported by: syzbot+a8829e888a93a4a04619@syzka
tcp: don't use KTLS socket option on listening sockets
KTLS socket options make use of socket buffers, which are not available for listening sockets.
Reported by: syzbot+a8829e888a93a4a04619@syzkaller.appspotmail.com Reviewed by: jhb@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D27948
show more ...
|
#
02bc3865 |
| 19-Dec-2020 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Optionally bind ktls threads to NUMA domains
When ktls_bind_thread is 2, we pick a ktls worker thread that is bound to the same domain as the TCP connection associated with the socket. We use roughl
Optionally bind ktls threads to NUMA domains
When ktls_bind_thread is 2, we pick a ktls worker thread that is bound to the same domain as the TCP connection associated with the socket. We use roughly the same code as netinet/tcp_hpts.c to do this. This allows crypto to run on the same domain as the TCP connection is associated with. Assuming TCP_REUSPORT_LB_NUMA (D21636) is in place & in use, this ensures that the crypto source and destination buffers are local to the same NUMA domain as we're running crypto on.
This change (when TCP_REUSPORT_LB_NUMA, D21636, is used) reduces cross-domain traffic from over 37% down to about 13% as measured by pcm.x on a dual-socket Xeon using nginx and a Netflix workload.
Reviewed by: jhb Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21648
show more ...
|
#
36e0a362 |
| 30-Oct-2020 |
John Baldwin <jhb@FreeBSD.org> |
Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().
This gives a more uniform API for send tag life cycle management.
Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Re
Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().
This gives a more uniform API for send tag life cycle management.
Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D27000
show more ...
|
#
521eac97 |
| 29-Oct-2020 |
John Baldwin <jhb@FreeBSD.org> |
Support hardware rate limiting (pacing) with TLS offload.
- Add a new send tag type for a send tag that supports both rate limiting (packet pacing) and TLS offload (mostly similar to D22669 but
Support hardware rate limiting (pacing) with TLS offload.
- Add a new send tag type for a send tag that supports both rate limiting (packet pacing) and TLS offload (mostly similar to D22669 but adds a separate structure when allocating the new tag type).
- When allocating a send tag for TLS offload, check to see if the connection already has a pacing rate. If so, allocate a tag that supports both rate limiting and TLS offload rather than a plain TLS offload tag.
- When setting an initial rate on an existing ifnet KTLS connection, set the rate in the TCP control block inp and then reset the TLS send tag (via ktls_output_eagain) to reallocate a TLS + ratelimit send tag. This allocates the TLS send tag asynchronously from a task queue, so the TLS rate limit tag alloc is always sleepable.
- When modifying a rate on a connection using KTLS, look for a TLS send tag. If the send tag is only a plain TLS send tag, assume we failed to allocate a TLS ratelimit tag (either during the TCP_TXTLS_ENABLE socket option, or during the send tag reset triggered by ktls_output_eagain) and ignore the new rate. If the send tag is a ratelimit TLS send tag, change the rate on the TLS tag and leave the inp tag alone.
- Lock the inp lock when setting sb_tls_info for a socket send buffer so that the routines in tcp_ratelimit can safely dereference the pointer without needing to grab the socket buffer lock.
- Add an IFCAP_TXTLS_RTLMT capability flag and associated administrative controls in ifconfig(8). TLS rate limit tags are only allocated if this capability is enabled. Note that TLS offload (whether unlimited or rate limited) always requires IFCAP_TXTLS[46].
Reviewed by: gallatin, hselasky Relnotes: yes Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26691
show more ...
|
Revision tags: release/12.2.0 |
|
#
6bcf3c46 |
| 19-Oct-2020 |
John Baldwin <jhb@FreeBSD.org> |
Check TF_TOE not the tod pointer to determine if TOE is active.
The TF_TOE flag is the check used in the rest of the network stack to determine if TOE is active on a socket. There is at least one p
Check TF_TOE not the tod pointer to determine if TOE is active.
The TF_TOE flag is the check used in the rest of the network stack to determine if TOE is active on a socket. There is at least one path in the cxgbe(4) TOE driver that can leave the tod pointer non-NULL on a socket not using TOE.
Reported by: Sony Arpita Das <sonyarpitad@chelsio.com> Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D26803
show more ...
|
#
c2a8fd6f |
| 13-Oct-2020 |
John Baldwin <jhb@FreeBSD.org> |
Permit sending empty fragments for TLS 1.0.
Due to a weakness in the TLS 1.0 protocol, OpenSSL will periodically send empty TLS records ("empty fragments"). These TLS records have no payload (and t
Permit sending empty fragments for TLS 1.0.
Due to a weakness in the TLS 1.0 protocol, OpenSSL will periodically send empty TLS records ("empty fragments"). These TLS records have no payload (and thus a page count of zero). m_uiotombuf_nomap() was returning NULL instead of an empty mbuf, and a few places needed to be updated to treat an empty TLS record as having a page count of "1" as 0 means "no work to do" (e.g. nothing to encrypt, or nothing to mark ready via sbready()).
Reviewed by: gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26729
show more ...
|
#
d29a3de2 |
| 05-Sep-2020 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
uipc_ktls: remove unused static function
m_segments() was added with r363464 but never used. Remove it to avoid warnings when compiling kernels.
Reported by: rmacklem (also says jhb) Reviewed by: g
uipc_ktls: remove unused static function
m_segments() was added with r363464 but never used. Remove it to avoid warnings when compiling kernels.
Reported by: rmacklem (also says jhb) Reviewed by: gallatin, jhb Differential Revision: https://reviews.freebsd.org/D26330
show more ...
|
#
9675d889 |
| 04-Sep-2020 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ktls: Check for a NULL send tag in ktls_cleanup()
When using ifnet ktls, and when ktls_reset_send_tag() fails to allocate a replacement tag, it leaves the tls session's snd_tag pointer NULL. ktls_cl
ktls: Check for a NULL send tag in ktls_cleanup()
When using ifnet ktls, and when ktls_reset_send_tag() fails to allocate a replacement tag, it leaves the tls session's snd_tag pointer NULL. ktls_cleanup() tries to release the send tag, and will trip over this NULL pointer and panic unless NULL is checked for.
Reviewed by: jhb Sponsored by: Netflix
show more ...
|
#
c7aa572c |
| 31-Jul-2020 |
Glen Barber <gjb@FreeBSD.org> |
MFH
Sponsored by: Rubicon Communications, LLC (netgate.com)
|
#
3c0e5685 |
| 24-Jul-2020 |
John Baldwin <jhb@FreeBSD.org> |
Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the tra
Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the transmit path except in reverse. Protocols enqueue mbufs containing encrypted TLS records (or portions of records) into the tail of a socket buffer and the KTLS layer decrypts those records before returning them to userland applications. However, there is an important difference:
- In the transmit case, the socket buffer is always a single "record" holding a chain of mbufs. Not-yet-encrypted mbufs are marked not ready (M_NOTREADY) and released to protocols for transmit by marking mbufs ready once their data is encrypted.
- In the receive case, incoming (encrypted) data appended to the socket buffer is still a single stream of data from the protocol, but decrypted TLS records are stored as separate records in the socket buffer and read individually via recvmsg().
Initially I tried to make this work by marking incoming mbufs as M_NOTREADY, but there didn't seemed to be a non-gross way to deal with picking a portion of the mbuf chain and turning it into a new record in the socket buffer after decrypting the TLS record it contained (along with prepending a control message). Also, such mbufs would also need to be "pinned" in some way while they are being decrypted such that a concurrent sbcut() wouldn't free them out from under the thread performing decryption.
As such, I settled on the following solution:
- Socket buffers now contain an additional chain of mbufs (sb_mtls, sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by the protocol layer. These mbufs are still marked M_NOTREADY, but soreceive*() generally don't know about them (except that they will block waiting for data to be decrypted for a blocking read).
- Each time a new mbuf is appended to this TLS mbuf chain, the socket buffer peeks at the TLS record header at the head of the chain to determine the encrypted record's length. If enough data is queued for the TLS record, the socket is placed on a per-CPU TLS workqueue (reusing the existing KTLS workqueues and worker threads).
- The worker thread loops over the TLS mbuf chain decrypting records until it runs out of data. Each record is detached from the TLS mbuf chain while it is being decrypted to keep the mbufs "pinned". However, a new sb_dtlscc field tracks the character count of the detached record and sbcut()/sbdrop() is updated to account for the detached record. After the record is decrypted, the worker thread first checks to see if sbcut() dropped the record. If so, it is freed (can happen when a socket is closed with pending data). Otherwise, the header and trailer are stripped from the original mbufs, a control message is created holding the decrypted TLS header, and the decrypted TLS record is appended to the "normal" socket buffer chain.
(Side note: the SBCHECK() infrastucture was very useful as I was able to add assertions there about the TLS chain that caught several bugs during development.)
Tested by: rmacklem (various versions) Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24628
show more ...
|
#
4a711b8d |
| 25-Jun-2020 |
John Baldwin <jhb@FreeBSD.org> |
Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full allocation is always zeroed avoiding possible bugs with incorrect lengths p
Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full allocation is always zeroed avoiding possible bugs with incorrect lengths passed to explicit_bzero().
Suggested by: cem Reviewed by: cem, delphij Approved by: csprng (cem) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25435
show more ...
|
Revision tags: release/11.4.0 |
|
#
4f3c0f3d |
| 26-May-2020 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Fix build issue after r360292 when using both RSS and KERN_TLS options.
Sponsored by: Mellanox Technologies
|
#
6edfd179 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Step 4.1: mechanically rename M_NOMAP to M_EXTPG
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
|
#
7b6c99d0 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf within m_epg namespace. All edits except the 'struct mbuf' declaration and mb_dupcl() were done mechanically with sed:
Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf within m_epg namespace. All edits except the 'struct mbuf' declaration and mb_dupcl() were done mechanically with sed:
s/->m_ext_pgs.nrdy/->m_epg_nrdy/g s/->m_ext_pgs.hdr_len/->m_epg_hdrlen/g s/->m_ext_pgs.trail_len/->m_epg_trllen/g s/->m_ext_pgs.first_pg_off/->m_epg_1st_off/g s/->m_ext_pgs.last_pg_len/->m_epg_last_len/g s/->m_ext_pgs.flags/->m_epg_flags/g s/->m_ext_pgs.record_type/->m_epg_record_type/g s/->m_ext_pgs.enc_cnt/->m_epg_enc_cnt/g s/->m_ext_pgs.tls/->m_epg_tls/g s/->m_ext_pgs.so/->m_epg_so/g s/->m_ext_pgs.seqno/->m_epg_seqno/g s/->m_ext_pgs.stailq/->m_epg_stailq/g
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
show more ...
|
#
bccf6e26 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Step 2.5: Stop using 'struct mbuf_ext_pgs' in the kernel itself.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
|
#
c4ee38f8 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Step 2.3: Rename mbuf_ext_pg_len() to m_epg_pagelen() that uses mbuf argument.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
|
#
d90fe9d0 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Step 2.1: Build TLS workqueue from mbufs, not struct mbuf_ext_pgs.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
|
#
eeec8348 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Get rid of the mbuf self-pointing pointer.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
|
#
7433a5a9 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Start moving into EPG_/epg_ namespace. There is only one flag, but next commit brings in second flag, so let them already be in the future namespace.
Reviewed by: gallatin Differential Revision: ht
Start moving into EPG_/epg_ namespace. There is only one flag, but next commit brings in second flag, so let them already be in the future namespace.
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
show more ...
|
#
0c103266 |
| 03-May-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Continuation of multi page mbuf redesign from r359919.
The following series of patches addresses three things:
Now that array of pages is embedded into mbuf, we no longer need separate structure to
Continuation of multi page mbuf redesign from r359919.
The following series of patches addresses three things:
Now that array of pages is embedded into mbuf, we no longer need separate structure to pass around, so struct mbuf_ext_pgs is an artifact of the first implementation. And struct mbuf_ext_pgs_data is a crutch to accomodate the main idea r359919 with minimal churn.
Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP.
The namespace for the newfeature is somewhat inconsistent and sometimes has a lengthy prefixes. In these patches we will gradually bring the namespace to "m_epg" prefix for all mbuf fields and most functions.
Step 1 of 4:
o Anonymize mbuf_ext_pgs_data, embed in m_ext o Embed mbuf_ext_pgs o Start documenting all this entanglement
Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598
show more ...
|
#
f1f93475 |
| 28-Apr-2020 |
John Baldwin <jhb@FreeBSD.org> |
Initial support for kernel offload of TLS receive.
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and authentication algorithms and keys as well as the initial sequence number.
Initial support for kernel offload of TLS receive.
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and authentication algorithms and keys as well as the initial sequence number.
- When reading from a socket using KTLS receive, applications must use recvmsg(). Each successful call to recvmsg() will return a single TLS record. A new TCP control message, TLS_GET_RECORD, will contain the TLS record header of the decrypted record. The regular message buffer passed to recvmsg() will receive the decrypted payload. This is similar to the interface used by Linux's KTLS RX except that Linux does not return the full TLS header in the control message.
- Add plumbing to the TOE KTLS interface to request either transmit or receive KTLS sessions.
- When a socket is using receive KTLS, redirect reads from soreceive_stream() into soreceive_generic().
- Note that this interface is currently only defined for TLS 1.1 and 1.2, though I believe we will be able to reuse the same interface and structures for 1.3.
show more ...
|
#
ec1db6e1 |
| 28-Apr-2020 |
John Baldwin <jhb@FreeBSD.org> |
Add the initial sequence number to the TLS enable socket option.
This will be needed for KTLS RX.
Reviewed by: gallatin Sponsored by: Chelsio Communications Differential Revision: https://reviews.f
Add the initial sequence number to the TLS enable socket option.
This will be needed for KTLS RX.
Reviewed by: gallatin Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24451
show more ...
|
#
454d3896 |
| 25-Apr-2020 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
Fix LINT build #2 after r360292.
Pointyhat to: melifaro
|