#
75dfc66c |
| 27-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r358269 through r358399.
|
#
7029da5c |
| 26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly mark
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
show more ...
|
#
74dc6beb |
| 14-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357855 through r357920.
|
#
ca3b3c57 |
| 13-Feb-2020 |
John Baldwin <jhb@FreeBSD.org> |
Remove the per-TXQ tls_wrs stat.
It duplicated the kern_tls_records stat and was not conditional on NIC TLS being enabled.
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision
Remove the per-TXQ tls_wrs stat.
It duplicated the kern_tls_records stat and was not conditional on NIC TLS being enabled.
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D23670
show more ...
|
#
bc02c18c |
| 07-Feb-2020 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r357408 through r357661.
|
#
87bbb333 |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Add pfil(9) hooks to the driver's rx.
MFC after: 1 week Sponsored by: Chelsio Communications
|
#
1486d2de |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Treat NIC rx as special and run its handler directly and not via the t4_cpl_handler dispatch table.
MFC after: 1 week Sponsored by: Chelsio Communications
|
#
46e1e307 |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping code that tracks various rx buffer sizes and layouts.
MFC after: 1 wee
cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping code that tracks various rx buffer sizes and layouts.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
d6f79b27 |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be cold by the time rxb_free runs. Put the information needed by rxb_free in the same lin
cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be cold by the time rxb_free runs. Put the information needed by rxb_free in the same line as the refcount, which is very likely to be hot given that rxb_free runs when the refcount is decremented and reaches 0.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
44c6fea8 |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Do not use pack boundary > 512B unless it is explicitly requested.
This is a tradeoff between PCIe efficiency during large packet rx and packing efficiency during small packet rx.
MFC aft
cxgbe(4): Do not use pack boundary > 512B unless it is explicitly requested.
This is a tradeoff between PCIe efficiency during large packet rx and packing efficiency during small packet rx.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
a9c4062a |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Initialize the rx buffer's metadata on first-use and not on allocation.
refill_fl doesn't touch any part of a freshly allocated cluster after this change.
MFC after: 1 week Sponsored by:
cxgbe(4): Initialize the rx buffer's metadata on first-use and not on allocation.
refill_fl doesn't touch any part of a freshly allocated cluster after this change.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
9087a3df |
| 04-Feb-2020 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Only checksummed TCP should be considered for LRO.
This avoids the per-packet nanouptime in tcp_lro_rx for traffic that's not even TCP.
MFC after: 1 week Sponsored by: Chelsio Communicati
cxgbe(4): Only checksummed TCP should be considered for LRO.
This avoids the per-packet nanouptime in tcp_lro_rx for traffic that's not even TCP.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
e9edde41 |
| 07-Jan-2020 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Fix a typo - passing wrong mbuf pointer to needs_udp_csum(). Will trigger panic only on a kernel with RATELIMIT.
Submitted by: rrs
|
#
c0236bd9 |
| 13-Dec-2019 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
CPL_TX_PKT_XT disables the internal parser on the chip and instead relies on the driver to provide the exact length of the L2 a
cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
CPL_TX_PKT_XT disables the internal parser on the chip and instead relies on the driver to provide the exact length of the L2 and L3 headers. This allows hw checksumming and TSO to be used with L2 and L3 encapsulations that the chip doesn't understand directly.
Note that netmap tx still uses the old CPL as it never uses the hw to generate the checksum on tx.
Reviewed by: jhb@ MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D22788
show more ...
|
#
aa7bdbc0 |
| 10-Dec-2019 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Use TX_PKTS2 work requests in netmap Tx if it's available.
TX_PKTS2 is more efficient within the firmware and this improves netmap Tx by a few Mpps in some common scenarios.
MFC after: 1
cxgbe(4): Use TX_PKTS2 work requests in netmap Tx if it's available.
TX_PKTS2 is more efficient within the firmware and this improves netmap Tx by a few Mpps in some common scenarios.
MFC after: 1 week Sponsored by: Chelsio Communications
show more ...
|
#
bddf7343 |
| 21-Nov-2019 |
John Baldwin <jhb@FreeBSD.org> |
NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters. Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE connections.
NIC KTLS on T6 is n
NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters. Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE connections.
NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment the encrypted TLS frames output by the crypto engine. Instead, the TOE is placed into a special setup to permit "dummy" connections to be associated with regular sockets using KTLS. This permits using the TOE to segment the encrypted TLS records. However, this approach does have some limitations:
1) Regular TOE sockets cannot be used when the TOE is in this special mode. One can use either TOE and TOE-based KTLS or NIC KTLS, but not both at the same time.
2) In NIC KTLS mode, the TOE is only able to accept a per-connection timestamp offset that varies in the upper 4 bits. Put another way, only connections whose timestamp offset has the 28 lower bits cleared can use NIC KTLS and generate correct timestamps. The driver will refuse to enable NIC KTLS on connections with a timestamp offset with any of the lower 28 bits set. To use NIC KTLS, users can either disable TCP timestamps by setting the net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the tcp_new_ts_offset() function to clear the lower 28 bits of the generated offset.
3) Because the TCP segmentation relies on fields mirrored in a TCB in the TOE, not all fields in a TCP packet can be sent in the TCP segments generated from a TLS record. Specifically, for packets containing TCP options other than timestamps, the driver will inject an "empty" TCP packet holding the requested options (e.g. a SACK scoreboard) along with the segments from the TLS record. These empty TCP packets are counted by the dev.cc.N.txq.M.kern_tls_options sysctls.
Unlike TOE TLS which is able to buffer encrypted TLS records in on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS records for retransmit requests as well as non-retransmit requests that do not include the start of a TLS record but do include the trailer. The T6 NIC KTLS code tries to optimize some of the cases for requests to transmit partial TLS records. In particular it attempts to minimize sending "waste" bytes that have to be given as input to the crypto engine but are not needed on the wire to satisfy mbufs sent from the TCP stack down to the driver.
TCP packets for TLS requests are broken down into the following classes (with associated counters):
- Mbufs that send an entire TLS record in full do not have any waste bytes (dev.cc.N.txq.M.kern_tls_full).
- Mbufs that send a short TLS record that ends before the end of the trailer (dev.cc.N.txq.M.kern_tls_short). For sockets using AES-CBC, the encryption must always start at the beginning, so if the mbuf starts at an offset into the TLS record, the offset bytes will be "waste" bytes. For sockets using AES-GCM, the encryption can start at the 16 byte block before the starting offset capping the waste at 15 bytes.
- Mbufs that send a partial TLS record that has a non-zero starting offset but ends at the end of the trailer (dev.cc.N.txq.M.kern_tls_partial). In order to compute the authentication hash stored in the trailer, the entire TLS record must be sent as input to the crypto engine, so the bytes before the offset are always "waste" bytes.
In addition, other per-txq sysctls are provided:
- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq using AES-CBC.
- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq using AES-GCM.
- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to compensate for the TOE engine not being able to set FIN on the last segment of a TLS record if the TLS record mbuf had FIN set.
- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this txq including full, short, and partial records.
- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header and payload) sent for TLS record requests.
- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS record requests.
To enable NIC KTLS with T6, set the following tunables prior to loading the cxgbe(4) driver:
hw.cxgbe.config_file=kern_tls hw.cxgbe.kern_tls=1
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21962
show more ...
|
Revision tags: release/12.1.0 |
|
#
adb0cd84 |
| 25-Oct-2019 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Use correct FetchBurstMin values for T6.
MFC after: 1 week Sponsored by: Chelsio Communications
|
#
e38a50e8 |
| 22-Oct-2019 |
John Baldwin <jhb@FreeBSD.org> |
Split Chelsio send tags into a generic base tag and a ratelimit tag.
NIC KTLS will add a new TLS send tag type in cxgbe(4) that is a distinct tag from a ratelimit tag. To support this, refactor cxg
Split Chelsio send tags into a generic base tag and a ratelimit tag.
NIC KTLS will add a new TLS send tag type in cxgbe(4) that is a distinct tag from a ratelimit tag. To support this, refactor cxgbe_snd_tag to be a simple send tag with a type and convert the existing ratelimit tag to a new cxgbe_rate_tag structure.
Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D22072
show more ...
|
#
693a9dfc |
| 15-Oct-2019 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): An EQ update can be requested in a TX_PKTS2 work request.
MFC after: 1 week Sponsored by: Chelsio Communications
|
#
c5c3ba6b |
| 03-Sep-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r351317 through r351731.
|
#
8bf30903 |
| 24-Aug-2019 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Use the same buffer size for TOE rx queues as the NIC rx queues.
This is a minor simplification.
MFC after: 1 week Sponsored by: Chelsio Communications
|
#
a63915c2 |
| 28-Jul-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @r350386
Sponsored by: The FreeBSD Foundation
|
Revision tags: release/11.3.0 |
|
#
d76bbe17 |
| 29-Jun-2019 |
John Baldwin <jhb@FreeBSD.org> |
Add support for IFCAP_NOMAP to cxgbe(4).
Since cxgbe(4) uses sglist instead of bus_dma, this required updates to the code that generates scatter/gather lists for packets. Also, unmapped mbufs are a
Add support for IFCAP_NOMAP to cxgbe(4).
Since cxgbe(4) uses sglist instead of bus_dma, this required updates to the code that generates scatter/gather lists for packets. Also, unmapped mbufs are always sent via DMA and never as immediate data in the payload of a work request.
Submitted by: gallatin (earlier version) Reviewed by: gallatin, hselasky, rrs Discussed with: np Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20616
show more ...
|
#
0269ae4c |
| 06-Jun-2019 |
Alan Somers <asomers@FreeBSD.org> |
MFHead @348740
Sponsored by: The FreeBSD Foundation
|
#
fb3bc596 |
| 25-May-2019 |
John Baldwin <jhb@FreeBSD.org> |
Restructure mbuf send tags to provide stronger guarantees.
- Perform ifp mismatch checks (to determine if a send tag is allocated for a different ifp than the one the packet is being output on), i
Restructure mbuf send tags to provide stronger guarantees.
- Perform ifp mismatch checks (to determine if a send tag is allocated for a different ifp than the one the packet is being output on), in ip_output() and ip6_output(). This avoids sending packets with send tags to ifnet drivers that don't support send tags.
Since we are now checking for ifp mismatches before invoking if_output, we can now try to allocate a new tag before invoking if_output sending the original packet on the new tag if allocation succeeds.
To avoid code duplication for the fragment and unfragmented cases, add ip_output_send() and ip6_output_send() as wrappers around if_output and nd6_output_ifp, respectively. All of the logic for setting send tags and dealing with send tag-related errors is done in these wrapper functions.
For pseudo interfaces that wrap other network interfaces (vlan and lagg), wrapper send tags are now allocated so that ip*_output see the wrapper ifp as the ifp in the send tag. The if_transmit routines rewrite the send tags after performing an ifp mismatch check. If an ifp mismatch is detected, the transmit routines fail with EAGAIN.
- To provide clearer life cycle management of send tags, especially in the presence of vlan and lagg wrapper tags, add a reference count to send tags managed via m_snd_tag_ref() and m_snd_tag_rele(). Provide a helper function (m_snd_tag_init()) for use by drivers supporting send tags. m_snd_tag_init() takes care of the if_ref on the ifp meaning that code alloating send tags via if_snd_tag_alloc no longer has to manage that manually. Similarly, m_snd_tag_rele drops the refcount on the ifp after invoking if_snd_tag_free when the last reference to a send tag is dropped.
This also closes use after free races if there are pending packets in driver tx rings after the socket is closed (e.g. from tcpdrop).
In order for m_free to work reliably, add a new CSUM_SND_TAG flag in csum_flags to indicate 'snd_tag' is set (rather than 'rcvif'). Drivers now also check this flag instead of checking snd_tag against NULL. This avoids false positive matches when a forwarded packet has a non-NULL rcvif that was treated as a send tag.
- cxgbe was relying on snd_tag_free being called when the inp was detached so that it could kick the firmware to flush any pending work on the flow. This is because the driver doesn't require ACK messages from the firmware for every request, but instead does a kind of manual interrupt coalescing by only setting a flag to request a completion on a subset of requests. If all of the in-flight requests don't have the flag when the tag is detached from the inp, the flow might never return the credits. The current snd_tag_free command issues a flush command to force the credits to return. However, the credit return is what also frees the mbufs, and since those mbufs now hold references on the tag, this meant that snd_tag_free would never be called.
To fix, explicitly drop the mbuf's reference on the snd tag when the mbuf is queued in the firmware work queue. This means that once the inp's reference on the tag goes away and all in-flight mbufs have been queued to the firmware, tag's refcount will drop to zero and snd_tag_free will kick in and send the flush request. Note that we need to avoid doing this in the middle of ethofld_tx(), so the driver grabs a temporary reference on the tag around that loop to defer the free to the end of the function in case it sends the last mbuf to the queue after the inp has dropped its reference on the tag.
- mlx5 preallocates send tags and was using the ifp pointer even when the send tag wasn't in use. Explicitly use the ifp from other data structures instead.
- Sprinkle some assertions in various places to assert that received packets don't have a send tag, and that other places that overwrite rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.
Reviewed by: gallatin, hselasky, rgrimes, ae Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20117
show more ...
|