#
1979b511 |
| 07-Aug-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Allow user-configured and driver-configured traffic classes to be used simultaneously. Move sysctl_tc and sysctl_tc_params to t4_sched.c while here.
MFC after: 3 weeks Sponsored by: Chels
cxgbe(4): Allow user-configured and driver-configured traffic classes to be used simultaneously. Move sysctl_tc and sysctl_tc_params to t4_sched.c while here.
MFC after: 3 weeks Sponsored by: Chelsio Communications
show more ...
|
Revision tags: release/11.2.0 |
|
#
d7c5a620 |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace.
When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch:
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32
After the patch
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52
Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch
Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366
show more ...
|
#
b3daa684 |
| 15-May-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Filtering related features and fixes.
- Driver support for hardware NAT. - Driver support for swapmac action. - Validate a request to create a hashfilter against the filter mask. - Add a h
cxgbe(4): Filtering related features and fixes.
- Driver support for hardware NAT. - Driver support for swapmac action. - Validate a request to create a hashfilter against the filter mask. - Add a hashfilter config file for T5.
Sponsored by: Chelsio Communications
show more ...
|
#
4535e804 |
| 30-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Use opaque cookies or tid range-checks to determine the intended recipient of a CPL when it can't be determined solely from the opcode. Retire the per-queue handlers for such CPLs in favor
cxgbe(4): Use opaque cookies or tid range-checks to determine the intended recipient of a CPL when it can't be determined solely from the opcode. Retire the per-queue handlers for such CPLs in favor of the new scheme.
Sponsored by: Chelsio Communications
show more ...
|
#
8896672a |
| 27-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Move release_tid to the base NIC driver for future consumers.
Sponsored by: Chelsio Communications.
|
#
3747c1ff |
| 26-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Break up alloc_tid_tabs and move the atid routines to the base NIC driver. The atid services will be used by new features (hashfilters and inline TLS) that do not involve TOE.
Sponsored b
cxgbe(4): Break up alloc_tid_tabs and move the atid routines to the base NIC driver. The atid services will be used by new features (hashfilters and inline TLS) that do not involve TOE.
Sponsored by: Chelsio Communications
show more ...
|
#
8aa1c1d8 |
| 19-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag (instead of hardcoded 0xffff, implying no tag) in the routine
cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag (instead of hardcoded 0xffff, implying no tag) in the routines that process offload policy.
Submitted by: Krishnamraju Eraparaju @ Chelsio Sponsored by: Chelsio Communications
show more ...
|
#
1131c927 |
| 14-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection using t4_tom, and what settings to apply to a connection selecte
cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection using t4_tom, and what settings to apply to a connection selected for offload. t4_tom must still be loaded and IFCAP_TOE must still be enabled for full TCP offload to take place on an interface. The difference is that IFCAP_TOE used to be the only knob and would enable TOE for all new connections on the inteface, but now the driver will also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line. Each rule has a "match" part consisting of a socket-type (L = listen, A = active open, P = passive open, D = don't care) and a pcap-filter(7) expression, and a "settings" part that specifies whether to offload the connection or not and the parameters to use if so. The general format of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information. [L] ip && port http => offload [L] port 443 => !offload [L] port ssh => offload [P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno [P] dst port ssh => offload !nagle ecn cong tahoe [P] dst port http => offload [A] dst port 443 => offload tls [A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or passive open and stops at the first match. There is an implicit rule at the end of every policy that prohibits offload when no rule in the policy matches: [D] all => !offload
This is a reworked and expanded version of a patch submitted by Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
show more ...
|
#
f8fea0d9 |
| 03-Apr-2018 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe: Implement tcp_info handler for connections handled by t4_tom.
The TCB is read using a memory window right now. A better alternate to get self-consistent, uncached information would be to use
cxgbe: Implement tcp_info handler for connections handled by t4_tom.
The TCB is read using a memory window right now. A better alternate to get self-consistent, uncached information would be to use a GET_TCB request but waiting for a reply from hw while holding non-sleepable locks is quite inconvenient.
Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14817
show more ...
|
#
edf95feb |
| 27-Mar-2018 |
John Baldwin <jhb@FreeBSD.org> |
Use the offload transmit queue to set flags on TLS connections.
Requests to modify the state of TLS connections need to be sent on the same queue as TLS record transmit requests to ensure ordering.
Use the offload transmit queue to set flags on TLS connections.
Requests to modify the state of TLS connections need to be sent on the same queue as TLS record transmit requests to ensure ordering.
However, in order to use the offload transmit queue in t4_set_tcb_field(), the function needs to be updated to do proper flow control / credit management when queueing a request to an offload queue. This required passing a pointer to the toepcb itself to this function, so while here remove the 'tid' and 'iqid' parameters and obtain those values from the toepcb in t4_set_tcb_field() itself.
Submitted by: Harsh Jain @ Chelsio (original version) Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14871
show more ...
|
#
1e9538d2 |
| 14-Mar-2018 |
John Baldwin <jhb@FreeBSD.org> |
Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS encryption and TCP segmentation for offloaded connections. Sockets using
Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS encryption and TCP segmentation for offloaded connections. Sockets using TLS are required to use a set of custom socket options to upload RX and TX keys to the NIC and to enable RX processing. Currently these socket options are implemented as TCP options in the vendor specific range. A patched OpenSSL library will be made available in a port / package for use with the TLS TOE support.
TOE sockets can either offload both transmit and reception of TLS records or just transmit. TLS offload (both RX and TX) is enabled by setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be enabled on the relevant interface. Transmit offload can be used on any "normal" or TLS TOE socket by using the custom socket option to program a transmit key. This permits most TOE sockets to transparently offload TLS when applications use a patched SSL library (e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL library). Receive offload can only be used with TOE sockets using the TLS mode. The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a list of TCP port numbers. Any connection with either a local or remote port number in that list will be created as a TLS socket rather than a plain TOE socket. Note that although this sysctl accepts an arbitrary list of port numbers, the sysctl(8) tool is only able to set sysctl nodes to a single value. A TLS socket will hang without receiving data if used by an application that is not using a patched SSL library. Thus, the tls_rx_ports node should be used with care. For a server mostly concerned with offloading TLS transmit, this node is not needed as plain TOE sockets will fall back to software crypto when using an unpatched SSL library.
New per-interface statistics nodes are added giving counts of TLS packets and payload bytes (payload bytes do not include TLS headers or authentication tags/MACs) offloaded via the TOE engine, e.g.:
dev.cc.0.stats.rx_tls_octets: 149 dev.cc.0.stats.rx_tls_records: 13 dev.cc.0.stats.tx_tls_octets: 26501823 dev.cc.0.stats.tx_tls_records: 1620
TLS transmit work requests are constructed by a new variant of t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.
TLS transmit work requests require a buffer containing IVs. If the IVs are too large to fit into the work request, a separate buffer is allocated when constructing a work request. This buffer is associated with the transmit descriptor and freed when the descriptor is ACKed by the adapter.
Received TLS frames use two new CPL messages. The first message is a CPL_TLS_DATA containing the decryped payload of a single TLS record. The handler places the mbuf containing the received payload on an mbufq in the TOE pcb. The second message is a CPL_RX_TLS_CMP message which includes a copy of the TLS header and indicates if there were any errors. The handler for this message places the TLS header into the socket buffer followed by the saved mbuf with the payload data. Both of these handlers are contained in tom/t4_tls.c.
A few routines were exposed from t4_cpl_io.c for use by t4_tls.c including send_rx_credits(), a new send_rx_modulate(), and t4_close_conn().
TLS keys for both transmit and receive are stored in onboard memory in the NIC in the "TLS keys" memory region.
In some cases a TLS socket can hang with pending data available in the NIC that is not delivered to the host. As a workaround, TLS sockets are more aggressive about sending CPL_RX_DATA_ACK messages anytime that any data is read from a TLS socket. In addition, a fallback timer will periodically send CPL_RX_DATA_ACK messages to the NIC for connections that are still in the handshake phase. Once the connection has finished the handshake and programmed RX keys via the socket option, the timer is stopped.
A new function select_ulp_mode() is used to determine what sub-mode a given TOE socket should use (plain TOE, DDP, or TLS). The existing set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and handles initialization of TLS-specific state when necessary in addition to DDP-specific state.
Since TLS sockets do not receive individual TCP segments but always receive full TLS records, they can receive more data than is available in the current window (e.g. if a 16k TLS record is received but the socket buffer is itself 16k). To cope with this, just drop the window to 0 when this happens, but track the overage and "eat" the overage as it is read from the socket buffer not opening the window (or adding rx_credits) for the overage bytes.
Reviewed by: np (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14529
show more ...
|
#
9689995d |
| 13-Mar-2018 |
John Baldwin <jhb@FreeBSD.org> |
Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning success. This avoids having to pretend to have proper support for unlo
Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning success. This avoids having to pretend to have proper support for unloading when only part of t4_tom_mod_load() has run. - If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly. The module handling code in the kernel invokes MOD_UNLOAD on a module whose MOD_LOAD fails with an error already.
Reviewed by: np (part of a larger patch) MFC after: 1 month Sponsored by: Chelsio Communications
show more ...
|
#
125d42fe |
| 22-Feb-2018 |
John Baldwin <jhb@FreeBSD.org> |
Move DDP PCB state into a helper structure.
This consolidates all of the DDP state in one place. Also, the code has now been fixed to ensure that DDP state is only accessed for DDP connections. Th
Move DDP PCB state into a helper structure.
This consolidates all of the DDP state in one place. Also, the code has now been fixed to ensure that DDP state is only accessed for DDP connections. This should not be a functional change but makes it cleaner and easier to add state for other TOE socket modes in the future.
MFC after: 1 month Sponsored by: Chelsio Communications
show more ...
|
#
f1798531 |
| 31-Jan-2018 |
John Baldwin <jhb@FreeBSD.org> |
Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so expor
Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive was marked as static. LLD honors static more correctly, so export this variable properly (including moving it into the tcp_* namespace).
Reviewed by: bz, emaste MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14129
show more ...
|
#
718cf2cc |
| 27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error
sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
show more ...
|
Revision tags: release/10.4.0 |
|
#
b754c279 |
| 13-Sep-2017 |
Navdeep Parhar <np@FreeBSD.org> |
MFH @ r323558.
|
#
5be4ad9e |
| 09-Sep-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r323343
|
#
0a3bf7fb |
| 01-Sep-2017 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe/t4_tom: There may not be a tid to update if the connection isn't established.
MFC after: 2 weeks Sponsored by: Chelsio Communications
|
Revision tags: release/11.1.0 |
|
#
7e1b7636 |
| 08-May-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r317808 through r317970.
|
#
67904997 |
| 05-May-2017 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe/t4_tom: Per-connection rate limiting for TCP sockets handled by the TOE. For now this capability is always enabled in kernels with options RATELIMIT. t4_tom will check if_capenable once the b
cxgbe/t4_tom: Per-connection rate limiting for TCP sockets handled by the TOE. For now this capability is always enabled in kernels with options RATELIMIT. t4_tom will check if_capenable once the base driver gets code to support rate limiting for any socket (TOE or not).
This was tested with iperf3 and netperf ToT as they already support SO_MAX_PACING_RATE sockopt. There is a bug in firmwares prior to 1.16.45.0 that affects the BSD driver only and results in rate-limiting at an incorrect rate. This will resolve by itself as soon as 1.16.45.0 or later firmware shows up in the driver.
Relnotes: Yes Sponsored by: Chelsio Communications
show more ...
|
#
1a36faad |
| 11-Feb-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r313301 through r313643.
|
#
15df32b4 |
| 07-Feb-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r313360
|
#
eaf56694 |
| 06-Feb-2017 |
Navdeep Parhar <np@FreeBSD.org> |
cxgbe/t4_tom: Fix CLIP entry refcounting on the passive side. Every IPv6 connection being handled by the TOE should have a reference on its CLIP entry.
Sponsored by: Chelsio Communications
|
#
9b3ece1c |
| 04-Feb-2017 |
Enji Cooper <ngie@FreeBSD.org> |
MFhead@r313243
|
#
65575c14 |
| 29-Jan-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r312894 through r312967.
|