#
1b98f357 |
| 29-May-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Implement the Device Memory TCP transmit path, allo
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope, under its own lock and RCU. The route control path is now 3x times faster.
- Convert queue related netlink ops to instance lock, reducing again the scope of the RTNL lock. This improves the control plane scalability.
- Refactor the software crc32c implementation, removing unneeded abstraction layers and improving significantly the related micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10% performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a prep work to remove the current per-CPU lock in local_bh_disable() on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving considerably the dump performance. A lot of user-space tools still use this interface.
- Implement support for wildcard netdevice in netdev basechain and flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops programs and can be controlled in similar way to traditional qdiscs using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default upper bound for the receive buffer; overall this improves the single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS, and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major distros disable it by default. Its removal allows for better organisation of TCP fields to reduce the number of cache lines hit in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for the drivers, avoiding the problem of drivers not rejecting new unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that ynl-c can now read, create and modify links, routes addresses and qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output known struct instead of raw binary data, clarifying the classic netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to the kernel-space, increasing the data transfer throughput WRT the user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - refactor the steering table handling to significantly reduce the amount of memory used - add support for complex matches in H/W flow steering - improve flow streeing error handling - convert to netdev instance locking - Intel (100G, ice, igb, ixgbe, idpf): - ice: add switchdev support for LLDP traffic over VF - ixgbe: add firmware manipulation and regions devlink support - igb: introduce support for frame transmission premption - igb: adds persistent NAPI configuration - idpf: introduce RDMA support - idpf: add initial PTP support - Meta (fbnic): - extend hardware stats coverage - add devlink dev flash support - Broadcom (bnxt): - add support for RX-side device memory TCP - Wangxun (txgbe): - implement support for udp tunnel offload - complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual: - Google (gve): - add device memory TCP TX support - Amazon (ena): - support persistent per-NAPI config - Airoha: - add H/W support for L2 traffic offload - add per flow stats for flow offloading - RealTek (rtl8211): add support for WoL magic packet - Synopsys (stmmac): - dwmac-socfpga 1000BaseX support - add Loongson-2K3000 support - introduce support for hardware-accelerated VLAN stripping - Broadcom (bcmgenet): - expose more H/W stats - Freescale (enetc, dpaa2-eth): - enetc: add MAC filter, VLAN filter RSS and loopback support - dpaa2-eth: convert to H/W timestamping APIs - vxlan: convert FDB table to rhashtable, for better scalabilty - veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches: - Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs: - RealTek (rtl8211): - add support for WoL magic packet - add support for PHY LEDs
- CAN: - Adds RZ/G3E CANFD support to the rcar_canfd driver. - Preparatory work for CAN-XL support. - Add self-tests framework with support for CAN physical interfaces.
- WiFi: - mac80211: - scan improvements with multi-link operation (MLO) - Qualcomm (ath12k): - enable AHB support for IPQ5332 - add monitor interface support to QCN9274 - add multi-link operation support to WCN7850 - add 802.11d scan offload support to WCN7850 - monitor mode for WCN7850, better 6 GHz regulatory - Qualcomm (ath11k): - restore hibernation support - MediaTek (mt76): - WiFi-7 improvements - implement support for mt7990 - Intel (iwlwifi): - enhanced multi-link single-radio (EMLSR) support on 5 GHz links - rework device configuration - RealTek (rtw88): - improve throughput for RTL8814AU - RealTek (rtw89): - add multi-link operation support - STA/P2P concurrency improvements - support different SAR configs by antenna
- Bluetooth: - introduce HCI Driver protocol - btintel_pcie: do not generate coredump for diagnostic events - btusb: add HCI Drv commands for configuring altsetting - btusb: add RTL8851BE device 0x0bda:0xb850 - btusb: add new VID/PID 13d3/3584 for MT7922 - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925 - btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits) selftests/bpf: Fix bpf selftest build warning selftests: netfilter: Fix skip of wildcard interface test net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames net: openvswitch: Fix the dead loop of MPLS parse calipso: Don't call calipso functions for AF_INET sk. selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem net_sched: hfsc: Address reentrant enqueue adding class to eltree twice octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback octeontx2-pf: QOS: Perform cache sync on send queue teardown net: mana: Add support for Multi Vports on Bare metal net: devmem: ncdevmem: remove unused variable net: devmem: ksft: upgrade rx test to send 1K data net: devmem: ksft: add 5 tuple FS support net: devmem: ksft: add exit_wait to make rx test pass net: devmem: ksft: add ipv4 support net: devmem: preserve sockc_err page_pool: fix ugly page_pool formatting net: devmem: move list_add to net_devmem_bind_dmabuf. selftests: netfilter: nft_queue.sh: include file transfer duration in log message net: phy: mscc: Fix memory leak when using one step timestamping ...
show more ...
|
#
015b5b8e |
| 08-May-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'tools-ynl-gen-split-presence-metadata'
Jakub Kicinski says:
==================== tools: ynl-gen: split presence metadata
The presence metadata indicates whether given attribute was/s
Merge branch 'tools-ynl-gen-split-presence-metadata'
Jakub Kicinski says:
==================== tools: ynl-gen: split presence metadata
The presence metadata indicates whether given attribute was/should be added to the Netlink message. We have 3 types of such metadata: - bit presence for simple values like integers, - len presence for variable size attrs, like binary and strings, - count for arrays.
Previously this information was spread around with first two types living in a dedicated sub-struct called _present. The counts resided directly in the main struct with an n_ prefix.
Reshuffle these an uniformly store them in dedicated sub-structs. The immediate motivation is that current scheme causes name collisions for TC. ====================
Link: https://patch.msgid.link/20250505165208.248049-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
b8ae9f70 |
| 05-May-2025 |
Jakub Kicinski <kuba@kernel.org> |
tools: ynl-gen: split presence metadata
Each YNL struct contains the data and a sub-struct indicating which fields are valid. Something like:
struct family_op_req { struct { u32
tools: ynl-gen: split presence metadata
Each YNL struct contains the data and a sub-struct indicating which fields are valid. Something like:
struct family_op_req { struct { u32 a:1; u32 b:1; u32 bin_len; } _present;
u32 a; u64 b; const unsigned char *bin; };
Note that the bin object 'bin' has a length stored, and that length has a _len suffix added to the field name. This breaks if there is a explicit field called bin_len, which is the case for some TC actions. Move the length fields out of the _present struct, create a new struct called _len:
struct family_op_req { struct { u32 a:1; u32 b:1; } _present; struct { u32 bin; } _len;
u32 a; u64 b; const unsigned char *bin; };
This should prevent name collisions and help with the packing of the struct.
Unfortunately this is a breaking change, but hopefully the migration isn't too painful.
Link: https://patch.msgid.link/20250505165208.248049-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
dd4f33b4 |
| 11-Apr-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'tools-ynl-c-basic-netlink-raw-support'
Jakub Kicinski says:
==================== tools: ynl: c: basic netlink-raw support
Basic support for netlink-raw AKA classic netlink in user sp
Merge branch 'tools-ynl-c-basic-netlink-raw-support'
Jakub Kicinski says:
==================== tools: ynl: c: basic netlink-raw support
Basic support for netlink-raw AKA classic netlink in user space C codegen. This series is enough to read routes and addresses from the kernel (see the samples in patches 12 and 13).
Specs need to be slightly adjusted and decorated with the c naming info.
In terms of codegen this series includes just the basic plumbing required to skip genlmsghdr and handle request types which may technically also be legal in genetlink-legacy but are very uncommon there.
Subsequent series will add support for: - handling CRUD-style notifications - code gen for array types classic netlink uses - sub-message support
v1: https://lore.kernel.org/20250409000400.492371-1-kuba@kernel.org ====================
Link: https://patch.msgid.link/20250410014658.782120-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
29d34a4d |
| 10-Apr-2025 |
Jakub Kicinski <kuba@kernel.org> |
tools: ynl: generate code for rt-addr and add a sample
YNL C can now generate code for simple classic netlink families. Include rt-addr in the Makefile for generation and add a sample.
$ ./tools/
tools: ynl: generate code for rt-addr and add a sample
YNL C can now generate code for simple classic netlink families. Include rt-addr in the Makefile for generation and add a sample.
$ ./tools/net/ynl/samples/rt-addr lo: 127.0.0.1 wlp0s20f3: 192.168.1.101 lo: :: wlp0s20f3: fe80::6385:be6:746e:8116 vpn0: fe80::3597:d353:b5a7:66dd
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250410014658.782120-13-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|