#
0ad9617c |
| 22-Jan-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "This is slightly smaller than usual, with the most interesting
Merge tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "This is slightly smaller than usual, with the most interesting work being still around RTNL scope reduction.
Core:
- More core refactoring to reduce the RTNL lock contention, including preparatory work for the per-network namespace RTNL lock, replacing RTNL lock with a per device-one to protect NAPI-related net device data and moving synchronize_net() calls outside such lock.
- Extend drop reasons usage, adding net scheduler, AF_UNIX, bridge and more specific TCP coverage.
- Reduce network namespace tear-down time by removing per-subsystems synchronize_net() in tipc and sched.
- Add flow label selector support for fib rules, allowing traffic redirection based on such header field.
Netfilter:
- Do not remove netdev basechain when last device is gone, allowing netdev basechains without devices.
- Revisit the flowtable teardown strategy, dealing better with fin, reset and re-open events.
- Scale-up IP-vs connection dumping by avoiding linear search on each restart.
Protocols:
- A significant XDP socket refactor, consolidating and optimizing several helpers into the core
- Better scaling of ICMP rate-limiting, by removing false-sharing in inet peers handling.
- Introduces netlink notifications for multicast IPv4 and IPv6 address changes.
- Add ipsec support for IP-TFS/AggFrag encapsulation, allowing aggregation and fragmentation of the inner IP.
- Add sysctl to configure TIME-WAIT reuse delay for TCP sockets, to avoid local port exhaustion issues when the average connection lifetime is very short.
- Support updating keys (re-keying) for connections using kernel TLS (for TLS 1.3 only).
- Support ipv4-mapped ipv6 address clients in smc-r v2.
- Add support for jumbo data packet transmission in RxRPC sockets, gluing multiple data packets in a single UDP packet.
- Support RxRPC RACK-TLP to manage packet loss and retransmission in conjunction with the congestion control algorithm.
Driver API:
- Introduce a unified and structured interface for reporting PHY statistics, exposing consistent data across different H/W via ethtool.
- Make timestamping selectable, allow the user to select the desired hwtstamp provider (PHY or MAC) administratively.
- Add support for configuring a header-data-split threshold (HDS) value via ethtool, to deal with partial or buggy H/W implementation.
- Consolidate DSA drivers Energy Efficiency Ethernet support.
- Add EEE management to phylink, making use of the phylib implementation.
- Add phylib support for in-band capabilities negotiation.
- Simplify how phylib-enabled mac drivers expose the supported interfaces.
Tests and tooling:
- Make the YNL tool package-friendly to make it easier to deploy it separately from the kernel.
- Increase TCP selftest coverage importing several packetdrill test-cases.
- Regenerate the ethtool uapi header from the YNL spec, to ease maintenance and future development.
- Add YNL support for decoding the link types used in net self-tests, allowing a single build to run both net and drivers/net.
Drivers:
- Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - add cross E-Switch QoS support - add SW Steering support for ConnectX-8 - implement support for HW-Managed Flow Steering, improving the rule deletion/insertion rate - support for multi-host LAG - Intel (ixgbe, ice, igb): - ice: add support for devlink health events - ixgbe: add initial support for E610 chipset variant - igb: add support for AF_XDP zero-copy - Meta: - add support for basic RSS config - allow changing the number of channels - add hardware monitoring support - Broadcom (bnxt): - implement TCP data split and HDS threshold ethtool support, enabling Device Memory TCP. - Marvell Octeon: - implement egress ipsec offload support for the cn10k family - Hisilicon (HIBMC): - implement unicast MAC filtering
- Ethernet NICs embedded and virtual: - Convert UDP tunnel drivers to NETDEV_PCPU_STAT_DSTATS, avoiding contented atomic operations for drop counters - Freescale: - quicc: phylink conversion - enetc: support Tx and Rx checksum offload and improve TSO performances - MediaTek: - airoha: introduce support for ETS and HTB Qdisc offload - Microchip: - lan78XX USB: preparation work for phylink conversion - Synopsys (stmmac): - support DWMAC IP on NXP Automotive SoCs S32G2xx/S32G3xx/S32R45 - refactor EEE support to leverage the new driver API - optimize DMA and cache access to increase raw RX performances by 40% - TI: - icssg-prueth: add multicast filtering support for VLAN interface - netkit: - add ability to configure head/tailroom - VXLAN: - accepts packets with user-defined reserved bit
- Ethernet switches: - Microchip: - lan969x: add RGMII support - lan969x: improve TX and RX performance using the FDMA engine - nVidia/Mellanox: - move Tx header handling to PCI driver, to ease XDP support
- Ethernet PHYs: - Texas Instruments DP83822: - add support for GPIO2 clock output - Realtek: - 8169: add support for RTL8125D rev.b - rtl822x: add hwmon support for the temperature sensor - Microchip: - add support for RDS PTP hardware - consolidate periodic output signal generation
- CAN: - several DT-bindings to DT schema conversions - tcan4x5x: - add HW standby support - support nWKRQ voltage selection - kvaser: - allowing Bus Error Reporting runtime configuration
- WiFi: - the on-going Multi-Link Operation (MLO) effort continues, affecting both the stack and in drivers - mac80211/cfg80211: - Emergency Preparedness Communication Services (EPCS) station mode support - support for adding and removing station links for MLO - add support for WiFi 7/EHT mesh over 320 MHz channels - report Tx power info for each link - RealTek (rtw88): - enable USB Rx aggregation and USB 3 to improve performance - LED support - RealTek (rtw89): - refactor power save to support Multi-Link Operations - add support for RTL8922AE-VS variant - MediaTek (mt76): - single wiphy multiband support (preparation for MLO) - p2p device support - add TP-Link TXE50UH USB adapter support - Qualcomm (ath10k): - support for the QCA6698AQ IP core - Qualcomm (ath12k): - enable MLO for QCN9274
- Bluetooth: - Allow sysfs to trigger hdev reset, to allow recovering devices not responsive from user-space - MediaTek: add support for MT7922, MT7925, MT7921e devices - Realtek: add support for RTL8851BE devices - Qualcomm: add support for WCN785x devices - ISO: allow BIG re-sync"
* tag 'net-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1386 commits) net/rose: prevent integer overflows in rose_setsockopt() net: phylink: fix regression when binding a PHY net: ethernet: ti: am65-cpsw: streamline TX queue creation and cleanup net: ethernet: ti: am65-cpsw: streamline RX queue creation and cleanup net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path ipv6: Convert inet6_rtm_deladdr() to per-netns RTNL. ipv6: Convert inet6_rtm_newaddr() to per-netns RTNL. ipv6: Move lifetime validation to inet6_rtm_newaddr(). ipv6: Set cfg.ifa_flags before device lookup in inet6_rtm_newaddr(). ipv6: Pass dev to inet6_addr_add(). ipv6: Convert inet6_ioctl() to per-netns RTNL. ipv6: Hold rtnl_net_lock() in addrconf_init() and addrconf_cleanup(). ipv6: Hold rtnl_net_lock() in addrconf_dad_work(). ipv6: Hold rtnl_net_lock() in addrconf_verify_work(). ipv6: Convert net.ipv6.conf.${DEV}.XXX sysctl to per-netns RTNL. ipv6: Add __in6_dev_get_rtnl_net(). net: stmmac: Drop redundant skb_mark_for_recycle() for SKB frags net: mii: Fix the Speed display when the network cable is not connected sysctl net: Remove macro checks for CONFIG_SYSCTL eth: bnxt: update header sizing defaults ...
show more ...
|
#
7bf1659b |
| 08-Jan-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'intel-wired-lan-driver-updates-2025-01-06-igb-igc-ixgbe-ixgbevf-i40e-fm10k'
Tony Nguyen says:
==================== Intel Wired LAN Driver Updates 2025-01-06 (igb, igc, ixgbe, ixgbevf,
Merge branch 'intel-wired-lan-driver-updates-2025-01-06-igb-igc-ixgbe-ixgbevf-i40e-fm10k'
Tony Nguyen says:
==================== Intel Wired LAN Driver Updates 2025-01-06 (igb, igc, ixgbe, ixgbevf, i40e, fm10k)
For igb:
Sriram Yagnaraman and Kurt Kanzenbach add support for AF_XDP zero-copy.
Original cover letter: The first couple of patches adds helper functions to prepare for AF_XDP zero-copy support which comes in the last couple of patches, one each for Rx and TX paths.
As mentioned in v1 patchset [0], I don't have access to an actual IGB device to provide correct performance numbers. I have used Intel 82576EB emulator in QEMU [1] to test the changes to IGB driver.
The tests use one isolated vCPU for RX/TX and one isolated vCPU for the xdp-sock application [2]. Hope these measurements provide at the least some indication on the increase in performance when using ZC, especially in the TX path. It would be awesome if someone with a real IGB NIC can test the patch.
AF_XDP performance using 64 byte packets in Kpps. Benchmark: XDP-SKB XDP-DRV XDP-DRV(ZC) rxdrop 220 235 350 txpush 1.000 1.000 410 l2fwd 1.000 1.000 200
AF_XDP performance using 1500 byte packets in Kpps. Benchmark: XDP-SKB XDP-DRV XDP-DRV(ZC) rxdrop 200 210 310 txpush 1.000 1.000 410 l2fwd 0.900 1.000 160
[0]: https://lore.kernel.org/intel-wired-lan/20230704095915.9750-1-sriram.yagnaraman@est.tech/ [1]: https://www.qemu.org/docs/master/system/devices/igb.html [2]: https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-example
Subsequent changes and information can be found here: https://lore.kernel.org/intel-wired-lan/20241018-b4-igb_zero_copy-v9-0-da139d78d796@linutronix.de/
Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning.
For igc:
Song Yoong Siang allows for the XDP program to be hot-swapped.
Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning.
Joe Damato adds sets IRQ and queues to NAPI instances to allow for reporting via netdev-genl API.
For ixgbe:
Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning.
For ixgbevf:
Yue Haibing converts use of ERR_PTR return to traditional error code which resolves a smatch warning.
For i40e:
Alex implements "mdd-auto-reset-vf" private flag to automatically reset VFs when encountering an MDD event.
For fm10k:
Dr. David Alan Gilbert removes an unused function. ====================
Link: https://patch.msgid.link/20250106221929.956999-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
f8e284a0 |
| 06-Jan-2025 |
Sriram Yagnaraman <sriram.yagnaraman@est.tech> |
igb: Add AF_XDP zero-copy Tx support
Add support for AF_XDP zero-copy transmit path.
A new TX buffer type IGB_TYPE_XSK is introduced to indicate that the Tx frame was allocated from the xsk buff po
igb: Add AF_XDP zero-copy Tx support
Add support for AF_XDP zero-copy transmit path.
A new TX buffer type IGB_TYPE_XSK is introduced to indicate that the Tx frame was allocated from the xsk buff pool, so igb_clean_tx_ring() and igb_clean_tx_irq() can clean the buffers correctly based on type.
igb_xmit_zc() performs the actual packet transmit when AF_XDP zero-copy is enabled. We share the TX ring between slow path, XDP and AF_XDP zero-copy, so we use the netdev queue lock to ensure mutual exclusion.
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Set olinfo_status in igb_xmit_zc() so that frames are transmitted, Use READ_ONCE() for xsk_pool and check Tx disabled and carrier in igb_xmit_zc(), Add FIXME for RS bit] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-7-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
2c619601 |
| 06-Jan-2025 |
Sriram Yagnaraman <sriram.yagnaraman@est.tech> |
igb: Add AF_XDP zero-copy Rx support
Add support for AF_XDP zero-copy receive path.
When AF_XDP zero-copy is enabled, the rx buffers are allocated from the xsk buff pool using igb_alloc_rx_buffers_
igb: Add AF_XDP zero-copy Rx support
Add support for AF_XDP zero-copy receive path.
When AF_XDP zero-copy is enabled, the rx buffers are allocated from the xsk buff pool using igb_alloc_rx_buffers_zc().
Use xsk_pool_get_rx_frame_size() to set SRRCTL rx buf size when zero-copy is enabled.
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Port to v6.12 and provide napi_id for xdp_rxq_info_reg(), RCT, remove NETDEV_XDP_ACT_XSK_ZEROCOPY, update NTC handling, READ_ONCE() xsk_pool, likelyfy for XDP_REDIRECT case] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
80f6ccf9 |
| 06-Jan-2025 |
Sriram Yagnaraman <sriram.yagnaraman@est.tech> |
igb: Introduce XSK data structures and helpers
Add the following ring flag: - IGB_RING_FLAG_TX_DISABLED (when xsk pool is being setup)
Add a xdp_buff array for use with XSK receive batch API, and a
igb: Introduce XSK data structures and helpers
Add the following ring flag: - IGB_RING_FLAG_TX_DISABLED (when xsk pool is being setup)
Add a xdp_buff array for use with XSK receive batch API, and a pointer to xsk_pool in igb_adapter.
Add enable/disable functions for TX and RX rings. Add enable/disable functions for XSK pool. Add xsk wakeup function.
None of the above functionality will be active until NETDEV_XDP_ACT_XSK_ZEROCOPY is advertised in netdev->xdp_features.
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Add READ/WRITE_ONCE(), synchronize_net(), remove IGB_RING_FLAG_AF_XDP_ZC] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250106221929.956999-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|