|
Revision tags: v6.18, v6.18-rc7, v6.18-rc6, v6.18-rc5, v6.18-rc4, v6.18-rc3, v6.18-rc2, v6.18-rc1, v6.17, v6.17-rc7 |
|
| #
f088104d |
| 16-Sep-2025 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Backmerge in order to get the commit:
048832a3f400 ("drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter")
To drm-intel-gt-next as there are f
Merge drm/drm-next into drm-intel-gt-next
Backmerge in order to get the commit:
048832a3f400 ("drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter")
To drm-intel-gt-next as there are followup fixes to be applied.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
|
Revision tags: v6.17-rc6, v6.17-rc5, v6.17-rc4, v6.17-rc3, v6.17-rc2, v6.17-rc1 |
|
| #
ab93e0dd |
| 06-Aug-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.17 merge window.
|
| #
a7bee4e7 |
| 04-Aug-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next
Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the mer
Merge tag 'ib-mfd-gpio-input-pwm-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next
Merge an immutable branch between MFD, GPIO, Input and PWM to resolve conflicts for the merge window pull request.
show more ...
|
|
Revision tags: v6.16, v6.16-rc7, v6.16-rc6, v6.16-rc5, v6.16-rc4 |
|
| #
74f1af95 |
| 29-Jun-2025 |
Rob Clark <robin.clark@oss.qualcomm.com> |
Merge remote-tracking branch 'drm/drm-next' into msm-next
Back-merge drm-next to (indirectly) get arm-smmu updates for making stall-on-fault more reliable.
Signed-off-by: Rob Clark <robin.clark@oss
Merge remote-tracking branch 'drm/drm-next' into msm-next
Back-merge drm-next to (indirectly) get arm-smmu updates for making stall-on-fault more reliable.
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
show more ...
|
|
Revision tags: v6.16-rc3, v6.16-rc2 |
|
| #
c598d5eb |
| 11-Jun-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to forward to v6.16-rc1
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
| #
86e2d052 |
| 09-Jun-2025 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
Merge drm/drm-next into drm-xe-next
Backmerging to bring in 6.16
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
| #
34c55367 |
| 09-Jun-2025 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync to v6.16-rc1, among other things to get the fixed size GENMASK_U*() and BIT_U*() macros.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Revision tags: v6.16-rc1 |
|
| #
1b98f357 |
| 29-May-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Implement the Device Memory TCP transmit path, allo
Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni: "Core:
- Implement the Device Memory TCP transmit path, allowing zero-copy data transmission on top of TCP from e.g. GPU memory to the wire.
- Move all the IPv6 routing tables management outside the RTNL scope, under its own lock and RCU. The route control path is now 3x times faster.
- Convert queue related netlink ops to instance lock, reducing again the scope of the RTNL lock. This improves the control plane scalability.
- Refactor the software crc32c implementation, removing unneeded abstraction layers and improving significantly the related micro-benchmarks.
- Optimize the GRO engine for UDP-tunneled traffic, for a 10% performance improvement in related stream tests.
- Cover more per-CPU storage with local nested BH locking; this is a prep work to remove the current per-CPU lock in local_bh_disable() on PREMPT_RT.
- Introduce and use nlmsg_payload helper, combining buffer bounds verification with accessing payload carried by netlink messages.
Netfilter:
- Rewrite the procfs conntrack table implementation, improving considerably the dump performance. A lot of user-space tools still use this interface.
- Implement support for wildcard netdevice in netdev basechain and flowtables.
- Integrate conntrack information into nft trace infrastructure.
- Export set count and backend name to userspace, for better introspection.
BPF:
- BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops programs and can be controlled in similar way to traditional qdiscs using the "tc qdisc" command.
- Refactor the UDP socket iterator, addressing long standing issues WRT duplicate hits or missed sockets.
Protocols:
- Improve TCP receive buffer auto-tuning and increase the default upper bound for the receive buffer; overall this improves the single flow maximum thoughput on 200Gbs link by over 60%.
- Add AFS GSSAPI security class to AF_RXRPC; it provides transport security for connections to the AFS fileserver and VL server.
- Improve TCP multipath routing, so that the sources address always matches the nexthop device.
- Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS, and thus preventing DoS caused by passing around problematic FDs.
- Retire DCCP socket. DCCP only receives updates for bugs, and major distros disable it by default. Its removal allows for better organisation of TCP fields to reduce the number of cache lines hit in the fast path.
- Extend TCP drop-reason support to cover PAWS checks.
Driver API:
- Reorganize PTP ioctl flag support to require an explicit opt-in for the drivers, avoiding the problem of drivers not rejecting new unsupported flags.
- Converted several device drivers to timestamping APIs.
- Introduce per-PHY ethtool dump helpers, improving the support for dump operations targeting PHYs.
Tests and tooling:
- Add support for classic netlink in user space C codegen, so that ynl-c can now read, create and modify links, routes addresses and qdisc layer configuration.
- Add ynl sub-types for binary attributes, allowing ynl-c to output known struct instead of raw binary data, clarifying the classic netlink output.
- Extend MPTCP selftests to improve the code-coverage.
- Add tests for XDP tail adjustment in AF_XDP.
New hardware / drivers:
- OpenVPN virtual driver: offload OpenVPN data channels processing to the kernel-space, increasing the data transfer throughput WRT the user-space implementation.
- Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
- Broadcom asp-v3.0 ethernet driver.
- AMD Renoir ethernet device.
- ReakTek MT9888 2.5G ethernet PHY driver.
- Aeonsemi 10G C45 PHYs driver.
Drivers:
- Ethernet high-speed NICs: - nVidia/Mellanox (mlx5): - refactor the steering table handling to significantly reduce the amount of memory used - add support for complex matches in H/W flow steering - improve flow streeing error handling - convert to netdev instance locking - Intel (100G, ice, igb, ixgbe, idpf): - ice: add switchdev support for LLDP traffic over VF - ixgbe: add firmware manipulation and regions devlink support - igb: introduce support for frame transmission premption - igb: adds persistent NAPI configuration - idpf: introduce RDMA support - idpf: add initial PTP support - Meta (fbnic): - extend hardware stats coverage - add devlink dev flash support - Broadcom (bnxt): - add support for RX-side device memory TCP - Wangxun (txgbe): - implement support for udp tunnel offload - complete PTP and SRIOV support for AML 25G/10G devices
- Ethernet NICs embedded and virtual: - Google (gve): - add device memory TCP TX support - Amazon (ena): - support persistent per-NAPI config - Airoha: - add H/W support for L2 traffic offload - add per flow stats for flow offloading - RealTek (rtl8211): add support for WoL magic packet - Synopsys (stmmac): - dwmac-socfpga 1000BaseX support - add Loongson-2K3000 support - introduce support for hardware-accelerated VLAN stripping - Broadcom (bcmgenet): - expose more H/W stats - Freescale (enetc, dpaa2-eth): - enetc: add MAC filter, VLAN filter RSS and loopback support - dpaa2-eth: convert to H/W timestamping APIs - vxlan: convert FDB table to rhashtable, for better scalabilty - veth: apply qdisc backpressure on full ring to reduce TX drops
- Ethernet switches: - Microchip (kzZ88x3): add ETS scheduler support
- Ethernet PHYs: - RealTek (rtl8211): - add support for WoL magic packet - add support for PHY LEDs
- CAN: - Adds RZ/G3E CANFD support to the rcar_canfd driver. - Preparatory work for CAN-XL support. - Add self-tests framework with support for CAN physical interfaces.
- WiFi: - mac80211: - scan improvements with multi-link operation (MLO) - Qualcomm (ath12k): - enable AHB support for IPQ5332 - add monitor interface support to QCN9274 - add multi-link operation support to WCN7850 - add 802.11d scan offload support to WCN7850 - monitor mode for WCN7850, better 6 GHz regulatory - Qualcomm (ath11k): - restore hibernation support - MediaTek (mt76): - WiFi-7 improvements - implement support for mt7990 - Intel (iwlwifi): - enhanced multi-link single-radio (EMLSR) support on 5 GHz links - rework device configuration - RealTek (rtw88): - improve throughput for RTL8814AU - RealTek (rtw89): - add multi-link operation support - STA/P2P concurrency improvements - support different SAR configs by antenna
- Bluetooth: - introduce HCI Driver protocol - btintel_pcie: do not generate coredump for diagnostic events - btusb: add HCI Drv commands for configuring altsetting - btusb: add RTL8851BE device 0x0bda:0xb850 - btusb: add new VID/PID 13d3/3584 for MT7922 - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925 - btnxpuart: implement host-wakeup feature"
* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits) selftests/bpf: Fix bpf selftest build warning selftests: netfilter: Fix skip of wildcard interface test net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames net: openvswitch: Fix the dead loop of MPLS parse calipso: Don't call calipso functions for AF_INET sk. selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem net_sched: hfsc: Address reentrant enqueue adding class to eltree twice octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback octeontx2-pf: QOS: Perform cache sync on send queue teardown net: mana: Add support for Multi Vports on Bare metal net: devmem: ncdevmem: remove unused variable net: devmem: ksft: upgrade rx test to send 1K data net: devmem: ksft: add 5 tuple FS support net: devmem: ksft: add exit_wait to make rx test pass net: devmem: ksft: add ipv4 support net: devmem: preserve sockc_err page_pool: fix ugly page_pool formatting net: devmem: move list_add to net_devmem_bind_dmabuf. selftests: netfilter: nft_queue.sh: include file transfer duration in log message net: phy: mscc: Fix memory leak when using one step timestamping ...
show more ...
|
|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3 |
|
| #
5b38e821 |
| 15-Apr-2025 |
Jakub Kicinski <kuba@kernel.org> |
Merge branch 'rxrpc-afs-add-afs-gssapi-security-class-to-af_rxrpc-and-kafs'
David Howells says:
==================== rxrpc, afs: Add AFS GSSAPI security class to AF_RXRPC and kafs
Here's a set of
Merge branch 'rxrpc-afs-add-afs-gssapi-security-class-to-af_rxrpc-and-kafs'
David Howells says:
==================== rxrpc, afs: Add AFS GSSAPI security class to AF_RXRPC and kafs
Here's a set of patches to add basic support for the AFS GSSAPI security class to AF_RXRPC and kafs. It provides transport security for keys that match the security index 6 (YFS) for connections to the AFS fileserver and VL server.
Note that security index 4 (OpenAFS) can also be supported using this, but it needs more work as it's slightly different.
The patches also provide the ability to secure the callback channel - connections from the fileserver back to the client that are used to pass file change notifications, amongst other things. When challenged by the fileserver, kafs will generate a token specific to that server and include it in the RESPONSE packet as the appdata. The server then extracts this and uses it to send callback RPC calls back to the client.
It can also be used to provide transport security on the callback channel, but a further set of patches is required to provide the token and key to set that up when the client responds to the fileserver's challenge.
This makes use of the previously added crypto-krb5 library that is now upstream (last commit fc0cf10c04f4).
This series of patches consist of the following parts:
(0) Update kdoc comments to remove some kdoc builder warnings.
(1) Push reponding to CHALLENGE packets over to recvmsg() or the kernel equivalent so that the application layer can include user-defined information in the RESPONSE packet. In a follow-up patch set, this will allow the callback channel to be secured by the AFS filesystem.
(2) Add the AF_RXRPC RxGK security class that uses a key obtained from the AFS GSS security service to do Kerberos 5-based encryption instead of pcbc(fcrypt) and pcbc(des).
(3) Add support for callback channel encryption in kafs.
(4) Provide the test rxperf server module with some fixed krb5 keys. ====================
Link: https://patch.msgid.link/20250411095303.2316168-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
|
Revision tags: v6.15-rc2 |
|
| #
5800b1cf |
| 11-Apr-2025 |
David Howells <dhowells@redhat.com> |
rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE
Allow the app to request that CHALLENGEs be passed to it through an out-of-band queue that allows recvmsg() to pick it up so that the
rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSE
Allow the app to request that CHALLENGEs be passed to it through an out-of-band queue that allows recvmsg() to pick it up so that the app can add data to it with sendmsg().
This will allow the application (AFS or userspace) to interact with the process if it wants to and put values into user-defined fields. This will be used by AFS when talking to a fileserver to supply that fileserver with a crypto key by which callback RPCs can be encrypted (ie. notifications from the fileserver to the client).
Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| #
1260ed77 |
| 08-Apr-2025 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get updates from v6.15-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
Revision tags: v6.15-rc1 |
|
| #
946661e3 |
| 05-Apr-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.15 merge window.
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
0b119045 |
| 26-Feb-2025 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.14-rc4' into next
Sync up with the mainline.
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2 |
|
| #
9e676a02 |
| 05-Feb-2025 |
Namhyung Kim <namhyung@kernel.org> |
Merge tag 'v6.14-rc1' into perf-tools-next
To get the various fixes in the current master.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
| #
0410c612 |
| 28-Feb-2025 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Sync to fix conlicts between drm-xe-next and drm-intel-next.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
| #
93c7dd1b |
| 06-Feb-2025 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next
Bring rc1 to start the new release dev.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
| #
ea9f8f2b |
| 05-Feb-2025 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync with v6.14-rc1.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
| #
c771600c |
| 05-Feb-2025 |
Tvrtko Ursulin <tursulin@ursulin.net> |
Merge drm/drm-next into drm-intel-gt-next
We need 4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope") in order to land a i915 PMU simplification and a fix. That landed in 6.12 and
Merge drm/drm-next into drm-intel-gt-next
We need 4ba4f1afb6a9 ("perf: Generic hotplug support for a PMU with a scope") in order to land a i915 PMU simplification and a fix. That landed in 6.12 and we are stuck at 6.9 so lets bump things forward.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
show more ...
|
|
Revision tags: v6.14-rc1 |
|
| #
ca56a74a |
| 20-Jan-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs netfs updates from Christian Brauner: "This contains read performance improvements and support for m
Merge tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs netfs updates from Christian Brauner: "This contains read performance improvements and support for monolithic single-blob objects that have to be read/written as such (e.g. AFS directory contents). The implementation of the two parts is interwoven as each makes the other possible.
- Read performance improvements
The read performance improvements are intended to speed up some loss of performance detected in cifs and to a lesser extend in afs.
The problem is that we queue too many work items during the collection of read results: each individual subrequest is collected by its own work item, and then they have to interact with each other when a series of subrequests don't exactly align with the pattern of folios that are being read by the overall request.
Whilst the processing of the pages covered by individual subrequests as they complete potentially allows folios to be woken in parallel and with minimum delay, it can shuffle wakeups for sequential reads out of order - and that is the most common I/O pattern.
The final assessment and cleanup of an operation is then held up until the last I/O completes - and for a synchronous sequential operation, this means the bouncing around of work items just adds latency.
Two changes have been made to make this work:
(1) All collection is now done in a single "work item" that works progressively through the subrequests as they complete (and also dispatches retries as necessary).
(2) For readahead and AIO, this work item be done on a workqueue and can run in parallel with the ultimate consumer of the data; for synchronous direct or unbuffered reads, the collection is run in the application thread and not offloaded.
Functions such as smb2_readv_callback() then just tell netfslib that the subrequest has terminated; netfslib does a minimal bit of processing on the spot - stat counting and tracing mostly - and then queues/wakes up the worker. This simplifies the logic as the collector just walks sequentially through the subrequests as they complete and walks through the folios, if buffered, unlocking them as it goes. It also keeps to a minimum the amount of latency injected into the filesystem's low-level I/O handling
The way netfs supports filesystems using the deprecated PG_private_2 flag is changed: folios are flagged and added to a write request as they complete and that takes care of scheduling the writes to the cache. The originating read request can then just unlock the pages whatever happens.
- Single-blob object support
Single-blob objects are files for which the content of the file must be read from or written to the server in a single operation because reading them in parts may yield inconsistent results. AFS directories are an example of this as there exists the possibility that the contents are generated on the fly and would differ between reads or might change due to third party interference.
Such objects will be written to and retrieved from the cache if one is present, though we allow/may need to propose multiple subrequests to do so. The important part is that read from/write to the *server* is monolithic.
Single blob reading is, for the moment, fully synchronous and does result collection in the application thread and, also for the moment, the API is supplied the buffer in the form of a folio_queue chain rather than using the pagecache.
- Related afs changes
This series makes a number of changes to the kafs filesystem, primarily in the area of directory handling:
- AFS's FetchData RPC reply processing is made partially asynchronous which allows the netfs_io_request's outstanding operation counter to be removed as part of reducing the collection to a single work item.
- Directory and symlink reading are plumbed through netfslib using the single-blob object API and are now cacheable with fscache. This also allows the afs_read struct to be eliminated and netfs_io_subrequest to be used directly instead.
- Directory and symlink content are now stored in a folio_queue buffer rather than in the pagecache. This means we don't require the RCU read lock and xarray iteration to access it, and folios won't randomly disappear under us because the VM wants them back.
- The vnode operation lock is changed from a mutex struct to a private lock implementation. The problem is that the lock now needs to be dropped in a separate thread and mutexes don't permit that.
- When a new directory or symlink is created, we now initialise it locally and mark it valid rather than downloading it (we know what it's likely to look like).
- We now use the in-directory hashtable to reduce the number of entries we need to scan when doing a lookup. The edit routines have to maintain the hash chains.
- Cancellation (e.g. by signal) of an async call after the rxrpc_call has been set up is now offloaded to the worker thread as there will be a notification from rxrpc upon completion. This avoids a double cleanup.
- A "rolling buffer" implementation is created to abstract out the two separate folio_queue chaining implementations I had (one for read and one for write).
- Functions are provided to create/extend a buffer in a folio_queue chain and tear it down again.
This is used to handle AFS directories, but could also be used to create bounce buffers for content crypto and transport crypto.
- The was_async argument is dropped from netfs_read_subreq_terminated()
Instead we wake the read collection work item by either queuing it or waking up the app thread.
- We don't need to use BH-excluding locks when communicating between the issuing thread and the collection thread as neither of them now run in BH context.
- Also included are a number of new tracepoints; a split of the netfslib write collection code to put retrying into its own file (it gets more complicated with content encryption).
- There are also some minor fixes AFS included, including fixing the AFS directory format struct layout, reducing some directory over-invalidation and making afs_mkdir() translate EEXIST to ENOTEMPY (which is not available on all systems the servers support).
- Finally, there's a patch to try and detect entry into the folio unlock function with no folio_queue structs in the buffer (which isn't allowed in the cases that can get there).
This is a debugging patch, but should be minimal overhead"
* tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) netfs: Report on NULL folioq in netfs_writeback_unlock_folios() afs: Add a tracepoint for afs_read_receive() afs: Locally initialise the contents of a new symlink on creation afs: Use the contained hashtable to search a directory afs: Make afs_mkdir() locally initialise a new directory's content netfs: Change the read result collector to only use one work item afs: Make {Y,}FS.FetchData an asynchronous operation afs: Fix cleanup of immediately failed async calls afs: Eliminate afs_read afs: Use netfslib for symlinks, allowing them to be cached afs: Use netfslib for directories afs: Make afs_init_request() get a key if not given a file netfs: Add support for caching single monolithic objects such as AFS dirs netfs: Add functions to build/clean a buffer in a folio_queue afs: Add more tracepoints to do with tracking validity cachefiles: Add auxiliary data trace cachefiles: Add some subrequest tracepoints netfs: Remove some extraneous directory invalidations afs: Fix directory format encoding struct afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTY ...
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
7a47db23 |
| 20-Dec-2024 |
Christian Brauner <brauner@kernel.org> |
Merge patch series "netfs: Read performance improvements and "single-blob" support"
David Howells <dhowells@redhat.com> says:
This set of patches is primarily about two things: improving read perfo
Merge patch series "netfs: Read performance improvements and "single-blob" support"
David Howells <dhowells@redhat.com> says:
This set of patches is primarily about two things: improving read performance and supporting monolithic single-blob objects that have to be read/written as such (e.g. AFS directory contents). The implementation of the two parts is interwoven as each makes the other possible.
READ PERFORMANCE ================
The read performance improvements are intended to speed up some loss of performance detected in cifs and to a lesser extend in afs. The problem is that we queue too many work items during the collection of read results: each individual subrequest is collected by its own work item, and then they have to interact with each other when a series of subrequests don't exactly align with the pattern of folios that are being read by the overall request.
Whilst the processing of the pages covered by individual subrequests as they complete potentially allows folios to be woken in parallel and with minimum delay, it can shuffle wakeups for sequential reads out of order - and that is the most common I/O pattern.
The final assessment and cleanup of an operation is then held up until the last I/O completes - and for a synchronous sequential operation, this means the bouncing around of work items just adds latency.
Two changes have been made to make this work:
(1) All collection is now done in a single "work item" that works progressively through the subrequests as they complete (and also dispatches retries as necessary).
(2) For readahead and AIO, this work item be done on a workqueue and can run in parallel with the ultimate consumer of the data; for synchronous direct or unbuffered reads, the collection is run in the application thread and not offloaded.
Functions such as smb2_readv_callback() then just tell netfslib that the subrequest has terminated; netfslib does a minimal bit of processing on the spot - stat counting and tracing mostly - and then queues/wakes up the worker. This simplifies the logic as the collector just walks sequentially through the subrequests as they complete and walks through the folios, if buffered, unlocking them as it goes. It also keeps to a minimum the amount of latency injected into the filesystem's low-level I/O handling
The way netfs supports filesystems using the deprecated PG_private_2 flag is changed: folios are flagged and added to a write request as they complete and that takes care of scheduling the writes to the cache. The originating read request can then just unlock the pages whatever happens.
SINGLE-BLOB OBJECT SUPPORT ==========================
Single-blob objects are files for which the content of the file must be read from or written to the server in a single operation because reading them in parts may yield inconsistent results. AFS directories are an example of this as there exists the possibility that the contents are generated on the fly and would differ between reads or might change due to third party interference.
Such objects will be written to and retrieved from the cache if one is present, though we allow/may need to propose multiple subrequests to do so. The important part is that read from/write to the *server* is monolithic.
Single blob reading is, for the moment, fully synchronous and does result collection in the application thread and, also for the moment, the API is supplied the buffer in the form of a folio_queue chain rather than using the pagecache.
AFS CHANGES ===========
This series makes a number of changes to the kafs filesystem, primarily in the area of directory handling:
(1) AFS's FetchData RPC reply processing is made partially asynchronous which allows the netfs_io_request's outstanding operation counter to be removed as part of reducing the collection to a single work item.
(2) Directory and symlink reading are plumbed through netfslib using the single-blob object API and are now cacheable with fscache. This also allows the afs_read struct to be eliminated and netfs_io_subrequest to be used directly instead.
(3) Directory and symlink content are now stored in a folio_queue buffer rather than in the pagecache. This means we don't require the RCU read lock and xarray iteration to access it, and folios won't randomly disappear under us because the VM wants them back.
There are some downsides to this, though: the storage folios are no longer known to the VM, drop_caches can't flush them, the folios are not migrateable. The inode must also be marked dirty manually to get the data written to the cache in the background.
(4) The vnode operation lock is changed from a mutex struct to a private lock implementation. The problem is that the lock now needs to be dropped in a separate thread and mutexes don't permit that.
(5) When a new directory or symlink is created, we now initialise it locally and mark it valid rather than downloading it (we know what it's likely to look like).
(6) We now use the in-directory hashtable to reduce the number of entries we need to scan when doing a lookup. The edit routines have to maintain the hash chains.
(7) Cancellation (e.g. by signal) of an async call after the rxrpc_call has been set up is now offloaded to the worker thread as there will be a notification from rxrpc upon completion. This avoids a double cleanup.
SUPPORTING CHANGES ==================
To support the above some other changes are also made:
(1) A "rolling buffer" implementation is created to abstract out the two separate folio_queue chaining implementations I had (one for read and one for write).
(2) Functions are provided to create/extend a buffer in a folio_queue chain and tear it down again. This is used to handle AFS directories, but could also be used to create bounce buffers for content crypto and transport crypto.
(3) The was_async argument is dropped from netfs_read_subreq_terminated(). Instead we wake the read collection work item by either queuing it or waking up the app thread.
(4) We don't need to use BH-excluding locks when communicating between the issuing thread and the collection thread as neither of them now run in BH context.
MISCELLANY ==========
Also included are a number of new tracepoints; a split of the netfslib write collection code to put retrying into its own file (it gets more complicated with content encryption).
There are also some minor fixes AFS included, including fixing the AFS directory format struct layout, reducing some directory over-invalidation and making afs_mkdir() translate EEXIST to ENOTEMPY (which is not available on all systems the servers support).
Finally, there's a patch to try and detect entry into the folio unlock function with no folio_queue structs in the buffer (which isn't allowed in the cases that can get there). This is a debugging patch, but should be minimal overhead.
* patches from https://lore.kernel.org/r/20241216204124.3752367-1-dhowells@redhat.com: (31 commits) netfs: Report on NULL folioq in netfs_writeback_unlock_folios() afs: Add a tracepoint for afs_read_receive() afs: Locally initialise the contents of a new symlink on creation afs: Use the contained hashtable to search a directory afs: Make afs_mkdir() locally initialise a new directory's content netfs: Change the read result collector to only use one work item afs: Make {Y,}FS.FetchData an asynchronous operation afs: Fix cleanup of immediately failed async calls afs: Eliminate afs_read afs: Use netfslib for symlinks, allowing them to be cached afs: Use netfslib for directories afs: Make afs_init_request() get a key if not given a file netfs: Add support for caching single monolithic objects such as AFS dirs netfs: Add functions to build/clean a buffer in a folio_queue afs: Add more tracepoints to do with tracking validity cachefiles: Add auxiliary data trace cachefiles: Add some subrequest tracepoints netfs: Remove some extraneous directory invalidations afs: Fix directory format encoding struct afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTY ...
Link: https://lore.kernel.org/r/20241216204124.3752367-1-dhowells@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
| #
a5b5beeb |
| 16-Dec-2024 |
David Howells <dhowells@redhat.com> |
afs: Use the contained hashtable to search a directory
Each directory image contains a hashtable with 128 buckets to speed up searching. Currently, kafs does not use this, but rather iterates over
afs: Use the contained hashtable to search a directory
Each directory image contains a hashtable with 128 buckets to speed up searching. Currently, kafs does not use this, but rather iterates over all the occupied slots in the image as it can share this with readdir.
Switch kafs to use the hashtable for lookups to reduce the latency. Care must be taken that the hash chains are acyclic.
Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-30-dhowells@redhat.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
| #
a23e1966 |
| 15-Jul-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.11 merge window.
|
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
| #
6f47c7ae |
| 28-May-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.9' into next
Sync up with the mainline to bring in the new cleanup API.
|
|
Revision tags: v6.10-rc1 |
|
| #
60a2f25d |
| 16-May-2024 |
Tvrtko Ursulin <tursulin@ursulin.net> |
Merge drm/drm-next into drm-intel-gt-next
Some display refactoring patches are needed in order to allow conflict- less merging.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
|
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
| #
06d07429 |
| 29-Feb-2024 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync to get the drm_printer changes to drm-intel-next.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|