#
7dbec0bb |
| 04-Oct-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'for-6.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mikulas Patocka:
- a new dm-pcache target for read/write cac
Merge tag 'for-6.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mikulas Patocka:
- a new dm-pcache target for read/write caching on persistent memory
- fix typos in docs
- misc small refactoring
- mark dm-error with DM_TARGET_PASSES_INTEGRITY
- dm-request-based: fix NULL pointer dereference and quiesce_depth out of sync
- dm-linear: optimize REQ_PREFLUSH
- dm-vdo: return error on corrupted metadata
- dm-integrity: support asynchronous hash interface
* tag 'for-6.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (27 commits) dm raid: use proper md_ro_state enumerators dm-integrity: prefer synchronous hash interface dm-integrity: enable asynchronous hash interface dm-integrity: rename internal_hash dm-integrity: add the "offset" argument dm-integrity: allocate the recalculate buffer with kmalloc dm-integrity: introduce integrity_kmap and integrity_kunmap dm-integrity: replace bvec_kmap_local with kmap_local_page dm-integrity: use internal variable for digestsize dm vdo: return error on corrupted metadata in start_restoring_volume functions dm vdo: Update code to use mem_is_zero dm: optimize REQ_PREFLUSH with data when using the linear target dm-pcache: use int type to store negative error codes dm: fix "writen"->"written" dm-pcache: cleanup: fix coding style report by checkpatch.pl dm-pcache: remove ctrl_lock for pcache_cache_segment dm: fix NULL pointer dereference in __dm_suspend() dm: fix queue start/stop imbalance under suspend/load/resume races dm-pcache: add persistent cache target in device-mapper dm error: mark as DM_TARGET_PASSES_INTEGRITY ...
show more ...
|
#
1f9ad14a |
| 01-Sep-2025 |
Dongsheng Yang <dongsheng.yang@linux.dev> |
dm-pcache: remove ctrl_lock for pcache_cache_segment
The smatch checker reports a “scheduler in atomic context” problem in the following call chain:
miss_read_end_req() -> cache_seg_put()
dm-pcache: remove ctrl_lock for pcache_cache_segment
The smatch checker reports a “scheduler in atomic context” problem in the following call chain:
miss_read_end_req() -> cache_seg_put() -> cache_seg_invalidate() -> cache_seg_gen_increase() -> mutex_lock(&cache_seg->ctrl_lock);
In practice, this `mutex_lock` will not actually schedule, because it is only called when `cache_seg_put()` drops the last reference, which is single-threaded. That is also why the issue never shows up during real testing.
However, the code is still buggy. The original purpose of `ctrl_lock` was to prevent read/write conflicts on the cache segment control information. Looking at the current usage, all control information accesses are single-threaded: reads only occur during the init phase, where no conflicts are possible, and writes happen once in the init phase (also single-threaded) and once when `cache_seg_put()` drops the last reference (again single-threaded).
Therefore, this patch removes `ctrl_lock` entirely and adds comments in the appropriate places to document this logic.
Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
show more ...
|
#
1d57628f |
| 12-Aug-2025 |
Dongsheng Yang <dongsheng.yang@linux.dev> |
dm-pcache: add persistent cache target in device-mapper
This patch introduces dm-pcache, a new DM target that places a DAX- capable persistent-memory device in front of any slower block device and u
dm-pcache: add persistent cache target in device-mapper
This patch introduces dm-pcache, a new DM target that places a DAX- capable persistent-memory device in front of any slower block device and uses it as a high-throughput, low-latency cache.
Design highlights ----------------- - DAX data path – data is copied directly between DRAM and the pmem mapping, bypassing the block layer’s overhead.
- Segmented, crash-consistent layout - all layout metadata are dual-replicated CRC-protected. - atomic kset flushes; key replay on mount guarantees cache integrity even after power loss.
- Striped multi-tree index - Multi‑tree indexing for high parallelism. - overlap-resolution logic ensures non-intersecting cached extents.
- Background services - write-back worker flushes dirty keys in order, preserving backing-device crash consistency. This is important for checkpoint in cloud storage. - garbage collector reclaims clean segments when utilisation exceeds a tunable threshold.
- Data integrity – optional CRC32 on cached payload; metadata always protected.
Comparison with existing block-level caches --------------------------------------------------------------------------------------------------------------------------------- | Feature | pcache (this patch) | bcache | dm-writecache | |----------------------------------|---------------------------------|------------------------------|---------------------------| | pmem access method | DAX | bio (block I/O) | DAX | | Write latency (4 K rand-write) | ~5 µs | ~20 µs | ~5 µs | | Concurrency | multi subtree index | global index tree | single tree + wc_lock | | IOPS (4K randwrite, 32 numjobs) | 2.1 M | 352 K | 283 K | | Read-cache support | YES | YES | NO | | Deployment | no re-format of backend | backend devices must be | no re-format of backend | | | | reformatted | | | Write-back ordering | log-structured; | no ordering guarantee | no ordering guarantee | | | preserves app-IO-order | | | | Data integrity checks | metadata + data CRC(optional) | metadata CRC only | none | ---------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
show more ...
|