| 303d3284 | 03-Apr-2026 |
Dave Jiang <dave.jiang@intel.com> |
Merge branch 'for-7.1/dax-hmem' into cxl-for-next
The series addresses conflicts between HMEM and CXL when handling Soft Reserved memory ranges. CXL will try best effort in claiming the Soft Reserve
Merge branch 'for-7.1/dax-hmem' into cxl-for-next
The series addresses conflicts between HMEM and CXL when handling Soft Reserved memory ranges. CXL will try best effort in claiming the Soft Reserved memory region that are CXL regions. If fails, it will punt back to HMEM.
tools/testing/cxl: Test dax_hmem takeover of CXL regions tools/testing/cxl: Simulate auto-assembly failure dax/hmem: Parent dax_hmem devices dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices dax/hmem: Reduce visibility of dax_cxl coordination symbols cxl/region: Constify cxl_region_resource_contains() cxl/region: Limit visibility of cxl_region_contains_resource() dax/cxl: Fix HMEM dependencies cxl/region: Fix use-after-free from auto assembly failure dax/hmem, cxl: Defer and resolve Soft Reserved ownership cxl/region: Add helper to check Soft Reserved containment by CXL regions dax: Track all dax_region allocations under a global resource tree dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges dax/hmem: Factor HMEM registration into __hmem_register_device() dax/bus: Use dax_region_put() in alloc_dax_region() error path
show more ...
|
| 549b5c12 | 27-Mar-2026 |
Dan Williams <dan.j.williams@intel.com> |
tools/testing/cxl: Test dax_hmem takeover of CXL regions
When platform firmware is committed to publishing EFI_CONVENTIONAL_MEMORY in the memory map, but CXL fails to assemble the region, dax_hmem c
tools/testing/cxl: Test dax_hmem takeover of CXL regions
When platform firmware is committed to publishing EFI_CONVENTIONAL_MEMORY in the memory map, but CXL fails to assemble the region, dax_hmem can attempt to attach a dax device to the memory range.
Take advantage of the new ability to support multiple "hmem_platform" devices, and to enable regression testing of several scenarios:
* CXL correctly assembles a region, check dax_hmem fails to attach dax * CXL fails to assemble a region, check dax_hmem successfully attaches dax * Check that loading the dax_cxl driver loads the dax_hmem driver * Attempt to race cxl_mock_mem async probe vs dax_hmem probe flushing. Check that both positive and negative cases.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260327052821.440749-10-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 7f5ff740 | 31-Jan-2026 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Move dport RAS setup to dport add time
Towards the end goal of making all CXL RAS capability handling uniform across host bridge ports, upstream switch ports, and endpoint ports, move dpor
cxl/port: Move dport RAS setup to dport add time
Towards the end goal of making all CXL RAS capability handling uniform across host bridge ports, upstream switch ports, and endpoint ports, move dport RAS setup. Move it to cxl_switch_port_probe() context for switch / VH dports (via cxl_port_add_dport()) and cxl_endpoint_port_probe() context for an RCH dport. Rename the RAS setup helper to devm_cxl_dport_ras_setup() for symmetry with devm_cxl_switch_port_decoders_setup().
Only the RCH version needs to be exported and the cxl_test mocking can be deleted with a dev_is_pci() check on the dport_dev.
Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260131000403.2135324-7-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 29317f8d | 16-Dec-2025 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Unlike the cxl_pci class driver that opportunistically enables memory expansion with no other dependent functionality, CXL accelerato
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Unlike the cxl_pci class driver that opportunistically enables memory expansion with no other dependent functionality, CXL accelerator drivers have distinct PCIe-only and CXL-enhanced operation states. If CXL is available some additional coherent memory/cache operations can be enabled, otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
This constitutes a new mode of operation where the caller of devm_cxl_add_memdev() wants to make a "go/no-go" decision about running in CXL accelerated mode or falling back to PCIe-only operation. Part of that decision making process likely also includes additional CXL-acceleration-specific resource setup. Encapsulate both of those requirements into 'struct cxl_memdev_attach' that provides a ->probe() callback. The probe callback runs in cxl_mem_probe() context, after the port topology is successfully attached for the given memdev. It supports a contract where, upon successful return from devm_cxl_add_memdev(), everything needed for CXL accelerated operation has been enabled.
Additionally the presence of @cxlmd->attach indicates that the accelerator driver be detached when CXL operation ends. This conceptually makes a CXL link loss event mirror a PCIe link loss event which results in triggering the ->remove() callback of affected devices+drivers. A driver can re-attach to recover back to PCIe-only operation. Live recovery, i.e. without a ->remove()/->probe() cycle, is left as a future consideration.
[ dj: Repalce with updated commit log from Dan ]
Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251216005616.3090129-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| f1840efd | 16-Nov-2025 |
Alison Schofield <alison.schofield@intel.com> |
cxl/test: Assign overflow_err_count from log->nr_overflow
mock_get_event() uses an uninitialized local variable, nr_overflow, to populate the overflow_err_count field. That results in incorrect over
cxl/test: Assign overflow_err_count from log->nr_overflow
mock_get_event() uses an uninitialized local variable, nr_overflow, to populate the overflow_err_count field. That results in incorrect overflow_err_count values in mocked cxl_overflow trace events, such as this case where the records are reported as 0 and should be non-zero:
[] cxl_overflow: memdev=mem7 host=cxl_mem.6 serial=7: log=Failure : 0 records from 1763228189130895685 to 1763228193130896180
Fix by using log->nr_overflow and remove the unused local variable.
A follow-up change was considered in cxl_mem_get_records_log() to confirm that the overflow_err_count is non-zero when the overflow flag is set [1]. Since the driver has no functional dependency on this constraint, and a device that violates this specific requirement does not cause incorrect driver behavior, no validation check is added.
[1] CXL 3.2, Table 8-65 Get Event Records Output Payload
Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20251116013036.1713313-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| b6369daf | 16-Nov-2025 |
Alison Schofield <alison.schofield@intel.com> |
cxl/test: Remove ret_limit race condition in mock_get_event()
Commit 364ee9f3265e ("cxl/test: Enhance event testing") changed the loop iterator in mock_get_event() from a static constant, CXL_TEST_E
cxl/test: Remove ret_limit race condition in mock_get_event()
Commit 364ee9f3265e ("cxl/test: Enhance event testing") changed the loop iterator in mock_get_event() from a static constant, CXL_TEST_EVENT_CNT, to a dynamic global variable, ret_limit. The intent was to vary the number of events returned per call to simulate events occurring while logs are being read.
However, ret_limit is modified without synchronization. When multiple threads call mock_get_event() concurrently, one thread may read ret_limit, another thread may increment it, and the first thread's loop condition and size calculation see and use the updated value.
This is visible during cxl_test module load when all memdevs are initializing simultaneously, which includes getting event records. It is not tied to the cxl-events.sh unit test specifically, as that operates on a single memdev.
While no actual harm results (the buffer is always large enough and the record count fields correctly reflect what was written), this is a correctness issue. The race creates an inconsistent state within mock_get_event() and adding variability based on a race appears unintended.
Make ret_limit a local variable populated from an atomic counter. Each call gets a stable value that won't change during execution. That preserves the intended behavior of varying the return counts across calls while eliminating the race condition.
This implementation uses "+ 1" to produce the full range of 1 to CXL_TEST_EVENT_RET_MAX (4) records. Previously only 1, 2, 3 were produced.
Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20251116013819.1713780-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 68f4a852 | 17-Nov-2025 |
Dave Jiang <dave.jiang@intel.com> |
cxl/test: Add support for acpi extended linear cache
Add the mock wrappers for hmat_get_extended_linear_cache_size() in order to emulate the ACPI helper function for the regions that are mock'd by c
cxl/test: Add support for acpi extended linear cache
Add the mock wrappers for hmat_get_extended_linear_cache_size() in order to emulate the ACPI helper function for the regions that are mock'd by cxl_test.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Link: https://patch.msgid.link/20251117144611.903692-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 4b1c0466 | 17-Nov-2025 |
Dave Jiang <dave.jiang@intel.com> |
cxl/test: Add cxl_test CFMWS support for extended linear cache
Add a module parameter to allow activation of extended linear cache on the auto region for cxl_test. The current platform implementatio
cxl/test: Add cxl_test CFMWS support for extended linear cache
Add a module parameter to allow activation of extended linear cache on the auto region for cxl_test. The current platform implementation for extended linear cache is 1:1 of DRAM and CXL memory. A CFMWS is created with the size of both memory together where DRAM takes the first part of the memory range and CXL covers the second part. The current CXL auto region on cxl_test consists of 2 256M devices that creates a 512M region. The new extended linear cache setup will have 512M DRAM and 512M CXL memory for a total of 1G CFMWS. The hardware decoders must have their starting offset moved to after the DRAM region to handle the CXL regions.
[ dj: Fixup commenting style. (Jonathan) ]
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Link: https://patch.msgid.link/20251117144611.903692-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|