| 63fbf275 | 04-Feb-2026 |
Dave Jiang <dave.jiang@intel.com> |
Merge branch 'for-7.0/cxl-prm-translation' into cxl-for-next
Add support for normalized CXL address translation through ACPI PRM method to support AMD Zen5 platforms. Including a conventions doc tha
Merge branch 'for-7.0/cxl-prm-translation' into cxl-for-next
Add support for normalized CXL address translation through ACPI PRM method to support AMD Zen5 platforms. Including a conventions doc that explains how the translation is implemented and for future implementations that need such setup to comply with the current implementation method.
cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @hpa to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion
show more ...
|
| 7f5ff740 | 31-Jan-2026 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Move dport RAS setup to dport add time
Towards the end goal of making all CXL RAS capability handling uniform across host bridge ports, upstream switch ports, and endpoint ports, move dpor
cxl/port: Move dport RAS setup to dport add time
Towards the end goal of making all CXL RAS capability handling uniform across host bridge ports, upstream switch ports, and endpoint ports, move dport RAS setup. Move it to cxl_switch_port_probe() context for switch / VH dports (via cxl_port_add_dport()) and cxl_endpoint_port_probe() context for an RCH dport. Rename the RAS setup helper to devm_cxl_dport_ras_setup() for symmetry with devm_cxl_switch_port_decoders_setup().
Only the RCH version needs to be exported and the cxl_test mocking can be deleted with a dev_is_pci() check on the dport_dev.
Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260131000403.2135324-7-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 3864cb60 | 31-Jan-2026 |
Dan Williams <dan.j.williams@intel.com> |
cxl/port: Move dport probe operations to a driver event
In preparation for adding more register setup to the cxl_port_add_dport() path (for RAS register mapping), move the dport creation event to a
cxl/port: Move dport probe operations to a driver event
In preparation for adding more register setup to the cxl_port_add_dport() path (for RAS register mapping), move the dport creation event to a driver callback. This achieves two goals, it puts driver operations logically where they belong, in a driver, and it obviates the gymnastics of DECLARE_TESTABLE() which just makes a mess of grepping for CXL symbols.
In other words, a driver callback is less of an ongoing maintenance burden than this DECLARE_TESTABLE arrangement that does not scale and diminishes the grep-ability of the codebase.
cxl_port_add_dport() moves mostly unmodified from drivers/cxl/core/port.c. The only deliberate change is that it now assumes that the device_lock is held on entry and the driver is attached (just like cxl_port_probe()).
Reviewed-by: Terry Bowman <terry.bowman@amd.com> Tested-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260131000403.2135324-6-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 3f7938b1 | 23-Jan-2026 |
Dave Jiang <dave.jiang@intel.com> |
Merge branch 'for-7.0/cxl-init' into cxl-for-next
Merge in patches to support several patch series such as Soft Reserve handling, type2 accelerator enabling, and LSA 2.1 labeling support. Mainly add
Merge branch 'for-7.0/cxl-init' into cxl-for-next
Merge in patches to support several patch series such as Soft Reserve handling, type2 accelerator enabling, and LSA 2.1 labeling support. Mainly addition of cxl_memdev_attach() to allow the memdev probe to make a decision of proceed/fail depending success of CXL topology enumeration.
dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation cxl/mem: Drop @host argument to devm_cxl_add_memdev() cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup cxl/port: Arrange for always synchronous endpoint attach cxl/mem: Arrange for always-synchronous memdev attach cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
show more ...
|
| 29317f8d | 16-Dec-2025 |
Dan Williams <dan.j.williams@intel.com> |
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Unlike the cxl_pci class driver that opportunistically enables memory expansion with no other dependent functionality, CXL accelerato
cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Unlike the cxl_pci class driver that opportunistically enables memory expansion with no other dependent functionality, CXL accelerator drivers have distinct PCIe-only and CXL-enhanced operation states. If CXL is available some additional coherent memory/cache operations can be enabled, otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
This constitutes a new mode of operation where the caller of devm_cxl_add_memdev() wants to make a "go/no-go" decision about running in CXL accelerated mode or falling back to PCIe-only operation. Part of that decision making process likely also includes additional CXL-acceleration-specific resource setup. Encapsulate both of those requirements into 'struct cxl_memdev_attach' that provides a ->probe() callback. The probe callback runs in cxl_mem_probe() context, after the port topology is successfully attached for the given memdev. It supports a contract where, upon successful return from devm_cxl_add_memdev(), everything needed for CXL accelerated operation has been enabled.
Additionally the presence of @cxlmd->attach indicates that the accelerator driver be detached when CXL operation ends. This conceptually makes a CXL link loss event mirror a PCIe link loss event which results in triggering the ->remove() callback of affected devices+drivers. A driver can re-attach to recover back to PCIe-only operation. Live recovery, i.e. without a ->remove()/->probe() cycle, is left as a future consideration.
[ dj: Repalce with updated commit log from Dan ]
Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251216005616.3090129-7-dan.j.williams@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| f1840efd | 16-Nov-2025 |
Alison Schofield <alison.schofield@intel.com> |
cxl/test: Assign overflow_err_count from log->nr_overflow
mock_get_event() uses an uninitialized local variable, nr_overflow, to populate the overflow_err_count field. That results in incorrect over
cxl/test: Assign overflow_err_count from log->nr_overflow
mock_get_event() uses an uninitialized local variable, nr_overflow, to populate the overflow_err_count field. That results in incorrect overflow_err_count values in mocked cxl_overflow trace events, such as this case where the records are reported as 0 and should be non-zero:
[] cxl_overflow: memdev=mem7 host=cxl_mem.6 serial=7: log=Failure : 0 records from 1763228189130895685 to 1763228193130896180
Fix by using log->nr_overflow and remove the unused local variable.
A follow-up change was considered in cxl_mem_get_records_log() to confirm that the overflow_err_count is non-zero when the overflow flag is set [1]. Since the driver has no functional dependency on this constraint, and a device that violates this specific requirement does not cause incorrect driver behavior, no validation check is added.
[1] CXL 3.2, Table 8-65 Get Event Records Output Payload
Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20251116013036.1713313-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| b6369daf | 16-Nov-2025 |
Alison Schofield <alison.schofield@intel.com> |
cxl/test: Remove ret_limit race condition in mock_get_event()
Commit 364ee9f3265e ("cxl/test: Enhance event testing") changed the loop iterator in mock_get_event() from a static constant, CXL_TEST_E
cxl/test: Remove ret_limit race condition in mock_get_event()
Commit 364ee9f3265e ("cxl/test: Enhance event testing") changed the loop iterator in mock_get_event() from a static constant, CXL_TEST_EVENT_CNT, to a dynamic global variable, ret_limit. The intent was to vary the number of events returned per call to simulate events occurring while logs are being read.
However, ret_limit is modified without synchronization. When multiple threads call mock_get_event() concurrently, one thread may read ret_limit, another thread may increment it, and the first thread's loop condition and size calculation see and use the updated value.
This is visible during cxl_test module load when all memdevs are initializing simultaneously, which includes getting event records. It is not tied to the cxl-events.sh unit test specifically, as that operates on a single memdev.
While no actual harm results (the buffer is always large enough and the record count fields correctly reflect what was written), this is a correctness issue. The race creates an inconsistent state within mock_get_event() and adding variability based on a race appears unintended.
Make ret_limit a local variable populated from an atomic counter. Each call gets a stable value that won't change during execution. That preserves the intended behavior of varying the return counts across calls while eliminating the race condition.
This implementation uses "+ 1" to produce the full range of 1 to CXL_TEST_EVENT_RET_MAX (4) records. Previously only 1, 2, 3 were produced.
Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20251116013819.1713780-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 68f4a852 | 17-Nov-2025 |
Dave Jiang <dave.jiang@intel.com> |
cxl/test: Add support for acpi extended linear cache
Add the mock wrappers for hmat_get_extended_linear_cache_size() in order to emulate the ACPI helper function for the regions that are mock'd by c
cxl/test: Add support for acpi extended linear cache
Add the mock wrappers for hmat_get_extended_linear_cache_size() in order to emulate the ACPI helper function for the regions that are mock'd by cxl_test.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Link: https://patch.msgid.link/20251117144611.903692-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
| 4b1c0466 | 17-Nov-2025 |
Dave Jiang <dave.jiang@intel.com> |
cxl/test: Add cxl_test CFMWS support for extended linear cache
Add a module parameter to allow activation of extended linear cache on the auto region for cxl_test. The current platform implementatio
cxl/test: Add cxl_test CFMWS support for extended linear cache
Add a module parameter to allow activation of extended linear cache on the auto region for cxl_test. The current platform implementation for extended linear cache is 1:1 of DRAM and CXL memory. A CFMWS is created with the size of both memory together where DRAM takes the first part of the memory range and CXL covers the second part. The current CXL auto region on cxl_test consists of 2 256M devices that creates a 512M region. The new extended linear cache setup will have 512M DRAM and 512M CXL memory for a total of 1G CFMWS. The hardware decoders must have their starting offset moved to after the DRAM region to handle the CXL regions.
[ dj: Fixup commenting style. (Jonathan) ]
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Link: https://patch.msgid.link/20251117144611.903692-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|