#
033af36d |
| 27-Sep-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'cxl-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull compute express link (cxl) updates from Dave Jiang: "Major changes address HDM decoder initialization from DVS
Merge tag 'cxl-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull compute express link (cxl) updates from Dave Jiang: "Major changes address HDM decoder initialization from DVSEC ranges, refactoring the code related to cxl mailboxes to be independent of the memory devices, and adding support for shared upstream link access_coordinate calculation, as well as a change to remove locking from memory notifier callback.
In addition, a number of misc cleanups and refactoring of the code are also included.
Address HDM decoder initialization from DVSEC ranges: - Only register non-zero DVSEC ranges - Remove duplicate implementation of waiting for memory_info_valid - Simplify the checking of mem_enabled in cxl_hdm_decode_init()
Refactor the code related to cxl mailboxes to be independent of the memory devices: - Move cxl headers in include/linux/ to include/cxl - Move all mailbox related data to 'struct cxl_mailbox' - Refactor mailbox APIs with 'struct cxl_mailbox' as input instead of memory device state
Add support for shared upstream link access_coordinate calculation for configurations that have multiple targets under a switch or a root port where the aggregated bandwidth can be greater than the upstream link of the switch/RP upstream link: - Preserve the CDAT access_coordinate from an endpoint - Add the support for shared upstream link access_coordinate calculation - Add documentation to explain how the calculations are done
Remove locking from memory notifier callback.
Misc cleanups: - Convert devm_cxl_add_root() to return using ERR_CAST() - cxl_test use dev_is_platform() instead of open coding - Remove duplicate include of header core.h in core/cdat.c - use scoped resource management to drop put_device() for cxl_port - Use scoped_guard to drop device_lock() for cxl_port - Refactor __devm_cxl_add_port() to drop gotos - Rename cxl_setup_parent_dport to cxl_dport_init_aer and cxl_dport_map_regs() to cxl_dport_map_ras() - Refactor cxl_dport_init_aer() to be more concise - Remove duplicate host_bridge->native_aer checking in cxl_dport_init_ras_reporting() - Fix comment for cxl_query_cmd()"
* tag 'cxl-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits) cxl: Add documentation to explain the shared link bandwidth calculation cxl: Calculate region bandwidth of targets with shared upstream link cxl: Preserve the CDAT access_coordinate for an endpoint cxl: Fix comment regarding cxl_query_cmd() return data cxl: Convert cxl_internal_send_cmd() to use 'struct cxl_mailbox' as input cxl: Move mailbox related bits to the same context cxl: move cxl headers to new include/cxl/ directory cxl/region: Remove lock from memory notifier callback cxl/pci: simplify the check of mem_enabled in cxl_hdm_decode_init() cxl/pci: Check Mem_info_valid bit for each applicable DVSEC cxl/pci: Remove duplicated implementation of waiting for memory_info_valid cxl/pci: Fix to record only non-zero ranges cxl/pci: Remove duplicate host_bridge->native_aer checking cxl/pci: cxl_dport_map_rch_aer() cleanup cxl/pci: Rename cxl_setup_parent_dport() and cxl_dport_map_regs() cxl/port: Refactor __devm_cxl_add_port() to drop goto pattern cxl/port: Use scoped_guard()/guard() to drop device_lock() for cxl_port cxl/port: Use __free() to drop put_device() for cxl_port cxl: Remove duplicate included header file core.h tools/testing/cxl: Use dev_is_platform() ...
show more ...
|
Revision tags: v6.11, v6.11-rc7 |
|
#
a5ab0de0 |
| 04-Sep-2024 |
Dave Jiang <dave.jiang@intel.com> |
cxl: Calculate region bandwidth of targets with shared upstream link
The current bandwidth calculation aggregates all the targets. This simple method does not take into account where multiple target
cxl: Calculate region bandwidth of targets with shared upstream link
The current bandwidth calculation aggregates all the targets. This simple method does not take into account where multiple targets sharing under a switch or a root port where the aggregated bandwidth can be greater than the upstream link of the switch.
To accurately account for the shared upstream uplink cases, a new update function is introduced by walking from the leaves to the root of the hierarchy and clamp the bandwidth in the process as needed. This process is done when all the targets for a region are present but before the final values are send to the HMAT handling code cached access_coordinate targets.
The original perf calculation path was kept to calculate the latency performance data that does not require the shared link consideration. The shared upstream link calculation is done as a second pass when all the endpoints have arrived.
Testing is done via qemu with CXL hierarchy. run_qemu[1] is modified to support several CXL hierarchy layouts. The following layouts are tested:
HB: Host Bridge RP: Root Port SW: Switch EP: End Point
2 HB 2 RP 2 EP: resulting bandwidth: 624 1 HB 2 RP 2 EP: resulting bandwidth: 624 2 HB 2 RP 2 SW 4 EP: resulting bandwidth: 624
Current testing, perf number from SRAT/HMAT is hacked into the kernel code. However with new QEMU support of Generic Target Port that's incoming, the perf data injection is no longer needed.
[1]: https://github.com/pmem/run_qemu
Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/linux-cxl/20240501152503.00002e60@Huawei.com/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20240904001316.1688225-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
#
d9a476c8 |
| 04-Sep-2024 |
Ira Weiny <ira.weiny@intel.com> |
cxl/region: Remove lock from memory notifier callback
In testing Dynamic Capacity Device (DCD) support, a lockdep splat revealed an ABBA issue between the memory notifiers and the DCD extent process
cxl/region: Remove lock from memory notifier callback
In testing Dynamic Capacity Device (DCD) support, a lockdep splat revealed an ABBA issue between the memory notifiers and the DCD extent processing code.[0] Changing the lock ordering within DCD proved difficult because regions must be stable while searching for the proper region and then the device lock must be held to properly notify the DAX region driver of memory changes.
Dan points out in the thread that notifiers should be able to trust that it is safe to access static data. Region data is static once the device is realized and until it's destruction. Thus it is better to manage the notifiers within the region driver.
Remove the need for a lock by ensuring the notifiers are active only during the region's lifetime.
Furthermore, remove cxl_region_nid() because resource can't be NULL while the region is stable.
Link: https://lore.kernel.org/all/66b4cf539a79b_a36e829416@iweiny-mobl.notmuch/ [0] Cc: Ying Huang <ying.huang@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240904-fix-notifiers-v3-1-576b4e950266@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
Revision tags: v6.11-rc6 |
|
#
7f569e91 |
| 30-Aug-2024 |
Li Ming <ming4.li@intel.com> |
cxl/port: Use scoped_guard()/guard() to drop device_lock() for cxl_port
A device_lock() and device_unlock() pair can be replaced by a cleanup helper scoped_guard() or guard(), that can enhance code
cxl/port: Use scoped_guard()/guard() to drop device_lock() for cxl_port
A device_lock() and device_unlock() pair can be replaced by a cleanup helper scoped_guard() or guard(), that can enhance code readability. In CXL subsystem, still use device_lock() and device_unlock() pairs for cxl port resource protection, most of them can be replaced by a scoped_guard() or a guard() simply.
Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830013138.2256244-2-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
#
36ec807b |
| 20-Sep-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.12 merge window.
|
#
f057b572 |
| 06-Sep-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'ib/6.11-rc6-matrix-keypad-spitz' into next
Bring in changes removing support for platform data from matrix-keypad driver.
|
Revision tags: v6.11-rc5, v6.11-rc4 |
|
#
50470d38 |
| 13-Aug-2024 |
Andrii Nakryiko <andrii@kernel.org> |
Merge remote-tracking branch 'vfs/stable-struct_fd'
Merge Al Viro's struct fd refactorings.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
Revision tags: v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
#
3daee2e4 |
| 16-Jul-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.10' into next
Sync up with mainline to bring in device_for_each_child_node_scoped() and other newer APIs.
|
#
66e72a01 |
| 29-Jul-2024 |
Jerome Brunet <jbrunet@baylibre.com> |
Merge tag 'v6.11-rc1' into clk-meson-next
Linux 6.11-rc1
|
#
ee057c8c |
| 14-Aug-2024 |
Steven Rostedt <rostedt@goodmis.org> |
Merge tag 'v6.11-rc3' into trace/ring-buffer/core
The "reserve_mem" kernel command line parameter has been pulled into v6.11. Merge the latest -rc3 to allow the persistent ring buffer memory to be a
Merge tag 'v6.11-rc3' into trace/ring-buffer/core
The "reserve_mem" kernel command line parameter has been pulled into v6.11. Merge the latest -rc3 to allow the persistent ring buffer memory to be able to be mapped at the address specified by the "reserve_mem" command line parameter.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
#
50c374c6 |
| 22-Aug-2024 |
Alexei Starovoitov <ast@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Cross-merge bpf fixes after downstream PR including important fixes (from bpf-next point of view): commit 41c24102af7b ("selftests/bpf: Fi
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Cross-merge bpf fixes after downstream PR including important fixes (from bpf-next point of view): commit 41c24102af7b ("selftests/bpf: Filter out _GNU_SOURCE when compiling test_cpp") commit fdad456cbcca ("bpf: Fix updating attached freplace prog in prog_array map")
No conflicts.
Adjacent changes in: include/linux/bpf_verifier.h kernel/bpf/verifier.c tools/testing/selftests/bpf/Makefile
Link: https://lore.kernel.org/bpf/20240813234307.82773-1-alexei.starovoitov@gmail.com/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>
show more ...
|
#
c8faf11c |
| 30-Jul-2024 |
Tejun Heo <tj@kernel.org> |
Merge tag 'v6.11-rc1' into for-6.12
Linux 6.11-rc1
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5 |
|
#
8cce4759 |
| 18-Jun-2024 |
Tejun Heo <tj@kernel.org> |
Merge branch 'bpf/for-next' into sched_ext-base
|
#
ed7171ff |
| 16-Aug-2024 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter
Merge drm/drm-next into drm-xe-next
Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter and allows further pending refactors.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
show more ...
|
#
5c61f598 |
| 12-Aug-2024 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Get drm-misc-next to the state of v6.11-rc2.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
3663e2c4 |
| 01-Aug-2024 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync with v6.11-rc1 in general, and specifically get the new BACKLIGHT_POWER_ constants for power states.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
0e8655b4 |
| 29-Jul-2024 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get a late RC of v6.10 before moving into v6.11.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
4436e6da |
| 02-Aug-2024 |
Thomas Gleixner <tglx@linutronix.de> |
Merge branch 'linus' into x86/mm
Bring x86 and selftests up to date
|
#
5fa35bd3 |
| 01-Aug-2024 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Link: https://patch.msgid.link/20240801131917.344
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Link: https://patch.msgid.link/20240801131917.34494-1-pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
a1ff5a7d |
| 30-Jul-2024 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-fixes into drm-misc-fixes
Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
#
e62f81bb |
| 28-Jul-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang: "Core:
- A CXL maturity map has been added to the documentation to detail
Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang: "Core:
- A CXL maturity map has been added to the documentation to detail the current state of CXL enabling.
It provides the status of the current state of various CXL features to inform current and future contributors of where things are and which areas need contribution.
- A notifier handler has been added in order for a newly created CXL memory region to trigger the abstract distance metrics calculation.
This should bring parity for CXL memory to the same level vs hotplugged DRAM for NUMA abstract distance calculation. The abstract distance reflects relative performance used for memory tiering handling.
- An addition for XOR math has been added to address the CXL DPA to SPA translation.
CXL address translation did not support address interleave math with XOR prior to this change.
Fixes:
- Fix to address race condition in the CXL memory hotplug notifier
- Add missing MODULE_DESCRIPTION() for CXL modules
- Fix incorrect vendor debug UUID define
Misc:
- A warning has been added to inform users of an unsupported configuration when mixing CXL VH and RCH/RCD hierarchies
- The ENXIO error code has been replaced with EBUSY for inject poison limit reached via debugfs and cxl-test support
- Moving the PCI config read in cxl_dvsec_rr_decode() to avoid unnecessary PCI config reads
- A refactor to a common struct for DRAM and general media CXL events"
* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/pci: Move reading of control register to immediately before usage cxl: Remove defunct code calculating host bridge target positions cxl/region: Verify target positions using the ordered target list cxl: Restore XOR'd position bits during address translation cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa() cxl/test: Replace ENXIO with EBUSY for inject poison limit reached cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy cxl/core: Fix incorrect vendor debug UUID define Documentation: CXL Maturity Map cxl/region: Simplify cxl_region_nid() cxl/region: Support to calculate memory tier abstract distance cxl/region: Fix a race condition in memory hotplug notifier cxl: add missing MODULE_DESCRIPTION() macros cxl/events: Use a common struct for DRAM and General Media events
show more ...
|
#
56478475 |
| 12-Jul-2024 |
Dave Jiang <dave.jiang@intel.com> |
Merge branch 'for-6.11/xor_fixes' into cxl-for-next
Series to fix XOR math for DPA to SPA translation - Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa() - Complete DPA->HPA->SPA translation a
Merge branch 'for-6.11/xor_fixes' into cxl-for-next
Series to fix XOR math for DPA to SPA translation - Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa() - Complete DPA->HPA->SPA translation and correct XOR translation issue - Add new method to verify a CXL target position - Remove old method of CXL target position verifiation
show more ...
|
#
82a3e3a2 |
| 03-Jul-2024 |
Alison Schofield <alison.schofield@intel.com> |
cxl/region: Verify target positions using the ordered target list
When a root decoder is configured the interleave target list is read from the BIOS populated CFMWS structure. Per the CXL spec 3.1 T
cxl/region: Verify target positions using the ordered target list
When a root decoder is configured the interleave target list is read from the BIOS populated CFMWS structure. Per the CXL spec 3.1 Table 9-22 the target list is in interleave order. The CXL driver populates its decoder target list in the same order and stores it in 'struct cxl_switch_decoder' field "@target: active ordered target list in current decoder configuration"
Given the promise of an ordered list, the driver can stop duplicating the work of BIOS and simply check target positions against the ordered list during region configuration.
The simplified check against the ordered list is presented here. A follow-on patch will remove the unused code.
For Modulo arithmetic this is not a fix, only a simplification. For XOR arithmetic this is a fix for HB IW of 3,6,12.
Fixes: f9db85bfec0d ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/35d08d3aba08fee0f9b86ab1cef0c25116ca8a55.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
#
3b2fedcd |
| 03-Jul-2024 |
Alison Schofield <alison.schofield@intel.com> |
cxl: Restore XOR'd position bits during address translation
When a device reports a DPA in events like poison, general_media, and dram, the driver translates that DPA back to an HPA. Presently, the
cxl: Restore XOR'd position bits during address translation
When a device reports a DPA in events like poison, general_media, and dram, the driver translates that DPA back to an HPA. Presently, the CXL driver translation only considers the Modulo position and will report the wrong HPA for XOR configured root decoders.
Add a helper function that restores the XOR'd bits during DPA->HPA address translation. Plumb a root decoder callback to the new helper when XOR interleave arithmetic is in use. For Modulo arithmetic, just let the callback be NULL - as in no extra work required.
Upon completion of a DPA->HPA translation a couple of checks are performed on the result. One simply confirms that the calculated HPA is within the address range of the region. That test is useful for both Modulo and XOR interleave arithmetic decodes.
A second check confirms that the HPA is within an expected chunk based on the endpoints position in the region and the region granularity. An XOR decode disrupts the Modulo pattern making the chunk check useless.
To align the checks with the proper decode, pull the region range check inline and use the helper to do the chunk check for Modulo decodes only.
A cxl-test unit test is posted for upstream review here: https://lore.kernel.org/20240624210644.495563-1-alison.schofield@intel.com/
Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/1a1ac880d9f889bd6384e657e810431b9a0a72e5.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|
#
9aa5f623 |
| 03-Jul-2024 |
Alison Schofield <alison.schofield@intel.com> |
cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA addresses the work it performs is a DPA to HPA translation not a trace. Tidy u
cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA addresses the work it performs is a DPA to HPA translation not a trace. Tidy up this naming by moving the minimal work done in cxl_trace_hpa() into cxl_dpa_to_hpa() and use cxl_dpa_to_hpa() for trace event callbacks.
Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/452a9b0c525b774c72d9d5851515ffa928750132.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
show more ...
|