|
Revision tags: v7.1-rc2 |
|
| #
0fc8f620 |
| 27-Apr-2026 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Getting fixes and updates from v7.1-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
Revision tags: v7.1-rc1 |
|
| #
f4b369c6 |
| 20-Apr-2026 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 7.1 merge window.
|
|
Revision tags: v7.0, v7.0-rc7, v7.0-rc6, v7.0-rc5, v7.0-rc4 |
|
| #
0421ccdf |
| 12-Mar-2026 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v7.0-rc3' into next
Sync up with the mainline to brig up the latest changes, specifically changes to ALPS driver.
|
| #
f1d26d72 |
| 16-Apr-2026 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'iommu-updates-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel: "Core:
- Support for RISC-V IO-page-table format in generic iom
Merge tag 'iommu-updates-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel: "Core:
- Support for RISC-V IO-page-table format in generic iommupt code
ARM-SMMU Updates:
- Introduction of an "invalidation array" for SMMUv3, which enables future scalability work and optimisations for devices with a large number of SMMUv3 instances
- Update the conditions under which the SMMUv3 driver works around hardware errata for invalidation on MMU-700 implementations
- Fix broken command filtering for the host view of NVIDIA's "cmdqv" SMMUv3 extension
- MMU-500 device-tree binding additions for Qualcomm Eliza & Hawi SoCs
Intel VT-d:
- Support for dirty tracking on domains attached to PASID
- Removal of unnecessary read*()/write*() wrappers
- Improvements to the invalidation paths
AMD Vi:
- Race-condition fixed in debugfs code
- Make log buffer allocation NUMA aware
RISC-V:
- IO-TLB flushing improvements
- Minor fixes"
* tag 'iommu-updates-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (48 commits) iommu/vt-d: Restore IOMMU_CAP_CACHE_COHERENCY dt-bindings: arm-smmu: qcom: Add compatible for Hawi SoC iommu/amd: Invalidate IRT cache for DMA aliases iommu/riscv: Remove overflows on the invalidation path iommu/amd: Fix clone_alias() to use the original device's devid iommu/vt-d: Remove the remaining pages along the invalidation path iommu/vt-d: Pass size_order to qi_desc_piotlb() not npages iommu/vt-d: Split piotlb invalidation into range and all iommu/vt-d: Remove dmar_writel() and dmar_writeq() iommu/vt-d: Remove dmar_readl() and dmar_readq() iommufd/selftest: Test dirty tracking on PASID iommu/vt-d: Support dirty tracking on PASID iommu/vt-d: Rename device_set_dirty_tracking() and pass dmar_domain pointer iommu/vt-d: Block PASID attachment to nested domain with dirty tracking iommu/dma: Always allow DMA-FQ when iommupt provides the iommu_domain iommu/riscv: Fix signedness bug iommu/amd: Fix illegal cap/mmio access in IOMMU debugfs iommu/amd: Fix illegal device-id access in IOMMU debugfs iommu/tegra241-cmdqv: Update uAPI to clarify HYP_OWN requirement iommu/tegra241-cmdqv: Set supports_cmd op in tegra241_vcmdq_hw_init() ...
show more ...
|
| #
f8d5e706 |
| 09-Apr-2026 |
Will Deacon <will@kernel.org> |
Merge branches 'fixes', 'arm/smmu/updates', 'arm/smmu/bindings', 'riscv', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
|
|
Revision tags: v7.0-rc3, v7.0-rc2 |
|
| #
e71e0012 |
| 27-Feb-2026 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add the RISC-V page table format
The RISC-V format is a fairly simple 5 level page table not unlike the x86 one. It has optional support for a single contiguous page size of 64k (16 x 4k).
iommupt: Add the RISC-V page table format
The RISC-V format is a fairly simple 5 level page table not unlike the x86 one. It has optional support for a single contiguous page size of 64k (16 x 4k).
The specification describes a 32-bit format, the general code can support it via a #define but the iommu side implementation has been left off until a user comes.
Tested-by: Vincent Chen <vincent.chen@sifive.com> Acked-by: Paul Walmsley <pjw@kernel.org> # arch/riscv Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com> Tested-by: Tomasz Jeznach <tjeznach@rivosinc.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
|
Revision tags: v7.0-rc1 |
|
| #
ec496f77 |
| 09-Feb-2026 |
Jiri Kosina <jkosina@suse.com> |
Merge branch 'for-6.20/sony' into for-linus
- Support for Rock band 4 PS4 and PS5 guitars (Rosalie Wanders)
|
|
Revision tags: v6.19, v6.19-rc8, v6.19-rc7 |
|
| #
cc4adab1 |
| 20-Jan-2026 |
Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> |
Merge tag 'v6.19-rc1' into msm-next
Merge Linux 6.19-rc1 in order to catch up with other changes (e.g. UBWC config database defining UBWC_6).
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.q
Merge tag 'v6.19-rc1' into msm-next
Merge Linux 6.19-rc1 in order to catch up with other changes (e.g. UBWC config database defining UBWC_6).
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
show more ...
|
|
Revision tags: v6.19-rc6, v6.19-rc5, v6.19-rc4, v6.19-rc3, v6.19-rc2 |
|
| #
5add3c3c |
| 19-Dec-2025 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
Merge drm/drm-next into drm-xe-next
Backmerging to bring in 6.19-rc1. An important upstream bugfix and to help unblock PTL CI.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
| #
ec439c38 |
| 17-Dec-2025 |
Alexei Starovoitov <ast@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1
Cross-merge BPF and other fixes after downstream PR.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
| #
b8304863 |
| 15-Dec-2025 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync-up some display code needed for Async flips refactor.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
| #
7f790dd2 |
| 15-Dec-2025 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next
Let's kickstart the v6.20 (7.0?) release cycle.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
| #
24f171c7 |
| 21-Dec-2025 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v6.19-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.19
We've been quite busy with fixes since the merge window, though
Merge tag 'asoc-fix-v6.19-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.19
We've been quite busy with fixes since the merge window, though not in any particularly exciting ways - the standout thing is the fix for _SX controls which were broken by a change to how we do clamping, otherwise it's all fairly run of the mill fixes and quirks.
show more ...
|
| #
84318277 |
| 15-Dec-2025 |
Maarten Lankhorst <dev@lankhorst.se> |
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Pull in rc1 to include all changes since the merge window closed, and grab all fixes and changes from drm/drm-next.
Signed-off-by: M
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Pull in rc1 to include all changes since the merge window closed, and grab all fixes and changes from drm/drm-next.
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
show more ...
|
|
Revision tags: v6.19-rc1 |
|
| #
ce5cfb0f |
| 05-Dec-2025 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'iommu-updates-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
- Introduction of the generic IO page-table framework with suppor
Merge tag 'iommu-updates-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
- Introduction of the generic IO page-table framework with support for Intel and AMD IOMMU formats from Jason.
This has good potential for unifying more IO page-table implementations and making future enhancements more easy. But this also needed quite some fixes during development. All known issues have been fixed, but my feeling is that there is a higher potential than usual that more might be needed.
- Intel VT-d updates: - Use right invalidation hint in qi_desc_iotlb() - Reduce the scope of INTEL_IOMMU_FLOPPY_WA
- ARM-SMMU updates: - Qualcomm device-tree binding updates for Kaanapali and Glymur SoCs and a new clock for the TBU. - Fix error handling if level 1 CD table allocation fails. - Permit more than the architectural maximum number of SMRs for funky Qualcomm mis-implementations of SMMUv2.
- Mediatek driver: - MT8189 iommu support
- Move ARM IO-pgtable selftests to kunit
- Device leak fixes for a couple of drivers
- Random smaller fixes and improvements
* tag 'iommu-updates-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (81 commits) iommupt/vtd: Support mgaw's less than a 4 level walk for first stage iommupt/vtd: Allow VT-d to have a larger table top than the vasz requires powerpc/pseries/svm: Make mem_encrypt.h self contained genpt: Make GENERIC_PT invisible iommupt: Avoid a compiler bug with sw_bit iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal iommupt: Fix unlikely flows in increase_top() iommu/amd: Propagate the error code returned by __modify_irte_ga() in modify_irte_ga() MAINTAINERS: Update my email address iommu/arm-smmu-v3: Fix error check in arm_smmu_alloc_cd_tables dt-bindings: iommu: qcom_iommu: Allow 'tbu' clock iommu/vt-d: Restore previous domain::aperture_end calculation iommu/vt-d: Fix unused invalidation hint in qi_desc_iotlb iommu/vt-d: Set INTEL_IOMMU_FLOPPY_WA depend on BLK_DEV_FD iommu/tegra: fix device leak on probe_device() iommu/sun50i: fix device leak on of_xlate() iommu/omap: simplify probe_device() error handling iommu/omap: fix device leaks on probe_device() iommu/mediatek-v1: add missing larb count sanity check iommu/mediatek-v1: fix device leaks on probe() ...
show more ...
|
|
Revision tags: v6.18 |
|
| #
0d081b16 |
| 28-Nov-2025 |
Joerg Roedel <joerg.roedel@amd.com> |
Merge branches 'arm/smmu/updates', 'arm/smmu/bindings', 'mediatek', 'nvidia/tegra', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
|
|
Revision tags: v6.18-rc7, v6.18-rc6 |
|
| #
01569c21 |
| 12-Nov-2025 |
Geert Uytterhoeven <geert+renesas@glider.be> |
genpt: Make GENERIC_PT invisible
There is no point in asking the user about the Generic Radix Page Table API: - All IOMMU drivers that use this API already select GENERIC_PT when needed, - M
genpt: Make GENERIC_PT invisible
There is no point in asking the user about the Generic Radix Page Table API: - All IOMMU drivers that use this API already select GENERIC_PT when needed, - Most users probably do not know what to answer anyway.
Fixes: 7c5b184db7145fd4 ("genpt: Generic Page Table base API") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
|
Revision tags: v6.18-rc5 |
|
| #
5cb637d9 |
| 06-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Documentation fixes
Some adjustments pointed out by Randy:
"decodes an full 64-bit" -> "decodes the full 64 bit"
Correct the function parameter name for iova_to_phys()
Use the recomme
iommupt: Documentation fixes
Some adjustments pointed out by Randy:
"decodes an full 64-bit" -> "decodes the full 64 bit"
Correct the function parameter name for iova_to_phys()
Use the recommended section heading style.
Suggested-by: Randy Dunlap <rdunlap@infradead.org> Fixes: ab0b572847ac ("genpt: Add Documentation/ files") Fixes: 879ced2bab1b ("iommupt: Add the AMD IOMMU v1 page table format") Fixes: 9d4c274cd7d5 ("iommupt: Add iova_to_phys op") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
|
Revision tags: v6.18-rc4, v6.18-rc3 |
|
| #
5448c155 |
| 23-Oct-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add the Intel VT-d second stage page table format
The VT-d second stage format is almost the same as the x86 PAE format, except the bit encodings in the PTE are different and a few new PTE
iommupt: Add the Intel VT-d second stage page table format
The VT-d second stage format is almost the same as the x86 PAE format, except the bit encodings in the PTE are different and a few new PTE features, like force coherency are present.
Among all the formats it is unique in not having a designated present bit.
Comparing the performance of several operations to the existing version:
iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 53,66 , 50,64 , 21.21 2^21, 59,70 , 56,67 , 16.16 2^30, 54,66 , 52,63 , 17.17 256*2^12, 384,524 , 337,516 , 34.34 256*2^21, 387,632 , 336,626 , 46.46 256*2^30, 376,629 , 323,623 , 48.48
iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 67,86 , 63,84 , 25.25 2^21, 64,84 , 59,80 , 26.26 2^30, 59,78 , 56,74 , 24.24 256*2^12, 216,335 , 198,317 , 37.37 256*2^21, 245,350 , 232,344 , 32.32 256*2^30, 248,345 , 226,339 , 33.33
Cc: Tina Zhang <tina.zhang@intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
| #
aef5de75 |
| 04-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add the x86 64 bit page table format
This is used by x86 CPUs and can be used in AMD/VT-d x86 IOMMUs. When a x86 IOMMU is running SVA the MM will be using this format.
This implementation
iommupt: Add the x86 64 bit page table format
This is used by x86 CPUs and can be used in AMD/VT-d x86 IOMMUs. When a x86 IOMMU is running SVA the MM will be using this format.
This implementation follows the AMD v2 io-pgtable version.
There is nothing remarkable here, the format can have 4 or 5 levels and limited support for different page sizes. No contiguous pages support.
x86 uses a sign extension mechanism where the top bits of the VA must match the sign bit. The core code supports this through PT_FEAT_SIGN_EXTEND which creates and upper and lower VA range. All the new operations will work correctly in both spaces, however currently there is no way to report the upper space to other layers. Future patches can improve that.
In principle this can support 3 page tables levels matching the 32 bit PAE table format, but no iommu driver needs this. The focus is on the modern 64 bit 4 and 5 level formats.
Comparing the performance of several operations to the existing version:
iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 71,61 , 66,58 , -13.13 2^21, 66,60 , 61,55 , -10.10 2^30, 59,56 , 56,54 , -3.03 256*2^12, 392,1360 , 345,1289 , 73.73 256*2^21, 383,1159 , 335,1145 , 70.70 256*2^30, 378,965 , 331,892 , 62.62
iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 77,71 , 73,68 , -7.07 2^21, 76,70 , 70,66 , -6.06 2^30, 69,66 , 66,63 , -4.04 256*2^12, 225,899 , 210,870 , 75.75 256*2^21, 262,722 , 248,710 , 65.65 256*2^30, 251,643 , 244,634 , 61.61
The small -ve values in the iommu_unmap() are due to the core code calling iommu_pgsize() before invoking the domain op. This is unncessary with this implementation. Future work optimizes this and gets to 2%, 4%, 3%.
Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
| #
1dd4187f |
| 04-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add a kunit test for Generic Page Table
This intends to have high coverage of the page table format functions, it uses the IOMMU implementation to create a tree which it then walks through
iommupt: Add a kunit test for Generic Page Table
This intends to have high coverage of the page table format functions, it uses the IOMMU implementation to create a tree which it then walks through and directly calls the generic page table functions to test them.
It is a good starting point to test a new format header as it is often able to find typos and inconsistencies much more directly, rather than with an obscure failure in the iommu implementation.
The tests can be run with commands like:
tools/testing/kunit/kunit.py run --build_dir build_kunit_arm64 --arch arm64 --make_options LLVM=-19 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_uml --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig --kconfig_add CONFIG_WERROR=n tools/testing/kunit/kunit.py run --build_dir build_kunit_x86_64 --arch x86_64 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386 --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386pae --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig --kconfig_add CONFIG_X86_PAE=y
There are several interesting corner cases on the 32 bit platforms that need checking.
The format can declare a list of configurations that generate different configurations the initialize the page table, for instance with different top levels or other parameters. The kunit will turn these into "params" which cause each test to run multiple times.
The tests are repeated to run at every table level to check that all the item encoding formats work.
The following are checked: - Basic init works for each configuration - The various log2 functions have the expected behavior at the limits - pt_compute_best_pgsize() works - pt_table_pa() reads back what pt_install_table() writes - range.max_vasz_lg2 works properly - pt_table_oa_lg2sz() and pt_table_item_lg2sz() use a contiguous non-overlapping set of bits from the VA up to the defined max_va - pt_possible_sizes() and pt_can_have_leaf() produces a sensible layout - pt_item_oa(), pt_entry_oa(), and pt_entry_num_contig_lg2() read back what pt_install_leaf_entry() writes - pt_clear_entry() works - pt_attr_from_entry() reads back what pt_iommu_set_prot() & pt_install_leaf_entry() writes - pt_entry_set_write_clean(), pt_entry_make_write_dirty(), and pt_entry_write_is_dirty() work
Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
| #
879ced2b |
| 04-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add the AMD IOMMU v1 page table format
AMD IOMMU v1 is unique in supporting contiguous pages with a variable size and it can decode the full 64 bit VA space. Unlike other x86 page tables th
iommupt: Add the AMD IOMMU v1 page table format
AMD IOMMU v1 is unique in supporting contiguous pages with a variable size and it can decode the full 64 bit VA space. Unlike other x86 page tables this explicitly does not do sign extension as part of allowing the entire 64 bit VA space to be supported.
The general design is quite similar to the x86 PAE format, except with a 6th level and quite different PTE encoding.
This format is the only one that uses the PT_FEAT_DYNAMIC_TOP feature in the existing code as the existing AMDv1 code starts out with a 3 level table and adds levels on the fly if more IOVA is needed.
Comparing the performance of several operations to the existing version:
iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 65,64 , 62,61 , -1.01 2^13, 70,66 , 67,62 , -8.08 2^14, 73,69 , 71,65 , -9.09 2^15, 78,75 , 75,71 , -5.05 2^16, 89,89 , 86,84 , -2.02 2^17, 128,121 , 124,112 , -10.10 2^18, 175,175 , 170,163 , -4.04 2^19, 264,306 , 261,279 , 6.06 2^20, 444,525 , 438,489 , 10.10 2^21, 60,62 , 58,59 , 1.01 256*2^12, 381,1833 , 367,1795 , 79.79 256*2^21, 375,1623 , 356,1555 , 77.77 256*2^30, 356,1338 , 349,1277 , 72.72
iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 76,89 , 71,86 , 17.17 2^13, 79,89 , 75,86 , 12.12 2^14, 78,90 , 74,86 , 13.13 2^15, 82,89 , 74,86 , 13.13 2^16, 79,89 , 74,86 , 13.13 2^17, 81,89 , 77,87 , 11.11 2^18, 90,92 , 87,89 , 2.02 2^19, 91,93 , 88,90 , 2.02 2^20, 96,95 , 91,92 , 1.01 2^21, 72,88 , 68,85 , 20.20 256*2^12, 372,6583 , 364,6251 , 94.94 256*2^21, 398,6032 , 392,5758 , 93.93 256*2^30, 396,5665 , 389,5258 , 92.92
The ~5-17x speedup when working with mutli-PTE map/unmaps is because the AMD implementation rewalks the entire table on every new PTE while this version retains its position. The same speedup will be seen with dirtys as well.
The old implementation triggers a compiler optimization that ends up generating a "rep stos" memset for contiguous PTEs. Since AMD can have contiguous PTEs that span 2Kbytes of table this is a huge win compared to a normal movq loop. It is why the unmap side has a fairly flat runtime as the contiguous PTE sides increases. This version makes it explicit with a memset64() call.
Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
| #
cdb39d91 |
| 04-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
iommupt: Add the basic structure of the iommu implementation
The existing IOMMU page table implementations duplicate all of the working algorithms for each format. By using the generic page table AP
iommupt: Add the basic structure of the iommu implementation
The existing IOMMU page table implementations duplicate all of the working algorithms for each format. By using the generic page table API a single C version of the IOMMU algorithms can be created and re-used for all of the different formats used in the drivers. The implementation will provide a single C version of the iommu domain operations: iova_to_phys, map, unmap, and read_and_clear_dirty.
Further, adding new algorithms and techniques becomes easy to do across the entire fleet of drivers and formats.
The C functions are drop in compatible with the existing iommu_domain_ops using the IOMMU_PT_DOMAIN_OPS() macro. Each per-format implementation compilation unit will produce exported symbols following the pattern pt_iommu_FMT_map_pages() which the macro directly maps to the iommu_domain_ops members. This avoids the additional function pointer indirection like io-pgtable has.
The top level struct used by the drivers is pt_iommu_table_FMT. It contains the other structs to allow container_of() to move between the driver, iommu page table, generic page table, and generic format layers.
struct pt_iommu_table_amdv1 { struct pt_iommu { struct iommu_domain domain; } iommu; struct pt_amdv1 { struct pt_common common; } amdpt; };
The driver is expected to union the pt_iommu_table_FMT with its own existing domain struct:
struct driver_domain { union { struct iommu_domain domain; struct pt_iommu_table_amdv1 amdv1; }; }; PT_IOMMU_CHECK_DOMAIN(struct driver_domain, amdv1, domain);
To create an alias to avoid renaming 'domain' in a lot of driver code.
This allows all the layers to access all the necessary functions to implement their different roles with no change to any of the existing iommu core code.
Implement the basic starting point: pt_iommu_init(), get_info() and deinit().
Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|
| #
7c5b184d |
| 04-Nov-2025 |
Jason Gunthorpe <jgg@nvidia.com> |
genpt: Generic Page Table base API
The generic API is intended to be separated from the implementation of page table algorithms. It contains only accessors for walking and manipulating the table and
genpt: Generic Page Table base API
The generic API is intended to be separated from the implementation of page table algorithms. It contains only accessors for walking and manipulating the table and helpers that are useful for building an implementation. Memory management is not in the generic API, but part of the implementation.
Using a multi-compilation approach the implementation module would include headers in this order:
common.h defs_FMT.h pt_defs.h FMT.h pt_common.h IMPLEMENTATION.h
Where each compilation unit would have a combination of FMT and IMPLEMENTATION to produce a per-format per-implementation module.
The API is designed so that the format headers have minimal logic, and default implementations are provided if the format doesn't include one.
Generally formats provide their code via an inline function using the pattern:
static inline FMTpt_XX(..) {} #define pt_XX FMTpt_XX
The common code then enforces a function signature so that there is no drift in function arguments, or accidental polymorphic functions (as has been slightly troublesome in mm). Use of function-like #defines are avoided in the format even though many of the functions are small enough.
Provide kdocs for the API surface.
This is enough to implement the 8 initial format variations with all of their features: * Entries comprised of contiguous blocks of IO PTEs for larger page sizes (AMDv1, ARMv8) * Multi-level tables, up to 6 levels. Runtime selected top level * The size of the top table level can be selected at runtime (ARM's concatenated tables) * The number of levels in the table can optionally increase dynamically during map (AMDv1) * Optional leaf entries at any level * 32 bit/64 bit virtual and output addresses, using every bit * Sign extended addressing (x86) * Dirty tracking
A basic simple format takes about 200 lines to declare the require inline functions.
Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
show more ...
|