| 6f53bcb4 | 13-Nov-2025 |
Ashley Smith <ashley.smith@collabora.com> |
drm/panthor: Reset queue slots if termination fails
Make sure the queue slot is reset even if we failed termination so we don't have garbage in the CS input interface after a reset. In practice that
drm/panthor: Reset queue slots if termination fails
Make sure the queue slot is reset even if we failed termination so we don't have garbage in the CS input interface after a reset. In practice that's not a problem because we zero out all RW sections when a hangs occurs, but it's safer to reset things manually, in case we decide to not conditionally reload RW sections based on the type of hang.
v4: - Split the changes in two separate patches
v5: - No changes
v6: - Adjust the explanation in the commit message - Drop the Fixes tag - Put after the timeout changes and make the two patches independent so one can be backported, and the other not
v7: - Use the local group variable instead of dereferencing csg_slot->group - Add Steve's R-b
v8: - No changes
Signed-off-by: Ashley Smith <ashley.smith@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patch.msgid.link/20251113105734.1520338-3-boris.brezillon@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 345c5b7c | 13-Nov-2025 |
Ashley Smith <ashley.smith@collabora.com> |
drm/panthor: Make the timeout per-queue instead of per-job
The timeout logic provided by drm_sched leads to races when we try to suspend it while the drm_sched workqueue queues more jobs. Let's over
drm/panthor: Make the timeout per-queue instead of per-job
The timeout logic provided by drm_sched leads to races when we try to suspend it while the drm_sched workqueue queues more jobs. Let's overhaul the timeout handling in panthor to have our own delayed work that's resumed/suspended when a group is resumed/suspended. When an actual timeout occurs, we call drm_sched_fault() to report it through drm_sched, still. But otherwise, the drm_sched timeout is disabled (set to MAX_SCHEDULE_TIMEOUT), which leaves us in control of how we protect modifications on the timer.
One issue seems to be when we call drm_sched_suspend_timeout() from both queue_run_job() and tick_work() which could lead to races due to drm_sched_suspend_timeout() not having a lock. Another issue seems to be in queue_run_job() if the group is not scheduled, we suspend the timeout again which undoes what drm_sched_job_begin() did when calling drm_sched_start_timeout(). So the timeout does not reset when a job is finished.
v2: - Fix syntax error
v3: - Split the changes in two commits
v4: - No changes
v5: - No changes
v6: - Fix a NULL deref in group_can_run(), and narrow the group variable scope to avoid such mistakes in the future - Add an queue_timeout_is_suspended() helper to clarify things
v7: - No changes
v8: - Don't touch drm_gpu_scheduler::timeout in queue_timedout_job()
Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com> Signed-off-by: Ashley Smith <ashley.smith@collabora.com> Co-developed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patch.msgid.link/20251113105734.1520338-2-boris.brezillon@collabora.com
show more ...
|
| e20c6260 | 14-Nov-2025 |
Loïc Molinari <loic.molinari@collabora.com> |
drm/panthor: Improve IOMMU map/unmap debugging logs
Log the number of pages and their sizes actually mapped/unmapped by the IOMMU page table driver. Since a map/unmap op is often split in several op
drm/panthor: Improve IOMMU map/unmap debugging logs
Log the number of pages and their sizes actually mapped/unmapped by the IOMMU page table driver. Since a map/unmap op is often split in several ops depending on the underlying scatter/gather table, add the start address and the total size to the debugging logs in order to help understand which batch an op is part of.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patch.msgid.link/20251114170303.2800-10-loic.molinari@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| ab349049 | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Add support for Mali-G1 GPUs
Add support for Mali-G1 GPUs (CSF architecture v14), introducing a new panthor_hw_arch_v14 entry with reset and L2 power management operations via the PWR_C
drm/panthor: Add support for Mali-G1 GPUs
Add support for Mali-G1 GPUs (CSF architecture v14), introducing a new panthor_hw_arch_v14 entry with reset and L2 power management operations via the PWR_CONTROL block.
Mali-G1 introduces a dedicated PWR_CONTROL block for managing resets and power domains. panthor_gpu_info_init() is updated to use this block for L2, tiler, and shader domain present register reads.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-9-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 2008f49a | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Support 64-bit endpoint_req register for Mali-G1
Add support for the 64-bit endpoint_req register introduced in CSF v4.0+ GPUs. Unlike a simple register widening, the 64-bit variant occ
drm/panthor: Support 64-bit endpoint_req register for Mali-G1
Add support for the 64-bit endpoint_req register introduced in CSF v4.0+ GPUs. Unlike a simple register widening, the 64-bit variant occupies the next 64 bits after the original 32-bit field, requiring version-dependent access.
This change introduces helper functions to read, write, and update the endpoint_req register, ensuring correct handling on both pre-v4.0 and v4.0+ firmwares.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-8-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 51407254 | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Support GLB_REQ.STATE field for Mali-G1 GPUs
Add support for the GLB_REQ.STATE field introduced in CSF v4.1+, which replaces the HALT bit to provide finer control over the MCU state. Th
drm/panthor: Support GLB_REQ.STATE field for Mali-G1 GPUs
Add support for the GLB_REQ.STATE field introduced in CSF v4.1+, which replaces the HALT bit to provide finer control over the MCU state. This change implements basic handling for transitioning the MCU between ACTIVE and HALT states on Mali-G1 GPUs.
The update introduces new helpers to issue the state change requests, poll for MCU halt completion, and restore the MCU to an active state after halting.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-7-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 9ee52f5c | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Implement soft reset via PWR_CONTROL
Add helpers to issue reset commands through the PWR_CONTROL interface and wait for reset completion using IRQ signaling. This enables support for RE
drm/panthor: Implement soft reset via PWR_CONTROL
Add helpers to issue reset commands through the PWR_CONTROL interface and wait for reset completion using IRQ signaling. This enables support for RESET_SOFT operations with timeout handling and status verification.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-6-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| ee4f9af0 | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Implement L2 power on/off via PWR_CONTROL
This patch adds common helpers to issue power commands, poll transitions, and validate domain state, then wires them into the L2 on/off paths.
drm/panthor: Implement L2 power on/off via PWR_CONTROL
This patch adds common helpers to issue power commands, poll transitions, and validate domain state, then wires them into the L2 on/off paths.
The L2 power-on sequence now delegates control of the SHADER and TILER domains to the MCU when allowed, while the L2 itself is never delegated. On power-off, dependent domains beneath the L2 are checked, and if necessary, retracted and powered down to maintain proper domain ordering.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-5-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| c27787f2 | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Introduce panthor_pwr API and power control framework
Add the new panthor_pwr module, which provides basic power control management for Mali-G1 GPUs. The initial implementation includes
drm/panthor: Introduce panthor_pwr API and power control framework
Add the new panthor_pwr module, which provides basic power control management for Mali-G1 GPUs. The initial implementation includes infrastructure for initializing the PWR_CONTROL block, requesting and handling its IRQ, and checking for PWR_CONTROL support based on GPU architecture.
The patch also integrates panthor_pwr with the device lifecycle (init, suspend, resume, and unplug) through the new API functions. It also registers the IRQ handler under the 'gpu' IRQ as the PWR_CONTROL block is located within the GPU_CONTROL block.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-4-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 7d334f5c | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Add architecture-specific function operations
Introduce architecture-specific function pointers to support architecture-dependent behaviours. This patch adds the following function poin
drm/panthor: Add architecture-specific function operations
Introduce architecture-specific function pointers to support architecture-dependent behaviours. This patch adds the following function pointers and updates their usage accordingly:
- soft_reset - l2_power_on - l2_power_off
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-3-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| b1075ae1 | 25-Nov-2025 |
Karunika Choo <karunika.choo@arm.com> |
drm/panthor: Add arch-specific panthor_hw binding
This patch adds the framework for binding to a specific panthor_hw structure based on the architecture major value parsed from the GPU_ID register.
drm/panthor: Add arch-specific panthor_hw binding
This patch adds the framework for binding to a specific panthor_hw structure based on the architecture major value parsed from the GPU_ID register. This is in preparation of enabling architecture-specific behaviours based on GPU_ID. As such, it also splits the GPU_ID register read operation into its own helper function.
This framework allows a single panthor_hw structure to be shared across multiple architectures should there be minimal changes between them via the arch_min and arch_max field of the panthor_hw_entry structure, instead of duplicating the structure across multiple architectures.
Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Karunika Choo <karunika.choo@arm.com> Link: https://patch.msgid.link/20251125125548.3282320-2-karunika.choo@arm.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
show more ...
|
| 3668133e | 17-Oct-2025 |
Nicolas Frattaroli <nicolas.frattaroli@collabora.com> |
drm/panthor: Use existing OPP table if present
On SoCs where the GPU's power-domain is in charge of setting performance levels, the OPP table of the GPU node will have already been populated during
drm/panthor: Use existing OPP table if present
On SoCs where the GPU's power-domain is in charge of setting performance levels, the OPP table of the GPU node will have already been populated during said power-domain's attach_dev operation.
To avoid initialising an OPP table twice, only set the OPP regulator and the OPPs from DT if there's no OPP table present.
Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Link: https://patch.msgid.link/20251017-mt8196-gpufreq-v8-4-98fc1cc566a1@collabora.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|
| 3dd4844b | 17-Oct-2025 |
Nicolas Frattaroli <nicolas.frattaroli@collabora.com> |
drm/panthor: call into devfreq for current frequency
As it stands, panthor keeps a cached current frequency value for when it wants to retrieve it. This doesn't work well for when things might switc
drm/panthor: call into devfreq for current frequency
As it stands, panthor keeps a cached current frequency value for when it wants to retrieve it. This doesn't work well for when things might switch frequency without panthor's knowledge.
Instead, implement the get_cur_freq operation, and expose it through a helper function to the rest of panthor.
Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Karunika Choo <karunika.choo@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Link: https://patch.msgid.link/20251017-mt8196-gpufreq-v8-3-98fc1cc566a1@collabora.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|
| 858a7e41 | 22-Oct-2025 |
Rain Yang <jiyu.yang@nxp.com> |
drm/panthor: attach the driver's multiple power domains
Some platforms, such as i.MX95, utilize multiple power domains that need to be attached explicitly. This patch ensures that the driver properl
drm/panthor: attach the driver's multiple power domains
Some platforms, such as i.MX95, utilize multiple power domains that need to be attached explicitly. This patch ensures that the driver properly attaches all available power domains using devm_pm_domain_attach_list().
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com> Suggested-by: Steven Price <steven.price@arm.com> Signed-off-by: Prabhu Sundararaj <prabhu.sundararaj@nxp.com> Signed-off-by: Rain Yang <jiyu.yang@nxp.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patch.msgid.link/20251022092604.181752-1-jiyu.yang@oss.nxp.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|
| 98dd5143 | 31-Oct-2025 |
Boris Brezillon <boris.brezillon@collabora.com> |
drm/panthor: Fix UAF on kernel BO VA nodes
If the MMU is down, panthor_vm_unmap_range() might return an error. We expect the page table to be updated still, and if the MMU is blocked, the rest of th
drm/panthor: Fix UAF on kernel BO VA nodes
If the MMU is down, panthor_vm_unmap_range() might return an error. We expect the page table to be updated still, and if the MMU is blocked, the rest of the GPU should be blocked too, so no risk of accessing physical memory returned to the system (which the current code doesn't cover for anyway).
Proceed with the rest of the cleanup instead of bailing out and leaving the va_node inserted in the drm_mm, which leads to UAF when other adjacent nodes are removed from the drm_mm tree.
Reported-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/issues/57 Fixes: 8a1cc07578bf ("drm/panthor: Add GEM logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patch.msgid.link/20251031154818.821054-2-boris.brezillon@collabora.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|
| 08be57e6 | 22-Oct-2025 |
Ketil Johnsen <ketil.johnsen@arm.com> |
drm/panthor: Fix race with suspend during unplug
There is a race between panthor_device_unplug() and panthor_device_suspend() which can lead to IRQ handlers running on a powered down GPU. This is ho
drm/panthor: Fix race with suspend during unplug
There is a race between panthor_device_unplug() and panthor_device_suspend() which can lead to IRQ handlers running on a powered down GPU. This is how it can happen: - unplug routine calls drm_dev_unplug() - panthor_device_suspend() can now execute, and will skip a lot of important work because the device is currently marked as unplugged. - IRQs will remain active in this case and IRQ handlers can therefore try to access a powered down GPU.
The fix is simply to take the PM ref in panthor_device_unplug() a little bit earlier, before drm_dev_unplug().
Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patch.msgid.link/20251022103242.1083311-1-ketil.johnsen@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|
| 65c22519 | 29-Oct-2025 |
Ketil Johnsen <ketil.johnsen@arm.com> |
drm/panthor: disable async work during unplug
A previous change, "drm/panthor: Fix UAF race between device unplug and FW event processing", fixes a real issue where new work was unexpectedly queued
drm/panthor: disable async work during unplug
A previous change, "drm/panthor: Fix UAF race between device unplug and FW event processing", fixes a real issue where new work was unexpectedly queued after cancellation. This was fixed by a disable instead.
Apply the same disable logic to other device level async work on device unplug as a precaution.
Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patch.msgid.link/20251029111412.924104-1-ketil.johnsen@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
show more ...
|