6a7e448c | 08-Mar-2024 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Refactor/simplify reset logic
The current logic for handling resets is more complicated than it needs to be. The deferred_reset flag is used to indicate a reset is needed and the deferred_
vfio/pds: Refactor/simplify reset logic
The current logic for handling resets is more complicated than it needs to be. The deferred_reset flag is used to indicate a reset is needed and the deferred_reset_state is the requested, post-reset, state.
Also, the deferred_reset logic was added to vfio migration drivers to prevent a circular locking dependency with respect to mm_lock and state mutex. This is mainly because of the copy_to/from_user() functions(which takes mm_lock) invoked under state mutex.
Remove all of the deferred reset logic and just pass the requested next state to pds_vfio_reset() so it can be used for VMM and DSC initiated resets.
This removes the need for pds_vfio_state_mutex_lock(), so remove that and replace its use with a simple mutex_unlock().
Also, remove the reset_mutex as it's no longer needed since the state_mutex can be the driver's primary protector.
Suggested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Link: https://lore.kernel.org/r/20240308182149.22036-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
457f7308 | 08-Mar-2024 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Make sure migration file isn't accessed after reset
It's possible the migration file is accessed after reset when it has been cleaned up, especially when it's initiated by the device. This
vfio/pds: Make sure migration file isn't accessed after reset
It's possible the migration file is accessed after reset when it has been cleaned up, especially when it's initiated by the device. This is because the driver doesn't rip out the filep when cleaning up it only frees the related page structures and sets its local struct pds_vfio_lm_file pointer to NULL. This can cause a NULL pointer dereference, which is shown in the example below during a restore after a device initiated reset:
BUG: kernel NULL pointer dereference, address: 000000000000000c PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:pds_vfio_get_file_page+0x5d/0xf0 [pds_vfio_pci] [...] Call Trace: <TASK> pds_vfio_restore_write+0xf6/0x160 [pds_vfio_pci] vfs_write+0xc9/0x3f0 ? __fget_light+0xc9/0x110 ksys_write+0xb5/0xf0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...]
Add a disabled flag to the driver's struct pds_vfio_lm_file that gets set during cleanup. Then make sure to check the flag when the migration file is accessed via its file_operations. By default this flag will be false as the memory for struct pds_vfio_lm_file is kzalloc'd, which means the struct pds_vfio_lm_file is enabled and accessible. Also, since the file_operations and driver's migration file cleanup happen under the protection of the same pds_vfio_lm_file.lock, using this flag is thread safe.
Fixes: 8512ed256334 ("vfio/pds: Always clear the save/restore FDs on reset") Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Link: https://lore.kernel.org/r/20240308182149.22036-2-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
2e7c6feb | 17-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Add multi-region support
Only supporting a single region/range is limiting, wasteful, and in some cases broken (i.e. when there are large gaps in the iova memory ranges). Fix this by addin
vfio/pds: Add multi-region support
Only supporting a single region/range is limiting, wasteful, and in some cases broken (i.e. when there are large gaps in the iova memory ranges). Fix this by adding support for multiple regions based on what the device tells the driver it can support.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231117001207.2793-7-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
0c320f22 | 17-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Move seq/ack bitmaps into region struct
Since the host seq/ack bitmaps are part of a region move them into struct pds_vfio_region. Also, make use of the bmp_bytes value for validation purp
vfio/pds: Move seq/ack bitmaps into region struct
Since the host seq/ack bitmaps are part of a region move them into struct pds_vfio_region. Also, make use of the bmp_bytes value for validation purposes.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231117001207.2793-6-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
87bdf980 | 17-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Pass region info to relevant functions
A later patch in the series implements multi-region support. That will require specific regions to be passed to relevant functions. Prepare for that
vfio/pds: Pass region info to relevant functions
A later patch in the series implements multi-region support. That will require specific regions to be passed to relevant functions. Prepare for that change by passing the region structure to relevant functions.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231117001207.2793-5-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
3f589813 | 17-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Move and rename region specific info
An upcoming change in this series will add support for multiple regions. To prepare for that, move region specific information into struct pds_vfio_reg
vfio/pds: Move and rename region specific info
An upcoming change in this series will add support for multiple regions. To prepare for that, move region specific information into struct pds_vfio_region and rename the members for readability. This will reduce the size of the patch that actually implements multiple region support.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231117001207.2793-4-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
3b8f7a24 | 17-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Only use a single SGL for both seq and ack
Since the seq/ack operations never happen in parallel there is no need for multiple scatter gather lists per region. The current implementation i
vfio/pds: Only use a single SGL for both seq and ack
Since the seq/ack operations never happen in parallel there is no need for multiple scatter gather lists per region. The current implementation is wasting memory. Fix this by only using a single scatter-gather list for both the seq and ack operations.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231117001207.2793-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
ae2667cd | 22-Nov-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Fix possible sleep while in atomic context
The driver could possibly sleep while in atomic context resulting in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is set:
BUG: sle
vfio/pds: Fix possible sleep while in atomic context
The driver could possibly sleep while in atomic context resulting in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is set:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Call Trace: <TASK> dump_stack_lvl+0x36/0x50 __might_resched+0x123/0x170 mutex_lock+0x1e/0x50 pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci] pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci] pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
This can happen if pds_vfio_put_restore_file() and/or pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock) while the spin_lock(&pds_vfio->reset_lock) is held, which can happen during while calling pds_vfio_state_mutex_unlock().
Fix this by changing the reset_lock to reset_mutex so there are no such conerns. Also, make sure to destroy the reset_mutex in the driver specific VFIO device release function.
This also fixes a spinlock bad magic BUG that was caused by not calling spinlock_init() on the reset_lock. Since, the lock is being changed to a mutex, make sure to call mutex_init() on it.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/ Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
13578d4e | 24-Oct-2023 |
Joao Martins <joao.m.martins@oracle.com> |
iommufd/iova_bitmap: Move symbols to IOMMUFD namespace
Have the IOVA bitmap exported symbols adhere to the IOMMUFD symbol export convention i.e. using the IOMMUFD namespace. In doing so, import the
iommufd/iova_bitmap: Move symbols to IOMMUFD namespace
Have the IOVA bitmap exported symbols adhere to the IOMMUFD symbol export convention i.e. using the IOMMUFD namespace. In doing so, import the namespace in the current users. This means VFIO and the vfio-pci drivers that use iova_bitmap_set().
Link: https://lore.kernel.org/r/20231024135109.73787-4-joao.m.martins@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
27004f89 | 14-Sep-2023 |
Shixiong Ou <oushixiong@kylinos.cn> |
vfio/pds: Use proper PF device access helper
The pci_physfn() helper exists to support cases where the physfn field may not be compiled into the pci_dev structure. We've declared this driver depende
vfio/pds: Use proper PF device access helper
The pci_physfn() helper exists to support cases where the physfn field may not be compiled into the pci_dev structure. We've declared this driver dependent on PCI_IOV to avoid this problem, but regardless we should follow the precedent not to access this field directly.
Signed-off-by: Shixiong Ou <oushixiong@kylinos.cn> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230914021332.1929155-1-oushixiong@kylinos.cn Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
642265e2 | 21-Aug-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Send type for SUSPEND_STATUS command
Commit bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") added live migration support for the pds-vfio-pci driver. When sending the SUSPEND co
vfio/pds: Send type for SUSPEND_STATUS command
Commit bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") added live migration support for the pds-vfio-pci driver. When sending the SUSPEND command to the device, the driver sets the type of suspend (i.e. P2P or FULL). However, the driver isn't sending the type of suspend for the SUSPEND_STATUS command, which will result in failures. Fix this by also sending the suspend type in the SUSPEND_STATUS command.
Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230821184215.34564-1-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
2d12d18f | 19-Aug-2023 |
Yang Yingliang <yangyingliang@huawei.com> |
vfio/pds: fix return value in pds_vfio_get_lm_file()
anon_inode_getfile() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR().
Fixes: bb500dbe2ac6
vfio/pds: fix return value in pds_vfio_get_lm_file()
anon_inode_getfile() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR().
Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://lore.kernel.org/r/20230819023716.3469037-1-yangyingliang@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
fc9da661 | 07-Aug-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Add Kconfig and documentation
Add Kconfig entries and pds-vfio-pci.rst. Also, add an entry in the MAINTAINERS file for this new driver.
It's not clear where documentation for vendor speci
vfio/pds: Add Kconfig and documentation
Add Kconfig entries and pds-vfio-pci.rst. Also, add an entry in the MAINTAINERS file for this new driver.
It's not clear where documentation for vendor specific VFIO drivers should live, so just re-use the current amd ethernet location.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-9-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
7dabb1bc | 07-Aug-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Add support for firmware recovery
It's possible that the device firmware crashes and is able to recover due to some configuration and/or other issue. If a live migration is in progress whi
vfio/pds: Add support for firmware recovery
It's possible that the device firmware crashes and is able to recover due to some configuration and/or other issue. If a live migration is in progress while the firmware crashes, the live migration will fail. However, the VF PCI device should still be functional post crash recovery and subsequent migrations should go through as expected.
When the pds_core device notices that firmware crashes it sends an event to all its client drivers. When the pds_vfio driver receives this event while migration is in progress it will request a deferred reset on the next migration state transition. This state transition will report failure as well as any subsequent state transition requests from the VMM/VFIO. Based on uapi/vfio.h the only way out of VFIO_DEVICE_STATE_ERROR is by issuing VFIO_DEVICE_RESET. Once this reset is done, the migration state will be reset to VFIO_DEVICE_STATE_RUNNING and migration can be performed.
If the event is received while no migration is in progress (i.e. the VM is in normal operating mode), then no actions are taken and the migration state remains VFIO_DEVICE_STATE_RUNNING.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-8-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
f232836a | 07-Aug-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Add support for dirty page tracking
In order to support dirty page tracking, the driver has to implement the VFIO subsystem's vfio_log_ops. This includes log_start, log_stop, and log_read_
vfio/pds: Add support for dirty page tracking
In order to support dirty page tracking, the driver has to implement the VFIO subsystem's vfio_log_ops. This includes log_start, log_stop, and log_read_and_clear.
All of the tracker resources are allocated and dirty tracking on the device is started during log_start. The resources are cleaned up and dirty tracking on the device is stopped during log_stop. The dirty pages are determined and reported during log_read_and_clear.
In order to support these callbacks admin queue commands are used. All of the adminq queue command structures and implementations are included as part of this patch.
PDS_LM_CMD_DIRTY_STATUS is added to query the current status of dirty tracking on the device. This includes if it's enabled (i.e. number of regions being tracked from the device's perspective) and the maximum number of regions supported from the device's perspective.
PDS_LM_CMD_DIRTY_ENABLE is added to enable dirty tracking on the specified number of regions and their iova ranges.
PDS_LM_CMD_DIRTY_DISABLE is added to disable dirty tracking for all regions on the device.
PDS_LM_CMD_READ_SEQ and PDS_LM_CMD_DIRTY_WRITE_ACK are added to support reading and acknowledging the currently dirtied pages.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/r/20230807205755.29579-7-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|
bb500dbe | 07-Aug-2023 |
Brett Creeley <brett.creeley@amd.com> |
vfio/pds: Add VFIO live migration support
Add live migration support via the VFIO subsystem. The migration implementation aligns with the definition from uapi/vfio.h and uses the pds_core PF's admin
vfio/pds: Add VFIO live migration support
Add live migration support via the VFIO subsystem. The migration implementation aligns with the definition from uapi/vfio.h and uses the pds_core PF's adminq for device configuration.
The ability to suspend, resume, and transfer VF device state data is included along with the required admin queue command structures and implementations.
PDS_LM_CMD_SUSPEND and PDS_LM_CMD_SUSPEND_STATUS are added to support the VF device suspend operation.
PDS_LM_CMD_RESUME is added to support the VF device resume operation.
PDS_LM_CMD_STATE_SIZE is added to determine the exact size of the VF device state data.
PDS_LM_CMD_SAVE is added to get the VF device state data.
PDS_LM_CMD_RESTORE is added to restore the VF device with the previously saved data from PDS_LM_CMD_SAVE.
PDS_LM_CMD_HOST_VF_STATUS is added to notify the DSC/firmware when a migration is in/not-in progress from the host's perspective. The DSC/firmware can use this to clear/setup any necessary state related to a migration.
Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-6-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
show more ...
|