| 54c2d973 | 03-Nov-2025 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amd/ras: Compatible with legacy sriov host
If sriov host is legacy, the guest uniras will be disabled.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
drm/amd/ras: Compatible with legacy sriov host
If sriov host is legacy, the guest uniras will be disabled.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| 73c6c226 | 30-Oct-2025 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amd/ras: Add sriov ras preprocessing before gpu reset
Sriov host may clear all VF commands registered to auto update list during VF reset, set ecc.auto_uUpdate block to false before VF reset, an
drm/amd/ras: Add sriov ras preprocessing before gpu reset
Sriov host may clear all VF commands registered to auto update list during VF reset, set ecc.auto_uUpdate block to false before VF reset, and after VF reset is complete, RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command will be re-registered to auto update list of sriov host.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| 11dcf72e | 30-Oct-2025 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amd/ras: Support high-frequency querying sriov ras block error count
Support high-frequency querying sriov ras block error count: 1. Create shared memory and fills it with RAS_CMD__GET_LAL_LOC_S
drm/amd/ras: Support high-frequency querying sriov ras block error count
Support high-frequency querying sriov ras block error count: 1. Create shared memory and fills it with RAS_CMD__GET_LAL_LOC_STATUS ras command. 2. The RAS_CMD_GET_ALL_BLOCK_ECC_STATUS command and shared memory are registered to sriov host ras auto-update list via RAS_CMD_SET_CMD_AUTO_UPDATE command. 3. Once sriov host detects ras error, it will automatically execute RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command and write the result to shared memory.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| fcfa8dbb | 11-Nov-2025 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amd/ras: Add ras command to retrieve cper data from sriov host
In order to reduce the number of interactions with sriov host and the amount of data exchanged, a set of ras commands is first used
drm/amd/ras: Add ras command to retrieve cper data from sriov host
In order to reduce the number of interactions with sriov host and the amount of data exchanged, a set of ras commands is first used to obtain the raw data used to generate cper from the host, then, guest driver generates cper based on the obtained raw data.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| 3f16007d | 31-Oct-2025 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amd/ras: Add ras support for umc v12_5_0
Add ras support for umc v12_5_0.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher
drm/amd/ras: Add ras support for umc v12_5_0
Add ras support for umc v12_5_0.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| 7cf422ed | 30-Oct-2025 |
Xiang Liu <xiang.liu@amd.com> |
drm/amd/ras: Fix format truncation
../ras/rascore/ras_cper.c: In function ‘cper_generate_fatal_record.isra’: ../ras/rascore/ras_cper.c:75:36: error: ‘%llX’ directive output may be truncated writing
drm/amd/ras: Fix format truncation
../ras/rascore/ras_cper.c: In function ‘cper_generate_fatal_record.isra’: ../ras/rascore/ras_cper.c:75:36: error: ‘%llX’ directive output may be truncated writing between 1 and 14 bytes into a region of size between 0 and 7 [-Werror=format-truncation=] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~ ../ras/rascore/ras_cper.c:75:32: note: directive argument in the range [0, 72057594037927935] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~ ../ras/rascore/ras_cper.c:75:9: note: ‘snprintf’ output between 4 and 27 bytes into a destination of size 9 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76 | RAS_LOG_SEQNO_TO_BATCH_IDX(trace->seqno)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../ras/rascore/ras_cper.c: In function ‘cper_generate_runtime_record.isra’: ../ras/rascore/ras_cper.c:75:36: error: ‘%llX’ directive output may be truncated writing between 1 and 14 bytes into a region of size between 0 and 7 [-Werror=format-truncation=] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~ ../ras/rascore/ras_cper.c:75:32: note: directive argument in the range [0, 72057594037927935] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~ ../ras/rascore/ras_cper.c:75:9: note: ‘snprintf’ output between 4 and 27 bytes into a destination of size 9 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76 | RAS_LOG_SEQNO_TO_BATCH_IDX(trace->seqno)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
| 988fd51e | 23-Oct-2025 |
Xiang Liu <xiang.liu@amd.com> |
drm/amd/ras: Use correct severity for BP threshold exceed event
The severity of CPER for BP threshold exceed event should be set as FATAL to match the OOB implementation.
Signed-off-by: Xiang Liu <
drm/amd/ras: Use correct severity for BP threshold exceed event
The severity of CPER for BP threshold exceed event should be set as FATAL to match the OOB implementation.
Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|