61b74964 | 01-Aug-2024 |
Jithu Joseph <jithu.joseph@intel.com> |
trace: platform/x86/intel/ifs: Add SBAF trace support
Add tracing support for the SBAF IFS tests, which may be useful for debugging systems that fail these tests. Log details like test content batch
trace: platform/x86/intel/ifs: Add SBAF trace support
Add tracing support for the SBAF IFS tests, which may be useful for debugging systems that fail these tests. Log details like test content batch number, SBAF bundle ID, program index and the exact errors or warnings encountered by each HT thread during the test.
Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240801051814.1935149-5-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
3c4d06bd | 01-Aug-2024 |
Jithu Joseph <jithu.joseph@intel.com> |
platform/x86/intel/ifs: Add SBAF test support
In a core, the SBAF test engine is shared between sibling CPUs.
An SBAF test image contains multiple bundles. Each bundle is further composed of subuni
platform/x86/intel/ifs: Add SBAF test support
In a core, the SBAF test engine is shared between sibling CPUs.
An SBAF test image contains multiple bundles. Each bundle is further composed of subunits called programs. When a SBAF test (for a particular core) is triggered by the user, each SBAF bundle from the loaded test image is executed sequentially on all the threads on the core using the stop_core_cpuslocked mechanism. Each bundle execution is initiated by writing to MSR_ACTIVATE_SBAF.
SBAF test bundle execution may be aborted when an interrupt occurs or if the CPU does not have enough power budget for the test. In these cases the kernel restarts the test from the aborted bundle. SBAF execution is not retried if the test fails or if the test makes no forward progress after 5 retries.
Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240801051814.1935149-4-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
0a3e4e94 | 01-Aug-2024 |
Jithu Joseph <jithu.joseph@intel.com> |
platform/x86/intel/ifs: Add SBAF test image loading support
Structural Based Functional Test at Field (SBAF) is a new type of testing that provides comprehensive core test coverage complementing exi
platform/x86/intel/ifs: Add SBAF test image loading support
Structural Based Functional Test at Field (SBAF) is a new type of testing that provides comprehensive core test coverage complementing existing IFS tests like Scan at Field (SAF) or ArrayBist.
SBAF device will appear as a new device instance (intel_ifs_2) under /sys/devices/virtual/misc. The user interaction necessary to load the test image and test a particular core is the same as the existing scan test (intel_ifs_0).
During the loading stage, the driver will look for a file named ff-mm-ss-<batch02x>.sbft in the /lib/firmware/intel/ifs_2 directory. The hardware interaction needed for loading the image is similar to SAF, with the only difference being the MSR addresses used. Reuse the SAF image loading code, passing the SBAF-specific MSR addresses via struct ifs_test_msrs in the driver device data.
Unlike SAF, the SBAF test image chunks are further divided into smaller logical entities called bundles. Since the SBAF test is initiated per bundle, cache the maximum number of bundles in the current image, which is used for iterating through bundles during SBAF test execution.
Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Co-developed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240801051814.1935149-3-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
bd25a3f5 | 12-Apr-2024 |
Jithu Joseph <jithu.joseph@intel.com> |
platform/x86/intel/ifs: Disable irq during one load stage
One of the stages in IFS image loading process involves loading individual chunks (test patterns) from test image file to secure memory.
Dr
platform/x86/intel/ifs: Disable irq during one load stage
One of the stages in IFS image loading process involves loading individual chunks (test patterns) from test image file to secure memory.
Driver issues a WRMSR(MSR_AUTHENTICATE_AND_COPY_CHUNK) operation to do this. This operation can take up to 5 msec, and if an interrupt occurs in between, the AUTH_AND_COPY_CHUNK u-code implementation aborts the operation.
Interrupt sources such as NMI or SMI are handled by retrying. Regular interrupts may occur frequently enough to prevent this operation from ever completing. Disable irq on local cpu around the aforementioned WRMSR to allow the operation to complete.
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240412172349.544064-4-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
682c259a | 25-Jan-2024 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
platform/x86/intel/ifs: Remove unnecessary initialization of 'ret'
The ret variable is unconditionally assigned in ifs_load_firmware(). Therefore, remove its unnecessary initialization.
Reviewed-by
platform/x86/intel/ifs: Remove unnecessary initialization of 'ret'
The ret variable is unconditionally assigned in ifs_load_firmware(). Therefore, remove its unnecessary initialization.
Reviewed-by: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20240125130328.11253-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
ad630f5d | 25-Jan-2024 |
Ashok Raj <ashok.raj@intel.com> |
platform/x86/intel/ifs: Add an entry rendezvous for SAF
The activation for Scan at Field (SAF) includes a parameter to make microcode wait for both threads to join. It's preferable to perform an ent
platform/x86/intel/ifs: Add an entry rendezvous for SAF
The activation for Scan at Field (SAF) includes a parameter to make microcode wait for both threads to join. It's preferable to perform an entry rendezvous before the activation to ensure that they start the `wrmsr` close enough to each other. In some cases it has been observed that one of the threads might be just a bit late to arrive. An entry rendezvous reduces the likelihood of these cases occurring.
Add an entry rendezvous to ensure the activation on both threads happen close enough to each other.
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-6-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
ea15f34d | 25-Jan-2024 |
Ashok Raj <ashok.raj@intel.com> |
platform/x86/intel/ifs: Replace the exit rendezvous with an entry rendezvous for ARRAY_BIST
ARRAY_BIST requires the test to be invoked only from one of the HT siblings of a core. If the other sibli
platform/x86/intel/ifs: Replace the exit rendezvous with an entry rendezvous for ARRAY_BIST
ARRAY_BIST requires the test to be invoked only from one of the HT siblings of a core. If the other sibling was in mwait(), that didn't permit the test to complete and resulted in several retries before the test could finish.
The exit rendezvous was introduced to keep the HT sibling busy until the primary CPU completed the test to avoid those retries. What is actually needed is to ensure that both the threads rendezvous *before* the wrmsr to trigger the test to give good chance to complete the test.
The `stop_machine()` function returns only after all the CPUs complete running the function, and provides an exit rendezvous implicitly.
In kernel/stop_machine.c::multi_cpu_stop(), every CPU in the mask needs to complete reaching MULTI_STOP_RUN. When all CPUs complete, the state machine moves to next state, i.e MULTI_STOP_EXIT. Thus the underlying API stop_core_cpuslocked() already provides an exit rendezvous.
Add the rendezvous earlier in order to ensure the wrmsr is triggered after all CPUs reach the do_array_test(). Remove the exit rendezvous since stop_core_cpuslocked() already guarantees that.
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-5-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
e272d1e1 | 25-Jan-2024 |
Ashok Raj <ashok.raj@intel.com> |
platform/x86/intel/ifs: Add current batch number to trace output
Add the current batch number in the trace output. When there are failures, it's important to know which test content resulted in fail
platform/x86/intel/ifs: Add current batch number to trace output
Add the current batch number in the trace output. When there are failures, it's important to know which test content resulted in failure.
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | migration/0-18 [000] d..1. 527287.084668: ifs_status: batch: 02, start: 0000, stop: 007f, status: 0000000000007f80 migration/128-785 [128] d..1. 527287.084669: ifs_status: batch: 02, start: 0000, stop: 007f, status: 0000000000007f80
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-4-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|