17ba98583SGiovanni CabidduWhat: /sys/kernel/debug/qat_<device>_<BDF>/fw_counters 2865b50feSLucas Segarra FernandezDate: November 2023 3865b50feSLucas Segarra FernandezKernelVersion: 6.6 4865b50feSLucas Segarra FernandezContact: qat-linux@intel.com 5865b50feSLucas Segarra FernandezDescription: (RO) Read returns the number of requests sent to the FW and the number of responses 6865b50feSLucas Segarra Fernandez received from the FW for each Acceleration Engine 7865b50feSLucas Segarra Fernandez Reported firmware counters:: 8865b50feSLucas Segarra Fernandez 9865b50feSLucas Segarra Fernandez <N>: Number of requests sent from Acceleration Engine N to FW and responses 10865b50feSLucas Segarra Fernandez Acceleration Engine N received from FW 11359b84f8SDamian Muszynski 12359b84f8SDamian MuszynskiWhat: /sys/kernel/debug/qat_<device>_<BDF>/heartbeat/config 13359b84f8SDamian MuszynskiDate: November 2023 14359b84f8SDamian MuszynskiKernelVersion: 6.6 15359b84f8SDamian MuszynskiContact: qat-linux@intel.com 16359b84f8SDamian MuszynskiDescription: (RW) Read returns value of the Heartbeat update period. 17359b84f8SDamian Muszynski Write to the file changes this period value. 18359b84f8SDamian Muszynski 19359b84f8SDamian Muszynski This period should reflect planned polling interval of device 20359b84f8SDamian Muszynski health status. High frequency Heartbeat monitoring wastes CPU cycles 21359b84f8SDamian Muszynski but minimizes the customer’s system downtime. Also, if there are 22359b84f8SDamian Muszynski large service requests that take some time to complete, high frequency 23359b84f8SDamian Muszynski Heartbeat monitoring could result in false reports of unresponsiveness 24359b84f8SDamian Muszynski and in those cases, period needs to be increased. 25359b84f8SDamian Muszynski 26359b84f8SDamian Muszynski This parameter is effective only for c3xxx, c62x, dh895xcc devices. 27359b84f8SDamian Muszynski 4xxx has this value internally fixed to 200ms. 28359b84f8SDamian Muszynski 29359b84f8SDamian Muszynski Default value is set to 500. Minimal allowed value is 200. 30359b84f8SDamian Muszynski All values are expressed in milliseconds. 31359b84f8SDamian Muszynski 32359b84f8SDamian MuszynskiWhat: /sys/kernel/debug/qat_<device>_<BDF>/heartbeat/queries_failed 33359b84f8SDamian MuszynskiDate: November 2023 34359b84f8SDamian MuszynskiKernelVersion: 6.6 35359b84f8SDamian MuszynskiContact: qat-linux@intel.com 36359b84f8SDamian MuszynskiDescription: (RO) Read returns the number of times the device became unresponsive. 37359b84f8SDamian Muszynski 38359b84f8SDamian Muszynski Attribute returns value of the counter which is incremented when 39359b84f8SDamian Muszynski status query results negative. 40359b84f8SDamian Muszynski 41359b84f8SDamian MuszynskiWhat: /sys/kernel/debug/qat_<device>_<BDF>/heartbeat/queries_sent 42359b84f8SDamian MuszynskiDate: November 2023 43359b84f8SDamian MuszynskiKernelVersion: 6.6 44359b84f8SDamian MuszynskiContact: qat-linux@intel.com 45359b84f8SDamian MuszynskiDescription: (RO) Read returns the number of times the control process checked 46359b84f8SDamian Muszynski if the device is responsive. 47359b84f8SDamian Muszynski 48359b84f8SDamian Muszynski Attribute returns value of the counter which is incremented on 49359b84f8SDamian Muszynski every status query. 50359b84f8SDamian Muszynski 51359b84f8SDamian MuszynskiWhat: /sys/kernel/debug/qat_<device>_<BDF>/heartbeat/status 52359b84f8SDamian MuszynskiDate: November 2023 53359b84f8SDamian MuszynskiKernelVersion: 6.6 54359b84f8SDamian MuszynskiContact: qat-linux@intel.com 55359b84f8SDamian MuszynskiDescription: (RO) Read returns the device health status. 56359b84f8SDamian Muszynski 57359b84f8SDamian Muszynski Returns 0 when device is healthy or -1 when is unresponsive 58359b84f8SDamian Muszynski or the query failed to send. 59359b84f8SDamian Muszynski 60359b84f8SDamian Muszynski The driver does not monitor for Heartbeat. It is left for a user 61359b84f8SDamian Muszynski to poll the status periodically. 62e0792316SLucas Segarra Fernandez 63e0792316SLucas Segarra FernandezWhat: /sys/kernel/debug/qat_<device>_<BDF>/pm_status 64e0792316SLucas Segarra FernandezDate: January 2024 65e0792316SLucas Segarra FernandezKernelVersion: 6.7 66e0792316SLucas Segarra FernandezContact: qat-linux@intel.com 67e0792316SLucas Segarra FernandezDescription: (RO) Read returns power management information specific to the 68e0792316SLucas Segarra Fernandez QAT device. 69e0792316SLucas Segarra Fernandez 70e0792316SLucas Segarra Fernandez This attribute is only available for qat_4xxx devices. 71d807f024SLucas Segarra Fernandez 72d807f024SLucas Segarra FernandezWhat: /sys/kernel/debug/qat_<device>_<BDF>/cnv_errors 73d807f024SLucas Segarra FernandezDate: January 2024 74d807f024SLucas Segarra FernandezKernelVersion: 6.7 75d807f024SLucas Segarra FernandezContact: qat-linux@intel.com 76d807f024SLucas Segarra FernandezDescription: (RO) Read returns, for each Acceleration Engine (AE), the number 77d807f024SLucas Segarra Fernandez of errors and the type of the last error detected by the device 78d807f024SLucas Segarra Fernandez when performing verified compression. 79d807f024SLucas Segarra Fernandez Reported counters:: 80d807f024SLucas Segarra Fernandez 81d807f024SLucas Segarra Fernandez <N>: Number of Compress and Verify (CnV) errors and type 82d807f024SLucas Segarra Fernandez of the last CnV error detected by Acceleration 83d807f024SLucas Segarra Fernandez Engine N. 84*e2b67859SDamian Muszynski 85*e2b67859SDamian MuszynskiWhat: /sys/kernel/debug/qat_<device>_<BDF>/heartbeat/inject_error 86*e2b67859SDamian MuszynskiDate: March 2024 87*e2b67859SDamian MuszynskiKernelVersion: 6.8 88*e2b67859SDamian MuszynskiContact: qat-linux@intel.com 89*e2b67859SDamian MuszynskiDescription: (WO) Write to inject an error that simulates an heartbeat 90*e2b67859SDamian Muszynski failure. This is to be used for testing purposes. 91*e2b67859SDamian Muszynski 92*e2b67859SDamian Muszynski After writing this file, the driver stops arbitration on a 93*e2b67859SDamian Muszynski random engine and disables the fetching of heartbeat counters. 94*e2b67859SDamian Muszynski If a workload is running on the device, a job submitted to the 95*e2b67859SDamian Muszynski accelerator might not get a response and a read of the 96*e2b67859SDamian Muszynski `heartbeat/status` attribute might report -1, i.e. device 97*e2b67859SDamian Muszynski unresponsive. 98*e2b67859SDamian Muszynski The error is unrecoverable thus the device must be restarted to 99*e2b67859SDamian Muszynski restore its functionality. 100*e2b67859SDamian Muszynski 101*e2b67859SDamian Muszynski This attribute is available only when the kernel is built with 102*e2b67859SDamian Muszynski CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION=y. 103*e2b67859SDamian Muszynski 104*e2b67859SDamian Muszynski A write of 1 enables error injection. 105*e2b67859SDamian Muszynski 106*e2b67859SDamian Muszynski The following example shows how to enable error injection:: 107*e2b67859SDamian Muszynski 108*e2b67859SDamian Muszynski # cd /sys/kernel/debug/qat_<device>_<BDF> 109*e2b67859SDamian Muszynski # echo 1 > heartbeat/inject_error 110