xref: /linux/Documentation/ABI/testing/debugfs-cxl (revision 4f38da1f027ea2c9f01bb71daa7a299c191b6940)
150d527f5SAlison SchofieldWhat:		/sys/kernel/debug/cxl/memX/inject_poison
250d527f5SAlison SchofieldDate:		April, 2023
350d527f5SAlison SchofieldKernelVersion:	v6.4
450d527f5SAlison SchofieldContact:	linux-cxl@vger.kernel.org
550d527f5SAlison SchofieldDescription:
650d527f5SAlison Schofield		(WO) When a Device Physical Address (DPA) is written to this
750d527f5SAlison Schofield		attribute, the memdev driver sends an inject poison command to
850d527f5SAlison Schofield		the device for the specified address. The DPA must be 64-byte
950d527f5SAlison Schofield		aligned and the length of the injected poison is 64-bytes. If
1050d527f5SAlison Schofield		successful, the device returns poison when the address is
1150d527f5SAlison Schofield		accessed through the CXL.mem bus. Injecting poison adds the
1250d527f5SAlison Schofield		address to the device's Poison List and the error source is set
1350d527f5SAlison Schofield		to Injected. In addition, the device adds a poison creation
1450d527f5SAlison Schofield		event to its internal Informational Event log, updates the
1550d527f5SAlison Schofield		Event Status register, and if configured, interrupts the host.
1650d527f5SAlison Schofield		It is not an error to inject poison into an address that
17591209c7SAlison Schofield		already has poison present and no error is returned. If the
18591209c7SAlison Schofield		device returns 'Inject Poison Limit Reached' an -EBUSY error
19591209c7SAlison Schofield		is returned to the user. The inject_poison attribute is only
20591209c7SAlison Schofield		visible for devices supporting the capability.
2150d527f5SAlison Schofield
22*c3dd6768SAlison Schofield		TEST-ONLY INTERFACE: This interface is intended for testing
23*c3dd6768SAlison Schofield		and validation purposes only. It is not a data repair mechanism
24*c3dd6768SAlison Schofield		and should never be used on production systems or live data.
25*c3dd6768SAlison Schofield
26*c3dd6768SAlison Schofield		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
27*c3dd6768SAlison Schofield		poison injection can result in permanent data loss. Injected
28*c3dd6768SAlison Schofield		poison may render data permanently inaccessible even after
29*c3dd6768SAlison Schofield		clearing, as the clear operation writes zeros and does not
30*c3dd6768SAlison Schofield		recover original data.
31*c3dd6768SAlison Schofield
32*c3dd6768SAlison Schofield		SYSTEM STABILITY RISK: For volatile memory, poison injection
33*c3dd6768SAlison Schofield		can cause kernel crashes, system instability, or unpredictable
34*c3dd6768SAlison Schofield		behavior if the poisoned addresses are accessed by running code
35*c3dd6768SAlison Schofield		or critical kernel structures.
3650d527f5SAlison Schofield
37f11a5f89SAlison SchofieldWhat:		/sys/kernel/debug/cxl/memX/clear_poison
3850d527f5SAlison SchofieldDate:		April, 2023
3950d527f5SAlison SchofieldKernelVersion:	v6.4
4050d527f5SAlison SchofieldContact:	linux-cxl@vger.kernel.org
4150d527f5SAlison SchofieldDescription:
4250d527f5SAlison Schofield		(WO) When a Device Physical Address (DPA) is written to this
4350d527f5SAlison Schofield		attribute, the memdev driver sends a clear poison command to
4450d527f5SAlison Schofield		the device for the specified address. Clearing poison removes
4550d527f5SAlison Schofield		the address from the device's Poison List and writes 0 (zero)
4650d527f5SAlison Schofield		for 64 bytes starting at address. It is not an error to clear
4750d527f5SAlison Schofield		poison from an address that does not have poison set. If the
4850d527f5SAlison Schofield		device cannot clear poison from the address, -ENXIO is returned.
4950d527f5SAlison Schofield		The clear_poison attribute is only visible for devices
5050d527f5SAlison Schofield		supporting the capability.
518039804cSBen Cheatham
52*c3dd6768SAlison Schofield		TEST-ONLY INTERFACE: This interface is intended for testing
53*c3dd6768SAlison Schofield		and validation purposes only. It is not a data repair mechanism
54*c3dd6768SAlison Schofield		and should never be used on production systems or live data.
55*c3dd6768SAlison Schofield
56*c3dd6768SAlison Schofield		CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
57*c3dd6768SAlison Schofield		specified address range and removes the address from the poison
58*c3dd6768SAlison Schofield		list. It does NOT recover or restore original data that may have
59*c3dd6768SAlison Schofield		been present before poison injection. Any original data at the
60*c3dd6768SAlison Schofield		cleared address is permanently lost and replaced with zeros.
61*c3dd6768SAlison Schofield
62*c3dd6768SAlison Schofield		CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
63*c3dd6768SAlison Schofield		purposes only and should not be used as a data repair tool.
64*c3dd6768SAlison Schofield		Clearing poison is fundamentally different from data recovery
65*c3dd6768SAlison Schofield		or error correction.
66*c3dd6768SAlison Schofield
67*c3dd6768SAlison SchofieldWhat:		/sys/kernel/debug/cxl/regionX/inject_poison
68*c3dd6768SAlison SchofieldDate:		August, 2025
69*c3dd6768SAlison SchofieldContact:	linux-cxl@vger.kernel.org
70*c3dd6768SAlison SchofieldDescription:
71*c3dd6768SAlison Schofield		(WO) When a Host Physical Address (HPA) is written to this
72*c3dd6768SAlison Schofield		attribute, the region driver translates it to a Device
73*c3dd6768SAlison Schofield		Physical Address (DPA) and identifies the corresponding
74*c3dd6768SAlison Schofield		memdev. It then sends an inject poison command to that memdev
75*c3dd6768SAlison Schofield		at the translated DPA. Refer to the memdev ABI entry at:
76*c3dd6768SAlison Schofield		/sys/kernel/debug/cxl/memX/inject_poison for the detailed
77*c3dd6768SAlison Schofield		behavior. This attribute is only visible if all memdevs
78*c3dd6768SAlison Schofield		participating in the region support both inject and clear
79*c3dd6768SAlison Schofield		poison commands.
80*c3dd6768SAlison Schofield
81*c3dd6768SAlison Schofield		TEST-ONLY INTERFACE: This interface is intended for testing
82*c3dd6768SAlison Schofield		and validation purposes only. It is not a data repair mechanism
83*c3dd6768SAlison Schofield		and should never be used on production systems or live data.
84*c3dd6768SAlison Schofield
85*c3dd6768SAlison Schofield		DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
86*c3dd6768SAlison Schofield		poison injection can result in permanent data loss. Injected
87*c3dd6768SAlison Schofield		poison may render data permanently inaccessible even after
88*c3dd6768SAlison Schofield		clearing, as the clear operation writes zeros and does not
89*c3dd6768SAlison Schofield		recover original data.
90*c3dd6768SAlison Schofield
91*c3dd6768SAlison Schofield		SYSTEM STABILITY RISK: For volatile memory, poison injection
92*c3dd6768SAlison Schofield		can cause kernel crashes, system instability, or unpredictable
93*c3dd6768SAlison Schofield		behavior if the poisoned addresses are accessed by running code
94*c3dd6768SAlison Schofield		or critical kernel structures.
95*c3dd6768SAlison Schofield
96*c3dd6768SAlison SchofieldWhat:		/sys/kernel/debug/cxl/regionX/clear_poison
97*c3dd6768SAlison SchofieldDate:		August, 2025
98*c3dd6768SAlison SchofieldContact:	linux-cxl@vger.kernel.org
99*c3dd6768SAlison SchofieldDescription:
100*c3dd6768SAlison Schofield		(WO) When a Host Physical Address (HPA) is written to this
101*c3dd6768SAlison Schofield		attribute, the region driver translates it to a Device
102*c3dd6768SAlison Schofield		Physical Address (DPA) and identifies the corresponding
103*c3dd6768SAlison Schofield		memdev. It then sends a clear poison command to that memdev
104*c3dd6768SAlison Schofield		at the translated DPA. Refer to the memdev ABI entry at:
105*c3dd6768SAlison Schofield		/sys/kernel/debug/cxl/memX/clear_poison for the detailed
106*c3dd6768SAlison Schofield		behavior. This attribute is only visible if all memdevs
107*c3dd6768SAlison Schofield		participating in the region support both inject and clear
108*c3dd6768SAlison Schofield		poison commands.
109*c3dd6768SAlison Schofield
110*c3dd6768SAlison Schofield		TEST-ONLY INTERFACE: This interface is intended for testing
111*c3dd6768SAlison Schofield		and validation purposes only. It is not a data repair mechanism
112*c3dd6768SAlison Schofield		and should never be used on production systems or live data.
113*c3dd6768SAlison Schofield
114*c3dd6768SAlison Schofield		CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
115*c3dd6768SAlison Schofield		specified address range and removes the address from the poison
116*c3dd6768SAlison Schofield		list. It does NOT recover or restore original data that may have
117*c3dd6768SAlison Schofield		been present before poison injection. Any original data at the
118*c3dd6768SAlison Schofield		cleared address is permanently lost and replaced with zeros.
119*c3dd6768SAlison Schofield
120*c3dd6768SAlison Schofield		CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
121*c3dd6768SAlison Schofield		purposes only and should not be used as a data repair tool.
122*c3dd6768SAlison Schofield		Clearing poison is fundamentally different from data recovery
123*c3dd6768SAlison Schofield		or error correction.
124*c3dd6768SAlison Schofield
1258039804cSBen CheathamWhat:		/sys/kernel/debug/cxl/einj_types
1268039804cSBen CheathamDate:		January, 2024
1278039804cSBen CheathamKernelVersion:	v6.9
1288039804cSBen CheathamContact:	linux-cxl@vger.kernel.org
1298039804cSBen CheathamDescription:
1308039804cSBen Cheatham		(RO) Prints the CXL protocol error types made available by
131edc12434SDan Williams		the platform in the format:
132edc12434SDan Williams
133edc12434SDan Williams			0x<error number> <error type>
134edc12434SDan Williams
1358039804cSBen Cheatham		The possible error types are (as of ACPI v6.5):
136edc12434SDan Williams
1378039804cSBen Cheatham			0x1000	CXL.cache Protocol Correctable
1388039804cSBen Cheatham			0x2000	CXL.cache Protocol Uncorrectable non-fatal
1398039804cSBen Cheatham			0x4000	CXL.cache Protocol Uncorrectable fatal
1408039804cSBen Cheatham			0x8000	CXL.mem Protocol Correctable
1418039804cSBen Cheatham			0x10000	CXL.mem Protocol Uncorrectable non-fatal
1428039804cSBen Cheatham			0x20000	CXL.mem Protocol Uncorrectable fatal
1438039804cSBen Cheatham
1448039804cSBen Cheatham		The <error number> can be written to einj_inject to inject
1458039804cSBen Cheatham		<error type> into a chosen dport.
1468039804cSBen Cheatham
1478039804cSBen CheathamWhat:		/sys/kernel/debug/cxl/$dport_dev/einj_inject
1488039804cSBen CheathamDate:		January, 2024
1498039804cSBen CheathamKernelVersion:	v6.9
1508039804cSBen CheathamContact:	linux-cxl@vger.kernel.org
1518039804cSBen CheathamDescription:
1528039804cSBen Cheatham		(WO) Writing an integer to this file injects the corresponding
1538039804cSBen Cheatham		CXL protocol error into $dport_dev ($dport_dev will be a device
1548039804cSBen Cheatham		name from /sys/bus/pci/devices). The integer to type mapping for
1558039804cSBen Cheatham		injection can be found by reading from einj_types. If the dport
1568039804cSBen Cheatham		was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
1578039804cSBen Cheatham		a CXL 2.0 error is injected.
158