xref: /linux/Documentation/ABI/testing/sysfs-edac-memory-repair (revision d97e2634fbdcd238a51bc363267df0139c17f4da)
1What:		/sys/bus/edac/devices/<dev-name>/mem_repairX
2Date:		March 2025
3KernelVersion:	6.15
4Contact:	linux-edac@vger.kernel.org
5Description:
6		The sysfs EDAC bus devices /<dev-name>/mem_repairX subdirectory
7		pertains to the memory media repair features control, such as
8		PPR (Post Package Repair), memory sparing etc, where <dev-name>
9		directory corresponds to a device registered with the EDAC
10		device driver for the memory repair features.
11
12		Post Package Repair is a maintenance operation requests the memory
13		device to perform a repair operation on its media. It is a memory
14		self-healing feature that fixes a failing memory location by
15		replacing it with a spare row in a DRAM device. For example, a
16		CXL memory device with DRAM components that support PPR features may
17		implement PPR maintenance operations. DRAM components may support
18		two types of PPR functions: hard PPR, for a permanent row repair, and
19		soft PPR, for a temporary row repair. Soft PPR may be much faster
20		than hard PPR, but the repair is lost with a power cycle.
21
22		The sysfs attributes nodes for a repair feature are only
23		present if the parent driver has implemented the corresponding
24		attr callback function and provided the necessary operations
25		to the EDAC device driver during registration.
26
27		In some states of system configuration (e.g. before address
28		decoders have been configured), memory devices (e.g. CXL)
29		may not have an active mapping in the main host address
30		physical address map. As such, the memory to repair must be
31		identified by a device specific physical addressing scheme
32		using a device physical address(DPA). The DPA and other control
33		attributes to use will be presented in related error records.
34
35What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair_type
36Date:		March 2025
37KernelVersion:	6.15
38Contact:	linux-edac@vger.kernel.org
39Description:
40		(RO) Memory repair type. For eg. post package repair,
41		memory sparing etc. Valid values are:
42
43		- ppr - Post package repair.
44
45		- cacheline-sparing
46
47		- row-sparing
48
49		- bank-sparing
50
51		- rank-sparing
52
53		- All other values are reserved.
54
55What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
56Date:		March 2025
57KernelVersion:	6.15
58Contact:	linux-edac@vger.kernel.org
59Description:
60		(RW) Get/Set the current persist repair mode set for a
61		repair function. Persist repair modes supported in the
62		device, based on a memory repair function, either is temporary,
63		which is lost with a power cycle or permanent. Valid values are:
64
65		- 0 - Soft memory repair (temporary repair).
66
67		- 1 - Hard memory repair (permanent repair).
68
69		- All other values are reserved.
70
71What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair_safe_when_in_use
72Date:		March 2025
73KernelVersion:	6.15
74Contact:	linux-edac@vger.kernel.org
75Description:
76		(RO) True if memory media is accessible and data is retained
77		during the memory repair operation.
78		The data may not be retained and memory requests may not be
79		correctly processed during a repair operation. In such case
80		repair operation can not be executed at runtime. The memory
81		must be taken offline.
82
83What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/hpa
84Date:		March 2025
85KernelVersion:	6.15
86Contact:	linux-edac@vger.kernel.org
87Description:
88		(RW) Host Physical Address (HPA) of the memory to repair.
89		The HPA to use will be provided in related error records.
90
91What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/dpa
92Date:		March 2025
93KernelVersion:	6.15
94Contact:	linux-edac@vger.kernel.org
95Description:
96		(RW) Device Physical Address (DPA) of the memory to repair.
97		The specific DPA to use will be provided in related error
98		records.
99
100		In some states of system configuration (e.g. before address
101		decoders have been configured), memory devices (e.g. CXL)
102		may not have an active mapping in the main host address
103		physical address map. As such, the memory to repair must be
104		identified by a device specific physical addressing scheme
105		using a DPA. The device physical address(DPA) to use will be
106		presented in related error records.
107
108What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/nibble_mask
109Date:		March 2025
110KernelVersion:	6.15
111Contact:	linux-edac@vger.kernel.org
112Description:
113		(RW) Read/Write Nibble mask of the memory to repair.
114		Nibble mask identifies one or more nibbles in error on the
115		memory bus that produced the error event. Nibble Mask bit 0
116		shall be set if nibble 0 on the memory bus produced the
117		event, etc. For example, CXL PPR and sparing, a nibble mask
118		bit set to 1 indicates the request to perform repair
119		operation in the specific device. All nibble mask bits set
120		to 1 indicates the request to perform the operation in all
121		devices. Eg. for CXL memory repair, the specific value of
122		nibble mask to use will be provided in related error records.
123		For more details, See nibble mask field in CXL spec ver 3.1,
124		section 8.2.9.7.1.2 Table 8-103 soft PPR and section
125		8.2.9.7.1.3 Table 8-104 hard PPR, section 8.2.9.7.1.4
126		Table 8-105 memory sparing.
127
128What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/min_hpa
129What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/max_hpa
130What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/min_dpa
131What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/max_dpa
132Date:		March 2025
133KernelVersion:	6.15
134Contact:	linux-edac@vger.kernel.org
135Description:
136		(RW) The supported range of memory address that is to be
137		repaired. The memory device may give the supported range of
138		attributes to use and it will depend on the memory device
139		and the portion of memory to repair.
140		The userspace may receive the specific value of attributes
141		to use for a repair operation from the memory device via
142		related error records and trace events, for eg. CXL DRAM
143		and CXL general media error records in CXL memory devices.
144
145What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/bank_group
146What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/bank
147What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/rank
148What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/row
149What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/column
150What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/channel
151What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/sub_channel
152Date:		March 2025
153KernelVersion:	6.15
154Contact:	linux-edac@vger.kernel.org
155Description:
156		(RW) The control attributes for the memory to be repaired.
157		The specific value of attributes to use depends on the
158		portion of memory to repair and will be reported to the host
159		in related error records and be available to userspace
160		in trace events, such as CXL DRAM and CXL general media
161		error records of CXL memory devices.
162
163		When readng back these attributes, it returns the current
164		value of memory requested to be repaired.
165
166		bank_group - The bank group of the memory to repair.
167
168		bank - The bank number of the memory to repair.
169
170		rank - The rank of the memory to repair. Rank is defined as a
171		set of memory devices on a channel that together execute a
172		transaction.
173
174		row - The row number of the memory to repair.
175
176		column - The column number of the memory to repair.
177
178		channel - The channel of the memory to repair. Channel is
179		defined as an interface that can be independently accessed
180		for a transaction.
181
182		sub_channel - The subchannel of the memory to repair.
183
184		The requirement to set these attributes varies based on the
185		repair function. The attributes in sysfs are not present
186		unless required for a repair function.
187
188		For example, CXL spec ver 3.1, Section 8.2.9.7.1.2 Table 8-103
189		soft PPR and Section 8.2.9.7.1.3 Table 8-104 hard PPR operations,
190		these attributes are not required to set. CXL spec ver 3.1,
191		Section 8.2.9.7.1.4 Table 8-105 memory sparing, these attributes
192		are required to set based on memory sparing granularity.
193
194What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair
195Date:		March 2025
196KernelVersion:	6.15
197Contact:	linux-edac@vger.kernel.org
198Description:
199		(WO) Issue the memory repair operation for the specified
200		memory repair attributes. The operation may fail if resources
201		are insufficient based on the requirements of the memory
202		device and repair function.
203
204		- 1 - Issue the repair operation.
205
206		- All other values are reserved.
207