1.. SPDX-License-Identifier: GPL-2.0 2 3pstore block oops/panic logger 4============================== 5 6Introduction 7------------ 8 9pstore block (pstore/blk) is an oops/panic logger that writes its logs to a 10block device and non-block device before the system crashes. You can get 11these log files by mounting pstore filesystem like:: 12 13 mount -t pstore pstore /sys/fs/pstore 14 15 16pstore block concepts 17--------------------- 18 19pstore/blk provides efficient configuration method for pstore/blk, which 20divides all configurations into two parts, configurations for user and 21configurations for driver. 22 23Configurations for user determine how pstore/blk works, such as pmsg_size, 24kmsg_size and so on. All of them support both Kconfig and module parameters, 25but module parameters have priority over Kconfig. 26 27Configurations for driver are all about block device and non-block device, 28such as total_size of block device and read/write operations. 29 30Configurations for user 31----------------------- 32 33All of these configurations support both Kconfig and module parameters, but 34module parameters have priority over Kconfig. 35 36Here is an example for module parameters:: 37 38 pstore_blk.blkdev=179:7 pstore_blk.kmsg_size=64 39 40The detail of each configurations may be of interest to you. 41 42blkdev 43~~~~~~ 44 45The block device to use. Most of the time, it is a partition of block device. 46It's required for pstore/blk. It is also used for MTD device. 47 48It accepts the following variants for block device: 49 501. <hex_major><hex_minor> device number in hexadecimal represents itself; no 51 leading 0x, for example b302. 52#. /dev/<disk_name> represents the device number of disk 53#. /dev/<disk_name><decimal> represents the device number of partition - device 54 number of disk plus the partition number 55#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk 56 name of partitioned disk ends with a digit. 57#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of 58 a partition if the partition table provides it. The UUID may be either an 59 EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP, 60 where SSSSSSSS is a zero-filled hex representation of the 32-bit 61 "NT disk signature", and PP is a zero-filled hex representation of the 62 1-based partition number. 63#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a 64 partition with a known unique id. 65#. <major>:<minor> major and minor number of the device separated by a colon. 66 67It accepts the following variants for MTD device: 68 691. <device name> MTD device name. "pstore" is recommended. 70#. <device number> MTD device number. 71 72kmsg_size 73~~~~~~~~~ 74 75The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4. 76It's optional if you do not care oops/panic log. 77 78There are multiple chunks for oops/panic front-end depending on the remaining 79space except other pstore front-ends. 80 81pstore/blk will log to oops/panic chunks one by one, and always overwrite the 82oldest chunk if there is no more free chunk. 83 84pmsg_size 85~~~~~~~~~ 86 87The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4. 88It's optional if you do not care pmsg log. 89 90Unlike oops/panic front-end, there is only one chunk for pmsg front-end. 91 92Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are 93appended to the chunk. On reboot the contents are available in 94*/sys/fs/pstore/pmsg-pstore-blk-0*. 95 96console_size 97~~~~~~~~~~~~ 98 99The chunk size in KB for console front-end. It **MUST** be a multiple of 4. 100It's optional if you do not care console log. 101 102Similar to pmsg front-end, there is only one chunk for console front-end. 103 104All log of console will be appended to the chunk. On reboot the contents are 105available in */sys/fs/pstore/console-pstore-blk-0*. 106 107ftrace_size 108~~~~~~~~~~~ 109 110The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4. 111It's optional if you do not care console log. 112 113Similar to oops front-end, there are multiple chunks for ftrace front-end 114depending on the count of cpu processors. Each chunk size is equal to 115ftrace_size / processors_count. 116 117All log of ftrace will be appended to the chunk. On reboot the contents are 118combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*. 119 120Persistent function tracing might be useful for debugging software or hardware 121related hangs. Here is an example of usage:: 122 123 # mount -t pstore pstore /sys/fs/pstore 124 # mount -t debugfs debugfs /sys/kernel/debug/ 125 # echo 1 > /sys/kernel/debug/pstore/record_ftrace 126 # reboot -f 127 [...] 128 # mount -t pstore pstore /sys/fs/pstore 129 # tail /sys/fs/pstore/ftrace-pstore-blk-0 130 CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0 131 CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48 132 CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314 133 CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314 134 CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204 135 CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204 136 CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358 137 CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358 138 139max_reason 140~~~~~~~~~~ 141 142Limiting which kinds of kmsg dumps are stored can be controlled via 143the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's 144``enum kmsg_dump_reason``. For example, to store both Oopses and Panics, 145``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics 146``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0 147(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the 148``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS, 149otherwise KMSG_DUMP_MAX. 150 151Configurations for driver 152------------------------- 153 154Only a block device driver cares about these configurations. A block device 155driver uses ``register_pstore_blk`` to register to pstore/blk. 156 157.. kernel-doc:: fs/pstore/blk.c 158 :identifiers: register_pstore_blk 159 160A non-block device driver uses ``register_pstore_device`` with 161``struct pstore_device_info`` to register to pstore/blk. 162 163.. kernel-doc:: fs/pstore/blk.c 164 :identifiers: register_pstore_device 165 166.. kernel-doc:: include/linux/pstore_blk.h 167 :identifiers: pstore_device_info 168 169Compression and header 170---------------------- 171 172Block device is large enough for uncompressed oops data. Actually we do not 173recommend data compression because pstore/blk will insert some information into 174the first line of oops/panic data. For example:: 175 176 Panic: Total 16 times 177 178It means that it's OOPS|Panic for the 16th time since the first booting. 179Sometimes the number of occurrences of oops|panic since the first booting is 180important to judge whether the system is stable. 181 182The following line is inserted by pstore filesystem. For example:: 183 184 Oops#2 Part1 185 186It means that it's OOPS for the 2nd time on the last boot. 187 188Reading the data 189---------------- 190 191The dump data can be read from the pstore filesystem. The format for these 192files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end, 193``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the 194dump file records the trigger time. To delete a stored record from block 195device, simply unlink the respective pstore file. 196 197Attentions in panic read/write APIs 198----------------------------------- 199 200If on panic, the kernel is not going to run for much longer, the tasks will not 201be scheduled and most kernel resources will be out of service. It 202looks like a single-threaded program running on a single-core computer. 203 204The following points require special attention for panic read/write APIs: 205 2061. Can **NOT** allocate any memory. 207 If you need memory, just allocate while the block driver is initializing 208 rather than waiting until the panic. 209#. Must be polled, **NOT** interrupt driven. 210 No task schedule any more. The block driver should delay to ensure the write 211 succeeds, but NOT sleep. 212#. Can **NOT** take any lock. 213 There is no other task, nor any shared resource; you are safe to break all 214 locks. 215#. Just use CPU to transfer. 216 Do not use DMA to transfer unless you are sure that DMA will not keep lock. 217#. Control registers directly. 218 Please control registers directly rather than use Linux kernel resources. 219 Do I/O map while initializing rather than wait until a panic occurs. 220#. Reset your block device and controller if necessary. 221 If you are not sure of the state of your block device and controller when 222 a panic occurs, you are safe to stop and reset them. 223 224pstore/blk supports psblk_blkdev_info(), which is defined in 225*linux/pstore_blk.h*, to get information of using block device, such as the 226device number, sector count and start sector of the whole disk. 227 228pstore block internals 229---------------------- 230 231For developer reference, here are all the important structures and APIs: 232 233.. kernel-doc:: fs/pstore/zone.c 234 :internal: 235 236.. kernel-doc:: include/linux/pstore_zone.h 237 :internal: 238 239.. kernel-doc:: fs/pstore/blk.c 240 :export: 241 242.. kernel-doc:: include/linux/pstore_blk.h 243 :internal: 244