xref: /linux/Documentation/ABI/stable/sysfs-block (revision 2c04718edcd5e1ac8fed9a0f8d0620e8bc94014d)
1What:		/sys/block/<disk>/alignment_offset
2Date:		April 2009
3Contact:	Martin K. Petersen <martin.petersen@oracle.com>
4Description:
5		Storage devices may report a physical block size that is
6		bigger than the logical block size (for instance a drive
7		with 4KB physical sectors exposing 512-byte logical
8		blocks to the operating system).  This parameter
9		indicates how many bytes the beginning of the device is
10		offset from the disk's natural alignment.
11
12
13What:		/sys/block/<disk>/discard_alignment
14Date:		May 2011
15Contact:	Martin K. Petersen <martin.petersen@oracle.com>
16Description:
17		Devices that support discard functionality may
18		internally allocate space in units that are bigger than
19		the exported logical block size. The discard_alignment
20		parameter indicates how many bytes the beginning of the
21		device is offset from the internal allocation unit's
22		natural alignment.
23
24What:		/sys/block/<disk>/atomic_write_max_bytes
25Date:		February 2024
26Contact:	Himanshu Madhani <himanshu.madhani@oracle.com>
27Description:
28		[RO] This parameter specifies the maximum atomic write
29		size reported by the device. This parameter is relevant
30		for merging of writes, where a merged atomic write
31		operation must not exceed this number of bytes.
32		This parameter may be greater than the value in
33		atomic_write_unit_max_bytes as
34		atomic_write_unit_max_bytes will be rounded down to a
35		power-of-two and atomic_write_unit_max_bytes may also be
36		limited by some other queue limits, such as max_segments.
37		This parameter - along with atomic_write_unit_min_bytes
38		and atomic_write_unit_max_bytes - will not be larger than
39		max_hw_sectors_kb, but may be larger than max_sectors_kb.
40
41
42What:		/sys/block/<disk>/atomic_write_unit_min_bytes
43Date:		February 2024
44Contact:	Himanshu Madhani <himanshu.madhani@oracle.com>
45Description:
46		[RO] This parameter specifies the smallest block which can
47		be written atomically with an atomic write operation. All
48		atomic write operations must begin at a
49		atomic_write_unit_min boundary and must be multiples of
50		atomic_write_unit_min. This value must be a power-of-two.
51
52
53What:		/sys/block/<disk>/atomic_write_unit_max_bytes
54Date:		February 2024
55Contact:	Himanshu Madhani <himanshu.madhani@oracle.com>
56Description:
57		[RO] This parameter defines the largest block which can be
58		written atomically with an atomic write operation. This
59		value must be a multiple of atomic_write_unit_min and must
60		be a power-of-two. This value will not be larger than
61		atomic_write_max_bytes.
62
63
64What:		/sys/block/<disk>/atomic_write_boundary_bytes
65Date:		February 2024
66Contact:	Himanshu Madhani <himanshu.madhani@oracle.com>
67Description:
68		[RO] A device may need to internally split an atomic write I/O
69		which straddles a given logical block address boundary. This
70		parameter specifies the size in bytes of the atomic boundary if
71		one is reported by the device. This value must be a
72		power-of-two and at least the size as in
73		atomic_write_unit_max_bytes.
74		Any attempt to merge atomic write I/Os must not result in a
75		merged I/O which crosses this boundary (if any).
76
77
78What:		/sys/block/<disk>/diskseq
79Date:		February 2021
80Contact:	Matteo Croce <teknoraver@meta.com>
81Description:
82		The /sys/block/<disk>/diskseq files reports the disk
83		sequence number, which is a monotonically increasing
84		number assigned to every drive.
85		Some devices, like the loop device, refresh such number
86		every time the backing file is changed.
87		The value type is 64 bit unsigned.
88
89
90What:		/sys/block/<disk>/inflight
91Date:		October 2009
92Contact:	Jens Axboe <axboe@kernel.dk>, Nikanth Karthikesan <knikanth@suse.de>
93Description:
94		Reports the number of I/O requests currently in progress
95		(pending / in flight) in a device driver. This can be less
96		than the number of requests queued in the block device queue.
97		The report contains 2 fields: one for read requests
98		and one for write requests.
99		The value type is unsigned int.
100		Cf. Documentation/block/stat.rst which contains a single value for
101		requests in flight.
102		This is related to /sys/block/<disk>/queue/nr_requests
103		and for SCSI device also its queue_depth.
104
105
106What:		/sys/block/<disk>/integrity/device_is_integrity_capable
107Date:		July 2014
108Contact:	Martin K. Petersen <martin.petersen@oracle.com>
109Description:
110		Indicates whether a storage device is capable of storing
111		integrity metadata. Set if the device is T10 PI-capable.
112		This flag is set to 1 if the storage media is formatted
113		with T10 Protection Information. If the storage media is
114		not formatted with T10 Protection Information, this flag
115		is set to 0.
116
117
118What:		/sys/block/<disk>/integrity/format
119Date:		June 2008
120Contact:	Martin K. Petersen <martin.petersen@oracle.com>
121Description:
122		Metadata format for integrity capable block device.
123		E.g. T10-DIF-TYPE1-CRC.
124		This field describes the type of T10 Protection Information
125		that the block device can send and receive.
126		If the device can store application integrity metadata but
127		no T10 Protection Information profile is used, this field
128		contains "nop".
129		If the device does not support integrity metadata, this
130		field contains "none".
131
132
133What:		/sys/block/<disk>/integrity/protection_interval_bytes
134Date:		July 2015
135Contact:	Martin K. Petersen <martin.petersen@oracle.com>
136Description:
137		Describes the number of data bytes which are protected
138		by one integrity tuple. Typically the device's logical
139		block size.
140
141
142What:		/sys/block/<disk>/integrity/read_verify
143Date:		June 2008
144Contact:	Martin K. Petersen <martin.petersen@oracle.com>
145Description:
146		Indicates whether the block layer should verify the
147		integrity of read requests serviced by devices that
148		support sending integrity metadata.
149
150
151What:		/sys/block/<disk>/integrity/tag_size
152Date:		June 2008
153Contact:	Martin K. Petersen <martin.petersen@oracle.com>
154Description:
155		Number of bytes of integrity tag space available per
156		protection_interval_bytes, which is typically
157		the device's logical block size.
158		This field describes the size of the application tag
159		if the storage device is formatted with T10 Protection
160		Information and permits use of the application tag.
161		The tag_size is reported in bytes and indicates the
162		space available for adding an opaque tag to each block
163		(protection_interval_bytes).
164		If the device does not support T10 Protection Information
165		(even if the device provides application integrity
166		metadata space), this field is set to 0.
167
168
169What:		/sys/block/<disk>/integrity/write_generate
170Date:		June 2008
171Contact:	Martin K. Petersen <martin.petersen@oracle.com>
172Description:
173		Indicates whether the block layer should automatically
174		generate checksums for write requests bound for
175		devices that support receiving integrity metadata.
176
177
178What:		/sys/block/<disk>/partscan
179Date:		May 2024
180Contact:	Christoph Hellwig <hch@lst.de>
181Description:
182		The /sys/block/<disk>/partscan files reports if partition
183		scanning is enabled for the disk.  It returns "1" if partition
184		scanning is enabled, or "0" if not.  The value type is a 32-bit
185		unsigned integer, but only "0" and "1" are valid values.
186
187
188What:		/sys/block/<disk>/<partition>/alignment_offset
189Date:		April 2009
190Contact:	Martin K. Petersen <martin.petersen@oracle.com>
191Description:
192		Storage devices may report a physical block size that is
193		bigger than the logical block size (for instance a drive
194		with 4KB physical sectors exposing 512-byte logical
195		blocks to the operating system).  This parameter
196		indicates how many bytes the beginning of the partition
197		is offset from the disk's natural alignment.
198
199
200What:		/sys/block/<disk>/<partition>/discard_alignment
201Date:		May 2011
202Contact:	Martin K. Petersen <martin.petersen@oracle.com>
203Description:
204		Devices that support discard functionality may
205		internally allocate space in units that are bigger than
206		the exported logical block size. The discard_alignment
207		parameter indicates how many bytes the beginning of the
208		partition is offset from the internal allocation unit's
209		natural alignment.
210
211
212What:		/sys/block/<disk>/<partition>/stat
213Date:		February 2008
214Contact:	Jerome Marchand <jmarchan@redhat.com>
215Description:
216		The /sys/block/<disk>/<partition>/stat files display the
217		I/O statistics of partition <partition>. The format is the
218		same as the format of /sys/block/<disk>/stat.
219
220
221What:		/sys/block/<disk>/queue/add_random
222Date:		June 2010
223Contact:	linux-block@vger.kernel.org
224Description:
225		[RW] This file allows to turn off the disk entropy contribution.
226		Default value of this file is '1'(on).
227
228
229What:		/sys/block/<disk>/queue/chunk_sectors
230Date:		September 2016
231Contact:	Hannes Reinecke <hare@suse.com>
232Description:
233		[RO] chunk_sectors has different meaning depending on the type
234		of the disk. For a RAID device (dm-raid), chunk_sectors
235		indicates the size in 512B sectors of the RAID volume stripe
236		segment. For a zoned block device, either host-aware or
237		host-managed, chunk_sectors indicates the size in 512B sectors
238		of the zones of the device, with the eventual exception of the
239		last zone of the device which may be smaller.
240
241
242What:		/sys/block/<disk>/queue/crypto/
243Date:		February 2022
244Contact:	linux-block@vger.kernel.org
245Description:
246		The presence of this subdirectory of /sys/block/<disk>/queue/
247		indicates that the device supports inline encryption.  This
248		subdirectory contains files which describe the inline encryption
249		capabilities of the device.  For more information about inline
250		encryption, refer to Documentation/block/inline-encryption.rst.
251
252
253What:		/sys/block/<disk>/queue/crypto/hw_wrapped_keys
254Date:		February 2025
255Contact:	linux-block@vger.kernel.org
256Description:
257		[RO] The presence of this file indicates that the device
258		supports hardware-wrapped inline encryption keys, i.e. key blobs
259		that can only be unwrapped and used by dedicated hardware.  For
260		more information about hardware-wrapped inline encryption keys,
261		see Documentation/block/inline-encryption.rst.
262
263
264What:		/sys/block/<disk>/queue/crypto/max_dun_bits
265Date:		February 2022
266Contact:	linux-block@vger.kernel.org
267Description:
268		[RO] This file shows the maximum length, in bits, of data unit
269		numbers accepted by the device in inline encryption requests.
270
271
272What:		/sys/block/<disk>/queue/crypto/modes/<mode>
273Date:		February 2022
274Contact:	linux-block@vger.kernel.org
275Description:
276		[RO] For each crypto mode (i.e., encryption/decryption
277		algorithm) the device supports with inline encryption, a file
278		will exist at this location.  It will contain a hexadecimal
279		number that is a bitmask of the supported data unit sizes, in
280		bytes, for that crypto mode.
281
282		Currently, the crypto modes that may be supported are:
283
284		   * AES-256-XTS
285		   * AES-128-CBC-ESSIV
286		   * Adiantum
287
288		For example, if a device supports AES-256-XTS inline encryption
289		with data unit sizes of 512 and 4096 bytes, the file
290		/sys/block/<disk>/queue/crypto/modes/AES-256-XTS will exist and
291		will contain "0x1200".
292
293
294What:		/sys/block/<disk>/queue/crypto/num_keyslots
295Date:		February 2022
296Contact:	linux-block@vger.kernel.org
297Description:
298		[RO] This file shows the number of keyslots the device has for
299		use with inline encryption.
300
301
302What:		/sys/block/<disk>/queue/crypto/raw_keys
303Date:		February 2025
304Contact:	linux-block@vger.kernel.org
305Description:
306		[RO] The presence of this file indicates that the device
307		supports raw inline encryption keys, i.e. keys that are managed
308		in raw, plaintext form in software.
309
310
311What:		/sys/block/<disk>/queue/dax
312Date:		June 2016
313Contact:	linux-block@vger.kernel.org
314Description:
315		[RO] This file indicates whether the device supports Direct
316		Access (DAX), used by CPU-addressable storage to bypass the
317		pagecache.  It shows '1' if true, '0' if not.
318
319
320What:		/sys/block/<disk>/queue/discard_granularity
321Date:		May 2011
322Contact:	Martin K. Petersen <martin.petersen@oracle.com>
323Description:
324		[RO] Devices that support discard functionality may internally
325		allocate space using units that are bigger than the logical
326		block size. The discard_granularity parameter indicates the size
327		of the internal allocation unit in bytes if reported by the
328		device. Otherwise the discard_granularity will be set to match
329		the device's physical block size. A discard_granularity of 0
330		means that the device does not support discard functionality.
331
332
333What:		/sys/block/<disk>/queue/discard_max_bytes
334Date:		May 2011
335Contact:	Martin K. Petersen <martin.petersen@oracle.com>
336Description:
337		[RW] While discard_max_hw_bytes is the hardware limit for the
338		device, this setting is the software limit. Some devices exhibit
339		large latencies when large discards are issued, setting this
340		value lower will make Linux issue smaller discards and
341		potentially help reduce latencies induced by large discard
342		operations.
343
344
345What:		/sys/block/<disk>/queue/discard_max_hw_bytes
346Date:		July 2015
347Contact:	linux-block@vger.kernel.org
348Description:
349		[RO] Devices that support discard functionality may have
350		internal limits on the number of bytes that can be trimmed or
351		unmapped in a single operation.  The `discard_max_hw_bytes`
352		parameter is set by the device driver to the maximum number of
353		bytes that can be discarded in a single operation.  Discard
354		requests issued to the device must not exceed this limit.  A
355		`discard_max_hw_bytes` value of 0 means that the device does not
356		support discard functionality.
357
358
359What:		/sys/block/<disk>/queue/discard_zeroes_data
360Date:		May 2011
361Contact:	Martin K. Petersen <martin.petersen@oracle.com>
362Description:
363		[RO] Will always return 0.  Don't rely on any specific behavior
364		for discards, and don't read this file.
365
366
367What:		/sys/block/<disk>/queue/dma_alignment
368Date:		May 2022
369Contact:	linux-block@vger.kernel.org
370Description:
371		Reports the alignment that user space addresses must have to be
372		used for raw block device access with O_DIRECT and other driver
373		specific passthrough mechanisms.
374
375
376What:		/sys/block/<disk>/queue/fua
377Date:		May 2018
378Contact:	linux-block@vger.kernel.org
379Description:
380		[RO] Whether or not the block driver supports the FUA flag for
381		write requests.  FUA stands for Force Unit Access. If the FUA
382		flag is set that means that write requests must bypass the
383		volatile cache of the storage device.
384
385
386What:		/sys/block/<disk>/queue/hw_sector_size
387Date:		January 2008
388Contact:	linux-block@vger.kernel.org
389Description:
390		[RO] This is the hardware sector size of the device, in bytes.
391
392
393What:		/sys/block/<disk>/queue/independent_access_ranges/
394Date:		October 2021
395Contact:	linux-block@vger.kernel.org
396Description:
397		[RO] The presence of this sub-directory of the
398		/sys/block/xxx/queue/ directory indicates that the device is
399		capable of executing requests targeting different sector ranges
400		in parallel. For instance, single LUN multi-actuator hard-disks
401		will have an independent_access_ranges directory if the device
402		correctly advertises the sector ranges of its actuators.
403
404		The independent_access_ranges directory contains one directory
405		per access range, with each range described using the sector
406		(RO) attribute file to indicate the first sector of the range
407		and the nr_sectors (RO) attribute file to indicate the total
408		number of sectors in the range starting from the first sector of
409		the range.  For example, a dual-actuator hard-disk will have the
410		following independent_access_ranges entries.::
411
412			$ tree /sys/block/<disk>/queue/independent_access_ranges/
413			/sys/block/<disk>/queue/independent_access_ranges/
414			|-- 0
415			|   |-- nr_sectors
416			|   `-- sector
417			`-- 1
418			    |-- nr_sectors
419			    `-- sector
420
421		The sector and nr_sectors attributes use 512B sector unit,
422		regardless of the actual block size of the device. Independent
423		access ranges do not overlap and include all sectors within the
424		device capacity. The access ranges are numbered in increasing
425		order of the range start sector, that is, the sector attribute
426		of range 0 always has the value 0.
427
428
429What:		/sys/block/<disk>/queue/io_poll
430Date:		November 2015
431Contact:	linux-block@vger.kernel.org
432Description:
433		[RW] When read, this file shows whether polling is enabled (1)
434		or disabled (0).  Writing '0' to this file will disable polling
435		for this device.  Writing any non-zero value will enable this
436		feature.
437
438
439What:		/sys/block/<disk>/queue/io_poll_delay
440Date:		November 2016
441Contact:	linux-block@vger.kernel.org
442Description:
443		[RW] This was used to control what kind of polling will be
444		performed.  It is now fixed to -1, which is classic polling.
445		In this mode, the CPU will repeatedly ask for completions
446		without giving up any time.
447		<deprecated>
448
449
450What:		/sys/block/<disk>/queue/io_timeout
451Date:		November 2018
452Contact:	Weiping Zhang <zhangweiping@didiglobal.com>
453Description:
454		[RW] io_timeout is the request timeout in milliseconds. If a
455		request does not complete in this time then the block driver
456		timeout handler is invoked. That timeout handler can decide to
457		retry the request, to fail it or to start a device recovery
458		strategy.
459
460
461What:		/sys/block/<disk>/queue/iostats
462Date:		January 2009
463Contact:	linux-block@vger.kernel.org
464Description:
465		[RW] This file is used to control (on/off) the iostats
466		accounting of the disk.
467
468What:		/sys/block/<disk>/queue/iostats_passthrough
469Date:		October 2024
470Contact:	linux-block@vger.kernel.org
471Description:
472		[RW] This file is used to control (on/off) the iostats
473		accounting of the disk for passthrough commands.
474
475
476What:		/sys/block/<disk>/queue/logical_block_size
477Date:		May 2009
478Contact:	Martin K. Petersen <martin.petersen@oracle.com>
479Description:
480		[RO] This is the smallest unit the storage device can address.
481		It is typically 512 bytes.
482
483
484What:		/sys/block/<disk>/queue/max_active_zones
485Date:		July 2020
486Contact:	Niklas Cassel <niklas.cassel@wdc.com>
487Description:
488		[RO] For zoned block devices (zoned attribute indicating
489		"host-managed" or "host-aware"), the sum of zones belonging to
490		any of the zone states: EXPLICIT OPEN, IMPLICIT OPEN or CLOSED,
491		is limited by this value. If this value is 0, there is no limit.
492
493		If the host attempts to exceed this limit, the driver should
494		report this error with BLK_STS_ZONE_ACTIVE_RESOURCE, which user
495		space may see as the EOVERFLOW errno.
496
497
498What:		/sys/block/<disk>/queue/max_discard_segments
499Date:		February 2017
500Contact:	linux-block@vger.kernel.org
501Description:
502		[RO] The maximum number of DMA scatter/gather entries in a
503		discard request.
504
505
506What:		/sys/block/<disk>/queue/max_hw_sectors_kb
507Date:		September 2004
508Contact:	linux-block@vger.kernel.org
509Description:
510		[RO] This is the maximum number of kilobytes supported in a
511		single data transfer.
512
513
514What:		/sys/block/<disk>/queue/max_integrity_segments
515Date:		September 2010
516Contact:	linux-block@vger.kernel.org
517Description:
518		[RO] Maximum number of elements in a DMA scatter/gather list
519		with integrity data that will be submitted by the block layer
520		core to the associated block driver.
521
522
523What:		/sys/block/<disk>/queue/max_open_zones
524Date:		July 2020
525Contact:	Niklas Cassel <niklas.cassel@wdc.com>
526Description:
527		[RO] For zoned block devices (zoned attribute indicating
528		"host-managed" or "host-aware"), the sum of zones belonging to
529		any of the zone states: EXPLICIT OPEN or IMPLICIT OPEN, is
530		limited by this value. If this value is 0, there is no limit.
531
532
533What:		/sys/block/<disk>/queue/max_sectors_kb
534Date:		September 2004
535Contact:	linux-block@vger.kernel.org
536Description:
537		[RW] This is the maximum number of kilobytes that the block
538		layer will allow for a filesystem request. Must be smaller than
539		or equal to the maximum size allowed by the hardware. Write 0
540		to use default kernel settings.
541
542
543What:		/sys/block/<disk>/queue/max_segment_size
544Date:		March 2010
545Contact:	linux-block@vger.kernel.org
546Description:
547		[RO] Maximum size in bytes of a single element in a DMA
548		scatter/gather list.
549
550What:		/sys/block/<disk>/queue/max_write_streams
551Date:		November 2024
552Contact:	linux-block@vger.kernel.org
553Description:
554		[RO] Maximum number of write streams supported, 0 if not
555		supported. If supported, valid values are 1 through
556		max_write_streams, inclusive.
557
558What:		/sys/block/<disk>/queue/write_stream_granularity
559Date:		November 2024
560Contact:	linux-block@vger.kernel.org
561Description:
562		[RO] Granularity of a write stream in bytes.  The granularity
563		of a write stream is the size that should be discarded or
564		overwritten together to avoid write amplification in the device.
565
566What:		/sys/block/<disk>/queue/max_segments
567Date:		March 2010
568Contact:	linux-block@vger.kernel.org
569Description:
570		[RO] Maximum number of elements in a DMA scatter/gather list
571		that is submitted to the associated block driver.
572
573
574What:		/sys/block/<disk>/queue/minimum_io_size
575Date:		April 2009
576Contact:	Martin K. Petersen <martin.petersen@oracle.com>
577Description:
578		[RO] Storage devices may report a granularity or preferred
579		minimum I/O size which is the smallest request the device can
580		perform without incurring a performance penalty.  For disk
581		drives this is often the physical block size.  For RAID arrays
582		it is often the stripe chunk size.  A properly aligned multiple
583		of minimum_io_size is the preferred request size for workloads
584		where a high number of I/O operations is desired.
585
586
587What:		/sys/block/<disk>/queue/nomerges
588Date:		January 2010
589Contact:	linux-block@vger.kernel.org
590Description:
591		[RW] Standard I/O elevator operations include attempts to merge
592		contiguous I/Os. For known random I/O loads these attempts will
593		always fail and result in extra cycles being spent in the
594		kernel. This allows one to turn off this behavior on one of two
595		ways: When set to 1, complex merge checks are disabled, but the
596		simple one-shot merges with the previous I/O request are
597		enabled. When set to 2, all merge tries are disabled. The
598		default value is 0 - which enables all types of merge tries.
599
600
601What:		/sys/block/<disk>/queue/nr_requests
602Date:		July 2003
603Contact:	linux-block@vger.kernel.org
604Description:
605		[RW] This controls how many requests may be allocated in the
606		block layer. Noted this value only represents the quantity for a
607		single blk_mq_tags instance. The actual number for the entire
608		device depends on the hardware queue count, whether elevator is
609		enabled, and whether tags are shared.
610
611
612What:		/sys/block/<disk>/queue/async_depth
613Date:		August 2025
614Contact:	linux-block@vger.kernel.org
615Description:
616		[RW] Controls how many asynchronous requests may be allocated in the
617		block layer. The value is always capped at nr_requests.
618
619		When no elevator is active (none):
620		- async_depth is always equal to nr_requests.
621
622		For bfq scheduler:
623		- By default, async_depth is set to 75% of nr_requests.
624		  Internal limits are then derived from this value:
625		  * Sync writes: limited to async_depth (≈75% of nr_requests).
626		  * Async I/O: limited to ~2/3 of async_depth (≈50% of nr_requests).
627
628		  If a bfq_queue is weight-raised:
629		  * Sync writes: limited to ~1/2 of async_depth (≈37% of nr_requests).
630		  * Async I/O: limited to ~1/4 of async_depth (≈18% of nr_requests).
631
632		- If the user writes a custom value to async_depth, BFQ will recompute
633		  these limits proportionally based on the new value.
634
635		For Kyber:
636		- By default async_depth is set to 75% of nr_requests.
637		- If the user writes a custom value to async_depth, then it override the
638		  default and directly control the limit for writes and async I/O.
639
640		For mq-deadline:
641		- By default async_depth is set to nr_requests.
642		- If the user writes a custom value to async_depth, then it override the
643		  default and directly control the limit for writes and async I/O.
644
645
646What:		/sys/block/<disk>/queue/nr_zones
647Date:		November 2018
648Contact:	Damien Le Moal <damien.lemoal@wdc.com>
649Description:
650		[RO] nr_zones indicates the total number of zones of a zoned
651		block device ("host-aware" or "host-managed" zone model). For
652		regular block devices, the value is always 0.
653
654
655What:		/sys/block/<disk>/queue/optimal_io_size
656Date:		April 2009
657Contact:	Martin K. Petersen <martin.petersen@oracle.com>
658Description:
659		[RO] Storage devices may report an optimal I/O size, which is
660		the device's preferred unit for sustained I/O.  This is rarely
661		reported for disk drives.  For RAID arrays it is usually the
662		stripe width or the internal track size.  A properly aligned
663		multiple of optimal_io_size is the preferred request size for
664		workloads where sustained throughput is desired.  If no optimal
665		I/O size is reported this file contains 0.
666
667
668What:		/sys/block/<disk>/queue/physical_block_size
669Date:		May 2009
670Contact:	Martin K. Petersen <martin.petersen@oracle.com>
671Description:
672		[RO] This is the smallest unit a physical storage device can
673		write atomically.  It is usually the same as the logical block
674		size but may be bigger.  One example is SATA drives with 4KB
675		sectors that expose a 512-byte logical block size to the
676		operating system.  For stacked block devices the
677		physical_block_size variable contains the maximum
678		physical_block_size of the component devices.
679
680
681What:		/sys/block/<disk>/queue/read_ahead_kb
682Date:		May 2004
683Contact:	linux-block@vger.kernel.org
684Description:
685		[RW] Maximum number of kilobytes to read-ahead for filesystems
686		on this block device.
687
688		For MADV_HUGEPAGE, the readahead size may exceed this setting
689		since its granularity is based on the hugepage size.
690
691
692What:		/sys/block/<disk>/queue/rotational
693Date:		January 2009
694Contact:	linux-block@vger.kernel.org
695Description:
696		[RW] This file is used to stat if the device is of rotational
697		type or non-rotational type.
698
699
700What:		/sys/block/<disk>/queue/rq_affinity
701Date:		September 2008
702Contact:	linux-block@vger.kernel.org
703Description:
704		[RW] If this option is '1', the block layer will migrate request
705		completions to the cpu "group" that originally submitted the
706		request. For some workloads this provides a significant
707		reduction in CPU cycles due to caching effects.
708
709		For storage configurations that need to maximize distribution of
710		completion processing setting this option to '2' forces the
711		completion to run on the requesting cpu (bypassing the "group"
712		aggregation logic).
713
714
715What:		/sys/block/<disk>/queue/scheduler
716Date:		October 2004
717Contact:	linux-block@vger.kernel.org
718Description:
719		[RW] When read, this file will display the current and available
720		IO schedulers for this block device. The currently active IO
721		scheduler will be enclosed in [] brackets. Writing an IO
722		scheduler name to this file will switch control of this block
723		device to that new IO scheduler. Note that writing an IO
724		scheduler name to this file will attempt to load that IO
725		scheduler module, if it isn't already present in the system.
726
727
728What:		/sys/block/<disk>/queue/stable_writes
729Date:		September 2020
730Contact:	linux-block@vger.kernel.org
731Description:
732		[RW] This file will contain '1' if memory must not be modified
733		while it is being used in a write request to this device.  When
734		this is the case and the kernel is performing writeback of a
735		page, the kernel will wait for writeback to complete before
736		allowing the page to be modified again, rather than allowing
737		immediate modification as is normally the case.  This
738		restriction arises when the device accesses the memory multiple
739		times where the same data must be seen every time -- for
740		example, once to calculate a checksum and once to actually write
741		the data.  If no such restriction exists, this file will contain
742		'0'.  This file is writable for testing purposes.
743
744What:		/sys/block/<disk>/queue/virt_boundary_mask
745Date:		April 2021
746Contact:	linux-block@vger.kernel.org
747Description:
748		[RO] This file shows the I/O segment memory alignment mask for
749		the block device.  I/O requests to this device will be split
750		between segments wherever either the memory address of the end
751		of the previous segment or the memory address of the beginning
752		of the current segment is not aligned to virt_boundary_mask + 1
753		bytes.
754
755
756What:		/sys/block/<disk>/queue/wbt_lat_usec
757Date:		November 2016
758Contact:	linux-block@vger.kernel.org
759Description:
760		[RW] If the device is registered for writeback throttling, then
761		this file shows the target minimum read latency. If this latency
762		is exceeded in a given window of time (see curr_win_nsec), then
763		the writeback throttling will start scaling back writes. Writing
764		a value of '0' to this file disables the feature. Writing a
765		value of '-1' to this file resets the value to the default
766		setting.
767
768
769What:		/sys/block/<disk>/queue/write_cache
770Date:		April 2016
771Contact:	linux-block@vger.kernel.org
772Description:
773		[RW] When read, this file will display whether the device has
774		write back caching enabled or not. It will return "write back"
775		for the former case, and "write through" for the latter. Writing
776		to this file can change the kernels view of the device, but it
777		doesn't alter the device state. This means that it might not be
778		safe to toggle the setting from "write back" to "write through",
779		since that will also eliminate cache flushes issued by the
780		kernel.
781
782
783What:		/sys/block/<disk>/queue/write_same_max_bytes
784Date:		January 2012
785Contact:	Martin K. Petersen <martin.petersen@oracle.com>
786Description:
787		[RO] Some devices support a write same operation in which a
788		single data block can be written to a range of several
789		contiguous blocks on storage. This can be used to wipe areas on
790		disk or to initialize drives in a RAID configuration.
791		write_same_max_bytes indicates how many bytes can be written in
792		a single write same command. If write_same_max_bytes is 0, write
793		same is not supported by the device.
794
795
796What:		/sys/block/<disk>/queue/write_zeroes_max_bytes
797Date:		November 2016
798Contact:	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
799Description:
800		[RO] Devices that support write zeroes operation in which a
801		single request can be issued to zero out the range of contiguous
802		blocks on storage without having any payload in the request.
803		This can be used to optimize writing zeroes to the devices.
804		write_zeroes_max_bytes indicates how many bytes can be written
805		in a single write zeroes command. If write_zeroes_max_bytes is
806		0, write zeroes is not supported by the device.
807
808
809What:		/sys/block/<disk>/queue/write_zeroes_unmap_max_hw_bytes
810Date:		January 2025
811Contact:	Zhang Yi <yi.zhang@huawei.com>
812Description:
813		[RO] This file indicates whether a device supports zeroing data
814		in a specified block range without incurring the cost of
815		physically writing zeroes to the media for each individual
816		block. If this parameter is set to write_zeroes_max_bytes, the
817		device implements a zeroing operation which opportunistically
818		avoids writing zeroes to media while still guaranteeing that
819		subsequent reads from the specified block range will return
820		zeroed data. This operation is a best-effort optimization, a
821		device may fall back to physically writing zeroes to the media
822		due to other factors such as misalignment or being asked to
823		clear a block range smaller than the device's internal
824		allocation unit. If this parameter is set to 0, the device may
825		have to write each logical block media during a zeroing
826		operation.
827
828
829What:		/sys/block/<disk>/queue/write_zeroes_unmap_max_bytes
830Date:		January 2025
831Contact:	Zhang Yi <yi.zhang@huawei.com>
832Description:
833		[RW] While write_zeroes_unmap_max_hw_bytes is the hardware limit
834		for the device, this setting is the software limit. Since the
835		unmap write zeroes operation is a best-effort optimization, some
836		devices may still physically writing zeroes to media. So the
837		speed of this operation is not guaranteed. Writing a value of
838		'0' to this file disables this operation. Otherwise, this
839		parameter should be equal to write_zeroes_unmap_max_hw_bytes.
840
841
842What:		/sys/block/<disk>/queue/zone_append_max_bytes
843Date:		May 2020
844Contact:	linux-block@vger.kernel.org
845Description:
846		[RO] This is the maximum number of bytes that can be written to
847		a sequential zone of a zoned block device using a zone append
848		write operation (REQ_OP_ZONE_APPEND). This value is always 0 for
849		regular block devices.
850
851
852What:		/sys/block/<disk>/queue/zone_write_granularity
853Date:		January 2021
854Contact:	linux-block@vger.kernel.org
855Description:
856		[RO] This indicates the alignment constraint, in bytes, for
857		write operations in sequential zones of zoned block devices
858		(devices with a zoned attributed that reports "host-managed" or
859		"host-aware"). This value is always 0 for regular block devices.
860
861
862What:		/sys/block/<disk>/queue/zoned
863Date:		September 2016
864Contact:	Damien Le Moal <damien.lemoal@wdc.com>
865Description:
866		[RO] zoned indicates if the device is a zoned block device and
867		the zone model of the device if it is indeed zoned.  The
868		possible values indicated by zoned are "none" for regular block
869		devices and "host-aware" or "host-managed" for zoned block
870		devices. The characteristics of host-aware and host-managed
871		zoned block devices are described in the ZBC (Zoned Block
872		Commands) and ZAC (Zoned Device ATA Command Set) standards.
873		These standards also define the "drive-managed" zone model.
874		However, since drive-managed zoned block devices do not support
875		zone commands, they will be treated as regular block devices and
876		zoned will report "none".
877
878
879What:		/sys/block/<disk>/hidden
880Date:		March 2023
881Contact:	linux-block@vger.kernel.org
882Description:
883		[RO] the block device is hidden. it doesn’t produce events, and
884		can’t be opened from userspace or using blkdev_get*.
885		Used for the underlying components of multipath devices.
886
887
888What:		/sys/block/<disk>/stat
889Date:		February 2008
890Contact:	Jerome Marchand <jmarchan@redhat.com>
891Description:
892		The /sys/block/<disk>/stat files displays the I/O
893		statistics of disk <disk>. They contain 11 fields:
894
895		==  ==============================================
896		 1  reads completed successfully
897		 2  reads merged
898		 3  sectors read
899		 4  time spent reading (ms)
900		 5  writes completed
901		 6  writes merged
902		 7  sectors written
903		 8  time spent writing (ms)
904		 9  I/Os currently in progress
905		10  time spent doing I/Os (ms)
906		11  weighted time spent doing I/Os (ms)
907		12  discards completed
908		13  discards merged
909		14  sectors discarded
910		15  time spent discarding (ms)
911		16  flush requests completed
912		17  time spent flushing (ms)
913		==  ==============================================
914
915		For more details refer Documentation/admin-guide/iostats.rst
916