xref: /freebsd/sys/contrib/openzfs/man/man4/zfs.4 (revision e92ffd9b626833ebdbf2742c8ffddc6cd94b963e)
13ff01b23SMartin Matuska.\"
23ff01b23SMartin Matuska.\" Copyright (c) 2013 by Turbo Fredriksson <turbo@bayour.com>. All rights reserved.
33ff01b23SMartin Matuska.\" Copyright (c) 2019, 2021 by Delphix. All rights reserved.
43ff01b23SMartin Matuska.\" Copyright (c) 2019 Datto Inc.
53ff01b23SMartin Matuska.\" The contents of this file are subject to the terms of the Common Development
63ff01b23SMartin Matuska.\" and Distribution License (the "License").  You may not use this file except
73ff01b23SMartin Matuska.\" in compliance with the License. You can obtain a copy of the license at
83ff01b23SMartin Matuska.\" usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing.
93ff01b23SMartin Matuska.\"
103ff01b23SMartin Matuska.\" See the License for the specific language governing permissions and
113ff01b23SMartin Matuska.\" limitations under the License. When distributing Covered Code, include this
123ff01b23SMartin Matuska.\" CDDL HEADER in each file and include the License file at
133ff01b23SMartin Matuska.\" usr/src/OPENSOLARIS.LICENSE.  If applicable, add the following below this
143ff01b23SMartin Matuska.\" CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your
153ff01b23SMartin Matuska.\" own identifying information:
163ff01b23SMartin Matuska.\" Portions Copyright [yyyy] [name of copyright owner]
173ff01b23SMartin Matuska.\"
183ff01b23SMartin Matuska.Dd June 1, 2021
193ff01b23SMartin Matuska.Dt ZFS 4
203ff01b23SMartin Matuska.Os
213ff01b23SMartin Matuska.
223ff01b23SMartin Matuska.Sh NAME
233ff01b23SMartin Matuska.Nm zfs
243ff01b23SMartin Matuska.Nd tuning of the ZFS kernel module
253ff01b23SMartin Matuska.
263ff01b23SMartin Matuska.Sh DESCRIPTION
273ff01b23SMartin MatuskaThe ZFS module supports these parameters:
283ff01b23SMartin Matuska.Bl -tag -width Ds
293ff01b23SMartin Matuska.It Sy dbuf_cache_max_bytes Ns = Ns Sy ULONG_MAX Ns B Pq ulong
303ff01b23SMartin MatuskaMaximum size in bytes of the dbuf cache.
313ff01b23SMartin MatuskaThe target size is determined by the MIN versus
323ff01b23SMartin Matuska.No 1/2^ Ns Sy dbuf_cache_shift Pq 1/32nd
333ff01b23SMartin Matuskaof the target ARC size.
343ff01b23SMartin MatuskaThe behavior of the dbuf cache and its associated settings
353ff01b23SMartin Matuskacan be observed via the
363ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbufstats
373ff01b23SMartin Matuskakstat.
383ff01b23SMartin Matuska.
393ff01b23SMartin Matuska.It Sy dbuf_metadata_cache_max_bytes Ns = Ns Sy ULONG_MAX Ns B Pq ulong
403ff01b23SMartin MatuskaMaximum size in bytes of the metadata dbuf cache.
413ff01b23SMartin MatuskaThe target size is determined by the MIN versus
423ff01b23SMartin Matuska.No 1/2^ Ns Sy dbuf_metadata_cache_shift Pq 1/64th
433ff01b23SMartin Matuskaof the target ARC size.
443ff01b23SMartin MatuskaThe behavior of the metadata dbuf cache and its associated settings
453ff01b23SMartin Matuskacan be observed via the
463ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbufstats
473ff01b23SMartin Matuskakstat.
483ff01b23SMartin Matuska.
493ff01b23SMartin Matuska.It Sy dbuf_cache_hiwater_pct Ns = Ns Sy 10 Ns % Pq uint
503ff01b23SMartin MatuskaThe percentage over
513ff01b23SMartin Matuska.Sy dbuf_cache_max_bytes
523ff01b23SMartin Matuskawhen dbufs must be evicted directly.
533ff01b23SMartin Matuska.
543ff01b23SMartin Matuska.It Sy dbuf_cache_lowater_pct Ns = Ns Sy 10 Ns % Pq uint
553ff01b23SMartin MatuskaThe percentage below
563ff01b23SMartin Matuska.Sy dbuf_cache_max_bytes
573ff01b23SMartin Matuskawhen the evict thread stops evicting dbufs.
583ff01b23SMartin Matuska.
593ff01b23SMartin Matuska.It Sy dbuf_cache_shift Ns = Ns Sy 5 Pq int
603ff01b23SMartin MatuskaSet the size of the dbuf cache
613ff01b23SMartin Matuska.Pq Sy dbuf_cache_max_bytes
623ff01b23SMartin Matuskato a log2 fraction of the target ARC size.
633ff01b23SMartin Matuska.
643ff01b23SMartin Matuska.It Sy dbuf_metadata_cache_shift Ns = Ns Sy 6 Pq int
653ff01b23SMartin MatuskaSet the size of the dbuf metadata cache
663ff01b23SMartin Matuska.Pq Sy dbuf_metadata_cache_max_bytes
673ff01b23SMartin Matuskato a log2 fraction of the target ARC size.
683ff01b23SMartin Matuska.
693ff01b23SMartin Matuska.It Sy dmu_object_alloc_chunk_shift Ns = Ns Sy 7 Po 128 Pc Pq int
703ff01b23SMartin Matuskadnode slots allocated in a single operation as a power of 2.
713ff01b23SMartin MatuskaThe default value minimizes lock contention for the bulk operation performed.
723ff01b23SMartin Matuska.
733ff01b23SMartin Matuska.It Sy dmu_prefetch_max Ns = Ns Sy 134217728 Ns B Po 128MB Pc Pq int
743ff01b23SMartin MatuskaLimit the amount we can prefetch with one call to this amount in bytes.
753ff01b23SMartin MatuskaThis helps to limit the amount of memory that can be used by prefetching.
763ff01b23SMartin Matuska.
773ff01b23SMartin Matuska.It Sy ignore_hole_birth Pq int
783ff01b23SMartin MatuskaAlias for
793ff01b23SMartin Matuska.Sy send_holes_without_birth_time .
803ff01b23SMartin Matuska.
813ff01b23SMartin Matuska.It Sy l2arc_feed_again Ns = Ns Sy 1 Ns | Ns 0 Pq int
823ff01b23SMartin MatuskaTurbo L2ARC warm-up.
833ff01b23SMartin MatuskaWhen the L2ARC is cold the fill interval will be set as fast as possible.
843ff01b23SMartin Matuska.
853ff01b23SMartin Matuska.It Sy l2arc_feed_min_ms Ns = Ns Sy 200 Pq ulong
863ff01b23SMartin MatuskaMin feed interval in milliseconds.
873ff01b23SMartin MatuskaRequires
883ff01b23SMartin Matuska.Sy l2arc_feed_again Ns = Ns Ar 1
893ff01b23SMartin Matuskaand only applicable in related situations.
903ff01b23SMartin Matuska.
913ff01b23SMartin Matuska.It Sy l2arc_feed_secs Ns = Ns Sy 1 Pq ulong
923ff01b23SMartin MatuskaSeconds between L2ARC writing.
933ff01b23SMartin Matuska.
943ff01b23SMartin Matuska.It Sy l2arc_headroom Ns = Ns Sy 2 Pq ulong
953ff01b23SMartin MatuskaHow far through the ARC lists to search for L2ARC cacheable content,
963ff01b23SMartin Matuskaexpressed as a multiplier of
973ff01b23SMartin Matuska.Sy l2arc_write_max .
983ff01b23SMartin MatuskaARC persistence across reboots can be achieved with persistent L2ARC
993ff01b23SMartin Matuskaby setting this parameter to
1003ff01b23SMartin Matuska.Sy 0 ,
1013ff01b23SMartin Matuskaallowing the full length of ARC lists to be searched for cacheable content.
1023ff01b23SMartin Matuska.
1033ff01b23SMartin Matuska.It Sy l2arc_headroom_boost Ns = Ns Sy 200 Ns % Pq ulong
1043ff01b23SMartin MatuskaScales
1053ff01b23SMartin Matuska.Sy l2arc_headroom
1063ff01b23SMartin Matuskaby this percentage when L2ARC contents are being successfully compressed
1073ff01b23SMartin Matuskabefore writing.
1083ff01b23SMartin MatuskaA value of
1093ff01b23SMartin Matuska.Sy 100
1103ff01b23SMartin Matuskadisables this feature.
1113ff01b23SMartin Matuska.
112dae17134SMartin Matuska.It Sy l2arc_exclude_special Ns = Ns Sy 0 Ns | Ns 1 Pq int
113*e92ffd9bSMartin MatuskaControls whether buffers present on special vdevs are eligible for caching
114dae17134SMartin Matuskainto L2ARC.
115dae17134SMartin MatuskaIf set to 1, exclude dbufs on special vdevs from being cached to L2ARC.
116dae17134SMartin Matuska.
1173ff01b23SMartin Matuska.It Sy l2arc_mfuonly Ns = Ns Sy 0 Ns | Ns 1 Pq  int
1183ff01b23SMartin MatuskaControls whether only MFU metadata and data are cached from ARC into L2ARC.
1193ff01b23SMartin MatuskaThis may be desired to avoid wasting space on L2ARC when reading/writing large
1203ff01b23SMartin Matuskaamounts of data that are not expected to be accessed more than once.
1213ff01b23SMartin Matuska.Pp
1223ff01b23SMartin MatuskaThe default is off,
1233ff01b23SMartin Matuskameaning both MRU and MFU data and metadata are cached.
1243ff01b23SMartin MatuskaWhen turning off this feature, some MRU buffers will still be present
1253ff01b23SMartin Matuskain ARC and eventually cached on L2ARC.
1263ff01b23SMartin Matuska.No If Sy l2arc_noprefetch Ns = Ns Sy 0 ,
1273ff01b23SMartin Matuskasome prefetched buffers will be cached to L2ARC, and those might later
1283ff01b23SMartin Matuskatransition to MRU, in which case the
1293ff01b23SMartin Matuska.Sy l2arc_mru_asize No arcstat will not be Sy 0 .
1303ff01b23SMartin Matuska.Pp
1313ff01b23SMartin MatuskaRegardless of
1323ff01b23SMartin Matuska.Sy l2arc_noprefetch ,
1333ff01b23SMartin Matuskasome MFU buffers might be evicted from ARC,
1343ff01b23SMartin Matuskaaccessed later on as prefetches and transition to MRU as prefetches.
1353ff01b23SMartin MatuskaIf accessed again they are counted as MRU and the
1363ff01b23SMartin Matuska.Sy l2arc_mru_asize No arcstat will not be Sy 0 .
1373ff01b23SMartin Matuska.Pp
1383ff01b23SMartin MatuskaThe ARC status of L2ARC buffers when they were first cached in
1393ff01b23SMartin MatuskaL2ARC can be seen in the
1403ff01b23SMartin Matuska.Sy l2arc_mru_asize , Sy l2arc_mfu_asize , No and Sy l2arc_prefetch_asize
1413ff01b23SMartin Matuskaarcstats when importing the pool or onlining a cache
1423ff01b23SMartin Matuskadevice if persistent L2ARC is enabled.
1433ff01b23SMartin Matuska.Pp
1443ff01b23SMartin MatuskaThe
1453ff01b23SMartin Matuska.Sy evict_l2_eligible_mru
1463ff01b23SMartin Matuskaarcstat does not take into account if this option is enabled as the information
1473ff01b23SMartin Matuskaprovided by the
1483ff01b23SMartin Matuska.Sy evict_l2_eligible_m[rf]u
1493ff01b23SMartin Matuskaarcstats can be used to decide if toggling this option is appropriate
1503ff01b23SMartin Matuskafor the current workload.
1513ff01b23SMartin Matuska.
1523ff01b23SMartin Matuska.It Sy l2arc_meta_percent Ns = Ns Sy 33 Ns % Pq int
1533ff01b23SMartin MatuskaPercent of ARC size allowed for L2ARC-only headers.
1543ff01b23SMartin MatuskaSince L2ARC buffers are not evicted on memory pressure,
1553ff01b23SMartin Matuskatoo many headers on a system with an irrationally large L2ARC
1563ff01b23SMartin Matuskacan render it slow or unusable.
1573ff01b23SMartin MatuskaThis parameter limits L2ARC writes and rebuilds to achieve the target.
1583ff01b23SMartin Matuska.
1593ff01b23SMartin Matuska.It Sy l2arc_trim_ahead Ns = Ns Sy 0 Ns % Pq ulong
1603ff01b23SMartin MatuskaTrims ahead of the current write size
1613ff01b23SMartin Matuska.Pq Sy l2arc_write_max
1623ff01b23SMartin Matuskaon L2ARC devices by this percentage of write size if we have filled the device.
1633ff01b23SMartin MatuskaIf set to
1643ff01b23SMartin Matuska.Sy 100
1653ff01b23SMartin Matuskawe TRIM twice the space required to accommodate upcoming writes.
1663ff01b23SMartin MatuskaA minimum of
1673ff01b23SMartin Matuska.Sy 64MB
1683ff01b23SMartin Matuskawill be trimmed.
1693ff01b23SMartin MatuskaIt also enables TRIM of the whole L2ARC device upon creation
1703ff01b23SMartin Matuskaor addition to an existing pool or if the header of the device is
1713ff01b23SMartin Matuskainvalid upon importing a pool or onlining a cache device.
1723ff01b23SMartin MatuskaA value of
1733ff01b23SMartin Matuska.Sy 0
1743ff01b23SMartin Matuskadisables TRIM on L2ARC altogether and is the default as it can put significant
1753ff01b23SMartin Matuskastress on the underlying storage devices.
1763ff01b23SMartin MatuskaThis will vary depending of how well the specific device handles these commands.
1773ff01b23SMartin Matuska.
1783ff01b23SMartin Matuska.It Sy l2arc_noprefetch Ns = Ns Sy 1 Ns | Ns 0 Pq int
1793ff01b23SMartin MatuskaDo not write buffers to L2ARC if they were prefetched but not used by
1803ff01b23SMartin Matuskaapplications.
1813ff01b23SMartin MatuskaIn case there are prefetched buffers in L2ARC and this option
1823ff01b23SMartin Matuskais later set, we do not read the prefetched buffers from L2ARC.
1833ff01b23SMartin MatuskaUnsetting this option is useful for caching sequential reads from the
1843ff01b23SMartin Matuskadisks to L2ARC and serve those reads from L2ARC later on.
1853ff01b23SMartin MatuskaThis may be beneficial in case the L2ARC device is significantly faster
1863ff01b23SMartin Matuskain sequential reads than the disks of the pool.
1873ff01b23SMartin Matuska.Pp
1883ff01b23SMartin MatuskaUse
1893ff01b23SMartin Matuska.Sy 1
1903ff01b23SMartin Matuskato disable and
1913ff01b23SMartin Matuska.Sy 0
1923ff01b23SMartin Matuskato enable caching/reading prefetches to/from L2ARC.
1933ff01b23SMartin Matuska.
1943ff01b23SMartin Matuska.It Sy l2arc_norw Ns = Ns Sy 0 Ns | Ns 1 Pq int
1953ff01b23SMartin MatuskaNo reads during writes.
1963ff01b23SMartin Matuska.
1973ff01b23SMartin Matuska.It Sy l2arc_write_boost Ns = Ns Sy 8388608 Ns B Po 8MB Pc Pq ulong
1983ff01b23SMartin MatuskaCold L2ARC devices will have
1993ff01b23SMartin Matuska.Sy l2arc_write_max
2003ff01b23SMartin Matuskaincreased by this amount while they remain cold.
2013ff01b23SMartin Matuska.
2023ff01b23SMartin Matuska.It Sy l2arc_write_max Ns = Ns Sy 8388608 Ns B Po 8MB Pc Pq ulong
2033ff01b23SMartin MatuskaMax write bytes per interval.
2043ff01b23SMartin Matuska.
2053ff01b23SMartin Matuska.It Sy l2arc_rebuild_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
2063ff01b23SMartin MatuskaRebuild the L2ARC when importing a pool (persistent L2ARC).
2073ff01b23SMartin MatuskaThis can be disabled if there are problems importing a pool
2083ff01b23SMartin Matuskaor attaching an L2ARC device (e.g. the L2ARC device is slow
2093ff01b23SMartin Matuskain reading stored log metadata, or the metadata
2103ff01b23SMartin Matuskahas become somehow fragmented/unusable).
2113ff01b23SMartin Matuska.
2123ff01b23SMartin Matuska.It Sy l2arc_rebuild_blocks_min_l2size Ns = Ns Sy 1073741824 Ns B Po 1GB Pc Pq ulong
2133ff01b23SMartin MatuskaMininum size of an L2ARC device required in order to write log blocks in it.
2143ff01b23SMartin MatuskaThe log blocks are used upon importing the pool to rebuild the persistent L2ARC.
2153ff01b23SMartin Matuska.Pp
2163ff01b23SMartin MatuskaFor L2ARC devices less than 1GB, the amount of data
2173ff01b23SMartin Matuska.Fn l2arc_evict
2183ff01b23SMartin Matuskaevicts is significant compared to the amount of restored L2ARC data.
2193ff01b23SMartin MatuskaIn this case, do not write log blocks in L2ARC in order not to waste space.
2203ff01b23SMartin Matuska.
2213ff01b23SMartin Matuska.It Sy metaslab_aliquot Ns = Ns Sy 524288 Ns B Po 512kB Pc Pq ulong
2223ff01b23SMartin MatuskaMetaslab granularity, in bytes.
2233ff01b23SMartin MatuskaThis is roughly similar to what would be referred to as the "stripe size"
2243ff01b23SMartin Matuskain traditional RAID arrays.
2253ff01b23SMartin MatuskaIn normal operation, ZFS will try to write this amount of data
2263ff01b23SMartin Matuskato a top-level vdev before moving on to the next one.
2273ff01b23SMartin Matuska.
2283ff01b23SMartin Matuska.It Sy metaslab_bias_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
2293ff01b23SMartin MatuskaEnable metaslab group biasing based on their vdevs' over- or under-utilization
2303ff01b23SMartin Matuskarelative to the pool.
2313ff01b23SMartin Matuska.
2323ff01b23SMartin Matuska.It Sy metaslab_force_ganging Ns = Ns Sy 16777217 Ns B Ns B Po 16MB + 1B Pc Pq ulong
2333ff01b23SMartin MatuskaMake some blocks above a certain size be gang blocks.
2343ff01b23SMartin MatuskaThis option is used by the test suite to facilitate testing.
2353ff01b23SMartin Matuska.
2363ff01b23SMartin Matuska.It Sy zfs_history_output_max Ns = Ns Sy 1048576 Ns B Ns B Po 1MB Pc Pq int
2373ff01b23SMartin MatuskaWhen attempting to log an output nvlist of an ioctl in the on-disk history,
2383ff01b23SMartin Matuskathe output will not be stored if it is larger than this size (in bytes).
2393ff01b23SMartin MatuskaThis must be less than
2403ff01b23SMartin Matuska.Sy DMU_MAX_ACCESS Pq 64MB .
2413ff01b23SMartin MatuskaThis applies primarily to
2423ff01b23SMartin Matuska.Fn zfs_ioc_channel_program Pq cf. Xr zfs-program 8 .
2433ff01b23SMartin Matuska.
2443ff01b23SMartin Matuska.It Sy zfs_keep_log_spacemaps_at_export Ns = Ns Sy 0 Ns | Ns 1 Pq int
2453ff01b23SMartin MatuskaPrevent log spacemaps from being destroyed during pool exports and destroys.
2463ff01b23SMartin Matuska.
2473ff01b23SMartin Matuska.It Sy zfs_metaslab_segment_weight_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
2483ff01b23SMartin MatuskaEnable/disable segment-based metaslab selection.
2493ff01b23SMartin Matuska.
2503ff01b23SMartin Matuska.It Sy zfs_metaslab_switch_threshold Ns = Ns Sy 2 Pq int
2513ff01b23SMartin MatuskaWhen using segment-based metaslab selection, continue allocating
2523ff01b23SMartin Matuskafrom the active metaslab until this option's
2533ff01b23SMartin Matuskaworth of buckets have been exhausted.
2543ff01b23SMartin Matuska.
2553ff01b23SMartin Matuska.It Sy metaslab_debug_load Ns = Ns Sy 0 Ns | Ns 1 Pq int
2563ff01b23SMartin MatuskaLoad all metaslabs during pool import.
2573ff01b23SMartin Matuska.
2583ff01b23SMartin Matuska.It Sy metaslab_debug_unload Ns = Ns Sy 0 Ns | Ns 1 Pq int
2593ff01b23SMartin MatuskaPrevent metaslabs from being unloaded.
2603ff01b23SMartin Matuska.
2613ff01b23SMartin Matuska.It Sy metaslab_fragmentation_factor_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
2623ff01b23SMartin MatuskaEnable use of the fragmentation metric in computing metaslab weights.
2633ff01b23SMartin Matuska.
2643ff01b23SMartin Matuska.It Sy metaslab_df_max_search Ns = Ns Sy 16777216 Ns B Po 16MB Pc Pq int
2653ff01b23SMartin MatuskaMaximum distance to search forward from the last offset.
2663ff01b23SMartin MatuskaWithout this limit, fragmented pools can see
2673ff01b23SMartin Matuska.Em >100`000
2683ff01b23SMartin Matuskaiterations and
2693ff01b23SMartin Matuska.Fn metaslab_block_picker
2703ff01b23SMartin Matuskabecomes the performance limiting factor on high-performance storage.
2713ff01b23SMartin Matuska.Pp
2723ff01b23SMartin MatuskaWith the default setting of
2733ff01b23SMartin Matuska.Sy 16MB ,
2743ff01b23SMartin Matuskawe typically see less than
2753ff01b23SMartin Matuska.Em 500
2763ff01b23SMartin Matuskaiterations, even with very fragmented
2773ff01b23SMartin Matuska.Sy ashift Ns = Ns Sy 9
2783ff01b23SMartin Matuskapools.
2793ff01b23SMartin MatuskaThe maximum number of iterations possible is
2803ff01b23SMartin Matuska.Sy metaslab_df_max_search / 2^(ashift+1) .
2813ff01b23SMartin MatuskaWith the default setting of
2823ff01b23SMartin Matuska.Sy 16MB
2833ff01b23SMartin Matuskathis is
2843ff01b23SMartin Matuska.Em 16*1024 Pq with Sy ashift Ns = Ns Sy 9
2853ff01b23SMartin Matuskaor
2863ff01b23SMartin Matuska.Em 2*1024 Pq with Sy ashift Ns = Ns Sy 12 .
2873ff01b23SMartin Matuska.
2883ff01b23SMartin Matuska.It Sy metaslab_df_use_largest_segment Ns = Ns Sy 0 Ns | Ns 1 Pq int
2893ff01b23SMartin MatuskaIf not searching forward (due to
2903ff01b23SMartin Matuska.Sy metaslab_df_max_search , metaslab_df_free_pct ,
2913ff01b23SMartin Matuska.No or Sy metaslab_df_alloc_threshold ) ,
2923ff01b23SMartin Matuskathis tunable controls which segment is used.
2933ff01b23SMartin MatuskaIf set, we will use the largest free segment.
2943ff01b23SMartin MatuskaIf unset, we will use a segment of at least the requested size.
2953ff01b23SMartin Matuska.
2963ff01b23SMartin Matuska.It Sy zfs_metaslab_max_size_cache_sec Ns = Ns Sy 3600 Ns s Po 1h Pc Pq ulong
2973ff01b23SMartin MatuskaWhen we unload a metaslab, we cache the size of the largest free chunk.
2983ff01b23SMartin MatuskaWe use that cached size to determine whether or not to load a metaslab
2993ff01b23SMartin Matuskafor a given allocation.
3003ff01b23SMartin MatuskaAs more frees accumulate in that metaslab while it's unloaded,
3013ff01b23SMartin Matuskathe cached max size becomes less and less accurate.
3023ff01b23SMartin MatuskaAfter a number of seconds controlled by this tunable,
3033ff01b23SMartin Matuskawe stop considering the cached max size and start
3043ff01b23SMartin Matuskaconsidering only the histogram instead.
3053ff01b23SMartin Matuska.
3063ff01b23SMartin Matuska.It Sy zfs_metaslab_mem_limit Ns = Ns Sy 25 Ns % Pq int
3073ff01b23SMartin MatuskaWhen we are loading a new metaslab, we check the amount of memory being used
3083ff01b23SMartin Matuskato store metaslab range trees.
3093ff01b23SMartin MatuskaIf it is over a threshold, we attempt to unload the least recently used metaslab
3103ff01b23SMartin Matuskato prevent the system from clogging all of its memory with range trees.
3113ff01b23SMartin MatuskaThis tunable sets the percentage of total system memory that is the threshold.
3123ff01b23SMartin Matuska.
3133ff01b23SMartin Matuska.It Sy zfs_metaslab_try_hard_before_gang Ns = Ns Sy 0 Ns | Ns 1 Pq int
3143ff01b23SMartin Matuska.Bl -item -compact
3153ff01b23SMartin Matuska.It
3163ff01b23SMartin MatuskaIf unset, we will first try normal allocation.
3173ff01b23SMartin Matuska.It
3183ff01b23SMartin MatuskaIf that fails then we will do a gang allocation.
3193ff01b23SMartin Matuska.It
3203ff01b23SMartin MatuskaIf that fails then we will do a "try hard" gang allocation.
3213ff01b23SMartin Matuska.It
3223ff01b23SMartin MatuskaIf that fails then we will have a multi-layer gang block.
3233ff01b23SMartin Matuska.El
3243ff01b23SMartin Matuska.Pp
3253ff01b23SMartin Matuska.Bl -item -compact
3263ff01b23SMartin Matuska.It
3273ff01b23SMartin MatuskaIf set, we will first try normal allocation.
3283ff01b23SMartin Matuska.It
3293ff01b23SMartin MatuskaIf that fails then we will do a "try hard" allocation.
3303ff01b23SMartin Matuska.It
3313ff01b23SMartin MatuskaIf that fails we will do a gang allocation.
3323ff01b23SMartin Matuska.It
3333ff01b23SMartin MatuskaIf that fails we will do a "try hard" gang allocation.
3343ff01b23SMartin Matuska.It
3353ff01b23SMartin MatuskaIf that fails then we will have a multi-layer gang block.
3363ff01b23SMartin Matuska.El
3373ff01b23SMartin Matuska.
3383ff01b23SMartin Matuska.It Sy zfs_metaslab_find_max_tries Ns = Ns Sy 100 Pq int
3393ff01b23SMartin MatuskaWhen not trying hard, we only consider this number of the best metaslabs.
3403ff01b23SMartin MatuskaThis improves performance, especially when there are many metaslabs per vdev
3413ff01b23SMartin Matuskaand the allocation can't actually be satisfied
3423ff01b23SMartin Matuska(so we would otherwise iterate all metaslabs).
3433ff01b23SMartin Matuska.
3443ff01b23SMartin Matuska.It Sy zfs_vdev_default_ms_count Ns = Ns Sy 200 Pq int
3453ff01b23SMartin MatuskaWhen a vdev is added, target this number of metaslabs per top-level vdev.
3463ff01b23SMartin Matuska.
3473ff01b23SMartin Matuska.It Sy zfs_vdev_default_ms_shift Ns = Ns Sy 29 Po 512MB Pc Pq int
3483ff01b23SMartin MatuskaDefault limit for metaslab size.
3493ff01b23SMartin Matuska.
3503ff01b23SMartin Matuska.It Sy zfs_vdev_max_auto_ashift Ns = Ns Sy ASHIFT_MAX Po 16 Pc Pq ulong
351*e92ffd9bSMartin MatuskaMaximum ashift used when optimizing for logical \[->] physical sector size on new
3523ff01b23SMartin Matuskatop-level vdevs.
3533ff01b23SMartin Matuska.
3543ff01b23SMartin Matuska.It Sy zfs_vdev_min_auto_ashift Ns = Ns Sy ASHIFT_MIN Po 9 Pc Pq ulong
3553ff01b23SMartin MatuskaMinimum ashift used when creating new top-level vdevs.
3563ff01b23SMartin Matuska.
3573ff01b23SMartin Matuska.It Sy zfs_vdev_min_ms_count Ns = Ns Sy 16 Pq int
3583ff01b23SMartin MatuskaMinimum number of metaslabs to create in a top-level vdev.
3593ff01b23SMartin Matuska.
3603ff01b23SMartin Matuska.It Sy vdev_validate_skip Ns = Ns Sy 0 Ns | Ns 1 Pq int
3613ff01b23SMartin MatuskaSkip label validation steps during pool import.
3623ff01b23SMartin MatuskaChanging is not recommended unless you know what you're doing
3633ff01b23SMartin Matuskaand are recovering a damaged label.
3643ff01b23SMartin Matuska.
3653ff01b23SMartin Matuska.It Sy zfs_vdev_ms_count_limit Ns = Ns Sy 131072 Po 128k Pc Pq int
3663ff01b23SMartin MatuskaPractical upper limit of total metaslabs per top-level vdev.
3673ff01b23SMartin Matuska.
3683ff01b23SMartin Matuska.It Sy metaslab_preload_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
3693ff01b23SMartin MatuskaEnable metaslab group preloading.
3703ff01b23SMartin Matuska.
3713ff01b23SMartin Matuska.It Sy metaslab_lba_weighting_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
3723ff01b23SMartin MatuskaGive more weight to metaslabs with lower LBAs,
3733ff01b23SMartin Matuskaassuming they have greater bandwidth,
3743ff01b23SMartin Matuskaas is typically the case on a modern constant angular velocity disk drive.
3753ff01b23SMartin Matuska.
3763ff01b23SMartin Matuska.It Sy metaslab_unload_delay Ns = Ns Sy 32 Pq int
3773ff01b23SMartin MatuskaAfter a metaslab is used, we keep it loaded for this many TXGs, to attempt to
3783ff01b23SMartin Matuskareduce unnecessary reloading.
3793ff01b23SMartin MatuskaNote that both this many TXGs and
3803ff01b23SMartin Matuska.Sy metaslab_unload_delay_ms
3813ff01b23SMartin Matuskamilliseconds must pass before unloading will occur.
3823ff01b23SMartin Matuska.
3833ff01b23SMartin Matuska.It Sy metaslab_unload_delay_ms Ns = Ns Sy 600000 Ns ms Po 10min Pc Pq int
3843ff01b23SMartin MatuskaAfter a metaslab is used, we keep it loaded for this many milliseconds,
3853ff01b23SMartin Matuskato attempt to reduce unnecessary reloading.
3863ff01b23SMartin MatuskaNote, that both this many milliseconds and
3873ff01b23SMartin Matuska.Sy metaslab_unload_delay
3883ff01b23SMartin MatuskaTXGs must pass before unloading will occur.
3893ff01b23SMartin Matuska.
3903ff01b23SMartin Matuska.It Sy reference_history Ns = Ns Sy 3 Pq int
3913ff01b23SMartin MatuskaMaximum reference holders being tracked when reference_tracking_enable is active.
3923ff01b23SMartin Matuska.
3933ff01b23SMartin Matuska.It Sy reference_tracking_enable Ns = Ns Sy 0 Ns | Ns 1 Pq int
3943ff01b23SMartin MatuskaTrack reference holders to
3953ff01b23SMartin Matuska.Sy refcount_t
3963ff01b23SMartin Matuskaobjects (debug builds only).
3973ff01b23SMartin Matuska.
3983ff01b23SMartin Matuska.It Sy send_holes_without_birth_time Ns = Ns Sy 1 Ns | Ns 0 Pq int
3993ff01b23SMartin MatuskaWhen set, the
4003ff01b23SMartin Matuska.Sy hole_birth
4013ff01b23SMartin Matuskaoptimization will not be used, and all holes will always be sent during a
4023ff01b23SMartin Matuska.Nm zfs Cm send .
4033ff01b23SMartin MatuskaThis is useful if you suspect your datasets are affected by a bug in
4043ff01b23SMartin Matuska.Sy hole_birth .
4053ff01b23SMartin Matuska.
4063ff01b23SMartin Matuska.It Sy spa_config_path Ns = Ns Pa /etc/zfs/zpool.cache Pq charp
4073ff01b23SMartin MatuskaSPA config file.
4083ff01b23SMartin Matuska.
4093ff01b23SMartin Matuska.It Sy spa_asize_inflation Ns = Ns Sy 24 Pq int
4103ff01b23SMartin MatuskaMultiplication factor used to estimate actual disk consumption from the
4113ff01b23SMartin Matuskasize of data being written.
4123ff01b23SMartin MatuskaThe default value is a worst case estimate,
4133ff01b23SMartin Matuskabut lower values may be valid for a given pool depending on its configuration.
4143ff01b23SMartin MatuskaPool administrators who understand the factors involved
4153ff01b23SMartin Matuskamay wish to specify a more realistic inflation factor,
4163ff01b23SMartin Matuskaparticularly if they operate close to quota or capacity limits.
4173ff01b23SMartin Matuska.
4183ff01b23SMartin Matuska.It Sy spa_load_print_vdev_tree Ns = Ns Sy 0 Ns | Ns 1 Pq int
4193ff01b23SMartin MatuskaWhether to print the vdev tree in the debugging message buffer during pool import.
4203ff01b23SMartin Matuska.
4213ff01b23SMartin Matuska.It Sy spa_load_verify_data Ns = Ns Sy 1 Ns | Ns 0 Pq int
4223ff01b23SMartin MatuskaWhether to traverse data blocks during an "extreme rewind"
4233ff01b23SMartin Matuska.Pq Fl X
4243ff01b23SMartin Matuskaimport.
4253ff01b23SMartin Matuska.Pp
4263ff01b23SMartin MatuskaAn extreme rewind import normally performs a full traversal of all
4273ff01b23SMartin Matuskablocks in the pool for verification.
4283ff01b23SMartin MatuskaIf this parameter is unset, the traversal skips non-metadata blocks.
4293ff01b23SMartin MatuskaIt can be toggled once the
4303ff01b23SMartin Matuskaimport has started to stop or start the traversal of non-metadata blocks.
4313ff01b23SMartin Matuska.
4323ff01b23SMartin Matuska.It Sy spa_load_verify_metadata  Ns = Ns Sy 1 Ns | Ns 0 Pq int
4333ff01b23SMartin MatuskaWhether to traverse blocks during an "extreme rewind"
4343ff01b23SMartin Matuska.Pq Fl X
4353ff01b23SMartin Matuskapool import.
4363ff01b23SMartin Matuska.Pp
4373ff01b23SMartin MatuskaAn extreme rewind import normally performs a full traversal of all
4383ff01b23SMartin Matuskablocks in the pool for verification.
4393ff01b23SMartin MatuskaIf this parameter is unset, the traversal is not performed.
4403ff01b23SMartin MatuskaIt can be toggled once the import has started to stop or start the traversal.
4413ff01b23SMartin Matuska.
4423ff01b23SMartin Matuska.It Sy spa_load_verify_shift Ns = Ns Sy 4 Po 1/16th Pc Pq int
4433ff01b23SMartin MatuskaSets the maximum number of bytes to consume during pool import to the log2
4443ff01b23SMartin Matuskafraction of the target ARC size.
4453ff01b23SMartin Matuska.
4463ff01b23SMartin Matuska.It Sy spa_slop_shift Ns = Ns Sy 5 Po 1/32nd Pc Pq int
4473ff01b23SMartin MatuskaNormally, we don't allow the last
4483ff01b23SMartin Matuska.Sy 3.2% Pq Sy 1/2^spa_slop_shift
4493ff01b23SMartin Matuskaof space in the pool to be consumed.
4503ff01b23SMartin MatuskaThis ensures that we don't run the pool completely out of space,
4513ff01b23SMartin Matuskadue to unaccounted changes (e.g. to the MOS).
4523ff01b23SMartin MatuskaIt also limits the worst-case time to allocate space.
4533ff01b23SMartin MatuskaIf we have less than this amount of free space,
4543ff01b23SMartin Matuskamost ZPL operations (e.g. write, create) will return
4553ff01b23SMartin Matuska.Sy ENOSPC .
4563ff01b23SMartin Matuska.
4573ff01b23SMartin Matuska.It Sy vdev_removal_max_span Ns = Ns Sy 32768 Ns B Po 32kB Pc Pq int
4583ff01b23SMartin MatuskaDuring top-level vdev removal, chunks of data are copied from the vdev
4593ff01b23SMartin Matuskawhich may include free space in order to trade bandwidth for IOPS.
4603ff01b23SMartin MatuskaThis parameter determines the maximum span of free space, in bytes,
4613ff01b23SMartin Matuskawhich will be included as "unnecessary" data in a chunk of copied data.
4623ff01b23SMartin Matuska.Pp
4633ff01b23SMartin MatuskaThe default value here was chosen to align with
4643ff01b23SMartin Matuska.Sy zfs_vdev_read_gap_limit ,
4653ff01b23SMartin Matuskawhich is a similar concept when doing
4663ff01b23SMartin Matuskaregular reads (but there's no reason it has to be the same).
4673ff01b23SMartin Matuska.
4683ff01b23SMartin Matuska.It Sy vdev_file_logical_ashift Ns = Ns Sy 9 Po 512B Pc Pq ulong
4693ff01b23SMartin MatuskaLogical ashift for file-based devices.
4703ff01b23SMartin Matuska.
4713ff01b23SMartin Matuska.It Sy vdev_file_physical_ashift Ns = Ns Sy 9 Po 512B Pc Pq ulong
4723ff01b23SMartin MatuskaPhysical ashift for file-based devices.
4733ff01b23SMartin Matuska.
4743ff01b23SMartin Matuska.It Sy zap_iterate_prefetch Ns = Ns Sy 1 Ns | Ns 0 Pq int
4753ff01b23SMartin MatuskaIf set, when we start iterating over a ZAP object,
4763ff01b23SMartin Matuskaprefetch the entire object (all leaf blocks).
4773ff01b23SMartin MatuskaHowever, this is limited by
4783ff01b23SMartin Matuska.Sy dmu_prefetch_max .
4793ff01b23SMartin Matuska.
4803ff01b23SMartin Matuska.It Sy zfetch_array_rd_sz Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq ulong
4813ff01b23SMartin MatuskaIf prefetching is enabled, disable prefetching for reads larger than this size.
4823ff01b23SMartin Matuska.
4833ff01b23SMartin Matuska.It Sy zfetch_max_distance Ns = Ns Sy 8388608 Ns B Po 8MB Pc Pq uint
4843ff01b23SMartin MatuskaMax bytes to prefetch per stream.
4853ff01b23SMartin Matuska.
4863ff01b23SMartin Matuska.It Sy zfetch_max_idistance Ns = Ns Sy 67108864 Ns B Po 64MB Pc Pq uint
4873ff01b23SMartin MatuskaMax bytes to prefetch indirects for per stream.
4883ff01b23SMartin Matuska.
4893ff01b23SMartin Matuska.It Sy zfetch_max_streams Ns = Ns Sy 8 Pq uint
4903ff01b23SMartin MatuskaMax number of streams per zfetch (prefetch streams per file).
4913ff01b23SMartin Matuska.
4923ff01b23SMartin Matuska.It Sy zfetch_min_sec_reap Ns = Ns Sy 2 Pq uint
4933ff01b23SMartin MatuskaMin time before an active prefetch stream can be reclaimed
4943ff01b23SMartin Matuska.
4953ff01b23SMartin Matuska.It Sy zfs_abd_scatter_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
4963ff01b23SMartin MatuskaEnables ARC from using scatter/gather lists and forces all allocations to be
4973ff01b23SMartin Matuskalinear in kernel memory.
4983ff01b23SMartin MatuskaDisabling can improve performance in some code paths
4993ff01b23SMartin Matuskaat the expense of fragmented kernel memory.
5003ff01b23SMartin Matuska.
501*e92ffd9bSMartin Matuska.It Sy zfs_abd_scatter_max_order Ns = Ns Sy MAX_ORDER\-1 Pq uint
5023ff01b23SMartin MatuskaMaximum number of consecutive memory pages allocated in a single block for
5033ff01b23SMartin Matuskascatter/gather lists.
5043ff01b23SMartin Matuska.Pp
5053ff01b23SMartin MatuskaThe value of
5063ff01b23SMartin Matuska.Sy MAX_ORDER
5073ff01b23SMartin Matuskadepends on kernel configuration.
5083ff01b23SMartin Matuska.
5093ff01b23SMartin Matuska.It Sy zfs_abd_scatter_min_size Ns = Ns Sy 1536 Ns B Po 1.5kB Pc Pq uint
5103ff01b23SMartin MatuskaThis is the minimum allocation size that will use scatter (page-based) ABDs.
5113ff01b23SMartin MatuskaSmaller allocations will use linear ABDs.
5123ff01b23SMartin Matuska.
5133ff01b23SMartin Matuska.It Sy zfs_arc_dnode_limit Ns = Ns Sy 0 Ns B Pq ulong
5143ff01b23SMartin MatuskaWhen the number of bytes consumed by dnodes in the ARC exceeds this number of
5153ff01b23SMartin Matuskabytes, try to unpin some of it in response to demand for non-metadata.
5163ff01b23SMartin MatuskaThis value acts as a ceiling to the amount of dnode metadata, and defaults to
5173ff01b23SMartin Matuska.Sy 0 ,
5183ff01b23SMartin Matuskawhich indicates that a percent which is based on
5193ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit_percent
5203ff01b23SMartin Matuskaof the ARC meta buffers that may be used for dnodes.
5213ff01b23SMartin Matuska.Pp
5223ff01b23SMartin MatuskaAlso see
5233ff01b23SMartin Matuska.Sy zfs_arc_meta_prune
5243ff01b23SMartin Matuskawhich serves a similar purpose but is used
5253ff01b23SMartin Matuskawhen the amount of metadata in the ARC exceeds
5263ff01b23SMartin Matuska.Sy zfs_arc_meta_limit
5273ff01b23SMartin Matuskarather than in response to overall demand for non-metadata.
5283ff01b23SMartin Matuska.
5293ff01b23SMartin Matuska.It Sy zfs_arc_dnode_limit_percent Ns = Ns Sy 10 Ns % Pq ulong
5303ff01b23SMartin MatuskaPercentage that can be consumed by dnodes of ARC meta buffers.
5313ff01b23SMartin Matuska.Pp
5323ff01b23SMartin MatuskaSee also
5333ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit ,
5343ff01b23SMartin Matuskawhich serves a similar purpose but has a higher priority if nonzero.
5353ff01b23SMartin Matuska.
5363ff01b23SMartin Matuska.It Sy zfs_arc_dnode_reduce_percent Ns = Ns Sy 10 Ns % Pq ulong
5373ff01b23SMartin MatuskaPercentage of ARC dnodes to try to scan in response to demand for non-metadata
5383ff01b23SMartin Matuskawhen the number of bytes consumed by dnodes exceeds
5393ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit .
5403ff01b23SMartin Matuska.
5413ff01b23SMartin Matuska.It Sy zfs_arc_average_blocksize Ns = Ns Sy 8192 Ns B Po 8kB Pc Pq int
5423ff01b23SMartin MatuskaThe ARC's buffer hash table is sized based on the assumption of an average
5433ff01b23SMartin Matuskablock size of this value.
5443ff01b23SMartin MatuskaThis works out to roughly 1MB of hash table per 1GB of physical memory
5453ff01b23SMartin Matuskawith 8-byte pointers.
5463ff01b23SMartin MatuskaFor configurations with a known larger average block size,
5473ff01b23SMartin Matuskathis value can be increased to reduce the memory footprint.
5483ff01b23SMartin Matuska.
5493ff01b23SMartin Matuska.It Sy zfs_arc_eviction_pct Ns = Ns Sy 200 Ns % Pq int
5503ff01b23SMartin MatuskaWhen
5513ff01b23SMartin Matuska.Fn arc_is_overflowing ,
5523ff01b23SMartin Matuska.Fn arc_get_data_impl
5533ff01b23SMartin Matuskawaits for this percent of the requested amount of data to be evicted.
5543ff01b23SMartin MatuskaFor example, by default, for every
5553ff01b23SMartin Matuska.Em 2kB
5563ff01b23SMartin Matuskathat's evicted,
5573ff01b23SMartin Matuska.Em 1kB
5583ff01b23SMartin Matuskaof it may be "reused" by a new allocation.
5593ff01b23SMartin MatuskaSince this is above
5603ff01b23SMartin Matuska.Sy 100 Ns % ,
5613ff01b23SMartin Matuskait ensures that progress is made towards getting
5623ff01b23SMartin Matuska.Sy arc_size No under Sy arc_c .
5633ff01b23SMartin MatuskaSince this is finite, it ensures that allocations can still happen,
5643ff01b23SMartin Matuskaeven during the potentially long time that
5653ff01b23SMartin Matuska.Sy arc_size No is more than Sy arc_c .
5663ff01b23SMartin Matuska.
5673ff01b23SMartin Matuska.It Sy zfs_arc_evict_batch_limit Ns = Ns Sy 10 Pq int
5683ff01b23SMartin MatuskaNumber ARC headers to evict per sub-list before proceeding to another sub-list.
5693ff01b23SMartin MatuskaThis batch-style operation prevents entire sub-lists from being evicted at once
5703ff01b23SMartin Matuskabut comes at a cost of additional unlocking and locking.
5713ff01b23SMartin Matuska.
5723ff01b23SMartin Matuska.It Sy zfs_arc_grow_retry Ns = Ns Sy 0 Ns s Pq int
5733ff01b23SMartin MatuskaIf set to a non zero value, it will replace the
5743ff01b23SMartin Matuska.Sy arc_grow_retry
5753ff01b23SMartin Matuskavalue with this value.
5763ff01b23SMartin MatuskaThe
5773ff01b23SMartin Matuska.Sy arc_grow_retry
5783ff01b23SMartin Matuska.No value Pq default Sy 5 Ns s
5793ff01b23SMartin Matuskais the number of seconds the ARC will wait before
5803ff01b23SMartin Matuskatrying to resume growth after a memory pressure event.
5813ff01b23SMartin Matuska.
5823ff01b23SMartin Matuska.It Sy zfs_arc_lotsfree_percent Ns = Ns Sy 10 Ns % Pq int
5833ff01b23SMartin MatuskaThrottle I/O when free system memory drops below this percentage of total
5843ff01b23SMartin Matuskasystem memory.
5853ff01b23SMartin MatuskaSetting this value to
5863ff01b23SMartin Matuska.Sy 0
5873ff01b23SMartin Matuskawill disable the throttle.
5883ff01b23SMartin Matuska.
5893ff01b23SMartin Matuska.It Sy zfs_arc_max Ns = Ns Sy 0 Ns B Pq ulong
5903ff01b23SMartin MatuskaMax size of ARC in bytes.
5913ff01b23SMartin MatuskaIf
5923ff01b23SMartin Matuska.Sy 0 ,
5933ff01b23SMartin Matuskathen the max size of ARC is determined by the amount of system memory installed.
5943ff01b23SMartin MatuskaUnder Linux, half of system memory will be used as the limit.
5953ff01b23SMartin MatuskaUnder
5963ff01b23SMartin Matuska.Fx ,
5973ff01b23SMartin Matuskathe larger of
598*e92ffd9bSMartin Matuska.Sy all_system_memory No \- Sy 1GB
599*e92ffd9bSMartin Matuskaand
600*e92ffd9bSMartin Matuska.Sy 5/8 No \(mu Sy all_system_memory
6013ff01b23SMartin Matuskawill be used as the limit.
6023ff01b23SMartin MatuskaThis value must be at least
6033ff01b23SMartin Matuska.Sy 67108864 Ns B Pq 64MB .
6043ff01b23SMartin Matuska.Pp
6053ff01b23SMartin MatuskaThis value can be changed dynamically, with some caveats.
6063ff01b23SMartin MatuskaIt cannot be set back to
6073ff01b23SMartin Matuska.Sy 0
6083ff01b23SMartin Matuskawhile running, and reducing it below the current ARC size will not cause
6093ff01b23SMartin Matuskathe ARC to shrink without memory pressure to induce shrinking.
6103ff01b23SMartin Matuska.
6113ff01b23SMartin Matuska.It Sy zfs_arc_meta_adjust_restarts Ns = Ns Sy 4096 Pq ulong
6123ff01b23SMartin MatuskaThe number of restart passes to make while scanning the ARC attempting
6133ff01b23SMartin Matuskathe free buffers in order to stay below the
6143ff01b23SMartin Matuska.Sy fs_arc_meta_limit .
6153ff01b23SMartin MatuskaThis value should not need to be tuned but is available to facilitate
6163ff01b23SMartin Matuskaperformance analysis.
6173ff01b23SMartin Matuska.
6183ff01b23SMartin Matuska.It Sy zfs_arc_meta_limit Ns = Ns Sy 0 Ns B Pq ulong
6193ff01b23SMartin MatuskaThe maximum allowed size in bytes that metadata buffers are allowed to
6203ff01b23SMartin Matuskaconsume in the ARC.
6213ff01b23SMartin MatuskaWhen this limit is reached, metadata buffers will be reclaimed,
6223ff01b23SMartin Matuskaeven if the overall
6233ff01b23SMartin Matuska.Sy arc_c_max
6243ff01b23SMartin Matuskahas not been reached.
6253ff01b23SMartin MatuskaIt defaults to
6263ff01b23SMartin Matuska.Sy 0 ,
6273ff01b23SMartin Matuskawhich indicates that a percentage based on
6283ff01b23SMartin Matuska.Sy zfs_arc_meta_limit_percent
6293ff01b23SMartin Matuskaof the ARC may be used for metadata.
6303ff01b23SMartin Matuska.Pp
6313ff01b23SMartin MatuskaThis value my be changed dynamically, except that must be set to an explicit value
6323ff01b23SMartin Matuska.Pq cannot be set back to Sy 0 .
6333ff01b23SMartin Matuska.
6343ff01b23SMartin Matuska.It Sy zfs_arc_meta_limit_percent Ns = Ns Sy 75 Ns % Pq ulong
6353ff01b23SMartin MatuskaPercentage of ARC buffers that can be used for metadata.
6363ff01b23SMartin Matuska.Pp
6373ff01b23SMartin MatuskaSee also
6383ff01b23SMartin Matuska.Sy zfs_arc_meta_limit ,
6393ff01b23SMartin Matuskawhich serves a similar purpose but has a higher priority if nonzero.
6403ff01b23SMartin Matuska.
6413ff01b23SMartin Matuska.It Sy zfs_arc_meta_min Ns = Ns Sy 0 Ns B Pq ulong
6423ff01b23SMartin MatuskaThe minimum allowed size in bytes that metadata buffers may consume in
6433ff01b23SMartin Matuskathe ARC.
6443ff01b23SMartin Matuska.
6453ff01b23SMartin Matuska.It Sy zfs_arc_meta_prune Ns = Ns Sy 10000 Pq int
6463ff01b23SMartin MatuskaThe number of dentries and inodes to be scanned looking for entries
6473ff01b23SMartin Matuskawhich can be dropped.
6483ff01b23SMartin MatuskaThis may be required when the ARC reaches the
6493ff01b23SMartin Matuska.Sy zfs_arc_meta_limit
6503ff01b23SMartin Matuskabecause dentries and inodes can pin buffers in the ARC.
6513ff01b23SMartin MatuskaIncreasing this value will cause to dentry and inode caches
6523ff01b23SMartin Matuskato be pruned more aggressively.
6533ff01b23SMartin MatuskaSetting this value to
6543ff01b23SMartin Matuska.Sy 0
6553ff01b23SMartin Matuskawill disable pruning the inode and dentry caches.
6563ff01b23SMartin Matuska.
6573ff01b23SMartin Matuska.It Sy zfs_arc_meta_strategy Ns = Ns Sy 1 Ns | Ns 0 Pq int
6583ff01b23SMartin MatuskaDefine the strategy for ARC metadata buffer eviction (meta reclaim strategy):
6593ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "0 (META_ONLY)"
6603ff01b23SMartin Matuska.It Sy 0 Pq META_ONLY
6613ff01b23SMartin Matuskaevict only the ARC metadata buffers
6623ff01b23SMartin Matuska.It Sy 1 Pq BALANCED
6633ff01b23SMartin Matuskaadditional data buffers may be evicted if required
6643ff01b23SMartin Matuskato evict the required number of metadata buffers.
6653ff01b23SMartin Matuska.El
6663ff01b23SMartin Matuska.
6673ff01b23SMartin Matuska.It Sy zfs_arc_min Ns = Ns Sy 0 Ns B Pq ulong
6683ff01b23SMartin MatuskaMin size of ARC in bytes.
6693ff01b23SMartin Matuska.No If set to Sy 0 , arc_c_min
6703ff01b23SMartin Matuskawill default to consuming the larger of
671*e92ffd9bSMartin Matuska.Sy 32MB
672*e92ffd9bSMartin Matuskaand
673*e92ffd9bSMartin Matuska.Sy all_system_memory No / Sy 32 .
6743ff01b23SMartin Matuska.
6753ff01b23SMartin Matuska.It Sy zfs_arc_min_prefetch_ms Ns = Ns Sy 0 Ns ms Ns Po Ns ≡ Ns 1s Pc Pq int
6763ff01b23SMartin MatuskaMinimum time prefetched blocks are locked in the ARC.
6773ff01b23SMartin Matuska.
6783ff01b23SMartin Matuska.It Sy zfs_arc_min_prescient_prefetch_ms Ns = Ns Sy 0 Ns ms Ns Po Ns ≡ Ns 6s Pc Pq int
6793ff01b23SMartin MatuskaMinimum time "prescient prefetched" blocks are locked in the ARC.
6803ff01b23SMartin MatuskaThese blocks are meant to be prefetched fairly aggressively ahead of
6813ff01b23SMartin Matuskathe code that may use them.
6823ff01b23SMartin Matuska.
683*e92ffd9bSMartin Matuska.It Sy zfs_arc_prune_task_threads Ns = Ns Sy 1 Pq int
684*e92ffd9bSMartin MatuskaNumber of arc_prune threads.
685*e92ffd9bSMartin Matuska.Fx
686*e92ffd9bSMartin Matuskadoes not need more than one.
687*e92ffd9bSMartin MatuskaLinux may theoretically use one per mount point up to number of CPUs,
688*e92ffd9bSMartin Matuskabut that was not proven to be useful.
689*e92ffd9bSMartin Matuska.
6903ff01b23SMartin Matuska.It Sy zfs_max_missing_tvds Ns = Ns Sy 0 Pq int
6913ff01b23SMartin MatuskaNumber of missing top-level vdevs which will be allowed during
6923ff01b23SMartin Matuskapool import (only in read-only mode).
6933ff01b23SMartin Matuska.
6943ff01b23SMartin Matuska.It Sy zfs_max_nvlist_src_size Ns = Sy 0 Pq ulong
6953ff01b23SMartin MatuskaMaximum size in bytes allowed to be passed as
6963ff01b23SMartin Matuska.Sy zc_nvlist_src_size
6973ff01b23SMartin Matuskafor ioctls on
6983ff01b23SMartin Matuska.Pa /dev/zfs .
6993ff01b23SMartin MatuskaThis prevents a user from causing the kernel to allocate
7003ff01b23SMartin Matuskaan excessive amount of memory.
7013ff01b23SMartin MatuskaWhen the limit is exceeded, the ioctl fails with
7023ff01b23SMartin Matuska.Sy EINVAL
7033ff01b23SMartin Matuskaand a description of the error is sent to the
7043ff01b23SMartin Matuska.Pa zfs-dbgmsg
7053ff01b23SMartin Matuskalog.
7063ff01b23SMartin MatuskaThis parameter should not need to be touched under normal circumstances.
7073ff01b23SMartin MatuskaIf
7083ff01b23SMartin Matuska.Sy 0 ,
7093ff01b23SMartin Matuskaequivalent to a quarter of the user-wired memory limit under
7103ff01b23SMartin Matuska.Fx
7113ff01b23SMartin Matuskaand to
7123ff01b23SMartin Matuska.Sy 134217728 Ns B Pq 128MB
7133ff01b23SMartin Matuskaunder Linux.
7143ff01b23SMartin Matuska.
7153ff01b23SMartin Matuska.It Sy zfs_multilist_num_sublists Ns = Ns Sy 0 Pq int
7163ff01b23SMartin MatuskaTo allow more fine-grained locking, each ARC state contains a series
7173ff01b23SMartin Matuskaof lists for both data and metadata objects.
7183ff01b23SMartin MatuskaLocking is performed at the level of these "sub-lists".
7193ff01b23SMartin MatuskaThis parameters controls the number of sub-lists per ARC state,
7203ff01b23SMartin Matuskaand also applies to other uses of the multilist data structure.
7213ff01b23SMartin Matuska.Pp
7223ff01b23SMartin MatuskaIf
7233ff01b23SMartin Matuska.Sy 0 ,
7243ff01b23SMartin Matuskaequivalent to the greater of the number of online CPUs and
7253ff01b23SMartin Matuska.Sy 4 .
7263ff01b23SMartin Matuska.
7273ff01b23SMartin Matuska.It Sy zfs_arc_overflow_shift Ns = Ns Sy 8 Pq int
7283ff01b23SMartin MatuskaThe ARC size is considered to be overflowing if it exceeds the current
7293ff01b23SMartin MatuskaARC target size
7303ff01b23SMartin Matuska.Pq Sy arc_c
7313f9d360cSMartin Matuskaby thresholds determined by this parameter.
7323f9d360cSMartin MatuskaExceeding by
733*e92ffd9bSMartin Matuska.Sy ( arc_c No >> Sy zfs_arc_overflow_shift ) No / Sy 2
7343f9d360cSMartin Matuskastarts ARC reclamation process.
7353f9d360cSMartin MatuskaIf that appears insufficient, exceeding by
736*e92ffd9bSMartin Matuska.Sy ( arc_c No >> Sy zfs_arc_overflow_shift ) No \(mu Sy 1.5
7373f9d360cSMartin Matuskablocks new buffer allocation until the reclaim thread catches up.
7383f9d360cSMartin MatuskaStarted reclamation process continues till ARC size returns below the
7393f9d360cSMartin Matuskatarget size.
7403ff01b23SMartin Matuska.Pp
7413ff01b23SMartin MatuskaThe default value of
7423ff01b23SMartin Matuska.Sy 8
7433f9d360cSMartin Matuskacauses the ARC to start reclamation if it exceeds the target size by
7443f9d360cSMartin Matuska.Em 0.2%
7453f9d360cSMartin Matuskaof the target size, and block allocations by
7463f9d360cSMartin Matuska.Em 0.6% .
7473ff01b23SMartin Matuska.
7483ff01b23SMartin Matuska.It Sy zfs_arc_p_min_shift Ns = Ns Sy 0 Pq int
7493ff01b23SMartin MatuskaIf nonzero, this will update
7503ff01b23SMartin Matuska.Sy arc_p_min_shift Pq default Sy 4
7513ff01b23SMartin Matuskawith the new value.
7523ff01b23SMartin Matuska.Sy arc_p_min_shift No is used as a shift of Sy arc_c
7533ff01b23SMartin Matuskawhen calculating the minumum
7543ff01b23SMartin Matuska.Sy arc_p No size.
7553ff01b23SMartin Matuska.
7563ff01b23SMartin Matuska.It Sy zfs_arc_p_dampener_disable Ns = Ns Sy 1 Ns | Ns 0 Pq int
7573ff01b23SMartin MatuskaDisable
7583ff01b23SMartin Matuska.Sy arc_p
7593ff01b23SMartin Matuskaadapt dampener, which reduces the maximum single adjustment to
7603ff01b23SMartin Matuska.Sy arc_p .
7613ff01b23SMartin Matuska.
7623ff01b23SMartin Matuska.It Sy zfs_arc_shrink_shift Ns = Ns Sy 0 Pq int
7633ff01b23SMartin MatuskaIf nonzero, this will update
7643ff01b23SMartin Matuska.Sy arc_shrink_shift Pq default Sy 7
7653ff01b23SMartin Matuskawith the new value.
7663ff01b23SMartin Matuska.
7673ff01b23SMartin Matuska.It Sy zfs_arc_pc_percent Ns = Ns Sy 0 Ns % Po off Pc Pq uint
7683ff01b23SMartin MatuskaPercent of pagecache to reclaim ARC to.
7693ff01b23SMartin Matuska.Pp
7703ff01b23SMartin MatuskaThis tunable allows the ZFS ARC to play more nicely
7713ff01b23SMartin Matuskawith the kernel's LRU pagecache.
7723ff01b23SMartin MatuskaIt can guarantee that the ARC size won't collapse under scanning
7733ff01b23SMartin Matuskapressure on the pagecache, yet still allows the ARC to be reclaimed down to
7743ff01b23SMartin Matuska.Sy zfs_arc_min
7753ff01b23SMartin Matuskaif necessary.
7763ff01b23SMartin MatuskaThis value is specified as percent of pagecache size (as measured by
7773ff01b23SMartin Matuska.Sy NR_FILE_PAGES ) ,
7783ff01b23SMartin Matuskawhere that percent may exceed
7793ff01b23SMartin Matuska.Sy 100 .
7803ff01b23SMartin MatuskaThis
7813ff01b23SMartin Matuskaonly operates during memory pressure/reclaim.
7823ff01b23SMartin Matuska.
7833ff01b23SMartin Matuska.It Sy zfs_arc_shrinker_limit Ns = Ns Sy 10000 Pq int
7843ff01b23SMartin MatuskaThis is a limit on how many pages the ARC shrinker makes available for
7853ff01b23SMartin Matuskaeviction in response to one page allocation attempt.
7863ff01b23SMartin MatuskaNote that in practice, the kernel's shrinker can ask us to evict
7873ff01b23SMartin Matuskaup to about four times this for one allocation attempt.
7883ff01b23SMartin Matuska.Pp
7893ff01b23SMartin MatuskaThe default limit of
7903ff01b23SMartin Matuska.Sy 10000 Pq in practice, Em 160MB No per allocation attempt with 4kB pages
7913ff01b23SMartin Matuskalimits the amount of time spent attempting to reclaim ARC memory to
7923ff01b23SMartin Matuskaless than 100ms per allocation attempt,
7933ff01b23SMartin Matuskaeven with a small average compressed block size of ~8kB.
7943ff01b23SMartin Matuska.Pp
7953ff01b23SMartin MatuskaThe parameter can be set to 0 (zero) to disable the limit,
7963ff01b23SMartin Matuskaand only applies on Linux.
7973ff01b23SMartin Matuska.
7983ff01b23SMartin Matuska.It Sy zfs_arc_sys_free Ns = Ns Sy 0 Ns B Pq ulong
7993ff01b23SMartin MatuskaThe target number of bytes the ARC should leave as free memory on the system.
8003ff01b23SMartin MatuskaIf zero, equivalent to the bigger of
8013ff01b23SMartin Matuska.Sy 512kB No and Sy all_system_memory/64 .
8023ff01b23SMartin Matuska.
8033ff01b23SMartin Matuska.It Sy zfs_autoimport_disable Ns = Ns Sy 1 Ns | Ns 0 Pq int
8043ff01b23SMartin MatuskaDisable pool import at module load by ignoring the cache file
8053ff01b23SMartin Matuska.Pq Sy spa_config_path .
8063ff01b23SMartin Matuska.
8073ff01b23SMartin Matuska.It Sy zfs_checksum_events_per_second Ns = Ns Sy 20 Ns /s Pq uint
8083ff01b23SMartin MatuskaRate limit checksum events to this many per second.
8093ff01b23SMartin MatuskaNote that this should not be set below the ZED thresholds
8103ff01b23SMartin Matuska(currently 10 checksums over 10 seconds)
8113ff01b23SMartin Matuskaor else the daemon may not trigger any action.
8123ff01b23SMartin Matuska.
8133ff01b23SMartin Matuska.It Sy zfs_commit_timeout_pct Ns = Ns Sy 5 Ns % Pq int
8143ff01b23SMartin MatuskaThis controls the amount of time that a ZIL block (lwb) will remain "open"
8153ff01b23SMartin Matuskawhen it isn't "full", and it has a thread waiting for it to be committed to
8163ff01b23SMartin Matuskastable storage.
8173ff01b23SMartin MatuskaThe timeout is scaled based on a percentage of the last lwb
8183ff01b23SMartin Matuskalatency to avoid significantly impacting the latency of each individual
8193ff01b23SMartin Matuskatransaction record (itx).
8203ff01b23SMartin Matuska.
8213ff01b23SMartin Matuska.It Sy zfs_condense_indirect_commit_entry_delay_ms Ns = Ns Sy 0 Ns ms Pq int
8223ff01b23SMartin MatuskaVdev indirection layer (used for device removal) sleeps for this many
8233ff01b23SMartin Matuskamilliseconds during mapping generation.
8243ff01b23SMartin MatuskaIntended for use with the test suite to throttle vdev removal speed.
8253ff01b23SMartin Matuska.
8263ff01b23SMartin Matuska.It Sy zfs_condense_indirect_obsolete_pct Ns = Ns Sy 25 Ns % Pq int
8273ff01b23SMartin MatuskaMinimum percent of obsolete bytes in vdev mapping required to attempt to condense
8283ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable .
8293ff01b23SMartin MatuskaIntended for use with the test suite
8303ff01b23SMartin Matuskato facilitate triggering condensing as needed.
8313ff01b23SMartin Matuska.
8323ff01b23SMartin Matuska.It Sy zfs_condense_indirect_vdevs_enable Ns = Ns Sy 1 Ns | Ns 0 Pq int
8333ff01b23SMartin MatuskaEnable condensing indirect vdev mappings.
8343ff01b23SMartin MatuskaWhen set, attempt to condense indirect vdev mappings
8353ff01b23SMartin Matuskaif the mapping uses more than
8363ff01b23SMartin Matuska.Sy zfs_condense_min_mapping_bytes
8373ff01b23SMartin Matuskabytes of memory and if the obsolete space map object uses more than
8383ff01b23SMartin Matuska.Sy zfs_condense_max_obsolete_bytes
8393ff01b23SMartin Matuskabytes on-disk.
8403ff01b23SMartin MatuskaThe condensing process is an attempt to save memory by removing obsolete mappings.
8413ff01b23SMartin Matuska.
8423ff01b23SMartin Matuska.It Sy zfs_condense_max_obsolete_bytes Ns = Ns Sy 1073741824 Ns B Po 1GB Pc Pq ulong
8433ff01b23SMartin MatuskaOnly attempt to condense indirect vdev mappings if the on-disk size
8443ff01b23SMartin Matuskaof the obsolete space map object is greater than this number of bytes
8453ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable .
8463ff01b23SMartin Matuska.
8473ff01b23SMartin Matuska.It Sy zfs_condense_min_mapping_bytes Ns = Ns Sy 131072 Ns B Po 128kB Pc Pq ulong
8483ff01b23SMartin MatuskaMinimum size vdev mapping to attempt to condense
8493ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable .
8503ff01b23SMartin Matuska.
8513ff01b23SMartin Matuska.It Sy zfs_dbgmsg_enable Ns = Ns Sy 1 Ns | Ns 0 Pq int
8523ff01b23SMartin MatuskaInternally ZFS keeps a small log to facilitate debugging.
8533ff01b23SMartin MatuskaThe log is enabled by default, and can be disabled by unsetting this option.
8543ff01b23SMartin MatuskaThe contents of the log can be accessed by reading
8553ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbgmsg .
8563ff01b23SMartin MatuskaWriting
8573ff01b23SMartin Matuska.Sy 0
8583ff01b23SMartin Matuskato the file clears the log.
8593ff01b23SMartin Matuska.Pp
8603ff01b23SMartin MatuskaThis setting does not influence debug prints due to
8613ff01b23SMartin Matuska.Sy zfs_flags .
8623ff01b23SMartin Matuska.
8633ff01b23SMartin Matuska.It Sy zfs_dbgmsg_maxsize Ns = Ns Sy 4194304 Ns B Po 4MB Pc Pq int
8643ff01b23SMartin MatuskaMaximum size of the internal ZFS debug log.
8653ff01b23SMartin Matuska.
8663ff01b23SMartin Matuska.It Sy zfs_dbuf_state_index Ns = Ns Sy 0 Pq int
8673ff01b23SMartin MatuskaHistorically used for controlling what reporting was available under
8683ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs .
8693ff01b23SMartin MatuskaNo effect.
8703ff01b23SMartin Matuska.
8713ff01b23SMartin Matuska.It Sy zfs_deadman_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
8723ff01b23SMartin MatuskaWhen a pool sync operation takes longer than
8733ff01b23SMartin Matuska.Sy zfs_deadman_synctime_ms ,
8743ff01b23SMartin Matuskaor when an individual I/O operation takes longer than
8753ff01b23SMartin Matuska.Sy zfs_deadman_ziotime_ms ,
8763ff01b23SMartin Matuskathen the operation is considered to be "hung".
8773ff01b23SMartin MatuskaIf
8783ff01b23SMartin Matuska.Sy zfs_deadman_enabled
8793ff01b23SMartin Matuskais set, then the deadman behavior is invoked as described by
8803ff01b23SMartin Matuska.Sy zfs_deadman_failmode .
8813ff01b23SMartin MatuskaBy default, the deadman is enabled and set to
8823ff01b23SMartin Matuska.Sy wait
8833ff01b23SMartin Matuskawhich results in "hung" I/Os only being logged.
8843ff01b23SMartin MatuskaThe deadman is automatically disabled when a pool gets suspended.
8853ff01b23SMartin Matuska.
8863ff01b23SMartin Matuska.It Sy zfs_deadman_failmode Ns = Ns Sy wait Pq charp
8873ff01b23SMartin MatuskaControls the failure behavior when the deadman detects a "hung" I/O operation.
8883ff01b23SMartin MatuskaValid values are:
8893ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "continue"
8903ff01b23SMartin Matuska.It Sy wait
8913ff01b23SMartin MatuskaWait for a "hung" operation to complete.
8923ff01b23SMartin MatuskaFor each "hung" operation a "deadman" event will be posted
8933ff01b23SMartin Matuskadescribing that operation.
8943ff01b23SMartin Matuska.It Sy continue
8953ff01b23SMartin MatuskaAttempt to recover from a "hung" operation by re-dispatching it
8963ff01b23SMartin Matuskato the I/O pipeline if possible.
8973ff01b23SMartin Matuska.It Sy panic
8983ff01b23SMartin MatuskaPanic the system.
8993ff01b23SMartin MatuskaThis can be used to facilitate automatic fail-over
9003ff01b23SMartin Matuskato a properly configured fail-over partner.
9013ff01b23SMartin Matuska.El
9023ff01b23SMartin Matuska.
9033ff01b23SMartin Matuska.It Sy zfs_deadman_checktime_ms Ns = Ns Sy 60000 Ns ms Po 1min Pc Pq int
9043ff01b23SMartin MatuskaCheck time in milliseconds.
9053ff01b23SMartin MatuskaThis defines the frequency at which we check for hung I/O requests
9063ff01b23SMartin Matuskaand potentially invoke the
9073ff01b23SMartin Matuska.Sy zfs_deadman_failmode
9083ff01b23SMartin Matuskabehavior.
9093ff01b23SMartin Matuska.
9103ff01b23SMartin Matuska.It Sy zfs_deadman_synctime_ms Ns = Ns Sy 600000 Ns ms Po 10min Pc Pq ulong
9113ff01b23SMartin MatuskaInterval in milliseconds after which the deadman is triggered and also
9123ff01b23SMartin Matuskathe interval after which a pool sync operation is considered to be "hung".
9133ff01b23SMartin MatuskaOnce this limit is exceeded the deadman will be invoked every
9143ff01b23SMartin Matuska.Sy zfs_deadman_checktime_ms
9153ff01b23SMartin Matuskamilliseconds until the pool sync completes.
9163ff01b23SMartin Matuska.
9173ff01b23SMartin Matuska.It Sy zfs_deadman_ziotime_ms Ns = Ns Sy 300000 Ns ms Po 5min Pc Pq ulong
9183ff01b23SMartin MatuskaInterval in milliseconds after which the deadman is triggered and an
9193ff01b23SMartin Matuskaindividual I/O operation is considered to be "hung".
9203ff01b23SMartin MatuskaAs long as the operation remains "hung",
9213ff01b23SMartin Matuskathe deadman will be invoked every
9223ff01b23SMartin Matuska.Sy zfs_deadman_checktime_ms
9233ff01b23SMartin Matuskamilliseconds until the operation completes.
9243ff01b23SMartin Matuska.
9253ff01b23SMartin Matuska.It Sy zfs_dedup_prefetch Ns = Ns Sy 0 Ns | Ns 1 Pq int
9263ff01b23SMartin MatuskaEnable prefetching dedup-ed blocks which are going to be freed.
9273ff01b23SMartin Matuska.
9283ff01b23SMartin Matuska.It Sy zfs_delay_min_dirty_percent Ns = Ns Sy 60 Ns % Pq int
9293ff01b23SMartin MatuskaStart to delay each transaction once there is this amount of dirty data,
9303ff01b23SMartin Matuskaexpressed as a percentage of
9313ff01b23SMartin Matuska.Sy zfs_dirty_data_max .
9323ff01b23SMartin MatuskaThis value should be at least
9333ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent .
9343ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
9353ff01b23SMartin Matuska.
9363ff01b23SMartin Matuska.It Sy zfs_delay_scale Ns = Ns Sy 500000 Pq int
9373ff01b23SMartin MatuskaThis controls how quickly the transaction delay approaches infinity.
9383ff01b23SMartin MatuskaLarger values cause longer delays for a given amount of dirty data.
9393ff01b23SMartin Matuska.Pp
9403ff01b23SMartin MatuskaFor the smoothest delay, this value should be about 1 billion divided
9413ff01b23SMartin Matuskaby the maximum number of operations per second.
9423ff01b23SMartin MatuskaThis will smoothly handle between ten times and a tenth of this number.
9433ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
9443ff01b23SMartin Matuska.Pp
945*e92ffd9bSMartin Matuska.Sy zfs_delay_scale No \(mu Sy zfs_dirty_data_max Em must No be smaller than Sy 2^64 .
9463ff01b23SMartin Matuska.
9473ff01b23SMartin Matuska.It Sy zfs_disable_ivset_guid_check Ns = Ns Sy 0 Ns | Ns 1 Pq int
9483ff01b23SMartin MatuskaDisables requirement for IVset GUIDs to be present and match when doing a raw
9493ff01b23SMartin Matuskareceive of encrypted datasets.
9503ff01b23SMartin MatuskaIntended for users whose pools were created with
9513ff01b23SMartin MatuskaOpenZFS pre-release versions and now have compatibility issues.
9523ff01b23SMartin Matuska.
9533ff01b23SMartin Matuska.It Sy zfs_key_max_salt_uses Ns = Ns Sy 400000000 Po 4*10^8 Pc Pq ulong
9543ff01b23SMartin MatuskaMaximum number of uses of a single salt value before generating a new one for
9553ff01b23SMartin Matuskaencrypted datasets.
9563ff01b23SMartin MatuskaThe default value is also the maximum.
9573ff01b23SMartin Matuska.
9583ff01b23SMartin Matuska.It Sy zfs_object_mutex_size Ns = Ns Sy 64 Pq uint
9593ff01b23SMartin MatuskaSize of the znode hashtable used for holds.
9603ff01b23SMartin Matuska.Pp
9613ff01b23SMartin MatuskaDue to the need to hold locks on objects that may not exist yet, kernel mutexes
9623ff01b23SMartin Matuskaare not created per-object and instead a hashtable is used where collisions
9633ff01b23SMartin Matuskawill result in objects waiting when there is not actually contention on the
9643ff01b23SMartin Matuskasame object.
9653ff01b23SMartin Matuska.
9663ff01b23SMartin Matuska.It Sy zfs_slow_io_events_per_second Ns = Ns Sy 20 Ns /s Pq int
9673ff01b23SMartin MatuskaRate limit delay and deadman zevents (which report slow I/Os) to this many per
9683ff01b23SMartin Matuskasecond.
9693ff01b23SMartin Matuska.
9703ff01b23SMartin Matuska.It Sy zfs_unflushed_max_mem_amt Ns = Ns Sy 1073741824 Ns B Po 1GB Pc Pq ulong
9713ff01b23SMartin MatuskaUpper-bound limit for unflushed metadata changes to be held by the
9723ff01b23SMartin Matuskalog spacemap in memory, in bytes.
9733ff01b23SMartin Matuska.
9743ff01b23SMartin Matuska.It Sy zfs_unflushed_max_mem_ppm Ns = Ns Sy 1000 Ns ppm Po 0.1% Pc Pq ulong
9753ff01b23SMartin MatuskaPart of overall system memory that ZFS allows to be used
9763ff01b23SMartin Matuskafor unflushed metadata changes by the log spacemap, in millionths.
9773ff01b23SMartin Matuska.
9783ff01b23SMartin Matuska.It Sy zfs_unflushed_log_block_max Ns = Ns Sy 262144 Po 256k Pc Pq ulong
9793ff01b23SMartin MatuskaDescribes the maximum number of log spacemap blocks allowed for each pool.
9803ff01b23SMartin MatuskaThe default value means that the space in all the log spacemaps
9813ff01b23SMartin Matuskacan add up to no more than
9823ff01b23SMartin Matuska.Sy 262144
9833ff01b23SMartin Matuskablocks (which means
9843ff01b23SMartin Matuska.Em 32GB
9853ff01b23SMartin Matuskaof logical space before compression and ditto blocks,
9863ff01b23SMartin Matuskaassuming that blocksize is
9873ff01b23SMartin Matuska.Em 128kB ) .
9883ff01b23SMartin Matuska.Pp
9893ff01b23SMartin MatuskaThis tunable is important because it involves a trade-off between import
9903ff01b23SMartin Matuskatime after an unclean export and the frequency of flushing metaslabs.
9913ff01b23SMartin MatuskaThe higher this number is, the more log blocks we allow when the pool is
9923ff01b23SMartin Matuskaactive which means that we flush metaslabs less often and thus decrease
9933ff01b23SMartin Matuskathe number of I/Os for spacemap updates per TXG.
9943ff01b23SMartin MatuskaAt the same time though, that means that in the event of an unclean export,
9953ff01b23SMartin Matuskathere will be more log spacemap blocks for us to read, inducing overhead
9963ff01b23SMartin Matuskain the import time of the pool.
9973ff01b23SMartin MatuskaThe lower the number, the amount of flushing increases, destroying log
9983ff01b23SMartin Matuskablocks quicker as they become obsolete faster, which leaves less blocks
9993ff01b23SMartin Matuskato be read during import time after a crash.
10003ff01b23SMartin Matuska.Pp
10013ff01b23SMartin MatuskaEach log spacemap block existing during pool import leads to approximately
10023ff01b23SMartin Matuskaone extra logical I/O issued.
10033ff01b23SMartin MatuskaThis is the reason why this tunable is exposed in terms of blocks rather
10043ff01b23SMartin Matuskathan space used.
10053ff01b23SMartin Matuska.
10063ff01b23SMartin Matuska.It Sy zfs_unflushed_log_block_min Ns = Ns Sy 1000 Pq ulong
10073ff01b23SMartin MatuskaIf the number of metaslabs is small and our incoming rate is high,
10083ff01b23SMartin Matuskawe could get into a situation that we are flushing all our metaslabs every TXG.
10093ff01b23SMartin MatuskaThus we always allow at least this many log blocks.
10103ff01b23SMartin Matuska.
10113ff01b23SMartin Matuska.It Sy zfs_unflushed_log_block_pct Ns = Ns Sy 400 Ns % Pq ulong
10123ff01b23SMartin MatuskaTunable used to determine the number of blocks that can be used for
10133ff01b23SMartin Matuskathe spacemap log, expressed as a percentage of the total number of
10143ff01b23SMartin Matuskametaslabs in the pool.
10153ff01b23SMartin Matuska.
10163ff01b23SMartin Matuska.It Sy zfs_unlink_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq uint
10173ff01b23SMartin MatuskaWhen enabled, files will not be asynchronously removed from the list of pending
10183ff01b23SMartin Matuskaunlinks and the space they consume will be leaked.
10193ff01b23SMartin MatuskaOnce this option has been disabled and the dataset is remounted,
10203ff01b23SMartin Matuskathe pending unlinks will be processed and the freed space returned to the pool.
10213ff01b23SMartin MatuskaThis option is used by the test suite.
10223ff01b23SMartin Matuska.
10233ff01b23SMartin Matuska.It Sy zfs_delete_blocks Ns = Ns Sy 20480 Pq ulong
10243ff01b23SMartin MatuskaThis is the used to define a large file for the purposes of deletion.
10253ff01b23SMartin MatuskaFiles containing more than
10263ff01b23SMartin Matuska.Sy zfs_delete_blocks
10273ff01b23SMartin Matuskawill be deleted asynchronously, while smaller files are deleted synchronously.
10283ff01b23SMartin MatuskaDecreasing this value will reduce the time spent in an
10293ff01b23SMartin Matuska.Xr unlink 2
10303ff01b23SMartin Matuskasystem call, at the expense of a longer delay before the freed space is available.
10313ff01b23SMartin Matuska.
10323ff01b23SMartin Matuska.It Sy zfs_dirty_data_max Ns = Pq int
10333ff01b23SMartin MatuskaDetermines the dirty space limit in bytes.
10343ff01b23SMartin MatuskaOnce this limit is exceeded, new writes are halted until space frees up.
10353ff01b23SMartin MatuskaThis parameter takes precedence over
10363ff01b23SMartin Matuska.Sy zfs_dirty_data_max_percent .
10373ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
10383ff01b23SMartin Matuska.Pp
10393ff01b23SMartin MatuskaDefaults to
10403ff01b23SMartin Matuska.Sy physical_ram/10 ,
10413ff01b23SMartin Matuskacapped at
10423ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max .
10433ff01b23SMartin Matuska.
10443ff01b23SMartin Matuska.It Sy zfs_dirty_data_max_max Ns = Pq int
10453ff01b23SMartin MatuskaMaximum allowable value of
10463ff01b23SMartin Matuska.Sy zfs_dirty_data_max ,
10473ff01b23SMartin Matuskaexpressed in bytes.
10483ff01b23SMartin MatuskaThis limit is only enforced at module load time, and will be ignored if
10493ff01b23SMartin Matuska.Sy zfs_dirty_data_max
10503ff01b23SMartin Matuskais later changed.
10513ff01b23SMartin MatuskaThis parameter takes precedence over
10523ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max_percent .
10533ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
10543ff01b23SMartin Matuska.Pp
10553ff01b23SMartin MatuskaDefaults to
10563ff01b23SMartin Matuska.Sy physical_ram/4 ,
10573ff01b23SMartin Matuska.
10583ff01b23SMartin Matuska.It Sy zfs_dirty_data_max_max_percent Ns = Ns Sy 25 Ns % Pq int
10593ff01b23SMartin MatuskaMaximum allowable value of
10603ff01b23SMartin Matuska.Sy zfs_dirty_data_max ,
10613ff01b23SMartin Matuskaexpressed as a percentage of physical RAM.
10623ff01b23SMartin MatuskaThis limit is only enforced at module load time, and will be ignored if
10633ff01b23SMartin Matuska.Sy zfs_dirty_data_max
10643ff01b23SMartin Matuskais later changed.
10653ff01b23SMartin MatuskaThe parameter
10663ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max
10673ff01b23SMartin Matuskatakes precedence over this one.
10683ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
10693ff01b23SMartin Matuska.
10703ff01b23SMartin Matuska.It Sy zfs_dirty_data_max_percent Ns = Ns Sy 10 Ns % Pq int
10713ff01b23SMartin MatuskaDetermines the dirty space limit, expressed as a percentage of all memory.
10723ff01b23SMartin MatuskaOnce this limit is exceeded, new writes are halted until space frees up.
10733ff01b23SMartin MatuskaThe parameter
10743ff01b23SMartin Matuska.Sy zfs_dirty_data_max
10753ff01b23SMartin Matuskatakes precedence over this one.
10763ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY .
10773ff01b23SMartin Matuska.Pp
10783ff01b23SMartin MatuskaSubject to
10793ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max .
10803ff01b23SMartin Matuska.
10813ff01b23SMartin Matuska.It Sy zfs_dirty_data_sync_percent Ns = Ns Sy 20 Ns % Pq int
10823ff01b23SMartin MatuskaStart syncing out a transaction group if there's at least this much dirty data
10833ff01b23SMartin Matuska.Pq as a percentage of Sy zfs_dirty_data_max .
10843ff01b23SMartin MatuskaThis should be less than
10853ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_min_dirty_percent .
10863ff01b23SMartin Matuska.
10873f9d360cSMartin Matuska.It Sy zfs_wrlog_data_max Ns = Pq int
10883f9d360cSMartin MatuskaThe upper limit of write-transaction zil log data size in bytes.
10893f9d360cSMartin MatuskaOnce it is reached, write operation is blocked, until log data is cleared out
10903f9d360cSMartin Matuskaafter transaction group sync. Because of some overhead, it should be set
10913f9d360cSMartin Matuskaat least 2 times the size of
10923f9d360cSMartin Matuska.Sy zfs_dirty_data_max
10933f9d360cSMartin Matuska.No to prevent harming normal write throughput.
10943f9d360cSMartin MatuskaIt also should be smaller than the size of the slog device if slog is present.
10953f9d360cSMartin Matuska.Pp
10963f9d360cSMartin MatuskaDefaults to
10973f9d360cSMartin Matuska.Sy zfs_dirty_data_max*2
10983f9d360cSMartin Matuska.
10993ff01b23SMartin Matuska.It Sy zfs_fallocate_reserve_percent Ns = Ns Sy 110 Ns % Pq uint
11003ff01b23SMartin MatuskaSince ZFS is a copy-on-write filesystem with snapshots, blocks cannot be
11013ff01b23SMartin Matuskapreallocated for a file in order to guarantee that later writes will not
11023ff01b23SMartin Matuskarun out of space.
11033ff01b23SMartin MatuskaInstead,
11043ff01b23SMartin Matuska.Xr fallocate 2
11053ff01b23SMartin Matuskaspace preallocation only checks that sufficient space is currently available
11063ff01b23SMartin Matuskain the pool or the user's project quota allocation,
11073ff01b23SMartin Matuskaand then creates a sparse file of the requested size.
11083ff01b23SMartin MatuskaThe requested space is multiplied by
11093ff01b23SMartin Matuska.Sy zfs_fallocate_reserve_percent
11103ff01b23SMartin Matuskato allow additional space for indirect blocks and other internal metadata.
11113ff01b23SMartin MatuskaSetting this to
11123ff01b23SMartin Matuska.Sy 0
11133ff01b23SMartin Matuskadisables support for
11143ff01b23SMartin Matuska.Xr fallocate 2
11153ff01b23SMartin Matuskaand causes it to return
11163ff01b23SMartin Matuska.Sy EOPNOTSUPP .
11173ff01b23SMartin Matuska.
11183ff01b23SMartin Matuska.It Sy zfs_fletcher_4_impl Ns = Ns Sy fastest Pq string
11193ff01b23SMartin MatuskaSelect a fletcher 4 implementation.
11203ff01b23SMartin Matuska.Pp
11213ff01b23SMartin MatuskaSupported selectors are:
11223ff01b23SMartin Matuska.Sy fastest , scalar , sse2 , ssse3 , avx2 , avx512f , avx512bw ,
11233ff01b23SMartin Matuska.No and Sy aarch64_neon .
11243ff01b23SMartin MatuskaAll except
11253ff01b23SMartin Matuska.Sy fastest No and Sy scalar
11263ff01b23SMartin Matuskarequire instruction set extensions to be available,
11273ff01b23SMartin Matuskaand will only appear if ZFS detects that they are present at runtime.
11283ff01b23SMartin MatuskaIf multiple implementations of fletcher 4 are available, the
11293ff01b23SMartin Matuska.Sy fastest
11303ff01b23SMartin Matuskawill be chosen using a micro benchmark.
11313ff01b23SMartin MatuskaSelecting
11323ff01b23SMartin Matuska.Sy scalar
11333ff01b23SMartin Matuskaresults in the original CPU-based calculation being used.
11343ff01b23SMartin MatuskaSelecting any option other than
11353ff01b23SMartin Matuska.Sy fastest No or Sy scalar
11363ff01b23SMartin Matuskaresults in vector instructions
11373ff01b23SMartin Matuskafrom the respective CPU instruction set being used.
11383ff01b23SMartin Matuska.
11393ff01b23SMartin Matuska.It Sy zfs_free_bpobj_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
11403ff01b23SMartin MatuskaEnable/disable the processing of the free_bpobj object.
11413ff01b23SMartin Matuska.
11423ff01b23SMartin Matuska.It Sy zfs_async_block_max_blocks Ns = Ns Sy ULONG_MAX Po unlimited Pc Pq ulong
11433ff01b23SMartin MatuskaMaximum number of blocks freed in a single TXG.
11443ff01b23SMartin Matuska.
11453ff01b23SMartin Matuska.It Sy zfs_max_async_dedup_frees Ns = Ns Sy 100000 Po 10^5 Pc Pq ulong
11463ff01b23SMartin MatuskaMaximum number of dedup blocks freed in a single TXG.
11473ff01b23SMartin Matuska.
11483ff01b23SMartin Matuska.It Sy zfs_vdev_async_read_max_active Ns = Ns Sy 3 Pq int
11493ff01b23SMartin MatuskaMaximum asynchronous read I/O operations active to each device.
11503ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11513ff01b23SMartin Matuska.
11523ff01b23SMartin Matuska.It Sy zfs_vdev_async_read_min_active Ns = Ns Sy 1 Pq int
11533ff01b23SMartin MatuskaMinimum asynchronous read I/O operation active to each device.
11543ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11553ff01b23SMartin Matuska.
11563ff01b23SMartin Matuska.It Sy zfs_vdev_async_write_active_max_dirty_percent Ns = Ns Sy 60 Ns % Pq int
11573ff01b23SMartin MatuskaWhen the pool has more than this much dirty data, use
11583ff01b23SMartin Matuska.Sy zfs_vdev_async_write_max_active
11593ff01b23SMartin Matuskato limit active async writes.
11603ff01b23SMartin MatuskaIf the dirty data is between the minimum and maximum,
11613ff01b23SMartin Matuskathe active I/O limit is linearly interpolated.
11623ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11633ff01b23SMartin Matuska.
11643ff01b23SMartin Matuska.It Sy zfs_vdev_async_write_active_min_dirty_percent Ns = Ns Sy 30 Ns % Pq int
11653ff01b23SMartin MatuskaWhen the pool has less than this much dirty data, use
11663ff01b23SMartin Matuska.Sy zfs_vdev_async_write_min_active
11673ff01b23SMartin Matuskato limit active async writes.
11683ff01b23SMartin MatuskaIf the dirty data is between the minimum and maximum,
11693ff01b23SMartin Matuskathe active I/O limit is linearly
11703ff01b23SMartin Matuskainterpolated.
11713ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11723ff01b23SMartin Matuska.
11733ff01b23SMartin Matuska.It Sy zfs_vdev_async_write_max_active Ns = Ns Sy 30 Pq int
11743ff01b23SMartin MatuskaMaximum asynchronous write I/O operations active to each device.
11753ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11763ff01b23SMartin Matuska.
11773ff01b23SMartin Matuska.It Sy zfs_vdev_async_write_min_active Ns = Ns Sy 2 Pq int
11783ff01b23SMartin MatuskaMinimum asynchronous write I/O operations active to each device.
11793ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11803ff01b23SMartin Matuska.Pp
11813ff01b23SMartin MatuskaLower values are associated with better latency on rotational media but poorer
11823ff01b23SMartin Matuskaresilver performance.
11833ff01b23SMartin MatuskaThe default value of
11843ff01b23SMartin Matuska.Sy 2
11853ff01b23SMartin Matuskawas chosen as a compromise.
11863ff01b23SMartin MatuskaA value of
11873ff01b23SMartin Matuska.Sy 3
11883ff01b23SMartin Matuskahas been shown to improve resilver performance further at a cost of
11893ff01b23SMartin Matuskafurther increasing latency.
11903ff01b23SMartin Matuska.
11913ff01b23SMartin Matuska.It Sy zfs_vdev_initializing_max_active Ns = Ns Sy 1 Pq int
11923ff01b23SMartin MatuskaMaximum initializing I/O operations active to each device.
11933ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11943ff01b23SMartin Matuska.
11953ff01b23SMartin Matuska.It Sy zfs_vdev_initializing_min_active Ns = Ns Sy 1 Pq int
11963ff01b23SMartin MatuskaMinimum initializing I/O operations active to each device.
11973ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
11983ff01b23SMartin Matuska.
11993ff01b23SMartin Matuska.It Sy zfs_vdev_max_active Ns = Ns Sy 1000 Pq int
12003ff01b23SMartin MatuskaThe maximum number of I/O operations active to each device.
12013ff01b23SMartin MatuskaIdeally, this will be at least the sum of each queue's
12023ff01b23SMartin Matuska.Sy max_active .
12033ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12043ff01b23SMartin Matuska.
12053ff01b23SMartin Matuska.It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq int
12063ff01b23SMartin MatuskaMaximum sequential resilver I/O operations active to each device.
12073ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12083ff01b23SMartin Matuska.
12093ff01b23SMartin Matuska.It Sy zfs_vdev_rebuild_min_active Ns = Ns Sy 1 Pq int
12103ff01b23SMartin MatuskaMinimum sequential resilver I/O operations active to each device.
12113ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12123ff01b23SMartin Matuska.
12133ff01b23SMartin Matuska.It Sy zfs_vdev_removal_max_active Ns = Ns Sy 2 Pq int
12143ff01b23SMartin MatuskaMaximum removal I/O operations active to each device.
12153ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12163ff01b23SMartin Matuska.
12173ff01b23SMartin Matuska.It Sy zfs_vdev_removal_min_active Ns = Ns Sy 1 Pq int
12183ff01b23SMartin MatuskaMinimum removal I/O operations active to each device.
12193ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12203ff01b23SMartin Matuska.
12213ff01b23SMartin Matuska.It Sy zfs_vdev_scrub_max_active Ns = Ns Sy 2 Pq int
12223ff01b23SMartin MatuskaMaximum scrub I/O operations active to each device.
12233ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12243ff01b23SMartin Matuska.
12253ff01b23SMartin Matuska.It Sy zfs_vdev_scrub_min_active Ns = Ns Sy 1 Pq int
12263ff01b23SMartin MatuskaMinimum scrub I/O operations active to each device.
12273ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12283ff01b23SMartin Matuska.
12293ff01b23SMartin Matuska.It Sy zfs_vdev_sync_read_max_active Ns = Ns Sy 10 Pq int
12303ff01b23SMartin MatuskaMaximum synchronous read I/O operations active to each device.
12313ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12323ff01b23SMartin Matuska.
12333ff01b23SMartin Matuska.It Sy zfs_vdev_sync_read_min_active Ns = Ns Sy 10 Pq int
12343ff01b23SMartin MatuskaMinimum synchronous read I/O operations active to each device.
12353ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12363ff01b23SMartin Matuska.
12373ff01b23SMartin Matuska.It Sy zfs_vdev_sync_write_max_active Ns = Ns Sy 10 Pq int
12383ff01b23SMartin MatuskaMaximum synchronous write I/O operations active to each device.
12393ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12403ff01b23SMartin Matuska.
12413ff01b23SMartin Matuska.It Sy zfs_vdev_sync_write_min_active Ns = Ns Sy 10 Pq int
12423ff01b23SMartin MatuskaMinimum synchronous write I/O operations active to each device.
12433ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12443ff01b23SMartin Matuska.
12453ff01b23SMartin Matuska.It Sy zfs_vdev_trim_max_active Ns = Ns Sy 2 Pq int
12463ff01b23SMartin MatuskaMaximum trim/discard I/O operations active to each device.
12473ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12483ff01b23SMartin Matuska.
12493ff01b23SMartin Matuska.It Sy zfs_vdev_trim_min_active Ns = Ns Sy 1 Pq int
12503ff01b23SMartin MatuskaMinimum trim/discard I/O operations active to each device.
12513ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12523ff01b23SMartin Matuska.
12533ff01b23SMartin Matuska.It Sy zfs_vdev_nia_delay Ns = Ns Sy 5 Pq int
12543ff01b23SMartin MatuskaFor non-interactive I/O (scrub, resilver, removal, initialize and rebuild),
12553ff01b23SMartin Matuskathe number of concurrently-active I/O operations is limited to
12563ff01b23SMartin Matuska.Sy zfs_*_min_active ,
12573ff01b23SMartin Matuskaunless the vdev is "idle".
1258*e92ffd9bSMartin MatuskaWhen there are no interactive I/O operations active (synchronous or otherwise),
12593ff01b23SMartin Matuskaand
12603ff01b23SMartin Matuska.Sy zfs_vdev_nia_delay
12613ff01b23SMartin Matuskaoperations have completed since the last interactive operation,
12623ff01b23SMartin Matuskathen the vdev is considered to be "idle",
12633ff01b23SMartin Matuskaand the number of concurrently-active non-interactive operations is increased to
12643ff01b23SMartin Matuska.Sy zfs_*_max_active .
12653ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12663ff01b23SMartin Matuska.
12673ff01b23SMartin Matuska.It Sy zfs_vdev_nia_credit Ns = Ns Sy 5 Pq int
12683ff01b23SMartin MatuskaSome HDDs tend to prioritize sequential I/O so strongly, that concurrent
12693ff01b23SMartin Matuskarandom I/O latency reaches several seconds.
12703ff01b23SMartin MatuskaOn some HDDs this happens even if sequential I/O operations
12713ff01b23SMartin Matuskaare submitted one at a time, and so setting
12723ff01b23SMartin Matuska.Sy zfs_*_max_active Ns = Sy 1
12733ff01b23SMartin Matuskadoes not help.
12743ff01b23SMartin MatuskaTo prevent non-interactive I/O, like scrub,
12753ff01b23SMartin Matuskafrom monopolizing the device, no more than
12763ff01b23SMartin Matuska.Sy zfs_vdev_nia_credit operations can be sent
12773ff01b23SMartin Matuskawhile there are outstanding incomplete interactive operations.
12783ff01b23SMartin MatuskaThis enforced wait ensures the HDD services the interactive I/O
12793ff01b23SMartin Matuskawithin a reasonable amount of time.
12803ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
12813ff01b23SMartin Matuska.
12823ff01b23SMartin Matuska.It Sy zfs_vdev_queue_depth_pct Ns = Ns Sy 1000 Ns % Pq int
12833ff01b23SMartin MatuskaMaximum number of queued allocations per top-level vdev expressed as
12843ff01b23SMartin Matuskaa percentage of
12853ff01b23SMartin Matuska.Sy zfs_vdev_async_write_max_active ,
12863ff01b23SMartin Matuskawhich allows the system to detect devices that are more capable
12873ff01b23SMartin Matuskaof handling allocations and to allocate more blocks to those devices.
12883ff01b23SMartin MatuskaThis allows for dynamic allocation distribution when devices are imbalanced,
12893ff01b23SMartin Matuskaas fuller devices will tend to be slower than empty devices.
12903ff01b23SMartin Matuska.Pp
12913ff01b23SMartin MatuskaAlso see
12923ff01b23SMartin Matuska.Sy zio_dva_throttle_enabled .
12933ff01b23SMartin Matuska.
12943ff01b23SMartin Matuska.It Sy zfs_expire_snapshot Ns = Ns Sy 300 Ns s Pq int
12953ff01b23SMartin MatuskaTime before expiring
12963ff01b23SMartin Matuska.Pa .zfs/snapshot .
12973ff01b23SMartin Matuska.
12983ff01b23SMartin Matuska.It Sy zfs_admin_snapshot Ns = Ns Sy 0 Ns | Ns 1 Pq int
12993ff01b23SMartin MatuskaAllow the creation, removal, or renaming of entries in the
13003ff01b23SMartin Matuska.Sy .zfs/snapshot
13013ff01b23SMartin Matuskadirectory to cause the creation, destruction, or renaming of snapshots.
13023ff01b23SMartin MatuskaWhen enabled, this functionality works both locally and over NFS exports
13033ff01b23SMartin Matuskawhich have the
13043ff01b23SMartin Matuska.Em no_root_squash
13053ff01b23SMartin Matuskaoption set.
13063ff01b23SMartin Matuska.
13073ff01b23SMartin Matuska.It Sy zfs_flags Ns = Ns Sy 0 Pq int
13083ff01b23SMartin MatuskaSet additional debugging flags.
13093ff01b23SMartin MatuskaThe following flags may be bitwise-ored together:
13103ff01b23SMartin Matuska.TS
13113ff01b23SMartin Matuskabox;
13123ff01b23SMartin Matuskalbz r l l .
13133ff01b23SMartin Matuska	Value	Symbolic Name	Description
13143ff01b23SMartin Matuska_
13153ff01b23SMartin Matuska	1	ZFS_DEBUG_DPRINTF	Enable dprintf entries in the debug log.
13163ff01b23SMartin Matuska*	2	ZFS_DEBUG_DBUF_VERIFY	Enable extra dbuf verifications.
13173ff01b23SMartin Matuska*	4	ZFS_DEBUG_DNODE_VERIFY	Enable extra dnode verifications.
13183ff01b23SMartin Matuska	8	ZFS_DEBUG_SNAPNAMES	Enable snapshot name verification.
13193ff01b23SMartin Matuska	16	ZFS_DEBUG_MODIFY	Check for illegally modified ARC buffers.
13203ff01b23SMartin Matuska	64	ZFS_DEBUG_ZIO_FREE	Enable verification of block frees.
13213ff01b23SMartin Matuska	128	ZFS_DEBUG_HISTOGRAM_VERIFY	Enable extra spacemap histogram verifications.
13223ff01b23SMartin Matuska	256	ZFS_DEBUG_METASLAB_VERIFY	Verify space accounting on disk matches in-memory \fBrange_trees\fP.
13233ff01b23SMartin Matuska	512	ZFS_DEBUG_SET_ERROR	Enable \fBSET_ERROR\fP and dprintf entries in the debug log.
13243ff01b23SMartin Matuska	1024	ZFS_DEBUG_INDIRECT_REMAP	Verify split blocks created by device removal.
13253ff01b23SMartin Matuska	2048	ZFS_DEBUG_TRIM	Verify TRIM ranges are always within the allocatable range tree.
13263ff01b23SMartin Matuska	4096	ZFS_DEBUG_LOG_SPACEMAP	Verify that the log summary is consistent with the spacemap log
13273ff01b23SMartin Matuska			       and enable \fBzfs_dbgmsgs\fP for metaslab loading and flushing.
13283ff01b23SMartin Matuska.TE
13293ff01b23SMartin Matuska.Sy \& * No Requires debug build.
13303ff01b23SMartin Matuska.
13313ff01b23SMartin Matuska.It Sy zfs_free_leak_on_eio Ns = Ns Sy 0 Ns | Ns 1 Pq int
13323ff01b23SMartin MatuskaIf destroy encounters an
13333ff01b23SMartin Matuska.Sy EIO
13343ff01b23SMartin Matuskawhile reading metadata (e.g. indirect blocks),
13353ff01b23SMartin Matuskaspace referenced by the missing metadata can not be freed.
13363ff01b23SMartin MatuskaNormally this causes the background destroy to become "stalled",
13373ff01b23SMartin Matuskaas it is unable to make forward progress.
13383ff01b23SMartin MatuskaWhile in this stalled state, all remaining space to free
13393ff01b23SMartin Matuskafrom the error-encountering filesystem is "temporarily leaked".
13403ff01b23SMartin MatuskaSet this flag to cause it to ignore the
13413ff01b23SMartin Matuska.Sy EIO ,
13423ff01b23SMartin Matuskapermanently leak the space from indirect blocks that can not be read,
13433ff01b23SMartin Matuskaand continue to free everything else that it can.
13443ff01b23SMartin Matuska.Pp
13453ff01b23SMartin MatuskaThe default "stalling" behavior is useful if the storage partially
13463ff01b23SMartin Matuskafails (i.e. some but not all I/O operations fail), and then later recovers.
13473ff01b23SMartin MatuskaIn this case, we will be able to continue pool operations while it is
13483ff01b23SMartin Matuskapartially failed, and when it recovers, we can continue to free the
13493ff01b23SMartin Matuskaspace, with no leaks.
13503ff01b23SMartin MatuskaNote, however, that this case is actually fairly rare.
13513ff01b23SMartin Matuska.Pp
13523ff01b23SMartin MatuskaTypically pools either
13533ff01b23SMartin Matuska.Bl -enum -compact -offset 4n -width "1."
13543ff01b23SMartin Matuska.It
13553ff01b23SMartin Matuskafail completely (but perhaps temporarily,
13563ff01b23SMartin Matuskae.g. due to a top-level vdev going offline), or
13573ff01b23SMartin Matuska.It
13583ff01b23SMartin Matuskahave localized, permanent errors (e.g. disk returns the wrong data
13593ff01b23SMartin Matuskadue to bit flip or firmware bug).
13603ff01b23SMartin Matuska.El
13613ff01b23SMartin MatuskaIn the former case, this setting does not matter because the
13623ff01b23SMartin Matuskapool will be suspended and the sync thread will not be able to make
13633ff01b23SMartin Matuskaforward progress regardless.
13643ff01b23SMartin MatuskaIn the latter, because the error is permanent, the best we can do
13653ff01b23SMartin Matuskais leak the minimum amount of space,
13663ff01b23SMartin Matuskawhich is what setting this flag will do.
13673ff01b23SMartin MatuskaIt is therefore reasonable for this flag to normally be set,
13683ff01b23SMartin Matuskabut we chose the more conservative approach of not setting it,
13693ff01b23SMartin Matuskaso that there is no possibility of
13703ff01b23SMartin Matuskaleaking space in the "partial temporary" failure case.
13713ff01b23SMartin Matuska.
13723ff01b23SMartin Matuska.It Sy zfs_free_min_time_ms Ns = Ns Sy 1000 Ns ms Po 1s Pc Pq int
13733ff01b23SMartin MatuskaDuring a
13743ff01b23SMartin Matuska.Nm zfs Cm destroy
13753ff01b23SMartin Matuskaoperation using the
13763ff01b23SMartin Matuska.Sy async_destroy
13773ff01b23SMartin Matuskafeature,
13783ff01b23SMartin Matuskaa minimum of this much time will be spent working on freeing blocks per TXG.
13793ff01b23SMartin Matuska.
13803ff01b23SMartin Matuska.It Sy zfs_obsolete_min_time_ms Ns = Ns Sy 500 Ns ms Pq int
13813ff01b23SMartin MatuskaSimilar to
13823ff01b23SMartin Matuska.Sy zfs_free_min_time_ms ,
13833ff01b23SMartin Matuskabut for cleanup of old indirection records for removed vdevs.
13843ff01b23SMartin Matuska.
13853ff01b23SMartin Matuska.It Sy zfs_immediate_write_sz Ns = Ns Sy 32768 Ns B Po 32kB Pc Pq long
13863ff01b23SMartin MatuskaLargest data block to write to the ZIL.
13873ff01b23SMartin MatuskaLarger blocks will be treated as if the dataset being written to had the
13883ff01b23SMartin Matuska.Sy logbias Ns = Ns Sy throughput
13893ff01b23SMartin Matuskaproperty set.
13903ff01b23SMartin Matuska.
13913ff01b23SMartin Matuska.It Sy zfs_initialize_value Ns = Ns Sy 16045690984833335022 Po 0xDEADBEEFDEADBEEE Pc Pq ulong
13923ff01b23SMartin MatuskaPattern written to vdev free space by
13933ff01b23SMartin Matuska.Xr zpool-initialize 8 .
13943ff01b23SMartin Matuska.
13953ff01b23SMartin Matuska.It Sy zfs_initialize_chunk_size Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq ulong
13963ff01b23SMartin MatuskaSize of writes used by
13973ff01b23SMartin Matuska.Xr zpool-initialize 8 .
13983ff01b23SMartin MatuskaThis option is used by the test suite.
13993ff01b23SMartin Matuska.
14003ff01b23SMartin Matuska.It Sy zfs_livelist_max_entries Ns = Ns Sy 500000 Po 5*10^5 Pc Pq ulong
14013ff01b23SMartin MatuskaThe threshold size (in block pointers) at which we create a new sub-livelist.
14023ff01b23SMartin MatuskaLarger sublists are more costly from a memory perspective but the fewer
14033ff01b23SMartin Matuskasublists there are, the lower the cost of insertion.
14043ff01b23SMartin Matuska.
14053ff01b23SMartin Matuska.It Sy zfs_livelist_min_percent_shared Ns = Ns Sy 75 Ns % Pq int
14063ff01b23SMartin MatuskaIf the amount of shared space between a snapshot and its clone drops below
14073ff01b23SMartin Matuskathis threshold, the clone turns off the livelist and reverts to the old
14083ff01b23SMartin Matuskadeletion method.
14093ff01b23SMartin MatuskaThis is in place because livelists no long give us a benefit
14103ff01b23SMartin Matuskaonce a clone has been overwritten enough.
14113ff01b23SMartin Matuska.
14123ff01b23SMartin Matuska.It Sy zfs_livelist_condense_new_alloc Ns = Ns Sy 0 Pq int
14133ff01b23SMartin MatuskaIncremented each time an extra ALLOC blkptr is added to a livelist entry while
14143ff01b23SMartin Matuskait is being condensed.
14153ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions.
14163ff01b23SMartin Matuska.
14173ff01b23SMartin Matuska.It Sy zfs_livelist_condense_sync_cancel Ns = Ns Sy 0 Pq int
14183ff01b23SMartin MatuskaIncremented each time livelist condensing is canceled while in
14193ff01b23SMartin Matuska.Fn spa_livelist_condense_sync .
14203ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions.
14213ff01b23SMartin Matuska.
14223ff01b23SMartin Matuska.It Sy zfs_livelist_condense_sync_pause Ns = Ns Sy 0 Ns | Ns 1 Pq int
14233ff01b23SMartin MatuskaWhen set, the livelist condense process pauses indefinitely before
1424*e92ffd9bSMartin Matuskaexecuting the synctask \(em
14253ff01b23SMartin Matuska.Fn spa_livelist_condense_sync .
14263ff01b23SMartin MatuskaThis option is used by the test suite to trigger race conditions.
14273ff01b23SMartin Matuska.
14283ff01b23SMartin Matuska.It Sy zfs_livelist_condense_zthr_cancel Ns = Ns Sy 0 Pq int
14293ff01b23SMartin MatuskaIncremented each time livelist condensing is canceled while in
14303ff01b23SMartin Matuska.Fn spa_livelist_condense_cb .
14313ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions.
14323ff01b23SMartin Matuska.
14333ff01b23SMartin Matuska.It Sy zfs_livelist_condense_zthr_pause Ns = Ns Sy 0 Ns | Ns 1 Pq int
14343ff01b23SMartin MatuskaWhen set, the livelist condense process pauses indefinitely before
14353ff01b23SMartin Matuskaexecuting the open context condensing work in
14363ff01b23SMartin Matuska.Fn spa_livelist_condense_cb .
14373ff01b23SMartin MatuskaThis option is used by the test suite to trigger race conditions.
14383ff01b23SMartin Matuska.
14393ff01b23SMartin Matuska.It Sy zfs_lua_max_instrlimit Ns = Ns Sy 100000000 Po 10^8 Pc Pq ulong
14403ff01b23SMartin MatuskaThe maximum execution time limit that can be set for a ZFS channel program,
14413ff01b23SMartin Matuskaspecified as a number of Lua instructions.
14423ff01b23SMartin Matuska.
14433ff01b23SMartin Matuska.It Sy zfs_lua_max_memlimit Ns = Ns Sy 104857600 Po 100MB Pc Pq ulong
14443ff01b23SMartin MatuskaThe maximum memory limit that can be set for a ZFS channel program, specified
14453ff01b23SMartin Matuskain bytes.
14463ff01b23SMartin Matuska.
14473ff01b23SMartin Matuska.It Sy zfs_max_dataset_nesting Ns = Ns Sy 50 Pq int
14483ff01b23SMartin MatuskaThe maximum depth of nested datasets.
14493ff01b23SMartin MatuskaThis value can be tuned temporarily to
14503ff01b23SMartin Matuskafix existing datasets that exceed the predefined limit.
14513ff01b23SMartin Matuska.
14523ff01b23SMartin Matuska.It Sy zfs_max_log_walking Ns = Ns Sy 5 Pq ulong
14533ff01b23SMartin MatuskaThe number of past TXGs that the flushing algorithm of the log spacemap
14543ff01b23SMartin Matuskafeature uses to estimate incoming log blocks.
14553ff01b23SMartin Matuska.
14563ff01b23SMartin Matuska.It Sy zfs_max_logsm_summary_length Ns = Ns Sy 10 Pq ulong
14573ff01b23SMartin MatuskaMaximum number of rows allowed in the summary of the spacemap log.
14583ff01b23SMartin Matuska.
14593ff01b23SMartin Matuska.It Sy zfs_max_recordsize Ns = Ns Sy 1048576 Po 1MB Pc Pq int
14603ff01b23SMartin MatuskaWe currently support block sizes from
14613ff01b23SMartin Matuska.Em 512B No to Em 16MB .
14623ff01b23SMartin MatuskaThe benefits of larger blocks, and thus larger I/O,
14633ff01b23SMartin Matuskaneed to be weighed against the cost of COWing a giant block to modify one byte.
14643ff01b23SMartin MatuskaAdditionally, very large blocks can have an impact on I/O latency,
14653ff01b23SMartin Matuskaand also potentially on the memory allocator.
14663ff01b23SMartin MatuskaTherefore, we do not allow the recordsize to be set larger than this tunable.
14673ff01b23SMartin MatuskaLarger blocks can be created by changing it,
14683ff01b23SMartin Matuskaand pools with larger blocks can always be imported and used,
14693ff01b23SMartin Matuskaregardless of this setting.
14703ff01b23SMartin Matuska.
14713ff01b23SMartin Matuska.It Sy zfs_allow_redacted_dataset_mount Ns = Ns Sy 0 Ns | Ns 1 Pq int
14723ff01b23SMartin MatuskaAllow datasets received with redacted send/receive to be mounted.
14733ff01b23SMartin MatuskaNormally disabled because these datasets may be missing key data.
14743ff01b23SMartin Matuska.
14753ff01b23SMartin Matuska.It Sy zfs_min_metaslabs_to_flush Ns = Ns Sy 1 Pq ulong
14763ff01b23SMartin MatuskaMinimum number of metaslabs to flush per dirty TXG.
14773ff01b23SMartin Matuska.
14783ff01b23SMartin Matuska.It Sy zfs_metaslab_fragmentation_threshold Ns = Ns Sy 70 Ns % Pq int
14793ff01b23SMartin MatuskaAllow metaslabs to keep their active state as long as their fragmentation
14803ff01b23SMartin Matuskapercentage is no more than this value.
14813ff01b23SMartin MatuskaAn active metaslab that exceeds this threshold
14823ff01b23SMartin Matuskawill no longer keep its active status allowing better metaslabs to be selected.
14833ff01b23SMartin Matuska.
14843ff01b23SMartin Matuska.It Sy zfs_mg_fragmentation_threshold Ns = Ns Sy 95 Ns % Pq int
14853ff01b23SMartin MatuskaMetaslab groups are considered eligible for allocations if their
14863ff01b23SMartin Matuskafragmentation metric (measured as a percentage) is less than or equal to
14873ff01b23SMartin Matuskathis value.
14883ff01b23SMartin MatuskaIf a metaslab group exceeds this threshold then it will be
14893ff01b23SMartin Matuskaskipped unless all metaslab groups within the metaslab class have also
14903ff01b23SMartin Matuskacrossed this threshold.
14913ff01b23SMartin Matuska.
14923ff01b23SMartin Matuska.It Sy zfs_mg_noalloc_threshold Ns = Ns Sy 0 Ns % Pq int
14933ff01b23SMartin MatuskaDefines a threshold at which metaslab groups should be eligible for allocations.
14943ff01b23SMartin MatuskaThe value is expressed as a percentage of free space
14953ff01b23SMartin Matuskabeyond which a metaslab group is always eligible for allocations.
14963ff01b23SMartin MatuskaIf a metaslab group's free space is less than or equal to the
14973ff01b23SMartin Matuskathreshold, the allocator will avoid allocating to that group
14983ff01b23SMartin Matuskaunless all groups in the pool have reached the threshold.
14993ff01b23SMartin MatuskaOnce all groups have reached the threshold, all groups are allowed to accept
15003ff01b23SMartin Matuskaallocations.
15013ff01b23SMartin MatuskaThe default value of
15023ff01b23SMartin Matuska.Sy 0
15033ff01b23SMartin Matuskadisables the feature and causes all metaslab groups to be eligible for allocations.
15043ff01b23SMartin Matuska.Pp
15053ff01b23SMartin MatuskaThis parameter allows one to deal with pools having heavily imbalanced
15063ff01b23SMartin Matuskavdevs such as would be the case when a new vdev has been added.
15073ff01b23SMartin MatuskaSetting the threshold to a non-zero percentage will stop allocations
15083ff01b23SMartin Matuskafrom being made to vdevs that aren't filled to the specified percentage
15093ff01b23SMartin Matuskaand allow lesser filled vdevs to acquire more allocations than they
15103ff01b23SMartin Matuskaotherwise would under the old
15113ff01b23SMartin Matuska.Sy zfs_mg_alloc_failures
15123ff01b23SMartin Matuskafacility.
15133ff01b23SMartin Matuska.
15143ff01b23SMartin Matuska.It Sy zfs_ddt_data_is_special Ns = Ns Sy 1 Ns | Ns 0 Pq int
15153ff01b23SMartin MatuskaIf enabled, ZFS will place DDT data into the special allocation class.
15163ff01b23SMartin Matuska.
15173ff01b23SMartin Matuska.It Sy zfs_user_indirect_is_special Ns = Ns Sy 1 Ns | Ns 0 Pq int
15183ff01b23SMartin MatuskaIf enabled, ZFS will place user data indirect blocks
15193ff01b23SMartin Matuskainto the special allocation class.
15203ff01b23SMartin Matuska.
15213ff01b23SMartin Matuska.It Sy zfs_multihost_history Ns = Ns Sy 0 Pq int
15223ff01b23SMartin MatuskaHistorical statistics for this many latest multihost updates will be available in
15233ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /multihost .
15243ff01b23SMartin Matuska.
15253ff01b23SMartin Matuska.It Sy zfs_multihost_interval Ns = Ns Sy 1000 Ns ms Po 1s Pc Pq ulong
15263ff01b23SMartin MatuskaUsed to control the frequency of multihost writes which are performed when the
15273ff01b23SMartin Matuska.Sy multihost
15283ff01b23SMartin Matuskapool property is on.
15293ff01b23SMartin MatuskaThis is one of the factors used to determine the
15303ff01b23SMartin Matuskalength of the activity check during import.
15313ff01b23SMartin Matuska.Pp
15323ff01b23SMartin MatuskaThe multihost write period is
1533*e92ffd9bSMartin Matuska.Sy zfs_multihost_interval No / Sy leaf-vdevs .
15343ff01b23SMartin MatuskaOn average a multihost write will be issued for each leaf vdev
15353ff01b23SMartin Matuskaevery
15363ff01b23SMartin Matuska.Sy zfs_multihost_interval
15373ff01b23SMartin Matuskamilliseconds.
15383ff01b23SMartin MatuskaIn practice, the observed period can vary with the I/O load
15393ff01b23SMartin Matuskaand this observed value is the delay which is stored in the uberblock.
15403ff01b23SMartin Matuska.
15413ff01b23SMartin Matuska.It Sy zfs_multihost_import_intervals Ns = Ns Sy 20 Pq uint
15423ff01b23SMartin MatuskaUsed to control the duration of the activity test on import.
15433ff01b23SMartin MatuskaSmaller values of
15443ff01b23SMartin Matuska.Sy zfs_multihost_import_intervals
15453ff01b23SMartin Matuskawill reduce the import time but increase
15463ff01b23SMartin Matuskathe risk of failing to detect an active pool.
15473ff01b23SMartin MatuskaThe total activity check time is never allowed to drop below one second.
15483ff01b23SMartin Matuska.Pp
15493ff01b23SMartin MatuskaOn import the activity check waits a minimum amount of time determined by
1550*e92ffd9bSMartin Matuska.Sy zfs_multihost_interval No \(mu Sy zfs_multihost_import_intervals ,
15513ff01b23SMartin Matuskaor the same product computed on the host which last had the pool imported,
15523ff01b23SMartin Matuskawhichever is greater.
15533ff01b23SMartin MatuskaThe activity check time may be further extended if the value of MMP
15543ff01b23SMartin Matuskadelay found in the best uberblock indicates actual multihost updates happened
15553ff01b23SMartin Matuskaat longer intervals than
15563ff01b23SMartin Matuska.Sy zfs_multihost_interval .
15573ff01b23SMartin MatuskaA minimum of
15583ff01b23SMartin Matuska.Em 100ms
15593ff01b23SMartin Matuskais enforced.
15603ff01b23SMartin Matuska.Pp
15613ff01b23SMartin Matuska.Sy 0 No is equivalent to Sy 1 .
15623ff01b23SMartin Matuska.
15633ff01b23SMartin Matuska.It Sy zfs_multihost_fail_intervals Ns = Ns Sy 10 Pq uint
15643ff01b23SMartin MatuskaControls the behavior of the pool when multihost write failures or delays are
15653ff01b23SMartin Matuskadetected.
15663ff01b23SMartin Matuska.Pp
15673ff01b23SMartin MatuskaWhen
15683ff01b23SMartin Matuska.Sy 0 ,
15693ff01b23SMartin Matuskamultihost write failures or delays are ignored.
15703ff01b23SMartin MatuskaThe failures will still be reported to the ZED which depending on
15713ff01b23SMartin Matuskaits configuration may take action such as suspending the pool or offlining a
15723ff01b23SMartin Matuskadevice.
15733ff01b23SMartin Matuska.Pp
15743ff01b23SMartin MatuskaOtherwise, the pool will be suspended if
1575*e92ffd9bSMartin Matuska.Sy zfs_multihost_fail_intervals No \(mu Sy zfs_multihost_interval
15763ff01b23SMartin Matuskamilliseconds pass without a successful MMP write.
15773ff01b23SMartin MatuskaThis guarantees the activity test will see MMP writes if the pool is imported.
15783ff01b23SMartin Matuska.Sy 1 No is equivalent to Sy 2 ;
15793ff01b23SMartin Matuskathis is necessary to prevent the pool from being suspended
15803ff01b23SMartin Matuskadue to normal, small I/O latency variations.
15813ff01b23SMartin Matuska.
15823ff01b23SMartin Matuska.It Sy zfs_no_scrub_io Ns = Ns Sy 0 Ns | Ns 1 Pq int
15833ff01b23SMartin MatuskaSet to disable scrub I/O.
15843ff01b23SMartin MatuskaThis results in scrubs not actually scrubbing data and
15853ff01b23SMartin Matuskasimply doing a metadata crawl of the pool instead.
15863ff01b23SMartin Matuska.
15873ff01b23SMartin Matuska.It Sy zfs_no_scrub_prefetch Ns = Ns Sy 0 Ns | Ns 1 Pq int
15883ff01b23SMartin MatuskaSet to disable block prefetching for scrubs.
15893ff01b23SMartin Matuska.
15903ff01b23SMartin Matuska.It Sy zfs_nocacheflush Ns = Ns Sy 0 Ns | Ns 1 Pq int
15913ff01b23SMartin MatuskaDisable cache flush operations on disks when writing.
15923ff01b23SMartin MatuskaSetting this will cause pool corruption on power loss
15933ff01b23SMartin Matuskaif a volatile out-of-order write cache is enabled.
15943ff01b23SMartin Matuska.
15953ff01b23SMartin Matuska.It Sy zfs_nopwrite_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
15963ff01b23SMartin MatuskaAllow no-operation writes.
15973ff01b23SMartin MatuskaThe occurrence of nopwrites will further depend on other pool properties
15983ff01b23SMartin Matuska.Pq i.a. the checksumming and compression algorithms .
15993ff01b23SMartin Matuska.
1600681ce946SMartin Matuska.It Sy zfs_dmu_offset_next_sync Ns = Ns Sy 1 Ns | Ns 0 Pq int
16013ff01b23SMartin MatuskaEnable forcing TXG sync to find holes.
1602681ce946SMartin MatuskaWhen enabled forces ZFS to sync data when
16033ff01b23SMartin Matuska.Sy SEEK_HOLE No or Sy SEEK_DATA
1604681ce946SMartin Matuskaflags are used allowing holes in a file to be accurately reported.
1605681ce946SMartin MatuskaWhen disabled holes will not be reported in recently dirtied files.
16063ff01b23SMartin Matuska.
16073ff01b23SMartin Matuska.It Sy zfs_pd_bytes_max Ns = Ns Sy 52428800 Ns B Po 50MB Pc Pq int
16083ff01b23SMartin MatuskaThe number of bytes which should be prefetched during a pool traversal, like
16093ff01b23SMartin Matuska.Nm zfs Cm send
16103ff01b23SMartin Matuskaor other data crawling operations.
16113ff01b23SMartin Matuska.
16123ff01b23SMartin Matuska.It Sy zfs_traverse_indirect_prefetch_limit Ns = Ns Sy 32 Pq int
16133ff01b23SMartin MatuskaThe number of blocks pointed by indirect (non-L0) block which should be
16143ff01b23SMartin Matuskaprefetched during a pool traversal, like
16153ff01b23SMartin Matuska.Nm zfs Cm send
16163ff01b23SMartin Matuskaor other data crawling operations.
16173ff01b23SMartin Matuska.
16183ff01b23SMartin Matuska.It Sy zfs_per_txg_dirty_frees_percent Ns = Ns Sy 5 Ns % Pq ulong
16193ff01b23SMartin MatuskaControl percentage of dirtied indirect blocks from frees allowed into one TXG.
16203ff01b23SMartin MatuskaAfter this threshold is crossed, additional frees will wait until the next TXG.
16213ff01b23SMartin Matuska.Sy 0 No disables this throttle.
16223ff01b23SMartin Matuska.
16233ff01b23SMartin Matuska.It Sy zfs_prefetch_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int
16243ff01b23SMartin MatuskaDisable predictive prefetch.
16253ff01b23SMartin MatuskaNote that it leaves "prescient" prefetch (for. e.g.\&
16263ff01b23SMartin Matuska.Nm zfs Cm send )
16273ff01b23SMartin Matuskaintact.
16283ff01b23SMartin MatuskaUnlike predictive prefetch, prescient prefetch never issues I/O
16293ff01b23SMartin Matuskathat ends up not being needed, so it can't hurt performance.
16303ff01b23SMartin Matuska.
16313ff01b23SMartin Matuska.It Sy zfs_qat_checksum_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int
16323ff01b23SMartin MatuskaDisable QAT hardware acceleration for SHA256 checksums.
16333ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT
16343ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present.
16353ff01b23SMartin Matuska.
16363ff01b23SMartin Matuska.It Sy zfs_qat_compress_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int
16373ff01b23SMartin MatuskaDisable QAT hardware acceleration for gzip compression.
16383ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT
16393ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present.
16403ff01b23SMartin Matuska.
16413ff01b23SMartin Matuska.It Sy zfs_qat_encrypt_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int
16423ff01b23SMartin MatuskaDisable QAT hardware acceleration for AES-GCM encryption.
16433ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT
16443ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present.
16453ff01b23SMartin Matuska.
16463ff01b23SMartin Matuska.It Sy zfs_vnops_read_chunk_size Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq long
16473ff01b23SMartin MatuskaBytes to read per chunk.
16483ff01b23SMartin Matuska.
16493ff01b23SMartin Matuska.It Sy zfs_read_history Ns = Ns Sy 0 Pq int
16503ff01b23SMartin MatuskaHistorical statistics for this many latest reads will be available in
16513ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /reads .
16523ff01b23SMartin Matuska.
16533ff01b23SMartin Matuska.It Sy zfs_read_history_hits Ns = Ns Sy 0 Ns | Ns 1 Pq int
16543ff01b23SMartin MatuskaInclude cache hits in read history
16553ff01b23SMartin Matuska.
16563ff01b23SMartin Matuska.It Sy zfs_rebuild_max_segment Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq ulong
16573ff01b23SMartin MatuskaMaximum read segment size to issue when sequentially resilvering a
16583ff01b23SMartin Matuskatop-level vdev.
16593ff01b23SMartin Matuska.
16603ff01b23SMartin Matuska.It Sy zfs_rebuild_scrub_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
16613ff01b23SMartin MatuskaAutomatically start a pool scrub when the last active sequential resilver
16623ff01b23SMartin Matuskacompletes in order to verify the checksums of all blocks which have been
16633ff01b23SMartin Matuskaresilvered.
16643ff01b23SMartin MatuskaThis is enabled by default and strongly recommended.
16653ff01b23SMartin Matuska.
16663ff01b23SMartin Matuska.It Sy zfs_rebuild_vdev_limit Ns = Ns Sy 33554432 Ns B Po 32MB Pc Pq ulong
16673ff01b23SMartin MatuskaMaximum amount of I/O that can be concurrently issued for a sequential
16683ff01b23SMartin Matuskaresilver per leaf device, given in bytes.
16693ff01b23SMartin Matuska.
16703ff01b23SMartin Matuska.It Sy zfs_reconstruct_indirect_combinations_max Ns = Ns Sy 4096 Pq int
16713ff01b23SMartin MatuskaIf an indirect split block contains more than this many possible unique
16723ff01b23SMartin Matuskacombinations when being reconstructed, consider it too computationally
16733ff01b23SMartin Matuskaexpensive to check them all.
16743ff01b23SMartin MatuskaInstead, try at most this many randomly selected
16753ff01b23SMartin Matuskacombinations each time the block is accessed.
16763ff01b23SMartin MatuskaThis allows all segment copies to participate fairly
16773ff01b23SMartin Matuskain the reconstruction when all combinations
16783ff01b23SMartin Matuskacannot be checked and prevents repeated use of one bad copy.
16793ff01b23SMartin Matuska.
16803ff01b23SMartin Matuska.It Sy zfs_recover Ns = Ns Sy 0 Ns | Ns 1 Pq int
16813ff01b23SMartin MatuskaSet to attempt to recover from fatal errors.
16823ff01b23SMartin MatuskaThis should only be used as a last resort,
16833ff01b23SMartin Matuskaas it typically results in leaked space, or worse.
16843ff01b23SMartin Matuska.
16853ff01b23SMartin Matuska.It Sy zfs_removal_ignore_errors Ns = Ns Sy 0 Ns | Ns 1 Pq int
16863ff01b23SMartin MatuskaIgnore hard IO errors during device removal.
16873ff01b23SMartin MatuskaWhen set, if a device encounters a hard IO error during the removal process
16883ff01b23SMartin Matuskathe removal will not be cancelled.
16893ff01b23SMartin MatuskaThis can result in a normally recoverable block becoming permanently damaged
16903ff01b23SMartin Matuskaand is hence not recommended.
16913ff01b23SMartin MatuskaThis should only be used as a last resort when the
16923ff01b23SMartin Matuskapool cannot be returned to a healthy state prior to removing the device.
16933ff01b23SMartin Matuska.
16943ff01b23SMartin Matuska.It Sy zfs_removal_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq int
16953ff01b23SMartin MatuskaThis is used by the test suite so that it can ensure that certain actions
16963ff01b23SMartin Matuskahappen while in the middle of a removal.
16973ff01b23SMartin Matuska.
16983ff01b23SMartin Matuska.It Sy zfs_remove_max_segment Ns = Ns Sy 16777216 Ns B Po 16MB Pc Pq int
16993ff01b23SMartin MatuskaThe largest contiguous segment that we will attempt to allocate when removing
17003ff01b23SMartin Matuskaa device.
17013ff01b23SMartin MatuskaIf there is a performance problem with attempting to allocate large blocks,
17023ff01b23SMartin Matuskaconsider decreasing this.
17033ff01b23SMartin MatuskaThe default value is also the maximum.
17043ff01b23SMartin Matuska.
17053ff01b23SMartin Matuska.It Sy zfs_resilver_disable_defer Ns = Ns Sy 0 Ns | Ns 1 Pq int
17063ff01b23SMartin MatuskaIgnore the
17073ff01b23SMartin Matuska.Sy resilver_defer
17083ff01b23SMartin Matuskafeature, causing an operation that would start a resilver to
17093ff01b23SMartin Matuskaimmediately restart the one in progress.
17103ff01b23SMartin Matuska.
17113ff01b23SMartin Matuska.It Sy zfs_resilver_min_time_ms Ns = Ns Sy 3000 Ns ms Po 3s Pc Pq int
17123ff01b23SMartin MatuskaResilvers are processed by the sync thread.
17133ff01b23SMartin MatuskaWhile resilvering, it will spend at least this much time
17143ff01b23SMartin Matuskaworking on a resilver between TXG flushes.
17153ff01b23SMartin Matuska.
17163ff01b23SMartin Matuska.It Sy zfs_scan_ignore_errors Ns = Ns Sy 0 Ns | Ns 1 Pq int
17173ff01b23SMartin MatuskaIf set, remove the DTL (dirty time list) upon completion of a pool scan (scrub),
17183ff01b23SMartin Matuskaeven if there were unrepairable errors.
17193ff01b23SMartin MatuskaIntended to be used during pool repair or recovery to
17203ff01b23SMartin Matuskastop resilvering when the pool is next imported.
17213ff01b23SMartin Matuska.
17223ff01b23SMartin Matuska.It Sy zfs_scrub_min_time_ms Ns = Ns Sy 1000 Ns ms Po 1s Pc Pq int
17233ff01b23SMartin MatuskaScrubs are processed by the sync thread.
17243ff01b23SMartin MatuskaWhile scrubbing, it will spend at least this much time
17253ff01b23SMartin Matuskaworking on a scrub between TXG flushes.
17263ff01b23SMartin Matuska.
17273ff01b23SMartin Matuska.It Sy zfs_scan_checkpoint_intval Ns = Ns Sy 7200 Ns s Po 2h Pc Pq int
17283ff01b23SMartin MatuskaTo preserve progress across reboots, the sequential scan algorithm periodically
17293ff01b23SMartin Matuskaneeds to stop metadata scanning and issue all the verification I/O to disk.
17303ff01b23SMartin MatuskaThe frequency of this flushing is determined by this tunable.
17313ff01b23SMartin Matuska.
17323ff01b23SMartin Matuska.It Sy zfs_scan_fill_weight Ns = Ns Sy 3 Pq int
17333ff01b23SMartin MatuskaThis tunable affects how scrub and resilver I/O segments are ordered.
17343ff01b23SMartin MatuskaA higher number indicates that we care more about how filled in a segment is,
17353ff01b23SMartin Matuskawhile a lower number indicates we care more about the size of the extent without
17363ff01b23SMartin Matuskaconsidering the gaps within a segment.
17373ff01b23SMartin MatuskaThis value is only tunable upon module insertion.
1738681ce946SMartin MatuskaChanging the value afterwards will have no effect on scrub or resilver performance.
17393ff01b23SMartin Matuska.
17403ff01b23SMartin Matuska.It Sy zfs_scan_issue_strategy Ns = Ns Sy 0 Pq int
17413ff01b23SMartin MatuskaDetermines the order that data will be verified while scrubbing or resilvering:
17423ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "a"
17433ff01b23SMartin Matuska.It Sy 1
17443ff01b23SMartin MatuskaData will be verified as sequentially as possible, given the
17453ff01b23SMartin Matuskaamount of memory reserved for scrubbing
17463ff01b23SMartin Matuska.Pq see Sy zfs_scan_mem_lim_fact .
17473ff01b23SMartin MatuskaThis may improve scrub performance if the pool's data is very fragmented.
17483ff01b23SMartin Matuska.It Sy 2
17493ff01b23SMartin MatuskaThe largest mostly-contiguous chunk of found data will be verified first.
17503ff01b23SMartin MatuskaBy deferring scrubbing of small segments, we may later find adjacent data
17513ff01b23SMartin Matuskato coalesce and increase the segment size.
17523ff01b23SMartin Matuska.It Sy 0
17533ff01b23SMartin Matuska.No Use strategy Sy 1 No during normal verification
17543ff01b23SMartin Matuska.No and strategy Sy 2 No while taking a checkpoint.
17553ff01b23SMartin Matuska.El
17563ff01b23SMartin Matuska.
17573ff01b23SMartin Matuska.It Sy zfs_scan_legacy Ns = Ns Sy 0 Ns | Ns 1 Pq int
17583ff01b23SMartin MatuskaIf unset, indicates that scrubs and resilvers will gather metadata in
17593ff01b23SMartin Matuskamemory before issuing sequential I/O.
17603ff01b23SMartin MatuskaOtherwise indicates that the legacy algorithm will be used,
17613ff01b23SMartin Matuskawhere I/O is initiated as soon as it is discovered.
17623ff01b23SMartin MatuskaUnsetting will not affect scrubs or resilvers that are already in progress.
17633ff01b23SMartin Matuska.
17643ff01b23SMartin Matuska.It Sy zfs_scan_max_ext_gap Ns = Ns Sy 2097152 Ns B Po 2MB Pc Pq int
17653ff01b23SMartin MatuskaSets the largest gap in bytes between scrub/resilver I/O operations
17663ff01b23SMartin Matuskathat will still be considered sequential for sorting purposes.
17673ff01b23SMartin MatuskaChanging this value will not
17683ff01b23SMartin Matuskaaffect scrubs or resilvers that are already in progress.
17693ff01b23SMartin Matuska.
17703ff01b23SMartin Matuska.It Sy zfs_scan_mem_lim_fact Ns = Ns Sy 20 Ns ^-1 Pq int
17713ff01b23SMartin MatuskaMaximum fraction of RAM used for I/O sorting by sequential scan algorithm.
17723ff01b23SMartin MatuskaThis tunable determines the hard limit for I/O sorting memory usage.
17733ff01b23SMartin MatuskaWhen the hard limit is reached we stop scanning metadata and start issuing
17743ff01b23SMartin Matuskadata verification I/O.
17753ff01b23SMartin MatuskaThis is done until we get below the soft limit.
17763ff01b23SMartin Matuska.
17773ff01b23SMartin Matuska.It Sy zfs_scan_mem_lim_soft_fact Ns = Ns Sy 20 Ns ^-1 Pq int
17783ff01b23SMartin MatuskaThe fraction of the hard limit used to determined the soft limit for I/O sorting
17793ff01b23SMartin Matuskaby the sequential scan algorithm.
17803ff01b23SMartin MatuskaWhen we cross this limit from below no action is taken.
17813ff01b23SMartin MatuskaWhen we cross this limit from above it is because we are issuing verification I/O.
17823ff01b23SMartin MatuskaIn this case (unless the metadata scan is done) we stop issuing verification I/O
17833ff01b23SMartin Matuskaand start scanning metadata again until we get to the hard limit.
17843ff01b23SMartin Matuska.
17853ff01b23SMartin Matuska.It Sy zfs_scan_strict_mem_lim Ns = Ns Sy 0 Ns | Ns 1 Pq int
17863ff01b23SMartin MatuskaEnforce tight memory limits on pool scans when a sequential scan is in progress.
17873ff01b23SMartin MatuskaWhen disabled, the memory limit may be exceeded by fast disks.
17883ff01b23SMartin Matuska.
17893ff01b23SMartin Matuska.It Sy zfs_scan_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq int
17903ff01b23SMartin MatuskaFreezes a scrub/resilver in progress without actually pausing it.
17913ff01b23SMartin MatuskaIntended for testing/debugging.
17923ff01b23SMartin Matuska.
17933ff01b23SMartin Matuska.It Sy zfs_scan_vdev_limit Ns = Ns Sy 4194304 Ns B Po 4MB Pc Pq int
17943ff01b23SMartin MatuskaMaximum amount of data that can be concurrently issued at once for scrubs and
17953ff01b23SMartin Matuskaresilvers per leaf device, given in bytes.
17963ff01b23SMartin Matuska.
17973ff01b23SMartin Matuska.It Sy zfs_send_corrupt_data Ns = Ns Sy 0 Ns | Ns 1 Pq int
17983ff01b23SMartin MatuskaAllow sending of corrupt data (ignore read/checksum errors when sending).
17993ff01b23SMartin Matuska.
18003ff01b23SMartin Matuska.It Sy zfs_send_unmodified_spill_blocks Ns = Ns Sy 1 Ns | Ns 0 Pq int
18013ff01b23SMartin MatuskaInclude unmodified spill blocks in the send stream.
18023ff01b23SMartin MatuskaUnder certain circumstances, previous versions of ZFS could incorrectly
18033ff01b23SMartin Matuskaremove the spill block from an existing object.
18043ff01b23SMartin MatuskaIncluding unmodified copies of the spill blocks creates a backwards-compatible
18053ff01b23SMartin Matuskastream which will recreate a spill block if it was incorrectly removed.
18063ff01b23SMartin Matuska.
1807*e92ffd9bSMartin Matuska.It Sy zfs_send_no_prefetch_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq int
18083ff01b23SMartin MatuskaThe fill fraction of the
18093ff01b23SMartin Matuska.Nm zfs Cm send
18103ff01b23SMartin Matuskainternal queues.
18113ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up.
18123ff01b23SMartin Matuska.
18133ff01b23SMartin Matuska.It Sy zfs_send_no_prefetch_queue_length Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq int
18143ff01b23SMartin MatuskaThe maximum number of bytes allowed in
18153ff01b23SMartin Matuska.Nm zfs Cm send Ns 's
18163ff01b23SMartin Matuskainternal queues.
18173ff01b23SMartin Matuska.
1818*e92ffd9bSMartin Matuska.It Sy zfs_send_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq int
18193ff01b23SMartin MatuskaThe fill fraction of the
18203ff01b23SMartin Matuska.Nm zfs Cm send
18213ff01b23SMartin Matuskaprefetch queue.
18223ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up.
18233ff01b23SMartin Matuska.
18243ff01b23SMartin Matuska.It Sy zfs_send_queue_length Ns = Ns Sy 16777216 Ns B Po 16MB Pc Pq int
18253ff01b23SMartin MatuskaThe maximum number of bytes allowed that will be prefetched by
18263ff01b23SMartin Matuska.Nm zfs Cm send .
18273ff01b23SMartin MatuskaThis value must be at least twice the maximum block size in use.
18283ff01b23SMartin Matuska.
1829*e92ffd9bSMartin Matuska.It Sy zfs_recv_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq int
18303ff01b23SMartin MatuskaThe fill fraction of the
18313ff01b23SMartin Matuska.Nm zfs Cm receive
18323ff01b23SMartin Matuskaqueue.
18333ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up.
18343ff01b23SMartin Matuska.
18353ff01b23SMartin Matuska.It Sy zfs_recv_queue_length Ns = Ns Sy 16777216 Ns B Po 16MB Pc Pq int
18363ff01b23SMartin MatuskaThe maximum number of bytes allowed in the
18373ff01b23SMartin Matuska.Nm zfs Cm receive
18383ff01b23SMartin Matuskaqueue.
18393ff01b23SMartin MatuskaThis value must be at least twice the maximum block size in use.
18403ff01b23SMartin Matuska.
18413ff01b23SMartin Matuska.It Sy zfs_recv_write_batch_size Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq int
18423ff01b23SMartin MatuskaThe maximum amount of data, in bytes, that
18433ff01b23SMartin Matuska.Nm zfs Cm receive
18443ff01b23SMartin Matuskawill write in one DMU transaction.
18453ff01b23SMartin MatuskaThis is the uncompressed size, even when receiving a compressed send stream.
18463ff01b23SMartin MatuskaThis setting will not reduce the write size below a single block.
18473ff01b23SMartin MatuskaCapped at a maximum of
18483ff01b23SMartin Matuska.Sy 32MB .
18493ff01b23SMartin Matuska.
18503ff01b23SMartin Matuska.It Sy zfs_override_estimate_recordsize Ns = Ns Sy 0 Ns | Ns 1 Pq ulong
18513ff01b23SMartin MatuskaSetting this variable overrides the default logic for estimating block
18523ff01b23SMartin Matuskasizes when doing a
18533ff01b23SMartin Matuska.Nm zfs Cm send .
18543ff01b23SMartin MatuskaThe default heuristic is that the average block size
18553ff01b23SMartin Matuskawill be the current recordsize.
18563ff01b23SMartin MatuskaOverride this value if most data in your dataset is not of that size
18573ff01b23SMartin Matuskaand you require accurate zfs send size estimates.
18583ff01b23SMartin Matuska.
18593ff01b23SMartin Matuska.It Sy zfs_sync_pass_deferred_free Ns = Ns Sy 2 Pq int
18603ff01b23SMartin MatuskaFlushing of data to disk is done in passes.
18613ff01b23SMartin MatuskaDefer frees starting in this pass.
18623ff01b23SMartin Matuska.
18633ff01b23SMartin Matuska.It Sy zfs_spa_discard_memory_limit Ns = Ns Sy 16777216 Ns B Po 16MB Pc Pq int
18643ff01b23SMartin MatuskaMaximum memory used for prefetching a checkpoint's space map on each
18653ff01b23SMartin Matuskavdev while discarding the checkpoint.
18663ff01b23SMartin Matuska.
18673ff01b23SMartin Matuska.It Sy zfs_special_class_metadata_reserve_pct Ns = Ns Sy 25 Ns % Pq int
18683ff01b23SMartin MatuskaOnly allow small data blocks to be allocated on the special and dedup vdev
18693ff01b23SMartin Matuskatypes when the available free space percentage on these vdevs exceeds this value.
18703ff01b23SMartin MatuskaThis ensures reserved space is available for pool metadata as the
18713ff01b23SMartin Matuskaspecial vdevs approach capacity.
18723ff01b23SMartin Matuska.
18733ff01b23SMartin Matuska.It Sy zfs_sync_pass_dont_compress Ns = Ns Sy 8 Pq int
18743ff01b23SMartin MatuskaStarting in this sync pass, disable compression (including of metadata).
18753ff01b23SMartin MatuskaWith the default setting, in practice, we don't have this many sync passes,
18763ff01b23SMartin Matuskaso this has no effect.
18773ff01b23SMartin Matuska.Pp
18783ff01b23SMartin MatuskaThe original intent was that disabling compression would help the sync passes
18793ff01b23SMartin Matuskato converge.
18803ff01b23SMartin MatuskaHowever, in practice, disabling compression increases
18813ff01b23SMartin Matuskathe average number of sync passes; because when we turn compression off,
18823ff01b23SMartin Matuskamany blocks' size will change, and thus we have to re-allocate
18833ff01b23SMartin Matuska(not overwrite) them.
18843ff01b23SMartin MatuskaIt also increases the number of
18853ff01b23SMartin Matuska.Em 128kB
18863ff01b23SMartin Matuskaallocations (e.g. for indirect blocks and spacemaps)
18873ff01b23SMartin Matuskabecause these will not be compressed.
18883ff01b23SMartin MatuskaThe
18893ff01b23SMartin Matuska.Em 128kB
18903ff01b23SMartin Matuskaallocations are especially detrimental to performance
18913ff01b23SMartin Matuskaon highly fragmented systems, which may have very few free segments of this size,
18923ff01b23SMartin Matuskaand may need to load new metaslabs to satisfy these allocations.
18933ff01b23SMartin Matuska.
18943ff01b23SMartin Matuska.It Sy zfs_sync_pass_rewrite Ns = Ns Sy 2 Pq int
18953ff01b23SMartin MatuskaRewrite new block pointers starting in this pass.
18963ff01b23SMartin Matuska.
18973ff01b23SMartin Matuska.It Sy zfs_sync_taskq_batch_pct Ns = Ns Sy 75 Ns % Pq int
18983ff01b23SMartin MatuskaThis controls the number of threads used by
18993ff01b23SMartin Matuska.Sy dp_sync_taskq .
19003ff01b23SMartin MatuskaThe default value of
19013ff01b23SMartin Matuska.Sy 75%
19023ff01b23SMartin Matuskawill create a maximum of one thread per CPU.
19033ff01b23SMartin Matuska.
19043ff01b23SMartin Matuska.It Sy zfs_trim_extent_bytes_max Ns = Ns Sy 134217728 Ns B Po 128MB Pc Pq uint
19053ff01b23SMartin MatuskaMaximum size of TRIM command.
19063ff01b23SMartin MatuskaLarger ranges will be split into chunks no larger than this value before issuing.
19073ff01b23SMartin Matuska.
19083ff01b23SMartin Matuska.It Sy zfs_trim_extent_bytes_min Ns = Ns Sy 32768 Ns B Po 32kB Pc Pq uint
19093ff01b23SMartin MatuskaMinimum size of TRIM commands.
19103ff01b23SMartin MatuskaTRIM ranges smaller than this will be skipped,
19113ff01b23SMartin Matuskaunless they're part of a larger range which was chunked.
19123ff01b23SMartin MatuskaThis is done because it's common for these small TRIMs
19133ff01b23SMartin Matuskato negatively impact overall performance.
19143ff01b23SMartin Matuska.
19153ff01b23SMartin Matuska.It Sy zfs_trim_metaslab_skip Ns = Ns Sy 0 Ns | Ns 1 Pq uint
19163ff01b23SMartin MatuskaSkip uninitialized metaslabs during the TRIM process.
19173ff01b23SMartin MatuskaThis option is useful for pools constructed from large thinly-provisioned devices
19183ff01b23SMartin Matuskawhere TRIM operations are slow.
19193ff01b23SMartin MatuskaAs a pool ages, an increasing fraction of the pool's metaslabs
19203ff01b23SMartin Matuskawill be initialized, progressively degrading the usefulness of this option.
19213ff01b23SMartin MatuskaThis setting is stored when starting a manual TRIM and will
19223ff01b23SMartin Matuskapersist for the duration of the requested TRIM.
19233ff01b23SMartin Matuska.
19243ff01b23SMartin Matuska.It Sy zfs_trim_queue_limit Ns = Ns Sy 10 Pq uint
19253ff01b23SMartin MatuskaMaximum number of queued TRIMs outstanding per leaf vdev.
19263ff01b23SMartin MatuskaThe number of concurrent TRIM commands issued to the device is controlled by
19273ff01b23SMartin Matuska.Sy zfs_vdev_trim_min_active No and Sy zfs_vdev_trim_max_active .
19283ff01b23SMartin Matuska.
19293ff01b23SMartin Matuska.It Sy zfs_trim_txg_batch Ns = Ns Sy 32 Pq uint
19303ff01b23SMartin MatuskaThe number of transaction groups' worth of frees which should be aggregated
19313ff01b23SMartin Matuskabefore TRIM operations are issued to the device.
19323ff01b23SMartin MatuskaThis setting represents a trade-off between issuing larger,
19333ff01b23SMartin Matuskamore efficient TRIM operations and the delay
19343ff01b23SMartin Matuskabefore the recently trimmed space is available for use by the device.
19353ff01b23SMartin Matuska.Pp
19363ff01b23SMartin MatuskaIncreasing this value will allow frees to be aggregated for a longer time.
19373ff01b23SMartin MatuskaThis will result is larger TRIM operations and potentially increased memory usage.
19383ff01b23SMartin MatuskaDecreasing this value will have the opposite effect.
19393ff01b23SMartin MatuskaThe default of
19403ff01b23SMartin Matuska.Sy 32
19413ff01b23SMartin Matuskawas determined to be a reasonable compromise.
19423ff01b23SMartin Matuska.
19433ff01b23SMartin Matuska.It Sy zfs_txg_history Ns = Ns Sy 0 Pq int
19443ff01b23SMartin MatuskaHistorical statistics for this many latest TXGs will be available in
19453ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /TXGs .
19463ff01b23SMartin Matuska.
19473ff01b23SMartin Matuska.It Sy zfs_txg_timeout Ns = Ns Sy 5 Ns s Pq int
19483ff01b23SMartin MatuskaFlush dirty data to disk at least every this many seconds (maximum TXG duration).
19493ff01b23SMartin Matuska.
19503ff01b23SMartin Matuska.It Sy zfs_vdev_aggregate_trim Ns = Ns Sy 0 Ns | Ns 1 Pq int
19513ff01b23SMartin MatuskaAllow TRIM I/Os to be aggregated.
19523ff01b23SMartin MatuskaThis is normally not helpful because the extents to be trimmed
19533ff01b23SMartin Matuskawill have been already been aggregated by the metaslab.
19543ff01b23SMartin MatuskaThis option is provided for debugging and performance analysis.
19553ff01b23SMartin Matuska.
19563ff01b23SMartin Matuska.It Sy zfs_vdev_aggregation_limit Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq int
19573ff01b23SMartin MatuskaMax vdev I/O aggregation size.
19583ff01b23SMartin Matuska.
19593ff01b23SMartin Matuska.It Sy zfs_vdev_aggregation_limit_non_rotating Ns = Ns Sy 131072 Ns B Po 128kB Pc Pq int
19603ff01b23SMartin MatuskaMax vdev I/O aggregation size for non-rotating media.
19613ff01b23SMartin Matuska.
19623ff01b23SMartin Matuska.It Sy zfs_vdev_cache_bshift Ns = Ns Sy 16 Po 64kB Pc Pq int
19633ff01b23SMartin MatuskaShift size to inflate reads to.
19643ff01b23SMartin Matuska.
19653ff01b23SMartin Matuska.It Sy zfs_vdev_cache_max Ns = Ns Sy 16384 Ns B Po 16kB Pc Pq int
19663ff01b23SMartin MatuskaInflate reads smaller than this value to meet the
19673ff01b23SMartin Matuska.Sy zfs_vdev_cache_bshift
19683ff01b23SMartin Matuskasize
19693ff01b23SMartin Matuska.Pq default Sy 64kB .
19703ff01b23SMartin Matuska.
19713ff01b23SMartin Matuska.It Sy zfs_vdev_cache_size Ns = Ns Sy 0 Pq int
19723ff01b23SMartin MatuskaTotal size of the per-disk cache in bytes.
19733ff01b23SMartin Matuska.Pp
19743ff01b23SMartin MatuskaCurrently this feature is disabled, as it has been found to not be helpful
19753ff01b23SMartin Matuskafor performance and in some cases harmful.
19763ff01b23SMartin Matuska.
19773ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_rotating_inc Ns = Ns Sy 0 Pq int
19783ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for
19793ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation
19803ff01b23SMartin Matuskaimmediately follows its predecessor on rotational vdevs
19813ff01b23SMartin Matuskafor the purpose of making decisions based on load.
19823ff01b23SMartin Matuska.
19833ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_rotating_seek_inc Ns = Ns Sy 5 Pq int
19843ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for
19853ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation
19863ff01b23SMartin Matuskalacks locality as defined by
19873ff01b23SMartin Matuska.Sy zfs_vdev_mirror_rotating_seek_offset .
19883ff01b23SMartin MatuskaOperations within this that are not immediately following the previous operation
19893ff01b23SMartin Matuskaare incremented by half.
19903ff01b23SMartin Matuska.
19913ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_rotating_seek_offset Ns = Ns Sy 1048576 Ns B Po 1MB Pc Pq int
19923ff01b23SMartin MatuskaThe maximum distance for the last queued I/O operation in which
19933ff01b23SMartin Matuskathe balancing algorithm considers an operation to have locality.
19943ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER .
19953ff01b23SMartin Matuska.
19963ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_non_rotating_inc Ns = Ns Sy 0 Pq int
19973ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for
19983ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member on non-rotational vdevs
19993ff01b23SMartin Matuskawhen I/O operations do not immediately follow one another.
20003ff01b23SMartin Matuska.
20013ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_non_rotating_seek_inc Ns = Ns Sy 1 Pq int
20023ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for
20033ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation lacks
20043ff01b23SMartin Matuskalocality as defined by the
20053ff01b23SMartin Matuska.Sy zfs_vdev_mirror_rotating_seek_offset .
20063ff01b23SMartin MatuskaOperations within this that are not immediately following the previous operation
20073ff01b23SMartin Matuskaare incremented by half.
20083ff01b23SMartin Matuska.
20093ff01b23SMartin Matuska.It Sy zfs_vdev_read_gap_limit Ns = Ns Sy 32768 Ns B Po 32kB Pc Pq int
20103ff01b23SMartin MatuskaAggregate read I/O operations if the on-disk gap between them is within this
20113ff01b23SMartin Matuskathreshold.
20123ff01b23SMartin Matuska.
20133ff01b23SMartin Matuska.It Sy zfs_vdev_write_gap_limit Ns = Ns Sy 4096 Ns B Po 4kB Pc Pq int
20143ff01b23SMartin MatuskaAggregate write I/O operations if the on-disk gap between them is within this
20153ff01b23SMartin Matuskathreshold.
20163ff01b23SMartin Matuska.
20173ff01b23SMartin Matuska.It Sy zfs_vdev_raidz_impl Ns = Ns Sy fastest Pq string
20183ff01b23SMartin MatuskaSelect the raidz parity implementation to use.
20193ff01b23SMartin Matuska.Pp
20203ff01b23SMartin MatuskaVariants that don't depend on CPU-specific features
20213ff01b23SMartin Matuskamay be selected on module load, as they are supported on all systems.
20223ff01b23SMartin MatuskaThe remaining options may only be set after the module is loaded,
20233ff01b23SMartin Matuskaas they are available only if the implementations are compiled in
20243ff01b23SMartin Matuskaand supported on the running system.
20253ff01b23SMartin Matuska.Pp
20263ff01b23SMartin MatuskaOnce the module is loaded,
20273ff01b23SMartin Matuska.Pa /sys/module/zfs/parameters/zfs_vdev_raidz_impl
20283ff01b23SMartin Matuskawill show the available options,
20293ff01b23SMartin Matuskawith the currently selected one enclosed in square brackets.
20303ff01b23SMartin Matuska.Pp
20313ff01b23SMartin Matuska.TS
20323ff01b23SMartin Matuskalb l l .
20333ff01b23SMartin Matuskafastest	selected by built-in benchmark
20343ff01b23SMartin Matuskaoriginal	original implementation
20353ff01b23SMartin Matuskascalar	scalar implementation
20363ff01b23SMartin Matuskasse2	SSE2 instruction set	64-bit x86
20373ff01b23SMartin Matuskassse3	SSSE3 instruction set	64-bit x86
20383ff01b23SMartin Matuskaavx2	AVX2 instruction set	64-bit x86
20393ff01b23SMartin Matuskaavx512f	AVX512F instruction set	64-bit x86
20403ff01b23SMartin Matuskaavx512bw	AVX512F & AVX512BW instruction sets	64-bit x86
20413ff01b23SMartin Matuskaaarch64_neon	NEON	Aarch64/64-bit ARMv8
20423ff01b23SMartin Matuskaaarch64_neonx2	NEON with more unrolling	Aarch64/64-bit ARMv8
20433ff01b23SMartin Matuskapowerpc_altivec	Altivec	PowerPC
20443ff01b23SMartin Matuska.TE
20453ff01b23SMartin Matuska.
20463ff01b23SMartin Matuska.It Sy zfs_vdev_scheduler Pq charp
20473ff01b23SMartin Matuska.Sy DEPRECATED .
20482faf504dSMartin MatuskaPrints warning to kernel log for compatibility.
20493ff01b23SMartin Matuska.
20503ff01b23SMartin Matuska.It Sy zfs_zevent_len_max Ns = Ns Sy 512 Pq int
20513ff01b23SMartin MatuskaMax event queue length.
20523ff01b23SMartin MatuskaEvents in the queue can be viewed with
20533ff01b23SMartin Matuska.Xr zpool-events 8 .
20543ff01b23SMartin Matuska.
20553ff01b23SMartin Matuska.It Sy zfs_zevent_retain_max Ns = Ns Sy 2000 Pq int
20563ff01b23SMartin MatuskaMaximum recent zevent records to retain for duplicate checking.
20573ff01b23SMartin MatuskaSetting this to
20583ff01b23SMartin Matuska.Sy 0
20593ff01b23SMartin Matuskadisables duplicate detection.
20603ff01b23SMartin Matuska.
20613ff01b23SMartin Matuska.It Sy zfs_zevent_retain_expire_secs Ns = Ns Sy 900 Ns s Po 15min Pc Pq int
20623ff01b23SMartin MatuskaLifespan for a recent ereport that was retained for duplicate checking.
20633ff01b23SMartin Matuska.
20643ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_maxalloc Ns = Ns Sy 1048576 Pq int
20653ff01b23SMartin MatuskaThe maximum number of taskq entries that are allowed to be cached.
20663ff01b23SMartin MatuskaWhen this limit is exceeded transaction records (itxs)
20673ff01b23SMartin Matuskawill be cleaned synchronously.
20683ff01b23SMartin Matuska.
20693ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_minalloc Ns = Ns Sy 1024 Pq int
20703ff01b23SMartin MatuskaThe number of taskq entries that are pre-populated when the taskq is first
20713ff01b23SMartin Matuskacreated and are immediately available for use.
20723ff01b23SMartin Matuska.
20733ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_nthr_pct Ns = Ns Sy 100 Ns % Pq int
20743ff01b23SMartin MatuskaThis controls the number of threads used by
20753ff01b23SMartin Matuska.Sy dp_zil_clean_taskq .
20763ff01b23SMartin MatuskaThe default value of
20773ff01b23SMartin Matuska.Sy 100%
20783ff01b23SMartin Matuskawill create a maximum of one thread per cpu.
20793ff01b23SMartin Matuska.
20803ff01b23SMartin Matuska.It Sy zil_maxblocksize Ns = Ns Sy 131072 Ns B Po 128kB Pc Pq int
20813ff01b23SMartin MatuskaThis sets the maximum block size used by the ZIL.
20823ff01b23SMartin MatuskaOn very fragmented pools, lowering this
20833ff01b23SMartin Matuska.Pq typically to Sy 36kB
20843ff01b23SMartin Matuskacan improve performance.
20853ff01b23SMartin Matuska.
20863ff01b23SMartin Matuska.It Sy zil_nocacheflush Ns = Ns Sy 0 Ns | Ns 1 Pq int
20873ff01b23SMartin MatuskaDisable the cache flush commands that are normally sent to disk by
20883ff01b23SMartin Matuskathe ZIL after an LWB write has completed.
20893ff01b23SMartin MatuskaSetting this will cause ZIL corruption on power loss
20903ff01b23SMartin Matuskaif a volatile out-of-order write cache is enabled.
20913ff01b23SMartin Matuska.
20923ff01b23SMartin Matuska.It Sy zil_replay_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int
20933ff01b23SMartin MatuskaDisable intent logging replay.
20943ff01b23SMartin MatuskaCan be disabled for recovery from corrupted ZIL.
20953ff01b23SMartin Matuska.
20963ff01b23SMartin Matuska.It Sy zil_slog_bulk Ns = Ns Sy 786432 Ns B Po 768kB Pc Pq ulong
20973ff01b23SMartin MatuskaLimit SLOG write size per commit executed with synchronous priority.
20983ff01b23SMartin MatuskaAny writes above that will be executed with lower (asynchronous) priority
20993ff01b23SMartin Matuskato limit potential SLOG device abuse by single active ZIL writer.
21003ff01b23SMartin Matuska.
21013ff01b23SMartin Matuska.It Sy zfs_embedded_slog_min_ms Ns = Ns Sy 64 Pq int
21023ff01b23SMartin MatuskaUsually, one metaslab from each normal-class vdev is dedicated for use by
21033ff01b23SMartin Matuskathe ZIL to log synchronous writes.
21043ff01b23SMartin MatuskaHowever, if there are fewer than
21053ff01b23SMartin Matuska.Sy zfs_embedded_slog_min_ms
21063ff01b23SMartin Matuskametaslabs in the vdev, this functionality is disabled.
21073ff01b23SMartin MatuskaThis ensures that we don't set aside an unreasonable amount of space for the ZIL.
21083ff01b23SMartin Matuska.
21093ff01b23SMartin Matuska.It Sy zio_deadman_log_all Ns = Ns Sy 0 Ns | Ns 1 Pq int
21103ff01b23SMartin MatuskaIf non-zero, the zio deadman will produce debugging messages
21113ff01b23SMartin Matuska.Pq see Sy zfs_dbgmsg_enable
21123ff01b23SMartin Matuskafor all zios, rather than only for leaf zios possessing a vdev.
21133ff01b23SMartin MatuskaThis is meant to be used by developers to gain
21143ff01b23SMartin Matuskadiagnostic information for hang conditions which don't involve a mutex
21153ff01b23SMartin Matuskaor other locking primitive: typically conditions in which a thread in
21163ff01b23SMartin Matuskathe zio pipeline is looping indefinitely.
21173ff01b23SMartin Matuska.
21183ff01b23SMartin Matuska.It Sy zio_slow_io_ms Ns = Ns Sy 30000 Ns ms Po 30s Pc Pq int
21193ff01b23SMartin MatuskaWhen an I/O operation takes more than this much time to complete,
21203ff01b23SMartin Matuskait's marked as slow.
21213ff01b23SMartin MatuskaEach slow operation causes a delay zevent.
21223ff01b23SMartin MatuskaSlow I/O counters can be seen with
21233ff01b23SMartin Matuska.Nm zpool Cm status Fl s .
21243ff01b23SMartin Matuska.
21253ff01b23SMartin Matuska.It Sy zio_dva_throttle_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
21263ff01b23SMartin MatuskaThrottle block allocations in the I/O pipeline.
21273ff01b23SMartin MatuskaThis allows for dynamic allocation distribution when devices are imbalanced.
21283ff01b23SMartin MatuskaWhen enabled, the maximum number of pending allocations per top-level vdev
21293ff01b23SMartin Matuskais limited by
21303ff01b23SMartin Matuska.Sy zfs_vdev_queue_depth_pct .
21313ff01b23SMartin Matuska.
21323ff01b23SMartin Matuska.It Sy zio_requeue_io_start_cut_in_line Ns = Ns Sy 0 Ns | Ns 1 Pq int
21333ff01b23SMartin MatuskaPrioritize requeued I/O.
21343ff01b23SMartin Matuska.
21353ff01b23SMartin Matuska.It Sy zio_taskq_batch_pct Ns = Ns Sy 80 Ns % Pq uint
21363ff01b23SMartin MatuskaPercentage of online CPUs which will run a worker thread for I/O.
21373ff01b23SMartin MatuskaThese workers are responsible for I/O work such as compression and
21383ff01b23SMartin Matuskachecksum calculations.
21393ff01b23SMartin MatuskaFractional number of CPUs will be rounded down.
21403ff01b23SMartin Matuska.Pp
21413ff01b23SMartin MatuskaThe default value of
21423ff01b23SMartin Matuska.Sy 80%
21433ff01b23SMartin Matuskawas chosen to avoid using all CPUs which can result in
21443ff01b23SMartin Matuskalatency issues and inconsistent application performance,
21453ff01b23SMartin Matuskaespecially when slower compression and/or checksumming is enabled.
21463ff01b23SMartin Matuska.
21473ff01b23SMartin Matuska.It Sy zio_taskq_batch_tpq Ns = Ns Sy 0 Pq uint
21483ff01b23SMartin MatuskaNumber of worker threads per taskq.
21493ff01b23SMartin MatuskaLower values improve I/O ordering and CPU utilization,
21503ff01b23SMartin Matuskawhile higher reduces lock contention.
21513ff01b23SMartin Matuska.Pp
21523ff01b23SMartin MatuskaIf
21533ff01b23SMartin Matuska.Sy 0 ,
21543ff01b23SMartin Matuskagenerate a system-dependent value close to 6 threads per taskq.
21553ff01b23SMartin Matuska.
21563ff01b23SMartin Matuska.It Sy zvol_inhibit_dev Ns = Ns Sy 0 Ns | Ns 1 Pq uint
21573ff01b23SMartin MatuskaDo not create zvol device nodes.
21583ff01b23SMartin MatuskaThis may slightly improve startup time on
21593ff01b23SMartin Matuskasystems with a very large number of zvols.
21603ff01b23SMartin Matuska.
21613ff01b23SMartin Matuska.It Sy zvol_major Ns = Ns Sy 230 Pq uint
21623ff01b23SMartin MatuskaMajor number for zvol block devices.
21633ff01b23SMartin Matuska.
21643ff01b23SMartin Matuska.It Sy zvol_max_discard_blocks Ns = Ns Sy 16384 Pq ulong
21653ff01b23SMartin MatuskaDiscard (TRIM) operations done on zvols will be done in batches of this
21663ff01b23SMartin Matuskamany blocks, where block size is determined by the
21673ff01b23SMartin Matuska.Sy volblocksize
21683ff01b23SMartin Matuskaproperty of a zvol.
21693ff01b23SMartin Matuska.
21703ff01b23SMartin Matuska.It Sy zvol_prefetch_bytes Ns = Ns Sy 131072 Ns B Po 128kB Pc Pq uint
21713ff01b23SMartin MatuskaWhen adding a zvol to the system, prefetch this many bytes
21723ff01b23SMartin Matuskafrom the start and end of the volume.
21733ff01b23SMartin MatuskaPrefetching these regions of the volume is desirable,
21743ff01b23SMartin Matuskabecause they are likely to be accessed immediately by
21753ff01b23SMartin Matuska.Xr blkid 8
21763ff01b23SMartin Matuskaor the kernel partitioner.
21773ff01b23SMartin Matuska.
21783ff01b23SMartin Matuska.It Sy zvol_request_sync Ns = Ns Sy 0 Ns | Ns 1 Pq uint
21793ff01b23SMartin MatuskaWhen processing I/O requests for a zvol, submit them synchronously.
21803ff01b23SMartin MatuskaThis effectively limits the queue depth to
21813ff01b23SMartin Matuska.Em 1
21823ff01b23SMartin Matuskafor each I/O submitter.
21833ff01b23SMartin MatuskaWhen unset, requests are handled asynchronously by a thread pool.
21843ff01b23SMartin MatuskaThe number of requests which can be handled concurrently is controlled by
21853ff01b23SMartin Matuska.Sy zvol_threads .
21863ff01b23SMartin Matuska.
21873ff01b23SMartin Matuska.It Sy zvol_threads Ns = Ns Sy 32 Pq uint
21883ff01b23SMartin MatuskaMax number of threads which can handle zvol I/O requests concurrently.
21893ff01b23SMartin Matuska.
21903ff01b23SMartin Matuska.It Sy zvol_volmode Ns = Ns Sy 1 Pq uint
21913ff01b23SMartin MatuskaDefines zvol block devices behaviour when
21923ff01b23SMartin Matuska.Sy volmode Ns = Ns Sy default :
21933ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "a"
21943ff01b23SMartin Matuska.It Sy 1
21953ff01b23SMartin Matuska.No equivalent to Sy full
21963ff01b23SMartin Matuska.It Sy 2
21973ff01b23SMartin Matuska.No equivalent to Sy dev
21983ff01b23SMartin Matuska.It Sy 3
21993ff01b23SMartin Matuska.No equivalent to Sy none
22003ff01b23SMartin Matuska.El
22013ff01b23SMartin Matuska.El
22023ff01b23SMartin Matuska.
22033ff01b23SMartin Matuska.Sh ZFS I/O SCHEDULER
22043ff01b23SMartin MatuskaZFS issues I/O operations to leaf vdevs to satisfy and complete I/O operations.
22053ff01b23SMartin MatuskaThe scheduler determines when and in what order those operations are issued.
22063ff01b23SMartin MatuskaThe scheduler divides operations into five I/O classes,
22073ff01b23SMartin Matuskaprioritized in the following order: sync read, sync write, async read,
22083ff01b23SMartin Matuskaasync write, and scrub/resilver.
22093ff01b23SMartin MatuskaEach queue defines the minimum and maximum number of concurrent operations
22103ff01b23SMartin Matuskathat may be issued to the device.
22113ff01b23SMartin MatuskaIn addition, the device has an aggregate maximum,
22123ff01b23SMartin Matuska.Sy zfs_vdev_max_active .
22133ff01b23SMartin MatuskaNote that the sum of the per-queue minima must not exceed the aggregate maximum.
22143ff01b23SMartin MatuskaIf the sum of the per-queue maxima exceeds the aggregate maximum,
22153ff01b23SMartin Matuskathen the number of active operations may reach
22163ff01b23SMartin Matuska.Sy zfs_vdev_max_active ,
22173ff01b23SMartin Matuskain which case no further operations will be issued,
22183ff01b23SMartin Matuskaregardless of whether all per-queue minima have been met.
22193ff01b23SMartin Matuska.Pp
22203ff01b23SMartin MatuskaFor many physical devices, throughput increases with the number of
22213ff01b23SMartin Matuskaconcurrent operations, but latency typically suffers.
22223ff01b23SMartin MatuskaFurthermore, physical devices typically have a limit
22233ff01b23SMartin Matuskaat which more concurrent operations have no
22243ff01b23SMartin Matuskaeffect on throughput or can actually cause it to decrease.
22253ff01b23SMartin Matuska.Pp
22263ff01b23SMartin MatuskaThe scheduler selects the next operation to issue by first looking for an
22273ff01b23SMartin MatuskaI/O class whose minimum has not been satisfied.
22283ff01b23SMartin MatuskaOnce all are satisfied and the aggregate maximum has not been hit,
22293ff01b23SMartin Matuskathe scheduler looks for classes whose maximum has not been satisfied.
22303ff01b23SMartin MatuskaIteration through the I/O classes is done in the order specified above.
22313ff01b23SMartin MatuskaNo further operations are issued
22323ff01b23SMartin Matuskaif the aggregate maximum number of concurrent operations has been hit,
22333ff01b23SMartin Matuskaor if there are no operations queued for an I/O class that has not hit its maximum.
22343ff01b23SMartin MatuskaEvery time an I/O operation is queued or an operation completes,
22353ff01b23SMartin Matuskathe scheduler looks for new operations to issue.
22363ff01b23SMartin Matuska.Pp
22373ff01b23SMartin MatuskaIn general, smaller
22383ff01b23SMartin Matuska.Sy max_active Ns s
22393ff01b23SMartin Matuskawill lead to lower latency of synchronous operations.
22403ff01b23SMartin MatuskaLarger
22413ff01b23SMartin Matuska.Sy max_active Ns s
22423ff01b23SMartin Matuskamay lead to higher overall throughput, depending on underlying storage.
22433ff01b23SMartin Matuska.Pp
22443ff01b23SMartin MatuskaThe ratio of the queues'
22453ff01b23SMartin Matuska.Sy max_active Ns s
22463ff01b23SMartin Matuskadetermines the balance of performance between reads, writes, and scrubs.
22473ff01b23SMartin MatuskaFor example, increasing
22483ff01b23SMartin Matuska.Sy zfs_vdev_scrub_max_active
22493ff01b23SMartin Matuskawill cause the scrub or resilver to complete more quickly,
22503ff01b23SMartin Matuskabut reads and writes to have higher latency and lower throughput.
22513ff01b23SMartin Matuska.Pp
22523ff01b23SMartin MatuskaAll I/O classes have a fixed maximum number of outstanding operations,
22533ff01b23SMartin Matuskaexcept for the async write class.
22543ff01b23SMartin MatuskaAsynchronous writes represent the data that is committed to stable storage
22553ff01b23SMartin Matuskaduring the syncing stage for transaction groups.
22563ff01b23SMartin MatuskaTransaction groups enter the syncing state periodically,
22573ff01b23SMartin Matuskaso the number of queued async writes will quickly burst up
22583ff01b23SMartin Matuskaand then bleed down to zero.
22593ff01b23SMartin MatuskaRather than servicing them as quickly as possible,
22603ff01b23SMartin Matuskathe I/O scheduler changes the maximum number of active async write operations
22613ff01b23SMartin Matuskaaccording to the amount of dirty data in the pool.
22623ff01b23SMartin MatuskaSince both throughput and latency typically increase with the number of
22633ff01b23SMartin Matuskaconcurrent operations issued to physical devices, reducing the
22643ff01b23SMartin Matuskaburstiness in the number of concurrent operations also stabilizes the
22653ff01b23SMartin Matuskaresponse time of operations from other – and in particular synchronous – queues.
22663ff01b23SMartin MatuskaIn broad strokes, the I/O scheduler will issue more concurrent operations
22673ff01b23SMartin Matuskafrom the async write queue as there's more dirty data in the pool.
22683ff01b23SMartin Matuska.
22693ff01b23SMartin Matuska.Ss Async Writes
22703ff01b23SMartin MatuskaThe number of concurrent operations issued for the async write I/O class
22713ff01b23SMartin Matuskafollows a piece-wise linear function defined by a few adjustable points:
22723ff01b23SMartin Matuska.Bd -literal
22733ff01b23SMartin Matuska       |              o---------| <-- \fBzfs_vdev_async_write_max_active\fP
22743ff01b23SMartin Matuska  ^    |             /^         |
22753ff01b23SMartin Matuska  |    |            / |         |
22763ff01b23SMartin Matuskaactive |           /  |         |
22773ff01b23SMartin Matuska I/O   |          /   |         |
22783ff01b23SMartin Matuskacount  |         /    |         |
22793ff01b23SMartin Matuska       |        /     |         |
22803ff01b23SMartin Matuska       |-------o      |         | <-- \fBzfs_vdev_async_write_min_active\fP
22813ff01b23SMartin Matuska      0|_______^______|_________|
22823ff01b23SMartin Matuska       0%      |      |       100% of \fBzfs_dirty_data_max\fP
22833ff01b23SMartin Matuska               |      |
22843ff01b23SMartin Matuska               |      `-- \fBzfs_vdev_async_write_active_max_dirty_percent\fP
22853ff01b23SMartin Matuska               `--------- \fBzfs_vdev_async_write_active_min_dirty_percent\fP
22863ff01b23SMartin Matuska.Ed
22873ff01b23SMartin Matuska.Pp
22883ff01b23SMartin MatuskaUntil the amount of dirty data exceeds a minimum percentage of the dirty
22893ff01b23SMartin Matuskadata allowed in the pool, the I/O scheduler will limit the number of
22903ff01b23SMartin Matuskaconcurrent operations to the minimum.
22913ff01b23SMartin MatuskaAs that threshold is crossed, the number of concurrent operations issued
22923ff01b23SMartin Matuskaincreases linearly to the maximum at the specified maximum percentage
22933ff01b23SMartin Matuskaof the dirty data allowed in the pool.
22943ff01b23SMartin Matuska.Pp
22953ff01b23SMartin MatuskaIdeally, the amount of dirty data on a busy pool will stay in the sloped
22963ff01b23SMartin Matuskapart of the function between
22973ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_min_dirty_percent
22983ff01b23SMartin Matuskaand
22993ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent .
23003ff01b23SMartin MatuskaIf it exceeds the maximum percentage,
23013ff01b23SMartin Matuskathis indicates that the rate of incoming data is
23023ff01b23SMartin Matuskagreater than the rate that the backend storage can handle.
23033ff01b23SMartin MatuskaIn this case, we must further throttle incoming writes,
23043ff01b23SMartin Matuskaas described in the next section.
23053ff01b23SMartin Matuska.
23063ff01b23SMartin Matuska.Sh ZFS TRANSACTION DELAY
23073ff01b23SMartin MatuskaWe delay transactions when we've determined that the backend storage
23083ff01b23SMartin Matuskaisn't able to accommodate the rate of incoming writes.
23093ff01b23SMartin Matuska.Pp
23103ff01b23SMartin MatuskaIf there is already a transaction waiting, we delay relative to when
23113ff01b23SMartin Matuskathat transaction will finish waiting.
23123ff01b23SMartin MatuskaThis way the calculated delay time
23133ff01b23SMartin Matuskais independent of the number of threads concurrently executing transactions.
23143ff01b23SMartin Matuska.Pp
23153ff01b23SMartin MatuskaIf we are the only waiter, wait relative to when the transaction started,
23163ff01b23SMartin Matuskarather than the current time.
23173ff01b23SMartin MatuskaThis credits the transaction for "time already served",
23183ff01b23SMartin Matuskae.g. reading indirect blocks.
23193ff01b23SMartin Matuska.Pp
23203ff01b23SMartin MatuskaThe minimum time for a transaction to take is calculated as
2321*e92ffd9bSMartin Matuska.D1 min_time = min( Ns Sy zfs_delay_scale No \(mu Po Sy dirty No \- Sy min Pc / Po Sy max No \- Sy dirty Pc , 100ms)
23223ff01b23SMartin Matuska.Pp
23233ff01b23SMartin MatuskaThe delay has two degrees of freedom that can be adjusted via tunables.
23243ff01b23SMartin MatuskaThe percentage of dirty data at which we start to delay is defined by
23253ff01b23SMartin Matuska.Sy zfs_delay_min_dirty_percent .
23263ff01b23SMartin MatuskaThis should typically be at or above
23273ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent ,
23283ff01b23SMartin Matuskaso that we only start to delay after writing at full speed
23293ff01b23SMartin Matuskahas failed to keep up with the incoming write rate.
23303ff01b23SMartin MatuskaThe scale of the curve is defined by
23313ff01b23SMartin Matuska.Sy zfs_delay_scale .
23323ff01b23SMartin MatuskaRoughly speaking, this variable determines the amount of delay at the midpoint of the curve.
23333ff01b23SMartin Matuska.Bd -literal
23343ff01b23SMartin Matuskadelay
23353ff01b23SMartin Matuska 10ms +-------------------------------------------------------------*+
23363ff01b23SMartin Matuska      |                                                             *|
23373ff01b23SMartin Matuska  9ms +                                                             *+
23383ff01b23SMartin Matuska      |                                                             *|
23393ff01b23SMartin Matuska  8ms +                                                             *+
23403ff01b23SMartin Matuska      |                                                            * |
23413ff01b23SMartin Matuska  7ms +                                                            * +
23423ff01b23SMartin Matuska      |                                                            * |
23433ff01b23SMartin Matuska  6ms +                                                            * +
23443ff01b23SMartin Matuska      |                                                            * |
23453ff01b23SMartin Matuska  5ms +                                                           *  +
23463ff01b23SMartin Matuska      |                                                           *  |
23473ff01b23SMartin Matuska  4ms +                                                           *  +
23483ff01b23SMartin Matuska      |                                                           *  |
23493ff01b23SMartin Matuska  3ms +                                                          *   +
23503ff01b23SMartin Matuska      |                                                          *   |
23513ff01b23SMartin Matuska  2ms +                                              (midpoint) *    +
23523ff01b23SMartin Matuska      |                                                  |    **     |
23533ff01b23SMartin Matuska  1ms +                                                  v ***       +
23543ff01b23SMartin Matuska      |             \fBzfs_delay_scale\fP ---------->     ********         |
23553ff01b23SMartin Matuska    0 +-------------------------------------*********----------------+
23563ff01b23SMartin Matuska      0%                    <- \fBzfs_dirty_data_max\fP ->               100%
23573ff01b23SMartin Matuska.Ed
23583ff01b23SMartin Matuska.Pp
23593ff01b23SMartin MatuskaNote, that since the delay is added to the outstanding time remaining on the
23603ff01b23SMartin Matuskamost recent transaction it's effectively the inverse of IOPS.
23613ff01b23SMartin MatuskaHere, the midpoint of
23623ff01b23SMartin Matuska.Em 500us
23633ff01b23SMartin Matuskatranslates to
23643ff01b23SMartin Matuska.Em 2000 IOPS .
23653ff01b23SMartin MatuskaThe shape of the curve
23663ff01b23SMartin Matuskawas chosen such that small changes in the amount of accumulated dirty data
23673ff01b23SMartin Matuskain the first three quarters of the curve yield relatively small differences
23683ff01b23SMartin Matuskain the amount of delay.
23693ff01b23SMartin Matuska.Pp
23703ff01b23SMartin MatuskaThe effects can be easier to understand when the amount of delay is
23713ff01b23SMartin Matuskarepresented on a logarithmic scale:
23723ff01b23SMartin Matuska.Bd -literal
23733ff01b23SMartin Matuskadelay
23743ff01b23SMartin Matuska100ms +-------------------------------------------------------------++
23753ff01b23SMartin Matuska      +                                                              +
23763ff01b23SMartin Matuska      |                                                              |
23773ff01b23SMartin Matuska      +                                                             *+
23783ff01b23SMartin Matuska 10ms +                                                             *+
23793ff01b23SMartin Matuska      +                                                           ** +
23803ff01b23SMartin Matuska      |                                              (midpoint)  **  |
23813ff01b23SMartin Matuska      +                                                  |     **    +
23823ff01b23SMartin Matuska  1ms +                                                  v ****      +
23833ff01b23SMartin Matuska      +             \fBzfs_delay_scale\fP ---------->        *****         +
23843ff01b23SMartin Matuska      |                                             ****             |
23853ff01b23SMartin Matuska      +                                          ****                +
23863ff01b23SMartin Matuska100us +                                        **                    +
23873ff01b23SMartin Matuska      +                                       *                      +
23883ff01b23SMartin Matuska      |                                      *                       |
23893ff01b23SMartin Matuska      +                                     *                        +
23903ff01b23SMartin Matuska 10us +                                     *                        +
23913ff01b23SMartin Matuska      +                                                              +
23923ff01b23SMartin Matuska      |                                                              |
23933ff01b23SMartin Matuska      +                                                              +
23943ff01b23SMartin Matuska      +--------------------------------------------------------------+
23953ff01b23SMartin Matuska      0%                    <- \fBzfs_dirty_data_max\fP ->               100%
23963ff01b23SMartin Matuska.Ed
23973ff01b23SMartin Matuska.Pp
23983ff01b23SMartin MatuskaNote here that only as the amount of dirty data approaches its limit does
23993ff01b23SMartin Matuskathe delay start to increase rapidly.
24003ff01b23SMartin MatuskaThe goal of a properly tuned system should be to keep the amount of dirty data
24013ff01b23SMartin Matuskaout of that range by first ensuring that the appropriate limits are set
24023ff01b23SMartin Matuskafor the I/O scheduler to reach optimal throughput on the back-end storage,
24033ff01b23SMartin Matuskaand then by changing the value of
24043ff01b23SMartin Matuska.Sy zfs_delay_scale
24053ff01b23SMartin Matuskato increase the steepness of the curve.
2406