13ff01b23SMartin Matuska.\" 23ff01b23SMartin Matuska.\" Copyright (c) 2013 by Turbo Fredriksson <turbo@bayour.com>. All rights reserved. 33ff01b23SMartin Matuska.\" Copyright (c) 2019, 2021 by Delphix. All rights reserved. 43ff01b23SMartin Matuska.\" Copyright (c) 2019 Datto Inc. 5783d3ff6SMartin Matuska.\" Copyright (c) 2023, 2024 Klara, Inc. 63ff01b23SMartin Matuska.\" The contents of this file are subject to the terms of the Common Development 73ff01b23SMartin Matuska.\" and Distribution License (the "License"). You may not use this file except 83ff01b23SMartin Matuska.\" in compliance with the License. You can obtain a copy of the license at 9271171e0SMartin Matuska.\" usr/src/OPENSOLARIS.LICENSE or https://opensource.org/licenses/CDDL-1.0. 103ff01b23SMartin Matuska.\" 113ff01b23SMartin Matuska.\" See the License for the specific language governing permissions and 123ff01b23SMartin Matuska.\" limitations under the License. When distributing Covered Code, include this 133ff01b23SMartin Matuska.\" CDDL HEADER in each file and include the License file at 143ff01b23SMartin Matuska.\" usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this 153ff01b23SMartin Matuska.\" CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your 163ff01b23SMartin Matuska.\" own identifying information: 173ff01b23SMartin Matuska.\" Portions Copyright [yyyy] [name of copyright owner] 183ff01b23SMartin Matuska.\" 1975e1fea6SMartin Matuska.Dd June 27, 2024 203ff01b23SMartin Matuska.Dt ZFS 4 213ff01b23SMartin Matuska.Os 223ff01b23SMartin Matuska. 233ff01b23SMartin Matuska.Sh NAME 243ff01b23SMartin Matuska.Nm zfs 253ff01b23SMartin Matuska.Nd tuning of the ZFS kernel module 263ff01b23SMartin Matuska. 273ff01b23SMartin Matuska.Sh DESCRIPTION 283ff01b23SMartin MatuskaThe ZFS module supports these parameters: 293ff01b23SMartin Matuska.Bl -tag -width Ds 30dbd5678dSMartin Matuska.It Sy dbuf_cache_max_bytes Ns = Ns Sy UINT64_MAX Ns B Pq u64 313ff01b23SMartin MatuskaMaximum size in bytes of the dbuf cache. 323ff01b23SMartin MatuskaThe target size is determined by the MIN versus 333ff01b23SMartin Matuska.No 1/2^ Ns Sy dbuf_cache_shift Pq 1/32nd 343ff01b23SMartin Matuskaof the target ARC size. 353ff01b23SMartin MatuskaThe behavior of the dbuf cache and its associated settings 363ff01b23SMartin Matuskacan be observed via the 373ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbufstats 383ff01b23SMartin Matuskakstat. 393ff01b23SMartin Matuska. 40dbd5678dSMartin Matuska.It Sy dbuf_metadata_cache_max_bytes Ns = Ns Sy UINT64_MAX Ns B Pq u64 413ff01b23SMartin MatuskaMaximum size in bytes of the metadata dbuf cache. 423ff01b23SMartin MatuskaThe target size is determined by the MIN versus 433ff01b23SMartin Matuska.No 1/2^ Ns Sy dbuf_metadata_cache_shift Pq 1/64th 443ff01b23SMartin Matuskaof the target ARC size. 453ff01b23SMartin MatuskaThe behavior of the metadata dbuf cache and its associated settings 463ff01b23SMartin Matuskacan be observed via the 473ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbufstats 483ff01b23SMartin Matuskakstat. 493ff01b23SMartin Matuska. 503ff01b23SMartin Matuska.It Sy dbuf_cache_hiwater_pct Ns = Ns Sy 10 Ns % Pq uint 513ff01b23SMartin MatuskaThe percentage over 523ff01b23SMartin Matuska.Sy dbuf_cache_max_bytes 533ff01b23SMartin Matuskawhen dbufs must be evicted directly. 543ff01b23SMartin Matuska. 553ff01b23SMartin Matuska.It Sy dbuf_cache_lowater_pct Ns = Ns Sy 10 Ns % Pq uint 563ff01b23SMartin MatuskaThe percentage below 573ff01b23SMartin Matuska.Sy dbuf_cache_max_bytes 583ff01b23SMartin Matuskawhen the evict thread stops evicting dbufs. 593ff01b23SMartin Matuska. 60be181ee2SMartin Matuska.It Sy dbuf_cache_shift Ns = Ns Sy 5 Pq uint 613ff01b23SMartin MatuskaSet the size of the dbuf cache 623ff01b23SMartin Matuska.Pq Sy dbuf_cache_max_bytes 633ff01b23SMartin Matuskato a log2 fraction of the target ARC size. 643ff01b23SMartin Matuska. 65be181ee2SMartin Matuska.It Sy dbuf_metadata_cache_shift Ns = Ns Sy 6 Pq uint 663ff01b23SMartin MatuskaSet the size of the dbuf metadata cache 673ff01b23SMartin Matuska.Pq Sy dbuf_metadata_cache_max_bytes 683ff01b23SMartin Matuskato a log2 fraction of the target ARC size. 693ff01b23SMartin Matuska. 70be181ee2SMartin Matuska.It Sy dbuf_mutex_cache_shift Ns = Ns Sy 0 Pq uint 71be181ee2SMartin MatuskaSet the size of the mutex array for the dbuf cache. 72be181ee2SMartin MatuskaWhen set to 73be181ee2SMartin Matuska.Sy 0 74be181ee2SMartin Matuskathe array is dynamically sized based on total system memory. 75be181ee2SMartin Matuska. 76be181ee2SMartin Matuska.It Sy dmu_object_alloc_chunk_shift Ns = Ns Sy 7 Po 128 Pc Pq uint 773ff01b23SMartin Matuskadnode slots allocated in a single operation as a power of 2. 783ff01b23SMartin MatuskaThe default value minimizes lock contention for the bulk operation performed. 793ff01b23SMartin Matuska. 80be181ee2SMartin Matuska.It Sy dmu_prefetch_max Ns = Ns Sy 134217728 Ns B Po 128 MiB Pc Pq uint 813ff01b23SMartin MatuskaLimit the amount we can prefetch with one call to this amount in bytes. 823ff01b23SMartin MatuskaThis helps to limit the amount of memory that can be used by prefetching. 833ff01b23SMartin Matuska. 843ff01b23SMartin Matuska.It Sy ignore_hole_birth Pq int 853ff01b23SMartin MatuskaAlias for 863ff01b23SMartin Matuska.Sy send_holes_without_birth_time . 873ff01b23SMartin Matuska. 883ff01b23SMartin Matuska.It Sy l2arc_feed_again Ns = Ns Sy 1 Ns | Ns 0 Pq int 893ff01b23SMartin MatuskaTurbo L2ARC warm-up. 903ff01b23SMartin MatuskaWhen the L2ARC is cold the fill interval will be set as fast as possible. 913ff01b23SMartin Matuska. 92dbd5678dSMartin Matuska.It Sy l2arc_feed_min_ms Ns = Ns Sy 200 Pq u64 933ff01b23SMartin MatuskaMin feed interval in milliseconds. 943ff01b23SMartin MatuskaRequires 953ff01b23SMartin Matuska.Sy l2arc_feed_again Ns = Ns Ar 1 963ff01b23SMartin Matuskaand only applicable in related situations. 973ff01b23SMartin Matuska. 98dbd5678dSMartin Matuska.It Sy l2arc_feed_secs Ns = Ns Sy 1 Pq u64 993ff01b23SMartin MatuskaSeconds between L2ARC writing. 1003ff01b23SMartin Matuska. 101e716630dSMartin Matuska.It Sy l2arc_headroom Ns = Ns Sy 8 Pq u64 1023ff01b23SMartin MatuskaHow far through the ARC lists to search for L2ARC cacheable content, 1033ff01b23SMartin Matuskaexpressed as a multiplier of 1043ff01b23SMartin Matuska.Sy l2arc_write_max . 1053ff01b23SMartin MatuskaARC persistence across reboots can be achieved with persistent L2ARC 1063ff01b23SMartin Matuskaby setting this parameter to 1073ff01b23SMartin Matuska.Sy 0 , 1083ff01b23SMartin Matuskaallowing the full length of ARC lists to be searched for cacheable content. 1093ff01b23SMartin Matuska. 110dbd5678dSMartin Matuska.It Sy l2arc_headroom_boost Ns = Ns Sy 200 Ns % Pq u64 1113ff01b23SMartin MatuskaScales 1123ff01b23SMartin Matuska.Sy l2arc_headroom 1133ff01b23SMartin Matuskaby this percentage when L2ARC contents are being successfully compressed 1143ff01b23SMartin Matuskabefore writing. 1153ff01b23SMartin MatuskaA value of 1163ff01b23SMartin Matuska.Sy 100 1173ff01b23SMartin Matuskadisables this feature. 1183ff01b23SMartin Matuska. 119dae17134SMartin Matuska.It Sy l2arc_exclude_special Ns = Ns Sy 0 Ns | Ns 1 Pq int 120e92ffd9bSMartin MatuskaControls whether buffers present on special vdevs are eligible for caching 121dae17134SMartin Matuskainto L2ARC. 122dae17134SMartin MatuskaIf set to 1, exclude dbufs on special vdevs from being cached to L2ARC. 123dae17134SMartin Matuska. 1243ff01b23SMartin Matuska.It Sy l2arc_mfuonly Ns = Ns Sy 0 Ns | Ns 1 Pq int 1253ff01b23SMartin MatuskaControls whether only MFU metadata and data are cached from ARC into L2ARC. 1263ff01b23SMartin MatuskaThis may be desired to avoid wasting space on L2ARC when reading/writing large 1273ff01b23SMartin Matuskaamounts of data that are not expected to be accessed more than once. 1283ff01b23SMartin Matuska.Pp 1293ff01b23SMartin MatuskaThe default is off, 1303ff01b23SMartin Matuskameaning both MRU and MFU data and metadata are cached. 1313ff01b23SMartin MatuskaWhen turning off this feature, some MRU buffers will still be present 1323ff01b23SMartin Matuskain ARC and eventually cached on L2ARC. 1333ff01b23SMartin Matuska.No If Sy l2arc_noprefetch Ns = Ns Sy 0 , 1343ff01b23SMartin Matuskasome prefetched buffers will be cached to L2ARC, and those might later 1353ff01b23SMartin Matuskatransition to MRU, in which case the 1363ff01b23SMartin Matuska.Sy l2arc_mru_asize No arcstat will not be Sy 0 . 1373ff01b23SMartin Matuska.Pp 1383ff01b23SMartin MatuskaRegardless of 1393ff01b23SMartin Matuska.Sy l2arc_noprefetch , 1403ff01b23SMartin Matuskasome MFU buffers might be evicted from ARC, 1413ff01b23SMartin Matuskaaccessed later on as prefetches and transition to MRU as prefetches. 1423ff01b23SMartin MatuskaIf accessed again they are counted as MRU and the 1433ff01b23SMartin Matuska.Sy l2arc_mru_asize No arcstat will not be Sy 0 . 1443ff01b23SMartin Matuska.Pp 1453ff01b23SMartin MatuskaThe ARC status of L2ARC buffers when they were first cached in 1463ff01b23SMartin MatuskaL2ARC can be seen in the 1473ff01b23SMartin Matuska.Sy l2arc_mru_asize , Sy l2arc_mfu_asize , No and Sy l2arc_prefetch_asize 1483ff01b23SMartin Matuskaarcstats when importing the pool or onlining a cache 1493ff01b23SMartin Matuskadevice if persistent L2ARC is enabled. 1503ff01b23SMartin Matuska.Pp 1513ff01b23SMartin MatuskaThe 1523ff01b23SMartin Matuska.Sy evict_l2_eligible_mru 1533ff01b23SMartin Matuskaarcstat does not take into account if this option is enabled as the information 1543ff01b23SMartin Matuskaprovided by the 1553ff01b23SMartin Matuska.Sy evict_l2_eligible_m[rf]u 1563ff01b23SMartin Matuskaarcstats can be used to decide if toggling this option is appropriate 1573ff01b23SMartin Matuskafor the current workload. 1583ff01b23SMartin Matuska. 159be181ee2SMartin Matuska.It Sy l2arc_meta_percent Ns = Ns Sy 33 Ns % Pq uint 1603ff01b23SMartin MatuskaPercent of ARC size allowed for L2ARC-only headers. 1613ff01b23SMartin MatuskaSince L2ARC buffers are not evicted on memory pressure, 1623ff01b23SMartin Matuskatoo many headers on a system with an irrationally large L2ARC 1633ff01b23SMartin Matuskacan render it slow or unusable. 1643ff01b23SMartin MatuskaThis parameter limits L2ARC writes and rebuilds to achieve the target. 1653ff01b23SMartin Matuska. 166dbd5678dSMartin Matuska.It Sy l2arc_trim_ahead Ns = Ns Sy 0 Ns % Pq u64 1673ff01b23SMartin MatuskaTrims ahead of the current write size 1683ff01b23SMartin Matuska.Pq Sy l2arc_write_max 1693ff01b23SMartin Matuskaon L2ARC devices by this percentage of write size if we have filled the device. 1703ff01b23SMartin MatuskaIf set to 1713ff01b23SMartin Matuska.Sy 100 1723ff01b23SMartin Matuskawe TRIM twice the space required to accommodate upcoming writes. 1733ff01b23SMartin MatuskaA minimum of 174716fd348SMartin Matuska.Sy 64 MiB 1753ff01b23SMartin Matuskawill be trimmed. 1763ff01b23SMartin MatuskaIt also enables TRIM of the whole L2ARC device upon creation 1773ff01b23SMartin Matuskaor addition to an existing pool or if the header of the device is 1783ff01b23SMartin Matuskainvalid upon importing a pool or onlining a cache device. 1793ff01b23SMartin MatuskaA value of 1803ff01b23SMartin Matuska.Sy 0 1813ff01b23SMartin Matuskadisables TRIM on L2ARC altogether and is the default as it can put significant 1823ff01b23SMartin Matuskastress on the underlying storage devices. 1833ff01b23SMartin MatuskaThis will vary depending of how well the specific device handles these commands. 1843ff01b23SMartin Matuska. 1853ff01b23SMartin Matuska.It Sy l2arc_noprefetch Ns = Ns Sy 1 Ns | Ns 0 Pq int 1863ff01b23SMartin MatuskaDo not write buffers to L2ARC if they were prefetched but not used by 1873ff01b23SMartin Matuskaapplications. 1883ff01b23SMartin MatuskaIn case there are prefetched buffers in L2ARC and this option 1893ff01b23SMartin Matuskais later set, we do not read the prefetched buffers from L2ARC. 1903ff01b23SMartin MatuskaUnsetting this option is useful for caching sequential reads from the 1913ff01b23SMartin Matuskadisks to L2ARC and serve those reads from L2ARC later on. 1923ff01b23SMartin MatuskaThis may be beneficial in case the L2ARC device is significantly faster 1933ff01b23SMartin Matuskain sequential reads than the disks of the pool. 1943ff01b23SMartin Matuska.Pp 1953ff01b23SMartin MatuskaUse 1963ff01b23SMartin Matuska.Sy 1 1973ff01b23SMartin Matuskato disable and 1983ff01b23SMartin Matuska.Sy 0 1993ff01b23SMartin Matuskato enable caching/reading prefetches to/from L2ARC. 2003ff01b23SMartin Matuska. 2013ff01b23SMartin Matuska.It Sy l2arc_norw Ns = Ns Sy 0 Ns | Ns 1 Pq int 2023ff01b23SMartin MatuskaNo reads during writes. 2033ff01b23SMartin Matuska. 204e716630dSMartin Matuska.It Sy l2arc_write_boost Ns = Ns Sy 33554432 Ns B Po 32 MiB Pc Pq u64 2053ff01b23SMartin MatuskaCold L2ARC devices will have 2063ff01b23SMartin Matuska.Sy l2arc_write_max 2073ff01b23SMartin Matuskaincreased by this amount while they remain cold. 2083ff01b23SMartin Matuska. 209e716630dSMartin Matuska.It Sy l2arc_write_max Ns = Ns Sy 33554432 Ns B Po 32 MiB Pc Pq u64 2103ff01b23SMartin MatuskaMax write bytes per interval. 2113ff01b23SMartin Matuska. 2123ff01b23SMartin Matuska.It Sy l2arc_rebuild_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 2133ff01b23SMartin MatuskaRebuild the L2ARC when importing a pool (persistent L2ARC). 2143ff01b23SMartin MatuskaThis can be disabled if there are problems importing a pool 2153ff01b23SMartin Matuskaor attaching an L2ARC device (e.g. the L2ARC device is slow 2163ff01b23SMartin Matuskain reading stored log metadata, or the metadata 2173ff01b23SMartin Matuskahas become somehow fragmented/unusable). 2183ff01b23SMartin Matuska. 219dbd5678dSMartin Matuska.It Sy l2arc_rebuild_blocks_min_l2size Ns = Ns Sy 1073741824 Ns B Po 1 GiB Pc Pq u64 2203ff01b23SMartin MatuskaMininum size of an L2ARC device required in order to write log blocks in it. 2213ff01b23SMartin MatuskaThe log blocks are used upon importing the pool to rebuild the persistent L2ARC. 2223ff01b23SMartin Matuska.Pp 223716fd348SMartin MatuskaFor L2ARC devices less than 1 GiB, the amount of data 2243ff01b23SMartin Matuska.Fn l2arc_evict 2253ff01b23SMartin Matuskaevicts is significant compared to the amount of restored L2ARC data. 2263ff01b23SMartin MatuskaIn this case, do not write log blocks in L2ARC in order not to waste space. 2273ff01b23SMartin Matuska. 228dbd5678dSMartin Matuska.It Sy metaslab_aliquot Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq u64 2293ff01b23SMartin MatuskaMetaslab granularity, in bytes. 2303ff01b23SMartin MatuskaThis is roughly similar to what would be referred to as the "stripe size" 2313ff01b23SMartin Matuskain traditional RAID arrays. 232716fd348SMartin MatuskaIn normal operation, ZFS will try to write this amount of data to each disk 233716fd348SMartin Matuskabefore moving on to the next top-level vdev. 2343ff01b23SMartin Matuska. 2353ff01b23SMartin Matuska.It Sy metaslab_bias_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 2363ff01b23SMartin MatuskaEnable metaslab group biasing based on their vdevs' over- or under-utilization 2373ff01b23SMartin Matuskarelative to the pool. 2383ff01b23SMartin Matuska. 239dbd5678dSMartin Matuska.It Sy metaslab_force_ganging Ns = Ns Sy 16777217 Ns B Po 16 MiB + 1 B Pc Pq u64 2403ff01b23SMartin MatuskaMake some blocks above a certain size be gang blocks. 2413ff01b23SMartin MatuskaThis option is used by the test suite to facilitate testing. 2423ff01b23SMartin Matuska. 243315ee00fSMartin Matuska.It Sy metaslab_force_ganging_pct Ns = Ns Sy 3 Ns % Pq uint 244315ee00fSMartin MatuskaFor blocks that could be forced to be a gang block (due to 245315ee00fSMartin Matuska.Sy metaslab_force_ganging ) , 246315ee00fSMartin Matuskaforce this many of them to be gang blocks. 247315ee00fSMartin Matuska. 248783d3ff6SMartin Matuska.It Sy brt_zap_prefetch Ns = Ns Sy 1 Ns | Ns 0 Pq int 249783d3ff6SMartin MatuskaControls prefetching BRT records for blocks which are going to be cloned. 250783d3ff6SMartin Matuska. 251783d3ff6SMartin Matuska.It Sy brt_zap_default_bs Ns = Ns Sy 12 Po 4 KiB Pc Pq int 252783d3ff6SMartin MatuskaDefault BRT ZAP data block size as a power of 2. Note that changing this after 253783d3ff6SMartin Matuskacreating a BRT on the pool will not affect existing BRTs, only newly created 254783d3ff6SMartin Matuskaones. 255783d3ff6SMartin Matuska. 256783d3ff6SMartin Matuska.It Sy brt_zap_default_ibs Ns = Ns Sy 12 Po 4 KiB Pc Pq int 257783d3ff6SMartin MatuskaDefault BRT ZAP indirect block size as a power of 2. Note that changing this 258783d3ff6SMartin Matuskaafter creating a BRT on the pool will not affect existing BRTs, only newly 259783d3ff6SMartin Matuskacreated ones. 260783d3ff6SMartin Matuska. 261783d3ff6SMartin Matuska.It Sy ddt_zap_default_bs Ns = Ns Sy 15 Po 32 KiB Pc Pq int 2620a97523dSMartin MatuskaDefault DDT ZAP data block size as a power of 2. Note that changing this after 2630a97523dSMartin Matuskacreating a DDT on the pool will not affect existing DDTs, only newly created 2640a97523dSMartin Matuskaones. 2650a97523dSMartin Matuska. 266783d3ff6SMartin Matuska.It Sy ddt_zap_default_ibs Ns = Ns Sy 15 Po 32 KiB Pc Pq int 2670a97523dSMartin MatuskaDefault DDT ZAP indirect block size as a power of 2. Note that changing this 2680a97523dSMartin Matuskaafter creating a DDT on the pool will not affect existing DDTs, only newly 2690a97523dSMartin Matuskacreated ones. 2700a97523dSMartin Matuska. 27115f0b8c3SMartin Matuska.It Sy zfs_default_bs Ns = Ns Sy 9 Po 512 B Pc Pq int 27215f0b8c3SMartin MatuskaDefault dnode block size as a power of 2. 27315f0b8c3SMartin Matuska. 27415f0b8c3SMartin Matuska.It Sy zfs_default_ibs Ns = Ns Sy 17 Po 128 KiB Pc Pq int 27515f0b8c3SMartin MatuskaDefault dnode indirect block size as a power of 2. 27615f0b8c3SMartin Matuska. 277dbd5678dSMartin Matuska.It Sy zfs_history_output_max Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq u64 2783ff01b23SMartin MatuskaWhen attempting to log an output nvlist of an ioctl in the on-disk history, 2793ff01b23SMartin Matuskathe output will not be stored if it is larger than this size (in bytes). 2803ff01b23SMartin MatuskaThis must be less than 281716fd348SMartin Matuska.Sy DMU_MAX_ACCESS Pq 64 MiB . 2823ff01b23SMartin MatuskaThis applies primarily to 2833ff01b23SMartin Matuska.Fn zfs_ioc_channel_program Pq cf. Xr zfs-program 8 . 2843ff01b23SMartin Matuska. 2853ff01b23SMartin Matuska.It Sy zfs_keep_log_spacemaps_at_export Ns = Ns Sy 0 Ns | Ns 1 Pq int 2863ff01b23SMartin MatuskaPrevent log spacemaps from being destroyed during pool exports and destroys. 2873ff01b23SMartin Matuska. 2883ff01b23SMartin Matuska.It Sy zfs_metaslab_segment_weight_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 2893ff01b23SMartin MatuskaEnable/disable segment-based metaslab selection. 2903ff01b23SMartin Matuska. 2913ff01b23SMartin Matuska.It Sy zfs_metaslab_switch_threshold Ns = Ns Sy 2 Pq int 2923ff01b23SMartin MatuskaWhen using segment-based metaslab selection, continue allocating 2933ff01b23SMartin Matuskafrom the active metaslab until this option's 2943ff01b23SMartin Matuskaworth of buckets have been exhausted. 2953ff01b23SMartin Matuska. 2963ff01b23SMartin Matuska.It Sy metaslab_debug_load Ns = Ns Sy 0 Ns | Ns 1 Pq int 2973ff01b23SMartin MatuskaLoad all metaslabs during pool import. 2983ff01b23SMartin Matuska. 2993ff01b23SMartin Matuska.It Sy metaslab_debug_unload Ns = Ns Sy 0 Ns | Ns 1 Pq int 3003ff01b23SMartin MatuskaPrevent metaslabs from being unloaded. 3013ff01b23SMartin Matuska. 3023ff01b23SMartin Matuska.It Sy metaslab_fragmentation_factor_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 3033ff01b23SMartin MatuskaEnable use of the fragmentation metric in computing metaslab weights. 3043ff01b23SMartin Matuska. 305be181ee2SMartin Matuska.It Sy metaslab_df_max_search Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq uint 3063ff01b23SMartin MatuskaMaximum distance to search forward from the last offset. 3073ff01b23SMartin MatuskaWithout this limit, fragmented pools can see 3083ff01b23SMartin Matuska.Em >100`000 3093ff01b23SMartin Matuskaiterations and 3103ff01b23SMartin Matuska.Fn metaslab_block_picker 3113ff01b23SMartin Matuskabecomes the performance limiting factor on high-performance storage. 3123ff01b23SMartin Matuska.Pp 3133ff01b23SMartin MatuskaWith the default setting of 314716fd348SMartin Matuska.Sy 16 MiB , 3153ff01b23SMartin Matuskawe typically see less than 3163ff01b23SMartin Matuska.Em 500 3173ff01b23SMartin Matuskaiterations, even with very fragmented 3183ff01b23SMartin Matuska.Sy ashift Ns = Ns Sy 9 3193ff01b23SMartin Matuskapools. 3203ff01b23SMartin MatuskaThe maximum number of iterations possible is 3213ff01b23SMartin Matuska.Sy metaslab_df_max_search / 2^(ashift+1) . 3223ff01b23SMartin MatuskaWith the default setting of 323716fd348SMartin Matuska.Sy 16 MiB 3243ff01b23SMartin Matuskathis is 3253ff01b23SMartin Matuska.Em 16*1024 Pq with Sy ashift Ns = Ns Sy 9 3263ff01b23SMartin Matuskaor 3273ff01b23SMartin Matuska.Em 2*1024 Pq with Sy ashift Ns = Ns Sy 12 . 3283ff01b23SMartin Matuska. 3293ff01b23SMartin Matuska.It Sy metaslab_df_use_largest_segment Ns = Ns Sy 0 Ns | Ns 1 Pq int 3303ff01b23SMartin MatuskaIf not searching forward (due to 3313ff01b23SMartin Matuska.Sy metaslab_df_max_search , metaslab_df_free_pct , 3323ff01b23SMartin Matuska.No or Sy metaslab_df_alloc_threshold ) , 3333ff01b23SMartin Matuskathis tunable controls which segment is used. 3343ff01b23SMartin MatuskaIf set, we will use the largest free segment. 3353ff01b23SMartin MatuskaIf unset, we will use a segment of at least the requested size. 3363ff01b23SMartin Matuska. 337dbd5678dSMartin Matuska.It Sy zfs_metaslab_max_size_cache_sec Ns = Ns Sy 3600 Ns s Po 1 hour Pc Pq u64 3383ff01b23SMartin MatuskaWhen we unload a metaslab, we cache the size of the largest free chunk. 3393ff01b23SMartin MatuskaWe use that cached size to determine whether or not to load a metaslab 3403ff01b23SMartin Matuskafor a given allocation. 3413ff01b23SMartin MatuskaAs more frees accumulate in that metaslab while it's unloaded, 3423ff01b23SMartin Matuskathe cached max size becomes less and less accurate. 3433ff01b23SMartin MatuskaAfter a number of seconds controlled by this tunable, 3443ff01b23SMartin Matuskawe stop considering the cached max size and start 3453ff01b23SMartin Matuskaconsidering only the histogram instead. 3463ff01b23SMartin Matuska. 347be181ee2SMartin Matuska.It Sy zfs_metaslab_mem_limit Ns = Ns Sy 25 Ns % Pq uint 3483ff01b23SMartin MatuskaWhen we are loading a new metaslab, we check the amount of memory being used 3493ff01b23SMartin Matuskato store metaslab range trees. 3503ff01b23SMartin MatuskaIf it is over a threshold, we attempt to unload the least recently used metaslab 3513ff01b23SMartin Matuskato prevent the system from clogging all of its memory with range trees. 3523ff01b23SMartin MatuskaThis tunable sets the percentage of total system memory that is the threshold. 3533ff01b23SMartin Matuska. 3543ff01b23SMartin Matuska.It Sy zfs_metaslab_try_hard_before_gang Ns = Ns Sy 0 Ns | Ns 1 Pq int 3553ff01b23SMartin Matuska.Bl -item -compact 3563ff01b23SMartin Matuska.It 3573ff01b23SMartin MatuskaIf unset, we will first try normal allocation. 3583ff01b23SMartin Matuska.It 3593ff01b23SMartin MatuskaIf that fails then we will do a gang allocation. 3603ff01b23SMartin Matuska.It 3613ff01b23SMartin MatuskaIf that fails then we will do a "try hard" gang allocation. 3623ff01b23SMartin Matuska.It 3633ff01b23SMartin MatuskaIf that fails then we will have a multi-layer gang block. 3643ff01b23SMartin Matuska.El 3653ff01b23SMartin Matuska.Pp 3663ff01b23SMartin Matuska.Bl -item -compact 3673ff01b23SMartin Matuska.It 3683ff01b23SMartin MatuskaIf set, we will first try normal allocation. 3693ff01b23SMartin Matuska.It 3703ff01b23SMartin MatuskaIf that fails then we will do a "try hard" allocation. 3713ff01b23SMartin Matuska.It 3723ff01b23SMartin MatuskaIf that fails we will do a gang allocation. 3733ff01b23SMartin Matuska.It 3743ff01b23SMartin MatuskaIf that fails we will do a "try hard" gang allocation. 3753ff01b23SMartin Matuska.It 3763ff01b23SMartin MatuskaIf that fails then we will have a multi-layer gang block. 3773ff01b23SMartin Matuska.El 3783ff01b23SMartin Matuska. 379be181ee2SMartin Matuska.It Sy zfs_metaslab_find_max_tries Ns = Ns Sy 100 Pq uint 3803ff01b23SMartin MatuskaWhen not trying hard, we only consider this number of the best metaslabs. 3813ff01b23SMartin MatuskaThis improves performance, especially when there are many metaslabs per vdev 3823ff01b23SMartin Matuskaand the allocation can't actually be satisfied 3833ff01b23SMartin Matuska(so we would otherwise iterate all metaslabs). 3843ff01b23SMartin Matuska. 385be181ee2SMartin Matuska.It Sy zfs_vdev_default_ms_count Ns = Ns Sy 200 Pq uint 3863ff01b23SMartin MatuskaWhen a vdev is added, target this number of metaslabs per top-level vdev. 3873ff01b23SMartin Matuska. 388be181ee2SMartin Matuska.It Sy zfs_vdev_default_ms_shift Ns = Ns Sy 29 Po 512 MiB Pc Pq uint 389d411c1d6SMartin MatuskaDefault lower limit for metaslab size. 390d411c1d6SMartin Matuska. 391d411c1d6SMartin Matuska.It Sy zfs_vdev_max_ms_shift Ns = Ns Sy 34 Po 16 GiB Pc Pq uint 392d411c1d6SMartin MatuskaDefault upper limit for metaslab size. 3933ff01b23SMartin Matuska. 394dbd5678dSMartin Matuska.It Sy zfs_vdev_max_auto_ashift Ns = Ns Sy 14 Pq uint 395bb2d13b6SMartin MatuskaMaximum ashift used when optimizing for logical \[->] physical sector size on 396bb2d13b6SMartin Matuskanew 3973ff01b23SMartin Matuskatop-level vdevs. 398c7046f76SMartin MatuskaMay be increased up to 399c7046f76SMartin Matuska.Sy ASHIFT_MAX Po 16 Pc , 400c7046f76SMartin Matuskabut this may negatively impact pool space efficiency. 4013ff01b23SMartin Matuska. 402dbd5678dSMartin Matuska.It Sy zfs_vdev_min_auto_ashift Ns = Ns Sy ASHIFT_MIN Po 9 Pc Pq uint 4033ff01b23SMartin MatuskaMinimum ashift used when creating new top-level vdevs. 4043ff01b23SMartin Matuska. 405be181ee2SMartin Matuska.It Sy zfs_vdev_min_ms_count Ns = Ns Sy 16 Pq uint 4063ff01b23SMartin MatuskaMinimum number of metaslabs to create in a top-level vdev. 4073ff01b23SMartin Matuska. 4083ff01b23SMartin Matuska.It Sy vdev_validate_skip Ns = Ns Sy 0 Ns | Ns 1 Pq int 4093ff01b23SMartin MatuskaSkip label validation steps during pool import. 4103ff01b23SMartin MatuskaChanging is not recommended unless you know what you're doing 4113ff01b23SMartin Matuskaand are recovering a damaged label. 4123ff01b23SMartin Matuska. 413be181ee2SMartin Matuska.It Sy zfs_vdev_ms_count_limit Ns = Ns Sy 131072 Po 128k Pc Pq uint 4143ff01b23SMartin MatuskaPractical upper limit of total metaslabs per top-level vdev. 4153ff01b23SMartin Matuska. 4163ff01b23SMartin Matuska.It Sy metaslab_preload_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 4173ff01b23SMartin MatuskaEnable metaslab group preloading. 4183ff01b23SMartin Matuska. 419b2526e8bSMartin Matuska.It Sy metaslab_preload_limit Ns = Ns Sy 10 Pq uint 420b2526e8bSMartin MatuskaMaximum number of metaslabs per group to preload 421b2526e8bSMartin Matuska. 422b2526e8bSMartin Matuska.It Sy metaslab_preload_pct Ns = Ns Sy 50 Pq uint 423b2526e8bSMartin MatuskaPercentage of CPUs to run a metaslab preload taskq 424b2526e8bSMartin Matuska. 4253ff01b23SMartin Matuska.It Sy metaslab_lba_weighting_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 4263ff01b23SMartin MatuskaGive more weight to metaslabs with lower LBAs, 4273ff01b23SMartin Matuskaassuming they have greater bandwidth, 4283ff01b23SMartin Matuskaas is typically the case on a modern constant angular velocity disk drive. 4293ff01b23SMartin Matuska. 430be181ee2SMartin Matuska.It Sy metaslab_unload_delay Ns = Ns Sy 32 Pq uint 4313ff01b23SMartin MatuskaAfter a metaslab is used, we keep it loaded for this many TXGs, to attempt to 4323ff01b23SMartin Matuskareduce unnecessary reloading. 4333ff01b23SMartin MatuskaNote that both this many TXGs and 4343ff01b23SMartin Matuska.Sy metaslab_unload_delay_ms 4353ff01b23SMartin Matuskamilliseconds must pass before unloading will occur. 4363ff01b23SMartin Matuska. 437be181ee2SMartin Matuska.It Sy metaslab_unload_delay_ms Ns = Ns Sy 600000 Ns ms Po 10 min Pc Pq uint 4383ff01b23SMartin MatuskaAfter a metaslab is used, we keep it loaded for this many milliseconds, 4393ff01b23SMartin Matuskato attempt to reduce unnecessary reloading. 4403ff01b23SMartin MatuskaNote, that both this many milliseconds and 4413ff01b23SMartin Matuska.Sy metaslab_unload_delay 4423ff01b23SMartin MatuskaTXGs must pass before unloading will occur. 4433ff01b23SMartin Matuska. 444be181ee2SMartin Matuska.It Sy reference_history Ns = Ns Sy 3 Pq uint 445bb2d13b6SMartin MatuskaMaximum reference holders being tracked when reference_tracking_enable is 446bb2d13b6SMartin Matuskaactive. 447e716630dSMartin Matuska.It Sy raidz_expand_max_copy_bytes Ns = Ns Sy 160MB Pq ulong 448e716630dSMartin MatuskaMax amount of memory to use for RAID-Z expansion I/O. 449e716630dSMartin MatuskaThis limits how much I/O can be outstanding at once. 450e716630dSMartin Matuska. 451e716630dSMartin Matuska.It Sy raidz_expand_max_reflow_bytes Ns = Ns Sy 0 Pq ulong 452e716630dSMartin MatuskaFor testing, pause RAID-Z expansion when reflow amount reaches this value. 453e716630dSMartin Matuska. 454e716630dSMartin Matuska.It Sy raidz_io_aggregate_rows Ns = Ns Sy 4 Pq ulong 455e716630dSMartin MatuskaFor expanded RAID-Z, aggregate reads that have more rows than this. 456e716630dSMartin Matuska. 457e716630dSMartin Matuska.It Sy reference_history Ns = Ns Sy 3 Pq int 458e716630dSMartin MatuskaMaximum reference holders being tracked when reference_tracking_enable is 459e716630dSMartin Matuskaactive. 4603ff01b23SMartin Matuska. 4613ff01b23SMartin Matuska.It Sy reference_tracking_enable Ns = Ns Sy 0 Ns | Ns 1 Pq int 4623ff01b23SMartin MatuskaTrack reference holders to 4633ff01b23SMartin Matuska.Sy refcount_t 4643ff01b23SMartin Matuskaobjects (debug builds only). 4653ff01b23SMartin Matuska. 4663ff01b23SMartin Matuska.It Sy send_holes_without_birth_time Ns = Ns Sy 1 Ns | Ns 0 Pq int 4673ff01b23SMartin MatuskaWhen set, the 4683ff01b23SMartin Matuska.Sy hole_birth 4693ff01b23SMartin Matuskaoptimization will not be used, and all holes will always be sent during a 4703ff01b23SMartin Matuska.Nm zfs Cm send . 4713ff01b23SMartin MatuskaThis is useful if you suspect your datasets are affected by a bug in 4723ff01b23SMartin Matuska.Sy hole_birth . 4733ff01b23SMartin Matuska. 4743ff01b23SMartin Matuska.It Sy spa_config_path Ns = Ns Pa /etc/zfs/zpool.cache Pq charp 4753ff01b23SMartin MatuskaSPA config file. 4763ff01b23SMartin Matuska. 477be181ee2SMartin Matuska.It Sy spa_asize_inflation Ns = Ns Sy 24 Pq uint 4783ff01b23SMartin MatuskaMultiplication factor used to estimate actual disk consumption from the 4793ff01b23SMartin Matuskasize of data being written. 4803ff01b23SMartin MatuskaThe default value is a worst case estimate, 4813ff01b23SMartin Matuskabut lower values may be valid for a given pool depending on its configuration. 4823ff01b23SMartin MatuskaPool administrators who understand the factors involved 4833ff01b23SMartin Matuskamay wish to specify a more realistic inflation factor, 4843ff01b23SMartin Matuskaparticularly if they operate close to quota or capacity limits. 4853ff01b23SMartin Matuska. 4863ff01b23SMartin Matuska.It Sy spa_load_print_vdev_tree Ns = Ns Sy 0 Ns | Ns 1 Pq int 487bb2d13b6SMartin MatuskaWhether to print the vdev tree in the debugging message buffer during pool 488bb2d13b6SMartin Matuskaimport. 4893ff01b23SMartin Matuska. 4903ff01b23SMartin Matuska.It Sy spa_load_verify_data Ns = Ns Sy 1 Ns | Ns 0 Pq int 4913ff01b23SMartin MatuskaWhether to traverse data blocks during an "extreme rewind" 4923ff01b23SMartin Matuska.Pq Fl X 4933ff01b23SMartin Matuskaimport. 4943ff01b23SMartin Matuska.Pp 4953ff01b23SMartin MatuskaAn extreme rewind import normally performs a full traversal of all 4963ff01b23SMartin Matuskablocks in the pool for verification. 4973ff01b23SMartin MatuskaIf this parameter is unset, the traversal skips non-metadata blocks. 4983ff01b23SMartin MatuskaIt can be toggled once the 4993ff01b23SMartin Matuskaimport has started to stop or start the traversal of non-metadata blocks. 5003ff01b23SMartin Matuska. 5013ff01b23SMartin Matuska.It Sy spa_load_verify_metadata Ns = Ns Sy 1 Ns | Ns 0 Pq int 5023ff01b23SMartin MatuskaWhether to traverse blocks during an "extreme rewind" 5033ff01b23SMartin Matuska.Pq Fl X 5043ff01b23SMartin Matuskapool import. 5053ff01b23SMartin Matuska.Pp 5063ff01b23SMartin MatuskaAn extreme rewind import normally performs a full traversal of all 5073ff01b23SMartin Matuskablocks in the pool for verification. 5083ff01b23SMartin MatuskaIf this parameter is unset, the traversal is not performed. 5093ff01b23SMartin MatuskaIt can be toggled once the import has started to stop or start the traversal. 5103ff01b23SMartin Matuska. 511be181ee2SMartin Matuska.It Sy spa_load_verify_shift Ns = Ns Sy 4 Po 1/16th Pc Pq uint 5123ff01b23SMartin MatuskaSets the maximum number of bytes to consume during pool import to the log2 5133ff01b23SMartin Matuskafraction of the target ARC size. 5143ff01b23SMartin Matuska. 5153ff01b23SMartin Matuska.It Sy spa_slop_shift Ns = Ns Sy 5 Po 1/32nd Pc Pq int 5163ff01b23SMartin MatuskaNormally, we don't allow the last 5173ff01b23SMartin Matuska.Sy 3.2% Pq Sy 1/2^spa_slop_shift 5183ff01b23SMartin Matuskaof space in the pool to be consumed. 5193ff01b23SMartin MatuskaThis ensures that we don't run the pool completely out of space, 5203ff01b23SMartin Matuskadue to unaccounted changes (e.g. to the MOS). 5213ff01b23SMartin MatuskaIt also limits the worst-case time to allocate space. 5223ff01b23SMartin MatuskaIf we have less than this amount of free space, 5233ff01b23SMartin Matuskamost ZPL operations (e.g. write, create) will return 5243ff01b23SMartin Matuska.Sy ENOSPC . 5253ff01b23SMartin Matuska. 52614c2e0a0SMartin Matuska.It Sy spa_num_allocators Ns = Ns Sy 4 Pq int 52714c2e0a0SMartin MatuskaDetermines the number of block alloctators to use per spa instance. 528b985c9caSMartin MatuskaCapped by the number of actual CPUs in the system via 529b985c9caSMartin Matuska.Sy spa_cpus_per_allocator . 53014c2e0a0SMartin Matuska.Pp 53114c2e0a0SMartin MatuskaNote that setting this value too high could result in performance 53214c2e0a0SMartin Matuskadegredation and/or excess fragmentation. 533b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 534b985c9caSMartin Matuska. 535b985c9caSMartin Matuska.It Sy spa_cpus_per_allocator Ns = Ns Sy 4 Pq int 536b985c9caSMartin MatuskaDetermines the minimum number of CPUs in a system for block alloctator 537b985c9caSMartin Matuskaper spa instance. 538b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 53914c2e0a0SMartin Matuska. 540716fd348SMartin Matuska.It Sy spa_upgrade_errlog_limit Ns = Ns Sy 0 Pq uint 541716fd348SMartin MatuskaLimits the number of on-disk error log entries that will be converted to the 542716fd348SMartin Matuskanew format when enabling the 543716fd348SMartin Matuska.Sy head_errlog 544716fd348SMartin Matuskafeature. 545716fd348SMartin MatuskaThe default is to convert all log entries. 546716fd348SMartin Matuska. 547be181ee2SMartin Matuska.It Sy vdev_removal_max_span Ns = Ns Sy 32768 Ns B Po 32 KiB Pc Pq uint 5483ff01b23SMartin MatuskaDuring top-level vdev removal, chunks of data are copied from the vdev 5493ff01b23SMartin Matuskawhich may include free space in order to trade bandwidth for IOPS. 5503ff01b23SMartin MatuskaThis parameter determines the maximum span of free space, in bytes, 5513ff01b23SMartin Matuskawhich will be included as "unnecessary" data in a chunk of copied data. 5523ff01b23SMartin Matuska.Pp 5533ff01b23SMartin MatuskaThe default value here was chosen to align with 5543ff01b23SMartin Matuska.Sy zfs_vdev_read_gap_limit , 5553ff01b23SMartin Matuskawhich is a similar concept when doing 5563ff01b23SMartin Matuskaregular reads (but there's no reason it has to be the same). 5573ff01b23SMartin Matuska. 558dbd5678dSMartin Matuska.It Sy vdev_file_logical_ashift Ns = Ns Sy 9 Po 512 B Pc Pq u64 5593ff01b23SMartin MatuskaLogical ashift for file-based devices. 5603ff01b23SMartin Matuska. 561dbd5678dSMartin Matuska.It Sy vdev_file_physical_ashift Ns = Ns Sy 9 Po 512 B Pc Pq u64 5623ff01b23SMartin MatuskaPhysical ashift for file-based devices. 5633ff01b23SMartin Matuska. 5643ff01b23SMartin Matuska.It Sy zap_iterate_prefetch Ns = Ns Sy 1 Ns | Ns 0 Pq int 5653ff01b23SMartin MatuskaIf set, when we start iterating over a ZAP object, 5663ff01b23SMartin Matuskaprefetch the entire object (all leaf blocks). 5673ff01b23SMartin MatuskaHowever, this is limited by 5683ff01b23SMartin Matuska.Sy dmu_prefetch_max . 5693ff01b23SMartin Matuska. 57015f0b8c3SMartin Matuska.It Sy zap_micro_max_size Ns = Ns Sy 131072 Ns B Po 128 KiB Pc Pq int 57115f0b8c3SMartin MatuskaMaximum micro ZAP size. 57215f0b8c3SMartin MatuskaA micro ZAP is upgraded to a fat ZAP, once it grows beyond the specified size. 57315f0b8c3SMartin Matuska. 574b985c9caSMartin Matuska.It Sy zap_shrink_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 575b985c9caSMartin MatuskaIf set, adjacent empty ZAP blocks will be collapsed, reducing disk space. 5761719886fSMartin Matuska. 577e3aa18adSMartin Matuska.It Sy zfetch_min_distance Ns = Ns Sy 4194304 Ns B Po 4 MiB Pc Pq uint 578e3aa18adSMartin MatuskaMin bytes to prefetch per stream. 579e3aa18adSMartin MatuskaPrefetch distance starts from the demand access size and quickly grows to 580e3aa18adSMartin Matuskathis value, doubling on each hit. 581e3aa18adSMartin MatuskaAfter that it may grow further by 1/8 per hit, but only if some prefetch 582e3aa18adSMartin Matuskasince last time haven't completed in time to satisfy demand request, i.e. 583e3aa18adSMartin Matuskaprefetch depth didn't cover the read latency or the pool got saturated. 584e3aa18adSMartin Matuska. 585e3aa18adSMartin Matuska.It Sy zfetch_max_distance Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq uint 5863ff01b23SMartin MatuskaMax bytes to prefetch per stream. 5873ff01b23SMartin Matuska. 588716fd348SMartin Matuska.It Sy zfetch_max_idistance Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq uint 5893ff01b23SMartin MatuskaMax bytes to prefetch indirects for per stream. 5903ff01b23SMartin Matuska. 5911719886fSMartin Matuska.It Sy zfetch_max_reorder Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq uint 5921719886fSMartin MatuskaRequests within this byte distance from the current prefetch stream position 5931719886fSMartin Matuskaare considered parts of the stream, reordered due to parallel processing. 5941719886fSMartin MatuskaSuch requests do not advance the stream position immediately unless 5951719886fSMartin Matuska.Sy zfetch_hole_shift 5961719886fSMartin Matuskafill threshold is reached, but saved to fill holes in the stream later. 5971719886fSMartin Matuska. 5983ff01b23SMartin Matuska.It Sy zfetch_max_streams Ns = Ns Sy 8 Pq uint 5993ff01b23SMartin MatuskaMax number of streams per zfetch (prefetch streams per file). 6003ff01b23SMartin Matuska. 601e3aa18adSMartin Matuska.It Sy zfetch_min_sec_reap Ns = Ns Sy 1 Pq uint 602e3aa18adSMartin MatuskaMin time before inactive prefetch stream can be reclaimed 603e3aa18adSMartin Matuska. 604e3aa18adSMartin Matuska.It Sy zfetch_max_sec_reap Ns = Ns Sy 2 Pq uint 605e3aa18adSMartin MatuskaMax time before inactive prefetch stream can be deleted 6063ff01b23SMartin Matuska. 6073ff01b23SMartin Matuska.It Sy zfs_abd_scatter_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 6083ff01b23SMartin MatuskaEnables ARC from using scatter/gather lists and forces all allocations to be 6093ff01b23SMartin Matuskalinear in kernel memory. 6103ff01b23SMartin MatuskaDisabling can improve performance in some code paths 6113ff01b23SMartin Matuskaat the expense of fragmented kernel memory. 6123ff01b23SMartin Matuska. 613e92ffd9bSMartin Matuska.It Sy zfs_abd_scatter_max_order Ns = Ns Sy MAX_ORDER\-1 Pq uint 6143ff01b23SMartin MatuskaMaximum number of consecutive memory pages allocated in a single block for 6153ff01b23SMartin Matuskascatter/gather lists. 6163ff01b23SMartin Matuska.Pp 6173ff01b23SMartin MatuskaThe value of 6183ff01b23SMartin Matuska.Sy MAX_ORDER 6193ff01b23SMartin Matuskadepends on kernel configuration. 6203ff01b23SMartin Matuska. 621716fd348SMartin Matuska.It Sy zfs_abd_scatter_min_size Ns = Ns Sy 1536 Ns B Po 1.5 KiB Pc Pq uint 6223ff01b23SMartin MatuskaThis is the minimum allocation size that will use scatter (page-based) ABDs. 6233ff01b23SMartin MatuskaSmaller allocations will use linear ABDs. 6243ff01b23SMartin Matuska. 625dbd5678dSMartin Matuska.It Sy zfs_arc_dnode_limit Ns = Ns Sy 0 Ns B Pq u64 6263ff01b23SMartin MatuskaWhen the number of bytes consumed by dnodes in the ARC exceeds this number of 6273ff01b23SMartin Matuskabytes, try to unpin some of it in response to demand for non-metadata. 6283ff01b23SMartin MatuskaThis value acts as a ceiling to the amount of dnode metadata, and defaults to 6293ff01b23SMartin Matuska.Sy 0 , 6303ff01b23SMartin Matuskawhich indicates that a percent which is based on 6313ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit_percent 6323ff01b23SMartin Matuskaof the ARC meta buffers that may be used for dnodes. 633dbd5678dSMartin Matuska.It Sy zfs_arc_dnode_limit_percent Ns = Ns Sy 10 Ns % Pq u64 6343ff01b23SMartin MatuskaPercentage that can be consumed by dnodes of ARC meta buffers. 6353ff01b23SMartin Matuska.Pp 6363ff01b23SMartin MatuskaSee also 6373ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit , 6383ff01b23SMartin Matuskawhich serves a similar purpose but has a higher priority if nonzero. 6393ff01b23SMartin Matuska. 640dbd5678dSMartin Matuska.It Sy zfs_arc_dnode_reduce_percent Ns = Ns Sy 10 Ns % Pq u64 6413ff01b23SMartin MatuskaPercentage of ARC dnodes to try to scan in response to demand for non-metadata 6423ff01b23SMartin Matuskawhen the number of bytes consumed by dnodes exceeds 6433ff01b23SMartin Matuska.Sy zfs_arc_dnode_limit . 6443ff01b23SMartin Matuska. 645be181ee2SMartin Matuska.It Sy zfs_arc_average_blocksize Ns = Ns Sy 8192 Ns B Po 8 KiB Pc Pq uint 6463ff01b23SMartin MatuskaThe ARC's buffer hash table is sized based on the assumption of an average 6473ff01b23SMartin Matuskablock size of this value. 648716fd348SMartin MatuskaThis works out to roughly 1 MiB of hash table per 1 GiB of physical memory 6493ff01b23SMartin Matuskawith 8-byte pointers. 6503ff01b23SMartin MatuskaFor configurations with a known larger average block size, 6513ff01b23SMartin Matuskathis value can be increased to reduce the memory footprint. 6523ff01b23SMartin Matuska. 653be181ee2SMartin Matuska.It Sy zfs_arc_eviction_pct Ns = Ns Sy 200 Ns % Pq uint 6543ff01b23SMartin MatuskaWhen 6553ff01b23SMartin Matuska.Fn arc_is_overflowing , 6563ff01b23SMartin Matuska.Fn arc_get_data_impl 6573ff01b23SMartin Matuskawaits for this percent of the requested amount of data to be evicted. 6583ff01b23SMartin MatuskaFor example, by default, for every 659716fd348SMartin Matuska.Em 2 KiB 6603ff01b23SMartin Matuskathat's evicted, 661716fd348SMartin Matuska.Em 1 KiB 6623ff01b23SMartin Matuskaof it may be "reused" by a new allocation. 6633ff01b23SMartin MatuskaSince this is above 6643ff01b23SMartin Matuska.Sy 100 Ns % , 6653ff01b23SMartin Matuskait ensures that progress is made towards getting 6663ff01b23SMartin Matuska.Sy arc_size No under Sy arc_c . 6673ff01b23SMartin MatuskaSince this is finite, it ensures that allocations can still happen, 6683ff01b23SMartin Matuskaeven during the potentially long time that 6693ff01b23SMartin Matuska.Sy arc_size No is more than Sy arc_c . 6703ff01b23SMartin Matuska. 671be181ee2SMartin Matuska.It Sy zfs_arc_evict_batch_limit Ns = Ns Sy 10 Pq uint 6723ff01b23SMartin MatuskaNumber ARC headers to evict per sub-list before proceeding to another sub-list. 6733ff01b23SMartin MatuskaThis batch-style operation prevents entire sub-lists from being evicted at once 6743ff01b23SMartin Matuskabut comes at a cost of additional unlocking and locking. 6753ff01b23SMartin Matuska. 676be181ee2SMartin Matuska.It Sy zfs_arc_grow_retry Ns = Ns Sy 0 Ns s Pq uint 6773ff01b23SMartin MatuskaIf set to a non zero value, it will replace the 6783ff01b23SMartin Matuska.Sy arc_grow_retry 6793ff01b23SMartin Matuskavalue with this value. 6803ff01b23SMartin MatuskaThe 6813ff01b23SMartin Matuska.Sy arc_grow_retry 6823ff01b23SMartin Matuska.No value Pq default Sy 5 Ns s 6833ff01b23SMartin Matuskais the number of seconds the ARC will wait before 6843ff01b23SMartin Matuskatrying to resume growth after a memory pressure event. 6853ff01b23SMartin Matuska. 6863ff01b23SMartin Matuska.It Sy zfs_arc_lotsfree_percent Ns = Ns Sy 10 Ns % Pq int 6873ff01b23SMartin MatuskaThrottle I/O when free system memory drops below this percentage of total 6883ff01b23SMartin Matuskasystem memory. 6893ff01b23SMartin MatuskaSetting this value to 6903ff01b23SMartin Matuska.Sy 0 6913ff01b23SMartin Matuskawill disable the throttle. 6923ff01b23SMartin Matuska. 693dbd5678dSMartin Matuska.It Sy zfs_arc_max Ns = Ns Sy 0 Ns B Pq u64 6943ff01b23SMartin MatuskaMax size of ARC in bytes. 6953ff01b23SMartin MatuskaIf 6963ff01b23SMartin Matuska.Sy 0 , 6973ff01b23SMartin Matuskathen the max size of ARC is determined by the amount of system memory installed. 6986c1e79dfSMartin MatuskaThe larger of 699716fd348SMartin Matuska.Sy all_system_memory No \- Sy 1 GiB 700e92ffd9bSMartin Matuskaand 701e92ffd9bSMartin Matuska.Sy 5/8 No \(mu Sy all_system_memory 7023ff01b23SMartin Matuskawill be used as the limit. 7033ff01b23SMartin MatuskaThis value must be at least 704716fd348SMartin Matuska.Sy 67108864 Ns B Pq 64 MiB . 7053ff01b23SMartin Matuska.Pp 7063ff01b23SMartin MatuskaThis value can be changed dynamically, with some caveats. 7073ff01b23SMartin MatuskaIt cannot be set back to 7083ff01b23SMartin Matuska.Sy 0 7093ff01b23SMartin Matuskawhile running, and reducing it below the current ARC size will not cause 7103ff01b23SMartin Matuskathe ARC to shrink without memory pressure to induce shrinking. 7113ff01b23SMartin Matuska. 7122a58b312SMartin Matuska.It Sy zfs_arc_meta_balance Ns = Ns Sy 500 Pq uint 7132a58b312SMartin MatuskaBalance between metadata and data on ghost hits. 7142a58b312SMartin MatuskaValues above 100 increase metadata caching by proportionally reducing effect 7152a58b312SMartin Matuskaof ghost data hits on target data/metadata rate. 7163ff01b23SMartin Matuska. 717dbd5678dSMartin Matuska.It Sy zfs_arc_min Ns = Ns Sy 0 Ns B Pq u64 7183ff01b23SMartin MatuskaMin size of ARC in bytes. 7193ff01b23SMartin Matuska.No If set to Sy 0 , arc_c_min 7203ff01b23SMartin Matuskawill default to consuming the larger of 721716fd348SMartin Matuska.Sy 32 MiB 722e92ffd9bSMartin Matuskaand 723e92ffd9bSMartin Matuska.Sy all_system_memory No / Sy 32 . 7243ff01b23SMartin Matuska. 725be181ee2SMartin Matuska.It Sy zfs_arc_min_prefetch_ms Ns = Ns Sy 0 Ns ms Ns Po Ns ≡ Ns 1s Pc Pq uint 7263ff01b23SMartin MatuskaMinimum time prefetched blocks are locked in the ARC. 7273ff01b23SMartin Matuska. 728be181ee2SMartin Matuska.It Sy zfs_arc_min_prescient_prefetch_ms Ns = Ns Sy 0 Ns ms Ns Po Ns ≡ Ns 6s Pc Pq uint 7293ff01b23SMartin MatuskaMinimum time "prescient prefetched" blocks are locked in the ARC. 7303ff01b23SMartin MatuskaThese blocks are meant to be prefetched fairly aggressively ahead of 7313ff01b23SMartin Matuskathe code that may use them. 7323ff01b23SMartin Matuska. 733e92ffd9bSMartin Matuska.It Sy zfs_arc_prune_task_threads Ns = Ns Sy 1 Pq int 734e92ffd9bSMartin MatuskaNumber of arc_prune threads. 735e92ffd9bSMartin Matuska.Fx 736e92ffd9bSMartin Matuskadoes not need more than one. 737e92ffd9bSMartin MatuskaLinux may theoretically use one per mount point up to number of CPUs, 738e92ffd9bSMartin Matuskabut that was not proven to be useful. 739e92ffd9bSMartin Matuska. 7403ff01b23SMartin Matuska.It Sy zfs_max_missing_tvds Ns = Ns Sy 0 Pq int 7413ff01b23SMartin MatuskaNumber of missing top-level vdevs which will be allowed during 7423ff01b23SMartin Matuskapool import (only in read-only mode). 7433ff01b23SMartin Matuska. 744dbd5678dSMartin Matuska.It Sy zfs_max_nvlist_src_size Ns = Sy 0 Pq u64 7453ff01b23SMartin MatuskaMaximum size in bytes allowed to be passed as 7463ff01b23SMartin Matuska.Sy zc_nvlist_src_size 7473ff01b23SMartin Matuskafor ioctls on 7483ff01b23SMartin Matuska.Pa /dev/zfs . 7493ff01b23SMartin MatuskaThis prevents a user from causing the kernel to allocate 7503ff01b23SMartin Matuskaan excessive amount of memory. 7513ff01b23SMartin MatuskaWhen the limit is exceeded, the ioctl fails with 7523ff01b23SMartin Matuska.Sy EINVAL 7533ff01b23SMartin Matuskaand a description of the error is sent to the 7543ff01b23SMartin Matuska.Pa zfs-dbgmsg 7553ff01b23SMartin Matuskalog. 7563ff01b23SMartin MatuskaThis parameter should not need to be touched under normal circumstances. 7573ff01b23SMartin MatuskaIf 7583ff01b23SMartin Matuska.Sy 0 , 7593ff01b23SMartin Matuskaequivalent to a quarter of the user-wired memory limit under 7603ff01b23SMartin Matuska.Fx 7613ff01b23SMartin Matuskaand to 762716fd348SMartin Matuska.Sy 134217728 Ns B Pq 128 MiB 7633ff01b23SMartin Matuskaunder Linux. 7643ff01b23SMartin Matuska. 765be181ee2SMartin Matuska.It Sy zfs_multilist_num_sublists Ns = Ns Sy 0 Pq uint 7663ff01b23SMartin MatuskaTo allow more fine-grained locking, each ARC state contains a series 7673ff01b23SMartin Matuskaof lists for both data and metadata objects. 7683ff01b23SMartin MatuskaLocking is performed at the level of these "sub-lists". 7693ff01b23SMartin MatuskaThis parameters controls the number of sub-lists per ARC state, 7703ff01b23SMartin Matuskaand also applies to other uses of the multilist data structure. 7713ff01b23SMartin Matuska.Pp 7723ff01b23SMartin MatuskaIf 7733ff01b23SMartin Matuska.Sy 0 , 7743ff01b23SMartin Matuskaequivalent to the greater of the number of online CPUs and 7753ff01b23SMartin Matuska.Sy 4 . 7763ff01b23SMartin Matuska. 7773ff01b23SMartin Matuska.It Sy zfs_arc_overflow_shift Ns = Ns Sy 8 Pq int 7783ff01b23SMartin MatuskaThe ARC size is considered to be overflowing if it exceeds the current 7793ff01b23SMartin MatuskaARC target size 7803ff01b23SMartin Matuska.Pq Sy arc_c 7813f9d360cSMartin Matuskaby thresholds determined by this parameter. 7823f9d360cSMartin MatuskaExceeding by 783e92ffd9bSMartin Matuska.Sy ( arc_c No >> Sy zfs_arc_overflow_shift ) No / Sy 2 7843f9d360cSMartin Matuskastarts ARC reclamation process. 7853f9d360cSMartin MatuskaIf that appears insufficient, exceeding by 786e92ffd9bSMartin Matuska.Sy ( arc_c No >> Sy zfs_arc_overflow_shift ) No \(mu Sy 1.5 7873f9d360cSMartin Matuskablocks new buffer allocation until the reclaim thread catches up. 7883f9d360cSMartin MatuskaStarted reclamation process continues till ARC size returns below the 7893f9d360cSMartin Matuskatarget size. 7903ff01b23SMartin Matuska.Pp 7913ff01b23SMartin MatuskaThe default value of 7923ff01b23SMartin Matuska.Sy 8 7933f9d360cSMartin Matuskacauses the ARC to start reclamation if it exceeds the target size by 7943f9d360cSMartin Matuska.Em 0.2% 7953f9d360cSMartin Matuskaof the target size, and block allocations by 7963f9d360cSMartin Matuska.Em 0.6% . 7973ff01b23SMartin Matuska. 798be181ee2SMartin Matuska.It Sy zfs_arc_shrink_shift Ns = Ns Sy 0 Pq uint 7993ff01b23SMartin MatuskaIf nonzero, this will update 8003ff01b23SMartin Matuska.Sy arc_shrink_shift Pq default Sy 7 8013ff01b23SMartin Matuskawith the new value. 8023ff01b23SMartin Matuska. 8033ff01b23SMartin Matuska.It Sy zfs_arc_pc_percent Ns = Ns Sy 0 Ns % Po off Pc Pq uint 8043ff01b23SMartin MatuskaPercent of pagecache to reclaim ARC to. 8053ff01b23SMartin Matuska.Pp 8063ff01b23SMartin MatuskaThis tunable allows the ZFS ARC to play more nicely 8073ff01b23SMartin Matuskawith the kernel's LRU pagecache. 8083ff01b23SMartin MatuskaIt can guarantee that the ARC size won't collapse under scanning 8093ff01b23SMartin Matuskapressure on the pagecache, yet still allows the ARC to be reclaimed down to 8103ff01b23SMartin Matuska.Sy zfs_arc_min 8113ff01b23SMartin Matuskaif necessary. 8123ff01b23SMartin MatuskaThis value is specified as percent of pagecache size (as measured by 8133ff01b23SMartin Matuska.Sy NR_FILE_PAGES ) , 8143ff01b23SMartin Matuskawhere that percent may exceed 8153ff01b23SMartin Matuska.Sy 100 . 8163ff01b23SMartin MatuskaThis 8173ff01b23SMartin Matuskaonly operates during memory pressure/reclaim. 8183ff01b23SMartin Matuska. 8193ff01b23SMartin Matuska.It Sy zfs_arc_shrinker_limit Ns = Ns Sy 10000 Pq int 8203ff01b23SMartin MatuskaThis is a limit on how many pages the ARC shrinker makes available for 8213ff01b23SMartin Matuskaeviction in response to one page allocation attempt. 8223ff01b23SMartin MatuskaNote that in practice, the kernel's shrinker can ask us to evict 8233ff01b23SMartin Matuskaup to about four times this for one allocation attempt. 8243ff01b23SMartin Matuska.Pp 8253ff01b23SMartin MatuskaThe default limit of 826716fd348SMartin Matuska.Sy 10000 Pq in practice, Em 160 MiB No per allocation attempt with 4 KiB pages 8273ff01b23SMartin Matuskalimits the amount of time spent attempting to reclaim ARC memory to 8283ff01b23SMartin Matuskaless than 100 ms per allocation attempt, 829716fd348SMartin Matuskaeven with a small average compressed block size of ~8 KiB. 8303ff01b23SMartin Matuska.Pp 8313ff01b23SMartin MatuskaThe parameter can be set to 0 (zero) to disable the limit, 8323ff01b23SMartin Matuskaand only applies on Linux. 8333ff01b23SMartin Matuska. 834*ce4dcb97SMartin Matuska.It Sy zfs_arc_shrinker_seeks Ns = Ns Sy 2 Pq int 835*ce4dcb97SMartin MatuskaRelative cost of ARC eviction on Linux, AKA number of seeks needed to 836*ce4dcb97SMartin Matuskarestore evicted page. 837*ce4dcb97SMartin MatuskaBigger values make ARC more precious and evictions smaller, comparing to 838*ce4dcb97SMartin Matuskaother kernel subsystems. 839*ce4dcb97SMartin MatuskaValue of 4 means parity with page cache. 840*ce4dcb97SMartin Matuska. 841dbd5678dSMartin Matuska.It Sy zfs_arc_sys_free Ns = Ns Sy 0 Ns B Pq u64 8423ff01b23SMartin MatuskaThe target number of bytes the ARC should leave as free memory on the system. 8433ff01b23SMartin MatuskaIf zero, equivalent to the bigger of 844716fd348SMartin Matuska.Sy 512 KiB No and Sy all_system_memory/64 . 8453ff01b23SMartin Matuska. 8463ff01b23SMartin Matuska.It Sy zfs_autoimport_disable Ns = Ns Sy 1 Ns | Ns 0 Pq int 8473ff01b23SMartin MatuskaDisable pool import at module load by ignoring the cache file 8483ff01b23SMartin Matuska.Pq Sy spa_config_path . 8493ff01b23SMartin Matuska. 8503ff01b23SMartin Matuska.It Sy zfs_checksum_events_per_second Ns = Ns Sy 20 Ns /s Pq uint 8513ff01b23SMartin MatuskaRate limit checksum events to this many per second. 8523ff01b23SMartin MatuskaNote that this should not be set below the ZED thresholds 8533ff01b23SMartin Matuska(currently 10 checksums over 10 seconds) 8543ff01b23SMartin Matuskaor else the daemon may not trigger any action. 8553ff01b23SMartin Matuska. 8566c1e79dfSMartin Matuska.It Sy zfs_commit_timeout_pct Ns = Ns Sy 10 Ns % Pq uint 8573ff01b23SMartin MatuskaThis controls the amount of time that a ZIL block (lwb) will remain "open" 8583ff01b23SMartin Matuskawhen it isn't "full", and it has a thread waiting for it to be committed to 8593ff01b23SMartin Matuskastable storage. 8603ff01b23SMartin MatuskaThe timeout is scaled based on a percentage of the last lwb 8613ff01b23SMartin Matuskalatency to avoid significantly impacting the latency of each individual 8623ff01b23SMartin Matuskatransaction record (itx). 8633ff01b23SMartin Matuska. 8643ff01b23SMartin Matuska.It Sy zfs_condense_indirect_commit_entry_delay_ms Ns = Ns Sy 0 Ns ms Pq int 8653ff01b23SMartin MatuskaVdev indirection layer (used for device removal) sleeps for this many 8663ff01b23SMartin Matuskamilliseconds during mapping generation. 8673ff01b23SMartin MatuskaIntended for use with the test suite to throttle vdev removal speed. 8683ff01b23SMartin Matuska. 869be181ee2SMartin Matuska.It Sy zfs_condense_indirect_obsolete_pct Ns = Ns Sy 25 Ns % Pq uint 870bb2d13b6SMartin MatuskaMinimum percent of obsolete bytes in vdev mapping required to attempt to 871bb2d13b6SMartin Matuskacondense 8723ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable . 8733ff01b23SMartin MatuskaIntended for use with the test suite 8743ff01b23SMartin Matuskato facilitate triggering condensing as needed. 8753ff01b23SMartin Matuska. 8763ff01b23SMartin Matuska.It Sy zfs_condense_indirect_vdevs_enable Ns = Ns Sy 1 Ns | Ns 0 Pq int 8773ff01b23SMartin MatuskaEnable condensing indirect vdev mappings. 8783ff01b23SMartin MatuskaWhen set, attempt to condense indirect vdev mappings 8793ff01b23SMartin Matuskaif the mapping uses more than 8803ff01b23SMartin Matuska.Sy zfs_condense_min_mapping_bytes 8813ff01b23SMartin Matuskabytes of memory and if the obsolete space map object uses more than 8823ff01b23SMartin Matuska.Sy zfs_condense_max_obsolete_bytes 8833ff01b23SMartin Matuskabytes on-disk. 884bb2d13b6SMartin MatuskaThe condensing process is an attempt to save memory by removing obsolete 885bb2d13b6SMartin Matuskamappings. 8863ff01b23SMartin Matuska. 887dbd5678dSMartin Matuska.It Sy zfs_condense_max_obsolete_bytes Ns = Ns Sy 1073741824 Ns B Po 1 GiB Pc Pq u64 8883ff01b23SMartin MatuskaOnly attempt to condense indirect vdev mappings if the on-disk size 8893ff01b23SMartin Matuskaof the obsolete space map object is greater than this number of bytes 8903ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable . 8913ff01b23SMartin Matuska. 892dbd5678dSMartin Matuska.It Sy zfs_condense_min_mapping_bytes Ns = Ns Sy 131072 Ns B Po 128 KiB Pc Pq u64 8933ff01b23SMartin MatuskaMinimum size vdev mapping to attempt to condense 8943ff01b23SMartin Matuska.Pq see Sy zfs_condense_indirect_vdevs_enable . 8953ff01b23SMartin Matuska. 8963ff01b23SMartin Matuska.It Sy zfs_dbgmsg_enable Ns = Ns Sy 1 Ns | Ns 0 Pq int 8973ff01b23SMartin MatuskaInternally ZFS keeps a small log to facilitate debugging. 8983ff01b23SMartin MatuskaThe log is enabled by default, and can be disabled by unsetting this option. 8993ff01b23SMartin MatuskaThe contents of the log can be accessed by reading 9003ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/dbgmsg . 9013ff01b23SMartin MatuskaWriting 9023ff01b23SMartin Matuska.Sy 0 9033ff01b23SMartin Matuskato the file clears the log. 9043ff01b23SMartin Matuska.Pp 9053ff01b23SMartin MatuskaThis setting does not influence debug prints due to 9063ff01b23SMartin Matuska.Sy zfs_flags . 9073ff01b23SMartin Matuska. 908be181ee2SMartin Matuska.It Sy zfs_dbgmsg_maxsize Ns = Ns Sy 4194304 Ns B Po 4 MiB Pc Pq uint 9093ff01b23SMartin MatuskaMaximum size of the internal ZFS debug log. 9103ff01b23SMartin Matuska. 9113ff01b23SMartin Matuska.It Sy zfs_dbuf_state_index Ns = Ns Sy 0 Pq int 9123ff01b23SMartin MatuskaHistorically used for controlling what reporting was available under 9133ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs . 9143ff01b23SMartin MatuskaNo effect. 9153ff01b23SMartin Matuska. 916aca928a5SMartin Matuska.It Sy zfs_deadman_checktime_ms Ns = Ns Sy 60000 Ns ms Po 1 min Pc Pq u64 917aca928a5SMartin MatuskaCheck time in milliseconds. 918aca928a5SMartin MatuskaThis defines the frequency at which we check for hung I/O requests 919aca928a5SMartin Matuskaand potentially invoke the 920aca928a5SMartin Matuska.Sy zfs_deadman_failmode 921aca928a5SMartin Matuskabehavior. 922aca928a5SMartin Matuska. 9233ff01b23SMartin Matuska.It Sy zfs_deadman_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 9243ff01b23SMartin MatuskaWhen a pool sync operation takes longer than 9253ff01b23SMartin Matuska.Sy zfs_deadman_synctime_ms , 9263ff01b23SMartin Matuskaor when an individual I/O operation takes longer than 9273ff01b23SMartin Matuska.Sy zfs_deadman_ziotime_ms , 9283ff01b23SMartin Matuskathen the operation is considered to be "hung". 9293ff01b23SMartin MatuskaIf 9303ff01b23SMartin Matuska.Sy zfs_deadman_enabled 9313ff01b23SMartin Matuskais set, then the deadman behavior is invoked as described by 9323ff01b23SMartin Matuska.Sy zfs_deadman_failmode . 9333ff01b23SMartin MatuskaBy default, the deadman is enabled and set to 9343ff01b23SMartin Matuska.Sy wait 935c03c5b1cSMartin Matuskawhich results in "hung" I/O operations only being logged. 9363ff01b23SMartin MatuskaThe deadman is automatically disabled when a pool gets suspended. 9373ff01b23SMartin Matuska. 938aca928a5SMartin Matuska.It Sy zfs_deadman_events_per_second Ns = Ns Sy 1 Ns /s Pq int 939aca928a5SMartin MatuskaRate limit deadman zevents (which report hung I/O operations) to this many per 940aca928a5SMartin Matuskasecond. 941aca928a5SMartin Matuska. 9423ff01b23SMartin Matuska.It Sy zfs_deadman_failmode Ns = Ns Sy wait Pq charp 9433ff01b23SMartin MatuskaControls the failure behavior when the deadman detects a "hung" I/O operation. 9443ff01b23SMartin MatuskaValid values are: 9453ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "continue" 9463ff01b23SMartin Matuska.It Sy wait 9473ff01b23SMartin MatuskaWait for a "hung" operation to complete. 9483ff01b23SMartin MatuskaFor each "hung" operation a "deadman" event will be posted 9493ff01b23SMartin Matuskadescribing that operation. 9503ff01b23SMartin Matuska.It Sy continue 9513ff01b23SMartin MatuskaAttempt to recover from a "hung" operation by re-dispatching it 9523ff01b23SMartin Matuskato the I/O pipeline if possible. 9533ff01b23SMartin Matuska.It Sy panic 9543ff01b23SMartin MatuskaPanic the system. 9553ff01b23SMartin MatuskaThis can be used to facilitate automatic fail-over 9563ff01b23SMartin Matuskato a properly configured fail-over partner. 9573ff01b23SMartin Matuska.El 9583ff01b23SMartin Matuska. 959dbd5678dSMartin Matuska.It Sy zfs_deadman_synctime_ms Ns = Ns Sy 600000 Ns ms Po 10 min Pc Pq u64 9603ff01b23SMartin MatuskaInterval in milliseconds after which the deadman is triggered and also 9613ff01b23SMartin Matuskathe interval after which a pool sync operation is considered to be "hung". 9623ff01b23SMartin MatuskaOnce this limit is exceeded the deadman will be invoked every 9633ff01b23SMartin Matuska.Sy zfs_deadman_checktime_ms 9643ff01b23SMartin Matuskamilliseconds until the pool sync completes. 9653ff01b23SMartin Matuska. 966dbd5678dSMartin Matuska.It Sy zfs_deadman_ziotime_ms Ns = Ns Sy 300000 Ns ms Po 5 min Pc Pq u64 9673ff01b23SMartin MatuskaInterval in milliseconds after which the deadman is triggered and an 9683ff01b23SMartin Matuskaindividual I/O operation is considered to be "hung". 9693ff01b23SMartin MatuskaAs long as the operation remains "hung", 9703ff01b23SMartin Matuskathe deadman will be invoked every 9713ff01b23SMartin Matuska.Sy zfs_deadman_checktime_ms 9723ff01b23SMartin Matuskamilliseconds until the operation completes. 9733ff01b23SMartin Matuska. 9743ff01b23SMartin Matuska.It Sy zfs_dedup_prefetch Ns = Ns Sy 0 Ns | Ns 1 Pq int 9753ff01b23SMartin MatuskaEnable prefetching dedup-ed blocks which are going to be freed. 9763ff01b23SMartin Matuska. 977be181ee2SMartin Matuska.It Sy zfs_delay_min_dirty_percent Ns = Ns Sy 60 Ns % Pq uint 9783ff01b23SMartin MatuskaStart to delay each transaction once there is this amount of dirty data, 9793ff01b23SMartin Matuskaexpressed as a percentage of 9803ff01b23SMartin Matuska.Sy zfs_dirty_data_max . 9813ff01b23SMartin MatuskaThis value should be at least 9823ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent . 9833ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 9843ff01b23SMartin Matuska. 9853ff01b23SMartin Matuska.It Sy zfs_delay_scale Ns = Ns Sy 500000 Pq int 9863ff01b23SMartin MatuskaThis controls how quickly the transaction delay approaches infinity. 9873ff01b23SMartin MatuskaLarger values cause longer delays for a given amount of dirty data. 9883ff01b23SMartin Matuska.Pp 9893ff01b23SMartin MatuskaFor the smoothest delay, this value should be about 1 billion divided 9903ff01b23SMartin Matuskaby the maximum number of operations per second. 9913ff01b23SMartin MatuskaThis will smoothly handle between ten times and a tenth of this number. 9923ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 9933ff01b23SMartin Matuska.Pp 994e92ffd9bSMartin Matuska.Sy zfs_delay_scale No \(mu Sy zfs_dirty_data_max Em must No be smaller than Sy 2^64 . 9953ff01b23SMartin Matuska. 9963ff01b23SMartin Matuska.It Sy zfs_disable_ivset_guid_check Ns = Ns Sy 0 Ns | Ns 1 Pq int 9973ff01b23SMartin MatuskaDisables requirement for IVset GUIDs to be present and match when doing a raw 9983ff01b23SMartin Matuskareceive of encrypted datasets. 9993ff01b23SMartin MatuskaIntended for users whose pools were created with 10003ff01b23SMartin MatuskaOpenZFS pre-release versions and now have compatibility issues. 10013ff01b23SMartin Matuska. 10023ff01b23SMartin Matuska.It Sy zfs_key_max_salt_uses Ns = Ns Sy 400000000 Po 4*10^8 Pc Pq ulong 10033ff01b23SMartin MatuskaMaximum number of uses of a single salt value before generating a new one for 10043ff01b23SMartin Matuskaencrypted datasets. 10053ff01b23SMartin MatuskaThe default value is also the maximum. 10063ff01b23SMartin Matuska. 10073ff01b23SMartin Matuska.It Sy zfs_object_mutex_size Ns = Ns Sy 64 Pq uint 10083ff01b23SMartin MatuskaSize of the znode hashtable used for holds. 10093ff01b23SMartin Matuska.Pp 10103ff01b23SMartin MatuskaDue to the need to hold locks on objects that may not exist yet, kernel mutexes 10113ff01b23SMartin Matuskaare not created per-object and instead a hashtable is used where collisions 10123ff01b23SMartin Matuskawill result in objects waiting when there is not actually contention on the 10133ff01b23SMartin Matuskasame object. 10143ff01b23SMartin Matuska. 10153ff01b23SMartin Matuska.It Sy zfs_slow_io_events_per_second Ns = Ns Sy 20 Ns /s Pq int 1016aca928a5SMartin MatuskaRate limit delay zevents (which report slow I/O operations) to this many per 10173ff01b23SMartin Matuskasecond. 10183ff01b23SMartin Matuska. 1019dbd5678dSMartin Matuska.It Sy zfs_unflushed_max_mem_amt Ns = Ns Sy 1073741824 Ns B Po 1 GiB Pc Pq u64 10203ff01b23SMartin MatuskaUpper-bound limit for unflushed metadata changes to be held by the 10213ff01b23SMartin Matuskalog spacemap in memory, in bytes. 10223ff01b23SMartin Matuska. 1023dbd5678dSMartin Matuska.It Sy zfs_unflushed_max_mem_ppm Ns = Ns Sy 1000 Ns ppm Po 0.1% Pc Pq u64 10243ff01b23SMartin MatuskaPart of overall system memory that ZFS allows to be used 10253ff01b23SMartin Matuskafor unflushed metadata changes by the log spacemap, in millionths. 10263ff01b23SMartin Matuska. 1027dbd5678dSMartin Matuska.It Sy zfs_unflushed_log_block_max Ns = Ns Sy 131072 Po 128k Pc Pq u64 10283ff01b23SMartin MatuskaDescribes the maximum number of log spacemap blocks allowed for each pool. 10293ff01b23SMartin MatuskaThe default value means that the space in all the log spacemaps 10303ff01b23SMartin Matuskacan add up to no more than 1031716fd348SMartin Matuska.Sy 131072 10323ff01b23SMartin Matuskablocks (which means 1033716fd348SMartin Matuska.Em 16 GiB 10343ff01b23SMartin Matuskaof logical space before compression and ditto blocks, 10353ff01b23SMartin Matuskaassuming that blocksize is 1036716fd348SMartin Matuska.Em 128 KiB ) . 10373ff01b23SMartin Matuska.Pp 10383ff01b23SMartin MatuskaThis tunable is important because it involves a trade-off between import 10393ff01b23SMartin Matuskatime after an unclean export and the frequency of flushing metaslabs. 10403ff01b23SMartin MatuskaThe higher this number is, the more log blocks we allow when the pool is 10413ff01b23SMartin Matuskaactive which means that we flush metaslabs less often and thus decrease 1042c03c5b1cSMartin Matuskathe number of I/O operations for spacemap updates per TXG. 10433ff01b23SMartin MatuskaAt the same time though, that means that in the event of an unclean export, 10443ff01b23SMartin Matuskathere will be more log spacemap blocks for us to read, inducing overhead 10453ff01b23SMartin Matuskain the import time of the pool. 10463ff01b23SMartin MatuskaThe lower the number, the amount of flushing increases, destroying log 10473ff01b23SMartin Matuskablocks quicker as they become obsolete faster, which leaves less blocks 10483ff01b23SMartin Matuskato be read during import time after a crash. 10493ff01b23SMartin Matuska.Pp 10503ff01b23SMartin MatuskaEach log spacemap block existing during pool import leads to approximately 10513ff01b23SMartin Matuskaone extra logical I/O issued. 10523ff01b23SMartin MatuskaThis is the reason why this tunable is exposed in terms of blocks rather 10533ff01b23SMartin Matuskathan space used. 10543ff01b23SMartin Matuska. 1055dbd5678dSMartin Matuska.It Sy zfs_unflushed_log_block_min Ns = Ns Sy 1000 Pq u64 10563ff01b23SMartin MatuskaIf the number of metaslabs is small and our incoming rate is high, 10573ff01b23SMartin Matuskawe could get into a situation that we are flushing all our metaslabs every TXG. 10583ff01b23SMartin MatuskaThus we always allow at least this many log blocks. 10593ff01b23SMartin Matuska. 1060dbd5678dSMartin Matuska.It Sy zfs_unflushed_log_block_pct Ns = Ns Sy 400 Ns % Pq u64 10613ff01b23SMartin MatuskaTunable used to determine the number of blocks that can be used for 10623ff01b23SMartin Matuskathe spacemap log, expressed as a percentage of the total number of 1063716fd348SMartin Matuskaunflushed metaslabs in the pool. 1064716fd348SMartin Matuska. 1065dbd5678dSMartin Matuska.It Sy zfs_unflushed_log_txg_max Ns = Ns Sy 1000 Pq u64 1066716fd348SMartin MatuskaTunable limiting maximum time in TXGs any metaslab may remain unflushed. 1067716fd348SMartin MatuskaIt effectively limits maximum number of unflushed per-TXG spacemap logs 1068716fd348SMartin Matuskathat need to be read after unclean pool export. 10693ff01b23SMartin Matuska. 10703ff01b23SMartin Matuska.It Sy zfs_unlink_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq uint 10713ff01b23SMartin MatuskaWhen enabled, files will not be asynchronously removed from the list of pending 10723ff01b23SMartin Matuskaunlinks and the space they consume will be leaked. 10733ff01b23SMartin MatuskaOnce this option has been disabled and the dataset is remounted, 10743ff01b23SMartin Matuskathe pending unlinks will be processed and the freed space returned to the pool. 10753ff01b23SMartin MatuskaThis option is used by the test suite. 10763ff01b23SMartin Matuska. 10773ff01b23SMartin Matuska.It Sy zfs_delete_blocks Ns = Ns Sy 20480 Pq ulong 10783ff01b23SMartin MatuskaThis is the used to define a large file for the purposes of deletion. 10793ff01b23SMartin MatuskaFiles containing more than 10803ff01b23SMartin Matuska.Sy zfs_delete_blocks 10813ff01b23SMartin Matuskawill be deleted asynchronously, while smaller files are deleted synchronously. 10823ff01b23SMartin MatuskaDecreasing this value will reduce the time spent in an 10833ff01b23SMartin Matuska.Xr unlink 2 1084bb2d13b6SMartin Matuskasystem call, at the expense of a longer delay before the freed space is 1085bb2d13b6SMartin Matuskaavailable. 1086dbd5678dSMartin MatuskaThis only applies on Linux. 10873ff01b23SMartin Matuska. 10883ff01b23SMartin Matuska.It Sy zfs_dirty_data_max Ns = Pq int 10893ff01b23SMartin MatuskaDetermines the dirty space limit in bytes. 10903ff01b23SMartin MatuskaOnce this limit is exceeded, new writes are halted until space frees up. 10913ff01b23SMartin MatuskaThis parameter takes precedence over 10923ff01b23SMartin Matuska.Sy zfs_dirty_data_max_percent . 10933ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 10943ff01b23SMartin Matuska.Pp 10953ff01b23SMartin MatuskaDefaults to 10963ff01b23SMartin Matuska.Sy physical_ram/10 , 10973ff01b23SMartin Matuskacapped at 10983ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max . 10993ff01b23SMartin Matuska. 11003ff01b23SMartin Matuska.It Sy zfs_dirty_data_max_max Ns = Pq int 11013ff01b23SMartin MatuskaMaximum allowable value of 11023ff01b23SMartin Matuska.Sy zfs_dirty_data_max , 11033ff01b23SMartin Matuskaexpressed in bytes. 11043ff01b23SMartin MatuskaThis limit is only enforced at module load time, and will be ignored if 11053ff01b23SMartin Matuska.Sy zfs_dirty_data_max 11063ff01b23SMartin Matuskais later changed. 11073ff01b23SMartin MatuskaThis parameter takes precedence over 11083ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max_percent . 11093ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 11103ff01b23SMartin Matuska.Pp 11113ff01b23SMartin MatuskaDefaults to 111215f0b8c3SMartin Matuska.Sy min(physical_ram/4, 4GiB) , 111315f0b8c3SMartin Matuskaor 111415f0b8c3SMartin Matuska.Sy min(physical_ram/4, 1GiB) 111515f0b8c3SMartin Matuskafor 32-bit systems. 11163ff01b23SMartin Matuska. 1117be181ee2SMartin Matuska.It Sy zfs_dirty_data_max_max_percent Ns = Ns Sy 25 Ns % Pq uint 11183ff01b23SMartin MatuskaMaximum allowable value of 11193ff01b23SMartin Matuska.Sy zfs_dirty_data_max , 11203ff01b23SMartin Matuskaexpressed as a percentage of physical RAM. 11213ff01b23SMartin MatuskaThis limit is only enforced at module load time, and will be ignored if 11223ff01b23SMartin Matuska.Sy zfs_dirty_data_max 11233ff01b23SMartin Matuskais later changed. 11243ff01b23SMartin MatuskaThe parameter 11253ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max 11263ff01b23SMartin Matuskatakes precedence over this one. 11273ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 11283ff01b23SMartin Matuska. 1129be181ee2SMartin Matuska.It Sy zfs_dirty_data_max_percent Ns = Ns Sy 10 Ns % Pq uint 11303ff01b23SMartin MatuskaDetermines the dirty space limit, expressed as a percentage of all memory. 11313ff01b23SMartin MatuskaOnce this limit is exceeded, new writes are halted until space frees up. 11323ff01b23SMartin MatuskaThe parameter 11333ff01b23SMartin Matuska.Sy zfs_dirty_data_max 11343ff01b23SMartin Matuskatakes precedence over this one. 11353ff01b23SMartin Matuska.No See Sx ZFS TRANSACTION DELAY . 11363ff01b23SMartin Matuska.Pp 11373ff01b23SMartin MatuskaSubject to 11383ff01b23SMartin Matuska.Sy zfs_dirty_data_max_max . 11393ff01b23SMartin Matuska. 1140be181ee2SMartin Matuska.It Sy zfs_dirty_data_sync_percent Ns = Ns Sy 20 Ns % Pq uint 11413ff01b23SMartin MatuskaStart syncing out a transaction group if there's at least this much dirty data 11423ff01b23SMartin Matuska.Pq as a percentage of Sy zfs_dirty_data_max . 11433ff01b23SMartin MatuskaThis should be less than 11443ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_min_dirty_percent . 11453ff01b23SMartin Matuska. 11463f9d360cSMartin Matuska.It Sy zfs_wrlog_data_max Ns = Pq int 11473f9d360cSMartin MatuskaThe upper limit of write-transaction zil log data size in bytes. 1148e3aa18adSMartin MatuskaWrite operations are throttled when approaching the limit until log data is 1149e3aa18adSMartin Matuskacleared out after transaction group sync. 1150e3aa18adSMartin MatuskaBecause of some overhead, it should be set at least 2 times the size of 11513f9d360cSMartin Matuska.Sy zfs_dirty_data_max 11523f9d360cSMartin Matuska.No to prevent harming normal write throughput . 11533f9d360cSMartin MatuskaIt also should be smaller than the size of the slog device if slog is present. 11543f9d360cSMartin Matuska.Pp 11553f9d360cSMartin MatuskaDefaults to 11563f9d360cSMartin Matuska.Sy zfs_dirty_data_max*2 11573f9d360cSMartin Matuska. 11583ff01b23SMartin Matuska.It Sy zfs_fallocate_reserve_percent Ns = Ns Sy 110 Ns % Pq uint 11593ff01b23SMartin MatuskaSince ZFS is a copy-on-write filesystem with snapshots, blocks cannot be 11603ff01b23SMartin Matuskapreallocated for a file in order to guarantee that later writes will not 11613ff01b23SMartin Matuskarun out of space. 11623ff01b23SMartin MatuskaInstead, 11633ff01b23SMartin Matuska.Xr fallocate 2 11643ff01b23SMartin Matuskaspace preallocation only checks that sufficient space is currently available 11653ff01b23SMartin Matuskain the pool or the user's project quota allocation, 11663ff01b23SMartin Matuskaand then creates a sparse file of the requested size. 11673ff01b23SMartin MatuskaThe requested space is multiplied by 11683ff01b23SMartin Matuska.Sy zfs_fallocate_reserve_percent 11693ff01b23SMartin Matuskato allow additional space for indirect blocks and other internal metadata. 11703ff01b23SMartin MatuskaSetting this to 11713ff01b23SMartin Matuska.Sy 0 11723ff01b23SMartin Matuskadisables support for 11733ff01b23SMartin Matuska.Xr fallocate 2 11743ff01b23SMartin Matuskaand causes it to return 11753ff01b23SMartin Matuska.Sy EOPNOTSUPP . 11763ff01b23SMartin Matuska. 11773ff01b23SMartin Matuska.It Sy zfs_fletcher_4_impl Ns = Ns Sy fastest Pq string 11783ff01b23SMartin MatuskaSelect a fletcher 4 implementation. 11793ff01b23SMartin Matuska.Pp 11803ff01b23SMartin MatuskaSupported selectors are: 11813ff01b23SMartin Matuska.Sy fastest , scalar , sse2 , ssse3 , avx2 , avx512f , avx512bw , 11823ff01b23SMartin Matuska.No and Sy aarch64_neon . 11833ff01b23SMartin MatuskaAll except 11843ff01b23SMartin Matuska.Sy fastest No and Sy scalar 11853ff01b23SMartin Matuskarequire instruction set extensions to be available, 11863ff01b23SMartin Matuskaand will only appear if ZFS detects that they are present at runtime. 11873ff01b23SMartin MatuskaIf multiple implementations of fletcher 4 are available, the 11883ff01b23SMartin Matuska.Sy fastest 11893ff01b23SMartin Matuskawill be chosen using a micro benchmark. 11903ff01b23SMartin MatuskaSelecting 11913ff01b23SMartin Matuska.Sy scalar 11923ff01b23SMartin Matuskaresults in the original CPU-based calculation being used. 11933ff01b23SMartin MatuskaSelecting any option other than 11943ff01b23SMartin Matuska.Sy fastest No or Sy scalar 11953ff01b23SMartin Matuskaresults in vector instructions 11963ff01b23SMartin Matuskafrom the respective CPU instruction set being used. 11973ff01b23SMartin Matuska. 119847bb16f8SMartin Matuska.It Sy zfs_bclone_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 119947bb16f8SMartin MatuskaEnable the experimental block cloning feature. 120047bb16f8SMartin MatuskaIf this setting is 0, then even if feature@block_cloning is enabled, 120147bb16f8SMartin Matuskaattempts to clone blocks will act as though the feature is disabled. 120247bb16f8SMartin Matuska. 1203a4e5e010SMartin Matuska.It Sy zfs_bclone_wait_dirty Ns = Ns Sy 0 Ns | Ns 1 Pq int 1204a4e5e010SMartin MatuskaWhen set to 1 the FICLONE and FICLONERANGE ioctls wait for dirty data to be 1205a4e5e010SMartin Matuskawritten to disk. 1206a4e5e010SMartin MatuskaThis allows the clone operation to reliably succeed when a file is 1207a4e5e010SMartin Matuskamodified and then immediately cloned. 1208a4e5e010SMartin MatuskaFor small files this may be slower than making a copy of the file. 1209a4e5e010SMartin MatuskaTherefore, this setting defaults to 0 which causes a clone operation to 1210a4e5e010SMartin Matuskaimmediately fail when encountering a dirty block. 1211a4e5e010SMartin Matuska. 1212c7046f76SMartin Matuska.It Sy zfs_blake3_impl Ns = Ns Sy fastest Pq string 1213c7046f76SMartin MatuskaSelect a BLAKE3 implementation. 1214c7046f76SMartin Matuska.Pp 1215c7046f76SMartin MatuskaSupported selectors are: 1216c7046f76SMartin Matuska.Sy cycle , fastest , generic , sse2 , sse41 , avx2 , avx512 . 1217c7046f76SMartin MatuskaAll except 1218c7046f76SMartin Matuska.Sy cycle , fastest No and Sy generic 1219c7046f76SMartin Matuskarequire instruction set extensions to be available, 1220c7046f76SMartin Matuskaand will only appear if ZFS detects that they are present at runtime. 1221c7046f76SMartin MatuskaIf multiple implementations of BLAKE3 are available, the 1222c7046f76SMartin Matuska.Sy fastest will be chosen using a micro benchmark. You can see the 1223c7046f76SMartin Matuskabenchmark results by reading this kstat file: 1224c7046f76SMartin Matuska.Pa /proc/spl/kstat/zfs/chksum_bench . 1225c7046f76SMartin Matuska. 12263ff01b23SMartin Matuska.It Sy zfs_free_bpobj_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 12273ff01b23SMartin MatuskaEnable/disable the processing of the free_bpobj object. 12283ff01b23SMartin Matuska. 1229dbd5678dSMartin Matuska.It Sy zfs_async_block_max_blocks Ns = Ns Sy UINT64_MAX Po unlimited Pc Pq u64 12303ff01b23SMartin MatuskaMaximum number of blocks freed in a single TXG. 12313ff01b23SMartin Matuska. 1232dbd5678dSMartin Matuska.It Sy zfs_max_async_dedup_frees Ns = Ns Sy 100000 Po 10^5 Pc Pq u64 12333ff01b23SMartin MatuskaMaximum number of dedup blocks freed in a single TXG. 12343ff01b23SMartin Matuska. 1235be181ee2SMartin Matuska.It Sy zfs_vdev_async_read_max_active Ns = Ns Sy 3 Pq uint 12363ff01b23SMartin MatuskaMaximum asynchronous read I/O operations active to each device. 12373ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12383ff01b23SMartin Matuska. 1239be181ee2SMartin Matuska.It Sy zfs_vdev_async_read_min_active Ns = Ns Sy 1 Pq uint 12403ff01b23SMartin MatuskaMinimum asynchronous read I/O operation active to each device. 12413ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12423ff01b23SMartin Matuska. 1243be181ee2SMartin Matuska.It Sy zfs_vdev_async_write_active_max_dirty_percent Ns = Ns Sy 60 Ns % Pq uint 12443ff01b23SMartin MatuskaWhen the pool has more than this much dirty data, use 12453ff01b23SMartin Matuska.Sy zfs_vdev_async_write_max_active 12463ff01b23SMartin Matuskato limit active async writes. 12473ff01b23SMartin MatuskaIf the dirty data is between the minimum and maximum, 12483ff01b23SMartin Matuskathe active I/O limit is linearly interpolated. 12493ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12503ff01b23SMartin Matuska. 1251be181ee2SMartin Matuska.It Sy zfs_vdev_async_write_active_min_dirty_percent Ns = Ns Sy 30 Ns % Pq uint 12523ff01b23SMartin MatuskaWhen the pool has less than this much dirty data, use 12533ff01b23SMartin Matuska.Sy zfs_vdev_async_write_min_active 12543ff01b23SMartin Matuskato limit active async writes. 12553ff01b23SMartin MatuskaIf the dirty data is between the minimum and maximum, 12563ff01b23SMartin Matuskathe active I/O limit is linearly 12573ff01b23SMartin Matuskainterpolated. 12583ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12593ff01b23SMartin Matuska. 1260bb2d13b6SMartin Matuska.It Sy zfs_vdev_async_write_max_active Ns = Ns Sy 10 Pq uint 12613ff01b23SMartin MatuskaMaximum asynchronous write I/O operations active to each device. 12623ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12633ff01b23SMartin Matuska. 1264be181ee2SMartin Matuska.It Sy zfs_vdev_async_write_min_active Ns = Ns Sy 2 Pq uint 12653ff01b23SMartin MatuskaMinimum asynchronous write I/O operations active to each device. 12663ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12673ff01b23SMartin Matuska.Pp 12683ff01b23SMartin MatuskaLower values are associated with better latency on rotational media but poorer 12693ff01b23SMartin Matuskaresilver performance. 12703ff01b23SMartin MatuskaThe default value of 12713ff01b23SMartin Matuska.Sy 2 12723ff01b23SMartin Matuskawas chosen as a compromise. 12733ff01b23SMartin MatuskaA value of 12743ff01b23SMartin Matuska.Sy 3 12753ff01b23SMartin Matuskahas been shown to improve resilver performance further at a cost of 12763ff01b23SMartin Matuskafurther increasing latency. 12773ff01b23SMartin Matuska. 1278be181ee2SMartin Matuska.It Sy zfs_vdev_initializing_max_active Ns = Ns Sy 1 Pq uint 12793ff01b23SMartin MatuskaMaximum initializing I/O operations active to each device. 12803ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12813ff01b23SMartin Matuska. 1282be181ee2SMartin Matuska.It Sy zfs_vdev_initializing_min_active Ns = Ns Sy 1 Pq uint 12833ff01b23SMartin MatuskaMinimum initializing I/O operations active to each device. 12843ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12853ff01b23SMartin Matuska. 1286be181ee2SMartin Matuska.It Sy zfs_vdev_max_active Ns = Ns Sy 1000 Pq uint 12873ff01b23SMartin MatuskaThe maximum number of I/O operations active to each device. 12883ff01b23SMartin MatuskaIdeally, this will be at least the sum of each queue's 12893ff01b23SMartin Matuska.Sy max_active . 12903ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 12913ff01b23SMartin Matuska. 1292dbd5678dSMartin Matuska.It Sy zfs_vdev_open_timeout_ms Ns = Ns Sy 1000 Pq uint 1293dbd5678dSMartin MatuskaTimeout value to wait before determining a device is missing 1294dbd5678dSMartin Matuskaduring import. 1295dbd5678dSMartin MatuskaThis is helpful for transient missing paths due 1296dbd5678dSMartin Matuskato links being briefly removed and recreated in response to 1297dbd5678dSMartin Matuskaudev events. 1298dbd5678dSMartin Matuska. 1299be181ee2SMartin Matuska.It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq uint 13003ff01b23SMartin MatuskaMaximum sequential resilver I/O operations active to each device. 13013ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13023ff01b23SMartin Matuska. 1303be181ee2SMartin Matuska.It Sy zfs_vdev_rebuild_min_active Ns = Ns Sy 1 Pq uint 13043ff01b23SMartin MatuskaMinimum sequential resilver I/O operations active to each device. 13053ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13063ff01b23SMartin Matuska. 1307be181ee2SMartin Matuska.It Sy zfs_vdev_removal_max_active Ns = Ns Sy 2 Pq uint 13083ff01b23SMartin MatuskaMaximum removal I/O operations active to each device. 13093ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13103ff01b23SMartin Matuska. 1311be181ee2SMartin Matuska.It Sy zfs_vdev_removal_min_active Ns = Ns Sy 1 Pq uint 13123ff01b23SMartin MatuskaMinimum removal I/O operations active to each device. 13133ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13143ff01b23SMartin Matuska. 1315be181ee2SMartin Matuska.It Sy zfs_vdev_scrub_max_active Ns = Ns Sy 2 Pq uint 13163ff01b23SMartin MatuskaMaximum scrub I/O operations active to each device. 13173ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13183ff01b23SMartin Matuska. 1319be181ee2SMartin Matuska.It Sy zfs_vdev_scrub_min_active Ns = Ns Sy 1 Pq uint 13203ff01b23SMartin MatuskaMinimum scrub I/O operations active to each device. 13213ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13223ff01b23SMartin Matuska. 1323be181ee2SMartin Matuska.It Sy zfs_vdev_sync_read_max_active Ns = Ns Sy 10 Pq uint 13243ff01b23SMartin MatuskaMaximum synchronous read I/O operations active to each device. 13253ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13263ff01b23SMartin Matuska. 1327be181ee2SMartin Matuska.It Sy zfs_vdev_sync_read_min_active Ns = Ns Sy 10 Pq uint 13283ff01b23SMartin MatuskaMinimum synchronous read I/O operations active to each device. 13293ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13303ff01b23SMartin Matuska. 1331be181ee2SMartin Matuska.It Sy zfs_vdev_sync_write_max_active Ns = Ns Sy 10 Pq uint 13323ff01b23SMartin MatuskaMaximum synchronous write I/O operations active to each device. 13333ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13343ff01b23SMartin Matuska. 1335be181ee2SMartin Matuska.It Sy zfs_vdev_sync_write_min_active Ns = Ns Sy 10 Pq uint 13363ff01b23SMartin MatuskaMinimum synchronous write I/O operations active to each device. 13373ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13383ff01b23SMartin Matuska. 1339be181ee2SMartin Matuska.It Sy zfs_vdev_trim_max_active Ns = Ns Sy 2 Pq uint 13403ff01b23SMartin MatuskaMaximum trim/discard I/O operations active to each device. 13413ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13423ff01b23SMartin Matuska. 1343be181ee2SMartin Matuska.It Sy zfs_vdev_trim_min_active Ns = Ns Sy 1 Pq uint 13443ff01b23SMartin MatuskaMinimum trim/discard I/O operations active to each device. 13453ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13463ff01b23SMartin Matuska. 1347be181ee2SMartin Matuska.It Sy zfs_vdev_nia_delay Ns = Ns Sy 5 Pq uint 13483ff01b23SMartin MatuskaFor non-interactive I/O (scrub, resilver, removal, initialize and rebuild), 13493ff01b23SMartin Matuskathe number of concurrently-active I/O operations is limited to 13503ff01b23SMartin Matuska.Sy zfs_*_min_active , 13513ff01b23SMartin Matuskaunless the vdev is "idle". 1352e92ffd9bSMartin MatuskaWhen there are no interactive I/O operations active (synchronous or otherwise), 13533ff01b23SMartin Matuskaand 13543ff01b23SMartin Matuska.Sy zfs_vdev_nia_delay 13553ff01b23SMartin Matuskaoperations have completed since the last interactive operation, 13563ff01b23SMartin Matuskathen the vdev is considered to be "idle", 13573ff01b23SMartin Matuskaand the number of concurrently-active non-interactive operations is increased to 13583ff01b23SMartin Matuska.Sy zfs_*_max_active . 13593ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13603ff01b23SMartin Matuska. 1361be181ee2SMartin Matuska.It Sy zfs_vdev_nia_credit Ns = Ns Sy 5 Pq uint 13623ff01b23SMartin MatuskaSome HDDs tend to prioritize sequential I/O so strongly, that concurrent 13633ff01b23SMartin Matuskarandom I/O latency reaches several seconds. 13643ff01b23SMartin MatuskaOn some HDDs this happens even if sequential I/O operations 13653ff01b23SMartin Matuskaare submitted one at a time, and so setting 13663ff01b23SMartin Matuska.Sy zfs_*_max_active Ns = Sy 1 13673ff01b23SMartin Matuskadoes not help. 13683ff01b23SMartin MatuskaTo prevent non-interactive I/O, like scrub, 13693ff01b23SMartin Matuskafrom monopolizing the device, no more than 13703ff01b23SMartin Matuska.Sy zfs_vdev_nia_credit operations can be sent 13713ff01b23SMartin Matuskawhile there are outstanding incomplete interactive operations. 13723ff01b23SMartin MatuskaThis enforced wait ensures the HDD services the interactive I/O 13733ff01b23SMartin Matuskawithin a reasonable amount of time. 13743ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 13753ff01b23SMartin Matuska. 1376be181ee2SMartin Matuska.It Sy zfs_vdev_queue_depth_pct Ns = Ns Sy 1000 Ns % Pq uint 13773ff01b23SMartin MatuskaMaximum number of queued allocations per top-level vdev expressed as 13783ff01b23SMartin Matuskaa percentage of 13793ff01b23SMartin Matuska.Sy zfs_vdev_async_write_max_active , 13803ff01b23SMartin Matuskawhich allows the system to detect devices that are more capable 13813ff01b23SMartin Matuskaof handling allocations and to allocate more blocks to those devices. 13823ff01b23SMartin MatuskaThis allows for dynamic allocation distribution when devices are imbalanced, 13833ff01b23SMartin Matuskaas fuller devices will tend to be slower than empty devices. 13843ff01b23SMartin Matuska.Pp 13853ff01b23SMartin MatuskaAlso see 13863ff01b23SMartin Matuska.Sy zio_dva_throttle_enabled . 13873ff01b23SMartin Matuska. 1388d411c1d6SMartin Matuska.It Sy zfs_vdev_def_queue_depth Ns = Ns Sy 32 Pq uint 1389d411c1d6SMartin MatuskaDefault queue depth for each vdev IO allocator. 1390d411c1d6SMartin MatuskaHigher values allow for better coalescing of sequential writes before sending 1391d411c1d6SMartin Matuskathem to the disk, but can increase transaction commit times. 1392d411c1d6SMartin Matuska. 1393dbd5678dSMartin Matuska.It Sy zfs_vdev_failfast_mask Ns = Ns Sy 1 Pq uint 1394dbd5678dSMartin MatuskaDefines if the driver should retire on a given error type. 1395dbd5678dSMartin MatuskaThe following options may be bitwise-ored together: 1396dbd5678dSMartin Matuska.TS 1397dbd5678dSMartin Matuskabox; 1398dbd5678dSMartin Matuskalbz r l l . 1399dbd5678dSMartin Matuska Value Name Description 1400dbd5678dSMartin Matuska_ 1401dbd5678dSMartin Matuska 1 Device No driver retries on device errors 1402dbd5678dSMartin Matuska 2 Transport No driver retries on transport errors. 1403dbd5678dSMartin Matuska 4 Driver No driver retries on driver errors. 1404dbd5678dSMartin Matuska.TE 1405dbd5678dSMartin Matuska. 1406783d3ff6SMartin Matuska.It Sy zfs_vdev_disk_max_segs Ns = Ns Sy 0 Pq uint 1407783d3ff6SMartin MatuskaMaximum number of segments to add to a BIO (min 4). 1408783d3ff6SMartin MatuskaIf this is higher than the maximum allowed by the device queue or the kernel 1409783d3ff6SMartin Matuskaitself, it will be clamped. 1410783d3ff6SMartin MatuskaSetting it to zero will cause the kernel's ideal size to be used. 1411783d3ff6SMartin MatuskaThis parameter only applies on Linux. 1412783d3ff6SMartin MatuskaThis parameter is ignored if 1413783d3ff6SMartin Matuska.Sy zfs_vdev_disk_classic Ns = Ns Sy 1 . 1414783d3ff6SMartin Matuska. 1415783d3ff6SMartin Matuska.It Sy zfs_vdev_disk_classic Ns = Ns Sy 0 Ns | Ns 1 Pq uint 1416783d3ff6SMartin MatuskaIf set to 1, OpenZFS will submit IO to Linux using the method it used in 2.2 1417783d3ff6SMartin Matuskaand earlier. 1418783d3ff6SMartin MatuskaThis "classic" method has known issues with highly fragmented IO requests and 1419783d3ff6SMartin Matuskais slower on many workloads, but it has been in use for many years and is known 1420783d3ff6SMartin Matuskato be very stable. 1421783d3ff6SMartin MatuskaIf you set this parameter, please also open a bug report why you did so, 1422783d3ff6SMartin Matuskaincluding the workload involved and any error messages. 1423783d3ff6SMartin Matuska.Pp 1424783d3ff6SMartin MatuskaThis parameter and the classic submission method will be removed once we have 1425783d3ff6SMartin Matuskatotal confidence in the new method. 1426783d3ff6SMartin Matuska.Pp 1427783d3ff6SMartin MatuskaThis parameter only applies on Linux, and can only be set at module load time. 1428783d3ff6SMartin Matuska. 14293ff01b23SMartin Matuska.It Sy zfs_expire_snapshot Ns = Ns Sy 300 Ns s Pq int 14303ff01b23SMartin MatuskaTime before expiring 14313ff01b23SMartin Matuska.Pa .zfs/snapshot . 14323ff01b23SMartin Matuska. 14333ff01b23SMartin Matuska.It Sy zfs_admin_snapshot Ns = Ns Sy 0 Ns | Ns 1 Pq int 14343ff01b23SMartin MatuskaAllow the creation, removal, or renaming of entries in the 14353ff01b23SMartin Matuska.Sy .zfs/snapshot 14363ff01b23SMartin Matuskadirectory to cause the creation, destruction, or renaming of snapshots. 14373ff01b23SMartin MatuskaWhen enabled, this functionality works both locally and over NFS exports 14383ff01b23SMartin Matuskawhich have the 14393ff01b23SMartin Matuska.Em no_root_squash 14403ff01b23SMartin Matuskaoption set. 14413ff01b23SMartin Matuska. 14423ff01b23SMartin Matuska.It Sy zfs_flags Ns = Ns Sy 0 Pq int 14433ff01b23SMartin MatuskaSet additional debugging flags. 14443ff01b23SMartin MatuskaThe following flags may be bitwise-ored together: 14453ff01b23SMartin Matuska.TS 14463ff01b23SMartin Matuskabox; 14473ff01b23SMartin Matuskalbz r l l . 1448dbd5678dSMartin Matuska Value Name Description 14493ff01b23SMartin Matuska_ 14503ff01b23SMartin Matuska 1 ZFS_DEBUG_DPRINTF Enable dprintf entries in the debug log. 14513ff01b23SMartin Matuska* 2 ZFS_DEBUG_DBUF_VERIFY Enable extra dbuf verifications. 14523ff01b23SMartin Matuska* 4 ZFS_DEBUG_DNODE_VERIFY Enable extra dnode verifications. 14533ff01b23SMartin Matuska 8 ZFS_DEBUG_SNAPNAMES Enable snapshot name verification. 145415f0b8c3SMartin Matuska* 16 ZFS_DEBUG_MODIFY Check for illegally modified ARC buffers. 14553ff01b23SMartin Matuska 64 ZFS_DEBUG_ZIO_FREE Enable verification of block frees. 14563ff01b23SMartin Matuska 128 ZFS_DEBUG_HISTOGRAM_VERIFY Enable extra spacemap histogram verifications. 14573ff01b23SMartin Matuska 256 ZFS_DEBUG_METASLAB_VERIFY Verify space accounting on disk matches in-memory \fBrange_trees\fP. 14583ff01b23SMartin Matuska 512 ZFS_DEBUG_SET_ERROR Enable \fBSET_ERROR\fP and dprintf entries in the debug log. 14593ff01b23SMartin Matuska 1024 ZFS_DEBUG_INDIRECT_REMAP Verify split blocks created by device removal. 14603ff01b23SMartin Matuska 2048 ZFS_DEBUG_TRIM Verify TRIM ranges are always within the allocatable range tree. 14613ff01b23SMartin Matuska 4096 ZFS_DEBUG_LOG_SPACEMAP Verify that the log summary is consistent with the spacemap log 14623ff01b23SMartin Matuska and enable \fBzfs_dbgmsgs\fP for metaslab loading and flushing. 14633ff01b23SMartin Matuska.TE 14643ff01b23SMartin Matuska.Sy \& * No Requires debug build . 14653ff01b23SMartin Matuska. 1466c7046f76SMartin Matuska.It Sy zfs_btree_verify_intensity Ns = Ns Sy 0 Pq uint 1467c7046f76SMartin MatuskaEnables btree verification. 1468c7046f76SMartin MatuskaThe following settings are culminative: 1469c7046f76SMartin Matuska.TS 1470c7046f76SMartin Matuskabox; 1471c7046f76SMartin Matuskalbz r l l . 1472c7046f76SMartin Matuska Value Description 1473c7046f76SMartin Matuska 1474c7046f76SMartin Matuska 1 Verify height. 1475c7046f76SMartin Matuska 2 Verify pointers from children to parent. 1476c7046f76SMartin Matuska 3 Verify element counts. 1477c7046f76SMartin Matuska 4 Verify element order. (expensive) 1478c7046f76SMartin Matuska* 5 Verify unused memory is poisoned. (expensive) 1479c7046f76SMartin Matuska.TE 1480c7046f76SMartin Matuska.Sy \& * No Requires debug build . 1481c7046f76SMartin Matuska. 14823ff01b23SMartin Matuska.It Sy zfs_free_leak_on_eio Ns = Ns Sy 0 Ns | Ns 1 Pq int 14833ff01b23SMartin MatuskaIf destroy encounters an 14843ff01b23SMartin Matuska.Sy EIO 14853ff01b23SMartin Matuskawhile reading metadata (e.g. indirect blocks), 14863ff01b23SMartin Matuskaspace referenced by the missing metadata can not be freed. 14873ff01b23SMartin MatuskaNormally this causes the background destroy to become "stalled", 14883ff01b23SMartin Matuskaas it is unable to make forward progress. 14893ff01b23SMartin MatuskaWhile in this stalled state, all remaining space to free 14903ff01b23SMartin Matuskafrom the error-encountering filesystem is "temporarily leaked". 14913ff01b23SMartin MatuskaSet this flag to cause it to ignore the 14923ff01b23SMartin Matuska.Sy EIO , 14933ff01b23SMartin Matuskapermanently leak the space from indirect blocks that can not be read, 14943ff01b23SMartin Matuskaand continue to free everything else that it can. 14953ff01b23SMartin Matuska.Pp 14963ff01b23SMartin MatuskaThe default "stalling" behavior is useful if the storage partially 14973ff01b23SMartin Matuskafails (i.e. some but not all I/O operations fail), and then later recovers. 14983ff01b23SMartin MatuskaIn this case, we will be able to continue pool operations while it is 14993ff01b23SMartin Matuskapartially failed, and when it recovers, we can continue to free the 15003ff01b23SMartin Matuskaspace, with no leaks. 15013ff01b23SMartin MatuskaNote, however, that this case is actually fairly rare. 15023ff01b23SMartin Matuska.Pp 15033ff01b23SMartin MatuskaTypically pools either 15043ff01b23SMartin Matuska.Bl -enum -compact -offset 4n -width "1." 15053ff01b23SMartin Matuska.It 15063ff01b23SMartin Matuskafail completely (but perhaps temporarily, 15073ff01b23SMartin Matuskae.g. due to a top-level vdev going offline), or 15083ff01b23SMartin Matuska.It 15093ff01b23SMartin Matuskahave localized, permanent errors (e.g. disk returns the wrong data 15103ff01b23SMartin Matuskadue to bit flip or firmware bug). 15113ff01b23SMartin Matuska.El 15123ff01b23SMartin MatuskaIn the former case, this setting does not matter because the 15133ff01b23SMartin Matuskapool will be suspended and the sync thread will not be able to make 15143ff01b23SMartin Matuskaforward progress regardless. 15153ff01b23SMartin MatuskaIn the latter, because the error is permanent, the best we can do 15163ff01b23SMartin Matuskais leak the minimum amount of space, 15173ff01b23SMartin Matuskawhich is what setting this flag will do. 15183ff01b23SMartin MatuskaIt is therefore reasonable for this flag to normally be set, 15193ff01b23SMartin Matuskabut we chose the more conservative approach of not setting it, 15203ff01b23SMartin Matuskaso that there is no possibility of 15213ff01b23SMartin Matuskaleaking space in the "partial temporary" failure case. 15223ff01b23SMartin Matuska. 1523be181ee2SMartin Matuska.It Sy zfs_free_min_time_ms Ns = Ns Sy 1000 Ns ms Po 1s Pc Pq uint 15243ff01b23SMartin MatuskaDuring a 15253ff01b23SMartin Matuska.Nm zfs Cm destroy 15263ff01b23SMartin Matuskaoperation using the 15273ff01b23SMartin Matuska.Sy async_destroy 15283ff01b23SMartin Matuskafeature, 15293ff01b23SMartin Matuskaa minimum of this much time will be spent working on freeing blocks per TXG. 15303ff01b23SMartin Matuska. 1531be181ee2SMartin Matuska.It Sy zfs_obsolete_min_time_ms Ns = Ns Sy 500 Ns ms Pq uint 15323ff01b23SMartin MatuskaSimilar to 15333ff01b23SMartin Matuska.Sy zfs_free_min_time_ms , 15343ff01b23SMartin Matuskabut for cleanup of old indirection records for removed vdevs. 15353ff01b23SMartin Matuska. 1536dbd5678dSMartin Matuska.It Sy zfs_immediate_write_sz Ns = Ns Sy 32768 Ns B Po 32 KiB Pc Pq s64 15373ff01b23SMartin MatuskaLargest data block to write to the ZIL. 15383ff01b23SMartin MatuskaLarger blocks will be treated as if the dataset being written to had the 15393ff01b23SMartin Matuska.Sy logbias Ns = Ns Sy throughput 15403ff01b23SMartin Matuskaproperty set. 15413ff01b23SMartin Matuska. 1542dbd5678dSMartin Matuska.It Sy zfs_initialize_value Ns = Ns Sy 16045690984833335022 Po 0xDEADBEEFDEADBEEE Pc Pq u64 15433ff01b23SMartin MatuskaPattern written to vdev free space by 15443ff01b23SMartin Matuska.Xr zpool-initialize 8 . 15453ff01b23SMartin Matuska. 1546dbd5678dSMartin Matuska.It Sy zfs_initialize_chunk_size Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq u64 15473ff01b23SMartin MatuskaSize of writes used by 15483ff01b23SMartin Matuska.Xr zpool-initialize 8 . 15493ff01b23SMartin MatuskaThis option is used by the test suite. 15503ff01b23SMartin Matuska. 1551dbd5678dSMartin Matuska.It Sy zfs_livelist_max_entries Ns = Ns Sy 500000 Po 5*10^5 Pc Pq u64 15523ff01b23SMartin MatuskaThe threshold size (in block pointers) at which we create a new sub-livelist. 15533ff01b23SMartin MatuskaLarger sublists are more costly from a memory perspective but the fewer 15543ff01b23SMartin Matuskasublists there are, the lower the cost of insertion. 15553ff01b23SMartin Matuska. 15563ff01b23SMartin Matuska.It Sy zfs_livelist_min_percent_shared Ns = Ns Sy 75 Ns % Pq int 15573ff01b23SMartin MatuskaIf the amount of shared space between a snapshot and its clone drops below 15583ff01b23SMartin Matuskathis threshold, the clone turns off the livelist and reverts to the old 15593ff01b23SMartin Matuskadeletion method. 15603ff01b23SMartin MatuskaThis is in place because livelists no long give us a benefit 15613ff01b23SMartin Matuskaonce a clone has been overwritten enough. 15623ff01b23SMartin Matuska. 15633ff01b23SMartin Matuska.It Sy zfs_livelist_condense_new_alloc Ns = Ns Sy 0 Pq int 15643ff01b23SMartin MatuskaIncremented each time an extra ALLOC blkptr is added to a livelist entry while 15653ff01b23SMartin Matuskait is being condensed. 15663ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions. 15673ff01b23SMartin Matuska. 15683ff01b23SMartin Matuska.It Sy zfs_livelist_condense_sync_cancel Ns = Ns Sy 0 Pq int 15693ff01b23SMartin MatuskaIncremented each time livelist condensing is canceled while in 15703ff01b23SMartin Matuska.Fn spa_livelist_condense_sync . 15713ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions. 15723ff01b23SMartin Matuska. 15733ff01b23SMartin Matuska.It Sy zfs_livelist_condense_sync_pause Ns = Ns Sy 0 Ns | Ns 1 Pq int 15743ff01b23SMartin MatuskaWhen set, the livelist condense process pauses indefinitely before 1575e92ffd9bSMartin Matuskaexecuting the synctask \(em 15763ff01b23SMartin Matuska.Fn spa_livelist_condense_sync . 15773ff01b23SMartin MatuskaThis option is used by the test suite to trigger race conditions. 15783ff01b23SMartin Matuska. 15793ff01b23SMartin Matuska.It Sy zfs_livelist_condense_zthr_cancel Ns = Ns Sy 0 Pq int 15803ff01b23SMartin MatuskaIncremented each time livelist condensing is canceled while in 15813ff01b23SMartin Matuska.Fn spa_livelist_condense_cb . 15823ff01b23SMartin MatuskaThis option is used by the test suite to track race conditions. 15833ff01b23SMartin Matuska. 15843ff01b23SMartin Matuska.It Sy zfs_livelist_condense_zthr_pause Ns = Ns Sy 0 Ns | Ns 1 Pq int 15853ff01b23SMartin MatuskaWhen set, the livelist condense process pauses indefinitely before 15863ff01b23SMartin Matuskaexecuting the open context condensing work in 15873ff01b23SMartin Matuska.Fn spa_livelist_condense_cb . 15883ff01b23SMartin MatuskaThis option is used by the test suite to trigger race conditions. 15893ff01b23SMartin Matuska. 1590dbd5678dSMartin Matuska.It Sy zfs_lua_max_instrlimit Ns = Ns Sy 100000000 Po 10^8 Pc Pq u64 15913ff01b23SMartin MatuskaThe maximum execution time limit that can be set for a ZFS channel program, 15923ff01b23SMartin Matuskaspecified as a number of Lua instructions. 15933ff01b23SMartin Matuska. 1594dbd5678dSMartin Matuska.It Sy zfs_lua_max_memlimit Ns = Ns Sy 104857600 Po 100 MiB Pc Pq u64 15953ff01b23SMartin MatuskaThe maximum memory limit that can be set for a ZFS channel program, specified 15963ff01b23SMartin Matuskain bytes. 15973ff01b23SMartin Matuska. 15983ff01b23SMartin Matuska.It Sy zfs_max_dataset_nesting Ns = Ns Sy 50 Pq int 15993ff01b23SMartin MatuskaThe maximum depth of nested datasets. 16003ff01b23SMartin MatuskaThis value can be tuned temporarily to 16013ff01b23SMartin Matuskafix existing datasets that exceed the predefined limit. 16023ff01b23SMartin Matuska. 1603dbd5678dSMartin Matuska.It Sy zfs_max_log_walking Ns = Ns Sy 5 Pq u64 16043ff01b23SMartin MatuskaThe number of past TXGs that the flushing algorithm of the log spacemap 16053ff01b23SMartin Matuskafeature uses to estimate incoming log blocks. 16063ff01b23SMartin Matuska. 1607dbd5678dSMartin Matuska.It Sy zfs_max_logsm_summary_length Ns = Ns Sy 10 Pq u64 16083ff01b23SMartin MatuskaMaximum number of rows allowed in the summary of the spacemap log. 16093ff01b23SMartin Matuska. 1610be181ee2SMartin Matuska.It Sy zfs_max_recordsize Ns = Ns Sy 16777216 Po 16 MiB Pc Pq uint 16113ff01b23SMartin MatuskaWe currently support block sizes from 1612716fd348SMartin Matuska.Em 512 Po 512 B Pc No to Em 16777216 Po 16 MiB Pc . 16133ff01b23SMartin MatuskaThe benefits of larger blocks, and thus larger I/O, 16143ff01b23SMartin Matuskaneed to be weighed against the cost of COWing a giant block to modify one byte. 16153ff01b23SMartin MatuskaAdditionally, very large blocks can have an impact on I/O latency, 16163ff01b23SMartin Matuskaand also potentially on the memory allocator. 1617716fd348SMartin MatuskaTherefore, we formerly forbade creating blocks larger than 1M. 1618716fd348SMartin MatuskaLarger blocks could be created by changing it, 16193ff01b23SMartin Matuskaand pools with larger blocks can always be imported and used, 16203ff01b23SMartin Matuskaregardless of this setting. 16213ff01b23SMartin Matuska. 16223ff01b23SMartin Matuska.It Sy zfs_allow_redacted_dataset_mount Ns = Ns Sy 0 Ns | Ns 1 Pq int 16233ff01b23SMartin MatuskaAllow datasets received with redacted send/receive to be mounted. 16243ff01b23SMartin MatuskaNormally disabled because these datasets may be missing key data. 16253ff01b23SMartin Matuska. 1626dbd5678dSMartin Matuska.It Sy zfs_min_metaslabs_to_flush Ns = Ns Sy 1 Pq u64 16273ff01b23SMartin MatuskaMinimum number of metaslabs to flush per dirty TXG. 16283ff01b23SMartin Matuska. 1629be181ee2SMartin Matuska.It Sy zfs_metaslab_fragmentation_threshold Ns = Ns Sy 70 Ns % Pq uint 16303ff01b23SMartin MatuskaAllow metaslabs to keep their active state as long as their fragmentation 16313ff01b23SMartin Matuskapercentage is no more than this value. 16323ff01b23SMartin MatuskaAn active metaslab that exceeds this threshold 16333ff01b23SMartin Matuskawill no longer keep its active status allowing better metaslabs to be selected. 16343ff01b23SMartin Matuska. 1635be181ee2SMartin Matuska.It Sy zfs_mg_fragmentation_threshold Ns = Ns Sy 95 Ns % Pq uint 16363ff01b23SMartin MatuskaMetaslab groups are considered eligible for allocations if their 16373ff01b23SMartin Matuskafragmentation metric (measured as a percentage) is less than or equal to 16383ff01b23SMartin Matuskathis value. 16393ff01b23SMartin MatuskaIf a metaslab group exceeds this threshold then it will be 16403ff01b23SMartin Matuskaskipped unless all metaslab groups within the metaslab class have also 16413ff01b23SMartin Matuskacrossed this threshold. 16423ff01b23SMartin Matuska. 1643be181ee2SMartin Matuska.It Sy zfs_mg_noalloc_threshold Ns = Ns Sy 0 Ns % Pq uint 16443ff01b23SMartin MatuskaDefines a threshold at which metaslab groups should be eligible for allocations. 16453ff01b23SMartin MatuskaThe value is expressed as a percentage of free space 16463ff01b23SMartin Matuskabeyond which a metaslab group is always eligible for allocations. 16473ff01b23SMartin MatuskaIf a metaslab group's free space is less than or equal to the 16483ff01b23SMartin Matuskathreshold, the allocator will avoid allocating to that group 16493ff01b23SMartin Matuskaunless all groups in the pool have reached the threshold. 16503ff01b23SMartin MatuskaOnce all groups have reached the threshold, all groups are allowed to accept 16513ff01b23SMartin Matuskaallocations. 16523ff01b23SMartin MatuskaThe default value of 16533ff01b23SMartin Matuska.Sy 0 1654bb2d13b6SMartin Matuskadisables the feature and causes all metaslab groups to be eligible for 1655bb2d13b6SMartin Matuskaallocations. 16563ff01b23SMartin Matuska.Pp 16573ff01b23SMartin MatuskaThis parameter allows one to deal with pools having heavily imbalanced 16583ff01b23SMartin Matuskavdevs such as would be the case when a new vdev has been added. 16593ff01b23SMartin MatuskaSetting the threshold to a non-zero percentage will stop allocations 16603ff01b23SMartin Matuskafrom being made to vdevs that aren't filled to the specified percentage 16613ff01b23SMartin Matuskaand allow lesser filled vdevs to acquire more allocations than they 16623ff01b23SMartin Matuskaotherwise would under the old 16633ff01b23SMartin Matuska.Sy zfs_mg_alloc_failures 16643ff01b23SMartin Matuskafacility. 16653ff01b23SMartin Matuska. 16663ff01b23SMartin Matuska.It Sy zfs_ddt_data_is_special Ns = Ns Sy 1 Ns | Ns 0 Pq int 16673ff01b23SMartin MatuskaIf enabled, ZFS will place DDT data into the special allocation class. 16683ff01b23SMartin Matuska. 16693ff01b23SMartin Matuska.It Sy zfs_user_indirect_is_special Ns = Ns Sy 1 Ns | Ns 0 Pq int 16703ff01b23SMartin MatuskaIf enabled, ZFS will place user data indirect blocks 16713ff01b23SMartin Matuskainto the special allocation class. 16723ff01b23SMartin Matuska. 1673be181ee2SMartin Matuska.It Sy zfs_multihost_history Ns = Ns Sy 0 Pq uint 1674bb2d13b6SMartin MatuskaHistorical statistics for this many latest multihost updates will be available 1675bb2d13b6SMartin Matuskain 16763ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /multihost . 16773ff01b23SMartin Matuska. 1678dbd5678dSMartin Matuska.It Sy zfs_multihost_interval Ns = Ns Sy 1000 Ns ms Po 1 s Pc Pq u64 16793ff01b23SMartin MatuskaUsed to control the frequency of multihost writes which are performed when the 16803ff01b23SMartin Matuska.Sy multihost 16813ff01b23SMartin Matuskapool property is on. 16823ff01b23SMartin MatuskaThis is one of the factors used to determine the 16833ff01b23SMartin Matuskalength of the activity check during import. 16843ff01b23SMartin Matuska.Pp 16853ff01b23SMartin MatuskaThe multihost write period is 1686e92ffd9bSMartin Matuska.Sy zfs_multihost_interval No / Sy leaf-vdevs . 16873ff01b23SMartin MatuskaOn average a multihost write will be issued for each leaf vdev 16883ff01b23SMartin Matuskaevery 16893ff01b23SMartin Matuska.Sy zfs_multihost_interval 16903ff01b23SMartin Matuskamilliseconds. 16913ff01b23SMartin MatuskaIn practice, the observed period can vary with the I/O load 16923ff01b23SMartin Matuskaand this observed value is the delay which is stored in the uberblock. 16933ff01b23SMartin Matuska. 16943ff01b23SMartin Matuska.It Sy zfs_multihost_import_intervals Ns = Ns Sy 20 Pq uint 16953ff01b23SMartin MatuskaUsed to control the duration of the activity test on import. 16963ff01b23SMartin MatuskaSmaller values of 16973ff01b23SMartin Matuska.Sy zfs_multihost_import_intervals 16983ff01b23SMartin Matuskawill reduce the import time but increase 16993ff01b23SMartin Matuskathe risk of failing to detect an active pool. 17003ff01b23SMartin MatuskaThe total activity check time is never allowed to drop below one second. 17013ff01b23SMartin Matuska.Pp 17023ff01b23SMartin MatuskaOn import the activity check waits a minimum amount of time determined by 1703e92ffd9bSMartin Matuska.Sy zfs_multihost_interval No \(mu Sy zfs_multihost_import_intervals , 17043ff01b23SMartin Matuskaor the same product computed on the host which last had the pool imported, 17053ff01b23SMartin Matuskawhichever is greater. 17063ff01b23SMartin MatuskaThe activity check time may be further extended if the value of MMP 17073ff01b23SMartin Matuskadelay found in the best uberblock indicates actual multihost updates happened 17083ff01b23SMartin Matuskaat longer intervals than 17093ff01b23SMartin Matuska.Sy zfs_multihost_interval . 17103ff01b23SMartin MatuskaA minimum of 17113ff01b23SMartin Matuska.Em 100 ms 17123ff01b23SMartin Matuskais enforced. 17133ff01b23SMartin Matuska.Pp 17143ff01b23SMartin Matuska.Sy 0 No is equivalent to Sy 1 . 17153ff01b23SMartin Matuska. 17163ff01b23SMartin Matuska.It Sy zfs_multihost_fail_intervals Ns = Ns Sy 10 Pq uint 17173ff01b23SMartin MatuskaControls the behavior of the pool when multihost write failures or delays are 17183ff01b23SMartin Matuskadetected. 17193ff01b23SMartin Matuska.Pp 17203ff01b23SMartin MatuskaWhen 17213ff01b23SMartin Matuska.Sy 0 , 17223ff01b23SMartin Matuskamultihost write failures or delays are ignored. 17233ff01b23SMartin MatuskaThe failures will still be reported to the ZED which depending on 17243ff01b23SMartin Matuskaits configuration may take action such as suspending the pool or offlining a 17253ff01b23SMartin Matuskadevice. 17263ff01b23SMartin Matuska.Pp 17273ff01b23SMartin MatuskaOtherwise, the pool will be suspended if 1728e92ffd9bSMartin Matuska.Sy zfs_multihost_fail_intervals No \(mu Sy zfs_multihost_interval 17293ff01b23SMartin Matuskamilliseconds pass without a successful MMP write. 17303ff01b23SMartin MatuskaThis guarantees the activity test will see MMP writes if the pool is imported. 17313ff01b23SMartin Matuska.Sy 1 No is equivalent to Sy 2 ; 17323ff01b23SMartin Matuskathis is necessary to prevent the pool from being suspended 17333ff01b23SMartin Matuskadue to normal, small I/O latency variations. 17343ff01b23SMartin Matuska. 17353ff01b23SMartin Matuska.It Sy zfs_no_scrub_io Ns = Ns Sy 0 Ns | Ns 1 Pq int 17363ff01b23SMartin MatuskaSet to disable scrub I/O. 17373ff01b23SMartin MatuskaThis results in scrubs not actually scrubbing data and 17383ff01b23SMartin Matuskasimply doing a metadata crawl of the pool instead. 17393ff01b23SMartin Matuska. 17403ff01b23SMartin Matuska.It Sy zfs_no_scrub_prefetch Ns = Ns Sy 0 Ns | Ns 1 Pq int 17413ff01b23SMartin MatuskaSet to disable block prefetching for scrubs. 17423ff01b23SMartin Matuska. 17433ff01b23SMartin Matuska.It Sy zfs_nocacheflush Ns = Ns Sy 0 Ns | Ns 1 Pq int 17443ff01b23SMartin MatuskaDisable cache flush operations on disks when writing. 17453ff01b23SMartin MatuskaSetting this will cause pool corruption on power loss 17463ff01b23SMartin Matuskaif a volatile out-of-order write cache is enabled. 17473ff01b23SMartin Matuska. 17483ff01b23SMartin Matuska.It Sy zfs_nopwrite_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 17493ff01b23SMartin MatuskaAllow no-operation writes. 17503ff01b23SMartin MatuskaThe occurrence of nopwrites will further depend on other pool properties 17513ff01b23SMartin Matuska.Pq i.a. the checksumming and compression algorithms . 17523ff01b23SMartin Matuska. 1753681ce946SMartin Matuska.It Sy zfs_dmu_offset_next_sync Ns = Ns Sy 1 Ns | Ns 0 Pq int 17543ff01b23SMartin MatuskaEnable forcing TXG sync to find holes. 1755681ce946SMartin MatuskaWhen enabled forces ZFS to sync data when 17563ff01b23SMartin Matuska.Sy SEEK_HOLE No or Sy SEEK_DATA 1757681ce946SMartin Matuskaflags are used allowing holes in a file to be accurately reported. 1758681ce946SMartin MatuskaWhen disabled holes will not be reported in recently dirtied files. 17593ff01b23SMartin Matuska. 1760716fd348SMartin Matuska.It Sy zfs_pd_bytes_max Ns = Ns Sy 52428800 Ns B Po 50 MiB Pc Pq int 17613ff01b23SMartin MatuskaThe number of bytes which should be prefetched during a pool traversal, like 17623ff01b23SMartin Matuska.Nm zfs Cm send 17633ff01b23SMartin Matuskaor other data crawling operations. 17643ff01b23SMartin Matuska. 1765be181ee2SMartin Matuska.It Sy zfs_traverse_indirect_prefetch_limit Ns = Ns Sy 32 Pq uint 17663ff01b23SMartin MatuskaThe number of blocks pointed by indirect (non-L0) block which should be 17673ff01b23SMartin Matuskaprefetched during a pool traversal, like 17683ff01b23SMartin Matuska.Nm zfs Cm send 17693ff01b23SMartin Matuskaor other data crawling operations. 17703ff01b23SMartin Matuska. 1771dbd5678dSMartin Matuska.It Sy zfs_per_txg_dirty_frees_percent Ns = Ns Sy 30 Ns % Pq u64 17723ff01b23SMartin MatuskaControl percentage of dirtied indirect blocks from frees allowed into one TXG. 17733ff01b23SMartin MatuskaAfter this threshold is crossed, additional frees will wait until the next TXG. 17743ff01b23SMartin Matuska.Sy 0 No disables this throttle . 17753ff01b23SMartin Matuska. 17763ff01b23SMartin Matuska.It Sy zfs_prefetch_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int 17773ff01b23SMartin MatuskaDisable predictive prefetch. 1778c03c5b1cSMartin MatuskaNote that it leaves "prescient" prefetch 1779c03c5b1cSMartin Matuska.Pq for, e.g., Nm zfs Cm send 17803ff01b23SMartin Matuskaintact. 17813ff01b23SMartin MatuskaUnlike predictive prefetch, prescient prefetch never issues I/O 17823ff01b23SMartin Matuskathat ends up not being needed, so it can't hurt performance. 17833ff01b23SMartin Matuska. 17843ff01b23SMartin Matuska.It Sy zfs_qat_checksum_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int 17853ff01b23SMartin MatuskaDisable QAT hardware acceleration for SHA256 checksums. 17863ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT 17873ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present. 17883ff01b23SMartin Matuska. 17893ff01b23SMartin Matuska.It Sy zfs_qat_compress_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int 17903ff01b23SMartin MatuskaDisable QAT hardware acceleration for gzip compression. 17913ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT 17923ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present. 17933ff01b23SMartin Matuska. 17943ff01b23SMartin Matuska.It Sy zfs_qat_encrypt_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int 17953ff01b23SMartin MatuskaDisable QAT hardware acceleration for AES-GCM encryption. 17963ff01b23SMartin MatuskaMay be unset after the ZFS modules have been loaded to initialize the QAT 17973ff01b23SMartin Matuskahardware as long as support is compiled in and the QAT driver is present. 17983ff01b23SMartin Matuska. 1799dbd5678dSMartin Matuska.It Sy zfs_vnops_read_chunk_size Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq u64 18003ff01b23SMartin MatuskaBytes to read per chunk. 18013ff01b23SMartin Matuska. 1802be181ee2SMartin Matuska.It Sy zfs_read_history Ns = Ns Sy 0 Pq uint 18033ff01b23SMartin MatuskaHistorical statistics for this many latest reads will be available in 18043ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /reads . 18053ff01b23SMartin Matuska. 18063ff01b23SMartin Matuska.It Sy zfs_read_history_hits Ns = Ns Sy 0 Ns | Ns 1 Pq int 18073ff01b23SMartin MatuskaInclude cache hits in read history 18083ff01b23SMartin Matuska. 1809dbd5678dSMartin Matuska.It Sy zfs_rebuild_max_segment Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq u64 18103ff01b23SMartin MatuskaMaximum read segment size to issue when sequentially resilvering a 18113ff01b23SMartin Matuskatop-level vdev. 18123ff01b23SMartin Matuska. 18133ff01b23SMartin Matuska.It Sy zfs_rebuild_scrub_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 18143ff01b23SMartin MatuskaAutomatically start a pool scrub when the last active sequential resilver 18153ff01b23SMartin Matuskacompletes in order to verify the checksums of all blocks which have been 18163ff01b23SMartin Matuskaresilvered. 18173ff01b23SMartin MatuskaThis is enabled by default and strongly recommended. 18183ff01b23SMartin Matuska. 1819c9539b89SMartin Matuska.It Sy zfs_rebuild_vdev_limit Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq u64 18203ff01b23SMartin MatuskaMaximum amount of I/O that can be concurrently issued for a sequential 18213ff01b23SMartin Matuskaresilver per leaf device, given in bytes. 18223ff01b23SMartin Matuska. 18233ff01b23SMartin Matuska.It Sy zfs_reconstruct_indirect_combinations_max Ns = Ns Sy 4096 Pq int 18243ff01b23SMartin MatuskaIf an indirect split block contains more than this many possible unique 18253ff01b23SMartin Matuskacombinations when being reconstructed, consider it too computationally 18263ff01b23SMartin Matuskaexpensive to check them all. 18273ff01b23SMartin MatuskaInstead, try at most this many randomly selected 18283ff01b23SMartin Matuskacombinations each time the block is accessed. 18293ff01b23SMartin MatuskaThis allows all segment copies to participate fairly 18303ff01b23SMartin Matuskain the reconstruction when all combinations 18313ff01b23SMartin Matuskacannot be checked and prevents repeated use of one bad copy. 18323ff01b23SMartin Matuska. 18333ff01b23SMartin Matuska.It Sy zfs_recover Ns = Ns Sy 0 Ns | Ns 1 Pq int 18343ff01b23SMartin MatuskaSet to attempt to recover from fatal errors. 18353ff01b23SMartin MatuskaThis should only be used as a last resort, 18363ff01b23SMartin Matuskaas it typically results in leaked space, or worse. 18373ff01b23SMartin Matuska. 18383ff01b23SMartin Matuska.It Sy zfs_removal_ignore_errors Ns = Ns Sy 0 Ns | Ns 1 Pq int 1839c03c5b1cSMartin MatuskaIgnore hard I/O errors during device removal. 1840c03c5b1cSMartin MatuskaWhen set, if a device encounters a hard I/O error during the removal process 18413ff01b23SMartin Matuskathe removal will not be cancelled. 18423ff01b23SMartin MatuskaThis can result in a normally recoverable block becoming permanently damaged 18433ff01b23SMartin Matuskaand is hence not recommended. 18443ff01b23SMartin MatuskaThis should only be used as a last resort when the 18453ff01b23SMartin Matuskapool cannot be returned to a healthy state prior to removing the device. 18463ff01b23SMartin Matuska. 1847be181ee2SMartin Matuska.It Sy zfs_removal_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq uint 18483ff01b23SMartin MatuskaThis is used by the test suite so that it can ensure that certain actions 18493ff01b23SMartin Matuskahappen while in the middle of a removal. 18503ff01b23SMartin Matuska. 1851be181ee2SMartin Matuska.It Sy zfs_remove_max_segment Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq uint 18523ff01b23SMartin MatuskaThe largest contiguous segment that we will attempt to allocate when removing 18533ff01b23SMartin Matuskaa device. 18543ff01b23SMartin MatuskaIf there is a performance problem with attempting to allocate large blocks, 18553ff01b23SMartin Matuskaconsider decreasing this. 18563ff01b23SMartin MatuskaThe default value is also the maximum. 18573ff01b23SMartin Matuska. 18583ff01b23SMartin Matuska.It Sy zfs_resilver_disable_defer Ns = Ns Sy 0 Ns | Ns 1 Pq int 18593ff01b23SMartin MatuskaIgnore the 18603ff01b23SMartin Matuska.Sy resilver_defer 18613ff01b23SMartin Matuskafeature, causing an operation that would start a resilver to 18623ff01b23SMartin Matuskaimmediately restart the one in progress. 18633ff01b23SMartin Matuska. 1864be181ee2SMartin Matuska.It Sy zfs_resilver_min_time_ms Ns = Ns Sy 3000 Ns ms Po 3 s Pc Pq uint 18653ff01b23SMartin MatuskaResilvers are processed by the sync thread. 18663ff01b23SMartin MatuskaWhile resilvering, it will spend at least this much time 18673ff01b23SMartin Matuskaworking on a resilver between TXG flushes. 18683ff01b23SMartin Matuska. 18693ff01b23SMartin Matuska.It Sy zfs_scan_ignore_errors Ns = Ns Sy 0 Ns | Ns 1 Pq int 18703ff01b23SMartin MatuskaIf set, remove the DTL (dirty time list) upon completion of a pool scan (scrub), 18713ff01b23SMartin Matuskaeven if there were unrepairable errors. 18723ff01b23SMartin MatuskaIntended to be used during pool repair or recovery to 18733ff01b23SMartin Matuskastop resilvering when the pool is next imported. 18743ff01b23SMartin Matuska. 1875e716630dSMartin Matuska.It Sy zfs_scrub_after_expand Ns = Ns Sy 1 Ns | Ns 0 Pq int 1876e716630dSMartin MatuskaAutomatically start a pool scrub after a RAIDZ expansion completes 1877e716630dSMartin Matuskain order to verify the checksums of all blocks which have been 1878e716630dSMartin Matuskacopied during the expansion. 1879e716630dSMartin MatuskaThis is enabled by default and strongly recommended. 1880e716630dSMartin Matuska. 1881be181ee2SMartin Matuska.It Sy zfs_scrub_min_time_ms Ns = Ns Sy 1000 Ns ms Po 1 s Pc Pq uint 18823ff01b23SMartin MatuskaScrubs are processed by the sync thread. 18833ff01b23SMartin MatuskaWhile scrubbing, it will spend at least this much time 18843ff01b23SMartin Matuskaworking on a scrub between TXG flushes. 18853ff01b23SMartin Matuska. 1886c0a83fe0SMartin Matuska.It Sy zfs_scrub_error_blocks_per_txg Ns = Ns Sy 4096 Pq uint 1887c0a83fe0SMartin MatuskaError blocks to be scrubbed in one txg. 1888c0a83fe0SMartin Matuska. 1889be181ee2SMartin Matuska.It Sy zfs_scan_checkpoint_intval Ns = Ns Sy 7200 Ns s Po 2 hour Pc Pq uint 18903ff01b23SMartin MatuskaTo preserve progress across reboots, the sequential scan algorithm periodically 18913ff01b23SMartin Matuskaneeds to stop metadata scanning and issue all the verification I/O to disk. 18923ff01b23SMartin MatuskaThe frequency of this flushing is determined by this tunable. 18933ff01b23SMartin Matuska. 1894be181ee2SMartin Matuska.It Sy zfs_scan_fill_weight Ns = Ns Sy 3 Pq uint 18953ff01b23SMartin MatuskaThis tunable affects how scrub and resilver I/O segments are ordered. 18963ff01b23SMartin MatuskaA higher number indicates that we care more about how filled in a segment is, 18973ff01b23SMartin Matuskawhile a lower number indicates we care more about the size of the extent without 18983ff01b23SMartin Matuskaconsidering the gaps within a segment. 18993ff01b23SMartin MatuskaThis value is only tunable upon module insertion. 1900bb2d13b6SMartin MatuskaChanging the value afterwards will have no effect on scrub or resilver 1901bb2d13b6SMartin Matuskaperformance. 19023ff01b23SMartin Matuska. 1903be181ee2SMartin Matuska.It Sy zfs_scan_issue_strategy Ns = Ns Sy 0 Pq uint 19043ff01b23SMartin MatuskaDetermines the order that data will be verified while scrubbing or resilvering: 19053ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "a" 19063ff01b23SMartin Matuska.It Sy 1 19073ff01b23SMartin MatuskaData will be verified as sequentially as possible, given the 19083ff01b23SMartin Matuskaamount of memory reserved for scrubbing 19093ff01b23SMartin Matuska.Pq see Sy zfs_scan_mem_lim_fact . 19103ff01b23SMartin MatuskaThis may improve scrub performance if the pool's data is very fragmented. 19113ff01b23SMartin Matuska.It Sy 2 19123ff01b23SMartin MatuskaThe largest mostly-contiguous chunk of found data will be verified first. 19133ff01b23SMartin MatuskaBy deferring scrubbing of small segments, we may later find adjacent data 19143ff01b23SMartin Matuskato coalesce and increase the segment size. 19153ff01b23SMartin Matuska.It Sy 0 19163ff01b23SMartin Matuska.No Use strategy Sy 1 No during normal verification 19173ff01b23SMartin Matuska.No and strategy Sy 2 No while taking a checkpoint . 19183ff01b23SMartin Matuska.El 19193ff01b23SMartin Matuska. 19203ff01b23SMartin Matuska.It Sy zfs_scan_legacy Ns = Ns Sy 0 Ns | Ns 1 Pq int 19213ff01b23SMartin MatuskaIf unset, indicates that scrubs and resilvers will gather metadata in 19223ff01b23SMartin Matuskamemory before issuing sequential I/O. 19233ff01b23SMartin MatuskaOtherwise indicates that the legacy algorithm will be used, 19243ff01b23SMartin Matuskawhere I/O is initiated as soon as it is discovered. 19253ff01b23SMartin MatuskaUnsetting will not affect scrubs or resilvers that are already in progress. 19263ff01b23SMartin Matuska. 1927716fd348SMartin Matuska.It Sy zfs_scan_max_ext_gap Ns = Ns Sy 2097152 Ns B Po 2 MiB Pc Pq int 19283ff01b23SMartin MatuskaSets the largest gap in bytes between scrub/resilver I/O operations 19293ff01b23SMartin Matuskathat will still be considered sequential for sorting purposes. 19303ff01b23SMartin MatuskaChanging this value will not 19313ff01b23SMartin Matuskaaffect scrubs or resilvers that are already in progress. 19323ff01b23SMartin Matuska. 1933be181ee2SMartin Matuska.It Sy zfs_scan_mem_lim_fact Ns = Ns Sy 20 Ns ^-1 Pq uint 19343ff01b23SMartin MatuskaMaximum fraction of RAM used for I/O sorting by sequential scan algorithm. 19353ff01b23SMartin MatuskaThis tunable determines the hard limit for I/O sorting memory usage. 19363ff01b23SMartin MatuskaWhen the hard limit is reached we stop scanning metadata and start issuing 19373ff01b23SMartin Matuskadata verification I/O. 19383ff01b23SMartin MatuskaThis is done until we get below the soft limit. 19393ff01b23SMartin Matuska. 1940be181ee2SMartin Matuska.It Sy zfs_scan_mem_lim_soft_fact Ns = Ns Sy 20 Ns ^-1 Pq uint 19413ff01b23SMartin MatuskaThe fraction of the hard limit used to determined the soft limit for I/O sorting 19423ff01b23SMartin Matuskaby the sequential scan algorithm. 19433ff01b23SMartin MatuskaWhen we cross this limit from below no action is taken. 1944bb2d13b6SMartin MatuskaWhen we cross this limit from above it is because we are issuing verification 1945bb2d13b6SMartin MatuskaI/O. 19463ff01b23SMartin MatuskaIn this case (unless the metadata scan is done) we stop issuing verification I/O 19473ff01b23SMartin Matuskaand start scanning metadata again until we get to the hard limit. 19483ff01b23SMartin Matuska. 1949c9539b89SMartin Matuska.It Sy zfs_scan_report_txgs Ns = Ns Sy 0 Ns | Ns 1 Pq uint 1950c9539b89SMartin MatuskaWhen reporting resilver throughput and estimated completion time use the 1951c9539b89SMartin Matuskaperformance observed over roughly the last 1952c9539b89SMartin Matuska.Sy zfs_scan_report_txgs 1953c9539b89SMartin MatuskaTXGs. 1954c9539b89SMartin MatuskaWhen set to zero performance is calculated over the time between checkpoints. 1955c9539b89SMartin Matuska. 19563ff01b23SMartin Matuska.It Sy zfs_scan_strict_mem_lim Ns = Ns Sy 0 Ns | Ns 1 Pq int 19573ff01b23SMartin MatuskaEnforce tight memory limits on pool scans when a sequential scan is in progress. 19583ff01b23SMartin MatuskaWhen disabled, the memory limit may be exceeded by fast disks. 19593ff01b23SMartin Matuska. 19603ff01b23SMartin Matuska.It Sy zfs_scan_suspend_progress Ns = Ns Sy 0 Ns | Ns 1 Pq int 19613ff01b23SMartin MatuskaFreezes a scrub/resilver in progress without actually pausing it. 19623ff01b23SMartin MatuskaIntended for testing/debugging. 19633ff01b23SMartin Matuska. 1964c9539b89SMartin Matuska.It Sy zfs_scan_vdev_limit Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq int 19653ff01b23SMartin MatuskaMaximum amount of data that can be concurrently issued at once for scrubs and 19663ff01b23SMartin Matuskaresilvers per leaf device, given in bytes. 19673ff01b23SMartin Matuska. 19683ff01b23SMartin Matuska.It Sy zfs_send_corrupt_data Ns = Ns Sy 0 Ns | Ns 1 Pq int 19693ff01b23SMartin MatuskaAllow sending of corrupt data (ignore read/checksum errors when sending). 19703ff01b23SMartin Matuska. 19713ff01b23SMartin Matuska.It Sy zfs_send_unmodified_spill_blocks Ns = Ns Sy 1 Ns | Ns 0 Pq int 19723ff01b23SMartin MatuskaInclude unmodified spill blocks in the send stream. 19733ff01b23SMartin MatuskaUnder certain circumstances, previous versions of ZFS could incorrectly 19743ff01b23SMartin Matuskaremove the spill block from an existing object. 19753ff01b23SMartin MatuskaIncluding unmodified copies of the spill blocks creates a backwards-compatible 19763ff01b23SMartin Matuskastream which will recreate a spill block if it was incorrectly removed. 19773ff01b23SMartin Matuska. 1978be181ee2SMartin Matuska.It Sy zfs_send_no_prefetch_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq uint 19793ff01b23SMartin MatuskaThe fill fraction of the 19803ff01b23SMartin Matuska.Nm zfs Cm send 19813ff01b23SMartin Matuskainternal queues. 19823ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up. 19833ff01b23SMartin Matuska. 1984be181ee2SMartin Matuska.It Sy zfs_send_no_prefetch_queue_length Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq uint 19853ff01b23SMartin MatuskaThe maximum number of bytes allowed in 19863ff01b23SMartin Matuska.Nm zfs Cm send Ns 's 19873ff01b23SMartin Matuskainternal queues. 19883ff01b23SMartin Matuska. 1989be181ee2SMartin Matuska.It Sy zfs_send_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq uint 19903ff01b23SMartin MatuskaThe fill fraction of the 19913ff01b23SMartin Matuska.Nm zfs Cm send 19923ff01b23SMartin Matuskaprefetch queue. 19933ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up. 19943ff01b23SMartin Matuska. 1995be181ee2SMartin Matuska.It Sy zfs_send_queue_length Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq uint 19963ff01b23SMartin MatuskaThe maximum number of bytes allowed that will be prefetched by 19973ff01b23SMartin Matuska.Nm zfs Cm send . 19983ff01b23SMartin MatuskaThis value must be at least twice the maximum block size in use. 19993ff01b23SMartin Matuska. 2000be181ee2SMartin Matuska.It Sy zfs_recv_queue_ff Ns = Ns Sy 20 Ns ^\-1 Pq uint 20013ff01b23SMartin MatuskaThe fill fraction of the 20023ff01b23SMartin Matuska.Nm zfs Cm receive 20033ff01b23SMartin Matuskaqueue. 20043ff01b23SMartin MatuskaThe fill fraction controls the timing with which internal threads are woken up. 20053ff01b23SMartin Matuska. 2006be181ee2SMartin Matuska.It Sy zfs_recv_queue_length Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq uint 20073ff01b23SMartin MatuskaThe maximum number of bytes allowed in the 20083ff01b23SMartin Matuska.Nm zfs Cm receive 20093ff01b23SMartin Matuskaqueue. 20103ff01b23SMartin MatuskaThis value must be at least twice the maximum block size in use. 20113ff01b23SMartin Matuska. 2012be181ee2SMartin Matuska.It Sy zfs_recv_write_batch_size Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq uint 20133ff01b23SMartin MatuskaThe maximum amount of data, in bytes, that 20143ff01b23SMartin Matuska.Nm zfs Cm receive 20153ff01b23SMartin Matuskawill write in one DMU transaction. 20163ff01b23SMartin MatuskaThis is the uncompressed size, even when receiving a compressed send stream. 20173ff01b23SMartin MatuskaThis setting will not reduce the write size below a single block. 20183ff01b23SMartin MatuskaCapped at a maximum of 2019716fd348SMartin Matuska.Sy 32 MiB . 20203ff01b23SMartin Matuska. 2021271171e0SMartin Matuska.It Sy zfs_recv_best_effort_corrective Ns = Ns Sy 0 Pq int 2022271171e0SMartin MatuskaWhen this variable is set to non-zero a corrective receive: 2023271171e0SMartin Matuska.Bl -enum -compact -offset 4n -width "1." 2024271171e0SMartin Matuska.It 2025271171e0SMartin MatuskaDoes not enforce the restriction of source & destination snapshot GUIDs 2026271171e0SMartin Matuskamatching. 2027271171e0SMartin Matuska.It 2028271171e0SMartin MatuskaIf there is an error during healing, the healing receive is not 2029271171e0SMartin Matuskaterminated instead it moves on to the next record. 2030271171e0SMartin Matuska.El 2031271171e0SMartin Matuska. 2032be181ee2SMartin Matuska.It Sy zfs_override_estimate_recordsize Ns = Ns Sy 0 Ns | Ns 1 Pq uint 20333ff01b23SMartin MatuskaSetting this variable overrides the default logic for estimating block 20343ff01b23SMartin Matuskasizes when doing a 20353ff01b23SMartin Matuska.Nm zfs Cm send . 20363ff01b23SMartin MatuskaThe default heuristic is that the average block size 20373ff01b23SMartin Matuskawill be the current recordsize. 20383ff01b23SMartin MatuskaOverride this value if most data in your dataset is not of that size 20393ff01b23SMartin Matuskaand you require accurate zfs send size estimates. 20403ff01b23SMartin Matuska. 2041be181ee2SMartin Matuska.It Sy zfs_sync_pass_deferred_free Ns = Ns Sy 2 Pq uint 20423ff01b23SMartin MatuskaFlushing of data to disk is done in passes. 20433ff01b23SMartin MatuskaDefer frees starting in this pass. 20443ff01b23SMartin Matuska. 2045716fd348SMartin Matuska.It Sy zfs_spa_discard_memory_limit Ns = Ns Sy 16777216 Ns B Po 16 MiB Pc Pq int 20463ff01b23SMartin MatuskaMaximum memory used for prefetching a checkpoint's space map on each 20473ff01b23SMartin Matuskavdev while discarding the checkpoint. 20483ff01b23SMartin Matuska. 2049be181ee2SMartin Matuska.It Sy zfs_special_class_metadata_reserve_pct Ns = Ns Sy 25 Ns % Pq uint 20503ff01b23SMartin MatuskaOnly allow small data blocks to be allocated on the special and dedup vdev 2051bb2d13b6SMartin Matuskatypes when the available free space percentage on these vdevs exceeds this 2052bb2d13b6SMartin Matuskavalue. 20533ff01b23SMartin MatuskaThis ensures reserved space is available for pool metadata as the 20543ff01b23SMartin Matuskaspecial vdevs approach capacity. 20553ff01b23SMartin Matuska. 2056be181ee2SMartin Matuska.It Sy zfs_sync_pass_dont_compress Ns = Ns Sy 8 Pq uint 20573ff01b23SMartin MatuskaStarting in this sync pass, disable compression (including of metadata). 20583ff01b23SMartin MatuskaWith the default setting, in practice, we don't have this many sync passes, 20593ff01b23SMartin Matuskaso this has no effect. 20603ff01b23SMartin Matuska.Pp 20613ff01b23SMartin MatuskaThe original intent was that disabling compression would help the sync passes 20623ff01b23SMartin Matuskato converge. 20633ff01b23SMartin MatuskaHowever, in practice, disabling compression increases 20643ff01b23SMartin Matuskathe average number of sync passes; because when we turn compression off, 20653ff01b23SMartin Matuskamany blocks' size will change, and thus we have to re-allocate 20663ff01b23SMartin Matuska(not overwrite) them. 20673ff01b23SMartin MatuskaIt also increases the number of 2068716fd348SMartin Matuska.Em 128 KiB 20693ff01b23SMartin Matuskaallocations (e.g. for indirect blocks and spacemaps) 20703ff01b23SMartin Matuskabecause these will not be compressed. 20713ff01b23SMartin MatuskaThe 2072716fd348SMartin Matuska.Em 128 KiB 20733ff01b23SMartin Matuskaallocations are especially detrimental to performance 2074bb2d13b6SMartin Matuskaon highly fragmented systems, which may have very few free segments of this 2075bb2d13b6SMartin Matuskasize, 20763ff01b23SMartin Matuskaand may need to load new metaslabs to satisfy these allocations. 20773ff01b23SMartin Matuska. 2078be181ee2SMartin Matuska.It Sy zfs_sync_pass_rewrite Ns = Ns Sy 2 Pq uint 20793ff01b23SMartin MatuskaRewrite new block pointers starting in this pass. 20803ff01b23SMartin Matuska. 2081716fd348SMartin Matuska.It Sy zfs_trim_extent_bytes_max Ns = Ns Sy 134217728 Ns B Po 128 MiB Pc Pq uint 20823ff01b23SMartin MatuskaMaximum size of TRIM command. 2083bb2d13b6SMartin MatuskaLarger ranges will be split into chunks no larger than this value before 2084bb2d13b6SMartin Matuskaissuing. 20853ff01b23SMartin Matuska. 2086716fd348SMartin Matuska.It Sy zfs_trim_extent_bytes_min Ns = Ns Sy 32768 Ns B Po 32 KiB Pc Pq uint 20873ff01b23SMartin MatuskaMinimum size of TRIM commands. 20883ff01b23SMartin MatuskaTRIM ranges smaller than this will be skipped, 20893ff01b23SMartin Matuskaunless they're part of a larger range which was chunked. 20903ff01b23SMartin MatuskaThis is done because it's common for these small TRIMs 20913ff01b23SMartin Matuskato negatively impact overall performance. 20923ff01b23SMartin Matuska. 20933ff01b23SMartin Matuska.It Sy zfs_trim_metaslab_skip Ns = Ns Sy 0 Ns | Ns 1 Pq uint 20943ff01b23SMartin MatuskaSkip uninitialized metaslabs during the TRIM process. 2095bb2d13b6SMartin MatuskaThis option is useful for pools constructed from large thinly-provisioned 2096bb2d13b6SMartin Matuskadevices 20973ff01b23SMartin Matuskawhere TRIM operations are slow. 20983ff01b23SMartin MatuskaAs a pool ages, an increasing fraction of the pool's metaslabs 20993ff01b23SMartin Matuskawill be initialized, progressively degrading the usefulness of this option. 21003ff01b23SMartin MatuskaThis setting is stored when starting a manual TRIM and will 21013ff01b23SMartin Matuskapersist for the duration of the requested TRIM. 21023ff01b23SMartin Matuska. 21033ff01b23SMartin Matuska.It Sy zfs_trim_queue_limit Ns = Ns Sy 10 Pq uint 21043ff01b23SMartin MatuskaMaximum number of queued TRIMs outstanding per leaf vdev. 21053ff01b23SMartin MatuskaThe number of concurrent TRIM commands issued to the device is controlled by 21063ff01b23SMartin Matuska.Sy zfs_vdev_trim_min_active No and Sy zfs_vdev_trim_max_active . 21073ff01b23SMartin Matuska. 21083ff01b23SMartin Matuska.It Sy zfs_trim_txg_batch Ns = Ns Sy 32 Pq uint 21093ff01b23SMartin MatuskaThe number of transaction groups' worth of frees which should be aggregated 21103ff01b23SMartin Matuskabefore TRIM operations are issued to the device. 21113ff01b23SMartin MatuskaThis setting represents a trade-off between issuing larger, 21123ff01b23SMartin Matuskamore efficient TRIM operations and the delay 21133ff01b23SMartin Matuskabefore the recently trimmed space is available for use by the device. 21143ff01b23SMartin Matuska.Pp 21153ff01b23SMartin MatuskaIncreasing this value will allow frees to be aggregated for a longer time. 2116bb2d13b6SMartin MatuskaThis will result is larger TRIM operations and potentially increased memory 2117bb2d13b6SMartin Matuskausage. 21183ff01b23SMartin MatuskaDecreasing this value will have the opposite effect. 21193ff01b23SMartin MatuskaThe default of 21203ff01b23SMartin Matuska.Sy 32 21213ff01b23SMartin Matuskawas determined to be a reasonable compromise. 21223ff01b23SMartin Matuska. 212375e1fea6SMartin Matuska.It Sy zfs_txg_history Ns = Ns Sy 100 Pq uint 21243ff01b23SMartin MatuskaHistorical statistics for this many latest TXGs will be available in 21253ff01b23SMartin Matuska.Pa /proc/spl/kstat/zfs/ Ns Ao Ar pool Ac Ns Pa /TXGs . 21263ff01b23SMartin Matuska. 2127be181ee2SMartin Matuska.It Sy zfs_txg_timeout Ns = Ns Sy 5 Ns s Pq uint 2128bb2d13b6SMartin MatuskaFlush dirty data to disk at least every this many seconds (maximum TXG 2129bb2d13b6SMartin Matuskaduration). 21303ff01b23SMartin Matuska. 2131be181ee2SMartin Matuska.It Sy zfs_vdev_aggregation_limit Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq uint 21323ff01b23SMartin MatuskaMax vdev I/O aggregation size. 21333ff01b23SMartin Matuska. 2134be181ee2SMartin Matuska.It Sy zfs_vdev_aggregation_limit_non_rotating Ns = Ns Sy 131072 Ns B Po 128 KiB Pc Pq uint 21353ff01b23SMartin MatuskaMax vdev I/O aggregation size for non-rotating media. 21363ff01b23SMartin Matuska. 21373ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_rotating_inc Ns = Ns Sy 0 Pq int 21383ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for 21393ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation 21403ff01b23SMartin Matuskaimmediately follows its predecessor on rotational vdevs 21413ff01b23SMartin Matuskafor the purpose of making decisions based on load. 21423ff01b23SMartin Matuska. 21433ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_rotating_seek_inc Ns = Ns Sy 5 Pq int 21443ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for 21453ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation 21463ff01b23SMartin Matuskalacks locality as defined by 21473ff01b23SMartin Matuska.Sy zfs_vdev_mirror_rotating_seek_offset . 21483ff01b23SMartin MatuskaOperations within this that are not immediately following the previous operation 21493ff01b23SMartin Matuskaare incremented by half. 21503ff01b23SMartin Matuska. 2151716fd348SMartin Matuska.It Sy zfs_vdev_mirror_rotating_seek_offset Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq int 21523ff01b23SMartin MatuskaThe maximum distance for the last queued I/O operation in which 21533ff01b23SMartin Matuskathe balancing algorithm considers an operation to have locality. 21543ff01b23SMartin Matuska.No See Sx ZFS I/O SCHEDULER . 21553ff01b23SMartin Matuska. 21563ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_non_rotating_inc Ns = Ns Sy 0 Pq int 21573ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for 21583ff01b23SMartin Matuskathe purpose of selecting the least busy mirror member on non-rotational vdevs 21593ff01b23SMartin Matuskawhen I/O operations do not immediately follow one another. 21603ff01b23SMartin Matuska. 21613ff01b23SMartin Matuska.It Sy zfs_vdev_mirror_non_rotating_seek_inc Ns = Ns Sy 1 Pq int 21623ff01b23SMartin MatuskaA number by which the balancing algorithm increments the load calculation for 2163bb2d13b6SMartin Matuskathe purpose of selecting the least busy mirror member when an I/O operation 2164bb2d13b6SMartin Matuskalacks 21653ff01b23SMartin Matuskalocality as defined by the 21663ff01b23SMartin Matuska.Sy zfs_vdev_mirror_rotating_seek_offset . 21673ff01b23SMartin MatuskaOperations within this that are not immediately following the previous operation 21683ff01b23SMartin Matuskaare incremented by half. 21693ff01b23SMartin Matuska. 2170be181ee2SMartin Matuska.It Sy zfs_vdev_read_gap_limit Ns = Ns Sy 32768 Ns B Po 32 KiB Pc Pq uint 21713ff01b23SMartin MatuskaAggregate read I/O operations if the on-disk gap between them is within this 21723ff01b23SMartin Matuskathreshold. 21733ff01b23SMartin Matuska. 2174be181ee2SMartin Matuska.It Sy zfs_vdev_write_gap_limit Ns = Ns Sy 4096 Ns B Po 4 KiB Pc Pq uint 21753ff01b23SMartin MatuskaAggregate write I/O operations if the on-disk gap between them is within this 21763ff01b23SMartin Matuskathreshold. 21773ff01b23SMartin Matuska. 21783ff01b23SMartin Matuska.It Sy zfs_vdev_raidz_impl Ns = Ns Sy fastest Pq string 21793ff01b23SMartin MatuskaSelect the raidz parity implementation to use. 21803ff01b23SMartin Matuska.Pp 21813ff01b23SMartin MatuskaVariants that don't depend on CPU-specific features 21823ff01b23SMartin Matuskamay be selected on module load, as they are supported on all systems. 21833ff01b23SMartin MatuskaThe remaining options may only be set after the module is loaded, 21843ff01b23SMartin Matuskaas they are available only if the implementations are compiled in 21853ff01b23SMartin Matuskaand supported on the running system. 21863ff01b23SMartin Matuska.Pp 21873ff01b23SMartin MatuskaOnce the module is loaded, 21883ff01b23SMartin Matuska.Pa /sys/module/zfs/parameters/zfs_vdev_raidz_impl 21893ff01b23SMartin Matuskawill show the available options, 21903ff01b23SMartin Matuskawith the currently selected one enclosed in square brackets. 21913ff01b23SMartin Matuska.Pp 21923ff01b23SMartin Matuska.TS 21933ff01b23SMartin Matuskalb l l . 21943ff01b23SMartin Matuskafastest selected by built-in benchmark 21953ff01b23SMartin Matuskaoriginal original implementation 21963ff01b23SMartin Matuskascalar scalar implementation 21973ff01b23SMartin Matuskasse2 SSE2 instruction set 64-bit x86 21983ff01b23SMartin Matuskassse3 SSSE3 instruction set 64-bit x86 21993ff01b23SMartin Matuskaavx2 AVX2 instruction set 64-bit x86 22003ff01b23SMartin Matuskaavx512f AVX512F instruction set 64-bit x86 22013ff01b23SMartin Matuskaavx512bw AVX512F & AVX512BW instruction sets 64-bit x86 22023ff01b23SMartin Matuskaaarch64_neon NEON Aarch64/64-bit ARMv8 22033ff01b23SMartin Matuskaaarch64_neonx2 NEON with more unrolling Aarch64/64-bit ARMv8 22043ff01b23SMartin Matuskapowerpc_altivec Altivec PowerPC 22053ff01b23SMartin Matuska.TE 22063ff01b23SMartin Matuska. 22073ff01b23SMartin Matuska.It Sy zfs_vdev_scheduler Pq charp 22083ff01b23SMartin Matuska.Sy DEPRECATED . 22092faf504dSMartin MatuskaPrints warning to kernel log for compatibility. 22103ff01b23SMartin Matuska. 2211be181ee2SMartin Matuska.It Sy zfs_zevent_len_max Ns = Ns Sy 512 Pq uint 22123ff01b23SMartin MatuskaMax event queue length. 22133ff01b23SMartin MatuskaEvents in the queue can be viewed with 22143ff01b23SMartin Matuska.Xr zpool-events 8 . 22153ff01b23SMartin Matuska. 22163ff01b23SMartin Matuska.It Sy zfs_zevent_retain_max Ns = Ns Sy 2000 Pq int 22173ff01b23SMartin MatuskaMaximum recent zevent records to retain for duplicate checking. 22183ff01b23SMartin MatuskaSetting this to 22193ff01b23SMartin Matuska.Sy 0 22203ff01b23SMartin Matuskadisables duplicate detection. 22213ff01b23SMartin Matuska. 22223ff01b23SMartin Matuska.It Sy zfs_zevent_retain_expire_secs Ns = Ns Sy 900 Ns s Po 15 min Pc Pq int 22233ff01b23SMartin MatuskaLifespan for a recent ereport that was retained for duplicate checking. 22243ff01b23SMartin Matuska. 22253ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_maxalloc Ns = Ns Sy 1048576 Pq int 22263ff01b23SMartin MatuskaThe maximum number of taskq entries that are allowed to be cached. 22273ff01b23SMartin MatuskaWhen this limit is exceeded transaction records (itxs) 22283ff01b23SMartin Matuskawill be cleaned synchronously. 22293ff01b23SMartin Matuska. 22303ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_minalloc Ns = Ns Sy 1024 Pq int 22313ff01b23SMartin MatuskaThe number of taskq entries that are pre-populated when the taskq is first 22323ff01b23SMartin Matuskacreated and are immediately available for use. 22333ff01b23SMartin Matuska. 22343ff01b23SMartin Matuska.It Sy zfs_zil_clean_taskq_nthr_pct Ns = Ns Sy 100 Ns % Pq int 22353ff01b23SMartin MatuskaThis controls the number of threads used by 22363ff01b23SMartin Matuska.Sy dp_zil_clean_taskq . 22373ff01b23SMartin MatuskaThe default value of 22383ff01b23SMartin Matuska.Sy 100% 22393ff01b23SMartin Matuskawill create a maximum of one thread per cpu. 22403ff01b23SMartin Matuska. 2241be181ee2SMartin Matuska.It Sy zil_maxblocksize Ns = Ns Sy 131072 Ns B Po 128 KiB Pc Pq uint 22423ff01b23SMartin MatuskaThis sets the maximum block size used by the ZIL. 22433ff01b23SMartin MatuskaOn very fragmented pools, lowering this 2244716fd348SMartin Matuska.Pq typically to Sy 36 KiB 22453ff01b23SMartin Matuskacan improve performance. 22463ff01b23SMartin Matuska. 2247b2526e8bSMartin Matuska.It Sy zil_maxcopied Ns = Ns Sy 7680 Ns B Po 7.5 KiB Pc Pq uint 2248b2526e8bSMartin MatuskaThis sets the maximum number of write bytes logged via WR_COPIED. 2249b2526e8bSMartin MatuskaIt tunes a tradeoff between additional memory copy and possibly worse log 2250b2526e8bSMartin Matuskaspace efficiency vs additional range lock/unlock. 2251b2526e8bSMartin Matuska. 22523ff01b23SMartin Matuska.It Sy zil_nocacheflush Ns = Ns Sy 0 Ns | Ns 1 Pq int 22533ff01b23SMartin MatuskaDisable the cache flush commands that are normally sent to disk by 22543ff01b23SMartin Matuskathe ZIL after an LWB write has completed. 22553ff01b23SMartin MatuskaSetting this will cause ZIL corruption on power loss 22563ff01b23SMartin Matuskaif a volatile out-of-order write cache is enabled. 22573ff01b23SMartin Matuska. 22583ff01b23SMartin Matuska.It Sy zil_replay_disable Ns = Ns Sy 0 Ns | Ns 1 Pq int 22593ff01b23SMartin MatuskaDisable intent logging replay. 22603ff01b23SMartin MatuskaCan be disabled for recovery from corrupted ZIL. 22613ff01b23SMartin Matuska. 226222b267e8SMartin Matuska.It Sy zil_slog_bulk Ns = Ns Sy 67108864 Ns B Po 64 MiB Pc Pq u64 22633ff01b23SMartin MatuskaLimit SLOG write size per commit executed with synchronous priority. 22643ff01b23SMartin MatuskaAny writes above that will be executed with lower (asynchronous) priority 22653ff01b23SMartin Matuskato limit potential SLOG device abuse by single active ZIL writer. 22663ff01b23SMartin Matuska. 2267c03c5b1cSMartin Matuska.It Sy zfs_zil_saxattr Ns = Ns Sy 1 Ns | Ns 0 Pq int 2268c03c5b1cSMartin MatuskaSetting this tunable to zero disables ZIL logging of new 2269c03c5b1cSMartin Matuska.Sy xattr Ns = Ns Sy sa 2270c03c5b1cSMartin Matuskarecords if the 2271c03c5b1cSMartin Matuska.Sy org.openzfs:zilsaxattr 2272c03c5b1cSMartin Matuskafeature is enabled on the pool. 2273c03c5b1cSMartin MatuskaThis would only be necessary to work around bugs in the ZIL logging or replay 2274c03c5b1cSMartin Matuskacode for this record type. 2275c03c5b1cSMartin MatuskaThe tunable has no effect if the feature is disabled. 2276c03c5b1cSMartin Matuska. 2277be181ee2SMartin Matuska.It Sy zfs_embedded_slog_min_ms Ns = Ns Sy 64 Pq uint 22783ff01b23SMartin MatuskaUsually, one metaslab from each normal-class vdev is dedicated for use by 22793ff01b23SMartin Matuskathe ZIL to log synchronous writes. 22803ff01b23SMartin MatuskaHowever, if there are fewer than 22813ff01b23SMartin Matuska.Sy zfs_embedded_slog_min_ms 22823ff01b23SMartin Matuskametaslabs in the vdev, this functionality is disabled. 2283bb2d13b6SMartin MatuskaThis ensures that we don't set aside an unreasonable amount of space for the 2284bb2d13b6SMartin MatuskaZIL. 22853ff01b23SMartin Matuska. 2286be181ee2SMartin Matuska.It Sy zstd_earlyabort_pass Ns = Ns Sy 1 Pq uint 2287e3aa18adSMartin MatuskaWhether heuristic for detection of incompressible data with zstd levels >= 3 2288e3aa18adSMartin Matuskausing LZ4 and zstd-1 passes is enabled. 2289e3aa18adSMartin Matuska. 2290be181ee2SMartin Matuska.It Sy zstd_abort_size Ns = Ns Sy 131072 Pq uint 2291e3aa18adSMartin MatuskaMinimal uncompressed size (inclusive) of a record before the early abort 2292e3aa18adSMartin Matuskaheuristic will be attempted. 2293e3aa18adSMartin Matuska. 22943ff01b23SMartin Matuska.It Sy zio_deadman_log_all Ns = Ns Sy 0 Ns | Ns 1 Pq int 22953ff01b23SMartin MatuskaIf non-zero, the zio deadman will produce debugging messages 22963ff01b23SMartin Matuska.Pq see Sy zfs_dbgmsg_enable 22973ff01b23SMartin Matuskafor all zios, rather than only for leaf zios possessing a vdev. 22983ff01b23SMartin MatuskaThis is meant to be used by developers to gain 22993ff01b23SMartin Matuskadiagnostic information for hang conditions which don't involve a mutex 23003ff01b23SMartin Matuskaor other locking primitive: typically conditions in which a thread in 23013ff01b23SMartin Matuskathe zio pipeline is looping indefinitely. 23023ff01b23SMartin Matuska. 23033ff01b23SMartin Matuska.It Sy zio_slow_io_ms Ns = Ns Sy 30000 Ns ms Po 30 s Pc Pq int 23043ff01b23SMartin MatuskaWhen an I/O operation takes more than this much time to complete, 23053ff01b23SMartin Matuskait's marked as slow. 23063ff01b23SMartin MatuskaEach slow operation causes a delay zevent. 23073ff01b23SMartin MatuskaSlow I/O counters can be seen with 23083ff01b23SMartin Matuska.Nm zpool Cm status Fl s . 23093ff01b23SMartin Matuska. 23103ff01b23SMartin Matuska.It Sy zio_dva_throttle_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int 23113ff01b23SMartin MatuskaThrottle block allocations in the I/O pipeline. 23123ff01b23SMartin MatuskaThis allows for dynamic allocation distribution when devices are imbalanced. 23133ff01b23SMartin MatuskaWhen enabled, the maximum number of pending allocations per top-level vdev 23143ff01b23SMartin Matuskais limited by 23153ff01b23SMartin Matuska.Sy zfs_vdev_queue_depth_pct . 23163ff01b23SMartin Matuska. 2317c03c5b1cSMartin Matuska.It Sy zfs_xattr_compat Ns = Ns 0 Ns | Ns 1 Pq int 2318c03c5b1cSMartin MatuskaControl the naming scheme used when setting new xattrs in the user namespace. 2319c03c5b1cSMartin MatuskaIf 2320c03c5b1cSMartin Matuska.Sy 0 2321c03c5b1cSMartin Matuska.Pq the default on Linux , 2322c03c5b1cSMartin Matuskauser namespace xattr names are prefixed with the namespace, to be backwards 2323c03c5b1cSMartin Matuskacompatible with previous versions of ZFS on Linux. 2324c03c5b1cSMartin MatuskaIf 2325c03c5b1cSMartin Matuska.Sy 1 2326c03c5b1cSMartin Matuska.Pq the default on Fx , 2327c03c5b1cSMartin Matuskauser namespace xattr names are not prefixed, to be backwards compatible with 2328c03c5b1cSMartin Matuskaprevious versions of ZFS on illumos and 2329c03c5b1cSMartin Matuska.Fx . 2330c03c5b1cSMartin Matuska.Pp 2331c03c5b1cSMartin MatuskaEither naming scheme can be read on this and future versions of ZFS, regardless 2332c03c5b1cSMartin Matuskaof this tunable, but legacy ZFS on illumos or 2333c03c5b1cSMartin Matuska.Fx 2334c03c5b1cSMartin Matuskaare unable to read user namespace xattrs written in the Linux format, and 2335c03c5b1cSMartin Matuskalegacy versions of ZFS on Linux are unable to read user namespace xattrs written 2336c03c5b1cSMartin Matuskain the legacy ZFS format. 2337c03c5b1cSMartin Matuska.Pp 2338c03c5b1cSMartin MatuskaAn existing xattr with the alternate naming scheme is removed when overwriting 2339c03c5b1cSMartin Matuskathe xattr so as to not accumulate duplicates. 2340c03c5b1cSMartin Matuska. 23413ff01b23SMartin Matuska.It Sy zio_requeue_io_start_cut_in_line Ns = Ns Sy 0 Ns | Ns 1 Pq int 23423ff01b23SMartin MatuskaPrioritize requeued I/O. 23433ff01b23SMartin Matuska. 23443ff01b23SMartin Matuska.It Sy zio_taskq_batch_pct Ns = Ns Sy 80 Ns % Pq uint 23453ff01b23SMartin MatuskaPercentage of online CPUs which will run a worker thread for I/O. 2346b985c9caSMartin MatuskaThese workers are responsible for I/O work such as compression, encryption, 2347b985c9caSMartin Matuskachecksum and parity calculations. 23483ff01b23SMartin MatuskaFractional number of CPUs will be rounded down. 23493ff01b23SMartin Matuska.Pp 23503ff01b23SMartin MatuskaThe default value of 23513ff01b23SMartin Matuska.Sy 80% 23523ff01b23SMartin Matuskawas chosen to avoid using all CPUs which can result in 23533ff01b23SMartin Matuskalatency issues and inconsistent application performance, 23543ff01b23SMartin Matuskaespecially when slower compression and/or checksumming is enabled. 2355b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 23563ff01b23SMartin Matuska. 23573ff01b23SMartin Matuska.It Sy zio_taskq_batch_tpq Ns = Ns Sy 0 Pq uint 23583ff01b23SMartin MatuskaNumber of worker threads per taskq. 2359b985c9caSMartin MatuskaHigher values improve I/O ordering and CPU utilization, 2360b985c9caSMartin Matuskawhile lower reduce lock contention. 2361b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 23623ff01b23SMartin Matuska.Pp 23633ff01b23SMartin MatuskaIf 23643ff01b23SMartin Matuska.Sy 0 , 23653ff01b23SMartin Matuskagenerate a system-dependent value close to 6 threads per taskq. 2366b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 23673ff01b23SMartin Matuska. 2368b985c9caSMartin Matuska.It Sy zio_taskq_write_tpq Ns = Ns Sy 16 Pq uint 2369b985c9caSMartin MatuskaDetermines the minumum number of threads per write issue taskq. 2370b985c9caSMartin MatuskaHigher values improve CPU utilization on high throughput, 2371b985c9caSMartin Matuskawhile lower reduce taskq locks contention on high IOPS. 2372b985c9caSMartin MatuskaSet value only applies to pools imported/created after that. 237314c2e0a0SMartin Matuska. 2374b356da80SMartin Matuska.It Sy zio_taskq_read Ns = Ns Sy fixed,1,8 null scale null Pq charp 2375b356da80SMartin MatuskaSet the queue and thread configuration for the IO read queues. 2376b356da80SMartin MatuskaThis is an advanced debugging parameter. 2377b356da80SMartin MatuskaDon't change this unless you understand what it does. 2378b985c9caSMartin MatuskaSet values only apply to pools imported/created after that. 2379b356da80SMartin Matuska. 2380aca928a5SMartin Matuska.It Sy zio_taskq_write Ns = Ns Sy sync null scale null Pq charp 2381b356da80SMartin MatuskaSet the queue and thread configuration for the IO write queues. 2382b356da80SMartin MatuskaThis is an advanced debugging parameter. 2383b356da80SMartin MatuskaDon't change this unless you understand what it does. 2384b985c9caSMartin MatuskaSet values only apply to pools imported/created after that. 2385b356da80SMartin Matuska. 23863ff01b23SMartin Matuska.It Sy zvol_inhibit_dev Ns = Ns Sy 0 Ns | Ns 1 Pq uint 23873ff01b23SMartin MatuskaDo not create zvol device nodes. 23883ff01b23SMartin MatuskaThis may slightly improve startup time on 23893ff01b23SMartin Matuskasystems with a very large number of zvols. 23903ff01b23SMartin Matuska. 23913ff01b23SMartin Matuska.It Sy zvol_major Ns = Ns Sy 230 Pq uint 23923ff01b23SMartin MatuskaMajor number for zvol block devices. 23933ff01b23SMartin Matuska. 2394dbd5678dSMartin Matuska.It Sy zvol_max_discard_blocks Ns = Ns Sy 16384 Pq long 23953ff01b23SMartin MatuskaDiscard (TRIM) operations done on zvols will be done in batches of this 23963ff01b23SMartin Matuskamany blocks, where block size is determined by the 23973ff01b23SMartin Matuska.Sy volblocksize 23983ff01b23SMartin Matuskaproperty of a zvol. 23993ff01b23SMartin Matuska. 2400716fd348SMartin Matuska.It Sy zvol_prefetch_bytes Ns = Ns Sy 131072 Ns B Po 128 KiB Pc Pq uint 24013ff01b23SMartin MatuskaWhen adding a zvol to the system, prefetch this many bytes 24023ff01b23SMartin Matuskafrom the start and end of the volume. 24033ff01b23SMartin MatuskaPrefetching these regions of the volume is desirable, 24043ff01b23SMartin Matuskabecause they are likely to be accessed immediately by 24053ff01b23SMartin Matuska.Xr blkid 8 24063ff01b23SMartin Matuskaor the kernel partitioner. 24073ff01b23SMartin Matuska. 24083ff01b23SMartin Matuska.It Sy zvol_request_sync Ns = Ns Sy 0 Ns | Ns 1 Pq uint 24093ff01b23SMartin MatuskaWhen processing I/O requests for a zvol, submit them synchronously. 24103ff01b23SMartin MatuskaThis effectively limits the queue depth to 24113ff01b23SMartin Matuska.Em 1 24123ff01b23SMartin Matuskafor each I/O submitter. 24133ff01b23SMartin MatuskaWhen unset, requests are handled asynchronously by a thread pool. 24143ff01b23SMartin MatuskaThe number of requests which can be handled concurrently is controlled by 24153ff01b23SMartin Matuska.Sy zvol_threads . 24161f1e2261SMartin Matuska.Sy zvol_request_sync 24171f1e2261SMartin Matuskais ignored when running on a kernel that supports block multiqueue 24181f1e2261SMartin Matuska.Pq Li blk-mq . 24193ff01b23SMartin Matuska. 24201719886fSMartin Matuska.It Sy zvol_num_taskqs Ns = Ns Sy 0 Pq uint 24211719886fSMartin MatuskaNumber of zvol taskqs. 24221719886fSMartin MatuskaIf 24231719886fSMartin Matuska.Sy 0 24241719886fSMartin Matuska(the default) then scaling is done internally to prefer 6 threads per taskq. 24251719886fSMartin MatuskaThis only applies on Linux. 24261719886fSMartin Matuska. 24271f1e2261SMartin Matuska.It Sy zvol_threads Ns = Ns Sy 0 Pq uint 24281f1e2261SMartin MatuskaThe number of system wide threads to use for processing zvol block IOs. 24291f1e2261SMartin MatuskaIf 24301f1e2261SMartin Matuska.Sy 0 24311f1e2261SMartin Matuska(the default) then internally set 24321f1e2261SMartin Matuska.Sy zvol_threads 24331f1e2261SMartin Matuskato the number of CPUs present or 32 (whichever is greater). 24341f1e2261SMartin Matuska. 24356c1e79dfSMartin Matuska.It Sy zvol_blk_mq_threads Ns = Ns Sy 0 Pq uint 24366c1e79dfSMartin MatuskaThe number of threads per zvol to use for queuing IO requests. 24376c1e79dfSMartin MatuskaThis parameter will only appear if your kernel supports 24386c1e79dfSMartin Matuska.Li blk-mq 24396c1e79dfSMartin Matuskaand is only read and assigned to a zvol at zvol load time. 24406c1e79dfSMartin MatuskaIf 24416c1e79dfSMartin Matuska.Sy 0 24426c1e79dfSMartin Matuska(the default) then internally set 24436c1e79dfSMartin Matuska.Sy zvol_blk_mq_threads 24446c1e79dfSMartin Matuskato the number of CPUs present. 24456c1e79dfSMartin Matuska. 24466c1e79dfSMartin Matuska.It Sy zvol_use_blk_mq Ns = Ns Sy 0 Ns | Ns 1 Pq uint 24476c1e79dfSMartin MatuskaSet to 24486c1e79dfSMartin Matuska.Sy 1 24496c1e79dfSMartin Matuskato use the 24506c1e79dfSMartin Matuska.Li blk-mq 24516c1e79dfSMartin MatuskaAPI for zvols. 24526c1e79dfSMartin MatuskaSet to 24536c1e79dfSMartin Matuska.Sy 0 24546c1e79dfSMartin Matuska(the default) to use the legacy zvol APIs. 24556c1e79dfSMartin MatuskaThis setting can give better or worse zvol performance depending on 24566c1e79dfSMartin Matuskathe workload. 24576c1e79dfSMartin MatuskaThis parameter will only appear if your kernel supports 24586c1e79dfSMartin Matuska.Li blk-mq 24596c1e79dfSMartin Matuskaand is only read and assigned to a zvol at zvol load time. 24606c1e79dfSMartin Matuska. 24616c1e79dfSMartin Matuska.It Sy zvol_blk_mq_blocks_per_thread Ns = Ns Sy 8 Pq uint 24626c1e79dfSMartin MatuskaIf 24636c1e79dfSMartin Matuska.Sy zvol_use_blk_mq 24646c1e79dfSMartin Matuskais enabled, then process this number of 24656c1e79dfSMartin Matuska.Sy volblocksize Ns -sized blocks per zvol thread. 24666c1e79dfSMartin MatuskaThis tunable can be use to favor better performance for zvol reads (lower 24676c1e79dfSMartin Matuskavalues) or writes (higher values). 24686c1e79dfSMartin MatuskaIf set to 24696c1e79dfSMartin Matuska.Sy 0 , 24706c1e79dfSMartin Matuskathen the zvol layer will process the maximum number of blocks 24716c1e79dfSMartin Matuskaper thread that it can. 24726c1e79dfSMartin MatuskaThis parameter will only appear if your kernel supports 24736c1e79dfSMartin Matuska.Li blk-mq 24746c1e79dfSMartin Matuskaand is only applied at each zvol's load time. 24756c1e79dfSMartin Matuska. 24766c1e79dfSMartin Matuska.It Sy zvol_blk_mq_queue_depth Ns = Ns Sy 0 Pq uint 24776c1e79dfSMartin MatuskaThe queue_depth value for the zvol 24786c1e79dfSMartin Matuska.Li blk-mq 24796c1e79dfSMartin Matuskainterface. 24806c1e79dfSMartin MatuskaThis parameter will only appear if your kernel supports 24816c1e79dfSMartin Matuska.Li blk-mq 24826c1e79dfSMartin Matuskaand is only applied at each zvol's load time. 24836c1e79dfSMartin MatuskaIf 24846c1e79dfSMartin Matuska.Sy 0 24856c1e79dfSMartin Matuska(the default) then use the kernel's default queue depth. 24866c1e79dfSMartin MatuskaValues are clamped to the kernel's 24876c1e79dfSMartin Matuska.Dv BLKDEV_MIN_RQ 24886c1e79dfSMartin Matuskaand 24896c1e79dfSMartin Matuska.Dv BLKDEV_MAX_RQ Ns / Ns Dv BLKDEV_DEFAULT_RQ 24906c1e79dfSMartin Matuskalimits. 24916c1e79dfSMartin Matuska. 24923ff01b23SMartin Matuska.It Sy zvol_volmode Ns = Ns Sy 1 Pq uint 24933ff01b23SMartin MatuskaDefines zvol block devices behaviour when 24943ff01b23SMartin Matuska.Sy volmode Ns = Ns Sy default : 24953ff01b23SMartin Matuska.Bl -tag -compact -offset 4n -width "a" 24963ff01b23SMartin Matuska.It Sy 1 24973ff01b23SMartin Matuska.No equivalent to Sy full 24983ff01b23SMartin Matuska.It Sy 2 24993ff01b23SMartin Matuska.No equivalent to Sy dev 25003ff01b23SMartin Matuska.It Sy 3 25013ff01b23SMartin Matuska.No equivalent to Sy none 25023ff01b23SMartin Matuska.El 2503dbd5678dSMartin Matuska. 2504dbd5678dSMartin Matuska.It Sy zvol_enforce_quotas Ns = Ns Sy 0 Ns | Ns 1 Pq uint 2505dbd5678dSMartin MatuskaEnable strict ZVOL quota enforcement. 2506dbd5678dSMartin MatuskaThe strict quota enforcement may have a performance impact. 25073ff01b23SMartin Matuska.El 25083ff01b23SMartin Matuska. 25093ff01b23SMartin Matuska.Sh ZFS I/O SCHEDULER 25103ff01b23SMartin MatuskaZFS issues I/O operations to leaf vdevs to satisfy and complete I/O operations. 25113ff01b23SMartin MatuskaThe scheduler determines when and in what order those operations are issued. 25123ff01b23SMartin MatuskaThe scheduler divides operations into five I/O classes, 25133ff01b23SMartin Matuskaprioritized in the following order: sync read, sync write, async read, 25143ff01b23SMartin Matuskaasync write, and scrub/resilver. 25153ff01b23SMartin MatuskaEach queue defines the minimum and maximum number of concurrent operations 25163ff01b23SMartin Matuskathat may be issued to the device. 25173ff01b23SMartin MatuskaIn addition, the device has an aggregate maximum, 25183ff01b23SMartin Matuska.Sy zfs_vdev_max_active . 25193ff01b23SMartin MatuskaNote that the sum of the per-queue minima must not exceed the aggregate maximum. 25203ff01b23SMartin MatuskaIf the sum of the per-queue maxima exceeds the aggregate maximum, 25213ff01b23SMartin Matuskathen the number of active operations may reach 25223ff01b23SMartin Matuska.Sy zfs_vdev_max_active , 25233ff01b23SMartin Matuskain which case no further operations will be issued, 25243ff01b23SMartin Matuskaregardless of whether all per-queue minima have been met. 25253ff01b23SMartin Matuska.Pp 25263ff01b23SMartin MatuskaFor many physical devices, throughput increases with the number of 25273ff01b23SMartin Matuskaconcurrent operations, but latency typically suffers. 25283ff01b23SMartin MatuskaFurthermore, physical devices typically have a limit 25293ff01b23SMartin Matuskaat which more concurrent operations have no 25303ff01b23SMartin Matuskaeffect on throughput or can actually cause it to decrease. 25313ff01b23SMartin Matuska.Pp 25323ff01b23SMartin MatuskaThe scheduler selects the next operation to issue by first looking for an 25333ff01b23SMartin MatuskaI/O class whose minimum has not been satisfied. 25343ff01b23SMartin MatuskaOnce all are satisfied and the aggregate maximum has not been hit, 25353ff01b23SMartin Matuskathe scheduler looks for classes whose maximum has not been satisfied. 25363ff01b23SMartin MatuskaIteration through the I/O classes is done in the order specified above. 25373ff01b23SMartin MatuskaNo further operations are issued 25383ff01b23SMartin Matuskaif the aggregate maximum number of concurrent operations has been hit, 2539bb2d13b6SMartin Matuskaor if there are no operations queued for an I/O class that has not hit its 2540bb2d13b6SMartin Matuskamaximum. 25413ff01b23SMartin MatuskaEvery time an I/O operation is queued or an operation completes, 25423ff01b23SMartin Matuskathe scheduler looks for new operations to issue. 25433ff01b23SMartin Matuska.Pp 25443ff01b23SMartin MatuskaIn general, smaller 25453ff01b23SMartin Matuska.Sy max_active Ns s 25463ff01b23SMartin Matuskawill lead to lower latency of synchronous operations. 25473ff01b23SMartin MatuskaLarger 25483ff01b23SMartin Matuska.Sy max_active Ns s 25493ff01b23SMartin Matuskamay lead to higher overall throughput, depending on underlying storage. 25503ff01b23SMartin Matuska.Pp 25513ff01b23SMartin MatuskaThe ratio of the queues' 25523ff01b23SMartin Matuska.Sy max_active Ns s 25533ff01b23SMartin Matuskadetermines the balance of performance between reads, writes, and scrubs. 25543ff01b23SMartin MatuskaFor example, increasing 25553ff01b23SMartin Matuska.Sy zfs_vdev_scrub_max_active 25563ff01b23SMartin Matuskawill cause the scrub or resilver to complete more quickly, 25573ff01b23SMartin Matuskabut reads and writes to have higher latency and lower throughput. 25583ff01b23SMartin Matuska.Pp 25593ff01b23SMartin MatuskaAll I/O classes have a fixed maximum number of outstanding operations, 25603ff01b23SMartin Matuskaexcept for the async write class. 25613ff01b23SMartin MatuskaAsynchronous writes represent the data that is committed to stable storage 25623ff01b23SMartin Matuskaduring the syncing stage for transaction groups. 25633ff01b23SMartin MatuskaTransaction groups enter the syncing state periodically, 25643ff01b23SMartin Matuskaso the number of queued async writes will quickly burst up 25653ff01b23SMartin Matuskaand then bleed down to zero. 25663ff01b23SMartin MatuskaRather than servicing them as quickly as possible, 25673ff01b23SMartin Matuskathe I/O scheduler changes the maximum number of active async write operations 25683ff01b23SMartin Matuskaaccording to the amount of dirty data in the pool. 25693ff01b23SMartin MatuskaSince both throughput and latency typically increase with the number of 25703ff01b23SMartin Matuskaconcurrent operations issued to physical devices, reducing the 2571bb2d13b6SMartin Matuskaburstiness in the number of simultaneous operations also stabilizes the 2572bb2d13b6SMartin Matuskaresponse time of operations from other queues, in particular synchronous ones. 25733ff01b23SMartin MatuskaIn broad strokes, the I/O scheduler will issue more concurrent operations 2574bb2d13b6SMartin Matuskafrom the async write queue as there is more dirty data in the pool. 25753ff01b23SMartin Matuska. 25763ff01b23SMartin Matuska.Ss Async Writes 25773ff01b23SMartin MatuskaThe number of concurrent operations issued for the async write I/O class 25783ff01b23SMartin Matuskafollows a piece-wise linear function defined by a few adjustable points: 25793ff01b23SMartin Matuska.Bd -literal 25803ff01b23SMartin Matuska | o---------| <-- \fBzfs_vdev_async_write_max_active\fP 25813ff01b23SMartin Matuska ^ | /^ | 25823ff01b23SMartin Matuska | | / | | 25833ff01b23SMartin Matuskaactive | / | | 25843ff01b23SMartin Matuska I/O | / | | 25853ff01b23SMartin Matuskacount | / | | 25863ff01b23SMartin Matuska | / | | 25873ff01b23SMartin Matuska |-------o | | <-- \fBzfs_vdev_async_write_min_active\fP 25883ff01b23SMartin Matuska 0|_______^______|_________| 25893ff01b23SMartin Matuska 0% | | 100% of \fBzfs_dirty_data_max\fP 25903ff01b23SMartin Matuska | | 25913ff01b23SMartin Matuska | `-- \fBzfs_vdev_async_write_active_max_dirty_percent\fP 25923ff01b23SMartin Matuska `--------- \fBzfs_vdev_async_write_active_min_dirty_percent\fP 25933ff01b23SMartin Matuska.Ed 25943ff01b23SMartin Matuska.Pp 25953ff01b23SMartin MatuskaUntil the amount of dirty data exceeds a minimum percentage of the dirty 25963ff01b23SMartin Matuskadata allowed in the pool, the I/O scheduler will limit the number of 25973ff01b23SMartin Matuskaconcurrent operations to the minimum. 25983ff01b23SMartin MatuskaAs that threshold is crossed, the number of concurrent operations issued 25993ff01b23SMartin Matuskaincreases linearly to the maximum at the specified maximum percentage 26003ff01b23SMartin Matuskaof the dirty data allowed in the pool. 26013ff01b23SMartin Matuska.Pp 26023ff01b23SMartin MatuskaIdeally, the amount of dirty data on a busy pool will stay in the sloped 26033ff01b23SMartin Matuskapart of the function between 26043ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_min_dirty_percent 26053ff01b23SMartin Matuskaand 26063ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent . 26073ff01b23SMartin MatuskaIf it exceeds the maximum percentage, 26083ff01b23SMartin Matuskathis indicates that the rate of incoming data is 26093ff01b23SMartin Matuskagreater than the rate that the backend storage can handle. 26103ff01b23SMartin MatuskaIn this case, we must further throttle incoming writes, 26113ff01b23SMartin Matuskaas described in the next section. 26123ff01b23SMartin Matuska. 26133ff01b23SMartin Matuska.Sh ZFS TRANSACTION DELAY 26143ff01b23SMartin MatuskaWe delay transactions when we've determined that the backend storage 26153ff01b23SMartin Matuskaisn't able to accommodate the rate of incoming writes. 26163ff01b23SMartin Matuska.Pp 26173ff01b23SMartin MatuskaIf there is already a transaction waiting, we delay relative to when 26183ff01b23SMartin Matuskathat transaction will finish waiting. 26193ff01b23SMartin MatuskaThis way the calculated delay time 26203ff01b23SMartin Matuskais independent of the number of threads concurrently executing transactions. 26213ff01b23SMartin Matuska.Pp 26223ff01b23SMartin MatuskaIf we are the only waiter, wait relative to when the transaction started, 26233ff01b23SMartin Matuskarather than the current time. 26243ff01b23SMartin MatuskaThis credits the transaction for "time already served", 26253ff01b23SMartin Matuskae.g. reading indirect blocks. 26263ff01b23SMartin Matuska.Pp 26273ff01b23SMartin MatuskaThe minimum time for a transaction to take is calculated as 2628e92ffd9bSMartin Matuska.D1 min_time = min( Ns Sy zfs_delay_scale No \(mu Po Sy dirty No \- Sy min Pc / Po Sy max No \- Sy dirty Pc , 100ms) 26293ff01b23SMartin Matuska.Pp 26303ff01b23SMartin MatuskaThe delay has two degrees of freedom that can be adjusted via tunables. 26313ff01b23SMartin MatuskaThe percentage of dirty data at which we start to delay is defined by 26323ff01b23SMartin Matuska.Sy zfs_delay_min_dirty_percent . 26333ff01b23SMartin MatuskaThis should typically be at or above 26343ff01b23SMartin Matuska.Sy zfs_vdev_async_write_active_max_dirty_percent , 26353ff01b23SMartin Matuskaso that we only start to delay after writing at full speed 26363ff01b23SMartin Matuskahas failed to keep up with the incoming write rate. 26373ff01b23SMartin MatuskaThe scale of the curve is defined by 26383ff01b23SMartin Matuska.Sy zfs_delay_scale . 2639bb2d13b6SMartin MatuskaRoughly speaking, this variable determines the amount of delay at the midpoint 2640bb2d13b6SMartin Matuskaof the curve. 26413ff01b23SMartin Matuska.Bd -literal 26423ff01b23SMartin Matuskadelay 26433ff01b23SMartin Matuska 10ms +-------------------------------------------------------------*+ 26443ff01b23SMartin Matuska | *| 26453ff01b23SMartin Matuska 9ms + *+ 26463ff01b23SMartin Matuska | *| 26473ff01b23SMartin Matuska 8ms + *+ 26483ff01b23SMartin Matuska | * | 26493ff01b23SMartin Matuska 7ms + * + 26503ff01b23SMartin Matuska | * | 26513ff01b23SMartin Matuska 6ms + * + 26523ff01b23SMartin Matuska | * | 26533ff01b23SMartin Matuska 5ms + * + 26543ff01b23SMartin Matuska | * | 26553ff01b23SMartin Matuska 4ms + * + 26563ff01b23SMartin Matuska | * | 26573ff01b23SMartin Matuska 3ms + * + 26583ff01b23SMartin Matuska | * | 26593ff01b23SMartin Matuska 2ms + (midpoint) * + 26603ff01b23SMartin Matuska | | ** | 26613ff01b23SMartin Matuska 1ms + v *** + 26623ff01b23SMartin Matuska | \fBzfs_delay_scale\fP ----------> ******** | 26633ff01b23SMartin Matuska 0 +-------------------------------------*********----------------+ 26643ff01b23SMartin Matuska 0% <- \fBzfs_dirty_data_max\fP -> 100% 26653ff01b23SMartin Matuska.Ed 26663ff01b23SMartin Matuska.Pp 26673ff01b23SMartin MatuskaNote, that since the delay is added to the outstanding time remaining on the 26683ff01b23SMartin Matuskamost recent transaction it's effectively the inverse of IOPS. 26693ff01b23SMartin MatuskaHere, the midpoint of 26703ff01b23SMartin Matuska.Em 500 us 26713ff01b23SMartin Matuskatranslates to 26723ff01b23SMartin Matuska.Em 2000 IOPS . 26733ff01b23SMartin MatuskaThe shape of the curve 26743ff01b23SMartin Matuskawas chosen such that small changes in the amount of accumulated dirty data 26753ff01b23SMartin Matuskain the first three quarters of the curve yield relatively small differences 26763ff01b23SMartin Matuskain the amount of delay. 26773ff01b23SMartin Matuska.Pp 26783ff01b23SMartin MatuskaThe effects can be easier to understand when the amount of delay is 26793ff01b23SMartin Matuskarepresented on a logarithmic scale: 26803ff01b23SMartin Matuska.Bd -literal 26813ff01b23SMartin Matuskadelay 26823ff01b23SMartin Matuska100ms +-------------------------------------------------------------++ 26833ff01b23SMartin Matuska + + 26843ff01b23SMartin Matuska | | 26853ff01b23SMartin Matuska + *+ 26863ff01b23SMartin Matuska 10ms + *+ 26873ff01b23SMartin Matuska + ** + 26883ff01b23SMartin Matuska | (midpoint) ** | 26893ff01b23SMartin Matuska + | ** + 26903ff01b23SMartin Matuska 1ms + v **** + 26913ff01b23SMartin Matuska + \fBzfs_delay_scale\fP ----------> ***** + 26923ff01b23SMartin Matuska | **** | 26933ff01b23SMartin Matuska + **** + 26943ff01b23SMartin Matuska100us + ** + 26953ff01b23SMartin Matuska + * + 26963ff01b23SMartin Matuska | * | 26973ff01b23SMartin Matuska + * + 26983ff01b23SMartin Matuska 10us + * + 26993ff01b23SMartin Matuska + + 27003ff01b23SMartin Matuska | | 27013ff01b23SMartin Matuska + + 27023ff01b23SMartin Matuska +--------------------------------------------------------------+ 27033ff01b23SMartin Matuska 0% <- \fBzfs_dirty_data_max\fP -> 100% 27043ff01b23SMartin Matuska.Ed 27053ff01b23SMartin Matuska.Pp 27063ff01b23SMartin MatuskaNote here that only as the amount of dirty data approaches its limit does 27073ff01b23SMartin Matuskathe delay start to increase rapidly. 27083ff01b23SMartin MatuskaThe goal of a properly tuned system should be to keep the amount of dirty data 27093ff01b23SMartin Matuskaout of that range by first ensuring that the appropriate limits are set 27103ff01b23SMartin Matuskafor the I/O scheduler to reach optimal throughput on the back-end storage, 27113ff01b23SMartin Matuskaand then by changing the value of 27123ff01b23SMartin Matuska.Sy zfs_delay_scale 27133ff01b23SMartin Matuskato increase the steepness of the curve. 2714