1.. SPDX-License-Identifier: GPL-2.0 2 3=========== 4IPvs-sysctl 5=========== 6 7/proc/sys/net/ipv4/vs/* Variables: 8================================== 9 10am_droprate - INTEGER 11 default 10 12 13 It sets the always mode drop rate, which is used in the mode 3 14 of the drop_rate defense. 15 16amemthresh - INTEGER 17 default 1024 18 19 It sets the available memory threshold (in pages), which is 20 used in the automatic modes of defense. When there is no 21 enough available memory, the respective strategy will be 22 enabled and the variable is automatically set to 2, otherwise 23 the strategy is disabled and the variable is set to 1. 24 25backup_only - BOOLEAN 26 - 0 - disabled (default) 27 - not 0 - enabled 28 29 If set, disable the director function while the server is 30 in backup mode to avoid packet loops for DR/TUN methods. 31 32conn_lfactor - INTEGER 33 Possible values: -8 (larger table) .. 8 (smaller table) 34 35 Default: -4 36 37 Controls the sizing of the connection hash table based on the 38 load factor (number of connections per table buckets): 39 40 2^conn_lfactor = nodes / buckets 41 42 As result, the table grows if load increases and shrinks when 43 load decreases in the range of 2^8 - 2^conn_tab_bits (module 44 parameter). 45 The value is a shift count where negative values select 46 buckets = (connection hash nodes << -value) while positive 47 values select buckets = (connection hash nodes >> value). The 48 negative values reduce the collisions and reduce the time for 49 lookups but increase the table size. Positive values will 50 tolerate load above 100% when using smaller table is 51 preferred with the cost of more collisions. If using NAT 52 connections consider decreasing the value with one because 53 they add two nodes in the hash table. 54 55 Example: 56 -4: grow if load goes above 6% (buckets = nodes * 16) 57 2: grow if load goes above 400% (buckets = nodes / 4) 58 59conn_reuse_mode - INTEGER 60 1 - default 61 62 Controls how ipvs will deal with connections that are detected 63 port reuse. It is a bitmap, with the values being: 64 65 0: disable any special handling on port reuse. The new 66 connection will be delivered to the same real server that was 67 servicing the previous connection. 68 69 bit 1: enable rescheduling of new connections when it is safe. 70 That is, whenever expire_nodest_conn and for TCP sockets, when 71 the connection is in TIME_WAIT state (which is only possible if 72 you use NAT mode). 73 74 bit 2: it is bit 1 plus, for TCP connections, when connections 75 are in FIN_WAIT state, as this is the last state seen by load 76 balancer in Direct Routing mode. This bit helps on adding new 77 real servers to a very busy cluster. 78 79conntrack - BOOLEAN 80 - 0 - disabled (default) 81 - not 0 - enabled 82 83 If set, maintain connection tracking entries for 84 connections handled by IPVS. 85 86 This should be enabled if connections handled by IPVS are to be 87 also handled by stateful firewall rules. That is, iptables rules 88 that make use of connection tracking. It is a performance 89 optimisation to disable this setting otherwise. 90 91 Connections handled by the IPVS FTP application module 92 will have connection tracking entries regardless of this setting. 93 94 Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled. 95 96cache_bypass - BOOLEAN 97 - 0 - disabled (default) 98 - not 0 - enabled 99 100 If it is enabled, forward packets to the original destination 101 directly when no cache server is available and destination 102 address is not local (iph->daddr is RTN_UNICAST). It is mostly 103 used in transparent web cache cluster. 104 105debug_level - INTEGER 106 - 0 - transmission error messages (default) 107 - 1 - non-fatal error messages 108 - 2 - configuration 109 - 3 - destination trash 110 - 4 - drop entry 111 - 5 - service lookup 112 - 6 - scheduling 113 - 7 - connection new/expire, lookup and synchronization 114 - 8 - state transition 115 - 9 - binding destination, template checks and applications 116 - 10 - IPVS packet transmission 117 - 11 - IPVS packet handling (ip_vs_in/ip_vs_out) 118 - 12 or more - packet traversal 119 120 Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled. 121 122 Higher debugging levels include the messages for lower debugging 123 levels, so setting debug level 2, includes level 0, 1 and 2 124 messages. Thus, logging becomes more and more verbose the higher 125 the level. 126 127drop_entry - INTEGER 128 - 0 - disabled (default) 129 130 The drop_entry defense is to randomly drop entries in the 131 connection hash table, just in order to collect back some 132 memory for new connections. In the current code, the 133 drop_entry procedure can be activated every second, then it 134 randomly scans 1/32 of the whole and drops entries that are in 135 the SYN-RECV/SYNACK state, which should be effective against 136 syn-flooding attack. 137 138 The valid values of drop_entry are from 0 to 3, where 0 means 139 that this strategy is always disabled, 1 and 2 mean automatic 140 modes (when there is no enough available memory, the strategy 141 is enabled and the variable is automatically set to 2, 142 otherwise the strategy is disabled and the variable is set to 143 1), and 3 means that the strategy is always enabled. 144 145drop_packet - INTEGER 146 - 0 - disabled (default) 147 148 The drop_packet defense is designed to drop 1/rate packets 149 before forwarding them to real servers. If the rate is 1, then 150 drop all the incoming packets. 151 152 The value definition is the same as that of the drop_entry. In 153 the automatic mode, the rate is determined by the follow 154 formula: rate = amemthresh / (amemthresh - available_memory) 155 when available memory is less than the available memory 156 threshold. When the mode 3 is set, the always mode drop rate 157 is controlled by the /proc/sys/net/ipv4/vs/am_droprate. 158 159est_cpulist - CPULIST 160 Allowed CPUs for estimation kthreads 161 162 Syntax: standard cpulist format 163 empty list - stop kthread tasks and estimation 164 default - the system's housekeeping CPUs for kthreads 165 166 Example: 167 "all": all possible CPUs 168 "0-N": all possible CPUs, N denotes last CPU number 169 "0,1-N:1/2": first and all CPUs with odd number 170 "": empty list 171 172est_nice - INTEGER 173 default 0 174 Valid range: -20 (more favorable) .. 19 (less favorable) 175 176 Niceness value to use for the estimation kthreads (scheduling 177 priority) 178 179expire_nodest_conn - BOOLEAN 180 - 0 - disabled (default) 181 - not 0 - enabled 182 183 The default value is 0, the load balancer will silently drop 184 packets when its destination server is not available. It may 185 be useful, when user-space monitoring program deletes the 186 destination server (because of server overload or wrong 187 detection) and add back the server later, and the connections 188 to the server can continue. 189 190 If this feature is enabled, the load balancer will expire the 191 connection immediately when a packet arrives and its 192 destination server is not available, then the client program 193 will be notified that the connection is closed. This is 194 equivalent to the feature some people requires to flush 195 connections when its destination is not available. 196 197expire_quiescent_template - BOOLEAN 198 - 0 - disabled (default) 199 - not 0 - enabled 200 201 When set to a non-zero value, the load balancer will expire 202 persistent templates when the destination server is quiescent. 203 This may be useful, when a user makes a destination server 204 quiescent by setting its weight to 0 and it is desired that 205 subsequent otherwise persistent connections are sent to a 206 different destination server. By default new persistent 207 connections are allowed to quiescent destination servers. 208 209 If this feature is enabled, the load balancer will expire the 210 persistence template if it is to be used to schedule a new 211 connection and the destination server is quiescent. 212 213ignore_tunneled - BOOLEAN 214 - 0 - disabled (default) 215 - not 0 - enabled 216 217 If set, ipvs will set the ipvs_property on all packets which are of 218 unrecognized protocols. This prevents us from routing tunneled 219 protocols like ipip, which is useful to prevent rescheduling 220 packets that have been tunneled to the ipvs host (i.e. to prevent 221 ipvs routing loops when ipvs is also acting as a real server). 222 223nat_icmp_send - BOOLEAN 224 - 0 - disabled (default) 225 - not 0 - enabled 226 227 It controls sending icmp error messages (ICMP_DEST_UNREACH) 228 for VS/NAT when the load balancer receives packets from real 229 servers but the connection entries don't exist. 230 231pmtu_disc - BOOLEAN 232 - 0 - disabled 233 - not 0 - enabled (default) 234 235 By default, reject with FRAG_NEEDED all DF packets that exceed 236 the PMTU, irrespective of the forwarding method. For TUN method 237 the flag can be disabled to fragment such packets. 238 239secure_tcp - INTEGER 240 - 0 - disabled (default) 241 242 The secure_tcp defense is to use a more complicated TCP state 243 transition table. For VS/NAT, it also delays entering the 244 TCP ESTABLISHED state until the three way handshake is completed. 245 246 The value definition is the same as that of drop_entry and 247 drop_packet. 248 249svc_lfactor - INTEGER 250 Possible values: -8 (larger table) .. 8 (smaller table) 251 252 Default: -3 253 254 Controls the sizing of the service hash table based on the 255 load factor (number of services per table buckets). The table 256 will grow and shrink in the range of 2^4 - 2^20. 257 See conn_lfactor for explanation. 258 259sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period 260 default 3 50 261 262 It sets synchronization threshold, which is the minimum number 263 of incoming packets that a connection needs to receive before 264 the connection will be synchronized. A connection will be 265 synchronized, every time the number of its incoming packets 266 modulus sync_period equals the threshold. The range of the 267 threshold is from 0 to sync_period. 268 269 When sync_period and sync_refresh_period are 0, send sync only 270 for state changes or only once when pkts matches sync_threshold 271 272sync_refresh_period - UNSIGNED INTEGER 273 default 0 274 275 In seconds, difference in reported connection timer that triggers 276 new sync message. It can be used to avoid sync messages for the 277 specified period (or half of the connection timeout if it is lower) 278 if connection state is not changed since last sync. 279 280 This is useful for normal connections with high traffic to reduce 281 sync rate. Additionally, retry sync_retries times with period of 282 sync_refresh_period/8. 283 284sync_retries - INTEGER 285 default 0 286 287 Defines sync retries with period of sync_refresh_period/8. Useful 288 to protect against loss of sync messages. The range of the 289 sync_retries is from 0 to 3. 290 291sync_qlen_max - UNSIGNED LONG 292 293 Hard limit for queued sync messages that are not sent yet. It 294 defaults to 1/32 of the memory pages but actually represents 295 number of messages. It will protect us from allocating large 296 parts of memory when the sending rate is lower than the queuing 297 rate. 298 299sync_sock_size - INTEGER 300 default 0 301 302 Configuration of SNDBUF (master) or RCVBUF (slave) socket limit. 303 Default value is 0 (preserve system defaults). 304 305sync_ports - INTEGER 306 default 1 307 308 The number of threads that master and backup servers can use for 309 sync traffic. Every thread will use single UDP port, thread 0 will 310 use the default port 8848 while last thread will use port 311 8848+sync_ports-1. 312 313snat_reroute - BOOLEAN 314 - 0 - disabled 315 - not 0 - enabled (default) 316 317 If enabled, recalculate the route of SNATed packets from 318 realservers so that they are routed as if they originate from the 319 director. Otherwise they are routed as if they are forwarded by the 320 director. 321 322 If policy routing is in effect then it is possible that the route 323 of a packet originating from a director is routed differently to a 324 packet being forwarded by the director. 325 326 If policy routing is not in effect then the recalculated route will 327 always be the same as the original route so it is an optimisation 328 to disable snat_reroute and avoid the recalculation. 329 330sync_persist_mode - INTEGER 331 default 0 332 333 Controls the synchronisation of connections when using persistence 334 335 0: All types of connections are synchronised 336 337 1: Attempt to reduce the synchronisation traffic depending on 338 the connection type. For persistent services avoid synchronisation 339 for normal connections, do it only for persistence templates. 340 In such case, for TCP and SCTP it may need enabling sloppy_tcp and 341 sloppy_sctp flags on backup servers. For non-persistent services 342 such optimization is not applied, mode 0 is assumed. 343 344sync_version - INTEGER 345 default 1 346 347 The version of the synchronisation protocol used when sending 348 synchronisation messages. 349 350 0 selects the original synchronisation protocol (version 0). This 351 should be used when sending synchronisation messages to a legacy 352 system that only understands the original synchronisation protocol. 353 354 1 selects the current synchronisation protocol (version 1). This 355 should be used where possible. 356 357 Kernels with this sync_version entry are able to receive messages 358 of both version 1 and version 2 of the synchronisation protocol. 359 360run_estimation - BOOLEAN 361 0 - disabled 362 not 0 - enabled (default) 363 364 If disabled, the estimation will be suspended and kthread tasks 365 stopped. 366 367 You can always re-enable estimation by setting this value to 1. 368 But be careful, the first estimation after re-enable is not 369 accurate. 370