1.. SPDX-License-Identifier: GPL-2.0 2 3=========== 4IPvs-sysctl 5=========== 6 7/proc/sys/net/ipv4/vs/* Variables: 8================================== 9 10am_droprate - INTEGER 11 default 10 12 13 It sets the always mode drop rate, which is used in the mode 3 14 of the drop_rate defense. 15 16amemthresh - INTEGER 17 default 1024 18 19 It sets the available memory threshold (in pages), which is 20 used in the automatic modes of defense. When there is no 21 enough available memory, the respective strategy will be 22 enabled and the variable is automatically set to 2, otherwise 23 the strategy is disabled and the variable is set to 1. 24 25backup_only - BOOLEAN 26 - 0 - disabled (default) 27 - not 0 - enabled 28 29 If set, disable the director function while the server is 30 in backup mode to avoid packet loops for DR/TUN methods. 31 32conn_lfactor - INTEGER 33 Possible values: -8 (larger table) .. 8 (smaller table) 34 35 Default: -4 36 37 Controls the sizing of the connection hash table based on the 38 load factor (number of connections per table buckets): 39 40 2^conn_lfactor = nodes / buckets 41 42 As result, the table grows if load increases and shrinks when 43 load decreases in the range of 2^8 - 2^conn_tab_bits (module 44 parameter). 45 The value is a shift count where negative values select 46 buckets = (connection hash nodes << -value) while positive 47 values select buckets = (connection hash nodes >> value). The 48 negative values reduce the collisions and reduce the time for 49 lookups but increase the table size. Positive values will 50 tolerate load above 100% when using smaller table is 51 preferred with the cost of more collisions. If using NAT 52 connections consider decreasing the value with one because 53 they add two nodes in the hash table. 54 55 Example: 56 -4: grow if load goes above 6% (buckets = nodes * 16) 57 2: grow if load goes above 400% (buckets = nodes / 4) 58 59conn_max - INTEGER 60 Limit for number of connections, per netns. 61 62 Controls the soft and hard limit for number of connections. 63 Initially, the platform specific limit is assigned for init_net. 64 The value can be changed and later the soft limit propagated 65 to other networking namespaces. 66 67 Privileged admin can change both limits up to the value of the 68 platform limit while the unprivileged admin can change only the 69 soft limit up to the value of the hard limit. 70 71 For setups using conntrack=1 (CONFIG_IP_VS_NFCT for 72 Netfilter connection tracking) the connections can be 73 limited also by nf_conntrack_max. 74 75 Limits for init_net: 76 77 ======================= =============== ============= 78 \ soft limit hard limit 79 ======================= =============== ============= 80 create netns platform platform 81 priv admin 0 .. platform 0 .. platform 82 ======================= =============== ============= 83 84 Limits for new netns: 85 86 ======================= =============== ============= 87 \ soft limit hard limit 88 ======================= =============== ============= 89 create netns init_net:soft init_net:soft 90 priv admin 0 .. platform 0 .. platform 91 unpriv admin 0 .. hard N/A 92 ======================= =============== ============= 93 94 Limits per platform: 95 96 - 1,073,741,824 (2^30 for 64-bit) 97 - 16,777,216 (2^24 for 32-bit) 98 99 Possible values: 0 .. platform limit 100 101 Default: platform limit 102 103conn_reuse_mode - INTEGER 104 1 - default 105 106 Controls how ipvs will deal with connections that are detected 107 port reuse. It is a bitmap, with the values being: 108 109 0: disable any special handling on port reuse. The new 110 connection will be delivered to the same real server that was 111 servicing the previous connection. 112 113 bit 1: enable rescheduling of new connections when it is safe. 114 That is, whenever expire_nodest_conn and for TCP sockets, when 115 the connection is in TIME_WAIT state (which is only possible if 116 you use NAT mode). 117 118 bit 2: it is bit 1 plus, for TCP connections, when connections 119 are in FIN_WAIT state, as this is the last state seen by load 120 balancer in Direct Routing mode. This bit helps on adding new 121 real servers to a very busy cluster. 122 123conntrack - BOOLEAN 124 - 0 - disabled (default) 125 - not 0 - enabled 126 127 If set, maintain connection tracking entries for 128 connections handled by IPVS. 129 130 This should be enabled if connections handled by IPVS are to be 131 also handled by stateful firewall rules. That is, iptables rules 132 that make use of connection tracking. It is a performance 133 optimisation to disable this setting otherwise. 134 135 Connections handled by the IPVS FTP application module 136 will have connection tracking entries regardless of this setting. 137 138 Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled. 139 140cache_bypass - BOOLEAN 141 - 0 - disabled (default) 142 - not 0 - enabled 143 144 If it is enabled, forward packets to the original destination 145 directly when no cache server is available and destination 146 address is not local (iph->daddr is RTN_UNICAST). It is mostly 147 used in transparent web cache cluster. 148 149debug_level - INTEGER 150 - 0 - transmission error messages (default) 151 - 1 - non-fatal error messages 152 - 2 - configuration 153 - 3 - destination trash 154 - 4 - drop entry 155 - 5 - service lookup 156 - 6 - scheduling 157 - 7 - connection new/expire, lookup and synchronization 158 - 8 - state transition 159 - 9 - binding destination, template checks and applications 160 - 10 - IPVS packet transmission 161 - 11 - IPVS packet handling (ip_vs_in/ip_vs_out) 162 - 12 or more - packet traversal 163 164 Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled. 165 166 Higher debugging levels include the messages for lower debugging 167 levels, so setting debug level 2, includes level 0, 1 and 2 168 messages. Thus, logging becomes more and more verbose the higher 169 the level. 170 171drop_entry - INTEGER 172 - 0 - disabled (default) 173 174 The drop_entry defense is to randomly drop entries in the 175 connection hash table, just in order to collect back some 176 memory for new connections. In the current code, the 177 drop_entry procedure can be activated every second, then it 178 randomly scans 1/32 of the whole and drops entries that are in 179 the SYN-RECV/SYNACK state, which should be effective against 180 syn-flooding attack. 181 182 The valid values of drop_entry are from 0 to 3, where 0 means 183 that this strategy is always disabled, 1 and 2 mean automatic 184 modes (when there is no enough available memory, the strategy 185 is enabled and the variable is automatically set to 2, 186 otherwise the strategy is disabled and the variable is set to 187 1), and 3 means that the strategy is always enabled. 188 189drop_packet - INTEGER 190 - 0 - disabled (default) 191 192 The drop_packet defense is designed to drop 1/rate packets 193 before forwarding them to real servers. If the rate is 1, then 194 drop all the incoming packets. 195 196 The value definition is the same as that of the drop_entry. In 197 the automatic mode, the rate is determined by the follow 198 formula: rate = amemthresh / (amemthresh - available_memory) 199 when available memory is less than the available memory 200 threshold. When the mode 3 is set, the always mode drop rate 201 is controlled by the /proc/sys/net/ipv4/vs/am_droprate. 202 203est_cpulist - CPULIST 204 Allowed CPUs for estimation kthreads 205 206 Syntax: standard cpulist format 207 empty list - stop kthread tasks and estimation 208 default - the system's housekeeping CPUs for kthreads 209 210 Example: 211 "all": all possible CPUs 212 "0-N": all possible CPUs, N denotes last CPU number 213 "0,1-N:1/2": first and all CPUs with odd number 214 "": empty list 215 216est_nice - INTEGER 217 default 0 218 Valid range: -20 (more favorable) .. 19 (less favorable) 219 220 Niceness value to use for the estimation kthreads (scheduling 221 priority) 222 223expire_nodest_conn - BOOLEAN 224 - 0 - disabled (default) 225 - not 0 - enabled 226 227 The default value is 0, the load balancer will silently drop 228 packets when its destination server is not available. It may 229 be useful, when user-space monitoring program deletes the 230 destination server (because of server overload or wrong 231 detection) and add back the server later, and the connections 232 to the server can continue. 233 234 If this feature is enabled, the load balancer will expire the 235 connection immediately when a packet arrives and its 236 destination server is not available, then the client program 237 will be notified that the connection is closed. This is 238 equivalent to the feature some people requires to flush 239 connections when its destination is not available. 240 241expire_quiescent_template - BOOLEAN 242 - 0 - disabled (default) 243 - not 0 - enabled 244 245 When set to a non-zero value, the load balancer will expire 246 persistent templates when the destination server is quiescent. 247 This may be useful, when a user makes a destination server 248 quiescent by setting its weight to 0 and it is desired that 249 subsequent otherwise persistent connections are sent to a 250 different destination server. By default new persistent 251 connections are allowed to quiescent destination servers. 252 253 If this feature is enabled, the load balancer will expire the 254 persistence template if it is to be used to schedule a new 255 connection and the destination server is quiescent. 256 257ignore_tunneled - BOOLEAN 258 - 0 - disabled (default) 259 - not 0 - enabled 260 261 If set, ipvs will set the ipvs_property on all packets which are of 262 unrecognized protocols. This prevents us from routing tunneled 263 protocols like ipip, which is useful to prevent rescheduling 264 packets that have been tunneled to the ipvs host (i.e. to prevent 265 ipvs routing loops when ipvs is also acting as a real server). 266 267nat_icmp_send - BOOLEAN 268 - 0 - disabled (default) 269 - not 0 - enabled 270 271 It controls sending icmp error messages (ICMP_DEST_UNREACH) 272 for VS/NAT when the load balancer receives packets from real 273 servers but the connection entries don't exist. 274 275pmtu_disc - BOOLEAN 276 - 0 - disabled 277 - not 0 - enabled (default) 278 279 By default, reject with FRAG_NEEDED all DF packets that exceed 280 the PMTU, irrespective of the forwarding method. For TUN method 281 the flag can be disabled to fragment such packets. 282 283secure_tcp - INTEGER 284 - 0 - disabled (default) 285 286 The secure_tcp defense is to use a more complicated TCP state 287 transition table. For VS/NAT, it also delays entering the 288 TCP ESTABLISHED state until the three way handshake is completed. 289 290 The value definition is the same as that of drop_entry and 291 drop_packet. 292 293svc_lfactor - INTEGER 294 Possible values: -8 (larger table) .. 8 (smaller table) 295 296 Default: -3 297 298 Controls the sizing of the service hash table based on the 299 load factor (number of services per table buckets). The table 300 will grow and shrink in the range of 2^4 - 2^20. 301 See conn_lfactor for explanation. 302 303sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period 304 default 3 50 305 306 It sets synchronization threshold, which is the minimum number 307 of incoming packets that a connection needs to receive before 308 the connection will be synchronized. A connection will be 309 synchronized, every time the number of its incoming packets 310 modulus sync_period equals the threshold. The range of the 311 threshold is from 0 to sync_period. 312 313 When sync_period and sync_refresh_period are 0, send sync only 314 for state changes or only once when pkts matches sync_threshold 315 316sync_refresh_period - UNSIGNED INTEGER 317 default 0 318 319 In seconds, difference in reported connection timer that triggers 320 new sync message. It can be used to avoid sync messages for the 321 specified period (or half of the connection timeout if it is lower) 322 if connection state is not changed since last sync. 323 324 This is useful for normal connections with high traffic to reduce 325 sync rate. Additionally, retry sync_retries times with period of 326 sync_refresh_period/8. 327 328sync_retries - INTEGER 329 default 0 330 331 Defines sync retries with period of sync_refresh_period/8. Useful 332 to protect against loss of sync messages. The range of the 333 sync_retries is from 0 to 3. 334 335sync_qlen_max - UNSIGNED LONG 336 337 Hard limit for queued sync messages that are not sent yet. It 338 defaults to 1/32 of the memory pages but actually represents 339 number of messages. It will protect us from allocating large 340 parts of memory when the sending rate is lower than the queuing 341 rate. 342 343sync_sock_size - INTEGER 344 default 0 345 346 Configuration of SNDBUF (master) or RCVBUF (slave) socket limit. 347 Default value is 0 (preserve system defaults). 348 349sync_ports - INTEGER 350 default 1 351 352 The number of threads that master and backup servers can use for 353 sync traffic. Every thread will use single UDP port, thread 0 will 354 use the default port 8848 while last thread will use port 355 8848+sync_ports-1. 356 357snat_reroute - BOOLEAN 358 - 0 - disabled 359 - not 0 - enabled (default) 360 361 If enabled, recalculate the route of SNATed packets from 362 realservers so that they are routed as if they originate from the 363 director. Otherwise they are routed as if they are forwarded by the 364 director. 365 366 If policy routing is in effect then it is possible that the route 367 of a packet originating from a director is routed differently to a 368 packet being forwarded by the director. 369 370 If policy routing is not in effect then the recalculated route will 371 always be the same as the original route so it is an optimisation 372 to disable snat_reroute and avoid the recalculation. 373 374sync_persist_mode - INTEGER 375 default 0 376 377 Controls the synchronisation of connections when using persistence 378 379 0: All types of connections are synchronised 380 381 1: Attempt to reduce the synchronisation traffic depending on 382 the connection type. For persistent services avoid synchronisation 383 for normal connections, do it only for persistence templates. 384 In such case, for TCP and SCTP it may need enabling sloppy_tcp and 385 sloppy_sctp flags on backup servers. For non-persistent services 386 such optimization is not applied, mode 0 is assumed. 387 388sync_version - INTEGER 389 default 1 390 391 The version of the synchronisation protocol used when sending 392 synchronisation messages. 393 394 0 selects the original synchronisation protocol (version 0). This 395 should be used when sending synchronisation messages to a legacy 396 system that only understands the original synchronisation protocol. 397 398 1 selects the current synchronisation protocol (version 1). This 399 should be used where possible. 400 401 Kernels with this sync_version entry are able to receive messages 402 of both version 1 and version 2 of the synchronisation protocol. 403 404run_estimation - BOOLEAN 405 0 - disabled 406 not 0 - enabled (default) 407 408 If disabled, the estimation will be suspended and kthread tasks 409 stopped. 410 411 You can always re-enable estimation by setting this value to 1. 412 But be careful, the first estimation after re-enable is not 413 accurate. 414