xref: /linux/Documentation/networking/ip-sysctl.rst (revision dae4a92399fa8d68aa917db6bb3245f83021e762)
1.. SPDX-License-Identifier: GPL-2.0
2
3=========
4IP Sysctl
5=========
6
7/proc/sys/net/ipv4/* Variables
8==============================
9
10ip_forward - BOOLEAN
11	Forward Packets between interfaces.
12
13	This variable is special, its change resets all configuration
14	parameters to their default state (RFC1122 for hosts, RFC1812
15	for routers)
16
17	Possible values:
18
19	- 0 (disabled)
20	- 1 (enabled)
21
22	Default: 0 (disabled)
23
24ip_default_ttl - INTEGER
25	Default value of TTL field (Time To Live) for outgoing (but not
26	forwarded) IP packets. Should be between 1 and 255 inclusive.
27	Default: 64 (as recommended by RFC1700)
28
29ip_no_pmtu_disc - INTEGER
30	Disable Path MTU Discovery. If enabled in mode 1 and a
31	fragmentation-required ICMP is received, the PMTU to this
32	destination will be set to the smallest of the old MTU to
33	this destination and min_pmtu (see below). You will need
34	to raise min_pmtu to the smallest interface MTU on your system
35	manually if you want to avoid locally generated fragments.
36
37	In mode 2 incoming Path MTU Discovery messages will be
38	discarded. Outgoing frames are handled the same as in mode 1,
39	implicitly setting IP_PMTUDISC_DONT on every created socket.
40
41	Mode 3 is a hardened pmtu discover mode. The kernel will only
42	accept fragmentation-needed errors if the underlying protocol
43	can verify them besides a plain socket lookup. Current
44	protocols for which pmtu events will be honored are TCP and
45	SCTP as they verify e.g. the sequence number or the
46	association. This mode should not be enabled globally but is
47	only intended to secure e.g. name servers in namespaces where
48	TCP path mtu must still work but path MTU information of other
49	protocols should be discarded. If enabled globally this mode
50	could break other protocols.
51
52	Possible values: 0-3
53
54	Default: FALSE
55
56min_pmtu - INTEGER
57	default 552 - minimum Path MTU. Unless this is changed manually,
58	each cached pmtu will never be lower than this setting.
59
60ip_forward_use_pmtu - BOOLEAN
61	By default we don't trust protocol path MTUs while forwarding
62	because they could be easily forged and can lead to unwanted
63	fragmentation by the router.
64	You only need to enable this if you have user-space software
65	which tries to discover path mtus by itself and depends on the
66	kernel honoring this information. This is normally not the
67	case.
68
69	Possible values:
70
71	- 0 (disabled)
72	- 1 (enabled)
73
74	Default: 0 (disabled)
75
76fwmark_reflect - BOOLEAN
77	Controls the fwmark of kernel-generated IPv4 reply packets that are not
78	associated with a socket for example, TCP RSTs or ICMP echo replies).
79	If disabled, these packets have a fwmark of zero. If enabled, they have the
80	fwmark of the packet they are replying to.
81
82	Possible values:
83
84	- 0 (disabled)
85	- 1 (enabled)
86
87	Default: 0 (disabled)
88
89fib_multipath_use_neigh - BOOLEAN
90	Use status of existing neighbor entry when determining nexthop for
91	multipath routes. If disabled, neighbor information is not used and
92	packets could be directed to a failed nexthop. Only valid for kernels
93	built with CONFIG_IP_ROUTE_MULTIPATH enabled.
94
95	Possible values:
96
97	- 0 (disabled)
98	- 1 (enabled)
99
100	Default: 0 (disabled)
101
102fib_multipath_hash_policy - INTEGER
103	Controls which hash policy to use for multipath routes. Only valid
104	for kernels built with CONFIG_IP_ROUTE_MULTIPATH enabled.
105
106	Default: 0 (Layer 3)
107
108	Possible values:
109
110	- 0 - Layer 3
111	- 1 - Layer 4
112	- 2 - Layer 3 or inner Layer 3 if present
113	- 3 - Custom multipath hash. Fields used for multipath hash calculation
114	  are determined by fib_multipath_hash_fields sysctl
115
116fib_multipath_hash_fields - UNSIGNED INTEGER
117	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
118	fields used for multipath hash calculation are determined by this
119	sysctl.
120
121	This value is a bitmask which enables various fields for multipath hash
122	calculation.
123
124	Possible fields are:
125
126	====== ============================
127	0x0001 Source IP address
128	0x0002 Destination IP address
129	0x0004 IP protocol
130	0x0008 Unused (Flow Label)
131	0x0010 Source port
132	0x0020 Destination port
133	0x0040 Inner source IP address
134	0x0080 Inner destination IP address
135	0x0100 Inner IP protocol
136	0x0200 Inner Flow Label
137	0x0400 Inner source port
138	0x0800 Inner destination port
139	====== ============================
140
141	Default: 0x0007 (source IP, destination IP and IP protocol)
142
143fib_multipath_hash_seed - UNSIGNED INTEGER
144	The seed value used when calculating hash for multipath routes. Applies
145	to both IPv4 and IPv6 datapath. Only present for kernels built with
146	CONFIG_IP_ROUTE_MULTIPATH enabled.
147
148	When set to 0, the seed value used for multipath routing defaults to an
149	internal random-generated one.
150
151	The actual hashing algorithm is not specified -- there is no guarantee
152	that a next hop distribution effected by a given seed will keep stable
153	across kernel versions.
154
155	Default: 0 (random)
156
157fib_sync_mem - UNSIGNED INTEGER
158	Amount of dirty memory from fib entries that can be backlogged before
159	synchronize_rcu is forced.
160
161	Default: 512kB   Minimum: 64kB   Maximum: 64MB
162
163ip_forward_update_priority - INTEGER
164	Whether to update SKB priority from "TOS" field in IPv4 header after it
165	is forwarded. The new SKB priority is mapped from TOS field value
166	according to an rt_tos2priority table (see e.g. man tc-prio).
167
168	Default: 1 (Update priority.)
169
170	Possible values:
171
172	- 0 - Do not update priority.
173	- 1 - Update priority.
174
175route/max_size - INTEGER
176	Maximum number of routes allowed in the kernel.  Increase
177	this when using large numbers of interfaces and/or routes.
178
179	From linux kernel 3.6 onwards, this is deprecated for ipv4
180	as route cache is no longer used.
181
182	From linux kernel 6.3 onwards, this is deprecated for ipv6
183	as garbage collection manages cached route entries.
184
185neigh/default/gc_thresh1 - INTEGER
186	Minimum number of entries to keep.  Garbage collector will not
187	purge entries if there are fewer than this number.
188
189	Default: 128
190
191neigh/default/gc_thresh2 - INTEGER
192	Threshold when garbage collector becomes more aggressive about
193	purging entries. Entries older than 5 seconds will be cleared
194	when over this number.
195
196	Default: 512
197
198neigh/default/gc_thresh3 - INTEGER
199	Maximum number of non-PERMANENT neighbor entries allowed.  Increase
200	this when using large numbers of interfaces and when communicating
201	with large numbers of directly-connected peers.
202
203	Default: 1024
204
205neigh/default/unres_qlen_bytes - INTEGER
206	The maximum number of bytes which may be used by packets
207	queued for each	unresolved address by other network layers.
208	(added in linux 3.3)
209
210	Setting negative value is meaningless and will return error.
211
212	Default: SK_WMEM_DEFAULT, (same as net.core.wmem_default).
213
214		Exact value depends on architecture and kernel options,
215		but should be enough to allow queuing 256 packets
216		of medium size.
217
218neigh/default/unres_qlen - INTEGER
219	The maximum number of packets which may be queued for each
220	unresolved address by other network layers.
221
222	(deprecated in linux 3.3) : use unres_qlen_bytes instead.
223
224	Prior to linux 3.3, the default value is 3 which may cause
225	unexpected packet loss. The current default value is calculated
226	according to default value of unres_qlen_bytes and true size of
227	packet.
228
229	Default: 101
230
231neigh/default/interval_probe_time_ms - INTEGER
232	The probe interval for neighbor entries with NTF_MANAGED flag,
233	the min value is 1.
234
235	Default: 5000
236
237mtu_expires - INTEGER
238	Time, in seconds, that cached PMTU information is kept.
239
240min_adv_mss - INTEGER
241	The advertised MSS depends on the first hop route MTU, but will
242	never be lower than this setting.
243
244fib_notify_on_flag_change - INTEGER
245        Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/
246        RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed.
247
248        After installing a route to the kernel, user space receives an
249        acknowledgment, which means the route was installed in the kernel,
250        but not necessarily in hardware.
251        It is also possible for a route already installed in hardware to change
252        its action and therefore its flags. For example, a host route that is
253        trapping packets can be "promoted" to perform decapsulation following
254        the installation of an IPinIP/VXLAN tunnel.
255        The notifications will indicate to user-space the state of the route.
256
257        Default: 0 (Do not emit notifications.)
258
259        Possible values:
260
261        - 0 - Do not emit notifications.
262        - 1 - Emit notifications.
263        - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change.
264
265IP Fragmentation:
266
267ipfrag_high_thresh - LONG INTEGER
268	Maximum memory used to reassemble IP fragments.
269
270ipfrag_low_thresh - LONG INTEGER
271	(Obsolete since linux-4.17)
272	Maximum memory used to reassemble IP fragments before the kernel
273	begins to remove incomplete fragment queues to free up resources.
274	The kernel still accepts new fragments for defragmentation.
275
276ipfrag_time - INTEGER
277	Time in seconds to keep an IP fragment in memory.
278
279ipfrag_max_dist - INTEGER
280	ipfrag_max_dist is a non-negative integer value which defines the
281	maximum "disorder" which is allowed among fragments which share a
282	common IP source address. Note that reordering of packets is
283	not unusual, but if a large number of fragments arrive from a source
284	IP address while a particular fragment queue remains incomplete, it
285	probably indicates that one or more fragments belonging to that queue
286	have been lost. When ipfrag_max_dist is positive, an additional check
287	is done on fragments before they are added to a reassembly queue - if
288	ipfrag_max_dist (or more) fragments have arrived from a particular IP
289	address between additions to any IP fragment queue using that source
290	address, it's presumed that one or more fragments in the queue are
291	lost. The existing fragment queue will be dropped, and a new one
292	started. An ipfrag_max_dist value of zero disables this check.
293
294	Using a very small value, e.g. 1 or 2, for ipfrag_max_dist can
295	result in unnecessarily dropping fragment queues when normal
296	reordering of packets occurs, which could lead to poor application
297	performance. Using a very large value, e.g. 50000, increases the
298	likelihood of incorrectly reassembling IP fragments that originate
299	from different IP datagrams, which could result in data corruption.
300	Default: 64
301
302bc_forwarding - INTEGER
303	bc_forwarding enables the feature described in rfc1812#section-5.3.5.2
304	and rfc2644. It allows the router to forward directed broadcast.
305	To enable this feature, the 'all' entry and the input interface entry
306	should be set to 1.
307	Default: 0
308
309INET peer storage
310=================
311
312inet_peer_threshold - INTEGER
313	The approximate size of the storage.  Starting from this threshold
314	entries will be thrown aggressively.  This threshold also determines
315	entries' time-to-live and time intervals between garbage collection
316	passes.  More entries, less time-to-live, less GC interval.
317
318inet_peer_minttl - INTEGER
319	Minimum time-to-live of entries.  Should be enough to cover fragment
320	time-to-live on the reassembling side.  This minimum time-to-live  is
321	guaranteed if the pool size is less than inet_peer_threshold.
322	Measured in seconds.
323
324inet_peer_maxttl - INTEGER
325	Maximum time-to-live of entries.  Unused entries will expire after
326	this period of time if there is no memory pressure on the pool (i.e.
327	when the number of entries in the pool is very small).
328	Measured in seconds.
329
330TCP variables
331=============
332
333somaxconn - INTEGER
334	Limit of socket listen() backlog, known in userspace as SOMAXCONN.
335	Defaults to 4096. (Was 128 before linux-5.4)
336	See also tcp_max_syn_backlog for additional tuning for TCP sockets.
337
338tcp_abort_on_overflow - BOOLEAN
339	If listening service is too slow to accept new connections,
340	reset them. Default state is FALSE. It means that if overflow
341	occurred due to a burst, connection will recover. Enable this
342	option _only_ if you are really sure that listening daemon
343	cannot be tuned to accept connections faster. Enabling this
344	option can harm clients of your server.
345
346tcp_adv_win_scale - INTEGER
347	Obsolete since linux-6.6
348	Count buffering overhead as bytes/2^tcp_adv_win_scale
349	(if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale),
350	if it is <= 0.
351
352	Possible values are [-31, 31], inclusive.
353
354	Default: 1
355
356tcp_allowed_congestion_control - STRING
357	Show/set the congestion control choices available to non-privileged
358	processes. The list is a subset of those listed in
359	tcp_available_congestion_control.
360
361	Default is "reno" and the default setting (tcp_congestion_control).
362
363tcp_app_win - INTEGER
364	Reserve max(window/2^tcp_app_win, mss) of window for application
365	buffer. Value 0 is special, it means that nothing is reserved.
366
367	Possible values are [0, 31], inclusive.
368
369	Default: 31
370
371tcp_autocorking - BOOLEAN
372	Enable TCP auto corking :
373	When applications do consecutive small write()/sendmsg() system calls,
374	we try to coalesce these small writes as much as possible, to lower
375	total amount of sent packets. This is done if at least one prior
376	packet for the flow is waiting in Qdisc queues or device transmit
377	queue. Applications can still use TCP_CORK for optimal behavior
378	when they know how/when to uncork their sockets.
379
380	Possible values:
381
382	- 0 (disabled)
383	- 1 (enabled)
384
385	Default: 1 (enabled)
386
387tcp_available_congestion_control - STRING
388	Shows the available congestion control choices that are registered.
389	More congestion control algorithms may be available as modules,
390	but not loaded.
391
392tcp_base_mss - INTEGER
393	The initial value of search_low to be used by the packetization layer
394	Path MTU discovery (MTU probing).  If MTU probing is enabled,
395	this is the initial MSS used by the connection.
396
397tcp_mtu_probe_floor - INTEGER
398	If MTU probing is enabled this caps the minimum MSS used for search_low
399	for the connection.
400
401	Default : 48
402
403tcp_min_snd_mss - INTEGER
404	TCP SYN and SYNACK messages usually advertise an ADVMSS option,
405	as described in RFC 1122 and RFC 6691.
406
407	If this ADVMSS option is smaller than tcp_min_snd_mss,
408	it is silently capped to tcp_min_snd_mss.
409
410	Default : 48 (at least 8 bytes of payload per segment)
411
412tcp_congestion_control - STRING
413	Set the congestion control algorithm to be used for new
414	connections. The algorithm "reno" is always available, but
415	additional choices may be available based on kernel configuration.
416	Default is set as part of kernel configuration.
417	For passive connections, the listener congestion control choice
418	is inherited.
419
420	[see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
421
422tcp_dsack - BOOLEAN
423	Allows TCP to send "duplicate" SACKs.
424
425	Possible values:
426
427	- 0 (disabled)
428	- 1 (enabled)
429
430	Default: 1 (enabled)
431
432tcp_early_retrans - INTEGER
433	Tail loss probe (TLP) converts RTOs occurring due to tail
434	losses into fast recovery (RFC8985). Note that
435	TLP requires RACK to function properly (see tcp_recovery below)
436
437	Possible values:
438
439		- 0 disables TLP
440		- 3 or 4 enables TLP
441
442	Default: 3
443
444tcp_ecn - INTEGER
445	Control use of Explicit Congestion Notification (ECN) by TCP.
446	ECN is used only when both ends of the TCP connection indicate support
447	for it. This feature is useful in avoiding losses due to congestion by
448	allowing supporting routers to signal congestion before having to drop
449	packets. A host that supports ECN both sends ECN at the IP layer and
450	feeds back ECN at the TCP layer. The highest variant of ECN feedback
451	that both peers support is chosen by the ECN negotiation (Accurate ECN,
452	ECN, or no ECN).
453
454	The highest negotiated variant for incoming connection requests
455	and the highest variant requested by outgoing connection
456	attempts:
457
458	===== ==================== ====================
459	Value Incoming connections Outgoing connections
460	===== ==================== ====================
461	0     No ECN               No ECN
462	1     ECN                  ECN
463	2     ECN                  No ECN
464	3     AccECN               AccECN
465	4     AccECN               ECN
466	5     AccECN               No ECN
467	===== ==================== ====================
468
469	Default: 2
470
471tcp_ecn_option - INTEGER
472	Control Accurate ECN (AccECN) option sending when AccECN has been
473	successfully negotiated during handshake. Send logic inhibits
474	sending AccECN options regarless of this setting when no AccECN
475	option has been seen for the reverse direction.
476
477	Possible values are:
478
479	= ============================================================
480	0 Never send AccECN option. This also disables sending AccECN
481	  option in SYN/ACK during handshake.
482	1 Send AccECN option sparingly according to the minimum option
483	  rules outlined in draft-ietf-tcpm-accurate-ecn.
484	2 Send AccECN option on every packet whenever it fits into TCP
485	  option space.
486	= ============================================================
487
488	Default: 2
489
490tcp_ecn_option_beacon - INTEGER
491	Control Accurate ECN (AccECN) option sending frequency per RTT and it
492	takes effect only when tcp_ecn_option is set to 2.
493
494	Default: 3 (AccECN will be send at least 3 times per RTT)
495
496tcp_ecn_fallback - BOOLEAN
497	If the kernel detects that ECN connection misbehaves, enable fall
498	back to non-ECN. Currently, this knob implements the fallback
499	from RFC3168, section 6.1.1.1., but we reserve that in future,
500	additional detection mechanisms could be implemented under this
501	knob. The value	is not used, if tcp_ecn or per route (or congestion
502	control) ECN settings are disabled.
503
504	Possible values:
505
506	- 0 (disabled)
507	- 1 (enabled)
508
509	Default: 1 (enabled)
510
511tcp_fack - BOOLEAN
512	This is a legacy option, it has no effect anymore.
513
514tcp_fin_timeout - INTEGER
515	The length of time an orphaned (no longer referenced by any
516	application) connection will remain in the FIN_WAIT_2 state
517	before it is aborted at the local end.  While a perfectly
518	valid "receive only" state for an un-orphaned connection, an
519	orphaned connection in FIN_WAIT_2 state could otherwise wait
520	forever for the remote to close its end of the connection.
521
522	Cf. tcp_max_orphans
523
524	Default: 60 seconds
525
526tcp_frto - INTEGER
527	Enables Forward RTO-Recovery (F-RTO) defined in RFC5682.
528	F-RTO is an enhanced recovery algorithm for TCP retransmission
529	timeouts.  It is particularly beneficial in networks where the
530	RTT fluctuates (e.g., wireless). F-RTO is sender-side only
531	modification. It does not require any support from the peer.
532
533	By default it's enabled with a non-zero value. 0 disables F-RTO.
534
535tcp_fwmark_accept - BOOLEAN
536	If enabled, incoming connections to listening sockets that do not have a
537	socket mark will set the mark of the accepting socket to the fwmark of
538	the incoming SYN packet. This will cause all packets on that connection
539	(starting from the first SYNACK) to be sent with that fwmark. The
540	listening socket's mark is unchanged. Listening sockets that already
541	have a fwmark set via setsockopt(SOL_SOCKET, SO_MARK, ...) are
542	unaffected.
543
544	Possible values:
545
546	- 0 (disabled)
547	- 1 (enabled)
548
549	Default: 0 (disabled)
550
551tcp_invalid_ratelimit - INTEGER
552	Limit the maximal rate for sending duplicate acknowledgments
553	in response to incoming TCP packets that are for an existing
554	connection but that are invalid due to any of these reasons:
555
556	  (a) out-of-window sequence number,
557	  (b) out-of-window acknowledgment number, or
558	  (c) PAWS (Protection Against Wrapped Sequence numbers) check failure
559
560	This can help mitigate simple "ack loop" DoS attacks, wherein
561	a buggy or malicious middlebox or man-in-the-middle can
562	rewrite TCP header fields in manner that causes each endpoint
563	to think that the other is sending invalid TCP segments, thus
564	causing each side to send an unterminating stream of duplicate
565	acknowledgments for invalid segments.
566
567	Using 0 disables rate-limiting of dupacks in response to
568	invalid segments; otherwise this value specifies the minimal
569	space between sending such dupacks, in milliseconds.
570
571	Default: 500 (milliseconds).
572
573tcp_keepalive_time - INTEGER
574	How often TCP sends out keepalive messages when keepalive is enabled.
575	Default: 2hours.
576
577tcp_keepalive_probes - INTEGER
578	How many keepalive probes TCP sends out, until it decides that the
579	connection is broken. Default value: 9.
580
581tcp_keepalive_intvl - INTEGER
582	How frequently the probes are send out. Multiplied by
583	tcp_keepalive_probes it is time to kill not responding connection,
584	after probes started. Default value: 75sec i.e. connection
585	will be aborted after ~11 minutes of retries.
586
587tcp_l3mdev_accept - BOOLEAN
588	Enables child sockets to inherit the L3 master device index.
589	Enabling this option allows a "global" listen socket to work
590	across L3 master domains (e.g., VRFs) with connected sockets
591	derived from the listen socket to be bound to the L3 domain in
592	which the packets originated. Only valid when the kernel was
593	compiled with CONFIG_NET_L3_MASTER_DEV.
594
595	Possible values:
596
597	- 0 (disabled)
598	- 1 (enabled)
599
600	Default: 0 (disabled)
601
602tcp_low_latency - BOOLEAN
603	This is a legacy option, it has no effect anymore.
604
605tcp_max_orphans - INTEGER
606	Maximal number of TCP sockets not attached to any user file handle,
607	held by system.	If this number is exceeded orphaned connections are
608	reset immediately and warning is printed. This limit exists
609	only to prevent simple DoS attacks, you _must_ not rely on this
610	or lower the limit artificially, but rather increase it
611	(probably, after increasing installed memory),
612	if network conditions require more than default value,
613	and tune network services to linger and kill such states
614	more aggressively. Let me to remind again: each orphan eats
615	up to ~64K of unswappable memory.
616
617tcp_max_syn_backlog - INTEGER
618	Maximal number of remembered connection requests (SYN_RECV),
619	which have not received an acknowledgment from connecting client.
620
621	This is a per-listener limit.
622
623	The minimal value is 128 for low memory machines, and it will
624	increase in proportion to the memory of machine.
625
626	If server suffers from overload, try increasing this number.
627
628	Remember to also check /proc/sys/net/core/somaxconn
629	A SYN_RECV request socket consumes about 304 bytes of memory.
630
631tcp_max_tw_buckets - INTEGER
632	Maximal number of timewait sockets held by system simultaneously.
633	If this number is exceeded time-wait socket is immediately destroyed
634	and warning is printed. This limit exists only to prevent
635	simple DoS attacks, you _must_ not lower the limit artificially,
636	but rather increase it (probably, after increasing installed memory),
637	if network conditions require more than default value.
638
639tcp_mem - vector of 3 INTEGERs: min, pressure, max
640	min: below this number of pages TCP is not bothered about its
641	memory appetite.
642
643	pressure: when amount of memory allocated by TCP exceeds this number
644	of pages, TCP moderates its memory consumption and enters memory
645	pressure mode, which is exited when memory consumption falls
646	under "min".
647
648	max: number of pages allowed for queueing by all TCP sockets.
649
650	Defaults are calculated at boot time from amount of available
651	memory.
652
653tcp_min_rtt_wlen - INTEGER
654	The window length of the windowed min filter to track the minimum RTT.
655	A shorter window lets a flow more quickly pick up new (higher)
656	minimum RTT when it is moved to a longer path (e.g., due to traffic
657	engineering). A longer window makes the filter more resistant to RTT
658	inflations such as transient congestion. The unit is seconds.
659
660	Possible values: 0 - 86400 (1 day)
661
662	Default: 300
663
664tcp_moderate_rcvbuf - BOOLEAN
665	If enabled, TCP performs receive buffer auto-tuning, attempting to
666	automatically size the buffer (no greater than tcp_rmem[2]) to
667	match the size required by the path for full throughput.
668
669	Possible values:
670
671	- 0 (disabled)
672	- 1 (enabled)
673
674	Default: 1 (enabled)
675
676tcp_mtu_probing - INTEGER
677	Controls TCP Packetization-Layer Path MTU Discovery.  Takes three
678	values:
679
680	- 0 - Disabled
681	- 1 - Disabled by default, enabled when an ICMP black hole detected
682	- 2 - Always enabled, use initial MSS of tcp_base_mss.
683
684tcp_probe_interval - UNSIGNED INTEGER
685	Controls how often to start TCP Packetization-Layer Path MTU
686	Discovery reprobe. The default is reprobing every 10 minutes as
687	per RFC4821.
688
689tcp_probe_threshold - INTEGER
690	Controls when TCP Packetization-Layer Path MTU Discovery probing
691	will stop in respect to the width of search range in bytes. Default
692	is 8 bytes.
693
694tcp_no_metrics_save - BOOLEAN
695	By default, TCP saves various connection metrics in the route cache
696	when the connection closes, so that connections established in the
697	near future can use these to set initial conditions.  Usually, this
698	increases overall performance, but may sometimes cause performance
699	degradation.  If enabled, TCP will not cache metrics on closing
700	connections.
701
702	Possible values:
703
704	- 0 (disabled)
705	- 1 (enabled)
706
707	Default: 0 (disabled)
708
709tcp_no_ssthresh_metrics_save - BOOLEAN
710	Controls whether TCP saves ssthresh metrics in the route cache.
711	If enabled, ssthresh metrics are disabled.
712
713	Possible values:
714
715	- 0 (disabled)
716	- 1 (enabled)
717
718	Default: 1 (enabled)
719
720tcp_orphan_retries - INTEGER
721	This value influences the timeout of a locally closed TCP connection,
722	when RTO retransmissions remain unacknowledged.
723	See tcp_retries2 for more details.
724
725	The default value is 8.
726
727	If your machine is a loaded WEB server,
728	you should think about lowering this value, such sockets
729	may consume significant resources. Cf. tcp_max_orphans.
730
731tcp_recovery - INTEGER
732	This value is a bitmap to enable various experimental loss recovery
733	features.
734
735	=========   =============================================================
736	RACK: 0x1   enables RACK loss detection, for fast detection of lost
737		    retransmissions and tail drops, and resilience to
738		    reordering. currently, setting this bit to 0 has no
739		    effect, since RACK is the only supported loss detection
740		    algorithm.
741
742	RACK: 0x2   makes RACK's reordering window static (min_rtt/4).
743
744	RACK: 0x4   disables RACK's DUPACK threshold heuristic
745	=========   =============================================================
746
747	Default: 0x1
748
749tcp_reflect_tos - BOOLEAN
750	For listening sockets, reuse the DSCP value of the initial SYN message
751	for outgoing packets. This allows to have both directions of a TCP
752	stream to use the same DSCP value, assuming DSCP remains unchanged for
753	the lifetime of the connection.
754
755	This options affects both IPv4 and IPv6.
756
757	Possible values:
758
759	- 0 (disabled)
760	- 1 (enabled)
761
762	Default: 0 (disabled)
763
764tcp_reordering - INTEGER
765	Initial reordering level of packets in a TCP stream.
766	TCP stack can then dynamically adjust flow reordering level
767	between this initial value and tcp_max_reordering
768
769	Default: 3
770
771tcp_max_reordering - INTEGER
772	Maximal reordering level of packets in a TCP stream.
773	300 is a fairly conservative value, but you might increase it
774	if paths are using per packet load balancing (like bonding rr mode)
775
776	Default: 300
777
778tcp_retrans_collapse - BOOLEAN
779	Bug-to-bug compatibility with some broken printers.
780	On retransmit try to send bigger packets to work around bugs in
781	certain TCP stacks.
782
783	Possible values:
784
785	- 0 (disabled)
786	- 1 (enabled)
787
788	Default: 1 (enabled)
789
790tcp_retries1 - INTEGER
791	This value influences the time, after which TCP decides, that
792	something is wrong due to unacknowledged RTO retransmissions,
793	and reports this suspicion to the network layer.
794	See tcp_retries2 for more details.
795
796	RFC 1122 recommends at least 3 retransmissions, which is the
797	default.
798
799tcp_retries2 - INTEGER
800	This value influences the timeout of an alive TCP connection,
801	when RTO retransmissions remain unacknowledged.
802	Given a value of N, a hypothetical TCP connection following
803	exponential backoff with an initial RTO of TCP_RTO_MIN would
804	retransmit N times before killing the connection at the (N+1)th RTO.
805
806	The default value of 15 yields a hypothetical timeout of 924.6
807	seconds and is a lower bound for the effective timeout.
808	TCP will effectively time out at the first RTO which exceeds the
809	hypothetical timeout.
810	If tcp_rto_max_ms is decreased, it is recommended to also
811	change tcp_retries2.
812
813	RFC 1122 recommends at least 100 seconds for the timeout,
814	which corresponds to a value of at least 8.
815
816tcp_rfc1337 - BOOLEAN
817	If enabled, the TCP stack behaves conforming to RFC1337. If unset,
818	we are not conforming to RFC, but prevent TCP TIME_WAIT
819	assassination.
820
821	Possible values:
822
823	- 0 (disabled)
824	- 1 (enabled)
825
826	Default: 0 (disabled)
827
828tcp_rmem - vector of 3 INTEGERs: min, default, max
829	min: Minimal size of receive buffer used by TCP sockets.
830	It is guaranteed to each TCP socket, even under moderate memory
831	pressure.
832
833	Default: 4K
834
835	default: initial size of receive buffer used by TCP sockets.
836	This value overrides net.core.rmem_default used by other protocols.
837	Default: 131072 bytes.
838	This value results in initial window of 65535.
839
840	max: maximal size of receive buffer allowed for automatically
841	selected receiver buffers for TCP socket.
842	Calling setsockopt() with SO_RCVBUF disables
843	automatic tuning of that socket's receive buffer size, in which
844	case this value is ignored.
845	Default: between 131072 and 32MB, depending on RAM size.
846
847tcp_sack - BOOLEAN
848	Enable select acknowledgments (SACKS).
849
850	Possible values:
851
852	- 0 (disabled)
853	- 1 (enabled)
854
855	Default: 1 (enabled)
856
857tcp_comp_sack_rtt_percent - INTEGER
858	Percentage of SRTT used for the compressed SACK feature.
859	See tcp_comp_sack_nr, tcp_comp_sack_delay_ns, tcp_comp_sack_slack_ns.
860
861	Possible values : 1 - 1000
862
863	Default : 33 %
864
865tcp_comp_sack_delay_ns - LONG INTEGER
866	TCP tries to reduce number of SACK sent, using a timer based
867	on tcp_comp_sack_rtt_percent of SRTT, capped by this sysctl
868	in nano seconds.
869	The default is 1ms, based on TSO autosizing period.
870
871	Default : 1,000,000 ns (1 ms)
872
873tcp_comp_sack_slack_ns - LONG INTEGER
874	This sysctl control the slack used when arming the
875	timer used by SACK compression. This gives extra time
876	for small RTT flows, and reduces system overhead by allowing
877	opportunistic reduction of timer interrupts.
878
879	Default : 100,000 ns (100 us)
880
881tcp_comp_sack_nr - INTEGER
882	Max number of SACK that can be compressed.
883	Using 0 disables SACK compression.
884
885	Default : 44
886
887tcp_backlog_ack_defer - BOOLEAN
888	If enabled, user thread processing socket backlog tries sending
889	one ACK for the whole queue. This helps to avoid potential
890	long latencies at end of a TCP socket syscall.
891
892	Possible values:
893
894	- 0 (disabled)
895	- 1 (enabled)
896
897	Default: 1 (enabled)
898
899tcp_slow_start_after_idle - BOOLEAN
900	If enabled, provide RFC2861 behavior and time out the congestion
901	window after an idle period.  An idle period is defined at
902	the current RTO.  If unset, the congestion window will not
903	be timed out after an idle period.
904
905	Possible values:
906
907	- 0 (disabled)
908	- 1 (enabled)
909
910	Default: 1 (enabled)
911
912tcp_stdurg - BOOLEAN
913	Use the Host requirements interpretation of the TCP urgent pointer field.
914	Most hosts use the older BSD interpretation, so if enabled,
915	Linux might not communicate correctly with them.
916
917	Possible values:
918
919	- 0 (disabled)
920	- 1 (enabled)
921
922	Default: 0 (disabled)
923
924tcp_synack_retries - INTEGER
925	Number of times SYNACKs for a passive TCP connection attempt will
926	be retransmitted. Should not be higher than 255. Default value
927	is 5, which corresponds to 31seconds till the last retransmission
928	with the current initial RTO of 1second. With this the final timeout
929	for a passive TCP connection will happen after 63seconds.
930
931tcp_syncookies - INTEGER
932	Only valid when the kernel was compiled with CONFIG_SYN_COOKIES
933	Send out syncookies when the syn backlog queue of a socket
934	overflows. This is to prevent against the common 'SYN flood attack'
935	Default: 1
936
937	Note, that syncookies is fallback facility.
938	It MUST NOT be used to help highly loaded servers to stand
939	against legal connection rate. If you see SYN flood warnings
940	in your logs, but investigation	shows that they occur
941	because of overload with legal connections, you should tune
942	another parameters until this warning disappear.
943	See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.
944
945	syncookies seriously violate TCP protocol, do not allow
946	to use TCP extensions, can result in serious degradation
947	of some services (f.e. SMTP relaying), visible not by you,
948	but your clients and relays, contacting you. While you see
949	SYN flood warnings in logs not being really flooded, your server
950	is seriously misconfigured.
951
952	If you want to test which effects syncookies have to your
953	network connections you can set this knob to 2 to enable
954	unconditionally generation of syncookies.
955
956tcp_migrate_req - BOOLEAN
957	The incoming connection is tied to a specific listening socket when
958	the initial SYN packet is received during the three-way handshake.
959	When a listener is closed, in-flight request sockets during the
960	handshake and established sockets in the accept queue are aborted.
961
962	If the listener has SO_REUSEPORT enabled, other listeners on the
963	same port should have been able to accept such connections. This
964	option makes it possible to migrate such child sockets to another
965	listener after close() or shutdown().
966
967	The BPF_SK_REUSEPORT_SELECT_OR_MIGRATE type of eBPF program should
968	usually be used to define the policy to pick an alive listener.
969	Otherwise, the kernel will randomly pick an alive listener only if
970	this option is enabled.
971
972	Note that migration between listeners with different settings may
973	crash applications. Let's say migration happens from listener A to
974	B, and only B has TCP_SAVE_SYN enabled. B cannot read SYN data from
975	the requests migrated from A. To avoid such a situation, cancel
976	migration by returning SK_DROP in the type of eBPF program, or
977	disable this option.
978
979	Possible values:
980
981	- 0 (disabled)
982	- 1 (enabled)
983
984	Default: 0 (disabled)
985
986tcp_fastopen - INTEGER
987	Enable TCP Fast Open (RFC7413) to send and accept data in the opening
988	SYN packet.
989
990	The client support is enabled by flag 0x1 (on by default). The client
991	then must use sendmsg() or sendto() with the MSG_FASTOPEN flag,
992	rather than connect() to send data in SYN.
993
994	The server support is enabled by flag 0x2 (off by default). Then
995	either enable for all listeners with another flag (0x400) or
996	enable individual listeners via TCP_FASTOPEN socket option with
997	the option value being the length of the syn-data backlog.
998
999	The values (bitmap) are
1000
1001	=====  ======== ======================================================
1002	  0x1  (client) enables sending data in the opening SYN on the client.
1003	  0x2  (server) enables the server support, i.e., allowing data in
1004			a SYN packet to be accepted and passed to the
1005			application before 3-way handshake finishes.
1006	  0x4  (client) send data in the opening SYN regardless of cookie
1007			availability and without a cookie option.
1008	0x200  (server) accept data-in-SYN w/o any cookie option present.
1009	0x400  (server) enable all listeners to support Fast Open by
1010			default without explicit TCP_FASTOPEN socket option.
1011	=====  ======== ======================================================
1012
1013	Default: 0x1
1014
1015	Note that additional client or server features are only
1016	effective if the basic support (0x1 and 0x2) are enabled respectively.
1017
1018tcp_fastopen_blackhole_timeout_sec - INTEGER
1019	Initial time period in second to disable Fastopen on active TCP sockets
1020	when a TFO firewall blackhole issue happens.
1021	This time period will grow exponentially when more blackhole issues
1022	get detected right after Fastopen is re-enabled and will reset to
1023	initial value when the blackhole issue goes away.
1024	0 to disable the blackhole detection.
1025
1026	By default, it is set to 0 (feature is disabled).
1027
1028tcp_fastopen_key - list of comma separated 32-digit hexadecimal INTEGERs
1029	The list consists of a primary key and an optional backup key. The
1030	primary key is used for both creating and validating cookies, while the
1031	optional backup key is only used for validating cookies. The purpose of
1032	the backup key is to maximize TFO validation when keys are rotated.
1033
1034	A randomly chosen primary key may be configured by the kernel if
1035	the tcp_fastopen sysctl is set to 0x400 (see above), or if the
1036	TCP_FASTOPEN setsockopt() optname is set and a key has not been
1037	previously configured via sysctl. If keys are configured via
1038	setsockopt() by using the TCP_FASTOPEN_KEY optname, then those
1039	per-socket keys will be used instead of any keys that are specified via
1040	sysctl.
1041
1042	A key is specified as 4 8-digit hexadecimal integers which are separated
1043	by a '-' as: xxxxxxxx-xxxxxxxx-xxxxxxxx-xxxxxxxx. Leading zeros may be
1044	omitted. A primary and a backup key may be specified by separating them
1045	by a comma. If only one key is specified, it becomes the primary key and
1046	any previously configured backup keys are removed.
1047
1048tcp_syn_retries - INTEGER
1049	Number of times initial SYNs for an active TCP connection attempt
1050	will be retransmitted. Should not be higher than 127. Default value
1051	is 6, which corresponds to 67seconds (with tcp_syn_linear_timeouts = 4)
1052	till the last retransmission with the current initial RTO of 1second.
1053	With this the final timeout for an active TCP connection attempt
1054	will happen after 131seconds.
1055
1056tcp_timestamps - INTEGER
1057	Enable timestamps as defined in RFC1323.
1058
1059	- 0: Disabled.
1060	- 1: Enable timestamps as defined in RFC1323 and use random offset for
1061	  each connection rather than only using the current time.
1062	- 2: Like 1, but without random offsets.
1063
1064	Default: 1
1065
1066tcp_min_tso_segs - INTEGER
1067	Minimal number of segments per TSO frame.
1068
1069	Since linux-3.12, TCP does an automatic sizing of TSO frames,
1070	depending on flow rate, instead of filling 64Kbytes packets.
1071	For specific usages, it's possible to force TCP to build big
1072	TSO frames. Note that TCP stack might split too big TSO packets
1073	if available window is too small.
1074
1075	Default: 2
1076
1077tcp_tso_rtt_log - INTEGER
1078	Adjustment of TSO packet sizes based on min_rtt
1079
1080	Starting from linux-5.18, TCP autosizing can be tweaked
1081	for flows having small RTT.
1082
1083	Old autosizing was splitting the pacing budget to send 1024 TSO
1084	per second.
1085
1086	tso_packet_size = sk->sk_pacing_rate / 1024;
1087
1088	With the new mechanism, we increase this TSO sizing using:
1089
1090	distance = min_rtt_usec / (2^tcp_tso_rtt_log)
1091	tso_packet_size += gso_max_size >> distance;
1092
1093	This means that flows between very close hosts can use bigger
1094	TSO packets, reducing their cpu costs.
1095
1096	If you want to use the old autosizing, set this sysctl to 0.
1097
1098	Default: 9  (2^9 = 512 usec)
1099
1100tcp_pacing_ss_ratio - INTEGER
1101	sk->sk_pacing_rate is set by TCP stack using a ratio applied
1102	to current rate. (current_rate = cwnd * mss / srtt)
1103	If TCP is in slow start, tcp_pacing_ss_ratio is applied
1104	to let TCP probe for bigger speeds, assuming cwnd can be
1105	doubled every other RTT.
1106
1107	Default: 200
1108
1109tcp_pacing_ca_ratio - INTEGER
1110	sk->sk_pacing_rate is set by TCP stack using a ratio applied
1111	to current rate. (current_rate = cwnd * mss / srtt)
1112	If TCP is in congestion avoidance phase, tcp_pacing_ca_ratio
1113	is applied to conservatively probe for bigger throughput.
1114
1115	Default: 120
1116
1117tcp_syn_linear_timeouts - INTEGER
1118	The number of times for an active TCP connection to retransmit SYNs with
1119	a linear backoff timeout before defaulting to an exponential backoff
1120	timeout. This has no effect on SYNACK at the passive TCP side.
1121
1122	With an initial RTO of 1 and tcp_syn_linear_timeouts = 4 we would
1123	expect SYN RTOs to be: 1, 1, 1, 1, 1, 2, 4, ... (4 linear timeouts,
1124	and the first exponential backoff using 2^0 * initial_RTO).
1125	Default: 4
1126
1127tcp_tso_win_divisor - INTEGER
1128	This allows control over what percentage of the congestion window
1129	can be consumed by a single TSO frame.
1130	The setting of this parameter is a choice between burstiness and
1131	building larger TSO frames.
1132
1133	Default: 3
1134
1135tcp_tw_reuse - INTEGER
1136	Enable reuse of TIME-WAIT sockets for new connections when it is
1137	safe from protocol viewpoint.
1138
1139	- 0 - disable
1140	- 1 - global enable
1141	- 2 - enable for loopback traffic only
1142
1143	It should not be changed without advice/request of technical
1144	experts.
1145
1146	Default: 2
1147
1148tcp_tw_reuse_delay - UNSIGNED INTEGER
1149        The delay in milliseconds before a TIME-WAIT socket can be reused by a
1150        new connection, if TIME-WAIT socket reuse is enabled. The actual reuse
1151        threshold is within [N, N+1] range, where N is the requested delay in
1152        milliseconds, to ensure the delay interval is never shorter than the
1153        configured value.
1154
1155        This setting contains an assumption about the other TCP timestamp clock
1156        tick interval. It should not be set to a value lower than the peer's
1157        clock tick for PAWS (Protection Against Wrapped Sequence numbers)
1158        mechanism work correctly for the reused connection.
1159
1160        Default: 1000 (milliseconds)
1161
1162tcp_window_scaling - BOOLEAN
1163	Enable window scaling as defined in RFC1323.
1164
1165	Possible values:
1166
1167	- 0 (disabled)
1168	- 1 (enabled)
1169
1170	Default: 1 (enabled)
1171
1172tcp_shrink_window - BOOLEAN
1173	This changes how the TCP receive window is calculated.
1174
1175	RFC 7323, section 2.4, says there are instances when a retracted
1176	window can be offered, and that TCP implementations MUST ensure
1177	that they handle a shrinking window, as specified in RFC 1122.
1178
1179	Possible values:
1180
1181	- 0 (disabled) - The window is never shrunk.
1182	- 1 (enabled)  - The window is shrunk when necessary to remain within
1183	  the memory limit set by autotuning (sk_rcvbuf).
1184	  This only occurs if a non-zero receive window
1185	  scaling factor is also in effect.
1186
1187	Default: 0 (disabled)
1188
1189tcp_wmem - vector of 3 INTEGERs: min, default, max
1190	min: Amount of memory reserved for send buffers for TCP sockets.
1191	Each TCP socket has rights to use it due to fact of its birth.
1192
1193	Default: 4K
1194
1195	default: initial size of send buffer used by TCP sockets.  This
1196	value overrides net.core.wmem_default used by other protocols.
1197
1198	It is usually lower than net.core.wmem_default.
1199
1200	Default: 16K
1201
1202	max: Maximal amount of memory allowed for automatically tuned
1203	send buffers for TCP sockets. This value does not override
1204	net.core.wmem_max.  Calling setsockopt() with SO_SNDBUF disables
1205	automatic tuning of that socket's send buffer size, in which case
1206	this value is ignored.
1207
1208	Default: between 64K and 4MB, depending on RAM size.
1209
1210tcp_notsent_lowat - UNSIGNED INTEGER
1211	A TCP socket can control the amount of unsent bytes in its write queue,
1212	thanks to TCP_NOTSENT_LOWAT socket option. poll()/select()/epoll()
1213	reports POLLOUT events if the amount of unsent bytes is below a per
1214	socket value, and if the write queue is not full. sendmsg() will
1215	also not add new buffers if the limit is hit.
1216
1217	This global variable controls the amount of unsent data for
1218	sockets not using TCP_NOTSENT_LOWAT. For these sockets, a change
1219	to the global variable has immediate effect.
1220
1221	Default: UINT_MAX (0xFFFFFFFF)
1222
1223tcp_workaround_signed_windows - BOOLEAN
1224	If enabled, assume no receipt of a window scaling option means the
1225	remote TCP is broken and treats the window as a signed quantity.
1226	If disabled, assume the remote TCP is not broken even if we do
1227	not receive a window scaling option from them.
1228
1229	Possible values:
1230
1231	- 0 (disabled)
1232	- 1 (enabled)
1233
1234	Default: 0 (disabled)
1235
1236tcp_thin_linear_timeouts - BOOLEAN
1237	Enable dynamic triggering of linear timeouts for thin streams.
1238	If enabled, a check is performed upon retransmission by timeout to
1239	determine if the stream is thin (less than 4 packets in flight).
1240	As long as the stream is found to be thin, up to 6 linear
1241	timeouts may be performed before exponential backoff mode is
1242	initiated. This improves retransmission latency for
1243	non-aggressive thin streams, often found to be time-dependent.
1244	For more information on thin streams, see
1245	Documentation/networking/tcp-thin.rst
1246
1247	Possible values:
1248
1249	- 0 (disabled)
1250	- 1 (enabled)
1251
1252	Default: 0 (disabled)
1253
1254tcp_limit_output_bytes - INTEGER
1255	Controls TCP Small Queue limit per tcp socket.
1256	TCP bulk sender tends to increase packets in flight until it
1257	gets losses notifications. With SNDBUF autotuning, this can
1258	result in a large amount of packets queued on the local machine
1259	(e.g.: qdiscs, CPU backlog, or device) hurting latency of other
1260	flows, for typical pfifo_fast qdiscs.  tcp_limit_output_bytes
1261	limits the number of bytes on qdisc or device to reduce artificial
1262	RTT/cwnd and reduce bufferbloat.
1263
1264	Default: 4194304 (4 MB)
1265
1266tcp_challenge_ack_limit - INTEGER
1267	Limits number of Challenge ACK sent per second, as recommended
1268	in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks)
1269	Note that this per netns rate limit can allow some side channel
1270	attacks and probably should not be enabled.
1271	TCP stack implements per TCP socket limits anyway.
1272	Default: INT_MAX (unlimited)
1273
1274tcp_ehash_entries - INTEGER
1275	Show the number of hash buckets for TCP sockets in the current
1276	networking namespace.
1277
1278	A negative value means the networking namespace does not own its
1279	hash buckets and shares the initial networking namespace's one.
1280
1281tcp_child_ehash_entries - INTEGER
1282	Control the number of hash buckets for TCP sockets in the child
1283	networking namespace, which must be set before clone() or unshare().
1284
1285	If the value is not 0, the kernel uses a value rounded up to 2^n
1286	as the actual hash bucket size.  0 is a special value, meaning
1287	the child networking namespace will share the initial networking
1288	namespace's hash buckets.
1289
1290	Note that the child will use the global one in case the kernel
1291	fails to allocate enough memory.  In addition, the global hash
1292	buckets are spread over available NUMA nodes, but the allocation
1293	of the child hash table depends on the current process's NUMA
1294	policy, which could result in performance differences.
1295
1296	Note also that the default value of tcp_max_tw_buckets and
1297	tcp_max_syn_backlog depend on the hash bucket size.
1298
1299	Possible values: 0, 2^n (n: 0 - 24 (16Mi))
1300
1301	Default: 0
1302
1303tcp_plb_enabled - BOOLEAN
1304	If enabled and the underlying congestion control (e.g. DCTCP) supports
1305	and enables PLB feature, TCP PLB (Protective Load Balancing) is
1306	enabled. PLB is described in the following paper:
1307	https://doi.org/10.1145/3544216.3544226. Based on PLB parameters,
1308	upon sensing sustained congestion, TCP triggers a change in
1309	flow label field for outgoing IPv6 packets. A change in flow label
1310	field potentially changes the path of outgoing packets for switches
1311	that use ECMP/WCMP for routing.
1312
1313	PLB changes socket txhash which results in a change in IPv6 Flow Label
1314	field, and currently no-op for IPv4 headers. It is possible
1315	to apply PLB for IPv4 with other network header fields (e.g. TCP
1316	or IPv4 options) or using encapsulation where outer header is used
1317	by switches to determine next hop. In either case, further host
1318	and switch side changes will be needed.
1319
1320	If enabled, PLB assumes that congestion signal (e.g. ECN) is made
1321	available and used by congestion control module to estimate a
1322	congestion measure (e.g. ce_ratio). PLB needs a congestion measure to
1323	make repathing decisions.
1324
1325	Possible values:
1326
1327	- 0 (disabled)
1328	- 1 (enabled)
1329
1330	Default: 0 (disabled)
1331
1332tcp_plb_idle_rehash_rounds - INTEGER
1333	Number of consecutive congested rounds (RTT) seen after which
1334	a rehash can be performed, given there are no packets in flight.
1335	This is referred to as M in PLB paper:
1336	https://doi.org/10.1145/3544216.3544226.
1337
1338	Possible Values: 0 - 31
1339
1340	Default: 3
1341
1342tcp_plb_rehash_rounds - INTEGER
1343	Number of consecutive congested rounds (RTT) seen after which
1344	a forced rehash can be performed. Be careful when setting this
1345	parameter, as a small value increases the risk of retransmissions.
1346	This is referred to as N in PLB paper:
1347	https://doi.org/10.1145/3544216.3544226.
1348
1349	Possible Values: 0 - 31
1350
1351	Default: 12
1352
1353tcp_plb_suspend_rto_sec - INTEGER
1354	Time, in seconds, to suspend PLB in event of an RTO. In order to avoid
1355	having PLB repath onto a connectivity "black hole", after an RTO a TCP
1356	connection suspends PLB repathing for a random duration between 1x and
1357	2x of this parameter. Randomness is added to avoid concurrent rehashing
1358	of multiple TCP connections. This should be set corresponding to the
1359	amount of time it takes to repair a failed link.
1360
1361	Possible Values: 0 - 255
1362
1363	Default: 60
1364
1365tcp_plb_cong_thresh - INTEGER
1366	Fraction of packets marked with congestion over a round (RTT) to
1367	tag that round as congested. This is referred to as K in the PLB paper:
1368	https://doi.org/10.1145/3544216.3544226.
1369
1370	The 0-1 fraction range is mapped to 0-256 range to avoid floating
1371	point operations. For example, 128 means that if at least 50% of
1372	the packets in a round were marked as congested then the round
1373	will be tagged as congested.
1374
1375	Setting threshold to 0 means that PLB repaths every RTT regardless
1376	of congestion. This is not intended behavior for PLB and should be
1377	used only for experimentation purpose.
1378
1379	Possible Values: 0 - 256
1380
1381	Default: 128
1382
1383tcp_pingpong_thresh - INTEGER
1384	The number of estimated data replies sent for estimated incoming data
1385	requests that must happen before TCP considers that a connection is a
1386	"ping-pong" (request-response) connection for which delayed
1387	acknowledgments can provide benefits.
1388
1389	This threshold is 1 by default, but some applications may need a higher
1390	threshold for optimal performance.
1391
1392	Possible Values: 1 - 255
1393
1394	Default: 1
1395
1396tcp_rto_min_us - INTEGER
1397	Minimal TCP retransmission timeout (in microseconds). Note that the
1398	rto_min route option has the highest precedence for configuring this
1399	setting, followed by the TCP_BPF_RTO_MIN and TCP_RTO_MIN_US socket
1400	options, followed by this tcp_rto_min_us sysctl.
1401
1402	The recommended practice is to use a value less or equal to 200000
1403	microseconds.
1404
1405	Possible Values: 1 - INT_MAX
1406
1407	Default: 200000
1408
1409tcp_rto_max_ms - INTEGER
1410	Maximal TCP retransmission timeout (in ms).
1411	Note that TCP_RTO_MAX_MS socket option has higher precedence.
1412
1413	When changing tcp_rto_max_ms, it is important to understand
1414	that tcp_retries2 might need a change.
1415
1416	Possible Values: 1000 - 120,000
1417
1418	Default: 120,000
1419
1420UDP variables
1421=============
1422
1423udp_l3mdev_accept - BOOLEAN
1424	Enabling this option allows a "global" bound socket to work
1425	across L3 master domains (e.g., VRFs) with packets capable of
1426	being received regardless of the L3 domain in which they
1427	originated. Only valid when the kernel was compiled with
1428	CONFIG_NET_L3_MASTER_DEV.
1429
1430	Possible values:
1431
1432	- 0 (disabled)
1433	- 1 (enabled)
1434
1435	Default: 0 (disabled)
1436
1437udp_mem - vector of 3 INTEGERs: min, pressure, max
1438	Number of pages allowed for queueing by all UDP sockets.
1439
1440	min: Number of pages allowed for queueing by all UDP sockets.
1441
1442	pressure: This value was introduced to follow format of tcp_mem.
1443
1444	max: This value was introduced to follow format of tcp_mem.
1445
1446	Default is calculated at boot time from amount of available memory.
1447
1448udp_rmem_min - INTEGER
1449	Minimal size of receive buffer used by UDP sockets in moderation.
1450	Each UDP socket is able to use the size for receiving data, even if
1451	total pages of UDP sockets exceed udp_mem pressure. The unit is byte.
1452
1453	Default: 4K
1454
1455udp_wmem_min - INTEGER
1456	UDP does not have tx memory accounting and this tunable has no effect.
1457
1458udp_hash_entries - INTEGER
1459	Show the number of hash buckets for UDP sockets in the current
1460	networking namespace.
1461
1462	A negative value means the networking namespace does not own its
1463	hash buckets and shares the initial networking namespace's one.
1464
1465udp_child_hash_entries - INTEGER
1466	Control the number of hash buckets for UDP sockets in the child
1467	networking namespace, which must be set before clone() or unshare().
1468
1469	If the value is not 0, the kernel uses a value rounded up to 2^n
1470	as the actual hash bucket size.  0 is a special value, meaning
1471	the child networking namespace will share the initial networking
1472	namespace's hash buckets.
1473
1474	Note that the child will use the global one in case the kernel
1475	fails to allocate enough memory.  In addition, the global hash
1476	buckets are spread over available NUMA nodes, but the allocation
1477	of the child hash table depends on the current process's NUMA
1478	policy, which could result in performance differences.
1479
1480	Possible values: 0, 2^n (n: 7 (128) - 16 (64K))
1481
1482	Default: 0
1483
1484
1485RAW variables
1486=============
1487
1488raw_l3mdev_accept - BOOLEAN
1489	Enabling this option allows a "global" bound socket to work
1490	across L3 master domains (e.g., VRFs) with packets capable of
1491	being received regardless of the L3 domain in which they
1492	originated. Only valid when the kernel was compiled with
1493	CONFIG_NET_L3_MASTER_DEV.
1494
1495	Possible values:
1496
1497	- 0 (disabled)
1498	- 1 (enabled)
1499
1500	Default: 1 (enabled)
1501
1502CIPSOv4 Variables
1503=================
1504
1505cipso_cache_enable - BOOLEAN
1506	If enabled, enable additions to and lookups from the CIPSO label mapping
1507	cache.  If disabled, additions are ignored and lookups always result in a
1508	miss.  However, regardless of the setting the cache is still
1509	invalidated when required when means you can safely toggle this on and
1510	off and the cache will always be "safe".
1511
1512	Possible values:
1513
1514	- 0 (disabled)
1515	- 1 (enabled)
1516
1517	Default: 1 (enabled)
1518
1519cipso_cache_bucket_size - INTEGER
1520	The CIPSO label cache consists of a fixed size hash table with each
1521	hash bucket containing a number of cache entries.  This variable limits
1522	the number of entries in each hash bucket; the larger the value is, the
1523	more CIPSO label mappings that can be cached.  When the number of
1524	entries in a given hash bucket reaches this limit adding new entries
1525	causes the oldest entry in the bucket to be removed to make room.
1526
1527	Default: 10
1528
1529cipso_rbm_optfmt - BOOLEAN
1530	Enable the "Optimized Tag 1 Format" as defined in section 3.4.2.6 of
1531	the CIPSO draft specification (see Documentation/netlabel for details).
1532	This means that when set the CIPSO tag will be padded with empty
1533	categories in order to make the packet data 32-bit aligned.
1534
1535	Possible values:
1536
1537	- 0 (disabled)
1538	- 1 (enabled)
1539
1540	Default: 0 (disabled)
1541
1542cipso_rbm_strictvalid - BOOLEAN
1543	If enabled, do a very strict check of the CIPSO option when
1544	ip_options_compile() is called.  If disabled, relax the checks done during
1545	ip_options_compile().  Either way is "safe" as errors are caught else
1546	where in the CIPSO processing code but setting this to 0 (False) should
1547	result in less work (i.e. it should be faster) but could cause problems
1548	with other implementations that require strict checking.
1549
1550	Possible values:
1551
1552	- 0 (disabled)
1553	- 1 (enabled)
1554
1555	Default: 0 (disabled)
1556
1557IP Variables
1558============
1559
1560ip_local_port_range - 2 INTEGERS
1561	Defines the local port range that is used by TCP and UDP to
1562	choose the local port. The first number is the first, the
1563	second the last local port number.
1564	If possible, it is better these numbers have different parity
1565	(one even and one odd value).
1566	Must be greater than or equal to ip_unprivileged_port_start.
1567	The default values are 32768 and 60999 respectively.
1568
1569ip_local_reserved_ports - list of comma separated ranges
1570	Specify the ports which are reserved for known third-party
1571	applications. These ports will not be used by automatic port
1572	assignments (e.g. when calling connect() or bind() with port
1573	number 0). Explicit port allocation behavior is unchanged.
1574
1575	The format used for both input and output is a comma separated
1576	list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and
1577	10). Writing to the file will clear all previously reserved
1578	ports and update the current list with the one given in the
1579	input.
1580
1581	Note that ip_local_port_range and ip_local_reserved_ports
1582	settings are independent and both are considered by the kernel
1583	when determining which ports are available for automatic port
1584	assignments.
1585
1586	You can reserve ports which are not in the current
1587	ip_local_port_range, e.g.::
1588
1589	    $ cat /proc/sys/net/ipv4/ip_local_port_range
1590	    32000	60999
1591	    $ cat /proc/sys/net/ipv4/ip_local_reserved_ports
1592	    8080,9148
1593
1594	although this is redundant. However such a setting is useful
1595	if later the port range is changed to a value that will
1596	include the reserved ports. Also keep in mind, that overlapping
1597	of these ranges may affect probability of selecting ephemeral
1598	ports which are right after block of reserved ports.
1599
1600	Default: Empty
1601
1602ip_unprivileged_port_start - INTEGER
1603	This is a per-namespace sysctl.  It defines the first
1604	unprivileged port in the network namespace.  Privileged ports
1605	require root or CAP_NET_BIND_SERVICE in order to bind to them.
1606	To disable all privileged ports, set this to 0.  They must not
1607	overlap with the ip_local_port_range.
1608
1609	Default: 1024
1610
1611ip_nonlocal_bind - BOOLEAN
1612	If enabled, allows processes to bind() to non-local IP addresses,
1613	which can be quite useful - but may break some applications.
1614
1615	Possible values:
1616
1617	- 0 (disabled)
1618	- 1 (enabled)
1619
1620	Default: 0 (disabled)
1621
1622ip_autobind_reuse - BOOLEAN
1623	By default, bind() does not select the ports automatically even if
1624	the new socket and all sockets bound to the port have SO_REUSEADDR.
1625	ip_autobind_reuse allows bind() to reuse the port and this is useful
1626	when you use bind()+connect(), but may break some applications.
1627	The preferred solution is to use IP_BIND_ADDRESS_NO_PORT and this
1628	option should only be set by experts.
1629
1630	Possible values:
1631
1632	- 0 (disabled)
1633	- 1 (enabled)
1634
1635	Default: 0 (disabled)
1636
1637ip_dynaddr - INTEGER
1638	If set non-zero, enables support for dynamic addresses.
1639	If set to a non-zero value larger than 1, a kernel log
1640	message will be printed when dynamic address rewriting
1641	occurs.
1642
1643	Default: 0
1644
1645ip_early_demux - BOOLEAN
1646	Optimize input packet processing down to one demux for
1647	certain kinds of local sockets.  Currently we only do this
1648	for established TCP and connected UDP sockets.
1649
1650	It may add an additional cost for pure routing workloads that
1651	reduces overall throughput, in such case you should disable it.
1652
1653	Possible values:
1654
1655	- 0 (disabled)
1656	- 1 (enabled)
1657
1658	Default: 1 (enabled)
1659
1660ping_group_range - 2 INTEGERS
1661	Restrict ICMP_PROTO datagram sockets to users in the group range.
1662	The default is "1 0", meaning, that nobody (not even root) may
1663	create ping sockets.  Setting it to "100 100" would grant permissions
1664	to the single group. "0 4294967294" would enable it for the world, "100
1665	4294967294" would enable it for the users, but not daemons.
1666
1667tcp_early_demux - BOOLEAN
1668	Enable early demux for established TCP sockets.
1669
1670	Possible values:
1671
1672	- 0 (disabled)
1673	- 1 (enabled)
1674
1675	Default: 1 (enabled)
1676
1677udp_early_demux - BOOLEAN
1678	Enable early demux for connected UDP sockets. Disable this if
1679	your system could experience more unconnected load.
1680
1681	Possible values:
1682
1683	- 0 (disabled)
1684	- 1 (enabled)
1685
1686	Default: 1 (enabled)
1687
1688icmp_echo_ignore_all - BOOLEAN
1689	If enabled, then the kernel will ignore all ICMP ECHO
1690	requests sent to it.
1691
1692	Possible values:
1693
1694	- 0 (disabled)
1695	- 1 (enabled)
1696
1697	Default: 0 (disabled)
1698
1699icmp_echo_enable_probe - BOOLEAN
1700        If enabled, then the kernel will respond to RFC 8335 PROBE
1701        requests sent to it.
1702
1703        Possible values:
1704
1705	- 0 (disabled)
1706	- 1 (enabled)
1707
1708	Default: 0 (disabled)
1709
1710icmp_echo_ignore_broadcasts - BOOLEAN
1711	If enabled, then the kernel will ignore all ICMP ECHO and
1712	TIMESTAMP requests sent to it via broadcast/multicast.
1713
1714	Possible values:
1715
1716	- 0 (disabled)
1717	- 1 (enabled)
1718
1719	Default: 1 (enabled)
1720
1721icmp_ratelimit - INTEGER
1722	Limit the maximal rates for sending ICMP packets whose type matches
1723	icmp_ratemask (see below) to specific targets.
1724	0 to disable any limiting,
1725	otherwise the minimal space between responses in milliseconds.
1726	Note that another sysctl, icmp_msgs_per_sec limits the number
1727	of ICMP packets	sent on all targets.
1728
1729	Default: 1000
1730
1731icmp_msgs_per_sec - INTEGER
1732	Limit maximal number of ICMP packets sent per second from this host.
1733	Only messages whose type matches icmp_ratemask (see below) are
1734	controlled by this limit. For security reasons, the precise count
1735	of messages per second is randomized.
1736
1737	Default: 1000
1738
1739icmp_msgs_burst - INTEGER
1740	icmp_msgs_per_sec controls number of ICMP packets sent per second,
1741	while icmp_msgs_burst controls the burst size of these packets.
1742	For security reasons, the precise burst size is randomized.
1743
1744	Default: 50
1745
1746icmp_ratemask - INTEGER
1747	Mask made of ICMP types for which rates are being limited.
1748
1749	Significant bits: IHGFEDCBA9876543210
1750
1751	Default mask:     0000001100000011000 (6168)
1752
1753	Bit definitions (see include/linux/icmp.h):
1754
1755		= =========================
1756		0 Echo Reply
1757		3 Destination Unreachable [1]_
1758		4 Source Quench [1]_
1759		5 Redirect
1760		8 Echo Request
1761		B Time Exceeded [1]_
1762		C Parameter Problem [1]_
1763		D Timestamp Request
1764		E Timestamp Reply
1765		F Info Request
1766		G Info Reply
1767		H Address Mask Request
1768		I Address Mask Reply
1769		= =========================
1770
1771	.. [1] These are rate limited by default (see default mask above)
1772
1773icmp_ignore_bogus_error_responses - BOOLEAN
1774	Some routers violate RFC1122 by sending bogus responses to broadcast
1775	frames.  Such violations are normally logged via a kernel warning.
1776	If enabled, the kernel will not give such warnings, which
1777	will avoid log file clutter.
1778
1779	Possible values:
1780
1781	- 0 (disabled)
1782	- 1 (enabled)
1783
1784	Default: 1 (enabled)
1785
1786icmp_errors_use_inbound_ifaddr - BOOLEAN
1787
1788	If disabled, icmp error messages are sent with the primary address of
1789	the exiting interface.
1790
1791	If enabled, the message will be sent with the primary address of
1792	the interface that received the packet that caused the icmp error.
1793	This is the behaviour many network administrators will expect from
1794	a router. And it can make debugging complicated network layouts
1795	much easier.
1796
1797	Note that if no primary address exists for the interface selected,
1798	then the primary address of the first non-loopback interface that
1799	has one will be used regardless of this setting.
1800
1801	Possible values:
1802
1803	- 0 (disabled)
1804	- 1 (enabled)
1805
1806	Default: 0 (disabled)
1807
1808icmp_errors_extension_mask - UNSIGNED INTEGER
1809	Bitmask of ICMP extensions to append to ICMPv4 error messages
1810	("Destination Unreachable", "Time Exceeded" and "Parameter Problem").
1811	The original datagram is trimmed / padded to 128 bytes in order to be
1812	compatible with applications that do not comply with RFC 4884.
1813
1814	Possible extensions are:
1815
1816	==== ==============================================================
1817	0x01 Incoming IP interface information according to RFC 5837.
1818	     Extension will include the index, IPv4 address (if present),
1819	     name and MTU of the IP interface that received the datagram
1820	     which elicited the ICMP error.
1821	==== ==============================================================
1822
1823	Default: 0x00 (no extensions)
1824
1825igmp_max_memberships - INTEGER
1826	Change the maximum number of multicast groups we can subscribe to.
1827	Default: 20
1828
1829	Theoretical maximum value is bounded by having to send a membership
1830	report in a single datagram (i.e. the report can't span multiple
1831	datagrams, or risk confusing the switch and leaving groups you don't
1832	intend to).
1833
1834	The number of supported groups 'M' is bounded by the number of group
1835	report entries you can fit into a single datagram of 65535 bytes.
1836
1837	M = 65536-sizeof (ip header)/(sizeof(Group record))
1838
1839	Group records are variable length, with a minimum of 12 bytes.
1840	So net.ipv4.igmp_max_memberships should not be set higher than:
1841
1842	(65536-24) / 12 = 5459
1843
1844	The value 5459 assumes no IP header options, so in practice
1845	this number may be lower.
1846
1847igmp_max_msf - INTEGER
1848	Maximum number of addresses allowed in the source filter list for a
1849	multicast group.
1850
1851	Default: 10
1852
1853igmp_qrv - INTEGER
1854	Controls the IGMP query robustness variable (see RFC2236 8.1).
1855
1856	Default: 2 (as specified by RFC2236 8.1)
1857
1858	Minimum: 1 (as specified by RFC6636 4.5)
1859
1860force_igmp_version - INTEGER
1861	- 0 - (default) No enforcement of a IGMP version, IGMPv1/v2 fallback
1862	  allowed. Will back to IGMPv3 mode again if all IGMPv1/v2 Querier
1863	  Present timer expires.
1864	- 1 - Enforce to use IGMP version 1. Will also reply IGMPv1 report if
1865	  receive IGMPv2/v3 query.
1866	- 2 - Enforce to use IGMP version 2. Will fallback to IGMPv1 if receive
1867	  IGMPv1 query message. Will reply report if receive IGMPv3 query.
1868	- 3 - Enforce to use IGMP version 3. The same react with default 0.
1869
1870	.. note::
1871
1872	   this is not the same with force_mld_version because IGMPv3 RFC3376
1873	   Security Considerations does not have clear description that we could
1874	   ignore other version messages completely as MLDv2 RFC3810. So make
1875	   this value as default 0 is recommended.
1876
1877``conf/interface/*``
1878	changes special settings per interface (where
1879	interface" is the name of your network interface)
1880
1881``conf/all/*``
1882	  is special, changes the settings for all interfaces
1883
1884log_martians - BOOLEAN
1885	Log packets with impossible addresses to kernel log.
1886	log_martians for the interface will be enabled if at least one of
1887	conf/{all,interface}/log_martians is set to TRUE,
1888	it will be disabled otherwise
1889
1890accept_redirects - BOOLEAN
1891	Accept ICMP redirect messages.
1892	accept_redirects for the interface will be enabled if:
1893
1894	- both conf/{all,interface}/accept_redirects are TRUE in the case
1895	  forwarding for the interface is enabled
1896
1897	or
1898
1899	- at least one of conf/{all,interface}/accept_redirects is TRUE in the
1900	  case forwarding for the interface is disabled
1901
1902	accept_redirects for the interface will be disabled otherwise
1903
1904	default:
1905
1906		- TRUE (host)
1907		- FALSE (router)
1908
1909forwarding - BOOLEAN
1910	Enable IP forwarding on this interface.  This controls whether packets
1911	received _on_ this interface can be forwarded.
1912
1913mc_forwarding - BOOLEAN
1914	Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE
1915	and a multicast routing daemon is required.
1916	conf/all/mc_forwarding must also be set to TRUE to enable multicast
1917	routing	for the interface
1918
1919medium_id - INTEGER
1920	Integer value used to differentiate the devices by the medium they
1921	are attached to. Two devices can have different id values when
1922	the broadcast packets are received only on one of them.
1923	The default value 0 means that the device is the only interface
1924	to its medium, value of -1 means that medium is not known.
1925
1926	Currently, it is used to change the proxy_arp behavior:
1927	the proxy_arp feature is enabled for packets forwarded between
1928	two devices attached to different media.
1929
1930proxy_arp - BOOLEAN
1931	Do proxy arp.
1932
1933	proxy_arp for the interface will be enabled if at least one of
1934	conf/{all,interface}/proxy_arp is set to TRUE,
1935	it will be disabled otherwise
1936
1937proxy_arp_pvlan - BOOLEAN
1938	Private VLAN proxy arp.
1939
1940	Basically allow proxy arp replies back to the same interface
1941	(from which the ARP request/solicitation was received).
1942
1943	This is done to support (ethernet) switch features, like RFC
1944	3069, where the individual ports are NOT allowed to
1945	communicate with each other, but they are allowed to talk to
1946	the upstream router.  As described in RFC 3069, it is possible
1947	to allow these hosts to communicate through the upstream
1948	router by proxy_arp'ing. Don't need to be used together with
1949	proxy_arp.
1950
1951	This technology is known by different names:
1952
1953	- In RFC 3069 it is called VLAN Aggregation.
1954	- Cisco and Allied Telesyn call it Private VLAN.
1955	- Hewlett-Packard call it Source-Port filtering or port-isolation.
1956	- Ericsson call it MAC-Forced Forwarding (RFC Draft).
1957
1958proxy_delay - INTEGER
1959	Delay proxy response.
1960
1961	Delay response to a neighbor solicitation when proxy_arp
1962	or proxy_ndp is enabled. A random value between [0, proxy_delay)
1963	will be chosen, setting to zero means reply with no delay.
1964	Value in jiffies. Defaults to 80.
1965
1966shared_media - BOOLEAN
1967	Send(router) or accept(host) RFC1620 shared media redirects.
1968	Overrides secure_redirects.
1969
1970	shared_media for the interface will be enabled if at least one of
1971	conf/{all,interface}/shared_media is set to TRUE,
1972	it will be disabled otherwise
1973
1974	default TRUE
1975
1976secure_redirects - BOOLEAN
1977	Accept ICMP redirect messages only to gateways listed in the
1978	interface's current gateway list. Even if disabled, RFC1122 redirect
1979	rules still apply.
1980
1981	Overridden by shared_media.
1982
1983	secure_redirects for the interface will be enabled if at least one of
1984	conf/{all,interface}/secure_redirects is set to TRUE,
1985	it will be disabled otherwise
1986
1987	default TRUE
1988
1989send_redirects - BOOLEAN
1990	Send redirects, if router.
1991
1992	send_redirects for the interface will be enabled if at least one of
1993	conf/{all,interface}/send_redirects is set to TRUE,
1994	it will be disabled otherwise
1995
1996	Default: TRUE
1997
1998bootp_relay - BOOLEAN
1999	Accept packets with source address 0.b.c.d destined
2000	not to this host as local ones. It is supposed, that
2001	BOOTP relay daemon will catch and forward such packets.
2002	conf/all/bootp_relay must also be set to TRUE to enable BOOTP relay
2003	for the interface
2004
2005	default FALSE
2006
2007	Not Implemented Yet.
2008
2009accept_source_route - BOOLEAN
2010	Accept packets with SRR option.
2011	conf/all/accept_source_route must also be set to TRUE to accept packets
2012	with SRR option on the interface
2013
2014	default
2015
2016		- TRUE (router)
2017		- FALSE (host)
2018
2019accept_local - BOOLEAN
2020	Accept packets with local source addresses. In combination with
2021	suitable routing, this can be used to direct packets between two
2022	local interfaces over the wire and have them accepted properly.
2023	default FALSE
2024
2025route_localnet - BOOLEAN
2026	Do not consider loopback addresses as martian source or destination
2027	while routing. This enables the use of 127/8 for local routing purposes.
2028
2029	default FALSE
2030
2031rp_filter - INTEGER
2032	- 0 - No source validation.
2033	- 1 - Strict mode as defined in RFC3704 Strict Reverse Path
2034	  Each incoming packet is tested against the FIB and if the interface
2035	  is not the best reverse path the packet check will fail.
2036	  By default failed packets are discarded.
2037	- 2 - Loose mode as defined in RFC3704 Loose Reverse Path
2038	  Each incoming packet's source address is also tested against the FIB
2039	  and if the source address is not reachable via any interface
2040	  the packet check will fail.
2041
2042	Current recommended practice in RFC3704 is to enable strict mode
2043	to prevent IP spoofing from DDos attacks. If using asymmetric routing
2044	or other complicated routing, then loose mode is recommended.
2045
2046	The max value from conf/{all,interface}/rp_filter is used
2047	when doing source validation on the {interface}.
2048
2049	Default value is 0. Note that some distributions enable it
2050	in startup scripts.
2051
2052src_valid_mark - BOOLEAN
2053	- 0 - The fwmark of the packet is not included in reverse path
2054	  route lookup.  This allows for asymmetric routing configurations
2055	  utilizing the fwmark in only one direction, e.g., transparent
2056	  proxying.
2057
2058	- 1 - The fwmark of the packet is included in reverse path route
2059	  lookup.  This permits rp_filter to function when the fwmark is
2060	  used for routing traffic in both directions.
2061
2062	This setting also affects the utilization of fmwark when
2063	performing source address selection for ICMP replies, or
2064	determining addresses stored for the IPOPT_TS_TSANDADDR and
2065	IPOPT_RR IP options.
2066
2067	The max value from conf/{all,interface}/src_valid_mark is used.
2068
2069	Default value is 0.
2070
2071arp_filter - BOOLEAN
2072	- 1 - Allows you to have multiple network interfaces on the same
2073	  subnet, and have the ARPs for each interface be answered
2074	  based on whether or not the kernel would route a packet from
2075	  the ARP'd IP out that interface (therefore you must use source
2076	  based routing for this to work). In other words it allows control
2077	  of which cards (usually 1) will respond to an arp request.
2078
2079	- 0 - (default) The kernel can respond to arp requests with addresses
2080	  from other interfaces. This may seem wrong but it usually makes
2081	  sense, because it increases the chance of successful communication.
2082	  IP addresses are owned by the complete host on Linux, not by
2083	  particular interfaces. Only for more complex setups like load-
2084	  balancing, does this behaviour cause problems.
2085
2086	arp_filter for the interface will be enabled if at least one of
2087	conf/{all,interface}/arp_filter is set to TRUE,
2088	it will be disabled otherwise
2089
2090arp_announce - INTEGER
2091	Define different restriction levels for announcing the local
2092	source IP address from IP packets in ARP requests sent on
2093	interface:
2094
2095	- 0 - (default) Use any local address, configured on any interface
2096	- 1 - Try to avoid local addresses that are not in the target's
2097	  subnet for this interface. This mode is useful when target
2098	  hosts reachable via this interface require the source IP
2099	  address in ARP requests to be part of their logical network
2100	  configured on the receiving interface. When we generate the
2101	  request we will check all our subnets that include the
2102	  target IP and will preserve the source address if it is from
2103	  such subnet. If there is no such subnet we select source
2104	  address according to the rules for level 2.
2105	- 2 - Always use the best local address for this target.
2106	  In this mode we ignore the source address in the IP packet
2107	  and try to select local address that we prefer for talks with
2108	  the target host. Such local address is selected by looking
2109	  for primary IP addresses on all our subnets on the outgoing
2110	  interface that include the target IP address. If no suitable
2111	  local address is found we select the first local address
2112	  we have on the outgoing interface or on all other interfaces,
2113	  with the hope we will receive reply for our request and
2114	  even sometimes no matter the source IP address we announce.
2115
2116	The max value from conf/{all,interface}/arp_announce is used.
2117
2118	Increasing the restriction level gives more chance for
2119	receiving answer from the resolved target while decreasing
2120	the level announces more valid sender's information.
2121
2122arp_ignore - INTEGER
2123	Define different modes for sending replies in response to
2124	received ARP requests that resolve local target IP addresses:
2125
2126	- 0 - (default): reply for any local target IP address, configured
2127	  on any interface
2128	- 1 - reply only if the target IP address is local address
2129	  configured on the incoming interface
2130	- 2 - reply only if the target IP address is local address
2131	  configured on the incoming interface and both with the
2132	  sender's IP address are part from same subnet on this interface
2133	- 3 - do not reply for local addresses configured with scope host,
2134	  only resolutions for global and link addresses are replied
2135	- 4-7 - reserved
2136	- 8 - do not reply for all local addresses
2137
2138	The max value from conf/{all,interface}/arp_ignore is used
2139	when ARP request is received on the {interface}
2140
2141arp_notify - BOOLEAN
2142	Define mode for notification of address and device changes.
2143
2144	 ==  ==========================================================
2145	  0  (default): do nothing
2146	  1  Generate gratuitous arp requests when device is brought up
2147	     or hardware address changes.
2148	 ==  ==========================================================
2149
2150arp_accept - INTEGER
2151	Define behavior for accepting gratuitous ARP (garp) frames from devices
2152	that are not already present in the ARP table:
2153
2154	- 0 - don't create new entries in the ARP table
2155	- 1 - create new entries in the ARP table
2156	- 2 - create new entries only if the source IP address is in the same
2157	  subnet as an address configured on the interface that received the
2158	  garp message.
2159
2160	Both replies and requests type gratuitous arp will trigger the
2161	ARP table to be updated, if this setting is on.
2162
2163	If the ARP table already contains the IP address of the
2164	gratuitous arp frame, the arp table will be updated regardless
2165	if this setting is on or off.
2166
2167arp_evict_nocarrier - BOOLEAN
2168	Clears the ARP cache on NOCARRIER events. This option is important for
2169	wireless devices where the ARP cache should not be cleared when roaming
2170	between access points on the same network. In most cases this should
2171	remain as the default (1).
2172
2173	Possible values:
2174
2175	- 0 (disabled) - Do not clear ARP cache on NOCARRIER events
2176	- 1 (enabled)  - Clear the ARP cache on NOCARRIER events
2177
2178	Default: 1 (enabled)
2179
2180mcast_solicit - INTEGER
2181	The maximum number of multicast probes in INCOMPLETE state,
2182	when the associated hardware address is unknown.  Defaults
2183	to 3.
2184
2185ucast_solicit - INTEGER
2186	The maximum number of unicast probes in PROBE state, when
2187	the hardware address is being reconfirmed.  Defaults to 3.
2188
2189app_solicit - INTEGER
2190	The maximum number of probes to send to the user space ARP daemon
2191	via netlink before dropping back to multicast probes (see
2192	mcast_resolicit).  Defaults to 0.
2193
2194mcast_resolicit - INTEGER
2195	The maximum number of multicast probes after unicast and
2196	app probes in PROBE state.  Defaults to 0.
2197
2198disable_policy - BOOLEAN
2199	Disable IPSEC policy (SPD) for this interface
2200
2201	Possible values:
2202
2203	- 0 (disabled)
2204	- 1 (enabled)
2205
2206	Default: 0 (disabled)
2207
2208disable_xfrm - BOOLEAN
2209	Disable IPSEC encryption on this interface, whatever the policy
2210
2211	Possible values:
2212
2213	- 0 (disabled)
2214	- 1 (enabled)
2215
2216	Default: 0 (disabled)
2217
2218igmpv2_unsolicited_report_interval - INTEGER
2219	The interval in milliseconds in which the next unsolicited
2220	IGMPv1 or IGMPv2 report retransmit will take place.
2221
2222	Default: 10000 (10 seconds)
2223
2224igmpv3_unsolicited_report_interval - INTEGER
2225	The interval in milliseconds in which the next unsolicited
2226	IGMPv3 report retransmit will take place.
2227
2228	Default: 1000 (1 seconds)
2229
2230ignore_routes_with_linkdown - BOOLEAN
2231        Ignore routes whose link is down when performing a FIB lookup.
2232
2233        Possible values:
2234
2235	- 0 (disabled)
2236	- 1 (enabled)
2237
2238	Default: 0 (disabled)
2239
2240promote_secondaries - BOOLEAN
2241	When a primary IP address is removed from this interface
2242	promote a corresponding secondary IP address instead of
2243	removing all the corresponding secondary IP addresses.
2244
2245	Possible values:
2246
2247	- 0 (disabled)
2248	- 1 (enabled)
2249
2250	Default: 0 (disabled)
2251
2252drop_unicast_in_l2_multicast - BOOLEAN
2253	Drop any unicast IP packets that are received in link-layer
2254	multicast (or broadcast) frames.
2255
2256	This behavior (for multicast) is actually a SHOULD in RFC
2257	1122, but is disabled by default for compatibility reasons.
2258
2259	Possible values:
2260
2261	- 0 (disabled)
2262	- 1 (enabled)
2263
2264	Default: 0 (disabled)
2265
2266drop_gratuitous_arp - BOOLEAN
2267	Drop all gratuitous ARP frames, for example if there's a known
2268	good ARP proxy on the network and such frames need not be used
2269	(or in the case of 802.11, must not be used to prevent attacks.)
2270
2271	Possible values:
2272
2273	- 0 (disabled)
2274	- 1 (enabled)
2275
2276	Default: 0 (disabled)
2277
2278
2279tag - INTEGER
2280	Allows you to write a number, which can be used as required.
2281
2282	Default value is 0.
2283
2284xfrm4_gc_thresh - INTEGER
2285	(Obsolete since linux-4.14)
2286	The threshold at which we will start garbage collecting for IPv4
2287	destination cache entries.  At twice this value the system will
2288	refuse new allocations.
2289
2290igmp_link_local_mcast_reports - BOOLEAN
2291	Enable IGMP reports for link local multicast groups in the
2292	224.0.0.X range.
2293
2294	Default TRUE
2295
2296Alexey Kuznetsov.
2297kuznet@ms2.inr.ac.ru
2298
2299Updated by:
2300
2301- Andi Kleen
2302  ak@muc.de
2303- Nicolas Delon
2304  delon.nicolas@wanadoo.fr
2305
2306
2307
2308
2309/proc/sys/net/ipv6/* Variables
2310==============================
2311
2312IPv6 has no global variables such as tcp_*.  tcp_* settings under ipv4/ also
2313apply to IPv6 [XXX?].
2314
2315bindv6only - BOOLEAN
2316	Default value for IPV6_V6ONLY socket option,
2317	which restricts use of the IPv6 socket to IPv6 communication
2318	only.
2319
2320	Possible values:
2321
2322	- 0 (disabled) - enable IPv4-mapped address feature
2323	- 1 (enabled)  - disable IPv4-mapped address feature
2324
2325	Default: 0 (disabled)
2326
2327flowlabel_consistency - BOOLEAN
2328	Protect the consistency (and unicity) of flow label.
2329	You have to disable it to use IPV6_FL_F_REFLECT flag on the
2330	flow label manager.
2331
2332	Possible values:
2333
2334	- 0 (disabled)
2335	- 1 (enabled)
2336
2337	Default: 1 (enabled)
2338
2339auto_flowlabels - INTEGER
2340	Automatically generate flow labels based on a flow hash of the
2341	packet. This allows intermediate devices, such as routers, to
2342	identify packet flows for mechanisms like Equal Cost Multipath
2343	Routing (see RFC 6438).
2344
2345	=  ===========================================================
2346	0  automatic flow labels are completely disabled
2347	1  automatic flow labels are enabled by default, they can be
2348	   disabled on a per socket basis using the IPV6_AUTOFLOWLABEL
2349	   socket option
2350	2  automatic flow labels are allowed, they may be enabled on a
2351	   per socket basis using the IPV6_AUTOFLOWLABEL socket option
2352	3  automatic flow labels are enabled and enforced, they cannot
2353	   be disabled by the socket option
2354	=  ===========================================================
2355
2356	Default: 1
2357
2358flowlabel_state_ranges - BOOLEAN
2359	Split the flow label number space into two ranges. 0-0x7FFFF is
2360	reserved for the IPv6 flow manager facility, 0x80000-0xFFFFF
2361	is reserved for stateless flow labels as described in RFC6437.
2362
2363	Possible values:
2364
2365	- 0 (disabled)
2366	- 1 (enabled)
2367
2368	Default: 1 (enabled)
2369
2370
2371flowlabel_reflect - INTEGER
2372	Control flow label reflection. Needed for Path MTU
2373	Discovery to work with Equal Cost Multipath Routing in anycast
2374	environments. See RFC 7690 and:
2375	https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01
2376
2377	This is a bitmask.
2378
2379	- 1: enabled for established flows
2380
2381	  Note that this prevents automatic flowlabel changes, as done
2382	  in "tcp: change IPv6 flow-label upon receiving spurious retransmission"
2383	  and "tcp: Change txhash on every SYN and RTO retransmit"
2384
2385	- 2: enabled for TCP RESET packets (no active listener)
2386	  If set, a RST packet sent in response to a SYN packet on a closed
2387	  port will reflect the incoming flow label.
2388
2389	- 4: enabled for ICMPv6 echo reply messages.
2390
2391	Default: 0
2392
2393fib_multipath_hash_policy - INTEGER
2394	Controls which hash policy to use for multipath routes.
2395
2396	Default: 0 (Layer 3)
2397
2398	Possible values:
2399
2400	- 0 - Layer 3 (source and destination addresses plus flow label)
2401	- 1 - Layer 4 (standard 5-tuple)
2402	- 2 - Layer 3 or inner Layer 3 if present
2403	- 3 - Custom multipath hash. Fields used for multipath hash calculation
2404	  are determined by fib_multipath_hash_fields sysctl
2405
2406fib_multipath_hash_fields - UNSIGNED INTEGER
2407	When fib_multipath_hash_policy is set to 3 (custom multipath hash), the
2408	fields used for multipath hash calculation are determined by this
2409	sysctl.
2410
2411	This value is a bitmask which enables various fields for multipath hash
2412	calculation.
2413
2414	Possible fields are:
2415
2416	====== ============================
2417	0x0001 Source IP address
2418	0x0002 Destination IP address
2419	0x0004 IP protocol
2420	0x0008 Flow Label
2421	0x0010 Source port
2422	0x0020 Destination port
2423	0x0040 Inner source IP address
2424	0x0080 Inner destination IP address
2425	0x0100 Inner IP protocol
2426	0x0200 Inner Flow Label
2427	0x0400 Inner source port
2428	0x0800 Inner destination port
2429	====== ============================
2430
2431	Default: 0x0007 (source IP, destination IP and IP protocol)
2432
2433anycast_src_echo_reply - BOOLEAN
2434	Controls the use of anycast addresses as source addresses for ICMPv6
2435	echo reply
2436
2437	Possible values:
2438
2439	- 0 (disabled)
2440	- 1 (enabled)
2441
2442	Default: 0 (disabled)
2443
2444
2445idgen_delay - INTEGER
2446	Controls the delay in seconds after which time to retry
2447	privacy stable address generation if a DAD conflict is
2448	detected.
2449
2450	Default: 1 (as specified in RFC7217)
2451
2452idgen_retries - INTEGER
2453	Controls the number of retries to generate a stable privacy
2454	address if a DAD conflict is detected.
2455
2456	Default: 3 (as specified in RFC7217)
2457
2458mld_qrv - INTEGER
2459	Controls the MLD query robustness variable (see RFC3810 9.1).
2460
2461	Default: 2 (as specified by RFC3810 9.1)
2462
2463	Minimum: 1 (as specified by RFC6636 4.5)
2464
2465max_dst_opts_number - INTEGER
2466	Maximum number of non-padding TLVs allowed in a Destination
2467	options extension header. If this value is less than zero
2468	then unknown options are disallowed and the number of known
2469	TLVs allowed is the absolute value of this number.
2470
2471	Default: 8
2472
2473max_hbh_opts_number - INTEGER
2474	Maximum number of non-padding TLVs allowed in a Hop-by-Hop
2475	options extension header. If this value is less than zero
2476	then unknown options are disallowed and the number of known
2477	TLVs allowed is the absolute value of this number.
2478
2479	Default: 8
2480
2481max_dst_opts_length - INTEGER
2482	Maximum length allowed for a Destination options extension
2483	header.
2484
2485	Default: INT_MAX (unlimited)
2486
2487max_hbh_length - INTEGER
2488	Maximum length allowed for a Hop-by-Hop options extension
2489	header.
2490
2491	Default: INT_MAX (unlimited)
2492
2493skip_notify_on_dev_down - BOOLEAN
2494	Controls whether an RTM_DELROUTE message is generated for routes
2495	removed when a device is taken down or deleted. IPv4 does not
2496	generate this message; IPv6 does by default. Setting this sysctl
2497	to true skips the message, making IPv4 and IPv6 on par in relying
2498	on userspace caches to track link events and evict routes.
2499
2500	Possible values:
2501
2502	- 0 (disabled) - generate the message
2503	- 1 (enabled)  - skip generating the message
2504
2505	Default: 0 (disabled)
2506
2507nexthop_compat_mode - BOOLEAN
2508	New nexthop API provides a means for managing nexthops independent of
2509	prefixes. Backwards compatibility with old route format is enabled by
2510	default which means route dumps and notifications contain the new
2511	nexthop attribute but also the full, expanded nexthop definition.
2512	Further, updates or deletes of a nexthop configuration generate route
2513	notifications for each fib entry using the nexthop. Once a system
2514	understands the new API, this sysctl can be disabled to achieve full
2515	performance benefits of the new API by disabling the nexthop expansion
2516	and extraneous notifications.
2517
2518	Note that as a backward-compatible mode, dumping of modern features
2519	might be incomplete or wrong. For example, resilient groups will not be
2520	shown as such, but rather as just a list of next hops. Also weights that
2521	do not fit into 8 bits will show incorrectly.
2522
2523	Default: true (backward compat mode)
2524
2525fib_notify_on_flag_change - INTEGER
2526        Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/
2527        RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed.
2528
2529        After installing a route to the kernel, user space receives an
2530        acknowledgment, which means the route was installed in the kernel,
2531        but not necessarily in hardware.
2532        It is also possible for a route already installed in hardware to change
2533        its action and therefore its flags. For example, a host route that is
2534        trapping packets can be "promoted" to perform decapsulation following
2535        the installation of an IPinIP/VXLAN tunnel.
2536        The notifications will indicate to user-space the state of the route.
2537
2538        Default: 0 (Do not emit notifications.)
2539
2540        Possible values:
2541
2542        - 0 - Do not emit notifications.
2543        - 1 - Emit notifications.
2544        - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change.
2545
2546ioam6_id - INTEGER
2547        Define the IOAM id of this node. Uses only 24 bits out of 32 in total.
2548
2549        Possible value range:
2550
2551        - Min: 0
2552        - Max: 0xFFFFFF
2553
2554        Default: 0xFFFFFF
2555
2556ioam6_id_wide - LONG INTEGER
2557        Define the wide IOAM id of this node. Uses only 56 bits out of 64 in
2558        total. Can be different from ioam6_id.
2559
2560        Possible value range:
2561
2562        - Min: 0
2563        - Max: 0xFFFFFFFFFFFFFF
2564
2565        Default: 0xFFFFFFFFFFFFFF
2566
2567IPv6 Fragmentation:
2568
2569ip6frag_high_thresh - INTEGER
2570	Maximum memory used to reassemble IPv6 fragments. When
2571	ip6frag_high_thresh bytes of memory is allocated for this purpose,
2572	the fragment handler will toss packets until ip6frag_low_thresh
2573	is reached.
2574
2575ip6frag_low_thresh - INTEGER
2576	See ip6frag_high_thresh
2577
2578ip6frag_time - INTEGER
2579	Time in seconds to keep an IPv6 fragment in memory.
2580
2581``conf/default/*``:
2582	Change the interface-specific default settings.
2583
2584	These settings would be used during creating new interfaces.
2585
2586
2587``conf/all/*``:
2588	Change all the interface-specific settings.
2589
2590	[XXX:  Other special features than forwarding?]
2591
2592conf/all/disable_ipv6 - BOOLEAN
2593	Changing this value is same as changing ``conf/default/disable_ipv6``
2594	setting and also all per-interface ``disable_ipv6`` settings to the same
2595	value.
2596
2597	Reading this value does not have any particular meaning. It does not say
2598	whether IPv6 support is enabled or disabled. Returned value can be 1
2599	also in the case when some interface has ``disable_ipv6`` set to 0 and
2600	has configured IPv6 addresses.
2601
2602conf/all/forwarding - BOOLEAN
2603	Enable global IPv6 forwarding between all interfaces.
2604
2605	IPv4 and IPv6 work differently here; the ``force_forwarding`` flag must
2606	be used to control which interfaces may forward packets.
2607
2608	This also sets all interfaces' Host/Router setting
2609	'forwarding' to the specified value.  See below for details.
2610
2611	This referred to as global forwarding.
2612
2613proxy_ndp - BOOLEAN
2614	Do proxy ndp.
2615
2616	Possible values:
2617
2618	- 0 (disabled)
2619	- 1 (enabled)
2620
2621	Default: 0 (disabled)
2622
2623force_forwarding - BOOLEAN
2624	Enable forwarding on this interface only -- regardless of the setting on
2625	``conf/all/forwarding``. When setting ``conf.all.forwarding`` to 0,
2626	the ``force_forwarding`` flag will be reset on all interfaces.
2627
2628fwmark_reflect - BOOLEAN
2629	Controls the fwmark of kernel-generated IPv6 reply packets that are not
2630	associated with a socket for example, TCP RSTs or ICMPv6 echo replies).
2631	If disabled, these packets have a fwmark of zero. If enabled, they have the
2632	fwmark of the packet they are replying to.
2633
2634	Possible values:
2635
2636	- 0 (disabled)
2637	- 1 (enabled)
2638
2639	Default: 0 (disabled)
2640
2641``conf/interface/*``:
2642	Change special settings per interface.
2643
2644	The functional behaviour for certain settings is different
2645	depending on whether local forwarding is enabled or not.
2646
2647accept_ra - INTEGER
2648	Accept Router Advertisements; autoconfigure using them.
2649
2650	It also determines whether or not to transmit Router
2651	Solicitations. If and only if the functional setting is to
2652	accept Router Advertisements, Router Solicitations will be
2653	transmitted.
2654
2655	Possible values are:
2656
2657		==  ===========================================================
2658		 0  Do not accept Router Advertisements.
2659		 1  Accept Router Advertisements if forwarding is disabled.
2660		 2  Overrule forwarding behaviour. Accept Router Advertisements
2661		    even if forwarding is enabled.
2662		==  ===========================================================
2663
2664	Functional default:
2665
2666		- enabled if local forwarding is disabled.
2667		- disabled if local forwarding is enabled.
2668
2669accept_ra_defrtr - BOOLEAN
2670	Learn default router in Router Advertisement.
2671
2672	Functional default:
2673
2674		- enabled if accept_ra is enabled.
2675		- disabled if accept_ra is disabled.
2676
2677ra_defrtr_metric - UNSIGNED INTEGER
2678	Route metric for default route learned in Router Advertisement. This value
2679	will be assigned as metric for the default route learned via IPv6 Router
2680	Advertisement. Takes affect only if accept_ra_defrtr is enabled.
2681
2682	Possible values:
2683		1 to 0xFFFFFFFF
2684
2685		Default: IP6_RT_PRIO_USER i.e. 1024.
2686
2687accept_ra_from_local - BOOLEAN
2688	Accept RA with source-address that is found on local machine
2689	if the RA is otherwise proper and able to be accepted.
2690
2691	Default is to NOT accept these as it may be an un-intended
2692	network loop.
2693
2694	Functional default:
2695
2696	   - enabled if accept_ra_from_local is enabled
2697	     on a specific interface.
2698	   - disabled if accept_ra_from_local is disabled
2699	     on a specific interface.
2700
2701accept_ra_min_hop_limit - INTEGER
2702	Minimum hop limit Information in Router Advertisement.
2703
2704	Hop limit Information in Router Advertisement less than this
2705	variable shall be ignored.
2706
2707	Default: 1
2708
2709accept_ra_min_lft - INTEGER
2710	Minimum acceptable lifetime value in Router Advertisement.
2711
2712	RA sections with a lifetime less than this value shall be
2713	ignored. Zero lifetimes stay unaffected.
2714
2715	Default: 0
2716
2717accept_ra_pinfo - BOOLEAN
2718	Learn Prefix Information in Router Advertisement.
2719
2720	Functional default:
2721
2722		- enabled if accept_ra is enabled.
2723		- disabled if accept_ra is disabled.
2724
2725ra_honor_pio_life - BOOLEAN
2726	Whether to use RFC4862 Section 5.5.3e to determine the valid
2727	lifetime of an address matching a prefix sent in a Router
2728	Advertisement Prefix Information Option.
2729
2730	Possible values:
2731
2732	- 0 (disabled) - RFC4862 section 5.5.3e is used to determine
2733	  the valid lifetime of the address.
2734	- 1 (enabled)  - the PIO valid lifetime will always be honored.
2735
2736	Default: 0 (disabled)
2737
2738ra_honor_pio_pflag - BOOLEAN
2739	The Prefix Information Option P-flag indicates the network can
2740	allocate a unique IPv6 prefix per client using DHCPv6-PD.
2741	This sysctl can be enabled when a userspace DHCPv6-PD client
2742	is running to cause the P-flag to take effect: i.e. the
2743	P-flag suppresses any effects of the A-flag within the same
2744	PIO. For a given PIO, P=1 and A=1 is treated as A=0.
2745
2746	Possible values:
2747
2748	- 0 (disabled) - the P-flag is ignored.
2749	- 1 (enabled)  - the P-flag will disable SLAAC autoconfiguration
2750	  for the given Prefix Information Option.
2751
2752	Default: 0 (disabled)
2753
2754accept_ra_rt_info_min_plen - INTEGER
2755	Minimum prefix length of Route Information in RA.
2756
2757	Route Information w/ prefix smaller than this variable shall
2758	be ignored.
2759
2760	Functional default:
2761
2762		* 0 if accept_ra_rtr_pref is enabled.
2763		* -1 if accept_ra_rtr_pref is disabled.
2764
2765accept_ra_rt_info_max_plen - INTEGER
2766	Maximum prefix length of Route Information in RA.
2767
2768	Route Information w/ prefix larger than this variable shall
2769	be ignored.
2770
2771	Functional default:
2772
2773		* 0 if accept_ra_rtr_pref is enabled.
2774		* -1 if accept_ra_rtr_pref is disabled.
2775
2776accept_ra_rtr_pref - BOOLEAN
2777	Accept Router Preference in RA.
2778
2779	Functional default:
2780
2781		- enabled if accept_ra is enabled.
2782		- disabled if accept_ra is disabled.
2783
2784accept_ra_mtu - BOOLEAN
2785	Apply the MTU value specified in RA option 5 (RFC4861). If
2786	disabled, the MTU specified in the RA will be ignored.
2787
2788	Functional default:
2789
2790		- enabled if accept_ra is enabled.
2791		- disabled if accept_ra is disabled.
2792
2793accept_redirects - BOOLEAN
2794	Accept Redirects.
2795
2796	Functional default:
2797
2798		- enabled if local forwarding is disabled.
2799		- disabled if local forwarding is enabled.
2800
2801accept_source_route - INTEGER
2802	Accept source routing (routing extension header).
2803
2804	- >= 0: Accept only routing header type 2.
2805	- < 0: Do not accept routing header.
2806
2807	Default: 0
2808
2809autoconf - BOOLEAN
2810	Autoconfigure addresses using Prefix Information in Router
2811	Advertisements.
2812
2813	Functional default:
2814
2815		- enabled if accept_ra_pinfo is enabled.
2816		- disabled if accept_ra_pinfo is disabled.
2817
2818dad_transmits - INTEGER
2819	The amount of Duplicate Address Detection probes to send.
2820
2821	Default: 1
2822
2823forwarding - INTEGER
2824	Configure interface-specific Host/Router behaviour.
2825
2826	.. note::
2827
2828	   It is recommended to have the same setting on all
2829	   interfaces; mixed router/host scenarios are rather uncommon.
2830
2831	Possible values are:
2832
2833		- 0 Forwarding disabled
2834		- 1 Forwarding enabled
2835
2836	**FALSE (0)**:
2837
2838	By default, Host behaviour is assumed.  This means:
2839
2840	1. IsRouter flag is not set in Neighbour Advertisements.
2841	2. If accept_ra is TRUE (default), transmit Router
2842	   Solicitations.
2843	3. If accept_ra is TRUE (default), accept Router
2844	   Advertisements (and do autoconfiguration).
2845	4. If accept_redirects is TRUE (default), accept Redirects.
2846
2847	**TRUE (1)**:
2848
2849	If local forwarding is enabled, Router behaviour is assumed.
2850	This means exactly the reverse from the above:
2851
2852	1. IsRouter flag is set in Neighbour Advertisements.
2853	2. Router Solicitations are not sent unless accept_ra is 2.
2854	3. Router Advertisements are ignored unless accept_ra is 2.
2855	4. Redirects are ignored.
2856
2857	Default: 0 (disabled) if global forwarding is disabled (default),
2858	otherwise 1 (enabled).
2859
2860hop_limit - INTEGER
2861	Default Hop Limit to set.
2862
2863	Default: 64
2864
2865mtu - INTEGER
2866	Default Maximum Transfer Unit
2867
2868	Default: 1280 (IPv6 required minimum)
2869
2870ip_nonlocal_bind - BOOLEAN
2871	If enabled, allows processes to bind() to non-local IPv6 addresses,
2872	which can be quite useful - but may break some applications.
2873
2874	Possible values:
2875
2876	- 0 (disabled)
2877	- 1 (enabled)
2878
2879	Default: 0 (disabled)
2880
2881router_probe_interval - INTEGER
2882	Minimum interval (in seconds) between Router Probing described
2883	in RFC4191.
2884
2885	Default: 60
2886
2887router_solicitation_delay - INTEGER
2888	Number of seconds to wait after interface is brought up
2889	before sending Router Solicitations.
2890
2891	Default: 1
2892
2893router_solicitation_interval - INTEGER
2894	Number of seconds to wait between Router Solicitations.
2895
2896	Default: 4
2897
2898router_solicitations - INTEGER
2899	Number of Router Solicitations to send until assuming no
2900	routers are present.
2901
2902	Default: 3
2903
2904use_oif_addrs_only - BOOLEAN
2905	When enabled, the candidate source addresses for destinations
2906	routed via this interface are restricted to the set of addresses
2907	configured on this interface (vis. RFC 6724, section 4).
2908
2909	Possible values:
2910
2911	- 0 (disabled)
2912	- 1 (enabled)
2913
2914	Default: 0 (disabled)
2915
2916use_tempaddr - INTEGER
2917	Preference for Privacy Extensions (RFC3041).
2918
2919	  * <= 0 : disable Privacy Extensions
2920	  * == 1 : enable Privacy Extensions, but prefer public
2921	    addresses over temporary addresses.
2922	  * >  1 : enable Privacy Extensions and prefer temporary
2923	    addresses over public addresses.
2924
2925	Default:
2926
2927		* 0 (for most devices)
2928		* -1 (for point-to-point devices and loopback devices)
2929
2930temp_valid_lft - INTEGER
2931	valid lifetime (in seconds) for temporary addresses. If less than the
2932	minimum required lifetime (typically 5-7 seconds), temporary addresses
2933	will not be created.
2934
2935	Default: 172800 (2 days)
2936
2937temp_prefered_lft - INTEGER
2938	Preferred lifetime (in seconds) for temporary addresses. If
2939	temp_prefered_lft is less than the minimum required lifetime (typically
2940	5-7 seconds), the preferred lifetime is the minimum required. If
2941	temp_prefered_lft is greater than temp_valid_lft, the preferred lifetime
2942	is temp_valid_lft.
2943
2944	Default: 86400 (1 day)
2945
2946keep_addr_on_down - INTEGER
2947	Keep all IPv6 addresses on an interface down event. If set static
2948	global addresses with no expiration time are not flushed.
2949
2950	*   >0 : enabled
2951	*    0 : system default
2952	*   <0 : disabled
2953
2954	Default: 0 (addresses are removed)
2955
2956max_desync_factor - INTEGER
2957	Maximum value for DESYNC_FACTOR, which is a random value
2958	that ensures that clients don't synchronize with each
2959	other and generate new addresses at exactly the same time.
2960	value is in seconds.
2961
2962	Default: 600
2963
2964regen_min_advance - INTEGER
2965	How far in advance (in seconds), at minimum, to create a new temporary
2966	address before the current one is deprecated. This value is added to
2967	the amount of time that may be required for duplicate address detection
2968	to determine when to create a new address. Linux permits setting this
2969	value to less than the default of 2 seconds, but a value less than 2
2970	does not conform to RFC 8981.
2971
2972	Default: 2
2973
2974regen_max_retry - INTEGER
2975	Number of attempts before give up attempting to generate
2976	valid temporary addresses.
2977
2978	Default: 5
2979
2980max_addresses - INTEGER
2981	Maximum number of autoconfigured addresses per interface.  Setting
2982	to zero disables the limitation.  It is not recommended to set this
2983	value too large (or to zero) because it would be an easy way to
2984	crash the kernel by allowing too many addresses to be created.
2985
2986	Default: 16
2987
2988disable_ipv6 - BOOLEAN
2989	Disable IPv6 operation.  If accept_dad is set to 2, this value
2990	will be dynamically set to TRUE if DAD fails for the link-local
2991	address.
2992
2993	Default: FALSE (enable IPv6 operation)
2994
2995	When this value is changed from 1 to 0 (IPv6 is being enabled),
2996	it will dynamically create a link-local address on the given
2997	interface and start Duplicate Address Detection, if necessary.
2998
2999	When this value is changed from 0 to 1 (IPv6 is being disabled),
3000	it will dynamically delete all addresses and routes on the given
3001	interface. From now on it will not possible to add addresses/routes
3002	to the selected interface.
3003
3004accept_dad - INTEGER
3005	Whether to accept DAD (Duplicate Address Detection).
3006
3007	 == ==============================================================
3008	  0  Disable DAD
3009	  1  Enable DAD (default)
3010	  2  Enable DAD, and disable IPv6 operation if MAC-based duplicate
3011	     link-local address has been found.
3012	 == ==============================================================
3013
3014	DAD operation and mode on a given interface will be selected according
3015	to the maximum value of conf/{all,interface}/accept_dad.
3016
3017force_tllao - BOOLEAN
3018	Enable sending the target link-layer address option even when
3019	responding to a unicast neighbor solicitation.
3020
3021	Default: FALSE
3022
3023	Quoting from RFC 2461, section 4.4, Target link-layer address:
3024
3025	"The option MUST be included for multicast solicitations in order to
3026	avoid infinite Neighbor Solicitation "recursion" when the peer node
3027	does not have a cache entry to return a Neighbor Advertisements
3028	message.  When responding to unicast solicitations, the option can be
3029	omitted since the sender of the solicitation has the correct link-
3030	layer address; otherwise it would not have be able to send the unicast
3031	solicitation in the first place. However, including the link-layer
3032	address in this case adds little overhead and eliminates a potential
3033	race condition where the sender deletes the cached link-layer address
3034	prior to receiving a response to a previous solicitation."
3035
3036ndisc_notify - BOOLEAN
3037	Define mode for notification of address and device changes.
3038
3039	Possible values:
3040
3041	- 0 (disabled) - do nothing
3042	- 1 (enabled)  - Generate unsolicited neighbour advertisements when device is brought
3043	  up or hardware address changes.
3044
3045	Default: 0 (disabled)
3046
3047ndisc_tclass - INTEGER
3048	The IPv6 Traffic Class to use by default when sending IPv6 Neighbor
3049	Discovery (Router Solicitation, Router Advertisement, Neighbor
3050	Solicitation, Neighbor Advertisement, Redirect) messages.
3051	These 8 bits can be interpreted as 6 high order bits holding the DSCP
3052	value and 2 low order bits representing ECN (which you probably want
3053	to leave cleared).
3054
3055	* 0 - (default)
3056
3057ndisc_evict_nocarrier - BOOLEAN
3058	Clears the neighbor discovery table on NOCARRIER events. This option is
3059	important for wireless devices where the neighbor discovery cache should
3060	not be cleared when roaming between access points on the same network.
3061	In most cases this should remain as the default (1).
3062
3063	Possible values:
3064
3065	- 0 (disabled) - Do not clear neighbor discovery cache on NOCARRIER events.
3066	- 1 (enabled)  - Clear neighbor discover cache on NOCARRIER events.
3067
3068	Default: 1 (enabled)
3069
3070mldv1_unsolicited_report_interval - INTEGER
3071	The interval in milliseconds in which the next unsolicited
3072	MLDv1 report retransmit will take place.
3073
3074	Default: 10000 (10 seconds)
3075
3076mldv2_unsolicited_report_interval - INTEGER
3077	The interval in milliseconds in which the next unsolicited
3078	MLDv2 report retransmit will take place.
3079
3080	Default: 1000 (1 second)
3081
3082force_mld_version - INTEGER
3083	* 0 - (default) No enforcement of a MLD version, MLDv1 fallback allowed
3084	* 1 - Enforce to use MLD version 1
3085	* 2 - Enforce to use MLD version 2
3086
3087suppress_frag_ndisc - INTEGER
3088	Control RFC 6980 (Security Implications of IPv6 Fragmentation
3089	with IPv6 Neighbor Discovery) behavior:
3090
3091	* 1 - (default) discard fragmented neighbor discovery packets
3092	* 0 - allow fragmented neighbor discovery packets
3093
3094optimistic_dad - BOOLEAN
3095	Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
3096
3097	Optimistic Duplicate Address Detection for the interface will be enabled
3098	if at least one of conf/{all,interface}/optimistic_dad is set to 1,
3099	it will be disabled otherwise.
3100
3101	Possible values:
3102
3103	- 0 (disabled)
3104	- 1 (enabled)
3105
3106	Default: 0 (disabled)
3107
3108
3109use_optimistic - BOOLEAN
3110	If enabled, do not classify optimistic addresses as deprecated during
3111	source address selection.  Preferred addresses will still be chosen
3112	before optimistic addresses, subject to other ranking in the source
3113	address selection algorithm.
3114
3115	This will be enabled if at least one of
3116	conf/{all,interface}/use_optimistic is set to 1, disabled otherwise.
3117
3118	Possible values:
3119
3120	- 0 (disabled)
3121	- 1 (enabled)
3122
3123	Default: 0 (disabled)
3124
3125stable_secret - IPv6 address
3126	This IPv6 address will be used as a secret to generate IPv6
3127	addresses for link-local addresses and autoconfigured
3128	ones. All addresses generated after setting this secret will
3129	be stable privacy ones by default. This can be changed via the
3130	addrgenmode ip-link. conf/default/stable_secret is used as the
3131	secret for the namespace, the interface specific ones can
3132	overwrite that. Writes to conf/all/stable_secret are refused.
3133
3134	It is recommended to generate this secret during installation
3135	of a system and keep it stable after that.
3136
3137	By default the stable secret is unset.
3138
3139addr_gen_mode - INTEGER
3140	Defines how link-local and autoconf addresses are generated.
3141
3142	=  =================================================================
3143	0  generate address based on EUI64 (default)
3144	1  do no generate a link-local address, use EUI64 for addresses
3145	   generated from autoconf
3146	2  generate stable privacy addresses, using the secret from
3147	   stable_secret (RFC7217)
3148	3  generate stable privacy addresses, using a random secret if unset
3149	=  =================================================================
3150
3151drop_unicast_in_l2_multicast - BOOLEAN
3152	Drop any unicast IPv6 packets that are received in link-layer
3153	multicast (or broadcast) frames.
3154
3155	Possible values:
3156
3157	- 0 (disabled)
3158	- 1 (enabled)
3159
3160	Default: 0 (disabled)
3161
3162drop_unsolicited_na - BOOLEAN
3163	Drop all unsolicited neighbor advertisements, for example if there's
3164	a known good NA proxy on the network and such frames need not be used
3165	(or in the case of 802.11, must not be used to prevent attacks.)
3166
3167	Possible values:
3168
3169	- 0 (disabled)
3170	- 1 (enabled)
3171
3172	Default: 0 (disabled).
3173
3174accept_untracked_na - INTEGER
3175	Define behavior for accepting neighbor advertisements from devices that
3176	are absent in the neighbor cache:
3177
3178	- 0 - (default) Do not accept unsolicited and untracked neighbor
3179	  advertisements.
3180
3181	- 1 - Add a new neighbor cache entry in STALE state for routers on
3182	  receiving a neighbor advertisement (either solicited or unsolicited)
3183	  with target link-layer address option specified if no neighbor entry
3184	  is already present for the advertised IPv6 address. Without this knob,
3185	  NAs received for untracked addresses (absent in neighbor cache) are
3186	  silently ignored.
3187
3188	  This is as per router-side behavior documented in RFC9131.
3189
3190	  This has lower precedence than drop_unsolicited_na.
3191
3192	  This will optimize the return path for the initial off-link
3193	  communication that is initiated by a directly connected host, by
3194	  ensuring that the first-hop router which turns on this setting doesn't
3195	  have to buffer the initial return packets to do neighbor-solicitation.
3196	  The prerequisite is that the host is configured to send unsolicited
3197	  neighbor advertisements on interface bringup. This setting should be
3198	  used in conjunction with the ndisc_notify setting on the host to
3199	  satisfy this prerequisite.
3200
3201	- 2 - Extend option (1) to add a new neighbor cache entry only if the
3202	  source IP address is in the same subnet as an address configured on
3203	  the interface that received the neighbor advertisement.
3204
3205enhanced_dad - BOOLEAN
3206	Include a nonce option in the IPv6 neighbor solicitation messages used for
3207	duplicate address detection per RFC7527. A received DAD NS will only signal
3208	a duplicate address if the nonce is different. This avoids any false
3209	detection of duplicates due to loopback of the NS messages that we send.
3210	The nonce option will be sent on an interface unless both of
3211	conf/{all,interface}/enhanced_dad are set to FALSE.
3212
3213	Possible values:
3214
3215	- 0 (disabled)
3216	- 1 (enabled)
3217
3218	Default: 1 (enabled)
3219
3220``icmp/*``:
3221===========
3222
3223ratelimit - INTEGER
3224	Limit the maximal rates for sending ICMPv6 messages.
3225
3226	0 to disable any limiting,
3227	otherwise the minimal space between responses in milliseconds.
3228
3229	Default: 1000
3230
3231ratemask - list of comma separated ranges
3232	For ICMPv6 message types matching the ranges in the ratemask, limit
3233	the sending of the message according to ratelimit parameter.
3234
3235	The format used for both input and output is a comma separated
3236	list of ranges (e.g. "0-127,129" for ICMPv6 message type 0 to 127 and
3237	129). Writing to the file will clear all previous ranges of ICMPv6
3238	message types and update the current list with the input.
3239
3240	Refer to: https://www.iana.org/assignments/icmpv6-parameters/icmpv6-parameters.xhtml
3241	for numerical values of ICMPv6 message types, e.g. echo request is 128
3242	and echo reply is 129.
3243
3244	Default: 0-1,3-127 (rate limit ICMPv6 errors except Packet Too Big)
3245
3246echo_ignore_all - BOOLEAN
3247	If enabled, then the kernel will ignore all ICMP ECHO
3248	requests sent to it over the IPv6 protocol.
3249
3250	Possible values:
3251
3252	- 0 (disabled)
3253	- 1 (enabled)
3254
3255	Default: 0 (disabled)
3256
3257echo_ignore_multicast - BOOLEAN
3258	If enabled, then the kernel will ignore all ICMP ECHO
3259	requests sent to it over the IPv6 protocol via multicast.
3260
3261	Possible values:
3262
3263	- 0 (disabled)
3264	- 1 (enabled)
3265
3266	Default: 0 (disabled)
3267
3268echo_ignore_anycast - BOOLEAN
3269	If enabled, then the kernel will ignore all ICMP ECHO
3270	requests sent to it over the IPv6 protocol destined to anycast address.
3271
3272	Possible values:
3273
3274	- 0 (disabled)
3275	- 1 (enabled)
3276
3277	Default: 0 (disabled)
3278
3279error_anycast_as_unicast - BOOLEAN
3280	If enabled, then the kernel will respond with ICMP Errors
3281	resulting from requests sent to it over the IPv6 protocol destined
3282	to anycast address essentially treating anycast as unicast.
3283
3284	Possible values:
3285
3286	- 0 (disabled)
3287	- 1 (enabled)
3288
3289	Default: 0 (disabled)
3290
3291errors_extension_mask - UNSIGNED INTEGER
3292	Bitmask of ICMP extensions to append to ICMPv6 error messages
3293	("Destination Unreachable" and "Time Exceeded"). The original datagram
3294	is trimmed / padded to 128 bytes in order to be compatible with
3295	applications that do not comply with RFC 4884.
3296
3297	Possible extensions are:
3298
3299	==== ==============================================================
3300	0x01 Incoming IP interface information according to RFC 5837.
3301	     Extension will include the index, IPv6 address (if present),
3302	     name and MTU of the IP interface that received the datagram
3303	     which elicited the ICMP error.
3304	==== ==============================================================
3305
3306	Default: 0x00 (no extensions)
3307
3308xfrm6_gc_thresh - INTEGER
3309	(Obsolete since linux-4.14)
3310	The threshold at which we will start garbage collecting for IPv6
3311	destination cache entries.  At twice this value the system will
3312	refuse new allocations.
3313
3314
3315IPv6 Update by:
3316Pekka Savola <pekkas@netcore.fi>
3317YOSHIFUJI Hideaki / USAGI Project <yoshfuji@linux-ipv6.org>
3318
3319
3320/proc/sys/net/bridge/* Variables:
3321=================================
3322
3323bridge-nf-call-arptables - BOOLEAN
3324
3325	Possible values:
3326
3327	- 0 (disabled) - disable this.
3328	- 1 (enabled)  - pass bridged ARP traffic to arptables' FORWARD chain.
3329
3330	Default: 1 (enabled)
3331
3332bridge-nf-call-iptables - BOOLEAN
3333
3334	Possible values:
3335
3336	- 0 (disabled) - disable this.
3337	- 1 (enabled)  - pass bridged IPv4 traffic to iptables' chains.
3338
3339	Default: 1 (enabled)
3340
3341bridge-nf-call-ip6tables - BOOLEAN
3342
3343	Possible values:
3344
3345	- 0 (disabled) - disable this.
3346	- 1 (enabled)  - pass bridged IPv6 traffic to ip6tables' chains.
3347
3348	Default: 1 (enabled)
3349
3350bridge-nf-filter-vlan-tagged - BOOLEAN
3351
3352	Possible values:
3353
3354	- 0 (disabled) - disable this.
3355	- 1 (enabled)  - pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables
3356
3357	Default: 0 (disabled)
3358
3359bridge-nf-filter-pppoe-tagged - BOOLEAN
3360
3361	Possible values:
3362
3363	- 0 (disabled) - disable this.
3364	- 1 (enabled)  - pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.
3365
3366	Default: 0 (disabled)
3367
3368bridge-nf-pass-vlan-input-dev - BOOLEAN
3369	- 1: if bridge-nf-filter-vlan-tagged is enabled, try to find a vlan
3370	  interface on the bridge and set the netfilter input device to the
3371	  vlan. This allows use of e.g. "iptables -i br0.1" and makes the
3372	  REDIRECT target work with vlan-on-top-of-bridge interfaces.  When no
3373	  matching vlan interface is found, or this switch is off, the input
3374	  device is set to the bridge interface.
3375
3376	- 0: disable bridge netfilter vlan interface lookup.
3377
3378	Default: 0
3379
3380``proc/sys/net/sctp/*`` Variables:
3381==================================
3382
3383addip_enable - BOOLEAN
3384	Enable or disable extension of  Dynamic Address Reconfiguration
3385	(ADD-IP) functionality specified in RFC5061.  This extension provides
3386	the ability to dynamically add and remove new addresses for the SCTP
3387	associations.
3388
3389	Possible values:
3390
3391	- 0 (disabled) - disable extension.
3392	- 1 (enabled)  - enable extension
3393
3394	Default: 0 (disabled)
3395
3396pf_enable - INTEGER
3397	Enable or disable pf (pf is short for potentially failed) state. A value
3398	of pf_retrans > path_max_retrans also disables pf state. That is, one of
3399	both pf_enable and pf_retrans > path_max_retrans can disable pf state.
3400	Since pf_retrans and path_max_retrans can be changed by userspace
3401	application, sometimes user expects to disable pf state by the value of
3402	pf_retrans > path_max_retrans, but occasionally the value of pf_retrans
3403	or path_max_retrans is changed by the user application, this pf state is
3404	enabled. As such, it is necessary to add this to dynamically enable
3405	and disable pf state. See:
3406	https://datatracker.ietf.org/doc/draft-ietf-tsvwg-sctp-failover for
3407	details.
3408
3409	Possible values:
3410
3411	- 1: Enable pf.
3412	- 0: Disable pf.
3413
3414	Default: 1
3415
3416pf_expose - INTEGER
3417	Unset or enable/disable pf (pf is short for potentially failed) state
3418	exposure.  Applications can control the exposure of the PF path state
3419	in the SCTP_PEER_ADDR_CHANGE event and access of SCTP_PF-state
3420	transport info via SCTP_GET_PEER_ADDR_INFO sockopt.
3421
3422	Possible values:
3423
3424	- 0: Unset pf state exposure (compatible with old applications). No
3425	  event will be sent but the transport info can be queried.
3426	- 1: Disable pf state exposure. No event will be sent and trying to
3427	  obtain transport info will return -EACCESS.
3428	- 2: Enable pf state exposure. The event will be sent for a transport
3429	  becoming SCTP_PF state and transport info can be obtained.
3430
3431	Default: 0
3432
3433addip_noauth_enable - BOOLEAN
3434	Dynamic Address Reconfiguration (ADD-IP) requires the use of
3435	authentication to protect the operations of adding or removing new
3436	addresses.  This requirement is mandated so that unauthorized hosts
3437	would not be able to hijack associations.  However, older
3438	implementations may not have implemented this requirement while
3439	allowing the ADD-IP extension.  For reasons of interoperability,
3440	we provide this variable to control the enforcement of the
3441	authentication requirement.
3442
3443	== ===============================================================
3444	1  Allow ADD-IP extension to be used without authentication.  This
3445	   should only be set in a closed environment for interoperability
3446	   with older implementations.
3447
3448	0  Enforce the authentication requirement
3449	== ===============================================================
3450
3451	Default: 0
3452
3453auth_enable - BOOLEAN
3454	Enable or disable Authenticated Chunks extension.  This extension
3455	provides the ability to send and receive authenticated chunks and is
3456	required for secure operation of Dynamic Address Reconfiguration
3457	(ADD-IP) extension.
3458
3459	Possible values:
3460
3461	- 0 (disabled) - disable extension.
3462	- 1 (enabled)  - enable extension
3463
3464	Default: 0 (disabled)
3465
3466prsctp_enable - BOOLEAN
3467	Enable or disable the Partial Reliability extension (RFC3758) which
3468	is used to notify peers that a given DATA should no longer be expected.
3469
3470	Possible values:
3471
3472	- 0 (disabled) - disable extension.
3473	- 1 (enabled)  - enable extension
3474
3475	Default: 1 (enabled)
3476
3477max_burst - INTEGER
3478	The limit of the number of new packets that can be initially sent.  It
3479	controls how bursty the generated traffic can be.
3480
3481	Default: 4
3482
3483association_max_retrans - INTEGER
3484	Set the maximum number for retransmissions that an association can
3485	attempt deciding that the remote end is unreachable.  If this value
3486	is exceeded, the association is terminated.
3487
3488	Default: 10
3489
3490max_init_retransmits - INTEGER
3491	The maximum number of retransmissions of INIT and COOKIE-ECHO chunks
3492	that an association will attempt before declaring the destination
3493	unreachable and terminating.
3494
3495	Default: 8
3496
3497path_max_retrans - INTEGER
3498	The maximum number of retransmissions that will be attempted on a given
3499	path.  Once this threshold is exceeded, the path is considered
3500	unreachable, and new traffic will use a different path when the
3501	association is multihomed.
3502
3503	Default: 5
3504
3505pf_retrans - INTEGER
3506	The number of retransmissions that will be attempted on a given path
3507	before traffic is redirected to an alternate transport (should one
3508	exist).  Note this is distinct from path_max_retrans, as a path that
3509	passes the pf_retrans threshold can still be used.  Its only
3510	deprioritized when a transmission path is selected by the stack.  This
3511	setting is primarily used to enable fast failover mechanisms without
3512	having to reduce path_max_retrans to a very low value.  See:
3513	http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt
3514	for details.  Note also that a value of pf_retrans > path_max_retrans
3515	disables this feature. Since both pf_retrans and path_max_retrans can
3516	be changed by userspace application, a variable pf_enable is used to
3517	disable pf state.
3518
3519	Default: 0
3520
3521ps_retrans - INTEGER
3522	Primary.Switchover.Max.Retrans (PSMR), it's a tunable parameter coming
3523	from section-5 "Primary Path Switchover" in rfc7829.  The primary path
3524	will be changed to another active path when the path error counter on
3525	the old primary path exceeds PSMR, so that "the SCTP sender is allowed
3526	to continue data transmission on a new working path even when the old
3527	primary destination address becomes active again".   Note this feature
3528	is disabled by initializing 'ps_retrans' per netns as 0xffff by default,
3529	and its value can't be less than 'pf_retrans' when changing by sysctl.
3530
3531	Default: 0xffff
3532
3533rto_initial - INTEGER
3534	The initial round trip timeout value in milliseconds that will be used
3535	in calculating round trip times.  This is the initial time interval
3536	for retransmissions.
3537
3538	Default: 3000
3539
3540rto_max - INTEGER
3541	The maximum value (in milliseconds) of the round trip timeout.  This
3542	is the largest time interval that can elapse between retransmissions.
3543
3544	Default: 60000
3545
3546rto_min - INTEGER
3547	The minimum value (in milliseconds) of the round trip timeout.  This
3548	is the smallest time interval the can elapse between retransmissions.
3549
3550	Default: 1000
3551
3552hb_interval - INTEGER
3553	The interval (in milliseconds) between HEARTBEAT chunks.  These chunks
3554	are sent at the specified interval on idle paths to probe the state of
3555	a given path between 2 associations.
3556
3557	Default: 30000
3558
3559sack_timeout - INTEGER
3560	The amount of time (in milliseconds) that the implementation will wait
3561	to send a SACK.
3562
3563	Default: 200
3564
3565valid_cookie_life - INTEGER
3566	The default lifetime of the SCTP cookie (in milliseconds).  The cookie
3567	is used during association establishment.
3568
3569	Default: 60000
3570
3571cookie_preserve_enable - BOOLEAN
3572	Enable or disable the ability to extend the lifetime of the SCTP cookie
3573	that is used during the establishment phase of SCTP association
3574
3575	Possible values:
3576
3577	- 0 (disabled) - disable.
3578	- 1 (enabled)  - enable cookie lifetime extension.
3579
3580	Default: 1 (enabled)
3581
3582cookie_hmac_alg - STRING
3583	Select the hmac algorithm used when generating the cookie value sent by
3584	a listening sctp socket to a connecting client in the INIT-ACK chunk.
3585	Valid values are:
3586
3587	* sha256
3588	* none
3589
3590	Default: sha256
3591
3592rcvbuf_policy - INTEGER
3593	Determines if the receive buffer is attributed to the socket or to
3594	association.   SCTP supports the capability to create multiple
3595	associations on a single socket.  When using this capability, it is
3596	possible that a single stalled association that's buffering a lot
3597	of data may block other associations from delivering their data by
3598	consuming all of the receive buffer space.  To work around this,
3599	the rcvbuf_policy could be set to attribute the receiver buffer space
3600	to each association instead of the socket.  This prevents the described
3601	blocking.
3602
3603	- 1: rcvbuf space is per association
3604	- 0: rcvbuf space is per socket
3605
3606	Default: 0
3607
3608sndbuf_policy - INTEGER
3609	Similar to rcvbuf_policy above, this applies to send buffer space.
3610
3611	- 1: Send buffer is tracked per association
3612	- 0: Send buffer is tracked per socket.
3613
3614	Default: 0
3615
3616sctp_mem - vector of 3 INTEGERs: min, pressure, max
3617	Number of pages allowed for queueing by all SCTP sockets.
3618
3619	* min: Below this number of pages SCTP is not bothered about its
3620	  memory usage. When amount of memory allocated by SCTP exceeds
3621	  this number, SCTP starts to moderate memory usage.
3622	* pressure: This value was introduced to follow format of tcp_mem.
3623	* max: Maximum number of allowed pages.
3624
3625	Default is calculated at boot time from amount of available memory.
3626
3627sctp_rmem - vector of 3 INTEGERs: min, default, max
3628	Only the first value ("min") is used, "default" and "max" are
3629	ignored.
3630
3631	* min: Minimal size of receive buffer used by SCTP socket.
3632	  It is guaranteed to each SCTP socket (but not association) even
3633	  under moderate memory pressure.
3634
3635	Default: 4K
3636
3637sctp_wmem  - vector of 3 INTEGERs: min, default, max
3638	Only the first value ("min") is used, "default" and "max" are
3639	ignored.
3640
3641	* min: Minimum size of send buffer that can be used by SCTP sockets.
3642	  It is guaranteed to each SCTP socket (but not association) even
3643	  under moderate memory pressure.
3644
3645	Default: 4K
3646
3647addr_scope_policy - INTEGER
3648	Control IPv4 address scoping (see
3649	https://datatracker.ietf.org/doc/draft-stewart-tsvwg-sctp-ipv4/00/
3650	for details).
3651
3652	- 0   - Disable IPv4 address scoping
3653	- 1   - Enable IPv4 address scoping
3654	- 2   - Follow draft but allow IPv4 private addresses
3655	- 3   - Follow draft but allow IPv4 link local addresses
3656
3657	Default: 1
3658
3659udp_port - INTEGER
3660	The listening port for the local UDP tunneling sock. Normally it's
3661	using the IANA-assigned UDP port number 9899 (sctp-tunneling).
3662
3663	This UDP sock is used for processing the incoming UDP-encapsulated
3664	SCTP packets (from RFC6951), and shared by all applications in the
3665	same net namespace. This UDP sock will be closed when the value is
3666	set to 0.
3667
3668	The value will also be used to set the src port of the UDP header
3669	for the outgoing UDP-encapsulated SCTP packets. For the dest port,
3670	please refer to 'encap_port' below.
3671
3672	Default: 0
3673
3674encap_port - INTEGER
3675	The default remote UDP encapsulation port.
3676
3677	This value is used to set the dest port of the UDP header for the
3678	outgoing UDP-encapsulated SCTP packets by default. Users can also
3679	change the value for each sock/asoc/transport by using setsockopt.
3680	For further information, please refer to RFC6951.
3681
3682	Note that when connecting to a remote server, the client should set
3683	this to the port that the UDP tunneling sock on the peer server is
3684	listening to and the local UDP tunneling sock on the client also
3685	must be started. On the server, it would get the encap_port from
3686	the incoming packet's source port.
3687
3688	Default: 0
3689
3690plpmtud_probe_interval - INTEGER
3691        The time interval (in milliseconds) for the PLPMTUD probe timer,
3692        which is configured to expire after this period to receive an
3693        acknowledgment to a probe packet. This is also the time interval
3694        between the probes for the current pmtu when the probe search
3695        is done.
3696
3697        PLPMTUD will be disabled when 0 is set, and other values for it
3698        must be >= 5000.
3699
3700	Default: 0
3701
3702reconf_enable - BOOLEAN
3703        Enable or disable extension of Stream Reconfiguration functionality
3704        specified in RFC6525. This extension provides the ability to "reset"
3705        a stream, and it includes the Parameters of "Outgoing/Incoming SSN
3706        Reset", "SSN/TSN Reset" and "Add Outgoing/Incoming Streams".
3707
3708	Possible values:
3709
3710	- 0 (disabled) - Disable extension.
3711	- 1 (enabled) - Enable extension.
3712
3713	Default: 0 (disabled)
3714
3715intl_enable - BOOLEAN
3716        Enable or disable extension of User Message Interleaving functionality
3717        specified in RFC8260. This extension allows the interleaving of user
3718        messages sent on different streams. With this feature enabled, I-DATA
3719        chunk will replace DATA chunk to carry user messages if also supported
3720        by the peer. Note that to use this feature, one needs to set this option
3721        to 1 and also needs to set socket options SCTP_FRAGMENT_INTERLEAVE to 2
3722        and SCTP_INTERLEAVING_SUPPORTED to 1.
3723
3724	Possible values:
3725
3726	- 0 (disabled) - Disable extension.
3727	- 1 (enabled) - Enable extension.
3728
3729	Default: 0 (disabled)
3730
3731ecn_enable - BOOLEAN
3732        Control use of Explicit Congestion Notification (ECN) by SCTP.
3733        Like in TCP, ECN is used only when both ends of the SCTP connection
3734        indicate support for it. This feature is useful in avoiding losses
3735        due to congestion by allowing supporting routers to signal congestion
3736        before having to drop packets.
3737
3738        Possible values:
3739
3740	- 0 (disabled) - Disable ecn.
3741	- 1 (enabled) - Enable ecn.
3742
3743	Default: 1 (enabled)
3744
3745l3mdev_accept - BOOLEAN
3746	Enabling this option allows a "global" bound socket to work
3747	across L3 master domains (e.g., VRFs) with packets capable of
3748	being received regardless of the L3 domain in which they
3749	originated. Only valid when the kernel was compiled with
3750	CONFIG_NET_L3_MASTER_DEV.
3751
3752	Possible values:
3753
3754	- 0 (disabled)
3755	- 1 (enabled)
3756
3757	Default: 1 (enabled)
3758
3759
3760``/proc/sys/net/core/*``
3761========================
3762
3763	Please see: Documentation/admin-guide/sysctl/net.rst for descriptions of these entries.
3764
3765
3766``/proc/sys/net/unix/*``
3767========================
3768
3769max_dgram_qlen - INTEGER
3770	The maximum length of dgram socket receive queue
3771
3772	Default: 10
3773
3774