xref: /freebsd/share/man/man9/mod_cc.9 (revision 4644fda3f7a455e47f45a51a2e986d6b1fd6d0f9)
1f772f9feSLawrence Stewart.\"
2f772f9feSLawrence Stewart.\" Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org>
3f772f9feSLawrence Stewart.\" Copyright (c) 2010-2011 The FreeBSD Foundation
4f772f9feSLawrence Stewart.\" All rights reserved.
5f772f9feSLawrence Stewart.\"
6f772f9feSLawrence Stewart.\" Portions of this documentation were written at the Centre for Advanced
7f772f9feSLawrence Stewart.\" Internet Architectures, Swinburne University of Technology, Melbourne,
8f772f9feSLawrence Stewart.\" Australia by David Hayes and Lawrence Stewart under sponsorship from the
9f772f9feSLawrence Stewart.\" FreeBSD Foundation.
10f772f9feSLawrence Stewart.\"
11f772f9feSLawrence Stewart.\" Redistribution and use in source and binary forms, with or without
12f772f9feSLawrence Stewart.\" modification, are permitted provided that the following conditions
13f772f9feSLawrence Stewart.\" are met:
14f772f9feSLawrence Stewart.\" 1. Redistributions of source code must retain the above copyright
15f772f9feSLawrence Stewart.\"    notice, this list of conditions and the following disclaimer.
16f772f9feSLawrence Stewart.\" 2. Redistributions in binary form must reproduce the above copyright
17f772f9feSLawrence Stewart.\"    notice, this list of conditions and the following disclaimer in the
18f772f9feSLawrence Stewart.\"    documentation and/or other materials provided with the distribution.
19f772f9feSLawrence Stewart.\"
20f772f9feSLawrence Stewart.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
21f772f9feSLawrence Stewart.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22f772f9feSLawrence Stewart.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23f772f9feSLawrence Stewart.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
24f772f9feSLawrence Stewart.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25f772f9feSLawrence Stewart.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26f772f9feSLawrence Stewart.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27f772f9feSLawrence Stewart.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28f772f9feSLawrence Stewart.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29f772f9feSLawrence Stewart.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30f772f9feSLawrence Stewart.\" SUCH DAMAGE.
31f772f9feSLawrence Stewart.\"
32f772f9feSLawrence Stewart.\" $FreeBSD$
33f772f9feSLawrence Stewart.\"
34d519cedbSGleb Smirnoff.Dd January 21, 2016
35f772f9feSLawrence Stewart.Dt MOD_CC 9
36f772f9feSLawrence Stewart.Os
37f772f9feSLawrence Stewart.Sh NAME
38f772f9feSLawrence Stewart.Nm mod_cc ,
39f772f9feSLawrence Stewart.Nm DECLARE_CC_MODULE ,
402433b5f1SKevin Lo.Nm CCV
41f772f9feSLawrence Stewart.Nd Modular Congestion Control
42f772f9feSLawrence Stewart.Sh SYNOPSIS
43d519cedbSGleb Smirnoff.In netinet/tcp.h
44*4644fda3SGleb Smirnoff.In netinet/cc/cc.h
45f772f9feSLawrence Stewart.In netinet/cc/cc_module.h
46f772f9feSLawrence Stewart.Fn DECLARE_CC_MODULE "ccname" "ccalgo"
472433b5f1SKevin Lo.Fn CCV "ccv" "what"
48f772f9feSLawrence Stewart.Sh DESCRIPTION
49f772f9feSLawrence StewartThe
50f772f9feSLawrence Stewart.Nm
51f772f9feSLawrence Stewartframework allows congestion control algorithms to be implemented as dynamically
52f772f9feSLawrence Stewartloadable kernel modules via the
53f772f9feSLawrence Stewart.Xr kld 4
54f772f9feSLawrence Stewartfacility.
55f772f9feSLawrence StewartTransport protocols can select from the list of available algorithms on a
56f772f9feSLawrence Stewartconnection-by-connection basis, or use the system default (see
57f772f9feSLawrence Stewart.Xr mod_cc 4
58f772f9feSLawrence Stewartfor more details).
59f772f9feSLawrence Stewart.Pp
60f772f9feSLawrence Stewart.Nm
61f772f9feSLawrence Stewartmodules are identified by an
62f772f9feSLawrence Stewart.Xr ascii 7
63f772f9feSLawrence Stewartname and set of hook functions encapsulated in a
64f772f9feSLawrence Stewart.Vt "struct cc_algo" ,
65f772f9feSLawrence Stewartwhich has the following members:
66f772f9feSLawrence Stewart.Bd -literal -offset indent
67f772f9feSLawrence Stewartstruct cc_algo {
68f772f9feSLawrence Stewart	char	name[TCP_CA_NAME_MAX];
69f772f9feSLawrence Stewart	int	(*mod_init) (void);
70f772f9feSLawrence Stewart	int	(*mod_destroy) (void);
71f772f9feSLawrence Stewart	int	(*cb_init) (struct cc_var *ccv);
72f772f9feSLawrence Stewart	void	(*cb_destroy) (struct cc_var *ccv);
73f772f9feSLawrence Stewart	void	(*conn_init) (struct cc_var *ccv);
74f772f9feSLawrence Stewart	void	(*ack_received) (struct cc_var *ccv, uint16_t type);
75f772f9feSLawrence Stewart	void	(*cong_signal) (struct cc_var *ccv, uint32_t type);
76f772f9feSLawrence Stewart	void	(*post_recovery) (struct cc_var *ccv);
77f772f9feSLawrence Stewart	void	(*after_idle) (struct cc_var *ccv);
78d519cedbSGleb Smirnoff	int	(*ctl_output)(struct cc_var *, struct sockopt *, void *);
79f772f9feSLawrence Stewart};
80f772f9feSLawrence Stewart.Ed
81f772f9feSLawrence Stewart.Pp
82f772f9feSLawrence StewartThe
83f772f9feSLawrence Stewart.Va name
84f772f9feSLawrence Stewartfield identifies the unique name of the algorithm, and should be no longer than
85f772f9feSLawrence StewartTCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in
86f772f9feSLawrence Stewart.In netinet/tcp.h
87f772f9feSLawrence Stewartfor compatibility reasons).
88f772f9feSLawrence Stewart.Pp
89f772f9feSLawrence StewartThe
90f772f9feSLawrence Stewart.Va mod_init
91f772f9feSLawrence Stewartfunction is called when a new module is loaded into the system but before the
92f772f9feSLawrence Stewartregistration process is complete.
93f772f9feSLawrence StewartIt should be implemented if a module needs to set up some global state prior to
94f772f9feSLawrence Stewartbeing available for use by new connections.
95f772f9feSLawrence StewartReturning a non-zero value from
96f772f9feSLawrence Stewart.Va mod_init
97f772f9feSLawrence Stewartwill cause the loading of the module to fail.
98f772f9feSLawrence Stewart.Pp
99f772f9feSLawrence StewartThe
100f772f9feSLawrence Stewart.Va mod_destroy
101f772f9feSLawrence Stewartfunction is called prior to unloading an existing module from the kernel.
102f772f9feSLawrence StewartIt should be implemented if a module needs to clean up any global state before
103f772f9feSLawrence Stewartbeing removed from the kernel.
104f772f9feSLawrence StewartThe return value is currently ignored.
105f772f9feSLawrence Stewart.Pp
106f772f9feSLawrence StewartThe
107f772f9feSLawrence Stewart.Va cb_init
108f772f9feSLawrence Stewartfunction is called when a TCP control block
109f772f9feSLawrence Stewart.Vt struct tcpcb
110f772f9feSLawrence Stewartis created.
111f772f9feSLawrence StewartIt should be implemented if a module needs to allocate memory for storing
112f772f9feSLawrence Stewartprivate per-connection state.
113f772f9feSLawrence StewartReturning a non-zero value from
114f772f9feSLawrence Stewart.Va cb_init
115f772f9feSLawrence Stewartwill cause the connection set up to be aborted, terminating the connection as a
116f772f9feSLawrence Stewartresult.
117f772f9feSLawrence Stewart.Pp
118f772f9feSLawrence StewartThe
119f772f9feSLawrence Stewart.Va cb_destroy
120f772f9feSLawrence Stewartfunction is called when a TCP control block
121f772f9feSLawrence Stewart.Vt struct tcpcb
122f772f9feSLawrence Stewartis destroyed.
123f772f9feSLawrence StewartIt should be implemented if a module needs to free memory allocated in
124f772f9feSLawrence Stewart.Va cb_init .
125f772f9feSLawrence Stewart.Pp
126f772f9feSLawrence StewartThe
127f772f9feSLawrence Stewart.Va conn_init
128f772f9feSLawrence Stewartfunction is called when a new connection has been established and variables are
129f772f9feSLawrence Stewartbeing initialised.
130f772f9feSLawrence StewartIt should be implemented to initialise congestion control algorithm variables
131f772f9feSLawrence Stewartfor the newly established connection.
132f772f9feSLawrence Stewart.Pp
133f772f9feSLawrence StewartThe
134f772f9feSLawrence Stewart.Va ack_received
135f772f9feSLawrence Stewartfunction is called when a TCP acknowledgement (ACK) packet is received.
136f772f9feSLawrence StewartModules use the
137f772f9feSLawrence Stewart.Fa type
138f772f9feSLawrence Stewartargument as an input to their congestion management algorithms.
139f772f9feSLawrence StewartThe ACK types currently reported by the stack are CC_ACK and CC_DUPACK.
140f772f9feSLawrence StewartCC_ACK indicates the received ACK acknowledges previously unacknowledged data.
141f772f9feSLawrence StewartCC_DUPACK indicates the received ACK acknowledges data we have already received
142f772f9feSLawrence Stewartan ACK for.
143f772f9feSLawrence Stewart.Pp
144f772f9feSLawrence StewartThe
145f772f9feSLawrence Stewart.Va cong_signal
146f772f9feSLawrence Stewartfunction is called when a congestion event is detected by the TCP stack.
147f772f9feSLawrence StewartModules use the
148f772f9feSLawrence Stewart.Fa type
149f772f9feSLawrence Stewartargument as an input to their congestion management algorithms.
150f772f9feSLawrence StewartThe congestion event types currently reported by the stack are CC_ECN, CC_RTO,
151f772f9feSLawrence StewartCC_RTO_ERR and CC_NDUPACK.
152f772f9feSLawrence StewartCC_ECN is reported when the TCP stack receives an explicit congestion notification
153f772f9feSLawrence Stewart(RFC3168).
154f772f9feSLawrence StewartCC_RTO is reported when the retransmission time out timer fires.
155f772f9feSLawrence StewartCC_RTO_ERR is reported if the retransmission time out timer fired in error.
156f772f9feSLawrence StewartCC_NDUPACK is reported if N duplicate ACKs have been received back-to-back,
157f772f9feSLawrence Stewartwhere N is the fast retransmit duplicate ack threshold (N=3 currently as per
158f772f9feSLawrence StewartRFC5681).
159f772f9feSLawrence Stewart.Pp
160f772f9feSLawrence StewartThe
161f772f9feSLawrence Stewart.Va post_recovery
162f772f9feSLawrence Stewartfunction is called after the TCP connection has recovered from a congestion event.
163f772f9feSLawrence StewartIt should be implemented to adjust state as required.
164f772f9feSLawrence Stewart.Pp
165f772f9feSLawrence StewartThe
166f772f9feSLawrence Stewart.Va after_idle
167f772f9feSLawrence Stewartfunction is called when data transfer resumes after an idle period.
168f772f9feSLawrence StewartIt should be implemented to adjust state as required.
169f772f9feSLawrence Stewart.Pp
170f772f9feSLawrence StewartThe
171d519cedbSGleb Smirnoff.Va ctl_output
172d519cedbSGleb Smirnofffunction is called when
173d519cedbSGleb Smirnoff.Xr getsockopt 2
174d519cedbSGleb Smirnoffor
175d519cedbSGleb Smirnoff.Xr setsockopt 2
176d519cedbSGleb Smirnoffis called on a
177d519cedbSGleb Smirnoff.Xr tcp 4
178d519cedbSGleb Smirnoffsocket with the
179d519cedbSGleb Smirnoff.Va struct sockopt
180d519cedbSGleb Smirnoffpointer forwarded unmodified from the TCP control, and a
181d519cedbSGleb Smirnoff.Va void *
182d519cedbSGleb Smirnoffpointer to algorithm specific argument.
183d519cedbSGleb Smirnoff.Pp
184d519cedbSGleb SmirnoffThe
185f772f9feSLawrence Stewart.Fn DECLARE_CC_MODULE
186f772f9feSLawrence Stewartmacro provides a convenient wrapper around the
187f772f9feSLawrence Stewart.Xr DECLARE_MODULE 9
188f772f9feSLawrence Stewartmacro, and is used to register a
189f772f9feSLawrence Stewart.Nm
190f772f9feSLawrence Stewartmodule with the
191f772f9feSLawrence Stewart.Nm
192f772f9feSLawrence Stewartframework.
193f772f9feSLawrence StewartThe
194f772f9feSLawrence Stewart.Fa ccname
195f772f9feSLawrence Stewartargument specifies the module's name.
196f772f9feSLawrence StewartThe
197f772f9feSLawrence Stewart.Fa ccalgo
198f772f9feSLawrence Stewartargument points to the module's
199f772f9feSLawrence Stewart.Vt struct cc_algo .
200f772f9feSLawrence Stewart.Pp
201f772f9feSLawrence Stewart.Nm
202f772f9feSLawrence Stewartmodules must instantiate a
203f772f9feSLawrence Stewart.Vt struct cc_algo ,
204f772f9feSLawrence Stewartbut are only required to set the name field, and optionally any of the function
205f772f9feSLawrence Stewartpointers.
206f772f9feSLawrence StewartThe stack will skip calling any function pointer which is NULL, so there is no
207f772f9feSLawrence Stewartrequirement to implement any of the function pointers.
208f772f9feSLawrence StewartUsing the C99 designated initialiser feature to set fields is encouraged.
209f772f9feSLawrence Stewart.Pp
210f772f9feSLawrence StewartEach function pointer which deals with congestion control state is passed a
211f772f9feSLawrence Stewartpointer to a
212f772f9feSLawrence Stewart.Vt struct cc_var ,
213f772f9feSLawrence Stewartwhich has the following members:
214f772f9feSLawrence Stewart.Bd -literal -offset indent
215f772f9feSLawrence Stewartstruct cc_var {
216f772f9feSLawrence Stewart	void		*cc_data;
217f772f9feSLawrence Stewart	int		bytes_this_ack;
218f772f9feSLawrence Stewart	tcp_seq		curack;
219f772f9feSLawrence Stewart	uint32_t	flags;
220f772f9feSLawrence Stewart	int		type;
221f772f9feSLawrence Stewart	union ccv_container {
222f772f9feSLawrence Stewart		struct tcpcb		*tcp;
223f772f9feSLawrence Stewart		struct sctp_nets	*sctp;
224f772f9feSLawrence Stewart	} ccvc;
225f772f9feSLawrence Stewart};
226f772f9feSLawrence Stewart.Ed
227f772f9feSLawrence Stewart.Pp
228f772f9feSLawrence Stewart.Vt struct cc_var
229f772f9feSLawrence Stewartgroups congestion control related variables into a single, embeddable structure
230f772f9feSLawrence Stewartand adds a layer of indirection to accessing transport protocol control blocks.
231f772f9feSLawrence StewartThe eventual goal is to allow a single set of
232f772f9feSLawrence Stewart.Nm
233f772f9feSLawrence Stewartmodules to be shared between all congestion aware transport protocols, though
234f772f9feSLawrence Stewartcurrently only
235f772f9feSLawrence Stewart.Xr tcp 4
236f772f9feSLawrence Stewartis supported.
237f772f9feSLawrence Stewart.Pp
238f772f9feSLawrence StewartTo aid the eventual transition towards this goal, direct use of variables from
239f772f9feSLawrence Stewartthe transport protocol's data structures is strongly discouraged.
240f772f9feSLawrence StewartHowever, it is inevitable at the current time to require access to some of these
241f772f9feSLawrence Stewartvariables, and so the
2422433b5f1SKevin Lo.Fn CCV
243f772f9feSLawrence Stewartmacro exists as a convenience accessor.
244f772f9feSLawrence StewartThe
245f772f9feSLawrence Stewart.Fa ccv
246f772f9feSLawrence Stewartargument points to the
247f772f9feSLawrence Stewart.Vt struct cc_var
248f772f9feSLawrence Stewartpassed into the function by the
249f772f9feSLawrence Stewart.Nm
250f772f9feSLawrence Stewartframework.
251f772f9feSLawrence StewartThe
252f772f9feSLawrence Stewart.Fa what
253f772f9feSLawrence Stewartargument specifies the name of the variable to access.
254f772f9feSLawrence Stewart.Pp
255f772f9feSLawrence StewartApart from the
256f772f9feSLawrence Stewart.Va type
257f772f9feSLawrence Stewartand
258f772f9feSLawrence Stewart.Va ccv_container
259f772f9feSLawrence Stewartfields, the remaining fields in
260f772f9feSLawrence Stewart.Vt struct cc_var
261f772f9feSLawrence Stewartare for use by
262f772f9feSLawrence Stewart.Nm
263f772f9feSLawrence Stewartmodules.
264f772f9feSLawrence Stewart.Pp
265f772f9feSLawrence StewartThe
266f772f9feSLawrence Stewart.Va cc_data
267f772f9feSLawrence Stewartfield is available for algorithms requiring additional per-connection state to
268f772f9feSLawrence Stewartattach a dynamic memory pointer to.
269f772f9feSLawrence StewartThe memory should be allocated and attached in the module's
270f772f9feSLawrence Stewart.Va cb_init
271f772f9feSLawrence Stewarthook function.
272f772f9feSLawrence Stewart.Pp
273f772f9feSLawrence StewartThe
274f772f9feSLawrence Stewart.Va bytes_this_ack
275f772f9feSLawrence Stewartfield specifies the number of new bytes acknowledged by the most recently
276f772f9feSLawrence Stewartreceived ACK packet.
277f772f9feSLawrence StewartIt is only valid in the
278f772f9feSLawrence Stewart.Va ack_received
279f772f9feSLawrence Stewarthook function.
280f772f9feSLawrence Stewart.Pp
281f772f9feSLawrence StewartThe
282f772f9feSLawrence Stewart.Va curack
283f772f9feSLawrence Stewartfield specifies the sequence number of the most recently received ACK packet.
284f772f9feSLawrence StewartIt is only valid in the
285f772f9feSLawrence Stewart.Va ack_received ,
286f772f9feSLawrence Stewart.Va cong_signal
287f772f9feSLawrence Stewartand
288f772f9feSLawrence Stewart.Va post_recovery
289f772f9feSLawrence Stewarthook functions.
290f772f9feSLawrence Stewart.Pp
291f772f9feSLawrence StewartThe
292f772f9feSLawrence Stewart.Va flags
293f772f9feSLawrence Stewartfield is used to pass useful information from the stack to a
294f772f9feSLawrence Stewart.Nm
295f772f9feSLawrence Stewartmodule.
296f772f9feSLawrence StewartThe CCF_ABC_SENTAWND flag is relevant in
297f772f9feSLawrence Stewart.Va ack_received
298f772f9feSLawrence Stewartand is set when appropriate byte counting (RFC3465) has counted a window's worth
299f772f9feSLawrence Stewartof bytes has been sent.
300f772f9feSLawrence StewartIt is the module's responsibility to clear the flag after it has processed the
301f772f9feSLawrence Stewartsignal.
302f772f9feSLawrence StewartThe CCF_CWND_LIMITED flag is relevant in
303f772f9feSLawrence Stewart.Va ack_received
304f772f9feSLawrence Stewartand is set when the connection's ability to send data is currently constrained
305f772f9feSLawrence Stewartby the value of the congestion window.
30673bbeaa5SGlen BarberAlgorithms should use the absence of this flag being set to avoid accumulating
307f772f9feSLawrence Stewarta large difference between the congestion window and send window.
308f772f9feSLawrence Stewart.Sh SEE ALSO
3095547f9fbSKevin Lo.Xr cc_cdg 4 ,
310f772f9feSLawrence Stewart.Xr cc_chd 4 ,
311f772f9feSLawrence Stewart.Xr cc_cubic 4 ,
312f772f9feSLawrence Stewart.Xr cc_hd 4 ,
313f772f9feSLawrence Stewart.Xr cc_htcp 4 ,
314f772f9feSLawrence Stewart.Xr cc_newreno 4 ,
315f772f9feSLawrence Stewart.Xr cc_vegas 4 ,
316f772f9feSLawrence Stewart.Xr mod_cc 4 ,
317f772f9feSLawrence Stewart.Xr tcp 4
318f772f9feSLawrence Stewart.Sh ACKNOWLEDGEMENTS
319f772f9feSLawrence StewartDevelopment and testing of this software were made possible in part by grants
320f772f9feSLawrence Stewartfrom the FreeBSD Foundation and Cisco University Research Program Fund at
321f772f9feSLawrence StewartCommunity Foundation Silicon Valley.
322f772f9feSLawrence Stewart.Sh FUTURE WORK
323f772f9feSLawrence StewartIntegrate with
324f772f9feSLawrence Stewart.Xr sctp 4 .
325f772f9feSLawrence Stewart.Sh HISTORY
326f772f9feSLawrence StewartThe modular Congestion Control (CC) framework first appeared in
327f772f9feSLawrence Stewart.Fx 9.0 .
328f772f9feSLawrence Stewart.Pp
329f772f9feSLawrence StewartThe framework was first released in 2007 by James Healy and Lawrence Stewart
330f772f9feSLawrence Stewartwhilst working on the NewTCP research project at Swinburne University of
331f772f9feSLawrence StewartTechnology's Centre for Advanced Internet Architectures, Melbourne, Australia,
332f772f9feSLawrence Stewartwhich was made possible in part by a grant from the Cisco University Research
333f772f9feSLawrence StewartProgram Fund at Community Foundation Silicon Valley.
334f772f9feSLawrence StewartMore details are available at:
335f772f9feSLawrence Stewart.Pp
336f772f9feSLawrence Stewarthttp://caia.swin.edu.au/urp/newtcp/
337f772f9feSLawrence Stewart.Sh AUTHORS
338f772f9feSLawrence Stewart.An -nosplit
339f772f9feSLawrence StewartThe
340f772f9feSLawrence Stewart.Nm
341f772f9feSLawrence Stewartframework was written by
3428a7314fcSBaptiste Daroussin.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org ,
3438a7314fcSBaptiste Daroussin.An James Healy Aq Mt jimmy@deefa.com
344f772f9feSLawrence Stewartand
3458a7314fcSBaptiste Daroussin.An David Hayes Aq Mt david.hayes@ieee.org .
346f772f9feSLawrence Stewart.Pp
347f772f9feSLawrence StewartThis manual page was written by
3488a7314fcSBaptiste Daroussin.An David Hayes Aq Mt david.hayes@ieee.org
349f772f9feSLawrence Stewartand
3508a7314fcSBaptiste Daroussin.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org .
351