1f772f9feSLawrence Stewart.\" 2f772f9feSLawrence Stewart.\" Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org> 3f772f9feSLawrence Stewart.\" Copyright (c) 2010-2011 The FreeBSD Foundation 4f772f9feSLawrence Stewart.\" All rights reserved. 5f772f9feSLawrence Stewart.\" 6f772f9feSLawrence Stewart.\" Portions of this documentation were written at the Centre for Advanced 7f772f9feSLawrence Stewart.\" Internet Architectures, Swinburne University of Technology, Melbourne, 8f772f9feSLawrence Stewart.\" Australia by David Hayes and Lawrence Stewart under sponsorship from the 9f772f9feSLawrence Stewart.\" FreeBSD Foundation. 10f772f9feSLawrence Stewart.\" 11f772f9feSLawrence Stewart.\" Redistribution and use in source and binary forms, with or without 12f772f9feSLawrence Stewart.\" modification, are permitted provided that the following conditions 13f772f9feSLawrence Stewart.\" are met: 14f772f9feSLawrence Stewart.\" 1. Redistributions of source code must retain the above copyright 15f772f9feSLawrence Stewart.\" notice, this list of conditions and the following disclaimer. 16f772f9feSLawrence Stewart.\" 2. Redistributions in binary form must reproduce the above copyright 17f772f9feSLawrence Stewart.\" notice, this list of conditions and the following disclaimer in the 18f772f9feSLawrence Stewart.\" documentation and/or other materials provided with the distribution. 19f772f9feSLawrence Stewart.\" 20f772f9feSLawrence Stewart.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 21f772f9feSLawrence Stewart.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22f772f9feSLawrence Stewart.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 23f772f9feSLawrence Stewart.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 24f772f9feSLawrence Stewart.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25f772f9feSLawrence Stewart.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 26f772f9feSLawrence Stewart.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 27f772f9feSLawrence Stewart.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 28f772f9feSLawrence Stewart.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 29f772f9feSLawrence Stewart.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 30f772f9feSLawrence Stewart.\" SUCH DAMAGE. 31f772f9feSLawrence Stewart.\" 32f772f9feSLawrence Stewart.\" $FreeBSD$ 33f772f9feSLawrence Stewart.\" 34*2433b5f1SKevin Lo.Dd December 25, 2014 35f772f9feSLawrence Stewart.Dt MOD_CC 9 36f772f9feSLawrence Stewart.Os 37f772f9feSLawrence Stewart.Sh NAME 38f772f9feSLawrence Stewart.Nm mod_cc , 39f772f9feSLawrence Stewart.Nm DECLARE_CC_MODULE , 40*2433b5f1SKevin Lo.Nm CCV 41f772f9feSLawrence Stewart.Nd Modular Congestion Control 42f772f9feSLawrence Stewart.Sh SYNOPSIS 43f772f9feSLawrence Stewart.In netinet/cc.h 44f772f9feSLawrence Stewart.In netinet/cc/cc_module.h 45f772f9feSLawrence Stewart.Fn DECLARE_CC_MODULE "ccname" "ccalgo" 46*2433b5f1SKevin Lo.Fn CCV "ccv" "what" 47f772f9feSLawrence Stewart.Sh DESCRIPTION 48f772f9feSLawrence StewartThe 49f772f9feSLawrence Stewart.Nm 50f772f9feSLawrence Stewartframework allows congestion control algorithms to be implemented as dynamically 51f772f9feSLawrence Stewartloadable kernel modules via the 52f772f9feSLawrence Stewart.Xr kld 4 53f772f9feSLawrence Stewartfacility. 54f772f9feSLawrence StewartTransport protocols can select from the list of available algorithms on a 55f772f9feSLawrence Stewartconnection-by-connection basis, or use the system default (see 56f772f9feSLawrence Stewart.Xr mod_cc 4 57f772f9feSLawrence Stewartfor more details). 58f772f9feSLawrence Stewart.Pp 59f772f9feSLawrence Stewart.Nm 60f772f9feSLawrence Stewartmodules are identified by an 61f772f9feSLawrence Stewart.Xr ascii 7 62f772f9feSLawrence Stewartname and set of hook functions encapsulated in a 63f772f9feSLawrence Stewart.Vt "struct cc_algo" , 64f772f9feSLawrence Stewartwhich has the following members: 65f772f9feSLawrence Stewart.Bd -literal -offset indent 66f772f9feSLawrence Stewartstruct cc_algo { 67f772f9feSLawrence Stewart char name[TCP_CA_NAME_MAX]; 68f772f9feSLawrence Stewart int (*mod_init) (void); 69f772f9feSLawrence Stewart int (*mod_destroy) (void); 70f772f9feSLawrence Stewart int (*cb_init) (struct cc_var *ccv); 71f772f9feSLawrence Stewart void (*cb_destroy) (struct cc_var *ccv); 72f772f9feSLawrence Stewart void (*conn_init) (struct cc_var *ccv); 73f772f9feSLawrence Stewart void (*ack_received) (struct cc_var *ccv, uint16_t type); 74f772f9feSLawrence Stewart void (*cong_signal) (struct cc_var *ccv, uint32_t type); 75f772f9feSLawrence Stewart void (*post_recovery) (struct cc_var *ccv); 76f772f9feSLawrence Stewart void (*after_idle) (struct cc_var *ccv); 77f772f9feSLawrence Stewart}; 78f772f9feSLawrence Stewart.Ed 79f772f9feSLawrence Stewart.Pp 80f772f9feSLawrence StewartThe 81f772f9feSLawrence Stewart.Va name 82f772f9feSLawrence Stewartfield identifies the unique name of the algorithm, and should be no longer than 83f772f9feSLawrence StewartTCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in 84f772f9feSLawrence Stewart.In netinet/tcp.h 85f772f9feSLawrence Stewartfor compatibility reasons). 86f772f9feSLawrence Stewart.Pp 87f772f9feSLawrence StewartThe 88f772f9feSLawrence Stewart.Va mod_init 89f772f9feSLawrence Stewartfunction is called when a new module is loaded into the system but before the 90f772f9feSLawrence Stewartregistration process is complete. 91f772f9feSLawrence StewartIt should be implemented if a module needs to set up some global state prior to 92f772f9feSLawrence Stewartbeing available for use by new connections. 93f772f9feSLawrence StewartReturning a non-zero value from 94f772f9feSLawrence Stewart.Va mod_init 95f772f9feSLawrence Stewartwill cause the loading of the module to fail. 96f772f9feSLawrence Stewart.Pp 97f772f9feSLawrence StewartThe 98f772f9feSLawrence Stewart.Va mod_destroy 99f772f9feSLawrence Stewartfunction is called prior to unloading an existing module from the kernel. 100f772f9feSLawrence StewartIt should be implemented if a module needs to clean up any global state before 101f772f9feSLawrence Stewartbeing removed from the kernel. 102f772f9feSLawrence StewartThe return value is currently ignored. 103f772f9feSLawrence Stewart.Pp 104f772f9feSLawrence StewartThe 105f772f9feSLawrence Stewart.Va cb_init 106f772f9feSLawrence Stewartfunction is called when a TCP control block 107f772f9feSLawrence Stewart.Vt struct tcpcb 108f772f9feSLawrence Stewartis created. 109f772f9feSLawrence StewartIt should be implemented if a module needs to allocate memory for storing 110f772f9feSLawrence Stewartprivate per-connection state. 111f772f9feSLawrence StewartReturning a non-zero value from 112f772f9feSLawrence Stewart.Va cb_init 113f772f9feSLawrence Stewartwill cause the connection set up to be aborted, terminating the connection as a 114f772f9feSLawrence Stewartresult. 115f772f9feSLawrence Stewart.Pp 116f772f9feSLawrence StewartThe 117f772f9feSLawrence Stewart.Va cb_destroy 118f772f9feSLawrence Stewartfunction is called when a TCP control block 119f772f9feSLawrence Stewart.Vt struct tcpcb 120f772f9feSLawrence Stewartis destroyed. 121f772f9feSLawrence StewartIt should be implemented if a module needs to free memory allocated in 122f772f9feSLawrence Stewart.Va cb_init . 123f772f9feSLawrence Stewart.Pp 124f772f9feSLawrence StewartThe 125f772f9feSLawrence Stewart.Va conn_init 126f772f9feSLawrence Stewartfunction is called when a new connection has been established and variables are 127f772f9feSLawrence Stewartbeing initialised. 128f772f9feSLawrence StewartIt should be implemented to initialise congestion control algorithm variables 129f772f9feSLawrence Stewartfor the newly established connection. 130f772f9feSLawrence Stewart.Pp 131f772f9feSLawrence StewartThe 132f772f9feSLawrence Stewart.Va ack_received 133f772f9feSLawrence Stewartfunction is called when a TCP acknowledgement (ACK) packet is received. 134f772f9feSLawrence StewartModules use the 135f772f9feSLawrence Stewart.Fa type 136f772f9feSLawrence Stewartargument as an input to their congestion management algorithms. 137f772f9feSLawrence StewartThe ACK types currently reported by the stack are CC_ACK and CC_DUPACK. 138f772f9feSLawrence StewartCC_ACK indicates the received ACK acknowledges previously unacknowledged data. 139f772f9feSLawrence StewartCC_DUPACK indicates the received ACK acknowledges data we have already received 140f772f9feSLawrence Stewartan ACK for. 141f772f9feSLawrence Stewart.Pp 142f772f9feSLawrence StewartThe 143f772f9feSLawrence Stewart.Va cong_signal 144f772f9feSLawrence Stewartfunction is called when a congestion event is detected by the TCP stack. 145f772f9feSLawrence StewartModules use the 146f772f9feSLawrence Stewart.Fa type 147f772f9feSLawrence Stewartargument as an input to their congestion management algorithms. 148f772f9feSLawrence StewartThe congestion event types currently reported by the stack are CC_ECN, CC_RTO, 149f772f9feSLawrence StewartCC_RTO_ERR and CC_NDUPACK. 150f772f9feSLawrence StewartCC_ECN is reported when the TCP stack receives an explicit congestion notification 151f772f9feSLawrence Stewart(RFC3168). 152f772f9feSLawrence StewartCC_RTO is reported when the retransmission time out timer fires. 153f772f9feSLawrence StewartCC_RTO_ERR is reported if the retransmission time out timer fired in error. 154f772f9feSLawrence StewartCC_NDUPACK is reported if N duplicate ACKs have been received back-to-back, 155f772f9feSLawrence Stewartwhere N is the fast retransmit duplicate ack threshold (N=3 currently as per 156f772f9feSLawrence StewartRFC5681). 157f772f9feSLawrence Stewart.Pp 158f772f9feSLawrence StewartThe 159f772f9feSLawrence Stewart.Va post_recovery 160f772f9feSLawrence Stewartfunction is called after the TCP connection has recovered from a congestion event. 161f772f9feSLawrence StewartIt should be implemented to adjust state as required. 162f772f9feSLawrence Stewart.Pp 163f772f9feSLawrence StewartThe 164f772f9feSLawrence Stewart.Va after_idle 165f772f9feSLawrence Stewartfunction is called when data transfer resumes after an idle period. 166f772f9feSLawrence StewartIt should be implemented to adjust state as required. 167f772f9feSLawrence Stewart.Pp 168f772f9feSLawrence StewartThe 169f772f9feSLawrence Stewart.Fn DECLARE_CC_MODULE 170f772f9feSLawrence Stewartmacro provides a convenient wrapper around the 171f772f9feSLawrence Stewart.Xr DECLARE_MODULE 9 172f772f9feSLawrence Stewartmacro, and is used to register a 173f772f9feSLawrence Stewart.Nm 174f772f9feSLawrence Stewartmodule with the 175f772f9feSLawrence Stewart.Nm 176f772f9feSLawrence Stewartframework. 177f772f9feSLawrence StewartThe 178f772f9feSLawrence Stewart.Fa ccname 179f772f9feSLawrence Stewartargument specifies the module's name. 180f772f9feSLawrence StewartThe 181f772f9feSLawrence Stewart.Fa ccalgo 182f772f9feSLawrence Stewartargument points to the module's 183f772f9feSLawrence Stewart.Vt struct cc_algo . 184f772f9feSLawrence Stewart.Pp 185f772f9feSLawrence Stewart.Nm 186f772f9feSLawrence Stewartmodules must instantiate a 187f772f9feSLawrence Stewart.Vt struct cc_algo , 188f772f9feSLawrence Stewartbut are only required to set the name field, and optionally any of the function 189f772f9feSLawrence Stewartpointers. 190f772f9feSLawrence StewartThe stack will skip calling any function pointer which is NULL, so there is no 191f772f9feSLawrence Stewartrequirement to implement any of the function pointers. 192f772f9feSLawrence StewartUsing the C99 designated initialiser feature to set fields is encouraged. 193f772f9feSLawrence Stewart.Pp 194f772f9feSLawrence StewartEach function pointer which deals with congestion control state is passed a 195f772f9feSLawrence Stewartpointer to a 196f772f9feSLawrence Stewart.Vt struct cc_var , 197f772f9feSLawrence Stewartwhich has the following members: 198f772f9feSLawrence Stewart.Bd -literal -offset indent 199f772f9feSLawrence Stewartstruct cc_var { 200f772f9feSLawrence Stewart void *cc_data; 201f772f9feSLawrence Stewart int bytes_this_ack; 202f772f9feSLawrence Stewart tcp_seq curack; 203f772f9feSLawrence Stewart uint32_t flags; 204f772f9feSLawrence Stewart int type; 205f772f9feSLawrence Stewart union ccv_container { 206f772f9feSLawrence Stewart struct tcpcb *tcp; 207f772f9feSLawrence Stewart struct sctp_nets *sctp; 208f772f9feSLawrence Stewart } ccvc; 209f772f9feSLawrence Stewart}; 210f772f9feSLawrence Stewart.Ed 211f772f9feSLawrence Stewart.Pp 212f772f9feSLawrence Stewart.Vt struct cc_var 213f772f9feSLawrence Stewartgroups congestion control related variables into a single, embeddable structure 214f772f9feSLawrence Stewartand adds a layer of indirection to accessing transport protocol control blocks. 215f772f9feSLawrence StewartThe eventual goal is to allow a single set of 216f772f9feSLawrence Stewart.Nm 217f772f9feSLawrence Stewartmodules to be shared between all congestion aware transport protocols, though 218f772f9feSLawrence Stewartcurrently only 219f772f9feSLawrence Stewart.Xr tcp 4 220f772f9feSLawrence Stewartis supported. 221f772f9feSLawrence Stewart.Pp 222f772f9feSLawrence StewartTo aid the eventual transition towards this goal, direct use of variables from 223f772f9feSLawrence Stewartthe transport protocol's data structures is strongly discouraged. 224f772f9feSLawrence StewartHowever, it is inevitable at the current time to require access to some of these 225f772f9feSLawrence Stewartvariables, and so the 226*2433b5f1SKevin Lo.Fn CCV 227f772f9feSLawrence Stewartmacro exists as a convenience accessor. 228f772f9feSLawrence StewartThe 229f772f9feSLawrence Stewart.Fa ccv 230f772f9feSLawrence Stewartargument points to the 231f772f9feSLawrence Stewart.Vt struct cc_var 232f772f9feSLawrence Stewartpassed into the function by the 233f772f9feSLawrence Stewart.Nm 234f772f9feSLawrence Stewartframework. 235f772f9feSLawrence StewartThe 236f772f9feSLawrence Stewart.Fa what 237f772f9feSLawrence Stewartargument specifies the name of the variable to access. 238f772f9feSLawrence Stewart.Pp 239f772f9feSLawrence StewartApart from the 240f772f9feSLawrence Stewart.Va type 241f772f9feSLawrence Stewartand 242f772f9feSLawrence Stewart.Va ccv_container 243f772f9feSLawrence Stewartfields, the remaining fields in 244f772f9feSLawrence Stewart.Vt struct cc_var 245f772f9feSLawrence Stewartare for use by 246f772f9feSLawrence Stewart.Nm 247f772f9feSLawrence Stewartmodules. 248f772f9feSLawrence Stewart.Pp 249f772f9feSLawrence StewartThe 250f772f9feSLawrence Stewart.Va cc_data 251f772f9feSLawrence Stewartfield is available for algorithms requiring additional per-connection state to 252f772f9feSLawrence Stewartattach a dynamic memory pointer to. 253f772f9feSLawrence StewartThe memory should be allocated and attached in the module's 254f772f9feSLawrence Stewart.Va cb_init 255f772f9feSLawrence Stewarthook function. 256f772f9feSLawrence Stewart.Pp 257f772f9feSLawrence StewartThe 258f772f9feSLawrence Stewart.Va bytes_this_ack 259f772f9feSLawrence Stewartfield specifies the number of new bytes acknowledged by the most recently 260f772f9feSLawrence Stewartreceived ACK packet. 261f772f9feSLawrence StewartIt is only valid in the 262f772f9feSLawrence Stewart.Va ack_received 263f772f9feSLawrence Stewarthook function. 264f772f9feSLawrence Stewart.Pp 265f772f9feSLawrence StewartThe 266f772f9feSLawrence Stewart.Va curack 267f772f9feSLawrence Stewartfield specifies the sequence number of the most recently received ACK packet. 268f772f9feSLawrence StewartIt is only valid in the 269f772f9feSLawrence Stewart.Va ack_received , 270f772f9feSLawrence Stewart.Va cong_signal 271f772f9feSLawrence Stewartand 272f772f9feSLawrence Stewart.Va post_recovery 273f772f9feSLawrence Stewarthook functions. 274f772f9feSLawrence Stewart.Pp 275f772f9feSLawrence StewartThe 276f772f9feSLawrence Stewart.Va flags 277f772f9feSLawrence Stewartfield is used to pass useful information from the stack to a 278f772f9feSLawrence Stewart.Nm 279f772f9feSLawrence Stewartmodule. 280f772f9feSLawrence StewartThe CCF_ABC_SENTAWND flag is relevant in 281f772f9feSLawrence Stewart.Va ack_received 282f772f9feSLawrence Stewartand is set when appropriate byte counting (RFC3465) has counted a window's worth 283f772f9feSLawrence Stewartof bytes has been sent. 284f772f9feSLawrence StewartIt is the module's responsibility to clear the flag after it has processed the 285f772f9feSLawrence Stewartsignal. 286f772f9feSLawrence StewartThe CCF_CWND_LIMITED flag is relevant in 287f772f9feSLawrence Stewart.Va ack_received 288f772f9feSLawrence Stewartand is set when the connection's ability to send data is currently constrained 289f772f9feSLawrence Stewartby the value of the congestion window. 29073bbeaa5SGlen BarberAlgorithms should use the absence of this flag being set to avoid accumulating 291f772f9feSLawrence Stewarta large difference between the congestion window and send window. 292f772f9feSLawrence Stewart.Sh SEE ALSO 293f772f9feSLawrence Stewart.Xr cc_chd 4 , 294f772f9feSLawrence Stewart.Xr cc_cubic 4 , 295f772f9feSLawrence Stewart.Xr cc_hd 4 , 296f772f9feSLawrence Stewart.Xr cc_htcp 4 , 297f772f9feSLawrence Stewart.Xr cc_newreno 4 , 298f772f9feSLawrence Stewart.Xr cc_vegas 4 , 299f772f9feSLawrence Stewart.Xr mod_cc 4 , 300f772f9feSLawrence Stewart.Xr tcp 4 301f772f9feSLawrence Stewart.Sh ACKNOWLEDGEMENTS 302f772f9feSLawrence StewartDevelopment and testing of this software were made possible in part by grants 303f772f9feSLawrence Stewartfrom the FreeBSD Foundation and Cisco University Research Program Fund at 304f772f9feSLawrence StewartCommunity Foundation Silicon Valley. 305f772f9feSLawrence Stewart.Sh FUTURE WORK 306f772f9feSLawrence StewartIntegrate with 307f772f9feSLawrence Stewart.Xr sctp 4 . 308f772f9feSLawrence Stewart.Sh HISTORY 309f772f9feSLawrence StewartThe modular Congestion Control (CC) framework first appeared in 310f772f9feSLawrence Stewart.Fx 9.0 . 311f772f9feSLawrence Stewart.Pp 312f772f9feSLawrence StewartThe framework was first released in 2007 by James Healy and Lawrence Stewart 313f772f9feSLawrence Stewartwhilst working on the NewTCP research project at Swinburne University of 314f772f9feSLawrence StewartTechnology's Centre for Advanced Internet Architectures, Melbourne, Australia, 315f772f9feSLawrence Stewartwhich was made possible in part by a grant from the Cisco University Research 316f772f9feSLawrence StewartProgram Fund at Community Foundation Silicon Valley. 317f772f9feSLawrence StewartMore details are available at: 318f772f9feSLawrence Stewart.Pp 319f772f9feSLawrence Stewarthttp://caia.swin.edu.au/urp/newtcp/ 320f772f9feSLawrence Stewart.Sh AUTHORS 321f772f9feSLawrence Stewart.An -nosplit 322f772f9feSLawrence StewartThe 323f772f9feSLawrence Stewart.Nm 324f772f9feSLawrence Stewartframework was written by 3258a7314fcSBaptiste Daroussin.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org , 3268a7314fcSBaptiste Daroussin.An James Healy Aq Mt jimmy@deefa.com 327f772f9feSLawrence Stewartand 3288a7314fcSBaptiste Daroussin.An David Hayes Aq Mt david.hayes@ieee.org . 329f772f9feSLawrence Stewart.Pp 330f772f9feSLawrence StewartThis manual page was written by 3318a7314fcSBaptiste Daroussin.An David Hayes Aq Mt david.hayes@ieee.org 332f772f9feSLawrence Stewartand 3338a7314fcSBaptiste Daroussin.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org . 334