1.\" 2.\" Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org> 3.\" Copyright (c) 2010-2011 The FreeBSD Foundation 4.\" All rights reserved. 5.\" 6.\" Portions of this documentation were written at the Centre for Advanced 7.\" Internet Architectures, Swinburne University of Technology, Melbourne, 8.\" Australia by David Hayes and Lawrence Stewart under sponsorship from the 9.\" FreeBSD Foundation. 10.\" 11.\" Redistribution and use in source and binary forms, with or without 12.\" modification, are permitted provided that the following conditions 13.\" are met: 14.\" 1. Redistributions of source code must retain the above copyright 15.\" notice, this list of conditions and the following disclaimer. 16.\" 2. Redistributions in binary form must reproduce the above copyright 17.\" notice, this list of conditions and the following disclaimer in the 18.\" documentation and/or other materials provided with the distribution. 19.\" 20.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 23.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 24.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 30.\" SUCH DAMAGE. 31.\" 32.\" $FreeBSD$ 33.\" 34.Dd January 21, 2016 35.Dt MOD_CC 9 36.Os 37.Sh NAME 38.Nm mod_cc , 39.Nm DECLARE_CC_MODULE , 40.Nm CCV 41.Nd Modular Congestion Control 42.Sh SYNOPSIS 43.In netinet/tcp.h 44.In netinet/cc/cc.h 45.In netinet/cc/cc_module.h 46.Fn DECLARE_CC_MODULE "ccname" "ccalgo" 47.Fn CCV "ccv" "what" 48.Sh DESCRIPTION 49The 50.Nm 51framework allows congestion control algorithms to be implemented as dynamically 52loadable kernel modules via the 53.Xr kld 4 54facility. 55Transport protocols can select from the list of available algorithms on a 56connection-by-connection basis, or use the system default (see 57.Xr mod_cc 4 58for more details). 59.Pp 60.Nm 61modules are identified by an 62.Xr ascii 7 63name and set of hook functions encapsulated in a 64.Vt "struct cc_algo" , 65which has the following members: 66.Bd -literal -offset indent 67struct cc_algo { 68 char name[TCP_CA_NAME_MAX]; 69 int (*mod_init) (void); 70 int (*mod_destroy) (void); 71 int (*cb_init) (struct cc_var *ccv); 72 void (*cb_destroy) (struct cc_var *ccv); 73 void (*conn_init) (struct cc_var *ccv); 74 void (*ack_received) (struct cc_var *ccv, uint16_t type); 75 void (*cong_signal) (struct cc_var *ccv, uint32_t type); 76 void (*post_recovery) (struct cc_var *ccv); 77 void (*after_idle) (struct cc_var *ccv); 78 int (*ctl_output)(struct cc_var *, struct sockopt *, void *); 79}; 80.Ed 81.Pp 82The 83.Va name 84field identifies the unique name of the algorithm, and should be no longer than 85TCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in 86.In netinet/tcp.h 87for compatibility reasons). 88.Pp 89The 90.Va mod_init 91function is called when a new module is loaded into the system but before the 92registration process is complete. 93It should be implemented if a module needs to set up some global state prior to 94being available for use by new connections. 95Returning a non-zero value from 96.Va mod_init 97will cause the loading of the module to fail. 98.Pp 99The 100.Va mod_destroy 101function is called prior to unloading an existing module from the kernel. 102It should be implemented if a module needs to clean up any global state before 103being removed from the kernel. 104The return value is currently ignored. 105.Pp 106The 107.Va cb_init 108function is called when a TCP control block 109.Vt struct tcpcb 110is created. 111It should be implemented if a module needs to allocate memory for storing 112private per-connection state. 113Returning a non-zero value from 114.Va cb_init 115will cause the connection set up to be aborted, terminating the connection as a 116result. 117.Pp 118The 119.Va cb_destroy 120function is called when a TCP control block 121.Vt struct tcpcb 122is destroyed. 123It should be implemented if a module needs to free memory allocated in 124.Va cb_init . 125.Pp 126The 127.Va conn_init 128function is called when a new connection has been established and variables are 129being initialised. 130It should be implemented to initialise congestion control algorithm variables 131for the newly established connection. 132.Pp 133The 134.Va ack_received 135function is called when a TCP acknowledgement (ACK) packet is received. 136Modules use the 137.Fa type 138argument as an input to their congestion management algorithms. 139The ACK types currently reported by the stack are CC_ACK and CC_DUPACK. 140CC_ACK indicates the received ACK acknowledges previously unacknowledged data. 141CC_DUPACK indicates the received ACK acknowledges data we have already received 142an ACK for. 143.Pp 144The 145.Va cong_signal 146function is called when a congestion event is detected by the TCP stack. 147Modules use the 148.Fa type 149argument as an input to their congestion management algorithms. 150The congestion event types currently reported by the stack are CC_ECN, CC_RTO, 151CC_RTO_ERR and CC_NDUPACK. 152CC_ECN is reported when the TCP stack receives an explicit congestion notification 153(RFC3168). 154CC_RTO is reported when the retransmission time out timer fires. 155CC_RTO_ERR is reported if the retransmission time out timer fired in error. 156CC_NDUPACK is reported if N duplicate ACKs have been received back-to-back, 157where N is the fast retransmit duplicate ack threshold (N=3 currently as per 158RFC5681). 159.Pp 160The 161.Va post_recovery 162function is called after the TCP connection has recovered from a congestion event. 163It should be implemented to adjust state as required. 164.Pp 165The 166.Va after_idle 167function is called when data transfer resumes after an idle period. 168It should be implemented to adjust state as required. 169.Pp 170The 171.Va ctl_output 172function is called when 173.Xr getsockopt 2 174or 175.Xr setsockopt 2 176is called on a 177.Xr tcp 4 178socket with the 179.Va struct sockopt 180pointer forwarded unmodified from the TCP control, and a 181.Va void * 182pointer to algorithm specific argument. 183.Pp 184The 185.Fn DECLARE_CC_MODULE 186macro provides a convenient wrapper around the 187.Xr DECLARE_MODULE 9 188macro, and is used to register a 189.Nm 190module with the 191.Nm 192framework. 193The 194.Fa ccname 195argument specifies the module's name. 196The 197.Fa ccalgo 198argument points to the module's 199.Vt struct cc_algo . 200.Pp 201.Nm 202modules must instantiate a 203.Vt struct cc_algo , 204but are only required to set the name field, and optionally any of the function 205pointers. 206The stack will skip calling any function pointer which is NULL, so there is no 207requirement to implement any of the function pointers. 208Using the C99 designated initialiser feature to set fields is encouraged. 209.Pp 210Each function pointer which deals with congestion control state is passed a 211pointer to a 212.Vt struct cc_var , 213which has the following members: 214.Bd -literal -offset indent 215struct cc_var { 216 void *cc_data; 217 int bytes_this_ack; 218 tcp_seq curack; 219 uint32_t flags; 220 int type; 221 union ccv_container { 222 struct tcpcb *tcp; 223 struct sctp_nets *sctp; 224 } ccvc; 225}; 226.Ed 227.Pp 228.Vt struct cc_var 229groups congestion control related variables into a single, embeddable structure 230and adds a layer of indirection to accessing transport protocol control blocks. 231The eventual goal is to allow a single set of 232.Nm 233modules to be shared between all congestion aware transport protocols, though 234currently only 235.Xr tcp 4 236is supported. 237.Pp 238To aid the eventual transition towards this goal, direct use of variables from 239the transport protocol's data structures is strongly discouraged. 240However, it is inevitable at the current time to require access to some of these 241variables, and so the 242.Fn CCV 243macro exists as a convenience accessor. 244The 245.Fa ccv 246argument points to the 247.Vt struct cc_var 248passed into the function by the 249.Nm 250framework. 251The 252.Fa what 253argument specifies the name of the variable to access. 254.Pp 255Apart from the 256.Va type 257and 258.Va ccv_container 259fields, the remaining fields in 260.Vt struct cc_var 261are for use by 262.Nm 263modules. 264.Pp 265The 266.Va cc_data 267field is available for algorithms requiring additional per-connection state to 268attach a dynamic memory pointer to. 269The memory should be allocated and attached in the module's 270.Va cb_init 271hook function. 272.Pp 273The 274.Va bytes_this_ack 275field specifies the number of new bytes acknowledged by the most recently 276received ACK packet. 277It is only valid in the 278.Va ack_received 279hook function. 280.Pp 281The 282.Va curack 283field specifies the sequence number of the most recently received ACK packet. 284It is only valid in the 285.Va ack_received , 286.Va cong_signal 287and 288.Va post_recovery 289hook functions. 290.Pp 291The 292.Va flags 293field is used to pass useful information from the stack to a 294.Nm 295module. 296The CCF_ABC_SENTAWND flag is relevant in 297.Va ack_received 298and is set when appropriate byte counting (RFC3465) has counted a window's worth 299of bytes has been sent. 300It is the module's responsibility to clear the flag after it has processed the 301signal. 302The CCF_CWND_LIMITED flag is relevant in 303.Va ack_received 304and is set when the connection's ability to send data is currently constrained 305by the value of the congestion window. 306Algorithms should use the absence of this flag being set to avoid accumulating 307a large difference between the congestion window and send window. 308.Sh SEE ALSO 309.Xr cc_cdg 4 , 310.Xr cc_chd 4 , 311.Xr cc_cubic 4 , 312.Xr cc_hd 4 , 313.Xr cc_htcp 4 , 314.Xr cc_newreno 4 , 315.Xr cc_vegas 4 , 316.Xr mod_cc 4 , 317.Xr tcp 4 318.Sh ACKNOWLEDGEMENTS 319Development and testing of this software were made possible in part by grants 320from the FreeBSD Foundation and Cisco University Research Program Fund at 321Community Foundation Silicon Valley. 322.Sh FUTURE WORK 323Integrate with 324.Xr sctp 4 . 325.Sh HISTORY 326The modular Congestion Control (CC) framework first appeared in 327.Fx 9.0 . 328.Pp 329The framework was first released in 2007 by James Healy and Lawrence Stewart 330whilst working on the NewTCP research project at Swinburne University of 331Technology's Centre for Advanced Internet Architectures, Melbourne, Australia, 332which was made possible in part by a grant from the Cisco University Research 333Program Fund at Community Foundation Silicon Valley. 334More details are available at: 335.Pp 336http://caia.swin.edu.au/urp/newtcp/ 337.Sh AUTHORS 338.An -nosplit 339The 340.Nm 341framework was written by 342.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org , 343.An James Healy Aq Mt jimmy@deefa.com 344and 345.An David Hayes Aq Mt david.hayes@ieee.org . 346.Pp 347This manual page was written by 348.An David Hayes Aq Mt david.hayes@ieee.org 349and 350.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org . 351