1.\" 2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org> 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.\" $FreeBSD$ 27.\" 28.Dd March 10, 2017 29.Dt TCP_FUNCTIONS 9 30.Os 31.Sh NAME 32.Nm tcp_functions 33.Nd Alternate TCP Stack Framework 34.Sh SYNOPSIS 35.In netinet/tcp.h 36.In netinet/tcp_var.h 37.Ft int 38.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait" 39.Ft int 40.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \ 41"const char *name" "int wait" 42.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \ 43"int wait" "const char *names[]" "int *num_names" 44.Ft int 45.Fn deregister_tcp_functions "struct tcp_function_block *blk" 46.Sh DESCRIPTION 47The 48.Nm 49framework allows a kernel developer to implement alternate TCP stacks. 50The alternate stacks can be compiled in the kernel or can be implemented in 51loadable kernel modules. 52This functionality is intended to encourage experimentation with the TCP stack 53and to allow alternate behaviors to be deployed for different TCP connections 54on a single system. 55.Pp 56A system administrator can set a system default stack. 57By default, all TCP connections will use the system default stack. 58Additionally, users can specify a particular stack to use on a per-connection 59basis. 60(See 61.Xr tcp 4 62for details on setting the system default stack, or selecting a specific stack 63for a given connection.) 64.Pp 65This man page treats "TCP stacks" as synonymous with "function blocks". 66This is intentional. 67A "TCP stack" is a collection of functions that implement a set of behavior. 68Therefore, an alternate "function block" defines an alternate "TCP stack". 69.Pp 70The 71.Fn register_tcp_functions , 72.Fn register_tcp_functions_as_name , 73and 74.Fn register_tcp_functions_as_names 75functions request that the system add a specified function block 76and register it for use with a given name. 77Modules may register the same function block multiple times with different 78names. 79However, names must be globally unique among all registered function blocks. 80Also, modules may not ever modify the contents of the function block (including 81the name) after it has been registered, unless the module first successfully 82de-registers the function block. 83.Pp 84The 85.Fn register_tcp_functions 86function requests that the system register the function block with the name 87defined in the function block's 88.Va tfb_tcp_block_name 89field. 90Note that this is the only one of the three registration functions that 91automatically registers the function block using the name defined in the 92function block's 93.Va tfb_tcp_block_name 94field. 95If a module uses one of the other registration functions, it may request that 96the system register the function block using the name defined in the 97function block's 98.Va tfb_tcp_block_name 99field by explicitly providing that name. 100.Pp 101The 102.Fn register_tcp_functions_as_name 103function requests that the system register the function block with the name 104provided in the 105.Fa name 106argument. 107.Pp 108The 109.Fn register_tcp_functions_as_names 110function requests that the system register the function block with all the 111names provided in the 112.Fa names 113argument. 114The 115.Fa num_names 116argument provides a pointer to the number of names. 117This function will either succeed in registering all of the names in the array, 118or none of the names in the array. 119On failure, the 120.Fa num_names 121argument is updated with the index number of the entry in the 122.Fa names 123array which the system was processing when it encountered the error. 124.Pp 125The 126.Fn deregister_tcp_functions 127function requests that the system remove a specified function block from the 128system. 129If this call succeeds, it will completely deregister the function block, 130regardless of the number of names used to register the function block. 131If the call fails because sockets are still using the specified function block, 132the system will mark the function block as being in the process of being 133removed. 134This will prevent additional sockets from using the specified function block. 135However, it will not impact sockets that are already using the function block. 136.Pp 137.Nm 138modules must call one or more of the registration functions during 139initialization and successfully call the 140.Fn deregister_tcp_functions 141function prior to allowing the module to be unloaded. 142.Pp 143The 144.Fa blk 145argument is a pointer to a 146.Vt "struct tcp_function_block" , 147which is explained below (see 148.Sx Function Block Structure ) . 149The 150.Fa wait 151argument is used as the 152.Fa flags 153argument to 154.Xr malloc 9 , 155and must be set to one of the valid values defined in that man page. 156.Ss Function Block Structure 157The 158.Fa blk argument is a pointer to a 159.Vt "struct tcp_function_block" , 160which has the following members: 161.Bd -literal -offset indent 162struct tcp_function_block { 163 char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; 164 int (*tfb_tcp_output)(struct tcpcb *); 165 void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, 166 struct socket *, struct tcpcb *, 167 int, int, uint8_t, 168 int); 169 int (*tfb_tcp_ctloutput)(struct socket *so, 170 struct sockopt *sopt, 171 struct inpcb *inp, struct tcpcb *tp); 172 /* Optional memory allocation/free routine */ 173 void (*tfb_tcp_fb_init)(struct tcpcb *); 174 void (*tfb_tcp_fb_fini)(struct tcpcb *, int); 175 /* Optional timers, must define all if you define one */ 176 int (*tfb_tcp_timer_stop_all)(struct tcpcb *); 177 void (*tfb_tcp_timer_activate)(struct tcpcb *, 178 uint32_t, u_int); 179 int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); 180 void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); 181 /* Optional functions */ 182 void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); 183 void (*tfb_tcp_handoff_ok)(struct tcpcb *); 184 /* System use */ 185 volatile uint32_t tfb_refcnt; 186 uint32_t tfb_flags; 187}; 188.Ed 189.Pp 190The 191.Va tfb_tcp_block_name 192field identifies the unique name of the TCP stack, and should be no longer than 193TCP_FUNCTION_NAME_LEN_MAX-1 characters in length. 194.Pp 195The 196.Va tfb_tcp_output , 197.Va tfb_tcp_do_segment , 198and 199.Va tfb_tcp_ctloutput 200fields are pointers to functions that perform the equivalent actions 201as the default 202.Fn tcp_output , 203.Fn tcp_do_segment , 204and 205.Fn tcp_default_ctloutput 206functions, respectively. 207Each of these function pointers must be non-NULL. 208.Pp 209If a TCP stack needs to initialize data when a socket first selects the TCP 210stack (or, when the socket is first opened), it should set a non-NULL 211pointer in the 212.Va tfb_tcp_fb_init 213field. 214Likewise, if a TCP stack needs to cleanup data when a socket stops using the 215TCP stack (or, when the socket is closed), it should set a non-NULL pointer 216in the 217.Va tfb_tcp_fb_fini 218field. 219.Pp 220If the 221.Va tfb_tcp_fb_fini 222argument is non-NULL, the function to which it points is called when the 223kernel is destroying the TCP control block or when the socket is transitioning 224to use a different TCP stack. 225The function is called with arguments of the TCP control block and an integer 226flag. 227The flag will be zero if the socket is transitioning to use another TCP stack 228or one if the TCP control block is being destroyed. 229.Pp 230If the TCP stack implements additional timers, the TCP stack should set a 231non-NULL pointer in the 232.Va tfb_tcp_timer_stop_all , 233.Va tfb_tcp_timer_activate , 234.Va tfb_tcp_timer_active , 235and 236.Va tfb_tcp_timer_stop 237fields. 238These fields should all be 239.Dv NULL 240or should all contain pointers to functions. 241The 242.Va tfb_tcp_timer_activate , 243.Va tfb_tcp_timer_active , 244and 245.Va tfb_tcp_timer_stop 246functions will be called when the 247.Fn tcp_timer_activate , 248.Fn tcp_timer_active , 249and 250.Fn tcp_timer_stop 251functions, respectively, are called with a timer type other than the standard 252types. 253The functions defined by the TCP stack have the same semantics (both for 254arguments and return values) as the normal timer functions they supplement. 255.Pp 256Additionally, a stack may define its own actions to take when the retransmit 257timer fires by setting a non-NULL function pointer in the 258.Va tfb_tcp_rexmit_tmr 259field. 260This function is called very early in the process of handling a retransmit 261timer. 262However, care must be taken to ensure the retransmit timer leaves the 263TCP control block in a valid state for the remainder of the retransmit 264timer logic. 265.Pp 266A user may select a new TCP stack before calling 267.Xr connect 2 268or 269.Xr listen 2 . 270Optionally, a TCP stack may also allow a user to begin using the TCP stack for 271a connection that is in a later state by setting a non-NULL function pointer in 272the 273.Va tfb_tcp_handoff_ok 274field. 275If this field is non-NULL and a user attempts to select that TCP stack after 276calling 277.Xr connect 2 278or 279.Xr listen 2 280for that socket, the kernel will call the function pointed to by the 281.Va tfb_tcp_handoff_ok 282field. 283The function should return 0 if the user is allowed to switch the socket to use 284the TCP stack. 285Otherwise, the function should return an error code, which will be returned to 286the user. 287If the 288.Va tfb_tcp_handoff_ok 289field is 290.Dv NULL 291and a user attempts to select the TCP stack after calling 292.Xr connect 2 293or 294.Xr listen 2 295for that socket, the operation will fail and the kernel will return 296.Er EINVAL . 297.Pp 298The 299.Va tfb_refcnt 300and 301.Va tfb_flags 302fields are used by the kernel's TCP code and will be initialized when the 303TCP stack is registered. 304.Ss Requirements for Alternate TCP Stacks 305If the TCP stack needs to store data beyond what is stored in the default 306TCP control block, the TCP stack can initialize its own per-connection storage. 307The 308.Va t_fb_ptr 309field in the 310.Vt "struct tcpcb" 311control block structure has been reserved to hold a pointer to this 312per-connection storage. 313If the TCP stack uses this alternate storage, it should understand that the 314value of the 315.Va t_fb_ptr 316pointer may not be initialized to 317.Dv NULL . 318Therefore, it should use a 319.Va tfb_tcp_fb_init 320function to initialize this field. 321Additionally, it should use a 322.Va tfb_tcp_fb_fini 323function to deallocate storage when the socket is closed. 324.Pp 325It is understood that alternate TCP stacks may keep different sets of data. 326However, in order to ensure that data is available to both the user and the 327rest of the system in a standardized format, alternate TCP stacks must 328update all fields in the TCP control block to the greatest extent practical. 329.Sh RETURN VALUES 330The 331.Fn register_tcp_functions , 332.Fn register_tcp_functions_as_name , 333.Fn register_tcp_functions_as_names , 334and 335.Fn deregister_tcp_functions 336functions return zero on success and non-zero on failure. 337In particular, the 338.Fn deregister_tcp_functions 339will return 340.Er EBUSY 341until no more connections are using the specified TCP stack. 342A module calling 343.Fn deregister_tcp_functions 344must be prepared to wait until all connections have stopped using the 345specified TCP stack. 346.Sh ERRORS 347The 348.Fn register_tcp_functions 349function will fail if: 350.Bl -tag -width Er 351.It Bq Er EINVAL 352Any of the members of the 353.Fa blk 354argument are set incorrectly. 355.It Bq Er ENOMEM 356The function could not allocate memory for its internal data. 357.It Bq Er EALREADY 358A function block is already registered with the same name. 359.El 360The 361.Fn deregister_tcp_functions 362function will fail if: 363.Bl -tag -width Er 364.It Bq Er EPERM 365The 366.Fa blk 367argument references the kernel's compiled-in default function block. 368.It Bq Er EBUSY 369The function block is still in use by one or more sockets, or is defined as 370the current default function block. 371.It Bq Er ENOENT 372The 373.Fa blk 374argument references a function block that is not currently registered. 375.El 376.Sh SEE ALSO 377.Xr connect 2 , 378.Xr listen 2 , 379.Xr tcp 4 , 380.Xr malloc 9 381.Sh HISTORY 382This framework first appeared in 383.Fx 11.0 . 384.Sh AUTHORS 385.An -nosplit 386The 387.Nm 388framework was written by 389.An Randall Stewart Aq Mt rrs@FreeBSD.org . 390.Pp 391This manual page was written by 392.An Jonathan Looney Aq Mt jtl@FreeBSD.org . 393