1.\" 2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org> 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.Dd June 6, 2024 27.Dt TCP_FUNCTIONS 9 28.Os 29.Sh NAME 30.Nm tcp_functions 31.Nd Alternate TCP Stack Framework 32.Sh SYNOPSIS 33.In netinet/tcp.h 34.In netinet/tcp_var.h 35.Ft int 36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait" 37.Ft int 38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \ 39"const char *name" "int wait" 40.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \ 41"int wait" "const char *names[]" "int *num_names" 42.Ft int 43.Fn deregister_tcp_functions "struct tcp_function_block *blk" 44.Sh DESCRIPTION 45The 46.Nm 47framework allows a kernel developer to implement alternate TCP stacks. 48The alternate stacks can be compiled in the kernel or can be implemented in 49loadable kernel modules. 50This functionality is intended to encourage experimentation with the TCP stack 51and to allow alternate behaviors to be deployed for different TCP connections 52on a single system. 53.Pp 54A system administrator can set a system default stack. 55By default, all TCP connections will use the system default stack. 56Additionally, users can specify a particular stack to use on a per-connection 57basis. 58(See 59.Xr tcp 4 60for details on setting the system default stack, or selecting a specific stack 61for a given connection.) 62.Pp 63This man page treats "TCP stacks" as synonymous with "function blocks". 64This is intentional. 65A "TCP stack" is a collection of functions that implement a set of behavior. 66Therefore, an alternate "function block" defines an alternate "TCP stack". 67.Pp 68The 69.Fn register_tcp_functions , 70.Fn register_tcp_functions_as_name , 71and 72.Fn register_tcp_functions_as_names 73functions request that the system add a specified function block 74and register it for use with a given name. 75Modules may register the same function block multiple times with different 76names. 77However, names must be globally unique among all registered function blocks. 78Also, modules may not ever modify the contents of the function block (including 79the name) after it has been registered, unless the module first successfully 80de-registers the function block. 81.Pp 82The 83.Fn register_tcp_functions 84function requests that the system register the function block with the name 85defined in the function block's 86.Va tfb_tcp_block_name 87field. 88Note that this is the only one of the three registration functions that 89automatically registers the function block using the name defined in the 90function block's 91.Va tfb_tcp_block_name 92field. 93If a module uses one of the other registration functions, it may request that 94the system register the function block using the name defined in the 95function block's 96.Va tfb_tcp_block_name 97field by explicitly providing that name. 98.Pp 99The 100.Fn register_tcp_functions_as_name 101function requests that the system register the function block with the name 102provided in the 103.Fa name 104argument. 105.Pp 106The 107.Fn register_tcp_functions_as_names 108function requests that the system register the function block with all the 109names provided in the 110.Fa names 111argument. 112The 113.Fa num_names 114argument provides a pointer to the number of names. 115This function will either succeed in registering all of the names in the array, 116or none of the names in the array. 117On failure, the 118.Fa num_names 119argument is updated with the index number of the entry in the 120.Fa names 121array which the system was processing when it encountered the error. 122.Pp 123The 124.Fn deregister_tcp_functions 125function requests that the system remove a specified function block from the 126system. 127If this call succeeds, it will completely deregister the function block, 128regardless of the number of names used to register the function block. 129If the call fails because sockets are still using the specified function block, 130the system will mark the function block as being in the process of being 131removed. 132This will prevent additional sockets from using the specified function block. 133However, it will not impact sockets that are already using the function block. 134.Pp 135.Nm 136modules must call one or more of the registration functions during 137initialization and successfully call the 138.Fn deregister_tcp_functions 139function prior to allowing the module to be unloaded. 140.Pp 141The 142.Fa blk 143argument is a pointer to a 144.Vt "struct tcp_function_block" , 145which is explained below (see 146.Sx Function Block Structure ) . 147The 148.Fa wait 149argument is used as the 150.Fa flags 151argument to 152.Xr malloc 9 , 153and must be set to one of the valid values defined in that man page. 154.Ss Function Block Structure 155The 156.Fa blk argument is a pointer to a 157.Vt "struct tcp_function_block" , 158which has the following members: 159.Bd -literal -offset indent 160struct tcp_function_block { 161 char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; 162 int (*tfb_tcp_output)(struct tcpcb *); 163 void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, 164 struct socket *, struct tcpcb *, 165 int, int, uint8_t, 166 int); 167 int (*tfb_tcp_ctloutput)(struct socket *so, 168 struct sockopt *sopt, 169 struct inpcb *inp, struct tcpcb *tp); 170 /* Optional memory allocation/free routine */ 171 void (*tfb_tcp_fb_init)(struct tcpcb *); 172 void (*tfb_tcp_fb_fini)(struct tcpcb *, int); 173 /* Optional timers, must define all if you define one */ 174 int (*tfb_tcp_timer_stop_all)(struct tcpcb *); 175 void (*tfb_tcp_timer_activate)(struct tcpcb *, 176 uint32_t, u_int); 177 int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); 178 void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); 179 /* Optional function */ 180 void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); 181 /* Mandatory function */ 182 int (*tfb_tcp_handoff_ok)(struct tcpcb *); 183 /* System use */ 184 volatile uint32_t tfb_refcnt; 185 uint32_t tfb_flags; 186}; 187.Ed 188.Pp 189The 190.Va tfb_tcp_block_name 191field identifies the unique name of the TCP stack, and should be no longer than 192TCP_FUNCTION_NAME_LEN_MAX-1 characters in length. 193.Pp 194The 195.Va tfb_tcp_output , 196.Va tfb_tcp_do_segment , 197and 198.Va tfb_tcp_ctloutput 199fields are pointers to functions that perform the equivalent actions 200as the default 201.Fn tcp_output , 202.Fn tcp_do_segment , 203and 204.Fn tcp_default_ctloutput 205functions, respectively. 206Each of these function pointers must be non-NULL. 207.Pp 208If a TCP stack needs to initialize data when a socket first selects the TCP 209stack (or, when the socket is first opened), it should set a non-NULL 210pointer in the 211.Va tfb_tcp_fb_init 212field. 213Likewise, if a TCP stack needs to cleanup data when a socket stops using the 214TCP stack (or, when the socket is closed), it should set a non-NULL pointer 215in the 216.Va tfb_tcp_fb_fini 217field. 218.Pp 219If the 220.Va tfb_tcp_fb_fini 221argument is non-NULL, the function to which it points is called when the 222kernel is destroying the TCP control block or when the socket is transitioning 223to use a different TCP stack. 224The function is called with arguments of the TCP control block and an integer 225flag. 226The flag will be zero if the socket is transitioning to use another TCP stack 227or one if the TCP control block is being destroyed. 228.Pp 229If the TCP stack implements additional timers, the TCP stack should set a 230non-NULL pointer in the 231.Va tfb_tcp_timer_stop_all , 232.Va tfb_tcp_timer_activate , 233.Va tfb_tcp_timer_active , 234and 235.Va tfb_tcp_timer_stop 236fields. 237These fields should all be 238.Dv NULL 239or should all contain pointers to functions. 240The 241.Va tfb_tcp_timer_activate , 242.Va tfb_tcp_timer_active , 243and 244.Va tfb_tcp_timer_stop 245functions will be called when the 246.Fn tcp_timer_activate , 247.Fn tcp_timer_active , 248and 249.Fn tcp_timer_stop 250functions, respectively, are called with a timer type other than the standard 251types. 252The functions defined by the TCP stack have the same semantics (both for 253arguments and return values) as the normal timer functions they supplement. 254.Pp 255Additionally, a stack may define its own actions to take when the retransmit 256timer fires by setting a non-NULL function pointer in the 257.Va tfb_tcp_rexmit_tmr 258field. 259This function is called very early in the process of handling a retransmit 260timer. 261However, care must be taken to ensure the retransmit timer leaves the 262TCP control block in a valid state for the remainder of the retransmit 263timer logic. 264.Pp 265A user may select a new TCP stack before calling at any time. 266Therefore, the function pointer 267.Va tfb_tcp_handoff_ok 268field must be non-NULL. 269If a user attempts to select that TCP stack, the kernel will call the function 270pointed to by the 271.Va tfb_tcp_handoff_ok 272field. 273The function should return 0 if the user is allowed to switch the socket to use 274the TCP stack. In this case, the kernel will call the function pointed to by 275.Va tfb_tcp_fb_init 276if this function pointer is non-NULL and finally perform the stack switch. 277If the user is not allowed to switch the socket, the function should undo any 278changes it made to the connection state configuration and return an error code, 279which will be returned to the user. 280.Pp 281The 282.Va tfb_refcnt 283and 284.Va tfb_flags 285fields are used by the kernel's TCP code and will be initialized when the 286TCP stack is registered. 287.Ss Requirements for Alternate TCP Stacks 288If the TCP stack needs to store data beyond what is stored in the default 289TCP control block, the TCP stack can initialize its own per-connection storage. 290The 291.Va t_fb_ptr 292field in the 293.Vt "struct tcpcb" 294control block structure has been reserved to hold a pointer to this 295per-connection storage. 296If the TCP stack uses this alternate storage, it should understand that the 297value of the 298.Va t_fb_ptr 299pointer may not be initialized to 300.Dv NULL . 301Therefore, it should use a 302.Va tfb_tcp_fb_init 303function to initialize this field. 304Additionally, it should use a 305.Va tfb_tcp_fb_fini 306function to deallocate storage when the socket is closed. 307.Pp 308It is understood that alternate TCP stacks may keep different sets of data. 309However, in order to ensure that data is available to both the user and the 310rest of the system in a standardized format, alternate TCP stacks must 311update all fields in the TCP control block to the greatest extent practical. 312.Sh RETURN VALUES 313The 314.Fn register_tcp_functions , 315.Fn register_tcp_functions_as_name , 316.Fn register_tcp_functions_as_names , 317and 318.Fn deregister_tcp_functions 319functions return zero on success and non-zero on failure. 320In particular, the 321.Fn deregister_tcp_functions 322will return 323.Er EBUSY 324until no more connections are using the specified TCP stack. 325A module calling 326.Fn deregister_tcp_functions 327must be prepared to wait until all connections have stopped using the 328specified TCP stack. 329.Sh ERRORS 330The 331.Fn register_tcp_functions 332function will fail if: 333.Bl -tag -width Er 334.It Bq Er EINVAL 335Any of the members of the 336.Fa blk 337argument are set incorrectly. 338.It Bq Er ENOMEM 339The function could not allocate memory for its internal data. 340.It Bq Er EALREADY 341A function block is already registered with the same name. 342.El 343The 344.Fn deregister_tcp_functions 345function will fail if: 346.Bl -tag -width Er 347.It Bq Er EPERM 348The 349.Fa blk 350argument references the kernel's compiled-in default function block. 351.It Bq Er EBUSY 352The function block is still in use by one or more sockets, or is defined as 353the current default function block. 354.It Bq Er ENOENT 355The 356.Fa blk 357argument references a function block that is not currently registered. 358.El 359.Sh SEE ALSO 360.Xr connect 2 , 361.Xr listen 2 , 362.Xr tcp 4 , 363.Xr malloc 9 364.Sh HISTORY 365This framework first appeared in 366.Fx 11.0 . 367.Sh AUTHORS 368.An -nosplit 369The 370.Nm 371framework was written by 372.An Randall Stewart Aq Mt rrs@FreeBSD.org . 373.Pp 374This manual page was written by 375.An Jonathan Looney Aq Mt jtl@FreeBSD.org . 376