1.\" 2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org> 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.Dd July 13, 2024 27.Dt TCP_FUNCTIONS 9 28.Os 29.Sh NAME 30.Nm tcp_functions 31.Nd Alternate TCP Stack Framework 32.Sh SYNOPSIS 33.In netinet/tcp.h 34.In netinet/tcp_var.h 35.Ft int 36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait" 37.Ft int 38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \ 39"const char *name" "int wait" 40.Ft int 41.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \ 42"int wait" "const char *names[]" "int *num_names" 43.Ft int 44.Fn deregister_tcp_functions "struct tcp_function_block *blk" 45.Sh DESCRIPTION 46The 47.Nm 48framework allows a kernel developer to implement alternate TCP stacks. 49The alternate stacks can be compiled in the kernel or can be implemented in 50loadable kernel modules. 51This functionality is intended to encourage experimentation with the TCP stack 52and to allow alternate behaviors to be deployed for different TCP connections 53on a single system. 54.Pp 55A system administrator can set a system default stack. 56By default, all TCP connections will use the system default stack. 57Additionally, users can specify a particular stack to use on a per-connection 58basis. 59(See 60.Xr tcp 4 61for details on setting the system default stack, or selecting a specific stack 62for a given connection.) 63.Pp 64This man page treats "TCP stacks" as synonymous with "function blocks". 65This is intentional. 66A "TCP stack" is a collection of functions that implement a set of behavior. 67Therefore, an alternate "function block" defines an alternate "TCP stack". 68.Pp 69The 70.Fn register_tcp_functions , 71.Fn register_tcp_functions_as_name , 72and 73.Fn register_tcp_functions_as_names 74functions request that the system add a specified function block 75and register it for use with a given name. 76Modules may register the same function block multiple times with different 77names. 78However, names must be globally unique among all registered function blocks. 79Also, modules may not ever modify the contents of the function block (including 80the name) after it has been registered, unless the module first successfully 81de-registers the function block. 82.Pp 83The 84.Fn register_tcp_functions 85function requests that the system register the function block with the name 86defined in the function block's 87.Va tfb_tcp_block_name 88field. 89Note that this is the only one of the three registration functions that 90automatically registers the function block using the name defined in the 91function block's 92.Va tfb_tcp_block_name 93field. 94If a module uses one of the other registration functions, it may request that 95the system register the function block using the name defined in the 96function block's 97.Va tfb_tcp_block_name 98field by explicitly providing that name. 99.Pp 100The 101.Fn register_tcp_functions_as_name 102function requests that the system register the function block with the name 103provided in the 104.Fa name 105argument. 106.Pp 107The 108.Fn register_tcp_functions_as_names 109function requests that the system register the function block with all the 110names provided in the 111.Fa names 112argument. 113The 114.Fa num_names 115argument provides a pointer to the number of names. 116This number must not exceed TCP_FUNCTION_NAME_NUM_MAX. 117This function will either succeed in registering all of the names in the array, 118or none of the names in the array. 119On failure, the 120.Fa num_names 121argument is updated with the index number of the entry in the 122.Fa names 123array which the system was processing when it encountered the error. 124.Pp 125The 126.Fn deregister_tcp_functions 127function requests that the system remove a specified function block from the 128system. 129If this call succeeds, it will completely deregister the function block, 130regardless of the number of names used to register the function block. 131If the call fails because sockets are still using the specified function block, 132the system will mark the function block as being in the process of being 133removed. 134This will prevent additional sockets from using the specified function block. 135However, it will not impact sockets that are already using the function block. 136.Pp 137.Nm 138modules must call one or more of the registration functions during 139initialization and successfully call the 140.Fn deregister_tcp_functions 141function prior to allowing the module to be unloaded. 142.Pp 143The 144.Fa blk 145argument is a pointer to a 146.Vt "struct tcp_function_block" , 147which is explained below (see 148.Sx Function Block Structure ) . 149The 150.Fa wait 151argument is used as the 152.Fa flags 153argument to 154.Xr malloc 9 , 155and must be set to one of the valid values defined in that man page. 156.Ss Function Block Structure 157The 158.Fa blk argument is a pointer to a 159.Vt "struct tcp_function_block" , 160which has the following members: 161.Bd -literal -offset indent 162struct tcp_function_block { 163 char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; 164 int (*tfb_tcp_output)(struct tcpcb *); 165 void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, 166 struct socket *, struct tcpcb *, 167 int, int, uint8_t, 168 int); 169 int (*tfb_tcp_ctloutput)(struct socket *so, 170 struct sockopt *sopt, 171 struct inpcb *inp, struct tcpcb *tp); 172 /* Optional memory allocation/free routine */ 173 void (*tfb_tcp_fb_init)(struct tcpcb *); 174 void (*tfb_tcp_fb_fini)(struct tcpcb *, int); 175 /* Optional timers, must define all if you define one */ 176 int (*tfb_tcp_timer_stop_all)(struct tcpcb *); 177 void (*tfb_tcp_timer_activate)(struct tcpcb *, 178 uint32_t, u_int); 179 int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); 180 void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); 181 /* Optional function */ 182 void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); 183 /* Mandatory function */ 184 int (*tfb_tcp_handoff_ok)(struct tcpcb *); 185 /* System use */ 186 volatile uint32_t tfb_refcnt; 187 uint32_t tfb_flags; 188}; 189.Ed 190.Pp 191The 192.Va tfb_tcp_block_name 193field identifies the unique name of the TCP stack, and should be no longer than 194TCP_FUNCTION_NAME_LEN_MAX-1 characters in length. 195.Pp 196The 197.Va tfb_tcp_output , 198.Va tfb_tcp_do_segment , 199and 200.Va tfb_tcp_ctloutput 201fields are pointers to functions that perform the equivalent actions 202as the default 203.Fn tcp_output , 204.Fn tcp_do_segment , 205and 206.Fn tcp_default_ctloutput 207functions, respectively. 208Each of these function pointers must be non-NULL. 209.Pp 210If a TCP stack needs to initialize data when a socket first selects the TCP 211stack (or, when the socket is first opened), it should set a non-NULL 212pointer in the 213.Va tfb_tcp_fb_init 214field. 215Likewise, if a TCP stack needs to cleanup data when a socket stops using the 216TCP stack (or, when the socket is closed), it should set a non-NULL pointer 217in the 218.Va tfb_tcp_fb_fini 219field. 220.Pp 221If the 222.Va tfb_tcp_fb_fini 223argument is non-NULL, the function to which it points is called when the 224kernel is destroying the TCP control block or when the socket is transitioning 225to use a different TCP stack. 226The function is called with arguments of the TCP control block and an integer 227flag. 228The flag will be zero if the socket is transitioning to use another TCP stack 229or one if the TCP control block is being destroyed. 230.Pp 231If the TCP stack implements additional timers, the TCP stack should set a 232non-NULL pointer in the 233.Va tfb_tcp_timer_stop_all , 234.Va tfb_tcp_timer_activate , 235.Va tfb_tcp_timer_active , 236and 237.Va tfb_tcp_timer_stop 238fields. 239These fields should all be 240.Dv NULL 241or should all contain pointers to functions. 242The 243.Va tfb_tcp_timer_activate , 244.Va tfb_tcp_timer_active , 245and 246.Va tfb_tcp_timer_stop 247functions will be called when the 248.Fn tcp_timer_activate , 249.Fn tcp_timer_active , 250and 251.Fn tcp_timer_stop 252functions, respectively, are called with a timer type other than the standard 253types. 254The functions defined by the TCP stack have the same semantics (both for 255arguments and return values) as the normal timer functions they supplement. 256.Pp 257Additionally, a stack may define its own actions to take when the retransmit 258timer fires by setting a non-NULL function pointer in the 259.Va tfb_tcp_rexmit_tmr 260field. 261This function is called very early in the process of handling a retransmit 262timer. 263However, care must be taken to ensure the retransmit timer leaves the 264TCP control block in a valid state for the remainder of the retransmit 265timer logic. 266.Pp 267A user may select a new TCP stack before calling at any time. 268Therefore, the function pointer 269.Va tfb_tcp_handoff_ok 270field must be non-NULL. 271If a user attempts to select that TCP stack, the kernel will call the function 272pointed to by the 273.Va tfb_tcp_handoff_ok 274field. 275The function should return 0 if the user is allowed to switch the socket to use 276the TCP stack. In this case, the kernel will call the function pointed to by 277.Va tfb_tcp_fb_init 278if this function pointer is non-NULL and finally perform the stack switch. 279If the user is not allowed to switch the socket, the function should undo any 280changes it made to the connection state configuration and return an error code, 281which will be returned to the user. 282.Pp 283The 284.Va tfb_refcnt 285and 286.Va tfb_flags 287fields are used by the kernel's TCP code and will be initialized when the 288TCP stack is registered. 289.Ss Requirements for Alternate TCP Stacks 290If the TCP stack needs to store data beyond what is stored in the default 291TCP control block, the TCP stack can initialize its own per-connection storage. 292The 293.Va t_fb_ptr 294field in the 295.Vt "struct tcpcb" 296control block structure has been reserved to hold a pointer to this 297per-connection storage. 298If the TCP stack uses this alternate storage, it should understand that the 299value of the 300.Va t_fb_ptr 301pointer may not be initialized to 302.Dv NULL . 303Therefore, it should use a 304.Va tfb_tcp_fb_init 305function to initialize this field. 306Additionally, it should use a 307.Va tfb_tcp_fb_fini 308function to deallocate storage when the socket is closed. 309.Pp 310It is understood that alternate TCP stacks may keep different sets of data. 311However, in order to ensure that data is available to both the user and the 312rest of the system in a standardized format, alternate TCP stacks must 313update all fields in the TCP control block to the greatest extent practical. 314.Sh RETURN VALUES 315The 316.Fn register_tcp_functions , 317.Fn register_tcp_functions_as_name , 318.Fn register_tcp_functions_as_names , 319and 320.Fn deregister_tcp_functions 321functions return zero on success and non-zero on failure. 322In particular, the 323.Fn deregister_tcp_functions 324will return 325.Er EBUSY 326until no more connections are using the specified TCP stack. 327A module calling 328.Fn deregister_tcp_functions 329must be prepared to wait until all connections have stopped using the 330specified TCP stack. 331.Sh ERRORS 332The 333.Fn register_tcp_functions , 334.Fn register_tcp_functions_as_name , 335and 336.Fn register_tcp_functions_as_names 337functions will fail if: 338.Bl -tag -width Er 339.It Bq Er EINVAL 340Any of the members of the 341.Fa blk 342argument are set incorrectly. 343.It Bq Er ENOMEM 344The function could not allocate memory for its internal data. 345.It Bq Er EALREADY 346The 347.Fa blk 348is already registered or a function block is already registered with the same 349name. 350.El 351Additionally, 352.Fn register_tcp_functions_as_names 353will fail if: 354.Bl -tag -width Er 355.It Bq Er E2BIG 356The number of names pointed to by the 357.Fa num_names 358argument is larger than TCP_FUNCTION_NAME_NUM_MAX. 359.El 360The 361.Fn deregister_tcp_functions 362function will fail if: 363.Bl -tag -width Er 364.It Bq Er EPERM 365The 366.Fa blk 367argument references the kernel's compiled-in default function block. 368.It Bq Er EBUSY 369The function block is still in use by one or more sockets, or is defined as 370the current default function block. 371.It Bq Er ENOENT 372The 373.Fa blk 374argument references a function block that is not currently registered. 375.El 376.Sh SEE ALSO 377.Xr connect 2 , 378.Xr listen 2 , 379.Xr tcp 4 , 380.Xr malloc 9 381.Sh HISTORY 382This framework first appeared in 383.Fx 11.0 . 384.Sh AUTHORS 385.An -nosplit 386The 387.Nm 388framework was written by 389.An Randall Stewart Aq Mt rrs@FreeBSD.org . 390.Pp 391This manual page was written by 392.An Jonathan Looney Aq Mt jtl@FreeBSD.org . 393