1.\" 2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org> 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.Dd March 10, 2017 27.Dt TCP_FUNCTIONS 9 28.Os 29.Sh NAME 30.Nm tcp_functions 31.Nd Alternate TCP Stack Framework 32.Sh SYNOPSIS 33.In netinet/tcp.h 34.In netinet/tcp_var.h 35.Ft int 36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait" 37.Ft int 38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \ 39"const char *name" "int wait" 40.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \ 41"int wait" "const char *names[]" "int *num_names" 42.Ft int 43.Fn deregister_tcp_functions "struct tcp_function_block *blk" 44.Sh DESCRIPTION 45The 46.Nm 47framework allows a kernel developer to implement alternate TCP stacks. 48The alternate stacks can be compiled in the kernel or can be implemented in 49loadable kernel modules. 50This functionality is intended to encourage experimentation with the TCP stack 51and to allow alternate behaviors to be deployed for different TCP connections 52on a single system. 53.Pp 54A system administrator can set a system default stack. 55By default, all TCP connections will use the system default stack. 56Additionally, users can specify a particular stack to use on a per-connection 57basis. 58(See 59.Xr tcp 4 60for details on setting the system default stack, or selecting a specific stack 61for a given connection.) 62.Pp 63This man page treats "TCP stacks" as synonymous with "function blocks". 64This is intentional. 65A "TCP stack" is a collection of functions that implement a set of behavior. 66Therefore, an alternate "function block" defines an alternate "TCP stack". 67.Pp 68The 69.Fn register_tcp_functions , 70.Fn register_tcp_functions_as_name , 71and 72.Fn register_tcp_functions_as_names 73functions request that the system add a specified function block 74and register it for use with a given name. 75Modules may register the same function block multiple times with different 76names. 77However, names must be globally unique among all registered function blocks. 78Also, modules may not ever modify the contents of the function block (including 79the name) after it has been registered, unless the module first successfully 80de-registers the function block. 81.Pp 82The 83.Fn register_tcp_functions 84function requests that the system register the function block with the name 85defined in the function block's 86.Va tfb_tcp_block_name 87field. 88Note that this is the only one of the three registration functions that 89automatically registers the function block using the name defined in the 90function block's 91.Va tfb_tcp_block_name 92field. 93If a module uses one of the other registration functions, it may request that 94the system register the function block using the name defined in the 95function block's 96.Va tfb_tcp_block_name 97field by explicitly providing that name. 98.Pp 99The 100.Fn register_tcp_functions_as_name 101function requests that the system register the function block with the name 102provided in the 103.Fa name 104argument. 105.Pp 106The 107.Fn register_tcp_functions_as_names 108function requests that the system register the function block with all the 109names provided in the 110.Fa names 111argument. 112The 113.Fa num_names 114argument provides a pointer to the number of names. 115This function will either succeed in registering all of the names in the array, 116or none of the names in the array. 117On failure, the 118.Fa num_names 119argument is updated with the index number of the entry in the 120.Fa names 121array which the system was processing when it encountered the error. 122.Pp 123The 124.Fn deregister_tcp_functions 125function requests that the system remove a specified function block from the 126system. 127If this call succeeds, it will completely deregister the function block, 128regardless of the number of names used to register the function block. 129If the call fails because sockets are still using the specified function block, 130the system will mark the function block as being in the process of being 131removed. 132This will prevent additional sockets from using the specified function block. 133However, it will not impact sockets that are already using the function block. 134.Pp 135.Nm 136modules must call one or more of the registration functions during 137initialization and successfully call the 138.Fn deregister_tcp_functions 139function prior to allowing the module to be unloaded. 140.Pp 141The 142.Fa blk 143argument is a pointer to a 144.Vt "struct tcp_function_block" , 145which is explained below (see 146.Sx Function Block Structure ) . 147The 148.Fa wait 149argument is used as the 150.Fa flags 151argument to 152.Xr malloc 9 , 153and must be set to one of the valid values defined in that man page. 154.Ss Function Block Structure 155The 156.Fa blk argument is a pointer to a 157.Vt "struct tcp_function_block" , 158which has the following members: 159.Bd -literal -offset indent 160struct tcp_function_block { 161 char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; 162 int (*tfb_tcp_output)(struct tcpcb *); 163 void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, 164 struct socket *, struct tcpcb *, 165 int, int, uint8_t, 166 int); 167 int (*tfb_tcp_ctloutput)(struct socket *so, 168 struct sockopt *sopt, 169 struct inpcb *inp, struct tcpcb *tp); 170 /* Optional memory allocation/free routine */ 171 void (*tfb_tcp_fb_init)(struct tcpcb *); 172 void (*tfb_tcp_fb_fini)(struct tcpcb *, int); 173 /* Optional timers, must define all if you define one */ 174 int (*tfb_tcp_timer_stop_all)(struct tcpcb *); 175 void (*tfb_tcp_timer_activate)(struct tcpcb *, 176 uint32_t, u_int); 177 int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); 178 void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); 179 /* Optional functions */ 180 void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); 181 void (*tfb_tcp_handoff_ok)(struct tcpcb *); 182 /* System use */ 183 volatile uint32_t tfb_refcnt; 184 uint32_t tfb_flags; 185}; 186.Ed 187.Pp 188The 189.Va tfb_tcp_block_name 190field identifies the unique name of the TCP stack, and should be no longer than 191TCP_FUNCTION_NAME_LEN_MAX-1 characters in length. 192.Pp 193The 194.Va tfb_tcp_output , 195.Va tfb_tcp_do_segment , 196and 197.Va tfb_tcp_ctloutput 198fields are pointers to functions that perform the equivalent actions 199as the default 200.Fn tcp_output , 201.Fn tcp_do_segment , 202and 203.Fn tcp_default_ctloutput 204functions, respectively. 205Each of these function pointers must be non-NULL. 206.Pp 207If a TCP stack needs to initialize data when a socket first selects the TCP 208stack (or, when the socket is first opened), it should set a non-NULL 209pointer in the 210.Va tfb_tcp_fb_init 211field. 212Likewise, if a TCP stack needs to cleanup data when a socket stops using the 213TCP stack (or, when the socket is closed), it should set a non-NULL pointer 214in the 215.Va tfb_tcp_fb_fini 216field. 217.Pp 218If the 219.Va tfb_tcp_fb_fini 220argument is non-NULL, the function to which it points is called when the 221kernel is destroying the TCP control block or when the socket is transitioning 222to use a different TCP stack. 223The function is called with arguments of the TCP control block and an integer 224flag. 225The flag will be zero if the socket is transitioning to use another TCP stack 226or one if the TCP control block is being destroyed. 227.Pp 228If the TCP stack implements additional timers, the TCP stack should set a 229non-NULL pointer in the 230.Va tfb_tcp_timer_stop_all , 231.Va tfb_tcp_timer_activate , 232.Va tfb_tcp_timer_active , 233and 234.Va tfb_tcp_timer_stop 235fields. 236These fields should all be 237.Dv NULL 238or should all contain pointers to functions. 239The 240.Va tfb_tcp_timer_activate , 241.Va tfb_tcp_timer_active , 242and 243.Va tfb_tcp_timer_stop 244functions will be called when the 245.Fn tcp_timer_activate , 246.Fn tcp_timer_active , 247and 248.Fn tcp_timer_stop 249functions, respectively, are called with a timer type other than the standard 250types. 251The functions defined by the TCP stack have the same semantics (both for 252arguments and return values) as the normal timer functions they supplement. 253.Pp 254Additionally, a stack may define its own actions to take when the retransmit 255timer fires by setting a non-NULL function pointer in the 256.Va tfb_tcp_rexmit_tmr 257field. 258This function is called very early in the process of handling a retransmit 259timer. 260However, care must be taken to ensure the retransmit timer leaves the 261TCP control block in a valid state for the remainder of the retransmit 262timer logic. 263.Pp 264A user may select a new TCP stack before calling 265.Xr connect 2 266or 267.Xr listen 2 . 268Optionally, a TCP stack may also allow a user to begin using the TCP stack for 269a connection that is in a later state by setting a non-NULL function pointer in 270the 271.Va tfb_tcp_handoff_ok 272field. 273If this field is non-NULL and a user attempts to select that TCP stack after 274calling 275.Xr connect 2 276or 277.Xr listen 2 278for that socket, the kernel will call the function pointed to by the 279.Va tfb_tcp_handoff_ok 280field. 281The function should return 0 if the user is allowed to switch the socket to use 282the TCP stack. 283Otherwise, the function should return an error code, which will be returned to 284the user. 285If the 286.Va tfb_tcp_handoff_ok 287field is 288.Dv NULL 289and a user attempts to select the TCP stack after calling 290.Xr connect 2 291or 292.Xr listen 2 293for that socket, the operation will fail and the kernel will return 294.Er EINVAL . 295.Pp 296The 297.Va tfb_refcnt 298and 299.Va tfb_flags 300fields are used by the kernel's TCP code and will be initialized when the 301TCP stack is registered. 302.Ss Requirements for Alternate TCP Stacks 303If the TCP stack needs to store data beyond what is stored in the default 304TCP control block, the TCP stack can initialize its own per-connection storage. 305The 306.Va t_fb_ptr 307field in the 308.Vt "struct tcpcb" 309control block structure has been reserved to hold a pointer to this 310per-connection storage. 311If the TCP stack uses this alternate storage, it should understand that the 312value of the 313.Va t_fb_ptr 314pointer may not be initialized to 315.Dv NULL . 316Therefore, it should use a 317.Va tfb_tcp_fb_init 318function to initialize this field. 319Additionally, it should use a 320.Va tfb_tcp_fb_fini 321function to deallocate storage when the socket is closed. 322.Pp 323It is understood that alternate TCP stacks may keep different sets of data. 324However, in order to ensure that data is available to both the user and the 325rest of the system in a standardized format, alternate TCP stacks must 326update all fields in the TCP control block to the greatest extent practical. 327.Sh RETURN VALUES 328The 329.Fn register_tcp_functions , 330.Fn register_tcp_functions_as_name , 331.Fn register_tcp_functions_as_names , 332and 333.Fn deregister_tcp_functions 334functions return zero on success and non-zero on failure. 335In particular, the 336.Fn deregister_tcp_functions 337will return 338.Er EBUSY 339until no more connections are using the specified TCP stack. 340A module calling 341.Fn deregister_tcp_functions 342must be prepared to wait until all connections have stopped using the 343specified TCP stack. 344.Sh ERRORS 345The 346.Fn register_tcp_functions 347function will fail if: 348.Bl -tag -width Er 349.It Bq Er EINVAL 350Any of the members of the 351.Fa blk 352argument are set incorrectly. 353.It Bq Er ENOMEM 354The function could not allocate memory for its internal data. 355.It Bq Er EALREADY 356A function block is already registered with the same name. 357.El 358The 359.Fn deregister_tcp_functions 360function will fail if: 361.Bl -tag -width Er 362.It Bq Er EPERM 363The 364.Fa blk 365argument references the kernel's compiled-in default function block. 366.It Bq Er EBUSY 367The function block is still in use by one or more sockets, or is defined as 368the current default function block. 369.It Bq Er ENOENT 370The 371.Fa blk 372argument references a function block that is not currently registered. 373.El 374.Sh SEE ALSO 375.Xr connect 2 , 376.Xr listen 2 , 377.Xr tcp 4 , 378.Xr malloc 9 379.Sh HISTORY 380This framework first appeared in 381.Fx 11.0 . 382.Sh AUTHORS 383.An -nosplit 384The 385.Nm 386framework was written by 387.An Randall Stewart Aq Mt rrs@FreeBSD.org . 388.Pp 389This manual page was written by 390.An Jonathan Looney Aq Mt jtl@FreeBSD.org . 391