1.\" 2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org> 3.\" All rights reserved. 4.\" 5.\" Redistribution and use in source and binary forms, with or without 6.\" modification, are permitted provided that the following conditions 7.\" are met: 8.\" 1. Redistributions of source code must retain the above copyright 9.\" notice, this list of conditions and the following disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR 18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.\" $FreeBSD$ 27.\" 28.Dd June 28, 2016 29.Dt TCP_FUNCTIONS 9 30.Os 31.Sh NAME 32.Nm tcp_functions 33.Nd Alternate TCP Stack Framework 34.Sh SYNOPSIS 35.In netinet/tcp.h 36.In netinet/tcp_var.h 37.Ft int 38.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait" 39.Ft int 40.Fn deregister_tcp_functions "struct tcp_function_block *blk" 41.Sh DESCRIPTION 42The 43.Nm 44framework allows a kernel developer to implement alternate TCP stacks. 45The alternate stacks can be compiled in the kernel or can be implemented in 46loadable kernel modules. 47This functionality is intended to encourage experimentation with the TCP stack 48and to allow alternate behaviors to be deployed for different TCP connections 49on a single system. 50.Pp 51A system administrator can set a system default stack. 52By default, all TCP connections will use the system default stack. 53Additionally, users can specify a particular stack to use on a per-connection 54basis. 55(See 56.Xr tcp 4 57for details on setting the system default stack, or selecting a specific stack 58for a given connection.) 59.Pp 60This man page treats "TCP stacks" as synonymous with "function blocks". 61This is intentional. 62A "TCP stack" is a collection of functions that implement a set of behavior. 63Therefore, an alternate "function block" defines an alternate "TCP stack". 64.Pp 65.Nm 66modules must call the 67.Fn register_tcp_functions 68function during initialization and successfully call the 69.Fn deregister_tcp_functions 70function prior to allowing the module to be unloaded. 71.Pp 72The 73.Fn register_tcp_functions 74function requests that the system add a specified function block to the system. 75.Pp 76The 77.Fn deregister_tcp_functions 78function requests that the system remove a specified function block from the 79system. 80If the call fails because sockets are still using the specified function block, 81the system will mark the function block as being in the process of being 82removed. 83This will prevent additional sockets from using the specified function block. 84However, it will not impact sockets that are already using the function block. 85.Pp 86The 87.Fa blk 88argument is a pointer to a 89.Vt "struct tcp_function_block" , 90which is explained below (see 91.Sx Function Block Structure ) . 92The 93.Fa wait 94argument is used as the 95.Fa flags 96argument to 97.Xr malloc 9 , 98and must be set to one of the valid values defined in that man page. 99.Ss Function Block Structure 100The 101.Fa blk argument is a pointer to a 102.Vt "struct tcp_function_block" , 103which has the following members: 104.Bd -literal -offset indent 105struct tcp_function_block { 106 char tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX]; 107 int (*tfb_tcp_output)(struct tcpcb *); 108 void (*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *, 109 struct socket *, struct tcpcb *, 110 int, int, uint8_t, 111 int); 112 int (*tfb_tcp_ctloutput)(struct socket *so, 113 struct sockopt *sopt, 114 struct inpcb *inp, struct tcpcb *tp); 115 /* Optional memory allocation/free routine */ 116 void (*tfb_tcp_fb_init)(struct tcpcb *); 117 void (*tfb_tcp_fb_fini)(struct tcpcb *, int); 118 /* Optional timers, must define all if you define one */ 119 int (*tfb_tcp_timer_stop_all)(struct tcpcb *); 120 void (*tfb_tcp_timer_activate)(struct tcpcb *, 121 uint32_t, u_int); 122 int (*tfb_tcp_timer_active)(struct tcpcb *, uint32_t); 123 void (*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t); 124 /* Optional functions */ 125 void (*tfb_tcp_rexmit_tmr)(struct tcpcb *); 126 void (*tfb_tcp_handoff_ok)(struct tcpcb *); 127 /* System use */ 128 volatile uint32_t tfb_refcnt; 129 uint32_t tfb_flags; 130}; 131.Ed 132.Pp 133The 134.Va tfb_tcp_block_name 135field identifies the unique name of the TCP stack, and should be no longer than 136TCP_FUNCTION_NAME_LEN_MAX-1 characters in length. 137.Pp 138The 139.Va tfb_tcp_output , 140.Va tfb_tcp_do_segment , 141and 142.Va tfb_tcp_ctloutput 143fields are pointers to functions that perform the equivalent actions 144as the default 145.Fn tcp_output , 146.Fn tcp_do_segment , 147and 148.Fn tcp_default_ctloutput 149functions, respectively. 150Each of these function pointers must be non-NULL. 151.Pp 152If a TCP stack needs to initialize data when a socket first selects the TCP 153stack (or, when the socket is first opened), it should set a non-NULL 154pointer in the 155.Va tfb_tcp_fb_init 156field. 157Likewise, if a TCP stack needs to cleanup data when a socket stops using the 158TCP stack (or, when the socket is closed), it should set a non-NULL pointer 159in the 160.Va tfb_tcp_fb_fini 161field. 162.Pp 163If the 164.Va tfb_tcp_fb_fini 165argument is non-NULL, the function to which it points is called when the 166kernel is destroying the TCP control block or when the socket is transitioning 167to use a different TCP stack. 168The function is called with arguments of the TCP control block and an integer 169flag. 170The flag will be zero if the socket is transitioning to use another TCP stack 171or one if the TCP control block is being destroyed. 172.Pp 173If the TCP stack implements additional timers, the TCP stack should set a 174non-NULL pointer in the 175.Va tfb_tcp_timer_stop_all , 176.Va tfb_tcp_timer_activate , 177.Va tfb_tcp_timer_active , 178and 179.Va tfb_tcp_timer_stop 180fields. 181These fields should all be 182.Dv NULL 183or should all contain pointers to functions. 184The 185.Va tfb_tcp_timer_activate , 186.Va tfb_tcp_timer_active , 187and 188.Va tfb_tcp_timer_stop 189functions will be called when the 190.Fn tcp_timer_activate , 191.Fn tcp_timer_active , 192and 193.Fn tcp_timer_stop 194functions, respectively, are called with a timer type other than the standard 195types. 196The functions defined by the TCP stack have the same semantics (both for 197arguments and return values) as the normal timer functions they supplement. 198.Pp 199Additionally, a stack may define its own actions to take when the retransmit 200timer fires by setting a non-NULL function pointer in the 201.Va tfb_tcp_rexmit_tmr 202field. 203This function is called very early in the process of handling a retransmit 204timer. 205However, care must be taken to ensure the retransmit timer leaves the 206TCP control block in a valid state for the remainder of the retransmit 207timer logic. 208.Pp 209A user may select a new TCP stack before calling 210.Xr connect 2 211or 212.Xr listen 2 . 213Optionally, a TCP stack may also allow a user to begin using the TCP stack for 214a connection that is in a later state by setting a non-NULL function pointer in 215the 216.Va tfb_tcp_handoff_ok 217field. 218If this field is non-NULL and a user attempts to select that TCP stack after 219calling 220.Xr connect 2 221or 222.Xr listen 2 223for that socket, the kernel will call the function pointed to by the 224.Va tfb_tcp_handoff_ok 225field. 226The function should return 0 if the user is allowed to switch the socket to use 227the TCP stack. Otherwise, the function should return an error code, which will 228be returned to the user. 229If the 230.Va tfb_tcp_handoff_ok 231field is 232.Dv NULL 233and a user attempts to select the TCP stack after calling 234.Xr connect 2 235or 236.Xr listen 2 237for that socket, the operation will fail and the kernel will return 238.Er EINVAL . 239.Pp 240The 241.Va tfb_refcnt 242and 243.Va tfb_flags 244fields are used by the kernel's TCP code and will be initialized when the 245TCP stack is registered. 246.Ss Requirements for Alternate TCP Stacks 247If the TCP stack needs to store data beyond what is stored in the default 248TCP control block, the TCP stack can initialize its own per-connection storage. 249The 250.Va t_fb_ptr 251field in the 252.Vt "struct tcpcb" 253control block structure has been reserved to hold a pointer to this 254per-connection storage. 255If the TCP stack uses this alternate storage, it should understand that the 256value of the 257.Va t_fb_ptr 258pointer may not be initialized to 259.Dv NULL . 260Therefore, it should use a 261.Va tfb_tcp_fb_init 262function to initialize this field. 263Additionally, it should use a 264.Va tfb_tcp_fb_fini 265function to deallocate storage when the socket is closed. 266.Pp 267It is understood that alternate TCP stacks may keep different sets of data. 268However, in order to ensure that data is available to both the user and the 269rest of the system in a standardized format, alternate TCP stacks must 270update all fields in the TCP control block to the greatest extent practical. 271.Sh RETURN VALUES 272The 273.Fn register_tcp_functions 274and 275.Fn deregister_tcp_functions 276functions return zero on success and non-zero on failure. 277In particular, the 278.Fn deregister_tcp_functions 279will return 280.Er EBUSY 281until no more connections are using the specified TCP stack. 282A module calling 283.Fn deregister_tcp_functions 284must be prepared to wait until all connections have stopped using the 285specified TCP stack. 286.Sh ERRORS 287The 288.Fn register_tcp_functions 289function will fail if: 290.Bl -tag -width Er 291.It Bq Er EINVAL 292Any of the members of the 293.Fa blk 294argument are set incorrectly. 295.It Bq Er ENOMEM 296The function could not allocate memory for its internal data. 297.It Bq Er EALREADY 298A function block is already registered with the same name. 299.El 300The 301.Fn deregister_tcp_functions 302function will fail if: 303.Bl -tag -width Er 304.It Bq Er EPERM 305The 306.Fa blk 307argument references the kernel's compiled-in default function block. 308.It Bq Er EBUSY 309The function block is still in use by one or more sockets, or is defined as 310the current default function block. 311.It Bq Er ENOENT 312The 313.Fa blk 314argument references a function block that is not currently registered. 315.Sh SEE ALSO 316.Xr connect 2 , 317.Xr listen 2 , 318.Xr tcp 4 , 319.Xr malloc 9 320.Sh HISTORY 321This framework first appeared in 322.Fx 11.0 . 323.Sh AUTHORS 324.An -nosplit 325The 326.Nm 327framework was written by 328.An Randall Stewart Aq Mt rrs@FreeBSD.org . 329.Pp 330This manual page was written by 331.An Jonathan Looney Aq Mt jtl@FreeBSD.org . 332