xref: /freebsd/share/man/man9/tcp_functions.9 (revision 9cbf1de7e34a6fced041388fad5d9180cb7705fe)
1.\"
2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.Dd June 6, 2024
27.Dt TCP_FUNCTIONS 9
28.Os
29.Sh NAME
30.Nm tcp_functions
31.Nd Alternate TCP Stack Framework
32.Sh SYNOPSIS
33.In netinet/tcp.h
34.In netinet/tcp_var.h
35.Ft int
36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait"
37.Ft int
38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \
39"const char *name" "int wait"
40.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \
41"int wait" "const char *names[]" "int *num_names"
42.Ft int
43.Fn deregister_tcp_functions "struct tcp_function_block *blk"
44.Sh DESCRIPTION
45The
46.Nm
47framework allows a kernel developer to implement alternate TCP stacks.
48The alternate stacks can be compiled in the kernel or can be implemented in
49loadable kernel modules.
50This functionality is intended to encourage experimentation with the TCP stack
51and to allow alternate behaviors to be deployed for different TCP connections
52on a single system.
53.Pp
54A system administrator can set a system default stack.
55By default, all TCP connections will use the system default stack.
56Additionally, users can specify a particular stack to use on a per-connection
57basis.
58(See
59.Xr tcp 4
60for details on setting the system default stack, or selecting a specific stack
61for a given connection.)
62.Pp
63This man page treats "TCP stacks" as synonymous with "function blocks".
64This is intentional.
65A "TCP stack" is a collection of functions that implement a set of behavior.
66Therefore, an alternate "function block" defines an alternate "TCP stack".
67.Pp
68The
69.Fn register_tcp_functions ,
70.Fn register_tcp_functions_as_name ,
71and
72.Fn register_tcp_functions_as_names
73functions request that the system add a specified function block
74and register it for use with a given name.
75Modules may register the same function block multiple times with different
76names.
77However, names must be globally unique among all registered function blocks.
78Also, modules may not ever modify the contents of the function block (including
79the name) after it has been registered, unless the module first successfully
80de-registers the function block.
81.Pp
82The
83.Fn register_tcp_functions
84function requests that the system register the function block with the name
85defined in the function block's
86.Va tfb_tcp_block_name
87field.
88Note that this is the only one of the three registration functions that
89automatically registers the function block using the name defined in the
90function block's
91.Va tfb_tcp_block_name
92field.
93If a module uses one of the other registration functions, it may request that
94the system register the function block using the name defined in the
95function block's
96.Va tfb_tcp_block_name
97field by explicitly providing that name.
98.Pp
99The
100.Fn register_tcp_functions_as_name
101function requests that the system register the function block with the name
102provided in the
103.Fa name
104argument.
105.Pp
106The
107.Fn register_tcp_functions_as_names
108function requests that the system register the function block with all the
109names provided in the
110.Fa names
111argument.
112The
113.Fa num_names
114argument provides a pointer to the number of names.
115This function will either succeed in registering all of the names in the array,
116or none of the names in the array.
117On failure, the
118.Fa num_names
119argument is updated with the index number of the entry in the
120.Fa names
121array which the system was processing when it encountered the error.
122.Pp
123The
124.Fn deregister_tcp_functions
125function requests that the system remove a specified function block from the
126system.
127If this call succeeds, it will completely deregister the function block,
128regardless of the number of names used to register the function block.
129If the call fails because sockets are still using the specified function block,
130the system will mark the function block as being in the process of being
131removed.
132This will prevent additional sockets from using the specified function block.
133However, it will not impact sockets that are already using the function block.
134.Pp
135.Nm
136modules must call one or more of the registration functions during
137initialization and successfully call the
138.Fn deregister_tcp_functions
139function prior to allowing the module to be unloaded.
140.Pp
141The
142.Fa blk
143argument is a pointer to a
144.Vt "struct tcp_function_block" ,
145which is explained below (see
146.Sx Function Block Structure ) .
147The
148.Fa wait
149argument is used as the
150.Fa flags
151argument to
152.Xr malloc 9 ,
153and must be set to one of the valid values defined in that man page.
154.Ss Function Block Structure
155The
156.Fa blk argument is a pointer to a
157.Vt "struct tcp_function_block" ,
158which has the following members:
159.Bd -literal -offset indent
160struct tcp_function_block {
161	char	tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX];
162	int	(*tfb_tcp_output)(struct tcpcb *);
163	void	(*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *,
164			    struct socket *, struct tcpcb *,
165			    int, int, uint8_t,
166			    int);
167	int     (*tfb_tcp_ctloutput)(struct socket *so,
168			    struct sockopt *sopt,
169			    struct inpcb *inp, struct tcpcb *tp);
170	/* Optional memory allocation/free routine */
171	void	(*tfb_tcp_fb_init)(struct tcpcb *);
172	void	(*tfb_tcp_fb_fini)(struct tcpcb *, int);
173	/* Optional timers, must define all if you define one */
174	int	(*tfb_tcp_timer_stop_all)(struct tcpcb *);
175	void	(*tfb_tcp_timer_activate)(struct tcpcb *,
176			    uint32_t, u_int);
177	int	(*tfb_tcp_timer_active)(struct tcpcb *, uint32_t);
178	void	(*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t);
179	/* Optional function */
180	void	(*tfb_tcp_rexmit_tmr)(struct tcpcb *);
181	/* Mandatory function */
182	int	(*tfb_tcp_handoff_ok)(struct tcpcb *);
183	/* System use */
184	volatile uint32_t tfb_refcnt;
185	uint32_t  tfb_flags;
186};
187.Ed
188.Pp
189The
190.Va tfb_tcp_block_name
191field identifies the unique name of the TCP stack, and should be no longer than
192TCP_FUNCTION_NAME_LEN_MAX-1 characters in length.
193.Pp
194The
195.Va tfb_tcp_output ,
196.Va tfb_tcp_do_segment ,
197and
198.Va tfb_tcp_ctloutput
199fields are pointers to functions that perform the equivalent actions
200as the default
201.Fn tcp_output ,
202.Fn tcp_do_segment ,
203and
204.Fn tcp_default_ctloutput
205functions, respectively.
206Each of these function pointers must be non-NULL.
207.Pp
208If a TCP stack needs to initialize data when a socket first selects the TCP
209stack (or, when the socket is first opened), it should set a non-NULL
210pointer in the
211.Va tfb_tcp_fb_init
212field.
213Likewise, if a TCP stack needs to cleanup data when a socket stops using the
214TCP stack (or, when the socket is closed), it should set a non-NULL pointer
215in the
216.Va tfb_tcp_fb_fini
217field.
218.Pp
219If the
220.Va tfb_tcp_fb_fini
221argument is non-NULL, the function to which it points is called when the
222kernel is destroying the TCP control block or when the socket is transitioning
223to use a different TCP stack.
224The function is called with arguments of the TCP control block and an integer
225flag.
226The flag will be zero if the socket is transitioning to use another TCP stack
227or one if the TCP control block is being destroyed.
228.Pp
229If the TCP stack implements additional timers, the TCP stack should set a
230non-NULL pointer in the
231.Va tfb_tcp_timer_stop_all ,
232.Va tfb_tcp_timer_activate ,
233.Va tfb_tcp_timer_active ,
234and
235.Va tfb_tcp_timer_stop
236fields.
237These fields should all be
238.Dv NULL
239or should all contain pointers to functions.
240The
241.Va tfb_tcp_timer_activate ,
242.Va tfb_tcp_timer_active ,
243and
244.Va tfb_tcp_timer_stop
245functions will be called when the
246.Fn tcp_timer_activate ,
247.Fn tcp_timer_active ,
248and
249.Fn tcp_timer_stop
250functions, respectively, are called with a timer type other than the standard
251types.
252The functions defined by the TCP stack have the same semantics (both for
253arguments and return values) as the normal timer functions they supplement.
254.Pp
255Additionally, a stack may define its own actions to take when the retransmit
256timer fires by setting a non-NULL function pointer in the
257.Va tfb_tcp_rexmit_tmr
258field.
259This function is called very early in the process of handling a retransmit
260timer.
261However, care must be taken to ensure the retransmit timer leaves the
262TCP control block in a valid state for the remainder of the retransmit
263timer logic.
264.Pp
265A user may select a new TCP stack before calling at any time.
266Therefore, the function pointer
267.Va tfb_tcp_handoff_ok
268field must be non-NULL.
269If a user attempts to select that TCP stack, the kernel will call the function
270pointed to by the
271.Va tfb_tcp_handoff_ok
272field.
273The function should return 0 if the user is allowed to switch the socket to use
274the TCP stack. In this case, the kernel will call the function pointed to by
275.Va tfb_tcp_fb_init
276if this function pointer is non-NULL and finally perform the stack switch.
277If the user is not allowed to switch the socket, the function should undo any
278changes it made to the connection state configuration and return an error code,
279which will be returned to the user.
280.Pp
281The
282.Va tfb_refcnt
283and
284.Va tfb_flags
285fields are used by the kernel's TCP code and will be initialized when the
286TCP stack is registered.
287.Ss Requirements for Alternate TCP Stacks
288If the TCP stack needs to store data beyond what is stored in the default
289TCP control block, the TCP stack can initialize its own per-connection storage.
290The
291.Va t_fb_ptr
292field in the
293.Vt "struct tcpcb"
294control block structure has been reserved to hold a pointer to this
295per-connection storage.
296If the TCP stack uses this alternate storage, it should understand that the
297value of the
298.Va t_fb_ptr
299pointer may not be initialized to
300.Dv NULL .
301Therefore, it should use a
302.Va tfb_tcp_fb_init
303function to initialize this field.
304Additionally, it should use a
305.Va tfb_tcp_fb_fini
306function to deallocate storage when the socket is closed.
307.Pp
308It is understood that alternate TCP stacks may keep different sets of data.
309However, in order to ensure that data is available to both the user and the
310rest of the system in a standardized format, alternate TCP stacks must
311update all fields in the TCP control block to the greatest extent practical.
312.Sh RETURN VALUES
313The
314.Fn register_tcp_functions ,
315.Fn register_tcp_functions_as_name ,
316.Fn register_tcp_functions_as_names ,
317and
318.Fn deregister_tcp_functions
319functions return zero on success and non-zero on failure.
320In particular, the
321.Fn deregister_tcp_functions
322will return
323.Er EBUSY
324until no more connections are using the specified TCP stack.
325A module calling
326.Fn deregister_tcp_functions
327must be prepared to wait until all connections have stopped using the
328specified TCP stack.
329.Sh ERRORS
330The
331.Fn register_tcp_functions
332function will fail if:
333.Bl -tag -width Er
334.It Bq Er EINVAL
335Any of the members of the
336.Fa blk
337argument are set incorrectly.
338.It Bq Er ENOMEM
339The function could not allocate memory for its internal data.
340.It Bq Er EALREADY
341A function block is already registered with the same name.
342.El
343The
344.Fn deregister_tcp_functions
345function will fail if:
346.Bl -tag -width Er
347.It Bq Er EPERM
348The
349.Fa blk
350argument references the kernel's compiled-in default function block.
351.It Bq Er EBUSY
352The function block is still in use by one or more sockets, or is defined as
353the current default function block.
354.It Bq Er ENOENT
355The
356.Fa blk
357argument references a function block that is not currently registered.
358.El
359.Sh SEE ALSO
360.Xr connect 2 ,
361.Xr listen 2 ,
362.Xr tcp 4 ,
363.Xr malloc 9
364.Sh HISTORY
365This framework first appeared in
366.Fx 11.0 .
367.Sh AUTHORS
368.An -nosplit
369The
370.Nm
371framework was written by
372.An Randall Stewart Aq Mt rrs@FreeBSD.org .
373.Pp
374This manual page was written by
375.An Jonathan Looney Aq Mt jtl@FreeBSD.org .
376