xref: /freebsd/share/man/man9/tcp_functions.9 (revision b2d2a78ad80ec68d4a17f5aef97d21686cb1e29b)
1.\"
2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.Dd July 13, 2024
27.Dt TCP_FUNCTIONS 9
28.Os
29.Sh NAME
30.Nm tcp_functions
31.Nd Alternate TCP Stack Framework
32.Sh SYNOPSIS
33.In netinet/tcp.h
34.In netinet/tcp_var.h
35.Ft int
36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait"
37.Ft int
38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \
39"const char *name" "int wait"
40.Ft int
41.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \
42"int wait" "const char *names[]" "int *num_names"
43.Ft int
44.Fn deregister_tcp_functions "struct tcp_function_block *blk"
45.Sh DESCRIPTION
46The
47.Nm
48framework allows a kernel developer to implement alternate TCP stacks.
49The alternate stacks can be compiled in the kernel or can be implemented in
50loadable kernel modules.
51This functionality is intended to encourage experimentation with the TCP stack
52and to allow alternate behaviors to be deployed for different TCP connections
53on a single system.
54.Pp
55A system administrator can set a system default stack.
56By default, all TCP connections will use the system default stack.
57Additionally, users can specify a particular stack to use on a per-connection
58basis.
59(See
60.Xr tcp 4
61for details on setting the system default stack, or selecting a specific stack
62for a given connection.)
63.Pp
64This man page treats "TCP stacks" as synonymous with "function blocks".
65This is intentional.
66A "TCP stack" is a collection of functions that implement a set of behavior.
67Therefore, an alternate "function block" defines an alternate "TCP stack".
68.Pp
69The
70.Fn register_tcp_functions ,
71.Fn register_tcp_functions_as_name ,
72and
73.Fn register_tcp_functions_as_names
74functions request that the system add a specified function block
75and register it for use with a given name.
76Modules may register the same function block multiple times with different
77names.
78However, names must be globally unique among all registered function blocks.
79Also, modules may not ever modify the contents of the function block (including
80the name) after it has been registered, unless the module first successfully
81de-registers the function block.
82.Pp
83The
84.Fn register_tcp_functions
85function requests that the system register the function block with the name
86defined in the function block's
87.Va tfb_tcp_block_name
88field.
89Note that this is the only one of the three registration functions that
90automatically registers the function block using the name defined in the
91function block's
92.Va tfb_tcp_block_name
93field.
94If a module uses one of the other registration functions, it may request that
95the system register the function block using the name defined in the
96function block's
97.Va tfb_tcp_block_name
98field by explicitly providing that name.
99.Pp
100The
101.Fn register_tcp_functions_as_name
102function requests that the system register the function block with the name
103provided in the
104.Fa name
105argument.
106.Pp
107The
108.Fn register_tcp_functions_as_names
109function requests that the system register the function block with all the
110names provided in the
111.Fa names
112argument.
113The
114.Fa num_names
115argument provides a pointer to the number of names.
116This number must not exceed TCP_FUNCTION_NAME_NUM_MAX.
117This function will either succeed in registering all of the names in the array,
118or none of the names in the array.
119On failure, the
120.Fa num_names
121argument is updated with the index number of the entry in the
122.Fa names
123array which the system was processing when it encountered the error.
124.Pp
125The
126.Fn deregister_tcp_functions
127function requests that the system remove a specified function block from the
128system.
129If this call succeeds, it will completely deregister the function block,
130regardless of the number of names used to register the function block.
131If the call fails because sockets are still using the specified function block,
132the system will mark the function block as being in the process of being
133removed.
134This will prevent additional sockets from using the specified function block.
135However, it will not impact sockets that are already using the function block.
136.Pp
137.Nm
138modules must call one or more of the registration functions during
139initialization and successfully call the
140.Fn deregister_tcp_functions
141function prior to allowing the module to be unloaded.
142.Pp
143The
144.Fa blk
145argument is a pointer to a
146.Vt "struct tcp_function_block" ,
147which is explained below (see
148.Sx Function Block Structure ) .
149The
150.Fa wait
151argument is used as the
152.Fa flags
153argument to
154.Xr malloc 9 ,
155and must be set to one of the valid values defined in that man page.
156.Ss Function Block Structure
157The
158.Fa blk argument is a pointer to a
159.Vt "struct tcp_function_block" ,
160which has the following members:
161.Bd -literal -offset indent
162struct tcp_function_block {
163	char	tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX];
164	int	(*tfb_tcp_output)(struct tcpcb *);
165	void	(*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *,
166			    struct socket *, struct tcpcb *,
167			    int, int, uint8_t,
168			    int);
169	int     (*tfb_tcp_ctloutput)(struct socket *so,
170			    struct sockopt *sopt,
171			    struct inpcb *inp, struct tcpcb *tp);
172	/* Optional memory allocation/free routine */
173	void	(*tfb_tcp_fb_init)(struct tcpcb *);
174	void	(*tfb_tcp_fb_fini)(struct tcpcb *, int);
175	/* Optional timers, must define all if you define one */
176	int	(*tfb_tcp_timer_stop_all)(struct tcpcb *);
177	void	(*tfb_tcp_timer_activate)(struct tcpcb *,
178			    uint32_t, u_int);
179	int	(*tfb_tcp_timer_active)(struct tcpcb *, uint32_t);
180	void	(*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t);
181	/* Optional function */
182	void	(*tfb_tcp_rexmit_tmr)(struct tcpcb *);
183	/* Mandatory function */
184	int	(*tfb_tcp_handoff_ok)(struct tcpcb *);
185	/* System use */
186	volatile uint32_t tfb_refcnt;
187	uint32_t  tfb_flags;
188};
189.Ed
190.Pp
191The
192.Va tfb_tcp_block_name
193field identifies the unique name of the TCP stack, and should be no longer than
194TCP_FUNCTION_NAME_LEN_MAX-1 characters in length.
195.Pp
196The
197.Va tfb_tcp_output ,
198.Va tfb_tcp_do_segment ,
199and
200.Va tfb_tcp_ctloutput
201fields are pointers to functions that perform the equivalent actions
202as the default
203.Fn tcp_output ,
204.Fn tcp_do_segment ,
205and
206.Fn tcp_default_ctloutput
207functions, respectively.
208Each of these function pointers must be non-NULL.
209.Pp
210If a TCP stack needs to initialize data when a socket first selects the TCP
211stack (or, when the socket is first opened), it should set a non-NULL
212pointer in the
213.Va tfb_tcp_fb_init
214field.
215Likewise, if a TCP stack needs to cleanup data when a socket stops using the
216TCP stack (or, when the socket is closed), it should set a non-NULL pointer
217in the
218.Va tfb_tcp_fb_fini
219field.
220.Pp
221If the
222.Va tfb_tcp_fb_fini
223argument is non-NULL, the function to which it points is called when the
224kernel is destroying the TCP control block or when the socket is transitioning
225to use a different TCP stack.
226The function is called with arguments of the TCP control block and an integer
227flag.
228The flag will be zero if the socket is transitioning to use another TCP stack
229or one if the TCP control block is being destroyed.
230.Pp
231If the TCP stack implements additional timers, the TCP stack should set a
232non-NULL pointer in the
233.Va tfb_tcp_timer_stop_all ,
234.Va tfb_tcp_timer_activate ,
235.Va tfb_tcp_timer_active ,
236and
237.Va tfb_tcp_timer_stop
238fields.
239These fields should all be
240.Dv NULL
241or should all contain pointers to functions.
242The
243.Va tfb_tcp_timer_activate ,
244.Va tfb_tcp_timer_active ,
245and
246.Va tfb_tcp_timer_stop
247functions will be called when the
248.Fn tcp_timer_activate ,
249.Fn tcp_timer_active ,
250and
251.Fn tcp_timer_stop
252functions, respectively, are called with a timer type other than the standard
253types.
254The functions defined by the TCP stack have the same semantics (both for
255arguments and return values) as the normal timer functions they supplement.
256.Pp
257Additionally, a stack may define its own actions to take when the retransmit
258timer fires by setting a non-NULL function pointer in the
259.Va tfb_tcp_rexmit_tmr
260field.
261This function is called very early in the process of handling a retransmit
262timer.
263However, care must be taken to ensure the retransmit timer leaves the
264TCP control block in a valid state for the remainder of the retransmit
265timer logic.
266.Pp
267A user may select a new TCP stack before calling at any time.
268Therefore, the function pointer
269.Va tfb_tcp_handoff_ok
270field must be non-NULL.
271If a user attempts to select that TCP stack, the kernel will call the function
272pointed to by the
273.Va tfb_tcp_handoff_ok
274field.
275The function should return 0 if the user is allowed to switch the socket to use
276the TCP stack. In this case, the kernel will call the function pointed to by
277.Va tfb_tcp_fb_init
278if this function pointer is non-NULL and finally perform the stack switch.
279If the user is not allowed to switch the socket, the function should undo any
280changes it made to the connection state configuration and return an error code,
281which will be returned to the user.
282.Pp
283The
284.Va tfb_refcnt
285and
286.Va tfb_flags
287fields are used by the kernel's TCP code and will be initialized when the
288TCP stack is registered.
289.Ss Requirements for Alternate TCP Stacks
290If the TCP stack needs to store data beyond what is stored in the default
291TCP control block, the TCP stack can initialize its own per-connection storage.
292The
293.Va t_fb_ptr
294field in the
295.Vt "struct tcpcb"
296control block structure has been reserved to hold a pointer to this
297per-connection storage.
298If the TCP stack uses this alternate storage, it should understand that the
299value of the
300.Va t_fb_ptr
301pointer may not be initialized to
302.Dv NULL .
303Therefore, it should use a
304.Va tfb_tcp_fb_init
305function to initialize this field.
306Additionally, it should use a
307.Va tfb_tcp_fb_fini
308function to deallocate storage when the socket is closed.
309.Pp
310It is understood that alternate TCP stacks may keep different sets of data.
311However, in order to ensure that data is available to both the user and the
312rest of the system in a standardized format, alternate TCP stacks must
313update all fields in the TCP control block to the greatest extent practical.
314.Sh RETURN VALUES
315The
316.Fn register_tcp_functions ,
317.Fn register_tcp_functions_as_name ,
318.Fn register_tcp_functions_as_names ,
319and
320.Fn deregister_tcp_functions
321functions return zero on success and non-zero on failure.
322In particular, the
323.Fn deregister_tcp_functions
324will return
325.Er EBUSY
326until no more connections are using the specified TCP stack.
327A module calling
328.Fn deregister_tcp_functions
329must be prepared to wait until all connections have stopped using the
330specified TCP stack.
331.Sh ERRORS
332The
333.Fn register_tcp_functions ,
334.Fn register_tcp_functions_as_name ,
335and
336.Fn register_tcp_functions_as_names
337functions will fail if:
338.Bl -tag -width Er
339.It Bq Er EINVAL
340Any of the members of the
341.Fa blk
342argument are set incorrectly.
343.It Bq Er ENOMEM
344The function could not allocate memory for its internal data.
345.It Bq Er EALREADY
346The
347.Fa blk
348is already registered or a function block is already registered with the same
349name.
350.El
351Additionally,
352.Fn register_tcp_functions_as_names
353will fail if:
354.Bl -tag -width Er
355.It Bq Er E2BIG
356The number of names pointed to by the
357.Fa num_names
358argument is larger than TCP_FUNCTION_NAME_NUM_MAX.
359.El
360The
361.Fn deregister_tcp_functions
362function will fail if:
363.Bl -tag -width Er
364.It Bq Er EPERM
365The
366.Fa blk
367argument references the kernel's compiled-in default function block.
368.It Bq Er EBUSY
369The function block is still in use by one or more sockets, or is defined as
370the current default function block.
371.It Bq Er ENOENT
372The
373.Fa blk
374argument references a function block that is not currently registered.
375.El
376.Sh SEE ALSO
377.Xr connect 2 ,
378.Xr listen 2 ,
379.Xr tcp 4 ,
380.Xr malloc 9
381.Sh HISTORY
382This framework first appeared in
383.Fx 11.0 .
384.Sh AUTHORS
385.An -nosplit
386The
387.Nm
388framework was written by
389.An Randall Stewart Aq Mt rrs@FreeBSD.org .
390.Pp
391This manual page was written by
392.An Jonathan Looney Aq Mt jtl@FreeBSD.org .
393