xref: /freebsd/share/man/man9/tcp_functions.9 (revision 734e82fe33aa764367791a7d603b383996c6b40b)
1.\"
2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.Dd March 10, 2017
27.Dt TCP_FUNCTIONS 9
28.Os
29.Sh NAME
30.Nm tcp_functions
31.Nd Alternate TCP Stack Framework
32.Sh SYNOPSIS
33.In netinet/tcp.h
34.In netinet/tcp_var.h
35.Ft int
36.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait"
37.Ft int
38.Fn register_tcp_functions_as_name "struct tcp_function_block *blk" \
39"const char *name" "int wait"
40.Fn register_tcp_functions_as_names "struct tcp_function_block *blk" \
41"int wait" "const char *names[]" "int *num_names"
42.Ft int
43.Fn deregister_tcp_functions "struct tcp_function_block *blk"
44.Sh DESCRIPTION
45The
46.Nm
47framework allows a kernel developer to implement alternate TCP stacks.
48The alternate stacks can be compiled in the kernel or can be implemented in
49loadable kernel modules.
50This functionality is intended to encourage experimentation with the TCP stack
51and to allow alternate behaviors to be deployed for different TCP connections
52on a single system.
53.Pp
54A system administrator can set a system default stack.
55By default, all TCP connections will use the system default stack.
56Additionally, users can specify a particular stack to use on a per-connection
57basis.
58(See
59.Xr tcp 4
60for details on setting the system default stack, or selecting a specific stack
61for a given connection.)
62.Pp
63This man page treats "TCP stacks" as synonymous with "function blocks".
64This is intentional.
65A "TCP stack" is a collection of functions that implement a set of behavior.
66Therefore, an alternate "function block" defines an alternate "TCP stack".
67.Pp
68The
69.Fn register_tcp_functions ,
70.Fn register_tcp_functions_as_name ,
71and
72.Fn register_tcp_functions_as_names
73functions request that the system add a specified function block
74and register it for use with a given name.
75Modules may register the same function block multiple times with different
76names.
77However, names must be globally unique among all registered function blocks.
78Also, modules may not ever modify the contents of the function block (including
79the name) after it has been registered, unless the module first successfully
80de-registers the function block.
81.Pp
82The
83.Fn register_tcp_functions
84function requests that the system register the function block with the name
85defined in the function block's
86.Va tfb_tcp_block_name
87field.
88Note that this is the only one of the three registration functions that
89automatically registers the function block using the name defined in the
90function block's
91.Va tfb_tcp_block_name
92field.
93If a module uses one of the other registration functions, it may request that
94the system register the function block using the name defined in the
95function block's
96.Va tfb_tcp_block_name
97field by explicitly providing that name.
98.Pp
99The
100.Fn register_tcp_functions_as_name
101function requests that the system register the function block with the name
102provided in the
103.Fa name
104argument.
105.Pp
106The
107.Fn register_tcp_functions_as_names
108function requests that the system register the function block with all the
109names provided in the
110.Fa names
111argument.
112The
113.Fa num_names
114argument provides a pointer to the number of names.
115This function will either succeed in registering all of the names in the array,
116or none of the names in the array.
117On failure, the
118.Fa num_names
119argument is updated with the index number of the entry in the
120.Fa names
121array which the system was processing when it encountered the error.
122.Pp
123The
124.Fn deregister_tcp_functions
125function requests that the system remove a specified function block from the
126system.
127If this call succeeds, it will completely deregister the function block,
128regardless of the number of names used to register the function block.
129If the call fails because sockets are still using the specified function block,
130the system will mark the function block as being in the process of being
131removed.
132This will prevent additional sockets from using the specified function block.
133However, it will not impact sockets that are already using the function block.
134.Pp
135.Nm
136modules must call one or more of the registration functions during
137initialization and successfully call the
138.Fn deregister_tcp_functions
139function prior to allowing the module to be unloaded.
140.Pp
141The
142.Fa blk
143argument is a pointer to a
144.Vt "struct tcp_function_block" ,
145which is explained below (see
146.Sx Function Block Structure ) .
147The
148.Fa wait
149argument is used as the
150.Fa flags
151argument to
152.Xr malloc 9 ,
153and must be set to one of the valid values defined in that man page.
154.Ss Function Block Structure
155The
156.Fa blk argument is a pointer to a
157.Vt "struct tcp_function_block" ,
158which has the following members:
159.Bd -literal -offset indent
160struct tcp_function_block {
161	char	tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX];
162	int	(*tfb_tcp_output)(struct tcpcb *);
163	void	(*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *,
164			    struct socket *, struct tcpcb *,
165			    int, int, uint8_t,
166			    int);
167	int     (*tfb_tcp_ctloutput)(struct socket *so,
168			    struct sockopt *sopt,
169			    struct inpcb *inp, struct tcpcb *tp);
170	/* Optional memory allocation/free routine */
171	void	(*tfb_tcp_fb_init)(struct tcpcb *);
172	void	(*tfb_tcp_fb_fini)(struct tcpcb *, int);
173	/* Optional timers, must define all if you define one */
174	int	(*tfb_tcp_timer_stop_all)(struct tcpcb *);
175	void	(*tfb_tcp_timer_activate)(struct tcpcb *,
176			    uint32_t, u_int);
177	int	(*tfb_tcp_timer_active)(struct tcpcb *, uint32_t);
178	void	(*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t);
179	/* Optional functions */
180	void	(*tfb_tcp_rexmit_tmr)(struct tcpcb *);
181	void	(*tfb_tcp_handoff_ok)(struct tcpcb *);
182	/* System use */
183	volatile uint32_t tfb_refcnt;
184	uint32_t  tfb_flags;
185};
186.Ed
187.Pp
188The
189.Va tfb_tcp_block_name
190field identifies the unique name of the TCP stack, and should be no longer than
191TCP_FUNCTION_NAME_LEN_MAX-1 characters in length.
192.Pp
193The
194.Va tfb_tcp_output ,
195.Va tfb_tcp_do_segment ,
196and
197.Va tfb_tcp_ctloutput
198fields are pointers to functions that perform the equivalent actions
199as the default
200.Fn tcp_output ,
201.Fn tcp_do_segment ,
202and
203.Fn tcp_default_ctloutput
204functions, respectively.
205Each of these function pointers must be non-NULL.
206.Pp
207If a TCP stack needs to initialize data when a socket first selects the TCP
208stack (or, when the socket is first opened), it should set a non-NULL
209pointer in the
210.Va tfb_tcp_fb_init
211field.
212Likewise, if a TCP stack needs to cleanup data when a socket stops using the
213TCP stack (or, when the socket is closed), it should set a non-NULL pointer
214in the
215.Va tfb_tcp_fb_fini
216field.
217.Pp
218If the
219.Va tfb_tcp_fb_fini
220argument is non-NULL, the function to which it points is called when the
221kernel is destroying the TCP control block or when the socket is transitioning
222to use a different TCP stack.
223The function is called with arguments of the TCP control block and an integer
224flag.
225The flag will be zero if the socket is transitioning to use another TCP stack
226or one if the TCP control block is being destroyed.
227.Pp
228If the TCP stack implements additional timers, the TCP stack should set a
229non-NULL pointer in the
230.Va tfb_tcp_timer_stop_all ,
231.Va tfb_tcp_timer_activate ,
232.Va tfb_tcp_timer_active ,
233and
234.Va tfb_tcp_timer_stop
235fields.
236These fields should all be
237.Dv NULL
238or should all contain pointers to functions.
239The
240.Va tfb_tcp_timer_activate ,
241.Va tfb_tcp_timer_active ,
242and
243.Va tfb_tcp_timer_stop
244functions will be called when the
245.Fn tcp_timer_activate ,
246.Fn tcp_timer_active ,
247and
248.Fn tcp_timer_stop
249functions, respectively, are called with a timer type other than the standard
250types.
251The functions defined by the TCP stack have the same semantics (both for
252arguments and return values) as the normal timer functions they supplement.
253.Pp
254Additionally, a stack may define its own actions to take when the retransmit
255timer fires by setting a non-NULL function pointer in the
256.Va tfb_tcp_rexmit_tmr
257field.
258This function is called very early in the process of handling a retransmit
259timer.
260However, care must be taken to ensure the retransmit timer leaves the
261TCP control block in a valid state for the remainder of the retransmit
262timer logic.
263.Pp
264A user may select a new TCP stack before calling
265.Xr connect 2
266or
267.Xr listen 2 .
268Optionally, a TCP stack may also allow a user to begin using the TCP stack for
269a connection that is in a later state by setting a non-NULL function pointer in
270the
271.Va tfb_tcp_handoff_ok
272field.
273If this field is non-NULL and a user attempts to select that TCP stack after
274calling
275.Xr connect 2
276or
277.Xr listen 2
278for that socket, the kernel will call the function pointed to by the
279.Va tfb_tcp_handoff_ok
280field.
281The function should return 0 if the user is allowed to switch the socket to use
282the TCP stack.
283Otherwise, the function should return an error code, which will be returned to
284the user.
285If the
286.Va tfb_tcp_handoff_ok
287field is
288.Dv NULL
289and a user attempts to select the TCP stack after calling
290.Xr connect 2
291or
292.Xr listen 2
293for that socket, the operation will fail and the kernel will return
294.Er EINVAL .
295.Pp
296The
297.Va tfb_refcnt
298and
299.Va tfb_flags
300fields are used by the kernel's TCP code and will be initialized when the
301TCP stack is registered.
302.Ss Requirements for Alternate TCP Stacks
303If the TCP stack needs to store data beyond what is stored in the default
304TCP control block, the TCP stack can initialize its own per-connection storage.
305The
306.Va t_fb_ptr
307field in the
308.Vt "struct tcpcb"
309control block structure has been reserved to hold a pointer to this
310per-connection storage.
311If the TCP stack uses this alternate storage, it should understand that the
312value of the
313.Va t_fb_ptr
314pointer may not be initialized to
315.Dv NULL .
316Therefore, it should use a
317.Va tfb_tcp_fb_init
318function to initialize this field.
319Additionally, it should use a
320.Va tfb_tcp_fb_fini
321function to deallocate storage when the socket is closed.
322.Pp
323It is understood that alternate TCP stacks may keep different sets of data.
324However, in order to ensure that data is available to both the user and the
325rest of the system in a standardized format, alternate TCP stacks must
326update all fields in the TCP control block to the greatest extent practical.
327.Sh RETURN VALUES
328The
329.Fn register_tcp_functions ,
330.Fn register_tcp_functions_as_name ,
331.Fn register_tcp_functions_as_names ,
332and
333.Fn deregister_tcp_functions
334functions return zero on success and non-zero on failure.
335In particular, the
336.Fn deregister_tcp_functions
337will return
338.Er EBUSY
339until no more connections are using the specified TCP stack.
340A module calling
341.Fn deregister_tcp_functions
342must be prepared to wait until all connections have stopped using the
343specified TCP stack.
344.Sh ERRORS
345The
346.Fn register_tcp_functions
347function will fail if:
348.Bl -tag -width Er
349.It Bq Er EINVAL
350Any of the members of the
351.Fa blk
352argument are set incorrectly.
353.It Bq Er ENOMEM
354The function could not allocate memory for its internal data.
355.It Bq Er EALREADY
356A function block is already registered with the same name.
357.El
358The
359.Fn deregister_tcp_functions
360function will fail if:
361.Bl -tag -width Er
362.It Bq Er EPERM
363The
364.Fa blk
365argument references the kernel's compiled-in default function block.
366.It Bq Er EBUSY
367The function block is still in use by one or more sockets, or is defined as
368the current default function block.
369.It Bq Er ENOENT
370The
371.Fa blk
372argument references a function block that is not currently registered.
373.El
374.Sh SEE ALSO
375.Xr connect 2 ,
376.Xr listen 2 ,
377.Xr tcp 4 ,
378.Xr malloc 9
379.Sh HISTORY
380This framework first appeared in
381.Fx 11.0 .
382.Sh AUTHORS
383.An -nosplit
384The
385.Nm
386framework was written by
387.An Randall Stewart Aq Mt rrs@FreeBSD.org .
388.Pp
389This manual page was written by
390.An Jonathan Looney Aq Mt jtl@FreeBSD.org .
391