xref: /freebsd/share/man/man9/tcp_functions.9 (revision 792bbaba989533a1fc93823df1720c8c4aaf0442)
1.\"
2.\" Copyright (c) 2016 Jonathan Looney <jtl@FreeBSD.org>
3.\" All rights reserved.
4.\"
5.\" Redistribution and use in source and binary forms, with or without
6.\" modification, are permitted provided that the following conditions
7.\" are met:
8.\" 1. Redistributions of source code must retain the above copyright
9.\"    notice, this list of conditions and the following disclaimer.
10.\" 2. Redistributions in binary form must reproduce the above copyright
11.\"    notice, this list of conditions and the following disclaimer in the
12.\"    documentation and/or other materials provided with the distribution.
13.\"
14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
18.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
24.\" SUCH DAMAGE.
25.\"
26.\" $FreeBSD$
27.\"
28.Dd June 28, 2016
29.Dt TCP_FUNCTIONS 9
30.Os
31.Sh NAME
32.Nm tcp_functions
33.Nd Alternate TCP Stack Framework
34.Sh SYNOPSIS
35.In netinet/tcp.h
36.In netinet/tcp_var.h
37.Ft int
38.Fn register_tcp_functions "struct tcp_function_block *blk" "int wait"
39.Ft int
40.Fn deregister_tcp_functions "struct tcp_function_block *blk"
41.Sh DESCRIPTION
42The
43.Nm
44framework allows a kernel developer to implement alternate TCP stacks.
45The alternate stacks can be compiled in the kernel or can be implemented in
46loadable kernel modules.
47This functionality is intended to encourage experimentation with the TCP stack
48and to allow alternate behaviors to be deployed for different TCP connections
49on a single system.
50.Pp
51A system administrator can set a system default stack.
52By default, all TCP connections will use the system default stack.
53Additionally, users can specify a particular stack to use on a per-connection
54basis.
55(See
56.Xr tcp 4
57for details on setting the system default stack, or selecting a specific stack
58for a given connection.)
59.Pp
60This man page treats "TCP stacks" as synonymous with "function blocks".
61This is intentional.
62A "TCP stack" is a collection of functions that implement a set of behavior.
63Therefore, an alternate "function block" defines an alternate "TCP stack".
64.Pp
65.Nm
66modules must call the
67.Fn register_tcp_functions
68function during initialization and successfully call the
69.Fn deregister_tcp_functions
70function prior to allowing the module to be unloaded.
71.Pp
72The
73.Fn register_tcp_functions
74function requests that the system add a specified function block to the system.
75.Pp
76The
77.Fn deregister_tcp_functions
78function requests that the system remove a specified function block from the
79system.
80If the call fails because sockets are still using the specified function block,
81the system will mark the function block as being in the process of being
82removed.
83This will prevent additional sockets from using the specified function block.
84However, it will not impact sockets that are already using the function block.
85.Pp
86The
87.Fa blk
88argument is a pointer to a
89.Vt "struct tcp_function_block" ,
90which is explained below (see
91.Sx Function Block Structure ) .
92The
93.Fa wait
94argument is used as the
95.Fa flags
96argument to
97.Xr malloc 9 ,
98and must be set to one of the valid values defined in that man page.
99.Ss Function Block Structure
100The
101.Fa blk argument is a pointer to a
102.Vt "struct tcp_function_block" ,
103which has the following members:
104.Bd -literal -offset indent
105struct tcp_function_block {
106	char	tfb_tcp_block_name[TCP_FUNCTION_NAME_LEN_MAX];
107	int	(*tfb_tcp_output)(struct tcpcb *);
108	void	(*tfb_tcp_do_segment)(struct mbuf *, struct tcphdr *,
109			    struct socket *, struct tcpcb *,
110			    int, int, uint8_t,
111			    int);
112	int     (*tfb_tcp_ctloutput)(struct socket *so,
113			    struct sockopt *sopt,
114			    struct inpcb *inp, struct tcpcb *tp);
115	/* Optional memory allocation/free routine */
116	void	(*tfb_tcp_fb_init)(struct tcpcb *);
117	void	(*tfb_tcp_fb_fini)(struct tcpcb *, int);
118	/* Optional timers, must define all if you define one */
119	int	(*tfb_tcp_timer_stop_all)(struct tcpcb *);
120	void	(*tfb_tcp_timer_activate)(struct tcpcb *,
121			    uint32_t, u_int);
122	int	(*tfb_tcp_timer_active)(struct tcpcb *, uint32_t);
123	void	(*tfb_tcp_timer_stop)(struct tcpcb *, uint32_t);
124	/* Optional functions */
125	void	(*tfb_tcp_rexmit_tmr)(struct tcpcb *);
126	void	(*tfb_tcp_handoff_ok)(struct tcpcb *);
127	/* System use */
128	volatile uint32_t tfb_refcnt;
129	uint32_t  tfb_flags;
130};
131.Ed
132.Pp
133The
134.Va tfb_tcp_block_name
135field identifies the unique name of the TCP stack, and should be no longer than
136TCP_FUNCTION_NAME_LEN_MAX-1 characters in length.
137.Pp
138The
139.Va tfb_tcp_output ,
140.Va tfb_tcp_do_segment ,
141and
142.Va tfb_tcp_ctloutput
143fields are pointers to functions that perform the equivalent actions
144as the default
145.Fn tcp_output ,
146.Fn tcp_do_segment ,
147and
148.Fn tcp_default_ctloutput
149functions, respectively.
150Each of these function pointers must be non-NULL.
151.Pp
152If a TCP stack needs to initialize data when a socket first selects the TCP
153stack (or, when the socket is first opened), it should set a non-NULL
154pointer in the
155.Va tfb_tcp_fb_init
156field.
157Likewise, if a TCP stack needs to cleanup data when a socket stops using the
158TCP stack (or, when the socket is closed), it should set a non-NULL pointer
159in the
160.Va tfb_tcp_fb_fini
161field.
162.Pp
163If the
164.Va tfb_tcp_fb_fini
165argument is non-NULL, the function to which it points is called when the
166kernel is destroying the TCP control block or when the socket is transitioning
167to use a different TCP stack.
168The function is called with arguments of the TCP control block and an integer
169flag.
170The flag will be zero if the socket is transitioning to use another TCP stack
171or one if the TCP control block is being destroyed.
172.Pp
173If the TCP stack implements additional timers, the TCP stack should set a
174non-NULL pointer in the
175.Va tfb_tcp_timer_stop_all ,
176.Va tfb_tcp_timer_activate ,
177.Va tfb_tcp_timer_active ,
178and
179.Va tfb_tcp_timer_stop
180fields.
181These fields should all be
182.Dv NULL
183or should all contain pointers to functions.
184The
185.Va tfb_tcp_timer_activate ,
186.Va tfb_tcp_timer_active ,
187and
188.Va tfb_tcp_timer_stop
189functions will be called when the
190.Fn tcp_timer_activate ,
191.Fn tcp_timer_active ,
192and
193.Fn tcp_timer_stop
194functions, respectively, are called with a timer type other than the standard
195types.
196The functions defined by the TCP stack have the same semantics (both for
197arguments and return values) as the normal timer functions they supplement.
198.Pp
199Additionally, a stack may define its own actions to take when the retransmit
200timer fires by setting a non-NULL function pointer in the
201.Va tfb_tcp_rexmit_tmr
202field.
203This function is called very early in the process of handling a retransmit
204timer.
205However, care must be taken to ensure the retransmit timer leaves the
206TCP control block in a valid state for the remainder of the retransmit
207timer logic.
208.Pp
209A user may select a new TCP stack before calling
210.Xr connect 2
211or
212.Xr listen 2 .
213Optionally, a TCP stack may also allow a user to begin using the TCP stack for
214a connection that is in a later state by setting a non-NULL function pointer in
215the
216.Va tfb_tcp_handoff_ok
217field.
218If this field is non-NULL and a user attempts to select that TCP stack after
219calling
220.Xr connect 2
221or
222.Xr listen 2
223for that socket, the kernel will call the function pointed to by the
224.Va tfb_tcp_handoff_ok
225field.
226The function should return 0 if the user is allowed to switch the socket to use
227the TCP stack. Otherwise, the function should return an error code, which will
228be returned to the user.
229If the
230.Va tfb_tcp_handoff_ok
231field is
232.Dv NULL
233and a user attempts to select the TCP stack after calling
234.Xr connect 2
235or
236.Xr listen 2
237for that socket, the operation will fail and the kernel will return
238.Er EINVAL .
239.Pp
240The
241.Va tfb_refcnt
242and
243.Va tfb_flags
244fields are used by the kernel's TCP code and will be initialized when the
245TCP stack is registered.
246.Ss Requirements for Alternate TCP Stacks
247If the TCP stack needs to store data beyond what is stored in the default
248TCP control block, the TCP stack can initialize its own per-connection storage.
249The
250.Va t_fb_ptr
251field in the
252.Vt "struct tcpcb"
253control block structure has been reserved to hold a pointer to this
254per-connection storage.
255If the TCP stack uses this alternate storage, it should understand that the
256value of the
257.Va t_fb_ptr
258pointer may not be initialized to
259.Dv NULL .
260Therefore, it should use a
261.Va tfb_tcp_fb_init
262function to initialize this field.
263Additionally, it should use a
264.Va tfb_tcp_fb_fini
265function to deallocate storage when the socket is closed.
266.Pp
267It is understood that alternate TCP stacks may keep different sets of data.
268However, in order to ensure that data is available to both the user and the
269rest of the system in a standardized format, alternate TCP stacks must
270update all fields in the TCP control block to the greatest extent practical.
271.Sh RETURN VALUES
272The
273.Fn register_tcp_functions
274and
275.Fn deregister_tcp_functions
276functions return zero on success and non-zero on failure.
277In particular, the
278.Fn deregister_tcp_functions
279will return
280.Er EBUSY
281until no more connections are using the specified TCP stack.
282A module calling
283.Fn deregister_tcp_functions
284must be prepared to wait until all connections have stopped using the
285specified TCP stack.
286.Sh ERRORS
287The
288.Fn register_tcp_functions
289function will fail if:
290.Bl -tag -width Er
291.It Bq Er EINVAL
292Any of the members of the
293.Fa blk
294argument are set incorrectly.
295.It Bq Er ENOMEM
296The function could not allocate memory for its internal data.
297.It Bq Er EALREADY
298A function block is already registered with the same name.
299.El
300The
301.Fn deregister_tcp_functions
302function will fail if:
303.Bl -tag -width Er
304.It Bq Er EPERM
305The
306.Fa blk
307argument references the kernel's compiled-in default function block.
308.It Bq Er EBUSY
309The function block is still in use by one or more sockets, or is defined as
310the current default function block.
311.It Bq Er ENOENT
312The
313.Fa blk
314argument references a function block that is not currently registered.
315.El
316.Sh SEE ALSO
317.Xr connect 2 ,
318.Xr listen 2 ,
319.Xr tcp 4 ,
320.Xr malloc 9
321.Sh HISTORY
322This framework first appeared in
323.Fx 11.0 .
324.Sh AUTHORS
325.An -nosplit
326The
327.Nm
328framework was written by
329.An Randall Stewart Aq Mt rrs@FreeBSD.org .
330.Pp
331This manual page was written by
332.An Jonathan Looney Aq Mt jtl@FreeBSD.org .
333