xref: /freebsd/share/man/man4/siftr.4 (revision a5548bf685205d1f186e3f163b3ee707b621b2b9)
1*a5548bf6SLawrence Stewart.\"
2*a5548bf6SLawrence Stewart.\" Copyright (c) 2010 The FreeBSD Foundation
3*a5548bf6SLawrence Stewart.\" All rights reserved.
4*a5548bf6SLawrence Stewart.\"
5*a5548bf6SLawrence Stewart.\" Portions of this software were developed at the Centre for Advanced
6*a5548bf6SLawrence Stewart.\" Internet Architectures, Swinburne University of Technology, Melbourne,
7*a5548bf6SLawrence Stewart.\" Australia by Lawrence Stewart under sponsorship from the FreeBSD
8*a5548bf6SLawrence Stewart.\" Foundation.
9*a5548bf6SLawrence Stewart.\"
10*a5548bf6SLawrence Stewart.\" Redistribution and use in source and binary forms, with or without
11*a5548bf6SLawrence Stewart.\" modification, are permitted provided that the following conditions
12*a5548bf6SLawrence Stewart.\" are met:
13*a5548bf6SLawrence Stewart.\" 1. Redistributions of source code must retain the above copyright
14*a5548bf6SLawrence Stewart.\"    notice, this list of conditions, and the following disclaimer,
15*a5548bf6SLawrence Stewart.\"    without modification, immediately at the beginning of the file.
16*a5548bf6SLawrence Stewart.\" 2. The name of the author may not be used to endorse or promote products
17*a5548bf6SLawrence Stewart.\"    derived from this software without specific prior written permission.
18*a5548bf6SLawrence Stewart.\"
19*a5548bf6SLawrence Stewart.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
20*a5548bf6SLawrence Stewart.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21*a5548bf6SLawrence Stewart.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
22*a5548bf6SLawrence Stewart.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
23*a5548bf6SLawrence Stewart.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24*a5548bf6SLawrence Stewart.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
25*a5548bf6SLawrence Stewart.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26*a5548bf6SLawrence Stewart.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
27*a5548bf6SLawrence Stewart.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
28*a5548bf6SLawrence Stewart.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
29*a5548bf6SLawrence Stewart.\" SUCH DAMAGE.
30*a5548bf6SLawrence Stewart.\"
31*a5548bf6SLawrence Stewart.\" $FreeBSD$
32*a5548bf6SLawrence Stewart.\"
33*a5548bf6SLawrence Stewart.Dd June 23, 2010
34*a5548bf6SLawrence Stewart.Dt SIFTR 4
35*a5548bf6SLawrence Stewart.Os
36*a5548bf6SLawrence Stewart.Sh NAME
37*a5548bf6SLawrence Stewart.Nm SIFTR
38*a5548bf6SLawrence Stewart.Nd Statistical Information For TCP Research
39*a5548bf6SLawrence Stewart.Sh SYNOPSIS
40*a5548bf6SLawrence StewartTo load
41*a5548bf6SLawrence Stewart.Ns Nm
42*a5548bf6SLawrence Stewartas a module at run-time, run the following command as root:
43*a5548bf6SLawrence Stewart.Bd -literal -offset indent
44*a5548bf6SLawrence Stewartkldload siftr
45*a5548bf6SLawrence Stewart.Ed
46*a5548bf6SLawrence Stewart.Pp
47*a5548bf6SLawrence StewartAlternatively, to load
48*a5548bf6SLawrence Stewart.Ns Nm
49*a5548bf6SLawrence Stewartas a module at boot time, add the following line into the
50*a5548bf6SLawrence Stewart.Xr loader.conf 5
51*a5548bf6SLawrence Stewartfile:
52*a5548bf6SLawrence Stewart.Bd -literal -offset indent
53*a5548bf6SLawrence Stewartsiftr_load="YES"
54*a5548bf6SLawrence Stewart.Ed
55*a5548bf6SLawrence Stewart.Sh DESCRIPTION
56*a5548bf6SLawrence Stewart.Nm
57*a5548bf6SLawrence Stewart.Ns ( Em S Ns tatistical
58*a5548bf6SLawrence Stewart.Em I Ns nformation
59*a5548bf6SLawrence Stewart.Em F Ns or
60*a5548bf6SLawrence Stewart.Em T Ns CP
61*a5548bf6SLawrence Stewart.Em R Ns esearch )
62*a5548bf6SLawrence Stewartis a kernel module that logs a range of statistics on active TCP connections to
63*a5548bf6SLawrence Stewarta log file.
64*a5548bf6SLawrence StewartIt provides the ability to make highly granular measurements of TCP connection
65*a5548bf6SLawrence Stewartstate, aimed at system administrators, developers and researchers.
66*a5548bf6SLawrence Stewart.Ss Compile-time Configuration
67*a5548bf6SLawrence StewartThe default operation of
68*a5548bf6SLawrence Stewart.Nm
69*a5548bf6SLawrence Stewartis to capture IPv4 TCP/IP packets.
70*a5548bf6SLawrence Stewart.Nm
71*a5548bf6SLawrence Stewartcan be configured to support IPv4 and IPv6 by uncommenting:
72*a5548bf6SLawrence Stewart.Bd -literal -offset indent
73*a5548bf6SLawrence StewartCFLAGS+=-DSIFTR_IPV6
74*a5548bf6SLawrence Stewart.Ed
75*a5548bf6SLawrence Stewart.Pp
76*a5548bf6SLawrence Stewartin
77*a5548bf6SLawrence Stewart.Aq sys/modules/siftr/Makefile
78*a5548bf6SLawrence Stewartand recompiling.
79*a5548bf6SLawrence Stewart.Pp
80*a5548bf6SLawrence StewartIn the IPv4-only (default) mode, standard dotted decimal notation (e.g.
81*a5548bf6SLawrence Stewart"136.186.229.95") is used to format IPv4 addresses for logging.
82*a5548bf6SLawrence StewartIn IPv6 mode, standard dotted decimal notation is used to format IPv4 addresses,
83*a5548bf6SLawrence Stewartand standard colon-separated hex notation (see RFC 4291) is used to format IPv6
84*a5548bf6SLawrence Stewartaddresses for logging. Note that SIFTR uses uncompressed notation to format IPv6
85*a5548bf6SLawrence Stewartaddresses.
86*a5548bf6SLawrence StewartFor example, the address "fe80::20f:feff:fea2:531b" would be logged as
87*a5548bf6SLawrence Stewart"fe80:0:0:0:20f:feff:fea2:531b".
88*a5548bf6SLawrence Stewart.Ss Run-time Configuration
89*a5548bf6SLawrence Stewart.Nm
90*a5548bf6SLawrence Stewartutilises the
91*a5548bf6SLawrence Stewart.Xr sysctl 8
92*a5548bf6SLawrence Stewartinterface to export its configuration variables to user-space.
93*a5548bf6SLawrence StewartThe following variables are available:
94*a5548bf6SLawrence Stewart.Bl -tag -offset indent
95*a5548bf6SLawrence Stewart.It Va net.inet.siftr.enabled
96*a5548bf6SLawrence Stewartcontrols whether the module performs its
97*a5548bf6SLawrence Stewartmeasurements or not.
98*a5548bf6SLawrence StewartBy default, the value is set to 0, which means the module
99*a5548bf6SLawrence Stewartwill not be taking any measurements.
100*a5548bf6SLawrence StewartHaving the module loaded with
101*a5548bf6SLawrence Stewart.Va net.inet.siftr.enabled
102*a5548bf6SLawrence Stewartset to 0 will have no impact on the performance of the network stack, as the
103*a5548bf6SLawrence Stewartpacket filtering hooks are only inserted when
104*a5548bf6SLawrence Stewart.Va net.inet.siftr.enabled
105*a5548bf6SLawrence Stewartis set to 1.
106*a5548bf6SLawrence Stewart.El
107*a5548bf6SLawrence Stewart.Bl -tag -offset indent
108*a5548bf6SLawrence Stewart.It Va net.inet.siftr.ppl
109*a5548bf6SLawrence Stewartcontrols how many inbound/outbound packets for a given TCP connection will cause
110*a5548bf6SLawrence Stewarta log message to be generated for the connection.
111*a5548bf6SLawrence StewartBy default, the value is set to 1, which means the module will log a message for
112*a5548bf6SLawrence Stewartevery packet of every TCP connection.
113*a5548bf6SLawrence StewartThe value can be set to any integer in the range [1,2^32], and can be changed at
114*a5548bf6SLawrence Stewartany time, even while the module is enabled.
115*a5548bf6SLawrence Stewart.El
116*a5548bf6SLawrence Stewart.Bl -tag -offset indent
117*a5548bf6SLawrence Stewart.It Va net.inet.siftr.logfile
118*a5548bf6SLawrence Stewartcontrols the path to the file that the module writes its log messages to.
119*a5548bf6SLawrence StewartBy default, the file /var/log/siftr.log is used.
120*a5548bf6SLawrence StewartThe path can be changed at any time, even while the module is enabled.
121*a5548bf6SLawrence Stewart.El
122*a5548bf6SLawrence Stewart.Bl -tag -offset indent
123*a5548bf6SLawrence Stewart.It Va net.inet.siftr.genhashes
124*a5548bf6SLawrence Stewartcontrols whether a hash is generated for each TCP packet seen by
125*a5548bf6SLawrence Stewart.Nm .
126*a5548bf6SLawrence StewartBy default, the value is set to 0, which means no hashes are generated.
127*a5548bf6SLawrence StewartThe hashes are useful to correlate which TCP packet triggered the generation of
128*a5548bf6SLawrence Stewarta particular log message, but calculating them adds additional computational
129*a5548bf6SLawrence Stewartoverhead into the fast path.
130*a5548bf6SLawrence Stewart.El
131*a5548bf6SLawrence Stewart.Ss Log Format
132*a5548bf6SLawrence StewartA typical
133*a5548bf6SLawrence Stewart.Nm
134*a5548bf6SLawrence Stewartlog file will contain 3 different types of log message.
135*a5548bf6SLawrence StewartAll messages are written in plain ASCII text.
136*a5548bf6SLawrence Stewart.Pp
137*a5548bf6SLawrence StewartNote: The
138*a5548bf6SLawrence Stewart.Qq \e
139*a5548bf6SLawrence Stewartpresent in the example log messages in this section indicates a
140*a5548bf6SLawrence Stewartline continuation and is not part of the actual log message
141*a5548bf6SLawrence Stewart.Pp
142*a5548bf6SLawrence StewartThe first type of log message is written to the file when the module is
143*a5548bf6SLawrence Stewartenabled and starts collecting data from the running kernel. The text below
144*a5548bf6SLawrence Stewartshows an example module enable log. The fields are tab delimited key-value
145*a5548bf6SLawrence Stewartpairs which describe some basic information about the system.
146*a5548bf6SLawrence Stewart.Bd -literal -offset indent
147*a5548bf6SLawrence Stewartenable_time_secs=1238556193    enable_time_usecs=462104 \\
148*a5548bf6SLawrence Stewartsiftrver=1.2.2    hz=1000    tcp_rtt_scale=32 \\
149*a5548bf6SLawrence Stewartsysname=FreeBSD    sysver=604000    ipmode=4
150*a5548bf6SLawrence Stewart.Ed
151*a5548bf6SLawrence Stewart.Pp
152*a5548bf6SLawrence StewartField descriptions are as follows:
153*a5548bf6SLawrence Stewart.Bl -tag -offset indent
154*a5548bf6SLawrence Stewart.It Va enable_time_secs
155*a5548bf6SLawrence Stewarttime at which the module was enabled, in seconds since the UNIX epoch.
156*a5548bf6SLawrence Stewart.El
157*a5548bf6SLawrence Stewart.Bl -tag -offset indent
158*a5548bf6SLawrence Stewart.It Va enable_time_usecs
159*a5548bf6SLawrence Stewarttime at which the module was enabled, in microseconds since enable_time_secs.
160*a5548bf6SLawrence Stewart.El
161*a5548bf6SLawrence Stewart.Bl -tag -offset indent
162*a5548bf6SLawrence Stewart.It Va siftrver
163*a5548bf6SLawrence Stewartversion of
164*a5548bf6SLawrence Stewart.Nm .
165*a5548bf6SLawrence Stewart.El
166*a5548bf6SLawrence Stewart.Bl -tag -offset indent
167*a5548bf6SLawrence Stewart.It Va hz
168*a5548bf6SLawrence Stewarttick rate of the kernel in ticks per second.
169*a5548bf6SLawrence Stewart.El
170*a5548bf6SLawrence Stewart.Bl -tag -offset indent
171*a5548bf6SLawrence Stewart.It Va tcp_rtt_scale
172*a5548bf6SLawrence Stewartsmoothed RTT estimate scaling factor
173*a5548bf6SLawrence Stewart.El
174*a5548bf6SLawrence Stewart.Bl -tag -offset indent
175*a5548bf6SLawrence Stewart.It Va sysname
176*a5548bf6SLawrence Stewartoperating system name
177*a5548bf6SLawrence Stewart.El
178*a5548bf6SLawrence Stewart.Bl -tag -offset indent
179*a5548bf6SLawrence Stewart.It Va sysver
180*a5548bf6SLawrence Stewartoperating system version
181*a5548bf6SLawrence Stewart.El
182*a5548bf6SLawrence Stewart.Bl -tag -offset indent
183*a5548bf6SLawrence Stewart.It Va ipmode
184*a5548bf6SLawrence StewartIP mode as defined at compile time.
185*a5548bf6SLawrence StewartAn ipmode of "4" means IPv6 is not supported and IP addresses are logged in
186*a5548bf6SLawrence Stewartregular dotted quad format.
187*a5548bf6SLawrence StewartAn ipmode of "6" means IPv6 is supported, and IP addresses are logged in dotted
188*a5548bf6SLawrence Stewartquad or hex format, as described in the
189*a5548bf6SLawrence Stewart.Qq Compile-time Configuration
190*a5548bf6SLawrence Stewartsubsection.
191*a5548bf6SLawrence Stewart.El
192*a5548bf6SLawrence Stewart.Pp
193*a5548bf6SLawrence StewartThe second type of log message is written to the file when a data log message
194*a5548bf6SLawrence Stewartis generated.
195*a5548bf6SLawrence StewartThe text below shows an example data log triggered by an IPv4
196*a5548bf6SLawrence StewartTCP/IP packet.
197*a5548bf6SLawrence StewartThe data is CSV formatted.
198*a5548bf6SLawrence Stewart.Bd -literal -offset indent
199*a5548bf6SLawrence Stewarto,0xbec491a5,1238556193.463551,172.16.7.28,22,172.16.2.5,55931, \\
200*a5548bf6SLawrence Stewart1073725440,172312,6144,66560,66608,8,1,4,1448,936,1,996,255, \\
201*a5548bf6SLawrence Stewart33304,208,66608,0,208
202*a5548bf6SLawrence Stewart.Ed
203*a5548bf6SLawrence Stewart.Pp
204*a5548bf6SLawrence StewartField descriptions are as follows:
205*a5548bf6SLawrence Stewart.Bl -tag -offset indent
206*a5548bf6SLawrence Stewart.It Va 1
207*a5548bf6SLawrence StewartDirection of packet that triggered the log message.
208*a5548bf6SLawrence StewartEither
209*a5548bf6SLawrence Stewart.Qq i
210*a5548bf6SLawrence Stewartfor in, or
211*a5548bf6SLawrence Stewart.Qq o
212*a5548bf6SLawrence Stewartfor out.
213*a5548bf6SLawrence Stewart.El
214*a5548bf6SLawrence Stewart.Bl -tag -offset indent
215*a5548bf6SLawrence Stewart.It Va 2
216*a5548bf6SLawrence StewartHash of the packet that triggered the log message.
217*a5548bf6SLawrence Stewart.El
218*a5548bf6SLawrence Stewart.Bl -tag -offset indent
219*a5548bf6SLawrence Stewart.It Va 3
220*a5548bf6SLawrence StewartTime at which the packet that triggered the log message was processed by
221*a5548bf6SLawrence Stewartthe
222*a5548bf6SLawrence Stewart.Xr pfil 9
223*a5548bf6SLawrence Stewarthook function, in seconds and microseconds since the UNIX epoch.
224*a5548bf6SLawrence Stewart.El
225*a5548bf6SLawrence Stewart.Bl -tag -offset indent
226*a5548bf6SLawrence Stewart.It Va 4
227*a5548bf6SLawrence StewartThe IPv4 or IPv6 address of the local host, in dotted quad (IPv4 packet)
228*a5548bf6SLawrence Stewartor colon-separated hex (IPv6 packet) notation.
229*a5548bf6SLawrence Stewart.El
230*a5548bf6SLawrence Stewart.Bl -tag -offset indent
231*a5548bf6SLawrence Stewart.It Va 5
232*a5548bf6SLawrence StewartThe TCP port that the local host is communicating via.
233*a5548bf6SLawrence Stewart.El
234*a5548bf6SLawrence Stewart.Bl -tag -offset indent
235*a5548bf6SLawrence Stewart.It Va 6
236*a5548bf6SLawrence StewartThe IPv4 or IPv6 address of the foreign host, in dotted quad (IPv4 packet)
237*a5548bf6SLawrence Stewartor colon-separated hex (IPv6 packet) notation.
238*a5548bf6SLawrence Stewart.El
239*a5548bf6SLawrence Stewart.Bl -tag -offset indent
240*a5548bf6SLawrence Stewart.It Va 7
241*a5548bf6SLawrence StewartThe TCP port that the foreign host is communicating via.
242*a5548bf6SLawrence Stewart.El
243*a5548bf6SLawrence Stewart.Bl -tag -offset indent
244*a5548bf6SLawrence Stewart.It Va 8
245*a5548bf6SLawrence StewartThe slow start threshold for the flow, in bytes.
246*a5548bf6SLawrence Stewart.El
247*a5548bf6SLawrence Stewart.Bl -tag -offset indent
248*a5548bf6SLawrence Stewart.It Va 9
249*a5548bf6SLawrence StewartThe current congestion window for the flow, in bytes.
250*a5548bf6SLawrence Stewart.El
251*a5548bf6SLawrence Stewart.Bl -tag -offset indent
252*a5548bf6SLawrence Stewart.It Va 10
253*a5548bf6SLawrence StewartThe current bandwidth-controlled window for the flow, in bytes.
254*a5548bf6SLawrence Stewart.El
255*a5548bf6SLawrence Stewart.Bl -tag -offset indent
256*a5548bf6SLawrence Stewart.It Va 11
257*a5548bf6SLawrence StewartThe current sending window for the flow, in bytes.
258*a5548bf6SLawrence StewartThe post scaled value is reported, except during the initial handshake (first
259*a5548bf6SLawrence Stewartfew packets), during which time the unscaled value is reported.
260*a5548bf6SLawrence Stewart.El
261*a5548bf6SLawrence Stewart.Bl -tag -offset indent
262*a5548bf6SLawrence Stewart.It Va 12
263*a5548bf6SLawrence StewartThe current receive window for the flow, in bytes.
264*a5548bf6SLawrence StewartThe post scaled value is always reported.
265*a5548bf6SLawrence Stewart.El
266*a5548bf6SLawrence Stewart.Bl -tag -offset indent
267*a5548bf6SLawrence Stewart.It Va 13
268*a5548bf6SLawrence StewartThe current window scaling factor for the sending window.
269*a5548bf6SLawrence Stewart.El
270*a5548bf6SLawrence Stewart.Bl -tag -offset indent
271*a5548bf6SLawrence Stewart.It Va 14
272*a5548bf6SLawrence StewartThe current window scaling factor for the receiving window.
273*a5548bf6SLawrence Stewart.El
274*a5548bf6SLawrence Stewart.Bl -tag -offset indent
275*a5548bf6SLawrence Stewart.It Va 15
276*a5548bf6SLawrence StewartThe current state of the TCP finite state machine, as defined
277*a5548bf6SLawrence Stewartin
278*a5548bf6SLawrence Stewart.Aq Pa netinet/tcp_fsm.h .
279*a5548bf6SLawrence Stewart.El
280*a5548bf6SLawrence Stewart.Bl -tag -offset indent
281*a5548bf6SLawrence Stewart.It Va 16
282*a5548bf6SLawrence StewartThe maximum segment size for the flow, in bytes.
283*a5548bf6SLawrence Stewart.El
284*a5548bf6SLawrence Stewart.Bl -tag -offset indent
285*a5548bf6SLawrence Stewart.It Va 17
286*a5548bf6SLawrence StewartThe current smoothed RTT estimate for the flow, in units of TCP_RTT_SCALE * HZ,
287*a5548bf6SLawrence Stewartwhere TCP_RTT_SCALE is a define found in tcp_var.h, and HZ is the kernel's tick
288*a5548bf6SLawrence Stewarttimer.
289*a5548bf6SLawrence StewartDivide by TCP_RTT_SCALE * HZ to get the RTT in secs. TCP_RTT_SCALE and HZ are
290*a5548bf6SLawrence Stewartreported in the enable log message.
291*a5548bf6SLawrence Stewart.El
292*a5548bf6SLawrence Stewart.Bl -tag -offset indent
293*a5548bf6SLawrence Stewart.It Va 18
294*a5548bf6SLawrence StewartSACK enabled indicator. 1 if SACK enabled, 0 otherwise.
295*a5548bf6SLawrence Stewart.El
296*a5548bf6SLawrence Stewart.Bl -tag -offset indent
297*a5548bf6SLawrence Stewart.It Va 19
298*a5548bf6SLawrence StewartThe current state of the TCP flags for the flow.
299*a5548bf6SLawrence StewartSee
300*a5548bf6SLawrence Stewart.Aq Pa netinet/tcp_var.h
301*a5548bf6SLawrence Stewartfor information about the various flags.
302*a5548bf6SLawrence Stewart.El
303*a5548bf6SLawrence Stewart.Bl -tag -offset indent
304*a5548bf6SLawrence Stewart.It Va 20
305*a5548bf6SLawrence StewartThe current retransmission timeout length for the flow, in units of HZ, where HZ
306*a5548bf6SLawrence Stewartis the kernel's tick timer.
307*a5548bf6SLawrence StewartDivide by HZ to get the timeout length in seconds. HZ is reported in the
308*a5548bf6SLawrence Stewartenable log message.
309*a5548bf6SLawrence Stewart.El
310*a5548bf6SLawrence Stewart.Bl -tag -offset indent
311*a5548bf6SLawrence Stewart.It Va 21
312*a5548bf6SLawrence StewartThe current size of the socket send buffer in bytes.
313*a5548bf6SLawrence Stewart.El
314*a5548bf6SLawrence Stewart.Bl -tag -offset indent
315*a5548bf6SLawrence Stewart.It Va 22
316*a5548bf6SLawrence StewartThe current number of bytes in the socket send buffer.
317*a5548bf6SLawrence Stewart.El
318*a5548bf6SLawrence Stewart.Bl -tag -offset indent
319*a5548bf6SLawrence Stewart.It Va 23
320*a5548bf6SLawrence StewartThe current size of the socket receive buffer in bytes.
321*a5548bf6SLawrence Stewart.El
322*a5548bf6SLawrence Stewart.Bl -tag -offset indent
323*a5548bf6SLawrence Stewart.It Va 24
324*a5548bf6SLawrence StewartThe current number of bytes in the socket receive buffer.
325*a5548bf6SLawrence Stewart.El
326*a5548bf6SLawrence Stewart.Bl -tag -offset indent
327*a5548bf6SLawrence Stewart.It Va 25
328*a5548bf6SLawrence StewartThe current number of unacknowledged bytes in-flight.
329*a5548bf6SLawrence StewartBytes acknowledged via SACK are not excluded from this count.
330*a5548bf6SLawrence Stewart.El
331*a5548bf6SLawrence Stewart.Pp
332*a5548bf6SLawrence StewartThe third type of log message is written to the file when the module is disabled
333*a5548bf6SLawrence Stewartand ceases collecting data from the running kernel.
334*a5548bf6SLawrence StewartThe text below shows an example module disable log.
335*a5548bf6SLawrence StewartThe fields are tab delimited key-value pairs which provide statistics about
336*a5548bf6SLawrence Stewartoperations since the module was most recently enabled.
337*a5548bf6SLawrence Stewart.Bd -literal -offset indent
338*a5548bf6SLawrence Stewartdisable_time_secs=1238556197    disable_time_usecs=933607 \\
339*a5548bf6SLawrence Stewartnum_inbound_tcp_pkts=356    num_outbound_tcp_pkts=627 \\
340*a5548bf6SLawrence Stewarttotal_tcp_pkts=983    num_inbound_skipped_pkts_malloc=0 \\
341*a5548bf6SLawrence Stewartnum_outbound_skipped_pkts_malloc=0    num_inbound_skipped_pkts_mtx=0 \\
342*a5548bf6SLawrence Stewartnum_outbound_skipped_pkts_mtx=0    num_inbound_skipped_pkts_tcb=0 \\
343*a5548bf6SLawrence Stewartnum_outbound_skipped_pkts_tcb=0    num_inbound_skipped_pkts_icb=0 \\
344*a5548bf6SLawrence Stewartnum_outbound_skipped_pkts_icb=0    total_skipped_tcp_pkts=0 \\
345*a5548bf6SLawrence Stewartflow_list=172.16.7.28;22-172.16.2.5;55931,
346*a5548bf6SLawrence Stewart.Ed
347*a5548bf6SLawrence Stewart.Pp
348*a5548bf6SLawrence StewartField descriptions are as follows:
349*a5548bf6SLawrence Stewart.Bl -tag -offset indent
350*a5548bf6SLawrence Stewart.It Va disable_time_secs
351*a5548bf6SLawrence StewartTime at which the module was disabled, in seconds since the UNIX epoch.
352*a5548bf6SLawrence Stewart.El
353*a5548bf6SLawrence Stewart.Bl -tag -offset indent
354*a5548bf6SLawrence Stewart.It Va disable_time_usecs
355*a5548bf6SLawrence StewartTime at which the module was disabled, in microseconds since disable_time_secs.
356*a5548bf6SLawrence Stewart.El
357*a5548bf6SLawrence Stewart.Bl -tag -offset indent
358*a5548bf6SLawrence Stewart.It Va num_inbound_tcp_pkts
359*a5548bf6SLawrence StewartNumber of TCP packets that traversed up the network stack.
360*a5548bf6SLawrence StewartThis only includes inbound TCP packets during the periods when
361*a5548bf6SLawrence Stewart.Nm
362*a5548bf6SLawrence Stewartwas enabled.
363*a5548bf6SLawrence Stewart.El
364*a5548bf6SLawrence Stewart.Bl -tag -offset indent
365*a5548bf6SLawrence Stewart.It Va num_outbound_tcp_pkts
366*a5548bf6SLawrence StewartNumber of TCP packets that traversed down the network stack.
367*a5548bf6SLawrence StewartThis only includes outbound TCP packets during the periods when
368*a5548bf6SLawrence Stewart.Nm
369*a5548bf6SLawrence Stewartwas enabled.
370*a5548bf6SLawrence Stewart.El
371*a5548bf6SLawrence Stewart.Bl -tag -offset indent
372*a5548bf6SLawrence Stewart.It Va total_tcp_pkts
373*a5548bf6SLawrence StewartThe summation of num_inbound_tcp_pkts and num_outbound_tcp_pkts.
374*a5548bf6SLawrence Stewart.El
375*a5548bf6SLawrence Stewart.Bl -tag -offset indent
376*a5548bf6SLawrence Stewart.It Va num_inbound_skipped_pkts_malloc
377*a5548bf6SLawrence StewartNumber of inbound packets that were not processed because of failed malloc() calls.
378*a5548bf6SLawrence Stewart.El
379*a5548bf6SLawrence Stewart.Bl -tag -offset indent
380*a5548bf6SLawrence Stewart.It Va num_outbound_skipped_pkts_malloc
381*a5548bf6SLawrence StewartNumber of outbound packets that were not processed because of failed malloc() calls.
382*a5548bf6SLawrence Stewart.El
383*a5548bf6SLawrence Stewart.Bl -tag -offset indent
384*a5548bf6SLawrence Stewart.It Va num_inbound_skipped_pkts_mtx
385*a5548bf6SLawrence StewartNumber of inbound packets that were not processed because of failure to add the
386*a5548bf6SLawrence Stewartpacket to the packet processing queue.
387*a5548bf6SLawrence Stewart.El
388*a5548bf6SLawrence Stewart.Bl -tag -offset indent
389*a5548bf6SLawrence Stewart.It Va num_outbound_skipped_pkts_mtx
390*a5548bf6SLawrence StewartNumber of outbound packets that were not processed because of failure to add the
391*a5548bf6SLawrence Stewartpacket to the packet processing queue.
392*a5548bf6SLawrence Stewart.El
393*a5548bf6SLawrence Stewart.Bl -tag -offset indent
394*a5548bf6SLawrence Stewart.It Va num_inbound_skipped_pkts_tcb
395*a5548bf6SLawrence StewartNumber of inbound packets that were not processed because of failure to find the
396*a5548bf6SLawrence StewartTCP control block associated with the packet.
397*a5548bf6SLawrence Stewart.El
398*a5548bf6SLawrence Stewart.Bl -tag -offset indent
399*a5548bf6SLawrence Stewart.It Va num_outbound_skipped_pkts_tcb
400*a5548bf6SLawrence StewartNumber of outbound packets that were not processed because of failure to find
401*a5548bf6SLawrence Stewartthe TCP control block associated with the packet.
402*a5548bf6SLawrence Stewart.El
403*a5548bf6SLawrence Stewart.Bl -tag -offset indent
404*a5548bf6SLawrence Stewart.It Va num_inbound_skipped_pkts_icb
405*a5548bf6SLawrence StewartNumber of inbound packets that were not processed because of failure to find the
406*a5548bf6SLawrence StewartIP control block associated with the packet.
407*a5548bf6SLawrence Stewart.El
408*a5548bf6SLawrence Stewart.Bl -tag -offset indent
409*a5548bf6SLawrence Stewart.It Va num_outbound_skipped_pkts_icb
410*a5548bf6SLawrence StewartNumber of outbound packets that were not processed because of failure to find
411*a5548bf6SLawrence Stewartthe IP control block associated with the packet.
412*a5548bf6SLawrence Stewart.El
413*a5548bf6SLawrence Stewart.Bl -tag -offset indent
414*a5548bf6SLawrence Stewart.It Va total_skipped_tcp_pkts
415*a5548bf6SLawrence StewartThe summation of all skipped packet counters.
416*a5548bf6SLawrence Stewart.El
417*a5548bf6SLawrence Stewart.Bl -tag -offset indent
418*a5548bf6SLawrence Stewart.It Va flow_list
419*a5548bf6SLawrence StewartA CSV list of TCP flows that triggered data log messages to be generated since
420*a5548bf6SLawrence Stewartthe module was loaded.
421*a5548bf6SLawrence StewartEach flow entry in the CSV list is
422*a5548bf6SLawrence Stewartformatted as
423*a5548bf6SLawrence Stewart.Qq local_ip;local_port-foreign_ip;foreign_port .
424*a5548bf6SLawrence StewartIf there are no entries in the list (i.e. no data log messages were generated),
425*a5548bf6SLawrence Stewartthe value will be blank.
426*a5548bf6SLawrence StewartIf there is at least one entry in the list, a trailing comma will always be
427*a5548bf6SLawrence Stewartpresent.
428*a5548bf6SLawrence Stewart.El
429*a5548bf6SLawrence Stewart.Pp
430*a5548bf6SLawrence StewartThe total number of data log messages found in the log file for a module
431*a5548bf6SLawrence Stewartenable/disable cycle should equate to total_tcp_pkts - total_skipped_tcp_pkts.
432*a5548bf6SLawrence Stewart.Sh IMPLEMENTATION NOTES
433*a5548bf6SLawrence Stewart.Nm
434*a5548bf6SLawrence Stewarthooks into the network stack using the
435*a5548bf6SLawrence Stewart.Xr pfil 9
436*a5548bf6SLawrence Stewartinterface.
437*a5548bf6SLawrence StewartIn its current incarnation, it hooks into the AF_INET/AF_INET6 (IPv4/IPv6)
438*a5548bf6SLawrence Stewart.Xr pfil 9
439*a5548bf6SLawrence Stewartfiltering points, which means it sees packets at the IP layer of the network
440*a5548bf6SLawrence Stewartstack.
441*a5548bf6SLawrence StewartThis means that TCP packets inbound to the stack are intercepted before
442*a5548bf6SLawrence Stewartthey have been processed by the TCP layer.
443*a5548bf6SLawrence StewartPackets outbound from the stack are intercepted after they have been processed
444*a5548bf6SLawrence Stewartby the TCP layer.
445*a5548bf6SLawrence Stewart.Pp
446*a5548bf6SLawrence StewartThe diagram below illustrates how
447*a5548bf6SLawrence Stewart.Nm
448*a5548bf6SLawrence Stewartinserts itself into the stack.
449*a5548bf6SLawrence Stewart.Bd -literal -offset indent
450*a5548bf6SLawrence Stewart----------------------------------
451*a5548bf6SLawrence Stewart           Upper Layers
452*a5548bf6SLawrence Stewart----------------------------------
453*a5548bf6SLawrence Stewart    ^                       |
454*a5548bf6SLawrence Stewart    |                       |
455*a5548bf6SLawrence Stewart    |                       |
456*a5548bf6SLawrence Stewart    |                       v
457*a5548bf6SLawrence Stewart TCP in                  TCP out
458*a5548bf6SLawrence Stewart----------------------------------
459*a5548bf6SLawrence Stewart    ^                      |
460*a5548bf6SLawrence Stewart    |________     _________|
461*a5548bf6SLawrence Stewart            |     |
462*a5548bf6SLawrence Stewart            |     v
463*a5548bf6SLawrence Stewart           ---------
464*a5548bf6SLawrence Stewart           | SIFTR |
465*a5548bf6SLawrence Stewart           ---------
466*a5548bf6SLawrence Stewart            ^     |
467*a5548bf6SLawrence Stewart    ________|     |__________
468*a5548bf6SLawrence Stewart    |                       |
469*a5548bf6SLawrence Stewart    |                       v
470*a5548bf6SLawrence StewartIPv{4/6} in            IPv{4/6} out
471*a5548bf6SLawrence Stewart----------------------------------
472*a5548bf6SLawrence Stewart    ^                       |
473*a5548bf6SLawrence Stewart    |                       |
474*a5548bf6SLawrence Stewart    |                       v
475*a5548bf6SLawrence StewartLayer 2 in             Layer 2 out
476*a5548bf6SLawrence Stewart----------------------------------
477*a5548bf6SLawrence Stewart          Physical Layer
478*a5548bf6SLawrence Stewart----------------------------------
479*a5548bf6SLawrence Stewart.Ed
480*a5548bf6SLawrence Stewart.Pp
481*a5548bf6SLawrence Stewart.Nm
482*a5548bf6SLawrence Stewartuses the
483*a5548bf6SLawrence Stewart.Xr alq 9
484*a5548bf6SLawrence Stewartinterface to manage writing data to disk.
485*a5548bf6SLawrence Stewart.Pp
486*a5548bf6SLawrence StewartAt first glance, you might mistakenly think that
487*a5548bf6SLawrence Stewart.Nm
488*a5548bf6SLawrence Stewartextracts information from
489*a5548bf6SLawrence Stewartindividual TCP packets.
490*a5548bf6SLawrence StewartThis is not the case.
491*a5548bf6SLawrence Stewart.Nm
492*a5548bf6SLawrence Stewartuses TCP packet events (inbound and outbound) for each TCP flow originating from
493*a5548bf6SLawrence Stewartthe system to trigger a dump of the state of the TCP control block for that
494*a5548bf6SLawrence Stewartflow.
495*a5548bf6SLawrence StewartWith the PPL set to 1, we are in effect sampling each TCP flow's control block
496*a5548bf6SLawrence Stewartstate as frequently as flow packets enter/leave the system.
497*a5548bf6SLawrence StewartFor example, setting PPL to 2 halves the sampling rate i.e. every second flow
498*a5548bf6SLawrence Stewartpacket (inbound OR outbound) causes a dump of the control block state.
499*a5548bf6SLawrence Stewart.Pp
500*a5548bf6SLawrence StewartThe distinction between interrogating individual packets vs interrogating the
501*a5548bf6SLawrence Stewartcontrol block is important, because
502*a5548bf6SLawrence Stewart.Nm
503*a5548bf6SLawrence Stewartdoes not remove the need for packet capturing tools like
504*a5548bf6SLawrence Stewart.Xr tcpdump 1 .
505*a5548bf6SLawrence Stewart.Nm
506*a5548bf6SLawrence Stewartallows you to correlate and observe the cause-and-affect relationship between
507*a5548bf6SLawrence Stewartwhat you see on the wire (captured using a tool like
508*a5548bf6SLawrence Stewart.Xr tcpdump 1 Ns )
509*a5548bf6SLawrence Stewartand changes in the TCP control block corresponding to the flow of interest.
510*a5548bf6SLawrence StewartIt is therefore useful to use
511*a5548bf6SLawrence Stewart.Nm
512*a5548bf6SLawrence Stewartand a tool like
513*a5548bf6SLawrence Stewart.Xr tcpdump 1
514*a5548bf6SLawrence Stewartto gather the necessary data to piece together the complete picture.
515*a5548bf6SLawrence StewartUse of either tool on its own will not be able to provide all of the necessary
516*a5548bf6SLawrence Stewartdata.
517*a5548bf6SLawrence Stewart.Pp
518*a5548bf6SLawrence StewartAs a result of needing to interrogate the TCP control block, certain packets
519*a5548bf6SLawrence Stewartduring the lifecycle of a connection are unable to trigger a
520*a5548bf6SLawrence Stewart.Nm
521*a5548bf6SLawrence Stewartlog message.
522*a5548bf6SLawrence StewartThe initial handshake takes place without the existence of a control block and
523*a5548bf6SLawrence Stewartthe final ACK is exchanged when the connection is in the TIMEWAIT state.
524*a5548bf6SLawrence Stewart.Pp
525*a5548bf6SLawrence Stewart.Nm
526*a5548bf6SLawrence Stewartwas designed to minimise the delay introduced to packets traversing the network
527*a5548bf6SLawrence Stewartstack.
528*a5548bf6SLawrence StewartThis design called for a highly optimised and minimal hook function that
529*a5548bf6SLawrence Stewartextracted the minimal details necessary whilst holding the packet up, and
530*a5548bf6SLawrence Stewartpassing these details to another thread for actual processing and logging.
531*a5548bf6SLawrence Stewart.Pp
532*a5548bf6SLawrence StewartThis multithreaded design does introduce some contention issues when accessing
533*a5548bf6SLawrence Stewartthe data structure shared between the threads of operation.
534*a5548bf6SLawrence StewartWhen the hook function tries to place details in the structure, it must first
535*a5548bf6SLawrence Stewartacquire an exclusive lock.
536*a5548bf6SLawrence StewartLikewise, when the processing thread tries to read details from the structure,
537*a5548bf6SLawrence Stewartit must also acquire an exclusive lock to do so.
538*a5548bf6SLawrence StewartIf one thread holds the lock, the other must wait before it can obtain it.
539*a5548bf6SLawrence StewartThis does introduce some additional bounded delay into the kernel's packet
540*a5548bf6SLawrence Stewartprocessing code path.
541*a5548bf6SLawrence Stewart.Pp
542*a5548bf6SLawrence StewartIn some cases (e.g. low memory, connection termination), TCP packets that enter
543*a5548bf6SLawrence Stewartthe
544*a5548bf6SLawrence Stewart.Nm
545*a5548bf6SLawrence Stewart.Xr pfil 9
546*a5548bf6SLawrence Stewarthook function will not trigger a log message to be generated.
547*a5548bf6SLawrence Stewart.Nm
548*a5548bf6SLawrence Stewartrefers to this outcome as a
549*a5548bf6SLawrence Stewart.Qq skipped packet .
550*a5548bf6SLawrence StewartNote that
551*a5548bf6SLawrence Stewart.Nm
552*a5548bf6SLawrence Stewartalways ensures that packets are allowed to continue through the stack, even if
553*a5548bf6SLawrence Stewartthey could not successfully trigger a data log message.
554*a5548bf6SLawrence Stewart.Nm
555*a5548bf6SLawrence Stewartwill therefore not introduce any packet loss for TCP/IP packets traversing the
556*a5548bf6SLawrence Stewartnetwork stack.
557*a5548bf6SLawrence Stewart.Ss Important Behaviours
558*a5548bf6SLawrence StewartThe behaviour of a log file path change whilst the module is enabled is as
559*a5548bf6SLawrence Stewartfollows:
560*a5548bf6SLawrence Stewart.Bl -enum
561*a5548bf6SLawrence Stewart.It
562*a5548bf6SLawrence StewartAttempt to open the new file path for writing.
563*a5548bf6SLawrence StewartIf this fails, the path change will fail and the existing path will continue to
564*a5548bf6SLawrence Stewartbe used.
565*a5548bf6SLawrence Stewart.It
566*a5548bf6SLawrence StewartAssuming the new path is valid and opened successfully:
567*a5548bf6SLawrence Stewart.Bl -dash
568*a5548bf6SLawrence Stewart.It
569*a5548bf6SLawrence StewartFlush all pending log messages to the old file path.
570*a5548bf6SLawrence Stewart.It
571*a5548bf6SLawrence StewartClose the old file path.
572*a5548bf6SLawrence Stewart.It
573*a5548bf6SLawrence StewartSwitch the active log file pointer to point at the new file path.
574*a5548bf6SLawrence Stewart.It
575*a5548bf6SLawrence StewartCommence logging to the new file.
576*a5548bf6SLawrence Stewart.El
577*a5548bf6SLawrence Stewart.El
578*a5548bf6SLawrence Stewart.Pp
579*a5548bf6SLawrence StewartDuring the time between the flush of pending log messages to the old file and
580*a5548bf6SLawrence Stewartcommencing logging to the new file, new log messages will still be generated and
581*a5548bf6SLawrence Stewartbuffered.
582*a5548bf6SLawrence StewartAs soon as the new file path is ready for writing, the accumulated log messages
583*a5548bf6SLawrence Stewartwill be written out to the file.
584*a5548bf6SLawrence Stewart.Sh EXAMPLES
585*a5548bf6SLawrence StewartTo enable the module's operations, run the following command as root:
586*a5548bf6SLawrence Stewartsysctl net.inet.siftr.enabled=1
587*a5548bf6SLawrence Stewart.Pp
588*a5548bf6SLawrence StewartTo change the granularity of log messages such that 1 log message is
589*a5548bf6SLawrence Stewartgenerated for every 10 TCP packets per connection, run the following
590*a5548bf6SLawrence Stewartcommand as root:
591*a5548bf6SLawrence Stewartsysctl net.inet.siftr.ppl=10
592*a5548bf6SLawrence Stewart.Pp
593*a5548bf6SLawrence StewartTo change the log file location to /tmp/siftr.log, run the following
594*a5548bf6SLawrence Stewartcommand as root:
595*a5548bf6SLawrence Stewartsysctl net.inet.siftr.logfile=/tmp/siftr.log
596*a5548bf6SLawrence Stewart.Sh SEE ALSO
597*a5548bf6SLawrence Stewart.Xr alq 9 ,
598*a5548bf6SLawrence Stewart.Xr pfil 9
599*a5548bf6SLawrence Stewart.Xr sysctl 8 ,
600*a5548bf6SLawrence Stewart.Xr tcp 4 ,
601*a5548bf6SLawrence Stewart.Xr tcpdump 1 ,
602*a5548bf6SLawrence Stewart.Sh ACKNOWLEDGEMENTS
603*a5548bf6SLawrence StewartDevelopment of this software was made possible in part by grants from the
604*a5548bf6SLawrence StewartCisco University Research Program Fund at Community Foundation Silicon Valley,
605*a5548bf6SLawrence Stewartand the FreeBSD Foundation.
606*a5548bf6SLawrence Stewart.Sh HISTORY
607*a5548bf6SLawrence Stewart.Nm
608*a5548bf6SLawrence Stewartfirst appeared in
609*a5548bf6SLawrence Stewart.Fx 9.0 .
610*a5548bf6SLawrence Stewart.Pp
611*a5548bf6SLawrence Stewart.Nm
612*a5548bf6SLawrence Stewartwas first released in 2007 by Lawrence Stewart and James Healy whilst working on
613*a5548bf6SLawrence Stewartthe NewTCP research project at Swinburne University's Centre for Advanced
614*a5548bf6SLawrence StewartInternet Architectures, Melbourne, Australia, which was made possible in part by
615*a5548bf6SLawrence Stewarta grant from the Cisco University Research Program Fund at Community Foundation
616*a5548bf6SLawrence StewartSilicon Valley.
617*a5548bf6SLawrence StewartMore details are available at:
618*a5548bf6SLawrence Stewart.Pp
619*a5548bf6SLawrence Stewarthttp://caia.swin.edu.au/urp/newtcp/
620*a5548bf6SLawrence Stewart.Pp
621*a5548bf6SLawrence StewartWork on
622*a5548bf6SLawrence Stewart.Nm
623*a5548bf6SLawrence Stewartv1.2.x was sponsored by the FreeBSD Foundation as part of
624*a5548bf6SLawrence Stewartthe
625*a5548bf6SLawrence Stewart.Qq Enhancing the FreeBSD TCP Implementation
626*a5548bf6SLawrence Stewartproject 2008-2009.
627*a5548bf6SLawrence StewartMore details are available at:
628*a5548bf6SLawrence Stewart.Pp
629*a5548bf6SLawrence Stewarthttp://www.freebsdfoundation.org/
630*a5548bf6SLawrence Stewart.Pp
631*a5548bf6SLawrence Stewarthttp://caia.swin.edu.au/freebsd/etcp09/
632*a5548bf6SLawrence Stewart.Sh AUTHORS
633*a5548bf6SLawrence Stewart.An -nosplit
634*a5548bf6SLawrence Stewart.Nm
635*a5548bf6SLawrence Stewartwas written by
636*a5548bf6SLawrence Stewart.An Lawrence Stewart Aq lstewart@FreeBSD.org
637*a5548bf6SLawrence Stewartand
638*a5548bf6SLawrence Stewart.An James Healy Aq jimmy@deefa.com .
639*a5548bf6SLawrence Stewart.Pp
640*a5548bf6SLawrence StewartThis manual page was written by
641*a5548bf6SLawrence Stewart.An Lawrence Stewart Aq lstewart@FreeBSD.org .
642*a5548bf6SLawrence Stewart.Sh BUGS
643*a5548bf6SLawrence StewartCurrent known limitations and any relevant workarounds are outlined below:
644*a5548bf6SLawrence Stewart.Bl -dash
645*a5548bf6SLawrence Stewart.It
646*a5548bf6SLawrence StewartThe internal queue used to pass information between the threads of operation is
647*a5548bf6SLawrence Stewartcurrently unbounded.
648*a5548bf6SLawrence StewartThis allows
649*a5548bf6SLawrence Stewart.Nm
650*a5548bf6SLawrence Stewartto cope with bursty network traffic, but sustained high packet-per-second
651*a5548bf6SLawrence Stewarttraffic can cause exhaustion of kernel memory if the processing thread cannot
652*a5548bf6SLawrence Stewartkeep up with the packet rate.
653*a5548bf6SLawrence Stewart.It
654*a5548bf6SLawrence StewartIf using
655*a5548bf6SLawrence Stewart.Nm
656*a5548bf6SLawrence Stewarton a machine that is also running other modules utilising the
657*a5548bf6SLawrence Stewart.Xr pfil 9
658*a5548bf6SLawrence Stewartframework e.g.
659*a5548bf6SLawrence Stewart.Xr dummynet 4 ,
660*a5548bf6SLawrence Stewart.Xr ipfw 8 ,
661*a5548bf6SLawrence Stewart.Xr pf 4 Ns ,
662*a5548bf6SLawrence Stewartthe order in which you load the modules is important.
663*a5548bf6SLawrence StewartYou should kldload the other modules first, as this will ensure TCP packets
664*a5548bf6SLawrence Stewartundergo any necessary manipulations before
665*a5548bf6SLawrence Stewart.Nm
666*a5548bf6SLawrence Stewart.Qq sees
667*a5548bf6SLawrence Stewartand processes them.
668*a5548bf6SLawrence Stewart.It
669*a5548bf6SLawrence StewartThere is a known, harmless lock order reversal warning between the
670*a5548bf6SLawrence Stewart.Xr pfil 9
671*a5548bf6SLawrence Stewartmutex and tcbinfo TCP lock reported by
672*a5548bf6SLawrence Stewart.Xr witness 4
673*a5548bf6SLawrence Stewartwhen
674*a5548bf6SLawrence Stewart.Nm
675*a5548bf6SLawrence Stewartis enabled in a kernel compiled with
676*a5548bf6SLawrence Stewart.Xr witness 4
677*a5548bf6SLawrence Stewartsupport.
678*a5548bf6SLawrence Stewart.It
679*a5548bf6SLawrence StewartThere is no way to filter which TCP flows you wish to capture data for.
680*a5548bf6SLawrence StewartPost processing is required to separate out data belonging to particular flows
681*a5548bf6SLawrence Stewartof interest.
682*a5548bf6SLawrence Stewart.It
683*a5548bf6SLawrence StewartThe module does not detect deletion of the log file path.
684*a5548bf6SLawrence StewartNew log messages will simply be lost if the log file being used by
685*a5548bf6SLawrence Stewart.Nm
686*a5548bf6SLawrence Stewartis deleted whilst the module is set to use the file.
687*a5548bf6SLawrence StewartSwitching to a new log file using the
688*a5548bf6SLawrence Stewart.Em net.inet.siftr.logfile
689*a5548bf6SLawrence Stewartvariable will create the new file and allow log messages to begin being written
690*a5548bf6SLawrence Stewartto disk again.
691*a5548bf6SLawrence StewartThe new log file path must differ from the path to the deleted file.
692*a5548bf6SLawrence Stewart.It
693*a5548bf6SLawrence StewartThe hash table used within the code is sized to hold 65536 flows.  This is not a
694*a5548bf6SLawrence Stewarthard limit, because chaining is used to handle collisions within the hash table
695*a5548bf6SLawrence Stewartstructure.
696*a5548bf6SLawrence StewartHowever, we suspect (based on analogies with other hash table performance data)
697*a5548bf6SLawrence Stewartthat the hash table look up performance (and therefore the module's packet
698*a5548bf6SLawrence Stewartprocessing performance) will degrade in an exponential manner as the number of
699*a5548bf6SLawrence Stewartunique flows handled in a module enable/disable cycle approaches and surpasses
700*a5548bf6SLawrence Stewart65536.
701*a5548bf6SLawrence Stewart.It
702*a5548bf6SLawrence StewartThere is no garbage collection performed on the flow hash table.
703*a5548bf6SLawrence StewartThe only way currently to flush it is to disable
704*a5548bf6SLawrence Stewart.Nm .
705*a5548bf6SLawrence Stewart.It
706*a5548bf6SLawrence StewartThe PPL variable applies to packets that make it into the processing thread,
707*a5548bf6SLawrence Stewartnot total packets received in the hook function.
708*a5548bf6SLawrence StewartPackets are skipped before the PPL variable is applied, which means there may be
709*a5548bf6SLawrence Stewarta slight discrepancy in the triggering of log messages.
710*a5548bf6SLawrence StewartFor example, if PPL was set to 10, and the 8th packet since the last log message
711*a5548bf6SLawrence Stewartis skipped, the 11th packet will actually trigger the log message to be
712*a5548bf6SLawrence Stewartgenerated.
713*a5548bf6SLawrence StewartThis is discussed in greater depth in CAIA technical report 070824A.
714*a5548bf6SLawrence Stewart.It
715*a5548bf6SLawrence StewartAt the time of writing, there was no simple way to hook into the TCP layer
716*a5548bf6SLawrence Stewartto intercept packets.
717*a5548bf6SLawrence Stewart.Nm Ap s
718*a5548bf6SLawrence Stewartuse of IP layer hook points means all IP
719*a5548bf6SLawrence Stewarttraffic will be processed by the
720*a5548bf6SLawrence Stewart.Nm
721*a5548bf6SLawrence Stewart.Xr pfil 9
722*a5548bf6SLawrence Stewarthook function, which introduces minor, but nonetheless unnecessary packet delay
723*a5548bf6SLawrence Stewartand processing overhead on the system for non-TCP packets as well.
724*a5548bf6SLawrence StewartHooking in at the IP layer is also not ideal from the data gathering point of
725*a5548bf6SLawrence Stewartview.
726*a5548bf6SLawrence StewartPackets traversing up the stack will be intercepted and cause a log message
727*a5548bf6SLawrence Stewartgeneration BEFORE they have been processed by the TCP layer, which means we
728*a5548bf6SLawrence Stewartcannot observe the cause-and-affect relationship between inbound events and the
729*a5548bf6SLawrence Stewartcorresponding TCP control block as precisely as could be.
730*a5548bf6SLawrence StewartIdeally,
731*a5548bf6SLawrence Stewart.Nm
732*a5548bf6SLawrence Stewartshould intercept packets after they have been processed by the TCP layer i.e.
733*a5548bf6SLawrence Stewartintercept packets coming up the stack after they have been processed by
734*a5548bf6SLawrence Stewarttcp_input(), and intercept packets coming down the stack after they have been
735*a5548bf6SLawrence Stewartprocessed by tcp_output().
736*a5548bf6SLawrence StewartThe current code still gives satisfactory granularity though, as inbound events
737*a5548bf6SLawrence Stewarttend to trigger outbound events, allowing the cause-and-effect to be observed
738*a5548bf6SLawrence Stewartindirectly by capturing the state on outbound events as well.
739*a5548bf6SLawrence Stewart.It
740*a5548bf6SLawrence StewartThe
741*a5548bf6SLawrence Stewart.Qq inflight bytes
742*a5548bf6SLawrence Stewartvalue logged by
743*a5548bf6SLawrence Stewart.Nm
744*a5548bf6SLawrence Stewartdoes not take into account bytes that have been
745*a5548bf6SLawrence Stewart.No SACK Ap ed
746*a5548bf6SLawrence Stewartby the receiving host.
747*a5548bf6SLawrence Stewart.It
748*a5548bf6SLawrence StewartPacket hash generation does not currently work for IPv6 based TCP packets.
749*a5548bf6SLawrence Stewart.It
750*a5548bf6SLawrence StewartCompressed notation is not used for IPv6 address representation.
751*a5548bf6SLawrence StewartThis consumes more bytes than is necessary in log output.
752*a5548bf6SLawrence Stewart.El
753