xref: /freebsd/contrib/tcpdump/CONTRIBUTING.md (revision b64c5a0ace59af62eff52bfe110a521dc73c937b)
1# Some Information for Contributors
2Thank you for considering to make a contribution to tcpdump! Please use the
3guidelines below to achieve the best results and experience for everyone.
4
5## How to report bugs and other problems
6**To report a security issue (segfault, buffer overflow, infinite loop, arbitrary
7code execution etc) please send an e-mail to security@tcpdump.org, do not use
8the bug tracker!**
9
10To report a non-security problem (failure to compile, incorrect output in the
11protocol printout, missing support for a particular protocol etc) please check
12first that it reproduces with the latest stable release of tcpdump and the latest
13stable release of libpcap. If it does, please check that the problem reproduces
14with the current git master branch of tcpdump and the current git master branch of
15libpcap. If it does (and it is not a security-related problem, otherwise see
16above), please navigate to the
17[bug tracker](https://github.com/the-tcpdump-group/tcpdump/issues)
18and check if the problem has already been reported. If it has not, please open
19a new issue and provide the following details:
20
21* tcpdump and libpcap version (`tcpdump --version`)
22* operating system name and version and any other details that may be relevant
23  (`uname -a`, compiler name and version, CPU type etc.)
24* custom `configure`/`cmake` flags, if any
25* statement of the problem
26* steps to reproduce
27
28Please note that if you know exactly how to solve the problem and the solution
29would not be too intrusive, it would be best to contribute some development time
30and to open a pull request instead as discussed below.
31
32Still not sure how to do? Feel free to
33[subscribe to the mailing list](https://www.tcpdump.org/#mailing-lists)
34and ask!
35
36
37## How to add new code and to update existing code
38
391) Check that there isn't a pull request already opened for the changes you
40   intend to make.
41
422) [Fork](https://help.github.com/articles/fork-a-repo/) the Tcpdump
43   [repository](https://github.com/the-tcpdump-group/tcpdump).
44
453) The easiest way to test your changes on multiple operating systems and
46   architectures is to let the upstream CI test your pull request (more on
47   this below).
48
494) Setup your git working copy
50   ```
51   git clone https://github.com/<username>/tcpdump.git
52   cd tcpdump
53   git remote add upstream https://github.com/the-tcpdump-group/tcpdump
54   git fetch upstream
55   ```
56
575) Do a `touch .devel` in your working directory.
58   Currently, the effect is
59   * add (via `configure`, in `Makefile`) some warnings options (`-Wall`,
60     `-Wmissing-prototypes`, `-Wstrict-prototypes`, ...) to the compiler if it
61     supports these options,
62   * have the `Makefile` support `make depend` and the `configure` script run it.
63
646) Configure and build
65   ```
66   ./configure && make -s && make check
67   ```
68
697) Add/update tests
70   The `tests` directory contains regression tests of the dissection of captured
71   packets.  Those captured packets were saved running tcpdump with option
72   `-w sample.pcap`.  Additional options, such as `-n`, are used to create relevant
73   and reproducible output; `-#` is used to indicate which particular packets
74   have output that differs.  The tests are run with the `TZ` environment
75   variable set to `GMT0`, so that UTC, rather than the local time where the
76   tests are being run, is used when "local time" values are printed.  The
77   actual test compares the current text output with the expected result
78   (`sample.out`) saved from a previous version.
79
80   Any new/updated fields in a dissector must be present in a `sample.pcap` file
81   and the corresponding output file.
82
83   Configuration is set in `tests/TESTLIST`.
84   Each line in this file has the following format:
85   ```
86   test-name   sample.pcap   sample.out   tcpdump-options
87   ```
88
89   The `sample.out` file can be produced as follows:
90   ```
91   (cd tests && TZ=GMT0 ../tcpdump -# -n -r sample.pcap tcpdump-options > sample.out)
92   ```
93
94   Or, for convenience, use `./update-test.sh test-name`
95
96   It is often useful to have test outputs with different verbosity levels
97   (none, `-v`, `-vv`, `-vvv`, etc.) depending on the code.
98
998) Test using `make check` (current build options) and `./build_matrix.sh`
100   (a multitude of build options, build systems and compilers). If you can,
101   test on more than one operating system. Don't send a pull request until
102   all tests pass.
103
1049) Try to rebase your commits to keep the history simple.
105   ```
106   git fetch upstream
107   git rebase upstream/master
108   ```
109   (If the rebase fails and you cannot resolve, issue `git rebase --abort`
110   and ask for help in the pull request comment.)
111
11210) Once 100% happy, put your work into your forked repository using `git push`.
113
11411) [Initiate and send](https://help.github.com/articles/using-pull-requests/)
115    a pull request.
116    This will trigger the upstream repository CI tests.
117
118
119## Code style and generic remarks
1201) A thorough reading of some other printers code is useful.
121
1222) To help learn how tcpdump works or to help debugging:
123   You can configure and build tcpdump with the instrumentation of functions:
124   ```
125   $ ./configure --enable-instrument-functions
126   $ make -s clean all
127   ```
128
129   This generates instrumentation calls for entry and exit to functions.
130   Just after function entry and just before function exit, these
131   profiling functions are called and print the function names with
132   indentation and call level.
133
134   If entering in a function, it prints also the calling function name with
135   file name and line number. There may be a small shift in the line number.
136
137   In some cases, with Clang 11, the file number is unknown (printed '??')
138   or the line number is unknown (printed '?'). In this case, use GCC.
139
140   If the environment variable INSTRUMENT is
141   - unset or set to an empty string, print nothing, like with no
142     instrumentation
143   - set to "all" or "a", print all the functions names
144   - set to "global" or "g", print only the global functions names
145
146   This allows to run:
147   ```
148   $ INSTRUMENT=a ./tcpdump ...
149   $ INSTRUMENT=g ./tcpdump ...
150   $ INSTRUMENT= ./tcpdump ...
151   ```
152   or
153   ```
154   $ export INSTRUMENT=global
155   $ ./tcpdump ...
156   ```
157
158   The library libbfd is used, therefore the binutils-dev package is required.
159
1603) Put the normative reference if any as comments (RFC, etc.).
161
1624) Put the format of packets/headers/options as comments if there is no
163   published normative reference.
164
1655) The printer may receive incomplete packet in the buffer, truncated at any
166   random position, for example by capturing with `-s size` option.
167   This means that an attempt to fetch packet data based on the expected
168   format of the packet may run the risk of overrunning the buffer.
169
170   Furthermore, if the packet is complete, but is not correctly formed,
171   that can also cause a printer to overrun the buffer, as it will be
172   fetching packet data based on the expected format of the packet.
173
174   Therefore, integral, IPv4 address, and octet sequence values should
175   be fetched using the `GET_*()` macros, which are defined in
176   `extract.h`.
177
178   If your code reads and decodes every byte of the protocol packet, then to
179   ensure proper and complete bounds checks it would be sufficient to read all
180   packet data using the `GET_*()` macros.
181
182   If your code uses the macros above only on some packet data, then the gaps
183   would have to be bounds-checked using the `ND_TCHECK_*()` macros:
184   ```
185   ND_TCHECK_n(p), n in { 1, 2, 3, 4, 5, 6, 7, 8, 16 }
186   ND_TCHECK_SIZE(p)
187   ND_TCHECK_LEN(p, l)
188   ```
189
190   where *p* points to the data not being decoded.  For `ND_CHECK_n()`,
191   *n* is the length of the gap, in bytes.  For `ND_CHECK_SIZE()`, the
192   length of the gap, in bytes, is the size of an item of the data type
193   to which *p* points.  For `ND_CHECK_LEN()`, *l* is the length of the
194   gap, in bytes.
195
196   For the `GET_*()` and `ND_TCHECK_*` macros (if not already done):
197   * Assign: `ndo->ndo_protocol = "protocol";`
198   * Define: `ND_LONGJMP_FROM_TCHECK` before including `netdissect.h`
199   * Make sure that the intersection of `GET_*()` and `ND_TCHECK_*()` is minimal,
200     but at the same time their union covers all packet data in all cases.
201
202   You can test the code via:
203   ```
204   sudo ./tcpdump -s snaplen [-v][v][...] -i lo # in a terminal
205   sudo tcpreplay -i lo sample.pcap             # in another terminal
206   ```
207   You should try several values for snaplen to do various truncation.
208
209*  The `GET_*()` macros that fetch integral values are:
210   ```
211   GET_U_1(p)
212   GET_S_1(p)
213   GET_BE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
214   GET_BE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
215   GET_LE_U_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
216   GET_LE_S_n(p), n in { 2, 3, 4, 5, 6, 7, 8 }
217   ```
218
219   where *p* points to the integral value in the packet buffer. The
220   macro returns the integral value at that location.
221
222   `U` indicates that an unsigned value is fetched; `S` indicates that a
223   signed value is fetched.  For multi-byte values, `BE` indicates that
224   a big-endian value ("network byte order") is fetched, and `LE`
225   indicates that a little-endian value is fetched. *n* is the length,
226   in bytes, of the multi-byte integral value to be fetched.
227
228   In addition to the bounds checking the `GET_*()` macros perform,
229   using those macros has other advantages:
230
231   * tcpdump runs on both big-endian and little-endian systems, so
232     fetches of multi-byte integral values must be done in a fashion
233     that works regardless of the byte order of the machine running
234     tcpdump.  The `GET_BE_*()` macros will fetch a big-endian value and
235     return a host-byte-order value on both big-endian and little-endian
236     machines, and the `GET_LE_*()` macros will fetch a little-endian
237     value and return a host-byte-order value on both big-endian and
238     little-endian machines.
239
240   * tcpdump runs on machines that do not support unaligned access to
241     multi-byte values, and packet values are not guaranteed to be
242     aligned on the proper boundary.  The `GET_BE_*()` and `GET_LE_*()`
243     macros will fetch values even if they are not aligned on the proper
244     boundary.
245
246*  The `GET_*()` macros that fetch IPv4 address values are:
247   ```
248   GET_IPV4_TO_HOST_ORDER(p)
249   GET_IPV4_TO_NETWORK_ORDER(p)
250   ```
251
252   where *p* points to the address in the packet buffer.
253  `GET_IPV4_TO_HOST_ORDER()` returns the address in the byte order of
254   the host that is running tcpdump; `GET_IPV4_TO_NETWORK_ORDER()`
255   returns it in network byte order.
256
257   Like the integral `GET_*()` macros, these macros work correctly on
258   both big-endian and little-endian machines and will fetch values even
259   if they are not aligned on the proper boundary.
260
261*  The `GET_*()` macro that fetches an arbitrary sequences of bytes is:
262   ```
263   GET_CPY_BYTES(dst, p, len)
264   ```
265
266   where *dst* is the destination to which the sequence of bytes should
267   be copied, *p* points to the first byte of the sequence of bytes, and
268   *len* is the number of bytes to be copied.  The bytes are copied in
269   the order in which they appear in the packet.
270
271*  To fetch a network address and convert it to a printable string, use
272   the following `GET_*()` macros, defined in `addrtoname.h`, to
273   perform bounds checks to make sure the entire address is within the
274   buffer and to translate the address to a string to print:
275   ```
276   GET_IPADDR_STRING(p)
277   GET_IP6ADDR_STRING(p)
278   GET_MAC48_STRING(p)
279   GET_EUI64_STRING(p)
280   GET_EUI64LE_STRING(p)
281   GET_LINKADDR_STRING(p, type, len)
282   GET_ISONSAP_STRING(nsap, nsap_length)
283   ```
284
285   `GET_IPADDR_STRING()` fetches an IPv4 address pointed to by *p* and
286   returns a string that is either a host name, if the `-n` flag wasn't
287   specified and a host name could be found for the address, or the
288   standard XXX.XXX.XXX.XXX-style representation of the address.
289
290   `GET_IP6ADDR_STRING()` fetches an IPv6 address pointed to by *p* and
291   returns a string that is either a host name, if the `-n` flag wasn't
292   specified and a host name could be found for the address, or the
293   standard XXXX::XXXX-style representation of the address.
294
295   `GET_MAC48_STRING()` fetches a 48-bit MAC address (Ethernet, 802.11,
296   etc.) pointed to by *p* and returns a string that is either a host
297   name, if the `-n` flag wasn't specified and a host name could be
298   found in the ethers file for the address, or the standard
299   XX:XX:XX:XX:XX:XX-style representation of the address.
300
301   `GET_EUI64_STRING()` fetches a 64-bit EUI pointed to by *p* and
302   returns a string that is the standard XX:XX:XX:XX:XX:XX:XX:XX-style
303   representation of the address.
304
305   `GET_EUI64LE_STRING()` fetches a 64-bit EUI, in reverse byte order,
306   pointed to by *p* and returns a string that is the standard
307   XX:XX:XX:XX:XX:XX:XX:XX-style representation of the address.
308
309   `GET_LINKADDR_STRING()` fetches an octet string, of length *length*
310   and type *type*,  pointed to by *p* and returns a string whose format
311   depends on the value of *type*:
312
313   * `LINKADDR_MAC48` - if the length is 6, the string has the same
314   value as `GET_MAC48_STRING()` would return for that address,
315   otherwise, the string is a sequence of XX:XX:... values for the bytes
316   of the address;
317
318   * `LINKADDR_FRELAY` - the string is "DLCI XXX", where XXX is the
319   DLCI, if the address is a valid Q.922 header, and an error indication
320   otherwise;
321
322   * `LINKADDR_EUI64`, `LINKADDR_ATM`, `LINKADDR_OTHER` -
323   the string is a sequence of XX:XX:... values for the bytes
324   of the address.
325
3266) When defining a structure corresponding to a packet or part of a
327   packet, so that a pointer to packet data can be cast to a pointer to
328   that structure and that structure pointer used to refer to fields in
329   the packet, use the `nd_*` types for the structure members.
330
331   Those types all are aligned only on a 1-byte boundary, so a
332   compiler will not assume that the structure is aligned on a boundary
333   stricter than one byte; there is no guarantee that fields in packets
334   are aligned on any particular boundary.
335
336   This means that all padding in the structure must be explicitly
337   declared as fields in the structure.
338
339   The `nd_*` types for integral values are:
340
341   * `nd_uintN_t`, for unsigned integral values, where *N* is the number
342      of bytes in the value.
343   * `nd_intN_t`, for signed integral values, where *N* is the number
344      of bytes in the value.
345
346   The `nd_*` types for IP addresses are:
347
348   * `nd_ipv4`, for IPv4 addresses;
349   * `nd_ipv6`, for IPv6 addresses.
350
351   The `nd_*` types for link-layer addresses are:
352
353   * `nd_mac48`, for MAC-48 (Ethernet, 802.11, etc.) addresses;
354   * `nd_eui64`, for EUI-64 values.
355
356   The `nd_*` type for a byte in a sequence of bytes is `nd_byte`; an
357   *N*-byte sequence should be declared as `nd_byte[N]`.
358
3597) Do invalid packet checks in code: Think that your code can receive in input
360   not only a valid packet but any arbitrary random sequence of octets (packet
361   * built malformed originally by the sender or by a fuzz tester,
362   * became corrupted in transit or for some other reason).
363
364   Print with: `nd_print_invalid(ndo);	/* to print " (invalid)" */`
365
3668) Use `struct tok` for indexed strings and print them with
367   `tok2str()` or `bittok2str()` (for flags).
368   All `struct tok` must end with `{ 0, NULL }`.
369
3709) Avoid empty lines in output of printers.
371
37210) A commit message must have:
373   ```
374   First line: Capitalized short summary in the imperative (50 chars or less)
375
376   If the commit concerns a protocol, the summary line must start with
377   "protocol: ".
378
379   Body: Detailed explanatory text, if necessary. Fold it to approximately
380   72 characters. There must be an empty line separating the summary from
381   the body.
382   ```
383
38411) Avoid non-ASCII characters in code and commit messages.
385
38612) Use the style of the modified sources.
387
38813) Don't mix declarations and code.
389
39014) tcpdump requires a compiler that supports C99 or later, so C99
391   features may be used in code, but C11 or later features should not be
392   used.
393
39415) Avoid trailing tabs/spaces
395