#
e4b3229a |
| 06-Apr-2012 |
Alexander V. Chernikov <melifaro@FreeBSD.org> |
- Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock i
- Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes.
- Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change.
- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller.
Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC
Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor)
MFC after: 3 weeks
show more ...
|
Revision tags: release/9.0.0, release/7.4.0_cvs, release/8.2.0_cvs, release/7.4.0, release/8.2.0, release/8.1.0_cvs, release/8.1.0 |
|
#
d6c18050 |
| 07-Jul-2010 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Merge svn+ssh://svn.freebsd.org/base/head@209749
|
#
547d94bd |
| 15-Jun-2010 |
Jung-uk Kim <jkim@FreeBSD.org> |
Implement flexible BPF timestamping framework.
- Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set va
Implement flexible BPF timestamping framework.
- Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command. Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP command. Document all supported options in bpf(4) and their uses.
- Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'. The new time stamp has both 64-bit second and fractional parts. bpf_xhdr has this time stamp instead of 'struct timeval' for bh_tstamp. The new structures let us use bh_tstamp of same size on both 32-bit and 64-bit platforms without adding additional shims for 32-bit binaries. On 64-bit platforms, size of BPF header does not change compared to bpf_hdr as its members are already all 64-bit long. On 32-bit platforms, the size may increase by 8 bytes. For backward compatibility, struct bpf_hdr with struct timeval is still the default header unless new time stamp format is explicitly requested. However, the behaviour may change in the future and all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now.
- Add experimental support for tagging mbufs with time stamps from a lower layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs. The time stamps must be uptime in 'struct bintime' format as binuptime(9) and getbinuptime(9) do.
Reviewed by: net@
show more ...
|
#
23dfb351 |
| 09-May-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
MFC r207195: Provide compat32 shims for bpf(4), except zero-copy facilities.
|
#
9307d8bd |
| 08-May-2010 |
Marcel Moolenaar <marcel@FreeBSD.org> |
Merge svn+ssh://svn.freebsd.org/base/head@207793
|
#
a4bf5fb9 |
| 28-Apr-2010 |
Kirk McKusick <mckusick@FreeBSD.org> |
Update to current version of head.
|
#
fc0a61a4 |
| 25-Apr-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
Provide compat32 shims for bpf(4), except zero-copy facilities.
bd_compat32 field of struct bpf_d is kept unconditionally to not impose the requirement of including "opt_compat.h" on all numerous us
Provide compat32 shims for bpf(4), except zero-copy facilities.
bd_compat32 field of struct bpf_d is kept unconditionally to not impose the requirement of including "opt_compat.h" on all numerous users of bpfdesc.h.
Submitted by: jhb (version for 6.x) Reviewed and tested by: emaste MFC after: 2 weeks
show more ...
|
Revision tags: release/7.3.0_cvs, release/7.3.0, release/8.0.0_cvs, release/8.0.0 |
|
#
10b3b545 |
| 17-Sep-2009 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Merge from head
|
#
cbd59a4f |
| 08-Sep-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
- MFC from head@196987
|
#
3eced0dd |
| 12-Aug-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
MFC: r196150
Always embed pointer to BPF JIT function in BPF descriptor to avoid inconsistency when opt_bpf.h is not included.
Reviewed by: rwatson Approved by: re (rwatson)
|
#
a36599cc |
| 12-Aug-2009 |
Jung-uk Kim <jkim@FreeBSD.org> |
Always embed pointer to BPF JIT function in BPF descriptor to avoid inconsistency when opt_bpf.h is not included.
Reviewed by: rwatson Approved by: re (rwatson)
|
Revision tags: release/7.2.0_cvs, release/7.2.0, release/7.1.0_cvs, release/7.1.0, release/6.4.0_cvs, release/6.4.0 |
|
#
7b4f6e7b |
| 02-Aug-2008 |
Antoine Brodin <antoine@FreeBSD.org> |
Remove trailing ';' in BPFD_LOCK_ASSERT macro.
MFC after: 1 month X-MFC-to: stable/7, stable/6 has it right
|
#
4d621040 |
| 24-Mar-2008 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Introduce support for zero-copy BPF buffering, which reduces the overhead of packet capture by allowing a user process to directly "loan" buffer memory to the kernel rather than using read(2) to expl
Introduce support for zero-copy BPF buffering, which reduces the overhead of packet capture by allowing a user process to directly "loan" buffer memory to the kernel rather than using read(2) to explicitly copy data from kernel address space.
The user process will issue new BPF ioctls to set the shared memory buffer mode and provide pointers to buffers and their size. The kernel then wires and maps the pages into kernel address space using sf_buf(9), which on supporting architectures will use the direct map region. The current "buffered" access mode remains the default, and support for zero-copy buffers must, for the time being, be explicitly enabled using a sysctl for the kernel to accept requests to use it.
The kernel and user process synchronize use of the buffers with atomic operations, avoiding the need for system calls under load; the user process may use select()/poll()/kqueue() to manage blocking while waiting for network data if the user process is able to consume data faster than the kernel generates it. Patchs to libpcap are available to allow libpcap applications to transparently take advantage of this support. Detailed information on the new API may be found in bpf(4), including specific atomic operations and memory barriers required to synchronize buffer use safely.
These changes modify the base BPF implementation to (roughly) abstrac the current buffer model, allowing the new shared memory model to be added, and add new monitoring statistics for netstat to print. The implementation, with the exception of some monitoring hanges that break the netstat monitoring ABI for BPF, will be MFC'd.
Zerocopy bpf buffers are still considered experimental are disabled by default. To experiment with this new facility, adjust the net.bpf.zerocopy_enable sysctl variable to 1.
Changes to libpcap will be made available as a patch for the time being, and further refinements to the implementation are expected.
Sponsored by: Seccuris Inc. In collaboration with: rwatson Tested by: pwood, gallatin MFC after: 4 months [1]
[1] Certain portions will probably not be MFCed, specifically things that can break the monitoring ABI.
show more ...
|
Revision tags: release/7.0.0_cvs, release/7.0.0, release/6.3.0_cvs, release/6.3.0 |
|
#
0bf686c1 |
| 06-Aug-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Rem
Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases.
While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency.
Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)
show more ...
|
#
560a54e1 |
| 26-Feb-2007 |
Jung-uk Kim <jkim@FreeBSD.org> |
Add three new ioctl(2) commands for bpf(4).
- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining whether incoming, outgoing, or all packets on the interface should be returned by B
Add three new ioctl(2) commands for bpf(4).
- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining whether incoming, outgoing, or all packets on the interface should be returned by BPF. Set to BPF_D_IN to see only incoming packets on the interface. Set to BPF_D_INOUT to see packets originating locally and remotely on the interface. Set to BPF_D_OUT to see only outgoing packets on the interface. This setting is initialized to BPF_D_INOUT by default. BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but kept for backward compatibility.
- BIOCFEEDBACK sets packet feedback mode. This allows injected packets to be fed back as input to the interface when output via the interface is successful. When BPF_D_INOUT direction is set, injected outgoing packet is not returned by BPF to avoid duplication. This flag is initialized to zero by default.
Note that libpcap has been modified to support BPF_D_OUT direction for pcap_setdirection(3) and PCAP_D_OUT direction is functional now.
Reviewed by: rwatson
show more ...
|
#
6d38c5ad |
| 29-Jan-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Update comment for struct bpf_d: we now store buffered packets for BPF in malloc'd storage, not in mbuf clusters.
|
#
a85614b4 |
| 27-Jan-2007 |
Robert Watson <rwatson@FreeBSD.org> |
Remove BSD < 199103 compatibility entries in the bpf_d structure: they are not used in any of our code. Also remove explicit padding variable that kept the bpf_d structure the same size before and a
Remove BSD < 199103 compatibility entries in the bpf_d structure: they are not used in any of our code. Also remove explicit padding variable that kept the bpf_d structure the same size before and after the change in select implementation, since binary compatibility is not required for this data structure on 7-CURRENT.
show more ...
|
Revision tags: release/6.2.0_cvs, release/6.2.0 |
|
#
16d878cc |
| 02-Jun-2006 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Fix the following bpf(4) race condition which can result in a panic:
(1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off t
Fix the following bpf(4) race condition which can result in a panic:
(1) bpf peer attaches to interface netif0 (2) Packet is received by netif0 (3) ifp->if_bpf pointer is checked and handed off to bpf (4) bpf peer detaches from netif0 resulting in ifp->if_bpf being initialized to NULL. (5) ifp->if_bpf is dereferenced by bpf machinery (6) Kaboom
This race condition likely explains the various different kernel panics reported around sending SIGINT to tcpdump or dhclient processes. But really this race can result in kernel panics anywhere you have frequent bpf attach and detach operations with high packet per second load.
Summary of changes:
- Remove the bpf interface's "driverp" member - When we attach bpf interfaces, we now set the ifp->if_bpf member to the bpf interface structure. Once this is done, ifp->if_bpf should never be NULL. [1] - Introduce bpf_peers_present function, an inline operation which will do a lockless read bpf peer list associated with the interface. It should be noted that the bpf code will pickup the bpf_interface lock before adding or removing bpf peers. This should serialize the access to the bpf descriptor list, removing the race. - Expose the bpf_if structure in bpf.h so that the bpf_peers_present function can use it. This also removes the struct bpf_if; hack that was there. - Adjust all consumers of the raw if_bpf structure to use bpf_peers_present
Now what happens is:
(1) Packet is received by netif0 (2) Check to see if bpf descriptor list is empty (3) Pickup the bpf interface lock (4) Hand packet off to process
From the attach/detach side:
(1) Pickup the bpf interface lock (2) Add/remove from bpf descriptor list
Now that we are storing the bpf interface structure with the ifnet, there is is no need to walk the bpf interface list to locate the correct bpf interface. We now simply look up the interface, and initialize the pointer. This has a nice side effect of changing a bpf interface attach operation from O(N) (where N is the number of bpf interfaces), to O(1).
[1] From now on, we can no longer check ifp->if_bpf to tell us whether or not we have any bpf peers that might be interested in receiving packets.
In collaboration with: sam@ MFC after: 1 month
show more ...
|
Revision tags: release/5.5.0_cvs, release/5.5.0, release/6.1.0_cvs, release/6.1.0 |
|
#
ae275efc |
| 06-Dec-2005 |
Jung-uk Kim <jkim@FreeBSD.org> |
Add experimental BPF Just-In-Time compiler for amd64 and i386.
Use the following kernel configuration option to enable:
options BPF_JITTER
If you want to use bpf_filter() instead (e. g., debuggin
Add experimental BPF Just-In-Time compiler for amd64 and i386.
Use the following kernel configuration option to enable:
options BPF_JITTER
If you want to use bpf_filter() instead (e. g., debugging), do:
sysctl net.bpf.jitter.enable=0
to turn it off.
Currently BIOCSETWF and bpf_mtap2() are unsupported, and bpf_mtap() is partially supported because 1) no need, 2) avoid expensive m_copydata(9).
Obtained from: WinPcap 3.1 (for i386)
show more ...
|
Revision tags: release/6.0.0_cvs, release/6.0.0 |
|
#
b75a24a0 |
| 06-Sep-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Instead of caching the PID which opened the bpf descriptor, continuously refresh the PID which has the descriptor open. The PID is refreshed in various operations like ioctl(2), kevent(2) or poll(2).
Instead of caching the PID which opened the bpf descriptor, continuously refresh the PID which has the descriptor open. The PID is refreshed in various operations like ioctl(2), kevent(2) or poll(2). This produces more accurate information about current bpf consumers. While we are here remove the bd_pcomm member of the bpf stats structure because now that we have an accurate PID we can lookup the via the kern.proc.pid sysctl variable. This is the trick that NetBSD decided to use to deal with this issue.
Special care needs to be taken when MFC'ing this change, as we have made a change to the bpf stats structure. What will end up happening is we will leave the pcomm structure but just mark it as being un-used. This way we keep the ABI in tact.
MFC after: 1 month Discussed with: Rui Paulo < rpaulo at NetBSD dot org >
show more ...
|
#
93e39f0b |
| 22-Aug-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands enhance the security of bpf(4) by further relinquishing the privilege of the bpf(4) consumer (assuming the ioctl commands a
Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands enhance the security of bpf(4) by further relinquishing the privilege of the bpf(4) consumer (assuming the ioctl commands are being implemented).
Once BIOCLOCK is executed, the device becomes locked which prevents the execution of ioctl(2) commands which can change the underly parameters of the bpf(4) device. An example might be the setting of bpf(4) filter programs or attaching to different network interfaces.
BIOCSETWF can be used to set write filters for outgoing packets. Currently if a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used as a raw socket, regardless of consumer's UID. Write filters give users the ability to constrain which packets can be sent through the bpf(4) descriptor.
These features are currently implemented by a couple programs which came from OpenBSD, such as the new dhclient and pflogd.
-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify whether a read or write filter is to be set. -Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the filter program on the mbuf data once we move the packet in from user-space. -Rather than execute two uiomove operations, (one for the link header and the other for the packet data), execute one and manually copy the linker header into the sockaddr structure via bcopy. -Restructure bpf_setf to compensate for write filters, as well as read. -Adjust bpf(4) stats structures to include a bd_locked member.
It should be noted that the FreeBSD and OpenBSD implementations differ a bit in the sense that we unconditionally enforce the lock, where OpenBSD enforces it only if the calling credential is not root.
Idea from: OpenBSD Reviewed by: mlaier
show more ...
|
#
69f7644b |
| 24-Jul-2005 |
Christian S.J. Peron <csjp@FreeBSD.org> |
Introduce new sysctl variable: net.bpf.stats. This sysctl variable can be used to pass statistics regarding dropped, matched and received packet counts from the kernel to user-space. While we are her
Introduce new sysctl variable: net.bpf.stats. This sysctl variable can be used to pass statistics regarding dropped, matched and received packet counts from the kernel to user-space. While we are here introduce a new counter for filtered or matched packets. We currently keep track of packets received or dropped by the bpf device, but not how many packets actually matched the bpf filter.
-Introduce net.bpf.stats sysctl OID -Move sysctl variables after the function prototypes so we can reference bpf_stats_sysctl(9) without build errors. -Introduce bpf descriptor counter which is used mainly for sizing of the xbpf_d array. -Introduce a xbpf_d structure which will act as an external representation of the bpf_d structure. -Add a the following members to the bpfd structure:
bd_fcount - Number of packets which matched bpf filter bd_pid - PID which opened the bpf device bd_pcomm - Process name which opened the device.
It should be noted that it's possible that the process which opened the device could be long gone at the time of stats collection. An example might be a process that opens the bpf device forks then exits leaving the child process with the bpf fd.
Reviewed by: mdodd
show more ...
|
Revision tags: release/5.4.0_cvs, release/5.4.0, release/4.11.0_cvs, release/4.11.0 |
|
#
c398230b |
| 07-Jan-2005 |
Warner Losh <imp@FreeBSD.org> |
/* -> /*- for license, minor formatting changes
|
Revision tags: release/5.3.0_cvs, release/5.3.0 |
|
#
4a3feeaa |
| 09-Sep-2004 |
Robert Watson <rwatson@FreeBSD.org> |
Reformulate use of linked lists in 'struct bpf_d' and 'struct bpf_if' to use queue(3) list macros rather than hand-crafted lists. While here, move to doubly linked lists to eliminate iterating lists
Reformulate use of linked lists in 'struct bpf_d' and 'struct bpf_if' to use queue(3) list macros rather than hand-crafted lists. While here, move to doubly linked lists to eliminate iterating lists in order to remove entries. This change simplifies and clarifies the list logic in the BPF descriptor code as a first step towards revising the locking strategy.
RELENG_5 candidate.
Reviewed by: fenner
show more ...
|
Revision tags: release/4.10.0_cvs, release/4.10.0 |
|
#
f36cfd49 |
| 07-Apr-2004 |
Warner Losh <imp@FreeBSD.org> |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson.
Approved by: core, peter, alc, rwatson
|