History log of /freebsd-9.3-release/sys/net/bpf_buffer.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 247629 02-Mar-2013 melifaro

Merge
* r233937 - Improve BPF locking model
* r233938 - Improve performace for writer-only BPF users
* r233946 - Fix build
* r235744 - Fix (new) panic on attaching to non-existent interface
* r235745 - Fix old panic when BPF consumer attaches to destroying interface
* r235746 - Call bpf_jitter() before acquiring BPF global lock
* r235747 - Make most BPF ioctls() SMP-safe.
* r236231 - Fix BPF_JITTER code broken by r235746.
* r236251 - Fix shim for BIOCSETF to drop all packets buffered on the descriptor.
* r236261 - Save the previous filter right before we set new one.
* r236262 - Fix style(9) nits, reduce unnecessary type castings.
* r236559 - Fix panic introduced by r235745
* r236806 - Fix typo introduced in r236559.

r233937
- Improve BPF locking model.

Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

r233938
- Improve performace for writer-only BPF users.

Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw
frames is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff
uses BPF only to send data. This leads us to the situation when software like
cdpd, being run on high-traffic-volume interface significantly reduces overall
performance since we have to acquire additional locks for every packet.

Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter
we assume it to be write-only and add it to special writer-only per-interface
list. This makes bpf_peers_present() return 0, so no additional overhead is
introduced. After filter is supplied, descriptor is added to original
per-interface list permitting packets to be captured.

Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose
of setting snap length.

Fortunately, most programs explicitly sets (event catch-all) filter after
that. tcpdump(1) is a good example.

So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.

Sysctl is named net.bpf.optimize_writers and is turned off by default.

- While here, document all sysctl variables in bpf.4

r233946
Fix build broken by r233938.

r235744
Fix panic on attaching to non-existent interface
(introduced by r233937, pointed by hrs@)
Fix panic on tcpdump being attached to interface being removed
(introduced by r233937, pointed by hrs@ and adrian@)
Protect most of bpf_setf() by BPF global lock

Add several forgotten assertions (thanks to adrian@)

Document current locking model inside bpf.c
Document EVENTHANDLER(9) usage inside BPF.

r235745
Fix old panic when BPF consumer attaches to destroying interface.
'flags' field is added to the end of bpf_if structure. Currently the only
flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd()
Problem can be easily triggered on SMP stable/[89] by the following command
(sort of):
'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; \
tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done'

Fix possible use-after-free when BPF detaches itself from interface, freeing
bpf_bif memory, while interface is still UP and there can be routes via this
interface. Freeing is now delayed till ifnet_departure_event is received via
eventhandler(9) api.

Convert bpfd rwlock back to mutex due lack of performance gain
(currently checking if packet matches filter is done without holding bpfd
lock and we have to acquire write lock if packet matches)

r235746
Call bpf_jitter() before acquiring BPF global lock due to malloc() being
used inside bpf_jitter.

Eliminate bpf_buffer_alloc() and allocate BPF buffers on descriptor creation
and BIOCSBLEN ioctl. This permits us not to allocate buffers inside
bpf_attachd() which is protected by global lock.

r235747
Make most BPF ioctls() SMP-safe.

r236559
Fix panic introduced by r235745. Panic occurs after first packet traverse
renamed interface.
Add several comments on locking

r236231
Fix BPF_JITTER code broken by r235746.

r236251
Fix 32-bit shim for BIOCSETF to drop all packets buffered on the descriptor
and reset statistics as it should.

r236261
- Save the previous filter right before we set new one.
- Reduce duplicate code and make it little easier to read.

r236262
Fix style(9) nits, reduce unnecessary type castings, etc., for bpf_setf().

r236806
Fix typo introduced in r236559.


# 234969 03-May-2012 eadler

MFC r230108:
Fix trivial typo

Approved by: cperciva (implicit)


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 177548 24-Mar-2008 csjp

Introduce support for zero-copy BPF buffering, which reduces the
overhead of packet capture by allowing a user process to directly "loan"
buffer memory to the kernel rather than using read(2) to explicitly copy
data from kernel address space.

The user process will issue new BPF ioctls to set the shared memory
buffer mode and provide pointers to buffers and their size. The kernel
then wires and maps the pages into kernel address space using sf_buf(9),
which on supporting architectures will use the direct map region. The
current "buffered" access mode remains the default, and support for
zero-copy buffers must, for the time being, be explicitly enabled using
a sysctl for the kernel to accept requests to use it.

The kernel and user process synchronize use of the buffers with atomic
operations, avoiding the need for system calls under load; the user
process may use select()/poll()/kqueue() to manage blocking while
waiting for network data if the user process is able to consume data
faster than the kernel generates it. Patchs to libpcap are available
to allow libpcap applications to transparently take advantage of this
support. Detailed information on the new API may be found in bpf(4),
including specific atomic operations and memory barriers required to
synchronize buffer use safely.

These changes modify the base BPF implementation to (roughly) abstrac
the current buffer model, allowing the new shared memory model to be
added, and add new monitoring statistics for netstat to print. The
implementation, with the exception of some monitoring hanges that break
the netstat monitoring ABI for BPF, will be MFC'd.

Zerocopy bpf buffers are still considered experimental are disabled
by default. To experiment with this new facility, adjust the
net.bpf.zerocopy_enable sysctl variable to 1.

Changes to libpcap will be made available as a patch for the time being,
and further refinements to the implementation are expected.

Sponsored by: Seccuris Inc.
In collaboration with: rwatson
Tested by: pwood, gallatin
MFC after: 4 months [1]

[1] Certain portions will probably not be MFCed, specifically things
that can break the monitoring ABI.