History log of /freebsd-current/sys/net/bpf.h
Revision Date Author Comments
# 29363fb4 23-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove ancient SCCS tags.

Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by: Netflix


# 8f31b879 04-Oct-2023 Justin Hibbits <jhibbits@FreeBSD.org>

bpf: Add IfAPI analogue for bpf_peers_present()

An interface's bpf could feasibly not exist, in which case
bpf_peers_present() would panic from a NULL pointer dereference. Solve
this by adding a new IfAPI that could deal with a NULL bpf, if such
could occur in the network stack.

Reviewed by: zlei
Sponsored by: Juniper Networks, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42082


# 5e444dee 13-Oct-2023 Justin Hibbits <jhibbits@FreeBSD.org>

Revert "bpf: Add IfAPI analogue for bpf_peers_present()"

This reverts commit c81dd8e5fe72d0c7ec055c8621bb2da3a3627abf.

Commit message needs revised.


# c81dd8e5 04-Oct-2023 Justin Hibbits <jhibbits@FreeBSD.org>

bpf: Add IfAPI analogue for bpf_peers_present()

An interface's bpf could feasibly not exist, in which case
bpf_peers_present() would panic from a NULL pointer dereference. Solve
this by adding a new IfAPI that includes a NULL check. Since this API
is used in only a handful of locations, it reduces the the NULL check
scope over inserting the check into bpf_peers_present().

Sponsored by: Juniper Networks, Inc.
MFC after: 1 week


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 950cc1f4 12-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

bpf: Add "_if" tap APIs

Summary:
Hide more netstack by making the BPF_TAP macros real functions in the
netstack. "struct ifnet" is used in the header instead of "if_t" to
keep header pollution down.

Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38103


# c88f6908 19-Jun-2022 Mark Johnston <markj@FreeBSD.org>

bpf: Correct a comment

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# 1e7fe2fb 21-Jul-2021 Luiz Otavio O Souza <loos@FreeBSD.org>

bpf: Add an ioctl to set the VLAN Priority on packets sent by bpf

This allows the use of VLAN PCP in dhclient, which is required for
certain ISPs (such as Orange.fr).

Reviewed by: bcr (man page)
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31263


# e2e050c8 19-May-2019 Conrad Meyer <cem@FreeBSD.org>

Extract eventfilter declarations to sys/_eventfilter.h

This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.


# 699281b5 13-May-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Rework locking in BPF code to remove rwlock from fast path.

On high packets rate the contention on rwlock in bpf_*tap*() functions
can lead to packets dropping. To avoid this, migrate this code to use
epoch(9) KPI and ConcurrencyKit's lists.

* all lists changed to use CK_LIST;
* reference counting added to bpf_if and bpf_d;
* now bpf_if references ifnet and releases this reference on destroy;
* each bpf_d descriptor references bpf_if when it is attached;
* new struct bpf_program_buffer introduced to keep BPF filter programs;
* bpf_program_buffer, bpf_d and bpf_if structures are freed by
epoch_call();
* bpf_freelist and ifnet_departure event are no longer needed, thus
both are removed;

Reviewed by: melifaro
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D20224


# 51369649 20-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.


# fbbd9655 28-Feb-2017 Warner Losh <imp@FreeBSD.org>

Renumber copyright clause 4

Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96


# a4641f4e 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/net*: minor spelling fixes.

No functional change.


# 05fc4164 11-Apr-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

During if_vmove() we call if_detach_internal() which in turn calls the event
handler notifying about interface departure and one of the consumers will
detach if_bpf.
There is no way for us to re-attach this easily as the DLT and hdrlen are
only given on interface creation.
Add a function to allow us to query the DLT and hdrlen from a current
BPF attachment and after if_attach_internal() manually re-add the if_bpf
attachment using these values.

Found by panics triggered by nd6 packets running past BPF_MTAP() with no
proper if_bpf pointer on the interface.

Also add a basic DDB show function to investigate the if_bpf attachment
of an interface.

Reviewed by: gnn
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5896


# f87e372e 31-Jul-2015 Luiz Otavio O Souza <loos@FreeBSD.org>

Remove two unnecessary sleeps from the hot path in bpf(4).

The first one never triggers because bpf_canfreebuf() can only be true for
zero-copy buffers and zero-copy buffers are not read with read(2).

The second also never triggers, because we check the free buffer before
calling ROTATE_BUFFERS(). If the hold buffer is in use the free buffer
will be NULL and there is nothing else to do besides drop the packet. If
the free buffer isn't NULL the hold buffer _is_ free and it is safe to
rotate the buffers.

Update the comment in ROTATE_BUFFERS macro to match the logic described
here.

While here fix a few typos in comments.

MFC after: 2 weeks
Sponsored by: Rubicon Communications (Netgate)


# b23cbbe6 20-Apr-2015 Mark Johnston <markj@FreeBSD.org>

Move the definition of struct bpf_if to bpf.c.

A couple of fields are still exposed via struct bpf_if_ext so that
bpf_peers_present() can be inlined into its callers. However, this change
eliminates some type duplication in the resulting CTF container, since
otherwise ctfmerge(1) propagates the duplication through all types that
contain a struct bpf_if.

Differential Revision: https://reviews.freebsd.org/D2319
Reviewed by: melifaro, rpaulo


# eaeb0c13 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Style: s/SYS_EVENTHANDLER_H/_SYS_EVENTHANDLER_H_/g

Submitted by: bde


# 7ced9c2f 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Instead of putting ifnet declaration into eventhandler.h, move
bpf(4) and vlan(4) related event declarations to bpf.h and
if_vlan_var.h. To avoid dependency on eventhandler.h, protect
these declarations with ifdef SYS_EVENTHANDLER_H.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 3b3b91e7 10-Dec-2012 Guy Helmer <ghelmer@FreeBSD.org>

Changes to resolve races in bpfread() and catchpacket() that, at worst,
cause kernel panics.

Add a flag to the bpf descriptor to indicate whether the hold buffer
is in use. In bpfread(), set the "hold buffer in use" flag before
dropping the descriptor lock during the call to bpf_uiomove().
Everywhere else the hold buffer is used or changed, wait while
the hold buffer is in use by bpfread(). Add a KASSERT in bpfread()
after re-acquiring the descriptor lock to assist uncovering any
additional hold buffer races.


# afa85850 21-May-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix old panic when BPF consumer attaches to destroying interface.
'flags' field is added to the end of bpf_if structure. Currently the only
flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd()
Problem can be easily triggered on SMP stable/[89] by the following command (sort of):
'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done'

Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory,
while interface is still UP and there can be routes via this interface.
Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api.

Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet
matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches)

Approved by: kib(mentor)
MFC in: 4 weeks


# 47db53c3 13-May-2012 Xin LI <delphij@FreeBSD.org>

Sync DLTs with the latest pcap version.

MFC after: 2 weeks


# 51ec1eb7 06-Apr-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

- Improve performace for writer-only BPF users.

Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw frames
is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses
BPF only to send data. This leads us to the situation when software like cdpd,
being run on high-traffic-volume interface significantly reduces overall performance
since we have to acquire additional locks for every packet.

Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter we
assume it to be write-only and add it to special writer-only per-interface list.
This makes bpf_peers_present() return 0, so no additional overhead is introduced.
After filter is supplied, descriptor is added to original per-interface list permitting
packets to be captured.

Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of
setting snap length.

Fortunately, most programs explicitly sets (event catch-all) filter after that.
tcpdump(1) is a good example.

So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.

Sysctl is named net.bpf.optimize_writers and is turned off by default.

- While here, document all sysctl variables in bpf.4

Sponsored by Yandex LLC

Reviewed by: glebius (previous version)
Reviewed by: silence on -net@
Approved by: (mentor)

MFC after: 4 weeks


# e4b3229a 06-Apr-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

- Improve BPF locking model.

Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

Found by: Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by: Yandex LLC

Reviewed by: glebius (previous version)
Reviewed by: silence on -net@
Approved by: (mentor)

MFC after: 3 weeks


# 253a3814 31-Dec-2011 Lawrence Stewart <lstewart@FreeBSD.org>

Revert r228986 until it can be reworked to avoid panicing the kernel when the
same interface is attached multiple times with different DLTs, as is done in
net80211 for example.

Reported by: adrian


# 0f89fc22 30-Dec-2011 Lawrence Stewart <lstewart@FreeBSD.org>

- Introduce the net.bpf.tscfg sysctl tree and associated code so as to make one
aspect of time stamp configuration per interface rather than per BPF
descriptor. Prior to this, the order in which BPF devices were opened and the
per descriptor time stamp configuration settings could cause non-deterministic
and unintended behaviour with respect to time stamping. With the new scheme, a
BPF attached interface's tscfg sysctl entry can be set to "default", "none",
"fast", "normal" or "external". Setting "default" means use the system default
option (set with the net.bpf.tscfg.default sysctl), "none" means do not
generate time stamps for tapped packets, "fast" means generate time stamps for
tapped packets using a hz granularity system clock read, "normal" means
generate time stamps for tapped packets using a full timecounter granularity
system clock read and "external" (currently unimplemented) means use the time
stamp provided with the packet from an underlying source.

- Utilise the recently introduced sysclock_getsnapshot() and
sysclock_snap2bintime() KPIs to ensure the system clock is only read once per
packet, regardless of the number of BPF descriptors and time stamp formats
requested. Use the per BPF attached interface time stamp configuration to
control if sysclock_getsnapshot() is called and whether the system clock read
is fast or normal. The per BPF descriptor time stamp configuration is then
used to control how the system clock snapshot is converted to a bintime by
sysclock_snap2bintime().

- Remove all FAST related BPF descriptor flag variants. Performing a "fast"
read of the system clock is now controlled per BPF attached interface using
the net.bpf.tscfg sysctl tree.

- Update the bpf.4 man page.

Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.

For more information, see http://www.synclab.org/radclock/

In collaboration with: Julien Ridoux (jridoux at unimelb edu au)


# 3e47c787 28-Nov-2011 Lawrence Stewart <lstewart@FreeBSD.org>

Revert r227778 in preparation for committing reworked patches in its place.


# b6f1c7db 20-Nov-2011 Lawrence Stewart <lstewart@FreeBSD.org>

- When feed-forward clock support is compiled in, change the BPF header to
contain both a regular timestamp obtained from the system clock and the
current feed-forward ffcounter value. This enables new possibilities including
comparison of timekeeping performance and timestamp correction during post
processing.

- Add the net.bpf.ffclock_tstamp sysctl to provide a choice between timestamping
packets using the feedback or feed-forward system clock.

Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.

For more information, see http://www.synclab.org/radclock/

Submitted by: Julien Ridoux (jridoux at unimelb edu au)


# 09b6dcf9 29-Oct-2010 Rui Paulo <rpaulo@FreeBSD.org>

Sync DLTs with the latest pcap version.


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 547d94bd 15-Jun-2010 Jung-uk Kim <jkim@FreeBSD.org>

Implement flexible BPF timestamping framework.

- Allow setting format, resolution and accuracy of BPF time stamps per
listener. Previously, we were only able to use microtime(9). Now we can
set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command.
Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP
command. Document all supported options in bpf(4) and their uses.

- Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'.
The new time stamp has both 64-bit second and fractional parts. bpf_xhdr
has this time stamp instead of 'struct timeval' for bh_tstamp. The new
structures let us use bh_tstamp of same size on both 32-bit and 64-bit
platforms without adding additional shims for 32-bit binaries. On 64-bit
platforms, size of BPF header does not change compared to bpf_hdr as its
members are already all 64-bit long. On 32-bit platforms, the size may
increase by 8 bytes. For backward compatibility, struct bpf_hdr with
struct timeval is still the default header unless new time stamp format is
explicitly requested. However, the behaviour may change in the future and
all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now.

- Add experimental support for tagging mbufs with time stamps from a lower
layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs.
The time stamps must be uptime in 'struct bintime' format as binuptime(9)
and getbinuptime(9) do.

Reviewed by: net@


# dcf377ed 02-Apr-2009 Rui Paulo <rpaulo@FreeBSD.org>

Sync DLTs with latest libpcap version.


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 18b6f055 26-Aug-2008 Jung-uk Kim <jkim@FreeBSD.org>

Revert the previous commit to fix buildworld for now.

We have constified 'struct bpf_insn *' for bpf_filter(9) and bpf_validate(9)
since r1.19 but they conflict with pcap.h from libpcap.


# 32688992 25-Aug-2008 Jung-uk Kim <jkim@FreeBSD.org>

Make sys/net/bpf_filter.c build cleanly on user land.


# f11c3508 07-Jul-2008 David Malone <dwmalone@FreeBSD.org>

Add a new ioctl for changing the read filter (BIOCSETFNR). This is
just like BIOCSETF but it doesn't drop all the packets buffered on
the discriptor and reset the statistics.

Also, when setting the write filter, don't drop packets waiting to
be read or reset the statistics.

PR: 118486
Submitted by: Matthew Luckie <mluckie@cs.waikato.ac.nz>
MFC after: 1 month


# 4d621040 24-Mar-2008 Christian S.J. Peron <csjp@FreeBSD.org>

Introduce support for zero-copy BPF buffering, which reduces the
overhead of packet capture by allowing a user process to directly "loan"
buffer memory to the kernel rather than using read(2) to explicitly copy
data from kernel address space.

The user process will issue new BPF ioctls to set the shared memory
buffer mode and provide pointers to buffers and their size. The kernel
then wires and maps the pages into kernel address space using sf_buf(9),
which on supporting architectures will use the direct map region. The
current "buffered" access mode remains the default, and support for
zero-copy buffers must, for the time being, be explicitly enabled using
a sysctl for the kernel to accept requests to use it.

The kernel and user process synchronize use of the buffers with atomic
operations, avoiding the need for system calls under load; the user
process may use select()/poll()/kqueue() to manage blocking while
waiting for network data if the user process is able to consume data
faster than the kernel generates it. Patchs to libpcap are available
to allow libpcap applications to transparently take advantage of this
support. Detailed information on the new API may be found in bpf(4),
including specific atomic operations and memory barriers required to
synchronize buffer use safely.

These changes modify the base BPF implementation to (roughly) abstrac
the current buffer model, allowing the new shared memory model to be
added, and add new monitoring statistics for netstat to print. The
implementation, with the exception of some monitoring hanges that break
the netstat monitoring ABI for BPF, will be MFC'd.

Zerocopy bpf buffers are still considered experimental are disabled
by default. To experiment with this new facility, adjust the
net.bpf.zerocopy_enable sysctl variable to 1.

Changes to libpcap will be made available as a patch for the time being,
and further refinements to the implementation are expected.

Sponsored by: Seccuris Inc.
In collaboration with: rwatson
Tested by: pwood, gallatin
MFC after: 4 months [1]

[1] Certain portions will probably not be MFCed, specifically things
that can break the monitoring ABI.


# 2a0a392e 23-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

Remove trailing whitespace from lines in BPF.

MFC after: 3 days


# 19ed78ce 21-Oct-2007 Max Laier <mlaier@FreeBSD.org>

Additions from libpcap 0.9.8 unbreak the build.

Pointy hat to: mlaier
X-MFC after: RELENG_7 buildworld


# 560a54e1 26-Feb-2007 Jung-uk Kim <jkim@FreeBSD.org>

Add three new ioctl(2) commands for bpf(4).

- BIOCGDIRECTION and BIOCSDIRECTION get or set the setting determining
whether incoming, outgoing, or all packets on the interface should be
returned by BPF. Set to BPF_D_IN to see only incoming packets on the
interface. Set to BPF_D_INOUT to see packets originating locally and
remotely on the interface. Set to BPF_D_OUT to see only outgoing
packets on the interface. This setting is initialized to BPF_D_INOUT
by default. BIOCGSEESENT and BIOCSSEESENT are obsoleted by these but
kept for backward compatibility.

- BIOCFEEDBACK sets packet feedback mode. This allows injected packets
to be fed back as input to the interface when output via the interface is
successful. When BPF_D_INOUT direction is set, injected outgoing packet
is not returned by BPF to avoid duplication. This flag is initialized to
zero by default.

Note that libpcap has been modified to support BPF_D_OUT direction for
pcap_setdirection(3) and PCAP_D_OUT direction is functional now.

Reviewed by: rwatson


# f09c8c4a 04-Sep-2006 Sam Leffler <sam@FreeBSD.org>

more juniper dlt's

MFC after: 1 month


# 7eae78a4 13-Jun-2006 Christian S.J. Peron <csjp@FreeBSD.org>

If bpf(4) has not been compiled into the kernel, initialize the bpf interface
pointer to a zeroed, statically allocated bpf_if structure. This way the
LIST_EMPTY() macro will always return true. This allows us to remove the
additional unconditional memory reference for each packet in the fast path.

Discussed with: sam


# ffdc0471 03-Jun-2006 Christian S.J. Peron <csjp@FreeBSD.org>

Back out previous two commits, this caused some problems in the namespace
resulting in some build failures. Instead, to fix the problem of bpf not
being present, check the pointer before dereferencing it.

This is a temporary bandaid until we can decide on how we want to handle
the bpf code not being present. This will be fixed shortly.


# 727b7381 03-Jun-2006 Christian S.J. Peron <csjp@FreeBSD.org>

Temporarily include files so that our macro checks do something useful.


# 5255290c 03-Jun-2006 Christian S.J. Peron <csjp@FreeBSD.org>

Make sure we don't try to dereference the the if_bpf pointer when bpf has
not been compiled into the the kernel.

Submitted by: benno


# 16d878cc 02-Jun-2006 Christian S.J. Peron <csjp@FreeBSD.org>

Fix the following bpf(4) race condition which can result in a panic:

(1) bpf peer attaches to interface netif0
(2) Packet is received by netif0
(3) ifp->if_bpf pointer is checked and handed off to bpf
(4) bpf peer detaches from netif0 resulting in ifp->if_bpf being
initialized to NULL.
(5) ifp->if_bpf is dereferenced by bpf machinery
(6) Kaboom

This race condition likely explains the various different kernel panics
reported around sending SIGINT to tcpdump or dhclient processes. But really
this race can result in kernel panics anywhere you have frequent bpf attach
and detach operations with high packet per second load.

Summary of changes:

- Remove the bpf interface's "driverp" member
- When we attach bpf interfaces, we now set the ifp->if_bpf member to the
bpf interface structure. Once this is done, ifp->if_bpf should never be
NULL. [1]
- Introduce bpf_peers_present function, an inline operation which will do
a lockless read bpf peer list associated with the interface. It should
be noted that the bpf code will pickup the bpf_interface lock before adding
or removing bpf peers. This should serialize the access to the bpf descriptor
list, removing the race.
- Expose the bpf_if structure in bpf.h so that the bpf_peers_present function
can use it. This also removes the struct bpf_if; hack that was there.
- Adjust all consumers of the raw if_bpf structure to use bpf_peers_present

Now what happens is:

(1) Packet is received by netif0
(2) Check to see if bpf descriptor list is empty
(3) Pickup the bpf interface lock
(4) Hand packet off to process

From the attach/detach side:

(1) Pickup the bpf interface lock
(2) Add/remove from bpf descriptor list

Now that we are storing the bpf interface structure with the ifnet, there is
is no need to walk the bpf interface list to locate the correct bpf interface.
We now simply look up the interface, and initialize the pointer. This has a
nice side effect of changing a bpf interface attach operation from O(N) (where
N is the number of bpf interfaces), to O(1).

[1] From now on, we can no longer check ifp->if_bpf to tell us whether or
not we have any bpf peers that might be interested in receiving packets.

In collaboration with: sam@
MFC after: 1 month


# 93e39f0b 22-Aug-2005 Christian S.J. Peron <csjp@FreeBSD.org>

Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).

Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.

BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.

These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.

-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
other for the packet data), execute one and manually copy the linker header
into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.

It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.

Idea from: OpenBSD
Reviewed by: mlaier


# e0d80bff 10-Jul-2005 Sam Leffler <sam@FreeBSD.org>

additions from libpcap 0.9.1 release

Approved by: re (scottl)


# f6f1669c 28-May-2005 Sam Leffler <sam@FreeBSD.org>

integrate changes from libpcap-0.9.1-096

Reviewed by: bms


# c398230b 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for license, minor formatting changes


# bde800e6 30-May-2004 David Malone <dwmalone@FreeBSD.org>

Make the comment for DLT_NULL slightly more accurate.

PR: 62272
Submitted by: Radim Kolar <hsn@netmag.cz>
MFC after: 1 week


# f36cfd49 07-Apr-2004 Warner Losh <imp@FreeBSD.org>

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 1acc2f81 31-Mar-2004 Bruce M Simpson <bms@FreeBSD.org>

Add more DLT types required by libpcap 0.8.3.
Maintain numeric sort order.


# a7135a62 31-Mar-2004 Bruce M Simpson <bms@FreeBSD.org>

Update system bpf headers for libpcap 0.8.3.
Maintain listing of DLT link types in numeric order.


# cc5934f5 25-Feb-2004 Max Laier <mlaier@FreeBSD.org>

Tweak existing header and other build infrastructure to be able to build
pf/pflog/pfsync as modules. Do not list them in NOTES or modules/Makefile
(i.e. do not connect it to any (automatic) builds - yet).

Approved by: bms(mentor)


# 437ffe18 27-Dec-2003 Sam Leffler <sam@FreeBSD.org>

o eliminate widespread on-stack mbuf use for bpf by introducing
a new bpf_mtap2 routine that does the right thing for an mbuf
and a variable-length chunk of data that should be prepended.
o while we're sweeping the drivers, use u_int32_t uniformly when
when prepending the address family (several places were assuming
sizeof(int) was 4)
o return M_ASSERTVALID to BPF_MTAP* now that all stack-allocated
mbufs have been eliminated; this may better be moved to the bpf
routines

Reviewed by: arch@ and several others


# d7be4a89 28-Nov-2003 Mike Silbersack <silby@FreeBSD.org>

Remove the call to M_ASSERTVALID from BPF_MTAP; some mbufs passed to
mpf are allocated on the stack, which causes this check to falsely trigger.

A new check which takes on-stack mbufs into account will be reintroduced
after 5.2 is out the door.

Approved by: re (watson)
Requested by: many


# 5d5b5d0f 19-Oct-2003 Mike Silbersack <silby@FreeBSD.org>

Add a new macro M_ASSERTVALID which ensures that the mbuf in question
is non-free. (More checks can/should be added in the future.)

Use M_ASSERTVALID in BPF_MTAP so that we catch when freed mbufs are
passed in, even if no bpf listeners are active.

Inspired by a bug in if_dc caught by Kenjiro Cho.


# 8eab61f3 20-Jan-2003 Sam Leffler <sam@FreeBSD.org>

o add BIOCGDLTLIST and BIOCSDLT ioctls to get the data link type list
and set the link type for use by libpcap and tcpdump
o move mtx unlock in bpfdetach up; it doesn't need to be held so long
o change printf in bpf_detach to distinguish it from the same one in bpfsetdlt

Note there are locking issues here related to ioctl processing; they
have not been addressed here.

Submitted by: Guy Harris <guy@alum.mit.edu>
Obtained from: NetBSD (w/ locking modifications)


# 24a229f4 14-Nov-2002 Sam Leffler <sam@FreeBSD.org>

o add support for multiple link types per interface (e.g. 802.11 and Ethernet)
o introduce BPF_TAP and BPF_MTAP macros to hide implementation details and
ease code portability
o use m_getcl where appropriate

Reviewed by: many
Approved by: re
Obtained from: NetBSD (multiple link type support)


# 94413c0d 20-Jun-2002 Bill Fenner <fenner@FreeBSD.org>

Update for libpcap 0.7.1

Originally-committed-to-wrong-repository by: fenner


# 929ddbbb 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P.


# 46da4bc6 31-Jul-2001 Bill Fenner <fenner@FreeBSD.org>

Update our bpf.h with tcpdump.org's new DLT_ types.
Use our bpf.h instead of tcpdump.org's to build libpcap.


# de5d9935 18-Mar-2000 Robert Watson <rwatson@FreeBSD.org>

The advent of if_detach, allowing interface removal at runtime, makes it
possible for a panic to occur if BPF is in use on the interface at the
time of the call to if_detach. This happens because BPF maintains pointers
to the struct ifnet describing the interface, which is freed by if_detach.

To correct this problem, a new call, bpfdetach, is introduced. bpfdetach
locates BPF descriptor references to the interface, and NULLs them. Other
BPF code is modified so that discovery of a NULL interface results in
ENXIO (already implemented for some calls). Processes blocked on a BPF
call will also be woken up so that they can receive ENXIO.

Interface drivers that invoke bpfattach and if_detach must be modified to
also call bpfattach(ifp) before calling if_detach(ifp). This is relevant
for buses that support hot removal, such as pccard and usb. Patches to
all effected devices will not be committed, only to if_wi.c, due to
testing limitations. To reproduce the crash, load up tcpdump on you
favorite pccard ethernet card, and then eject the card. As some pccard
drivers do not invoke if_detach(ifp), this bug will not manifest itself
for those drivers.

Reviewed by: wes


# 8ed3828c 17-Mar-2000 Robert Watson <rwatson@FreeBSD.org>

Introduce a new bd_seesent flag to the BPF descriptor, indicating whether or
not the current BPF device should report locally generated packets or not.
This allows sniffing applications to see only packets that are not generated
locally, which can be useful for debugging bridging problems, or other
situations where MAC addresses are not sufficient to identify locally
sourced packets. Default to true for this flag, so as to provide existing
behavior by default.

Introduce two new ioctls, BIOCGSEESENT and BIOCSSEESENT, which may be used
to manipulate this flag from userland, given appropriate privilege.

Modify bpf.4 to document these two new ioctl arguments.

Reviewed by: asmodai


# eba2a1ae 15-Jan-2000 Poul-Henning Kamp <phk@FreeBSD.org>

|The hard limit for the BPF buffer size is 32KB, which appears too low
|for high speed networks (even at 100Mbit/s this corresponds to 1/300th
|of a second). The default buffer size is 4KB, but libpcap and ipfilter
|both override this (using the BIOCSBLEN ioctl) and allocate 32KB.
|
|The following patch adds an sysctl for bpf_maxbufsize, similar to the
|one for bpf_bufsize that you added back in December 1995. I choose to
|make the default for this limit 512KB (the value suggested by NFR).

Submitted by: se
Reviewed by: phk


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# dcb129d5 02-Dec-1999 Archie Cobbs <archie@FreeBSD.org>

Add 'const' to the bpf_filter() and bpf_validate() prototypes.
Remove a stale comment from bpf_validate().


# 114ae644 14-Oct-1999 Mike Smith <msmith@FreeBSD.org>

Implement pseudo_AF_HDRCMPLT, which controls the state of the 'header
completion' flag. If set, the interface output routine will assume that
the packet already has a valid link-level source address. This defaults
to off (the address is overwritten)

PR: kern/10680
Submitted by: "Christopher N . Harrell" <cnh@mindspring.net>
Obtained from: NetBSD


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# ba136d4f 04-Oct-1998 Alexander Langer <alex@FreeBSD.org>

Change BPF_ALIGNMENT to long, necessary for correct alignment on Alpha.


# c2b0c424 15-Sep-1998 Bill Fenner <fenner@FreeBSD.org>

Add DLT_{SLIP,PPP}_BSDOS from libpcap 0.4


# 22f05c43 18-Aug-1998 Andrey A. Chernov <ache@FreeBSD.org>

Implement DLT_RAW from libpcap


# c086febe 13-Jul-1998 Bruce Evans <bde@FreeBSD.org>

Don't attempt to optimize the space allocated for bpf headers if
sizeof(struct bpf_hdr) > 20. 20 is normal on 32-bit systems with
32-bit alignment, but we still assume that the last 2 bytes of the
struct are unnecessary padding on such systems. On systems with
64-bit longs, struct timeval is bloated to 16 bytes, so bpf headers
certainly don't fit in 18 bytes.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 11a53ef3 19-Aug-1996 Paul Traina <pst@FreeBSD.org>

Update to match definitions in LBL June 96 release


# 9b44ff22 06-Feb-1996 Garrett Wollman <wollman@FreeBSD.org>

Clean up Ethernet drivers:
- fill in and use ifp->if_softc
- use if_bpf rather than private cookie variables
- change bpf interface to take advantage of this
- call ether_ifattach() directly from Ethernet drivers
- delete kludge in if_attach() that did this indirectly


# 6c5e9bbd 30-Jan-1996 Mike Pritchard <mpp@FreeBSD.org>

Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.


# 4fda91c7 04-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Moved prototypes for devswitch functions from conf.c and driver sources
to <machine/conf.h>. conf.h was mechanically generated by
`grep ^d_ conf.c >conf.h'. This accounts for part of its ugliness. The
prototypes should be moved back to the driver sources when the functions
are staticalized.


# 9e52b982 03-Oct-1995 Garrett Wollman <wollman@FreeBSD.org>

Import of 4.4-Lite-2 sys/net to make merge and examination easier. Since we
are not on the vendor branch for any of these files, the conflicts shown make
no matter.

Obtained from: 4.4BSD-Lite-2


# 60039670 08-Sep-1995 Bruce Evans <bde@FreeBSD.org>

Fix benign type mismatches in devsw functions. 82 out of 299 devsw
functions were wrong.


# 00a83887 15-Jun-1995 Paul Traina <pst@FreeBSD.org>

Give the BPF the ability to generate signals when a packet is available.

Reviewed by: pst & wollman
Submitted by: grossman@cygnus.com


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# cea1da3b 20-Aug-1994 Paul Richards <paul@FreeBSD.org>

Make idempotent.

Submitted by: Paul


# 3c4dd356 02-Aug-1994 David Greenman <dg@FreeBSD.org>

Added $Id$


# df8bae1d 24-May-1994 Rodney W. Grimes <rgrimes@FreeBSD.org>

BSD 4.4 Lite Kernel Sources