History log of /freebsd-10.1-release/sys/net/if_bridge.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 272461 02-Oct-2014 gjb

Copy stable/10@r272459 to releng/10.1 as part of
the 10.1-RELEASE process.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 253751 28-Jul-2013 hrs

- Relax the restriction on the member interfaces with LLAs. Two or more
LLAs on the member interfaces are actually harmless when the parent
interface does not have a LLA.

- Add net.link.bridge.allow_llz_overlap. This is a knob to allow LLAs on
a bridge and the member interfaces at the same time. The default is 0.

Pointed out by: ume
MFC after: 3 days


# 252548 03-Jul-2013 hrs

Fix a compiler warning.

MFC after: 1 week


# 252511 02-Jul-2013 hrs

- Allow ND6_IFF_AUTO_LINKLOCAL for IFT_BRIDGE. An interface with IFT_BRIDGE
is initialized with !ND6_IFF_AUTO_LINKLOCAL && !ND6_IFF_ACCEPT_RTADV
regardless of net.inet6.ip6.accept_rtadv and net.inet6.ip6.auto_linklocal.
To configure an autoconfigured link-local address (RFC 4862), the
following rc.conf(5) configuration can be used:

ifconfig_bridge0_ipv6="inet6 auto_linklocal"

- if_bridge(4) now removes IPv6 addresses on a member interface to be
added when the parent interface or one of the existing member
interfaces has an IPv6 address. if_bridge(4) merges each link-local
scope zone which the member interfaces form respectively, so it causes
address scope violation. Removal of the IPv6 addresses prevents it.

- if_lagg(4) now removes IPv6 addresses on a member interfaces
unconditionally.

- Set reasonable flags to non-IPv6-capable interfaces. [*]

Submitted by: rpaulo [*]
MFC after: 1 week


# 249294 09-Apr-2013 ae

Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats.

MFC after: 1 week


# 248851 28-Mar-2013 markj

Ignore interface renames instead of removing the interface from the bridge
group.

Reviewed by: rstone
Approved by: rstone (co-mentor)
Sponsored by: Sandvine Incorporated
MFC after: 1 week


# 248155 11-Mar-2013 glebius

Reinitialize eh after pfil(9) processing.

PR: 176764
Submitted by: adri


# 244378 18-Dec-2012 kevlo

Fix typo in comment.

Reviewed by: thompsa


# 243882 05-Dec-2012 glebius

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# 243669 29-Nov-2012 pjd

- Use more appropriate loop (do { } while()) when generating ethernet address
for bridge interface.
- If we found a collision we can break the loop - only one collision is
possible and one is exactly enough to need to renegerate.

Obtained from: WHEEL Systems
MFC after: 1 week


# 242161 26-Oct-2012 glebius

o Remove last argument to ip_fragment(), and obtain all needed information
on checksums directly from mbuf flags. This simplifies code.
o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in
hardware. Some driver may not announce CSUM_IP in theur if_hwassist,
although try to do checksums if CSUM_IP set on mbuf. Example is em(4).
o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP.
After this change CSUM_DELAY_IP vanishes from the stack.

Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>


# 242013 24-Oct-2012 glebius

Fix fallout from r240071. If destination interface lookup fails,
we should broadcast a packet, not try to deliver it to NULL.

Reported by: rpaulo


# 241610 16-Oct-2012 glebius

Make the "struct if_clone" opaque to users of the cloning API. Users
now use function calls:

if_clone_simple()
if_clone_advanced()

to initialize a cloner, instead of macros that initialize if_clone
structure.

Discussed with: brooks, bz, 1 year ago


# 241394 10-Oct-2012 kevlo

Revert previous commit...

Pointyhat to: kevlo (myself)


# 241370 09-Oct-2012 kevlo

Prefer NULL over 0 for pointers


# 241245 06-Oct-2012 glebius

A step in resolving mess with byte ordering for AF_INET. After this change:

- All packets in NETISR_IP queue are in net byte order.
- ip_input() is entered in net byte order and converts packet
to host byte order right _after_ processing pfil(9) hooks.
- ip_output() is entered in host byte order and converts packet
to net byte order right _before_ processing pfil(9) hooks.
- ip_fragment() accepts and emits packet in net byte order.
- ip_forward(), ip_mloopback() use host byte order (untouched actually).
- ip_fastforward() no longer modifies packet at all (except ip_ttl).
- Swapping of byte order there and back removed from the following modules:
pf(4), ipfw(4), enc(4), if_bridge(4).
- Swapping of byte order added to ipfilter(4), based on __FreeBSD_version
- __FreeBSD_version bumped.
- pfil(9) manual page updated.

Reviewed by: ray, luigi, eri, melifaro
Tested by: glebius (LE), ray (BE)


# 241183 04-Oct-2012 thompsa

Remove the M_NOWAIT from bridge_rtable_init as it isn't needed. The function
return value is not even checked and could lead to a panic on a null sc_rthash.

MFC after: 2 weeks


# 240971 26-Sep-2012 glebius

- In the bridge_enqueue() do success/error accounting for
each fragment, not only once.
- In the GRAB_OUR_PACKETS() macro do increase if_ibytes.


# 240099 04-Sep-2012 melifaro

Introduce new link-layer PFIL hook V_link_pfil_hook.
Merge ether_ipfw_chk() and part of bridge_pfil() into
unified ipfw_check_frame() function called by PFIL.
This change was suggested by rwatson? @ DevSummit.

Remove ipfw headers from ether/bridge code since they are unneeded now.

Note this thange introduce some (temporary) performance penalty since
PFIL read lock has to be acquired for every link-level packet.

MFC after: 3 weeks


# 240071 03-Sep-2012 glebius

Change bridge(4) to use if_transmit for forwarding packets to underlying
interfaces instead of queueing.

Tested by: ray


# 238355 10-Jul-2012 emaste

Simplify error case

Submitted by: thompsa@


# 238346 10-Jul-2012 emaste

Plug potential mbuf leak when bridging fragments

If an error occurs when transmitting one mbuf in a chain of fragments,
free the subsequent fragments instead of leaking them.

Sponsored by: ADARA Networks


# 238298 09-Jul-2012 emaste

Restore error handling lost in r191603

This was missed in the change from IFQ_ENQUEUE to if_transmit.

Sponsored by: ADARA Networks


# 236916 11-Jun-2012 thompsa

Fix a panic I introduced in r234487, the bridge softc pointer is set to null
early in the detach so rearrange things not to explode.

Reported by: David Roffiaen, Gustau Perez Querol
Tested by: David Roffiaen
MFC after: 3 days


# 234946 03-May-2012 melifaro

Revert r234834 per luigi@ request.

Cleaner solution (e.g. adding another header) should be done here.

Original log:
Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by: luigi
Approved by: kib(mentor)


# 234834 30-Apr-2012 melifaro

Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by: ae(mentor)
MFC after: 2 weeks


# 234487 20-Apr-2012 thompsa

Add linkstate to bridge(4), set the link to up when at least one underlying
interface is up, otherwise the link is down.

This, among other things, allows carp to work on a bridge.

Prodded by: glebius
Tested by: Alexander Lunev


# 232315 29-Feb-2012 thompsa

Use a more appropriate default for the maximum number of addresses in the
bridge forwarding table.

PR: docs/164564
Discussed with: brueffer


# 232014 22-Feb-2012 thompsa

bstp_input() always consumes the packet so remove the mbuf handling dance
around it.

Obtained from: OpenBSD (r1.37)


# 231130 07-Feb-2012 pjd

Allow to set if_bridge(4) sysctls from /boot/loader.conf.

MFC after: 3 days


# 227459 11-Nov-2011 brooks

In r191367 the need for if_free_type() was removed and a new member
if_alloctype was used to store the origional interface type. Take
advantage of this change by removing all existing uses of if_free_type()
in favor of if_free().

MFC after: 1 Month


# 227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# 225380 04-Sep-2011 thompsa

On the first loop for generating a bridge MAC address use the local
hostid, this gives a good chance of keeping the same address over
reboots. This is intended to help IPV6 and similar which generate
their addresses from the mac.

PR: kern/160300
Submitted by: mdodd
Approved by: re (kib)


# 225209 27-Aug-2011 bz

When adding IPv6 fwd support to ipfw in r225044 these two files were
not committed. Initialize next_hop6 to align with the IPv4 code.

PR: bin/117214
MFC after: 3 weeks
X-MFC with: r225044
Approved by: re (kib)


# 211193 11-Aug-2010 will

Unbreak LINT by moving all carp hooks to net/if.c / netinet/ip_carp.h, with
the appropriate ifdefs.

Reviewed by: bz
Approved by: ken (mentor)


# 211157 10-Aug-2010 will

Allow carp(4) to be loaded as a kernel module. Follow precedent set by
bridge(4), lagg(4) etc. and make use of function pointers and
pf_proto_register() to hook carp into the network stack.

Currently, because of the uncertainty about whether the unload path is free
of race condition panics, unloads are disallowed by default. Compiling with
CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure.

This commit requires IP6PROTOSPACER, introduced in r211115.

Reviewed by: bz, simon
Approved by: ken (mentor)
MFC after: 2 weeks


# 204591 02-Mar-2010 luigi

Bring in the most recent version of ipfw and dummynet, developed
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.

The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.

In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.

Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.

Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.

Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.

CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.


# 203272 31-Jan-2010 hrs

- Check if_type of "addm <interface>" before setting the
interface's MTU to the if_bridge(4) interface. This fixes a
bug that MTU value of "addm <interface>" is used even when it
is invalid for the if_bridge(4) member:

# ifconfig bridge0 create
# ifconfig bridge0
bridge0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
...
# ifconfig bridge0 addm lo0
ifconfig: BRDGADD lo0: Invalid argument
# ifconfig bridge0
bridge0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 16384
...

- Do not ignore MTU value of an interface even when if_type == IFT_GIF.
This fixes MTU mismatch when an if_bridge(4) interface has a
gif(4) interface and no other interface as the member, and it
is directly used for L2 communication with EtherIP tunneling
enabled.

- Implement SIOCSIFMTU ioctl. Changing the MTU is allowed only
when all members have the same MTU value.


# 202588 18-Jan-2010 thompsa

Declare a new EVENTHANDLER called iflladdr_event which signals that the L2
address on an interface has changed. This lets stacked interfaces such as
vlan(4) detect that their lower interface has changed and adjust things in
order to keep working. Previously this situation broke at least vlan(4) and
lagg(4) configurations.

The EVENTHANDLER_INVOKE call was not placed within if_setlladdr() due to the
risk of a loop.

PR: kern/142927
Submitted by: Nikolay Denev


# 201527 04-Jan-2010 luigi

Various cleanup done in ipfw3-head branch including:
- use a uniform mtag format for all packets that exit and re-enter
the firewall in the middle of a rulechain. On reentry, all tags
containing reinject info are renamed to MTAG_IPFW_RULE so the
processing is simpler.

- make ipfw and dummynet use ip_len and ip_off in network format
everywhere. Conversion is done only once instead of tracking
the format in every place.

- use a macro FREE_PKT to dispose of mbufs. This eases portability.

On passing i also removed a few typos, staticise or localise variables,
remove useless declarations and other minor things.

Overall the code shrinks a bit and is hopefully more readable.

I have tested functionality for all but ng_ipfw and if_bridge/if_ethersubr.
For ng_ipfw i am actually waiting for feedback from glebius@ because
we might have some small changes to make.
For if_bridge and if_ethersubr feedback would be welcome
(there are still some redundant parts in these two modules that
I would like to remove, but first i need to check functionality).


# 201122 28-Dec-2009 luigi

bring in several cleanups tested in ipfw3-head branch, namely:

r201011
- move most of ng_ipfw.h into ip_fw_private.h, as this code is
ipfw-specific. This removes a dependency on ng_ipfw.h from some files.

- move many equivalent definitions of direction (IN, OUT) for
reinjected packets into ip_fw_private.h

- document the structure of the packet tags used for dummynet
and netgraph;

r201049
- merge some common code to attach/detach hooks into
a single function.

r201055
- remove some duplicated code in ip_fw_pfil. The input
and output processing uses almost exactly the same code so
there is no need to use two separate hooks.
ip_fw_pfil.o goes from 2096 to 1382 bytes of .text

r201057 (see the svn log for full details)
- macros to make the conversion of ip_len and ip_off
between host and network format more explicit

r201113 (the remaining parts)
- readability fixes -- put braces around some large for() blocks,
localize variables so the compiler does not think they are uninitialized,
do not insist on precise allocation size if we have more than we need.

r201119
- when doing a lookup, keys must be in big endian format because
this is what the radix code expects (this fixes a bug in the
recently-introduced 'lookup' option)

No ABI changes in this commit.

MFC after: 1 week


# 200855 22-Dec-2009 luigi

merge code from ipfw3-head to reduce contention on the ipfw lock
and remove all O(N) sequences from kernel critical sections in ipfw.

In detail:

1. introduce a IPFW_UH_LOCK to arbitrate requests from
the upper half of the kernel. Some things, such as 'ipfw show',
can be done holding this lock in read mode, whereas insert and
delete require IPFW_UH_WLOCK.

2. introduce a mapping structure to keep rules together. This replaces
the 'next' chain currently used in ipfw rules. At the moment
the map is a simple array (sorted by rule number and then rule_id),
so we can find a rule quickly instead of having to scan the list.
This reduces many expensive lookups from O(N) to O(log N).

3. when an expensive operation (such as insert or delete) is done
by userland, we grab IPFW_UH_WLOCK, create a new copy of the map
without blocking the bottom half of the kernel, then acquire
IPFW_WLOCK and quickly update pointers to the map and related info.
After dropping IPFW_LOCK we can then continue the cleanup protected
by IPFW_UH_LOCK. So userland still costs O(N) but the kernel side
is only blocked for O(1).

4. do not pass pointers to rules through dummynet, netgraph, divert etc,
but rather pass a <slot, chain_id, rulenum, rule_id> tuple.
We validate the slot index (in the array of #2) with chain_id,
and if successful do a O(1) dereference; otherwise, we can find
the rule in O(log N) through <rulenum, rule_id>

All the above does not change the userland/kernel ABI, though there
are some disgusting casts between pointers and uint32_t

Operation costs now are as follows:

Function Old Now Planned
-------------------------------------------------------------------
+ skipto X, non cached O(N) O(log N)
+ skipto X, cached O(1) O(1)
XXX dynamic rule lookup O(1) O(log N) O(1)
+ skipto tablearg O(N) O(1)
+ reinject, non cached O(N) O(log N)
+ reinject, cached O(1) O(1)
+ kernel blocked during setsockopt() O(N) O(1)
-------------------------------------------------------------------

The only (very small) regression is on dynamic rule lookup and this will
be fixed in a day or two, without changing the userland/kernel ABI

Supported by: Valeria Paoli
MFC after: 1 month


# 200580 15-Dec-2009 luigi

Start splitting ip_fw2.c and ip_fw.h into smaller components.
At this time we pull out from ip_fw2.c the logging functions, and
support for dynamic rules, and move kernel-only stuff into
netinet/ipfw/ip_fw_private.h

No ABI change involved in this commit, unless I made some mistake.
ip_fw.h has changed, though not in the userland-visible part.

Files touched by this commit:

conf/files
now references the two new source files

netinet/ip_fw.h
remove kernel-only definitions gone into netinet/ipfw/ip_fw_private.h.

netinet/ipfw/ip_fw_private.h
new file with kernel-specific ipfw definitions

netinet/ipfw/ip_fw_log.c
ipfw_log and related functions

netinet/ipfw/ip_fw_dynamic.c
code related to dynamic rules

netinet/ipfw/ip_fw2.c
removed the pieces that goes in the new files

netinet/ipfw/ip_fw_nat.c
minor rearrangement to remove LOOKUP_NAT from the
main headers. This require a new function pointer.

A bunch of other kernel files that included netinet/ip_fw.h now
require netinet/ipfw/ip_fw_private.h as well.
Not 100% sure i caught all of them.

MFC after: 1 month


# 197952 11-Oct-2009 julian

Virtualize the pfil hooks so that different jails may chose different
packet filters. ALso allows ipfw to be enabled on on ejail and disabled
on another. In 8.0 it's a global setting.

Sitting aroung in tree waiting to commit for: 2 months
MFC after: 2 months


# 196519 24-Aug-2009 jfv

When bridging LRO is causing a problem, the believe
that it would work as long as all interfaces have TSO
seems to be false, until the matter gets sorted out
just disable LRO completely.


# 196039 02-Aug-2009 rwatson

Many network stack subsystems use a single global data structure to hold
all pertinent statatistics for the subsystem. These structures are
sometimes "borrowed" by kernel modules that require a place to store
statistics for similar events.

Add KPI accessor functions for statistics structures referenced by kernel
modules so that they no longer encode certain specifics of how the data
structures are named and stored. This change is intended to make it
easier to move to per-CPU network stats following 8.0-RELEASE.

The following modules are affected by this change:

if_bridge
if_cxgb
if_gif
ip_mroute
ipdivert
pf

In practice, most of these statistics consumers should, in fact, maintain
their own statistics data structures rather than borrowing structures
from the base network stack. However, that change is too agressive for
this point in the release cycle.

Reviewed by: bz
Approved by: re (kib)


# 196019 01-Aug-2009 rwatson

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# 195699 14-Jul-2009 rwatson

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 193983 11-Jun-2009 bz

carp(4) allows people to share a set of IP addresses and can only
use IPv4/v6 for inter-node communication (according to my reading).

Properly wrap the carp callouts in INET || INET6 and refelect this
in sys/conf/files as well. While in theory this should be ok,
it might be a bit optimistic to think that carp could build with
inet6 only[1].

Discussed with: mlaier [1]


# 193859 09-Jun-2009 oleg

Close long existed race with net.inet.ip.fw.one_pass = 0:
If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc)
it carries pointer to matching ipfw rule. If this packet then reinjected back
to ipfw, ruleset processing starts from that rule. If rule was deleted
meanwhile, due to existed race condition panic was possible (as well as
other odd effects like parsing rules in 'reap list').

P.S. this commit changes ABI so userland ipfw related binaries should be
recompiled.

MFC after: 1 month
Tested by: Mikolaj Golub


# 193502 05-Jun-2009 luigi

More cleanup in preparation of ipfw relocation (no actual code change):

+ move ipfw and dummynet hooks declarations to raw_ip.c (definitions
in ip_var.h) same as for most other global variables.
This removes some dependencies from ip_input.c;

+ remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly;

+ remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly;

+ move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it;

To be merged together with rev 193497

MFC after: 5 days


# 191729 01-May-2009 thompsa

Reorder the bridge add and delete routines to avoid calling ifpromisc() with
the bridge lock held.


# 191603 27-Apr-2009 sam

use if_transmit intead of direct frobbing of the if_snd q; this is no
longer allowed

Identified by: rwatson
Reviewed by: kmacy


# 190951 11-Apr-2009 rwatson

Update stats in struct ipstat using four new macros, IPSTAT_ADD(),
IPSTAT_INC(), IPSTAT_SUB(), and IPSTAT_DEC(), rather than directly
manipulating the fields across the kernel. This will make it easier
to change the implementation of these statistics, such as using
per-CPU versions of the data structures.

MFC after: 3 days


# 189851 15-Mar-2009 rwatson

Remove IFF_NEEDSGIANT, a compatibility infrastructure introduced
in FreeBSD 5.x to allow network device drivers to run with Giant
despite the network stack being Giant-free. This significantly
simplifies calls into ioctl() on network interfaces, especially
in the multicast code, as well as eliminates deferred invocation
of interface if_start routines.

Disable the build on device drivers still depending on
IFF_NEEDSGIANT as they no longer compile. They will be removed
in a few weeks if they haven't been made MPSAFE in that time.
Disabled drivers:

if_ar
if_axe
if_aue
if_cdce
if_cue
if_kue
if_ray
if_rue
if_rum
if_sr
if_udav
if_ural
if_zyd

Drivers that were already disabled because of tty changes:

if_ppp
if_sl

Discussed on: arch@


# 188594 13-Feb-2009 thompsa

bridge_delete_member is called via the event handler from if_detach
after the LLADDR is reclaimed which causes a null pointer deref with
inherit_mac enabled. Record the ifnet pointer of the interface and then compare
that to find when to re-assign the bridge address.

Submitted by: sam


# 185895 10-Dec-2008 zec

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 185571 02-Dec-2008 bz

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


# 183550 02-Oct-2008 zec

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 182862 08-Sep-2008 thompsa

Put the bridge mac inheritance behind a sysctl with the default off as this
still needs all the edge cases fixed.

Submitted by: Eygene Ryabinkin


# 181803 17-Aug-2008 bz

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


# 181795 16-Aug-2008 thompsa

LRO combined packets can actually be bridged as long as all the interfaces also
support TSO, this can always be disabled manually if undesirable.

Pointed out by: gallatin


# 180220 03-Jul-2008 thompsa

Be smarter about disabling interface capabilities. TOE/TSO/TXCSUM will only be
disabled if one (or more) of the member interfaces does not support it. Always
turn off LRO since we can not bridge a combined frame.

Tested by: Stefan Lambrev


# 180140 01-Jul-2008 philip

Set bridge MAC addresses to the MAC address of their first interface unless
locally configured. This is more in line with the behaviour of other popular
bridging implementations and makes bridges more predictable after reboots for
example.

Reviewed by: thompsa
MFC after: 1 week


# 175432 18-Jan-2008 thompsa

Remove a chunk of duplicated code, test the destination address against the
bridge the same way we check member interfaces.


# 175419 17-Jan-2008 thompsa

IEEE 802.1D-2004 states, frames containing any of the group MAC Addresses
specified in Table 7-10 in their destination address field shall not be relayed
by the Bridge. Add a check in bridge_forward() to adhere to this.

PR: kern/119744


# 175396 17-Jan-2008 thompsa

Sync from OpenBSD r1.118, nuke clause 3 & 4.


# 174749 18-Dec-2007 thompsa

Simplify the error handling and use the dereferenced sc->sc_ifp pointer.


# 174746 18-Dec-2007 thompsa

When the bridge has an address and a packet comes in for it then drop it if the
link has been marked discarding by Spanning Tree. This would cause the bridge
to see duplicate packets to itself even if STP has correctly calculated the
topology and blocked redundant links.

Reported by: trasz
Tested by: trasz
MFC after: 3 days


# 173399 06-Nov-2007 oleg

1) dummynet_io() declaration has changed.
2) Alter packet flow inside dummynet: allow certain packets to bypass
dummynet scheduler. Benefits are:

- lower latency: if packet flow does not exceed pipe bandwidth, packets
will not be (up to tick) delayed (due to dummynet's scheduler granularity).
- lower overhead: if packet avoids dummynet scheduler it shouldn't reenter ip
stack later. Such packets can be fastforwarded.
- recursion (which can lead to kernel stack exhaution) eliminated. This fix
long existed panic, which can be triggered this way:
kldload dummynet
sysctl net.inet.ip.fw.one_pass=0
ipfw pipe 1 config bw 0
for i in `jot 30`; do ipfw add 1 pipe 1 icmp from any to any; done
ping -c 1 localhost

3) Three new sysctl nodes are added:
net.inet.ip.dummynet.io_pkt - packets passed to dummynet
net.inet.ip.dummynet.io_pkt_fast - packets avoided dummynet scheduler
net.inet.ip.dummynet.io_pkt_drop - packets dropped by dummynet

P.S. Above comments are true only for layer 3 packets. Layer 2 packet flow
is not changed yet.

MFC after: 3 month


# 173320 04-Nov-2007 thompsa

Add an option to limit the number of source MACs that can be behind a bridge
interface. Once the limit is reached packets with unknown source addresses are
dropped until an existing host cache entry expires or is removed. Useful to
use with the STICKY cache option.

Sponsored by: miniSuperHappyDevHouse NZ


# 172824 20-Oct-2007 thompsa

Use ETHER_BPF_MTAP so that the vlan tags are visible to bpf(4) when bridging a
vlan trunk.

Discussed with: csjp
MFC after: 3 days


# 172770 18-Oct-2007 thompsa

The bridging output function puts the mbuf directly on the interfaces send
queue so the output network card must support the same tagging mechanism as
how the frame was input (prepended Ethernet header tag or stripped HW mflag).

Now the vlan Ethernet header is _always_ stripped in ether_input and the mbuf
flagged, only only network cards with VLAN_HWTAGGING enabled would properly
re-tag any outgoing vlan frames.

If the outgoing interface does not support hardware tagging then readd the vlan
header to the front of the frame. Move the common vlan encapsulation in to
ether_vlanencap().

Reported by: Erik Osterholm, Jon Otterholm
MFC after: 1 week


# 172201 16-Sep-2007 thompsa

Allow additional packet filtering on the physical interface for locally
destined packets, disabled by default.

PR: kern/116051
Submitted by: Eygene Ryabinkin
Approved by: re (bmah)
MFC after: 2 weeks


# 171678 31-Jul-2007 thompsa

Add a bridge interface flag called PRIVATE where any private port can not
communicate with another private port.

All unicast/broadcast/multicast layer2 traffic is blocked so it works much the
same way as using firewall rules but scales better and is generally easier as
firewall packages usually do not allow ARP blocking.

An example usage would be having a number of customers on separate vlans
bridged with a server network. All the vlans are marked private, they can all
communicate with the server network unhindered, but can not exchange any
traffic whatsoever with each other.

Approved by: re (rwatson)


# 171603 26-Jul-2007 thompsa

Avoid holding the softc lock when using copyout().

Reported by: dfr
Approved by: re (rwatson)


# 170681 13-Jun-2007 thompsa

Add the vlan tag to the bridge route table. This allows a vlan trunk to be
bridged, previously legitimate traffic was not passed as the bridge could not
tell that it was on a different Ethernet segment.

All non-tagged traffic is treated as vlan1 as per IEEE 802.1Q-2003


# 170139 30-May-2007 thompsa

Remove a KASSERT intended to help the developer, the condition is no longer
valid since the span code was added.

PR: kern/113170
MFC after: 1 week


# 167725 19-Mar-2007 thompsa

etherbroadcastaddr is now unused.


# 167722 19-Mar-2007 thompsa

M_BCAST & M_MCAST are now set by ether_input before passing to the bridge.


# 167683 18-Mar-2007 rik

Give a chance for packet to appear with a correct input interfaces
in case of multiple interfaces with the same MAC in the same bridge.
This commit do not solve the entire problem. Only case where packet
arrived from such interface.

PR: kern/109815
MFC after: 7 days
Submitted by: Eygene Ryabinkin and rik@
Discussed with: bms@, thompsa@, yar@


# 167575 14-Mar-2007 thompsa

Properly move the setting of bstp_linkstate_p to the bridgestp module.


# 167379 09-Mar-2007 thompsa

Change the passing of callbacks to a struct in case this needs to be extended in the future.


# 166916 23-Feb-2007 thompsa

Move the lock init until after if_alloc in case the allocation fails and we
free the softc and return.

MFC after: 3 days


# 165105 11-Dec-2006 thompsa

These days P2P means peer-2-peer (also well known from serveral filesharing
protocols) while PointToPoint has been PtP links. Change the variables
accordingly while the code is still fresh and undocumented.

Requested by: bz


# 164880 04-Dec-2006 syrinx

Add two new flags to if_bridge(4) indicating whether the edge flag
of the bridge port and path cost have been administratively set or
calculated automatically by RSTP.

Make sure to transition from non-edge to edge when the port goes down
and the edge flag was manually set before.
This is needed to comply with the condition
((!portEnabled && AdminEdge) || ....)
in the Bridge Detection State Machine (IEE802.1D-2004, p. 171).

Reviewed by: thompsa
Approved by: bz (mentor)


# 164861 03-Dec-2006 syrinx

Fix SIOCGDRVSPEC/BRDGGIFSSTP ioctl: make it copyin() the user
provided buffer length before trying to use it.

Reviewed by: thompsa
Approved by: bz (mentor)
MFC after: 3 days


# 164653 26-Nov-2006 thompsa

Sync with the OpenBSD port of RSTP
- use flags rather than sperate ioctls for edge, p2p
- implement p2p and autop2p flags
- define large pathcost constant as ULL
- show bridgeid and rootid in ifconfig

Obtained from: Reyk Floeter <reyk@openbsd.org>


# 164626 26-Nov-2006 thompsa

use two stage creation of stp ports, this means that the stp variables can be
set before the port is marked STP and they will no longer be overwrittten


# 164112 09-Nov-2006 thompsa

Add a new address cache type called sticky. On an interface marked sticky any
address learned by the bridge is made permanent, the address will not age out
and most importantly will not migrate to another interface.

This can be used to stop mac address poisoning or clients roaming in much the
same way as static entries without the hassle of preloading the table.


# 164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 164002 05-Nov-2006 csjp

Fix possible leak when bridge is in monitor mode. Use m_freem() which will
free the entire chain, instead of using m_free() which will free just the
mbuf that was passed.

Discussed with: thompsa
MFC after: 3 days


# 163984 04-Nov-2006 thompsa

When the packet is for the bridge then note which interface to send the reply
to, previously it was always broadcast to all interfaces (a bug). This is
useful when the bridge is the default gateway and vlans are used to isolate
each client, the reply is now kept private to the vlan which the client
resides.

Reported by: Jon Otterholm
Tested by: Jon Otterholm
MFC after: 3 days


# 163863 01-Nov-2006 thompsa

Bring in support for the Rapid Spanning Tree Protocol (802.1w).

RSTP provides faster spanning tree convergence, the protocol will exchange
information with neighboring switches to quickly transition to forwarding
without creating loops. The code will default to RSTP mode but will downgrade
any port connected to a legacy STP network so is fully backward compatible.

Reviewed by: syrinx
Tested by: syrinx


# 163142 08-Oct-2006 thompsa

Use LIST_FOREACH_SAFE instead of a hand rolled version.


# 162561 22-Sep-2006 thompsa

Revert r1.80 as the ethernet header was inadvertently stripped from ARP
packets. Reimplement this correctly and use a sysctl that defaults to off so
the user doesnt get any suprises if ipfw blocks the ARP packet.

MFC after: 3 days


# 162368 17-Sep-2006 thompsa

Rearrange things so that ARP packets can be filtered or rate limited with IPFW.

Requested by: Jon Otterholm
Tested by: Jon Otterholm


# 161625 25-Aug-2006 thompsa

The bridge cant hear its own transmissions so set IFF_SIMPLEX.

PR: kern/102361
Tested by: Radim Kolar <hsn@netmag.cz>
MFC after: 3 days


# 161407 17-Aug-2006 thompsa

Remove unneeded asserts from bridge_ioctl_* since these are just
extensions of bridge_ioctl() which has the correct locking.


# 161403 17-Aug-2006 thompsa

Remove two lock asserts that are unneeded due to subsequent unlocks.


# 161401 17-Aug-2006 thompsa

Call bridge_span before dropping the lock.

MFC after: 5 days


# 160902 02-Aug-2006 thompsa

- Use the new bridgestp callback to once again flush our bridge routes when an
interface is disabled.
- Log port changes to syslog, defaulting to off


# 160901 02-Aug-2006 thompsa

Tell bridgestp that we are about to free the memory so it can cleanup.


# 160867 31-Jul-2006 thompsa

Add some statistics that are needed to support RFC4188 as part of the SoC2006
work on a bridge monitoring module for BSNMP.

Submitted by: shteryana (SoC 2006)


# 160769 27-Jul-2006 thompsa

Remove the dependency of bridgestp.h on if_bridgevar.h by moving a couple of
private structures to if_bridge.c.


# 160730 26-Jul-2006 thompsa

bridgestp is now a seperate module.


# 160726 26-Jul-2006 thompsa

Remove stp variables that are already initialised in bstp_attach().


# 160704 26-Jul-2006 thompsa

Forced commit due to missing log on the last revision.

Split the spanning tree state into its own structures and provide a simple API
to perform functions such as adding and deleting ports. This is just a
mechanical change and the STP operation remains the same. The bridgestp code
now has no knowledge of if_bridge.

This makes the code easier to read and can now also support other bridges such
as ng_bridge.


# 160703 26-Jul-2006 thompsa

/tmp/cvsuusTrc


# 160702 26-Jul-2006 thompsa

Remove variables that are overridden by ether_ifattach(). This clears up any
confusion especially as *if_output was pointed to a different function.


# 160195 09-Jul-2006 sam

Revise network interface cloning to take an optional opaque
parameter that can specify configuration parameters:
o rev cloner api's to add optional parameter block
o add SIOCCREATE2 that accepts parameter data
o rev vlan support to use new api (maintain old code)

Reviewed by: arch@


# 159807 20-Jun-2006 thompsa

Allow gif interfaces to be added as span ports, the user may want to send a
copy of all packets to the other side of the world.


# 159759 19-Jun-2006 thompsa

Fix spelling mistake in comment.


# 159555 12-Jun-2006 thompsa

Use bit operations to get a locally administered address rather than using a
hardcoded OUI code.


# 159446 08-Jun-2006 thompsa

Allow bridge and carp to play nicely together by returning the packet if its
destined for a carp interface.

Obtained from: OpenBSD
MFC after: 2 weeks


# 158667 16-May-2006 thompsa

Fix style(9) nits, whitespace and parentheses.


# 158592 15-May-2006 dhartmei

Recalculate IP checksum after running pfil hooks.

Reviewed by: thompsa
Tested by: Adam McDougall <mcdouga9@egr.msu.edu>


# 158140 29-Apr-2006 thompsa

Add support for fragmenting ipv4 packets.

The packet filter may reassemble the ip fragments and return a packet that is
larger than the MTU of the sending interface. There is no check for DF or icmp
replies as we can only get a large packet to fragment by reassembling a
previous fragment, and this only happens after a call to pfil(9).

Obtained from: OpenBSD (mostly)
Glanced at by: mlaier
MFC after: 1 month


# 157155 26-Mar-2006 thompsa

Assert that the mbuf is not shared to ensure problems like the last commit are
not reintroduced.


# 157057 23-Mar-2006 rik

m_dup () packet not m_copypacket () since we will modify it. For more
details see PR kern/94448.

PR: kern/94448

Original patch: Eygene A. Ryabinkin <rea-fbsd at rea dot mbslab dot kiae dot ru>Final patch: thompsa@
Tested by: thompsa@, Eygene A. Ryabinkin

MFC after: 7 days


# 156238 03-Mar-2006 thompsa

Since we are using random ethernet addresses for the bridge, it is possible
that we might have address collisions, so make sure that this hardware address
isn't already in use on another bridge.

Submitted by: csjp
MFC after: 1 month


# 156235 03-Mar-2006 csjp

Slightly re-worked bpf(4) code associated with bridging: if we have a
destination interface as a member of our bridge or this is a unicast packet,
push it through the bpf(4) machinery.

For broadcast or multicast packets, don't bother with the bpf(4) because it will
be re-injected into ether_input. We do this before we pass the packets through
the pfil(9) framework, as it is possible that pfil(9) will drop the packet or
possibly modify it, making it very difficult to debug firewall issues on the
bridge.

Further, implemented IFF_MONITOR for bridge interfaces. This does much the same
thing that it does for regular network interfaces: it pushes the packet to any
bpf(4) peers and then returns. This bypasses all of the bridge machinery,
saving mutex acquisitions, list traversals, and other operations performed by
the bridging code.

This change to the bridging code is useful in situations where individuals use a
bridge to multiplex RX/TX signals from two interfaces, as is required by some
network taps for de-multiplexing links and transmitting the RX/TX signals
out through two separate interfaces. This behaviour is quite common for network
taps monitoring links, especially for certain manufacturers.

Reviewed by: thompsa
MFC after: 1 month
Sponsored by: Seccuris Labs


# 155268 03-Feb-2006 oleg

Properly initialize args structure before passing it to ipfw_chk(): having
uninitialized args.inp is unhealthy for uid/gid/jail ipfw rules.

PR: kern/92589
Approved by: glebius (mentor)
MFC after: 1 week


# 155221 02-Feb-2006 csjp

Use PFIL_HOOKED macros in if_bridge and pass the right argument to
rw_assert. This un-breaks the build.

Submitted by: Kostik Belousov
Pointy hat to: csjp


# 155143 31-Jan-2006 thompsa

Fix two bugs with the bridge

- code expects memcmp() to return a signed value, our memcmp() returns 0 if
args are equal and > 0 if not.

- It's possible to hijack interface for static entry. If bridge recieves
packet from interface marked as learning it will replace the bridge_rtnode
entry for the source address even if such entry marked as static.

Submitted by: Gleb Kurtsov <k-gleb yandex.ru>
MFC after: 3 days


# 154806 25-Jan-2006 cperciva

Make sure buffers in if_bridge are fully initialized before copying
them to userland.

Security: FreeBSD-SA-06:06.kmem


# 154336 14-Jan-2006 thompsa

Add code that clears certain capabilities from the member interface, these are
restored when its removed from the bridge.

At the moment we only clear IFCAP_TXCSUM. Since a locally generated packet on
the bridge may be sent out any one or more interfaces it cant be assumed that
every card does hardware csums. Most bridges don't generate a lot of traffic
themselves so turning off offloading won't hurt, bridged packets are
unaffected.

Tested by: Bruce Walker (bmw borderware.com)
MFC after: 5 days


# 153979 02-Jan-2006 thompsa

Fix a brain-o in the last commit, the conditional was always false.


# 153978 02-Jan-2006 thompsa

Reorganise bridge_rtupdate slightly to reduce duplication.


# 153977 02-Jan-2006 thompsa

Reset the route expiry time on each update rather than always letting them get
GC'd and recreated.


# 153976 02-Jan-2006 thompsa

It is better to use time_uptime here since it is monotonic.

Pointed out by: glebius


# 153967 02-Jan-2006 thompsa

Minor whitespace cleanup.


# 153965 02-Jan-2006 thompsa

Read time_second directly rather than calling getmicrotime().

Obtained from: DragonflyBSD


# 153831 29-Dec-2005 thompsa

When pfil(9) is enabled the bridge only considers ETHERTYPE_ARP, ETHERTYPE_IP and
ETHERTYPE_IPV6 frames. Change this to be a sysctl knob so that is able to still
bridge non-IP packets if desired.

Also return early if all pfil_* sysctls are turned off, the user obviously does
not want to filter on the bridge.


# 153621 21-Dec-2005 thompsa

Add RFC 3378 EtherIP support. This change makes it possible to add gif
interfaces to bridges, which will then send and receive IP protocol 97 packets.
Packets are Ethernet frames with an EtherIP header prepended.

Obtained from: NetBSD
MFC after: 2 weeks


# 153606 21-Dec-2005 thompsa

As of r1.21 all broadcast packets are reprocessed by ether_input as arriving on
the bridge, this caused these packets to show up twice via bpf. Do not process
them twice with BPF_TAP.

MFC after: 3 days


# 153498 17-Dec-2005 thompsa

Use M_ZERO for the bridge_iflist to ensure there are no unexpected suprises.


# 153497 17-Dec-2005 thompsa

Minor whitespace cleanup.


# 153494 17-Dec-2005 thompsa

Change from a callback in if_ethersubr to using EVENTHANDLER in order to detach
span ports when they disappear. The span port does not have a pointer to the
softc so revert r1.31 and bring back the softc linked-list.

MFC after: 2 weeks


# 153458 15-Dec-2005 thompsa

It is not safe to use m_copypacket() here as the returned mbuf is readonly,
change to m_dup and keep the alignment on the layer3 header.

MFC after: 1 week


# 153408 14-Dec-2005 thompsa

Add support for creating span ports so that one can snoop bridged traffic
from another interface/machine/network.

Obtained from: OpenBSD
MFC after: 2 weeks


# 152932 29-Nov-2005 thompsa

The bridge is capable of sending broadcast packets so enable IFF_BROADCAST

Requested by: des


# 152393 13-Nov-2005 thompsa

Fix a second missed case where the refcount is not decremented.

MFC after: 3 days


# 152392 13-Nov-2005 thompsa

Fix a mbuf and refcnt leak in the broadcast code.

If the packet is rejected from pfil(9) then continue the loop rather than
returning, this means that we can still try to send it out the remaining
interfaces but more importantly the mbuf is freed and refcount decremented on
exit.


# 152315 11-Nov-2005 ru

- Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.


# 152209 08-Nov-2005 thompsa

Move the cloned interface list management in to if_clone. For some drivers the
softc lists and associated mutex are now unused so these have been removed.

Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.

Idea by: brooks
Reviewed by: brooks


# 151594 23-Oct-2005 thompsa

If we have been called from ether_ifdetach() then do not try and clear the
promisc flag from the member interface, this is a no-op anyway since the
interface is disappearing. The driver may have already released
its resources such as miibus and this is likely to panic the kernel.

Submitted and tested by: Wojciech A. Koszek
MFC after: 2 weeks


# 151345 14-Oct-2005 thompsa

Make four more functions static that were missed in the last commit.


# 151313 14-Oct-2005 thompsa

Change most of the bridge and stp funtions to static. This has highlighted
that the following funtions are not used, wrap in '#ifdef noused' for the
moment.

bstp_enable_change_detection
bstp_disable_change_detection
bstp_set_bridge_priority
bstp_set_port_priority
bstp_set_path_cost


# 151305 14-Oct-2005 thompsa

Further clean up the bridge hooks in if_ethersubr.c and ng_ether.c

- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.


# 151301 13-Oct-2005 thompsa

From 101 ways to panic your kernel.

Use bridge_ifdetach() to notify the bridge that a member has been detached. The
bridge can then remove it from its interface list and not try to send out via a
dead pointer.


# 151282 13-Oct-2005 thompsa

Clean up the if_bridge hooks a bit in if_ethersubr.c and ng_ether.c, move
the broadcast/multicast test to bridge_input().

Requested by: glebius


# 151266 12-Oct-2005 thompsa

Change the reference counting to count the number of cloned interfaces for each
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.

Assert that all cloners have been destroyed when freeing the memory.

Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.

Discussed with: brooks, pjd, -current
Reviewed by: brooks


# 150837 02-Oct-2005 thompsa

Do not packet filter in the bridge_start() routine, locally generated packets
are already filtered by the higher layers.

Approved by: mlaier (mentor)
MFC after: 3 days


# 150444 21-Sep-2005 thompsa

Fix an alignment panic my preserving the 2byte padding (ETHER_ALIGN) on our
copied mbuf, which keeps the IP header 32-bit aligned. This copied mbuf is
reinjected back into ether_input and off to the IP routines.

Reported and tested by: Peter van Dijk
Approved by: mlaier (mentor)
MFC after: 3 days


# 149829 06-Sep-2005 thompsa

Add support for multicast to the bridge and allow inet6 addresses to be
assigned to the interface.

IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.

An address can be assigned in the usual way;
ifconfig bridge0 inet6 xxxx:...

Tested by: bmah
Reviewed by: ume (netinet6)
Approved by: mlaier (mentor)
MFC after: 1 week


# 149522 26-Aug-2005 thompsa

Fix a panic in softclock() if the interface is destroyed with a bpf consumer
attached.

This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
the ioctl function to init() the interface which resets the callouts.

The destroy then completes and frees the softc but softclock will panic on a
dead callout pointer.

Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.

Silence from: rwatson
Approved by: mlaier (mentor)
MFC after: 3 days


# 149396 23-Aug-2005 thompsa

The mtu check in bridge_enqueue is bogus as the maximum Ethernet frame is
actually 1514, so comparing the mbuf length which includes the Ethernet header
to the interface MTU is wrong.

The check was a little over the top so just remove it.

Approved by: mlaier (mentor)
MFC after: 3 days


# 149253 18-Aug-2005 thompsa

Mark the callouts as MPSAFE as if_bridge has been giant-free since day 1.

Use the SMP friendly callout_init_mtx() while we are here.

Approved by: mlaier (mentor)
MFC after: 3 days


# 149064 15-Aug-2005 thompsa

Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by: Michal Mertl
Approved by: mlaier (mentor)
MFC after: 3 days


# 148887 09-Aug-2005 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days


# 148874 08-Aug-2005 thompsa

Use m_copypacket() which is an optimization of the common case
m_copym(m, 0, M_COPYALL, how).

This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.

Noticed by: Petri Simolin
Approved by: mlaier (mentor)
MFC after: 3 days


# 148372 25-Jul-2005 thompsa

We check that all the member interfaces have the same MTU on attach to the
bridge but the interface can still be changed afterwards.

This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.

Approved by: mlaier (mentor)
MFC after: 3 days


# 148202 20-Jul-2005 thompsa

Clear the PROMISC flag from the vlan interface when we remove a member. We
checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().

Approved by: mlaier (mentor)


# 147976 13-Jul-2005 thompsa

Previously the bridge MTU was set to ETHERMTU and could not be changed. Since
we can only bridge interfaces with the same value it meant that all members had
to be set at ETHERMTU as well.

Allow the first member to be added to define the MTU for the bridge, the check
still applies to all additional members.

Print an informative message if the MTU is incorrect [1]

Requested by: Niki Denev [1]
Approved by: mlaier (mentor)
MFC after: 3 days


# 147786 05-Jul-2005 thompsa

- Previously when broadcasting to N number of interfaces we would run pfil
hooks for each outgoing interface but also run pfil hooks _N times_ on the
bridge interface. This is changed so pfil hooks are run once for the bridge
interface (bridge0) and then only on the outgoing interfaces in the broadcast
loop.

- Simplify bridge_enqueue() by moving bridge_pfil() to the callers.

- Check (inet6_pfil_hook.ph_busy_count >= 0), it may be possible to have a
packet filter hooked for only ipv6 but we were only checking if ipv4 hooks
were busy.

- Minor optimisation for null mbuf check after bridge_pfil(), move it into the
if-block as it couldnt possibly be null outside.

Prodded by: mlaier
Approved by: re (scottl), mlaier (mentor)


# 147744 02-Jul-2005 thompsa

Check the alignment of the IP header before passing the packet up to the
packet filter. This would cause a panic on architectures that require strict
alignment such as sparc64 (tier1) and ia64/ppc (tier2).

This adds two new macros that check the alignment, these are compile time
dependent on __NO_STRICT_ALIGNMENT which is set for i386 and amd64 where
alignment isn't need so the cost is avoided.

IP_HDR_ALIGNED_P()
IP6_HDR_ALIGNED_P()

Move bridge_ip_checkbasic()/bridge_ip6_checkbasic() up so that the alignment
is checked for ipfw and dummynet too.

PR: ia64/81284
Obtained from: NetBSD
Approved by: re (dwhite), mlaier (mentor)


# 147665 29-Jun-2005 thompsa

Sync if_bridge to NetBSD r1.31

Rename conflicting variables when handling SNAP Ethernet frames.

Obtained from: NetBSD
Approved by: mlaier (mentor)
Approved by: re (blanket)


# 147634 27-Jun-2005 thompsa

Fix a panic when bringing up the bridge interface. We were casting a ifnet
pointer to a softc which is no longer valid since the ifnet struct was split
out from the softc.

Approved by: mlaier (mentor)
Approved by: re (blanket)


# 147281 10-Jun-2005 thompsa

Catch up with the struct ifnet changes and use if_alloc().

Reviewed by: brooks
Approved by: mlaier (mentor)


# 147256 10-Jun-2005 brooks

Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.

Reviewed by: sobomax, sam


# 147251 10-Jun-2005 mlaier

Add missing {} in last commit.


# 147205 09-Jun-2005 thompsa

Add dummynet(4) support to if_bridge, this code is largely based on bridge.c.

This is the final piece to match bridge.c in functionality, we can now be a
drop-in replacement.

Approved by: mlaier (mentor)


# 147111 07-Jun-2005 thompsa

Bring in IPFW layer2 filtering from bridge.c, this allows Ethernet filtering
using the layer2, mac and mac-type keywords.

This is one of the last features that bridge.c has over if_bridge and gets us
very close to a full functional replacement.

Approved by: mlaier (mentor)


# 147040 06-Jun-2005 thompsa

Change ipv6 packet filtering to match ipv4. It now checks pfil_member and
pfil_bridge to determine which interfaces to filter on.

Approved by: mlaier (mentor)


# 146985 05-Jun-2005 thompsa

Add if_bridge, which provides more advanced Ethernet bridging and 802.1d
spanning tree support.

Based on Jason Wright's bridge driver from OpenBSD, and modified by Jason R.
Thorpe in NetBSD.

Reviewed by: mlaier, bms, green
Silence from: -net
Approved by: mlaier (mentor)
Obtained from: NetBSD