History log of /freebsd-current/sys/net/if_vlan.c
Revision Date Author Comments
# bdd12889 21-May-2024 Kristof Provost <kp@FreeBSD.org>

if_vlan: handle VID conflicts

If we fail to change the vlan id we have to undo the removal (and vlan id
change) in the error path. Otherwise we'll have removed the vlan object from the
hash table, and have the wrong vlan id as well. Subsequent modification attempts
will then try to remove an entry which doesn't exist, and panic.

Undo the vlan id modification if the insertion in the hash table fails, and
re-insert it under the original vlan id.

PR: 279195
Reviewed by: zlei
MFC atfer: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D45285


# ab393e95 12-Oct-2023 Kristof Provost <kp@FreeBSD.org>

netlink: move NETLINK define to opt_global.h

Move the NETLINK define into opt_global.h so we can rely on it being
set correctly, without having to remember to include opt_netlink.h.
This ensures that the NETLINK define is correctly set. If not we
may end up with unloadable modules, due to missing symbols (such as
nlmsg_get_group_writer).

PR: 274306
Reviewed by: imp, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42179


# b451dcc8 04-Sep-2023 Dag-Erling Smørgrav <des@FreeBSD.org>

if_vlan: Always default to 802.1q.

There is no reason for this fallback to be conditional on COMPAT_FREEBSD12.

PR: 273539
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Reviewed by: melifaro, allanjude
Differential Revision: https://reviews.freebsd.org/D41717


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# b1a39c31 12-Aug-2023 Kevin Bowling <kbowling@FreeBSD.org>

vlan: Respect IFCAP_LRO mask

vlan_capabilities(), used by the IFCAP ioctl, was not respecting the
IFCAP_LRO bit if it was masked by the requestor.

This prevented if_bridge(4) from automasking LRO with a message like:
bridge0: can't disable some capabilities on em3.11: 0x400

This also prevented manually disabling LRO from any vlan interface.

PR: 254596
Reported by: Paul Vixie <paul@redbarn.org>
MFC after: 1 week


# fb69ed39 12-Aug-2023 Kristof Provost <kp@FreeBSD.org>

Revert "if_vlan: do not enable LRO for bridge interaces"

This reverts commit 5f11a33ceeb385477cb22d9ad5941061c5a26be9.

As requested by Kevin Bowling. He explains:

> The subtle bug was that vlan_capabilities() in if_vlan was not obeying
> the requested mask from its IFCAP ioctl.


# 5f11a33c 11-Aug-2023 Paul Vixie <paul@redbarn.org>

if_vlan: do not enable LRO for bridge interaces

If the parent interface is not a bridge and can do LRO and
checksum offloading on VLANs, then guess it may do LRO on VLANs.
False positive here cost nothing, while false negative may lead
to some confusions. According to Wikipedia:

"LRO should not operate on machines acting as routers, as it breaks
the end-to-end principle and can significantly impact performance."

The same reasoning applies to machines acting as bridges.

PR: 254596
MFC after: 3 weeks


# 92c23f6d 12-May-2023 Kristof Provost <kp@FreeBSD.org>

vlan: fix setting flags on a QinQ interface

Setting vlan flags needlessly takes the exclusive VLAN_XLOCK().

If we have stacked vlan devices (i.e. QinQ) and we set vlan flags (e.g.
IFF_PROMISC) we call rtnl_handle_ifevent() to send a notification about
the interface.
This ends up calling SIOCGIFMEDIA, which requires the VLAN_SLOCK().
Trying to take that one with the VLAN_XLOCK() held deadlocks us.

There's no need for the exclusive lock though, as we're only accessing
parent/trunk information, not modifying it, so a shared lock is
sufficient.

While here also add a test case for this issue.

Backtrace:
shared lock of (sx) vlan_sx @ /usr/src/sys/net/if_vlan.c:2192
while exclusively locked from /usr/src/sys/net/if_vlan.c:2307
panic: excl->share
cpuid = 29
time = 1683873033
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe015d4ad4b0
vpanic() at vpanic+0x152/frame 0xfffffe015d4ad500
panic() at panic+0x43/frame 0xfffffe015d4ad560
witness_checkorder() at witness_checkorder+0xcb5/frame 0xfffffe015d4ad720
_sx_slock_int() at _sx_slock_int+0x67/frame 0xfffffe015d4ad760
vlan_ioctl() at vlan_ioctl+0xf8/frame 0xfffffe015d4ad7c0
dump_iface() at dump_iface+0x12f/frame 0xfffffe015d4ad840
rtnl_handle_ifevent() at rtnl_handle_ifevent+0xab/frame 0xfffffe015d4ad8c0
if_setflag() at if_setflag+0xf6/frame 0xfffffe015d4ad930
ifpromisc() at ifpromisc+0x2a/frame 0xfffffe015d4ad960
vlan_setflags() at vlan_setflags+0x60/frame 0xfffffe015d4ad990
vlan_ioctl() at vlan_ioctl+0x216/frame 0xfffffe015d4ad9f0
if_setflag() at if_setflag+0xe4/frame 0xfffffe015d4ada60
ifpromisc() at ifpromisc+0x2a/frame 0xfffffe015d4ada90
bridge_ioctl_add() at bridge_ioctl_add+0x499/frame 0xfffffe015d4adb10
bridge_ioctl() at bridge_ioctl+0x328/frame 0xfffffe015d4adbc0
ifioctl() at ifioctl+0x972/frame 0xfffffe015d4adcc0
kern_ioctl() at kern_ioctl+0x1fe/frame 0xfffffe015d4add30
sys_ioctl() at sys_ioctl+0x154/frame 0xfffffe015d4ade00
amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe015d4adf30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe015d4adf30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x22b0f0ef8d8a, rsp = 0x22b0ec63f2c8, rbp = 0x22b0ec63f380 ---
KDB: enter: panic
[ thread pid 5715 tid 101132 ]

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 089104e0 18-Apr-2023 Alexander V. Chernikov <melifaro@FreeBSD.org>

netlink: add netlink interfaces to if_clone

This change adds netlink create/modify/dump interfaces to the `if_clone.c`.
The previous attempt with storing the logic inside `netlink/route/iface_drivers.c`
did not quite work, as, for example, dumping interface-specific state
(like vlan id or vlan parent) required some peeking into the private interfaces.

The new interfaces are added in a compatible way - callers don't have to do anything
unless they are extended with Netlink.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D39032
MFC after: 1 month


# 66bdbcd5 03-Mar-2023 Alexander V. Chernikov <melifaro@FreeBSD.org>

net: unify mtu update code

Subscribers: imp, ae, glebius

Differential Revision: https://reviews.freebsd.org/D38893


# c255d1a4 23-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

IfAPI: Add if_llsoftc member accessors for TOEDEV

Summary:
Keep TOEDEV() macro for backwards compatibility, and add a SETTOEDEV()
macro to complement with the new accessors.

Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38199


# 2c2b37ad 13-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

ifnet/API: Move struct ifnet definition to a <net/if_private.h>

Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail. Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.

Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046


# 91ebcbe0 21-Sep-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

if_clone: migrate some consumers to the new KPI.

Convert most of the cloner customers who require custom params
to the new if_clone KPI.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D36636
MFC after: 2 weeks


# 151abc80 22-Jul-2022 Kristof Provost <kp@FreeBSD.org>

if_vlan: avoid hash table thrashing when adding and removing entries

vlan_remhash() uses incorrect value for b.

When using the default value for VLAN_DEF_HWIDTH (4), the VLAN hash-list table
expands from 16 chains to 32 chains as the 129th entry is added. trunk->hwidth
becomes 5. Say a few more entries are added and there are now 135 entries.
trunk-hwidth will still be 5. If an entry is removed, vlan_remhash() will
calculate a value of 32 for b. refcnt will be decremented to 134. The if
comparison at line 473 will return true and vlan_growhash() will be called. The
VLAN hash-list table will be compressed from 32 chains wide to 16 chains wide.
hwidth will become 4. This is an error, and it can be seen when a new VLAN is
added. The table will again be expanded. If an entry is then removed, again
the table is contracted.

If the number of VLANS stays in the range of 128-512, each time an insert
follows a remove, the table will expand. Each time a remove follows an
insert, the table will be contracted.

The fix is simple. The line 473 should test that the number of entries has
decreased such that the table should be contracted using what would be the new
value of hwidth. line 467 should be:

b = 1 << (trunk->hwidth - 1);

PR: 265382
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: NetApp, Inc.


# 663f556b 18-Jul-2022 Kristof Provost <kp@FreeBSD.org>

if_vlan: allow vlan and vlanproto to be changed

It's currently not possible to change the vlan ID or vlan protocol (i.e.
802.1q vs. 802.1ad) without de-configuring the interface (i.e. ifconfig
vlanX -vlandev).
Add a specific flow for this, allowing both the protocol and id (but not
parent interface) to be changed without going through the '-vlandev'
step.

Reviewed by: glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D35846


# 892eded5 24-May-2022 Hans Petter Selasky <hselasky@FreeBSD.org>

vlan(4): Add support for allocating TLS receive tags.

The TLS receive tags are allocated directly from the receiving interface,
because mbufs are flowing in the opposite direction and then route change
checks are not useful, because they only work for outgoing traffic.

Differential revision: https://reviews.freebsd.org/D32356
Sponsored by: NVIDIA Networking


# f2ab9160 19-May-2022 Andrey V. Elsukov <ae@FreeBSD.org>

[vlan + lagg] add IFNET_EVENT_UPDATE_BAUDRATE event

use it to update if_baudrate for vlan interfaces created on the LACP lagg.

Differential revision: https://reviews.freebsd.org/D33405


# 2884a936 13-Apr-2022 John Baldwin <jhb@FreeBSD.org>

vlan: ifa is only used under #ifdef INET.


# 78bc3d5e 14-Feb-2022 Kristof Provost <kp@FreeBSD.org>

vlan: allow net.link.vlan.mtag_pcp to be set per vnet

The primary reason for this change is to facilitate testing.

MFC after: 1 week


# c782ea8b 14-Sep-2021 John Baldwin <jhb@FreeBSD.org>

Add a switch structure for send tags.

Move the type and function pointers for operations on existing send
tags (modify, query, next, free) out of 'struct ifnet' and into a new
'struct if_snd_tag_sw'. A pointer to this structure is added to the
generic part of send tags and is initialized by m_snd_tag_init()
(which now accepts a switch structure as a new argument in place of
the type).

Previously, device driver ifnet methods switched on the type to call
type-specific functions. Now, those type-specific functions are saved
in the switch structure and invoked directly. In addition, this more
gracefully permits multiple implementations of the same tag within a
driver. In particular, NIC TLS for future Chelsio adapters will use a
different implementation than the existing NIC TLS support for T6
adapters.

Reviewed by: gallatin, hselasky, kib (older version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31572


# 2e5ff01d 21-Aug-2021 Luiz Otavio O Souza <loos@FreeBSD.org>

if_vlan: add the ALTQ support to if_vlan.

Inspired by the iflib implementation, allow ALTQ to be used with if_vlan
interfaces.

Reviewed by: donner
Obtained from: pfsense
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31647


# 9ef8cd0b 22-Jul-2021 Kristof Provost <kp@FreeBSD.org>

vlan: deduplicate bpf_setpcp() and pf_ieee8021q_setpcp()

These two fuctions were identical, so move them into the common
vlan_set_pcp() function, exposed in the if_vlan_var.h header.

Reviewed by: donner
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31275


# c6b2d024 21-Jun-2021 George V. Neville-Neil <gnn@FreeBSD.org>

Retore the vnet before returning an error.

Obtained from: Kanndula, Dheeraj <Dheeraj.Kandula@netapp.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30741


# afbb64f1 11-Apr-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix vlan creation for the older ifconfig(8) binaries.

Reported by: allanjude
MFC after: immediately


# 53729367 26-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix subinterface vlan creation.

D26436 introduced support for stacked vlans that changed the way vlans
are configured. In particular, this change broke setups that have
same-number vlans as subinterfaces.

Vlan support was initially created assuming "vlanX" semantics. In this paradigm,
automatic number assignment supported by cloning (ifconfig vlan create) was a
natural fit.
When "ifaceX.Y" support was added, allowing to have the same vlan number on
multiple devices, cloning code became more complex, as the is no
unified "vlan" namespace anymore. Such interfaces got the first spare
index from "vlan" cloner. This, in turn, led to the following problem:
ifconfig ix0.333 create -> index 1
ifconfig ix0.444 create -> index 2
ifconfig vlan2 create -> allocation failure

This change fixes such allocations by using cloning indexes only for
"vlanX" interfaces.

Reviewed by: hselasky
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D27505


# 3f43ada9 28-Jan-2021 Gleb Smirnoff <glebius@FreeBSD.org>

Catch up with 6edfd179c86: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG.

Originally IFCAP_NOMAP meant that the mbuf has external storage pointer
that points to unmapped address. Then, this was extended to array of
such pointers. Then, such mbufs were augmented with header/trailer.
Basically, extended mbufs are extended, and set of features is subject
to change. The new name should be generic enough to avoid further
renaming.


# 1a714ff2 26-Jan-2021 Randall Stewart <rrs@FreeBSD.org>

This pulls over all the changes that are in the netflix
tree that fix the ratelimit code. There were several bugs
in tcp_ratelimit itself and we needed further work to support
the multiple tag format coming for the joint TLS and Ratelimit dances.

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D28357


# 36e0a362 29-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().

This gives a more uniform API for send tag life cycle management.

Reviewed by: gallatin, hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27000


# 521eac97 28-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Support hardware rate limiting (pacing) with TLS offload.

- Add a new send tag type for a send tag that supports both rate
limiting (packet pacing) and TLS offload (mostly similar to D22669
but adds a separate structure when allocating the new tag type).

- When allocating a send tag for TLS offload, check to see if the
connection already has a pacing rate. If so, allocate a tag that
supports both rate limiting and TLS offload rather than a plain TLS
offload tag.

- When setting an initial rate on an existing ifnet KTLS connection,
set the rate in the TCP control block inp and then reset the TLS
send tag (via ktls_output_eagain) to reallocate a TLS + ratelimit
send tag. This allocates the TLS send tag asynchronously from a
task queue, so the TLS rate limit tag alloc is always sleepable.

- When modifying a rate on a connection using KTLS, look for a TLS
send tag. If the send tag is only a plain TLS send tag, assume we
failed to allocate a TLS ratelimit tag (either during the
TCP_TXTLS_ENABLE socket option, or during the send tag reset
triggered by ktls_output_eagain) and ignore the new rate. If the
send tag is a ratelimit TLS send tag, change the rate on the TLS tag
and leave the inp tag alone.

- Lock the inp lock when setting sb_tls_info for a socket send buffer
so that the routines in tcp_ratelimit can safely dereference the
pointer without needing to grab the socket buffer lock.

- Add an IFCAP_TXTLS_RTLMT capability flag and associated
administrative controls in ifconfig(8). TLS rate limit tags are
only allocated if this capability is enabled. Note that TLS offload
(whether unlimited or rate limited) always requires IFCAP_TXTLS[46].

Reviewed by: gallatin, hselasky
Relnotes: yes
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26691


# c7cffd65 21-Oct-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q).

802.1ad interfaces are created with ifconfig using the "vlanproto" parameter.
Eg., the following creates a 802.1Q VLAN (id #42) over a 802.1ad S-VLAN
(id #5) over a physical Ethernet interface (em0).

ifconfig vlan5 create vlandev em0 vlan 5 vlanproto 802.1ad up
ifconfig vlan42 create vlandev vlan5 vlan 42 inet 10.5.42.1/24

VLAN_MTU, VLAN_HWCSUM and VLAN_TSO capabilities should be properly
supported. VLAN_HWTAGGING is only partially supported, as there is
currently no IFCAP_VLAN_* denoting the possibility to set the VLAN
EtherType to anything else than 0x8100 (802.1ad uses 0x88A8).

Submitted by: Olivier Piras
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D26436


# 56fb710f 06-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Store the send tag type in the common send tag header.

Both cxgbe(4) and mlx5(4) wrapped the existing send tag header with
their own identical headers that stored the type that the
type-specific tag structures inherited from, so in practice it seems
drivers need this in the tag anyway. This permits removing these
extra header indirections (struct cxgbe_snd_tag and struct
mlx5e_snd_tag).

In addition, this permits driver-independent code to query the type of
a tag, e.g. to know what type of tag is being queried via
if_snd_query.

Reviewed by: gallatin, hselasky, np, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26689


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# eb03a443 31-Jan-2020 Kristof Provost <kp@FreeBSD.org>

vlan: Fix panic when vnet jail with a vlan interface is destroyed

During vnet cleanup vnet_if_uninit() checks that no more interfaces remain in
the vnet. Any interface borrowed from another vnet is returned by
vnet_if_return(). Other interfaces (i.e. cloned interfaces) should have been
destroyed by their cloner at this point.

The if_vlan VNET_SYSUNINIT had priority SI_ORDER_FIRST, which means it had
equal priority as vnet_if_uninit(). In other words: it was possible for it to
be called *after* vnet_if_uninit(), which would lead to assertion failures.

Set the priority to SI_ORDER_ANY, like other cloners to ensure that vlan
interfaces are destroyed before we enter vnet_if_uninit().

The sys/net/if_vlan test provoked this.


# 4be465ab 29-Jan-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Plug parent iface refcount leak on <ifname>.X vlan creation.

PR: kern/242270
Submitted by: Andrew Boyer <aboyer at pensando.io>
MFC after: 2 weeks


# 84becee1 22-Jan-2020 Alexander Motin <mav@FreeBSD.org>

Update route MTUs for bridge, lagg and vlan interfaces.

Those interfaces may implicitly change their MTU on addition of parent
interface in addition to normal SIOCSIFMTU ioctl path, where the route
MTUs are updated normally.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 2a4bd982 14-Jan-2020 Gleb Smirnoff <glebius@FreeBSD.org>

Introduce NET_EPOCH_CALL() macro and use it everywhere where we free
data based on the network epoch. The macro reverses the argument
order of epoch_call(9) - first function, then its argument. NFC


# a961401e 07-Nov-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Enqueue lladdr_task to update link level address of vlan, when its parent
interface has changed.

During vlan reconfiguration without destroying interface, it is possible,
that parent interface will be changed. This usually means, that link
layer address of vlan will be different. Therefore we need to update all
associated with vlan's addresses permanent llentries - NDP for IPv6
addresses, and ARP for IPv4 addresses. This is done via lladdr_task
execution. To avoid extra work, before execution do the check, that L2
address is different.

No objection from: #network
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D22243


# b2807792 17-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Revert two parts of r353292 that enter epoch when processing vlan capabilities.
It could be that entering epoch isn't necessary here, but better take a
conservative approach.

Submitted by: kp


# 6dcec895 13-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

vlan_config() isn't always called in epoch context.

Reported by: kp


# b8a6e03f 07-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Widen NET_EPOCH coverage.

When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by: gallatin, hselasky, cy, adrian, kristof
Differential Revision: https://reviews.freebsd.org/D19111


# bf7700e4 25-Sep-2019 Gleb Smirnoff <glebius@FreeBSD.org>

style(9): remove extraneous empty lines


# 16cf6bdb 30-Aug-2019 Matt Joras <mjoras@FreeBSD.org>

Wrap a vlan's parent's if_output in a separate function.

When a vlan interface is created, its if_output is set directly to the
parent interface's if_output. This is fine in the normal case but has an
unfortunate consequence if you end up with a certain combination of vlan
and lagg interfaces.

Consider you have a lagg interface with a single laggport member. When
an interface is added to a lagg its if_output is set to
lagg_port_output, which blackholes traffic from the normal networking
stack but not certain frames from BPF (pseudo_AF_HDRCMPLT). If you now
create a vlan with the laggport member (not the lagg interface) as its
parent, its if_output is set to lagg_port_output as well. While this is
confusing conceptually and likely represents a misconfigured system, it
is not itself a problem. The problem arises when you then remove the
lagg interface. Doing this resets the if_output of the laggport member
back to its original state, but the vlan's if_output is left pointing to
lagg_port_output. This gives rise to the possibility that the system
will panic when e.g. bpf is used to send any frames on the vlan
interface.

Fix this by creating a new function, vlan_output, which simply wraps the
parent's current if_output. That way when the parent's if_output is
restored there is no stale usage of lagg_port_output.

Reviewed by: rstone
Differential Revision: D21209


# b2e60773 26-Aug-2019 John Baldwin <jhb@FreeBSD.org>

Add kernel-side support for in-kernel TLS.

KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports
offload of TLS for transmitted data. Key negotation must still be
performed in userland. Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option. All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.

Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type. Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.

At present, rekeying is not supported though the in-kernel framework
should support rekeying.

KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer. Each TLS frame is described by a single
ext_pgs mbuf. The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.

KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.

Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then
called to schedule TLS frames for encryption. In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed. For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking
ktls_enqueue().

A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue(). Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.

(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct
map.)

KTLS supports pluggable software encryption backends. Internally,
Netflix uses proprietary pure-software backends. This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames. As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.

Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready(). At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.

ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session. TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted. The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface. If so, the packet is tagged
with the TLS send tag and sent to the interface. The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation. If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped. In addition, a task is scheduled to refresh the TLS send
tag for the TLS session. If a new TLS send tag cannot be allocated,
the connection is dropped. If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag. (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another. As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being
dropped.)

ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.

Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option. They can also use this socket
option to toggle between software and ifnet TLS modes.

In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax. However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.

Various sysctls and counters are available under the kern.ipc.tls
sysctl node. The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default). The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.

KTLS is enabled via the KERN_TLS kernel option.

This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.

Reviewed by: gallatin, hselasky, rrs
Obtained from: Netflix
Sponsored by: Netflix, Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21277


# 66d0c056 28-Jun-2019 John Baldwin <jhb@FreeBSD.org>

Support IFCAP_NOMAP in vlan(4).

Enable IFCAP_NOMAP for a vlan interface if it is supported by the
underlying trunk device.

Reviewed by: gallatin, hselasky, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20616


# fb3bc596 24-May-2019 John Baldwin <jhb@FreeBSD.org>

Restructure mbuf send tags to provide stronger guarantees.

- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), in
ip_output() and ip6_output(). This avoids sending packets with send
tags to ifnet drivers that don't support send tags.

Since we are now checking for ifp mismatches before invoking
if_output, we can now try to allocate a new tag before invoking
if_output sending the original packet on the new tag if allocation
succeeds.

To avoid code duplication for the fragment and unfragmented cases,
add ip_output_send() and ip6_output_send() as wrappers around
if_output and nd6_output_ifp, respectively. All of the logic for
setting send tags and dealing with send tag-related errors is done
in these wrapper functions.

For pseudo interfaces that wrap other network interfaces (vlan and
lagg), wrapper send tags are now allocated so that ip*_output see
the wrapper ifp as the ifp in the send tag. The if_transmit
routines rewrite the send tags after performing an ifp mismatch
check. If an ifp mismatch is detected, the transmit routines fail
with EAGAIN.

- To provide clearer life cycle management of send tags, especially
in the presence of vlan and lagg wrapper tags, add a reference count
to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
Provide a helper function (m_snd_tag_init()) for use by drivers
supporting send tags. m_snd_tag_init() takes care of the if_ref
on the ifp meaning that code alloating send tags via if_snd_tag_alloc
no longer has to manage that manually. Similarly, m_snd_tag_rele
drops the refcount on the ifp after invoking if_snd_tag_free when
the last reference to a send tag is dropped.

This also closes use after free races if there are pending packets in
driver tx rings after the socket is closed (e.g. from tcpdrop).

In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
Drivers now also check this flag instead of checking snd_tag against
NULL. This avoids false positive matches when a forwarded packet
has a non-NULL rcvif that was treated as a send tag.

- cxgbe was relying on snd_tag_free being called when the inp was
detached so that it could kick the firmware to flush any pending
work on the flow. This is because the driver doesn't require ACK
messages from the firmware for every request, but instead does a
kind of manual interrupt coalescing by only setting a flag to
request a completion on a subset of requests. If all of the
in-flight requests don't have the flag when the tag is detached from
the inp, the flow might never return the credits. The current
snd_tag_free command issues a flush command to force the credits to
return. However, the credit return is what also frees the mbufs,
and since those mbufs now hold references on the tag, this meant
that snd_tag_free would never be called.

To fix, explicitly drop the mbuf's reference on the snd tag when the
mbuf is queued in the firmware work queue. This means that once the
inp's reference on the tag goes away and all in-flight mbufs have
been queued to the firmware, tag's refcount will drop to zero and
snd_tag_free will kick in and send the flush request. Note that we
need to avoid doing this in the middle of ethofld_tx(), so the
driver grabs a temporary reference on the tag around that loop to
defer the free to the end of the function in case it sends the last
mbuf to the queue after the inp has dropped its reference on the
tag.

- mlx5 preallocates send tags and was using the ifp pointer even when
the send tag wasn't in use. Explicitly use the ifp from other data
structures instead.

- Sprinkle some assertions in various places to assert that received
packets don't have a send tag, and that other places that overwrite
rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.

Reviewed by: gallatin, hselasky, rgrimes, ae
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20117


# fa91f845 13-Feb-2019 Randall Stewart <rrs@FreeBSD.org>

This commit adds the missing release mechanism for the
ratelimiting code. The two modules (lagg and vlan) did have
allocation routines, and even though they are indirect (and
vector down to the underlying interfaces) they both need to
have a free routine (that also vectors down to the actual interface).

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D19032


# 7b7f772f 09-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Bring the comment up to date.


# 72755d28 09-Jan-2019 Mark Johnston <markj@FreeBSD.org>

Stop setting if_linkmib in vlan(4) ifnets.

There are several reasons:
- The structure being exported via IFDATA_LINKSPECIFIC doesn't appear
to be a standard MIB.
- The structure being exported is private to the kernel and always
has been.
- No other drivers in common use set the if_linkmib field.
- Because IFDATA_LINKSPECIFIC can be used to overwrite the linkmib
structure, a privileged user could use it to corrupt internal
vlan(4) state. [1]

PR: 219472
Reported by: CTurt <ecturt@gmail.com> [1]
Reviewed by: kp (previous version)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18779


# a68cc388 08-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanical cleanup of epoch(9) usage in network stack.

- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro is
used in embedded scopes. Explicitly declare epoch_tracker always.

- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
locking macros to what they actually are - the net_epoch.
Keeping them as is is very misleading. They all are named FOO_RLOCK(),
while they no longer have lock semantics. Now they allow recursion and
what's more important they now no longer guarantee protection against
their companion WLOCK macros.
Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.

This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.

Discussed with: jtl, gallatin


# cac30248 21-Nov-2018 Oleg Bulyzhin <oleg@FreeBSD.org>

Unbreak kernel build with VLAN_ARRAY defined.

MFC after: 1 week


# 5191a3ae 21-Oct-2018 Kristof Provost <kp@FreeBSD.org>

vlan: Fix panic with lagg and vlan

vlan_lladdr_fn() is called from taskqueue, which means there's no vnet context
set. We can end up trying to send ARP messages (through the iflladdr_event
event), which requires a vnet context.

PR: 227654
MFC after: 3 days


# c32a9d66 15-Oct-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix deadlock when destroying VLANs.

Synchronizing the epoch before freeing the multicast addresses while holding
the VLAN_XLOCK() might lead to a deadlock. Use deferred freeing of the VLAN
multicast addresses to resolve deadlock. Backtrace:

Thread1:
epoch_block_handler_preempt()
ck_epoch_synchronize_wait()
epoch_wait_preempt()
vlan_setmulti()
vlan_ioctl()
in6m_release_task()
gtaskqueue_run_locked()
gtaskqueue_thread_loop()
fork_exit()
fork_trampoline()

Thread2:
sleepq_switch()
sleepq_wait()
_sx_xlock_hard()
_sx_xlock()
in6_leavegroup()
in6_purgeaddr()
if_purgeaddrs()
if_detach_internal()
if_detach()
vlan_clone_destroy()
if_clone_destroyif()
if_clone_destroy()
ifioctl()
kern_ioctl()
sys_ioctl()
amd64_syscall()
fast_syscall_common()
syscall()

Differential revision: https://reviews.freebsd.org/D17496
Reviewed by: slavash, mmacy
Approved by: re (kib)
Sponsored by: Mellanox Technologies


# b08d611d 20-Sep-2018 Matt Macy <mmacy@FreeBSD.org>

fix vlan locking to permit sx acquisition in ioctl calls

- update vlan(9) to handle changes earlier this year in multicast locking

Tested by: np@, darkfiberu at gmail.com

PR: 230510
Reviewed by: mjoras@, shurd@, sbruno@
Approved by: re (gjb@)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16808


# 32a52e9e 16-Aug-2018 Navdeep Parhar <np@FreeBSD.org>

if_vlan(4): A VLAN always has a PCP and its ifnet's if_pcp should be set
to the PCP value in use instead of IFNET_PCP_NONE.

MFC after: 1 week
Sponsored by: Chelsio Communications


# 32d2623a 16-Aug-2018 Navdeep Parhar <np@FreeBSD.org>

Add the ability to look up the 3b PCP of a VLAN interface. Use it in
toe_l2_resolve to fill up the complete vtag and not just the vid.

Reviewed by: kib@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D16752


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# d7c5a620 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

ifnet: Replace if_addr_lock rwlock with epoch + mutex

Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32

After the patch

InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366


# 4a381a9e 26-Apr-2018 Hans Petter Selasky <hselasky@FreeBSD.org>

Add network device event for priority code point, PCP, changes.

When the PCP is changed for either a VLAN network interface or when
prio tagging is enabled for a regular ethernet network interface,
broadcast the IFNET_EVENT_PCP event so applications like ibcore can
update its GID tables accordingly.

MFC after: 3 days
Reviewed by: ae, kib
Differential Revision: https://reviews.freebsd.org/D15040
Sponsored by: Mellanox Technologies


# 541d96aa 30-Mar-2018 Brooks Davis <brooks@FreeBSD.org>

Use an accessor function to access ifr_data.

This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900


# 38d958a6 27-Mar-2018 Brooks Davis <brooks@FreeBSD.org>

Improve copy-and-pasted versions of SIOCGIFADDR.

The original implementation used a reference to ifr_data and a cast to
do the equivalent of accessing ifr_addr. This was copied multiple
times since 1996.

Approved by: kib
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14873


# f1379734 27-Mar-2018 Konstantin Belousov <kib@FreeBSD.org>

Allow to specify PCP on packets not belonging to any VLAN.

According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be
considered as untagged, and only PCP and DEI values from the VLAN tag
are meaningful. See for instance
https://www.cisco.com/c/en/us/td/docs/switches/connectedgrid/cg-switch-sw-master/software/configuration/guide/vlan0/b_vlan_0.html.

Make it possible to specify PCP value for outgoing packets on an
ethernet interface. When PCP is supplied, the tag is appended, VLAN
id set to 0, and PCP is filled by the supplied value. The code to do
VLAN tag encapsulation is refactored from the if_vlan.c and moved into
if_ethersubr.c.

Drivers might have issues with filtering VID 0 packets on
receive. This bug should be fixed for each driver.

Reviewed by: ae (previous version), hselasky, melifaro
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D14702


# ac2fffa4 21-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by: wosch
PR: 225197


# 44313341 15-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

net*: make some use of mallocarray(9).

Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837


# fdbf1174 12-Sep-2017 Matt Joras <mjoras@FreeBSD.org>

Allow vlan interfaces to rx through netmap(4).

Normally after receiving a packet, a vlan(4) interface sends the packet
back through its parent interface's rx routine so that it can be
processed as an untagged frame. It does this by using the parent's
ifp->if_input. This is incompatible with netmap(4), which replaces the
vlan(4) interface's if_input with a netmap(4) hook. Fix this by using
the vlan(4) interface's ifp instead of the parent's directly.

Reported by: Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by: rstone
Approved by: rstone (mentor)
MFC after: 3 days
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12191


# d148c2a2 15-Aug-2017 Matt Joras <mjoras@FreeBSD.org>

Rework vlan(4) locking.

Previously the locking of vlan(4) interfaces was not very comprehensive.
Particularly there was very little protection against the destruction of
active vlan(4) interfaces or concurrent modification of a vlan(4)
interface. The former readily produced several different panics.

The changes can be summarized as using two global vlan locks (an
rmlock(9) and an sx(9)) to protect accesses to the if_vlantrunk field of
struct ifnet, in addition to other places where global exclusive access
is required. vlan(4) should now be much more resilient to the destruction
of active interfaces and concurrent calls into the configuration path.

PR: 220980
Reviewed by: ae, markj, mav, rstone
Approved by: rstone (mentor)
MFC after: 4 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11370


# 9bcf3ae4 22-May-2017 Alexander Motin <mav@FreeBSD.org>

Add parent interface reference counting to if_vlan.

Using plain ifunit() looks like a request for troubles.

MFC after: 1 week


# 59150e91 29-Apr-2017 Alexander Motin <mav@FreeBSD.org>

Propagate IFCAP_LRO from trunk to vlan interface.

False positive here cost nothing, while false negative may lead to some
confusions.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# d89baa5a 28-Apr-2017 Alexander Motin <mav@FreeBSD.org>

Allow some control over enabled capabilities for if_vlan.

It improves interoperability with if_bridge, which may need to disable
some capabilities not supported by other members. IMHO there is still
open question about LRO capability, which may need to be disabled on
physical interface.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 070f87e8 11-Apr-2017 Andrey V. Elsukov <ae@FreeBSD.org>

Inherit IPv6 checksum offloading flags to vlan interfaces.

if_vlan(4) interfaces inherit IPv4 checksum offloading flags from the
parent when VLAN_HWCSUM and VLAN_HWTAGGING flags are present on the
parent interface. Do the same for IPv6 checksum offloading flags.

Reported by: Harry Schmalzbauer
Reviewed by: np, gnn
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10356


# f3e7afe2 18-Jan-2017 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement kernel support for hardware rate limited sockets.

- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.

Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months


# 89856f7e 21-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747


# 2ccbbd06 06-Jun-2016 Marcelo Araujo <araujo@FreeBSD.org>

Add support to priority code point (PCP) that is an 3-bit field
which refers to IEEE 802.1p class of service and maps to the frame
priority level.

Values in order of priority are: 1 (Background (lowest)),
0 (Best effort (default)), 2 (Excellent effort),
3 (Critical applications), 4 (Video, < 100ms latency),
5 (Video, < 10ms latency), 6 (Internetwork control) and
7 (Network control (highest)).

Example of usage:
root# ifconfig em0.1 create
root# ifconfig em0.1 vlanpcp 3

Note:
The review D801 includes the pf(4) part, but as discussed with kristof,
we won't commit the pf(4) bits for now.
The credits of the original code is from rwatson.

Differential Revision: https://reviews.freebsd.org/D801
Reviewed by: gnn, adrian, loos
Discussed with: rwatson, glebius, kristof
Tested by: many including Matthew Grooms <mgrooms__shrew.net>
Obtained from: pfSense
Relnotes: Yes


# a4641f4e 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/net*: minor spelling fixes.

No functional change.


# 8ad43f2d 14-Nov-2015 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move iflladdr_event eventhandler invocation to if_setlladdr.

Suggested by: glebius


# b13c5b5d 09-Nov-2015 Alexander V. Chernikov <melifaro@FreeBSD.org>

Use lladdr_event to propagate gratiotus arp.

Differential Revision: https://reviews.freebsd.org/D4019


# 9f7d0f48 23-Apr-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Don't propagate SIOCSIFCAPS from a vlan(4) to its parent. This leads to
quite unexpected result of toggling capabilities on the neighbour vlan(4)
interfaces.

Reviewed by: melifaro, np
Differential Revision: https://reviews.freebsd.org/D2310
Sponsored by: Nginx, Inc.


# c0304424 25-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Fix couple of fallouts from r280280. The first one is a simple typo,
where counter was incremented on parent, instead of vlan(4) interface.

The second is more complicated. Historically, in our stack the incoming
packets are accounted in drivers, while incoming bytes for Ethernet
drivers are accounted in ether_input_internal(). Thus, it should be
removed from vlan(4) driver.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# b1828acf 20-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Make vlan_config() the signle point of validity checks.

Sponsored by: Nginx, Inc.


# f941c31a 20-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

In vlan_clone_match_ethervid():
- Use ifunit() instead of going through the interface list ourselves.
- Remove unused parameter.
- Move the most important comment above the function.

Sponsored by: Nginx, Inc.


# 67975c79 20-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Tiny comment fix.


# a58ea6b1 20-Mar-2015 Gleb Smirnoff <glebius@FreeBSD.org>

Now, when r272244 introduced counter(9) based counters for all interfaces,
revert the r271538, which did that for vlan(4) only.

No objections: melifaro
Sponsored by: Nginx, Inc.


# 08e57366 20-Feb-2015 Xin LI <delphij@FreeBSD.org>

Handle SIOCSIFCAP by propogating the request to the parent interface. This
allows adding an vlan interface into a bridge.

Thanks for William Katsak <wkatsak cs rutgers edu> for testing and fixing
an issue in my previous patch draft.

MFC after: 2 weeks


# 478e0520 01-Oct-2014 Hiroki Sato <hrs@FreeBSD.org>

Virtualize net.link.vlan.soft_pad.


# 9fd573c3 22-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week


# 3751dddb 19-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically convert to if_inc_counter().


# 1b7fb1d9 18-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

While not too late rename 'ifnet_counter' to 'ift_counter'. One of the
imporant moments that we discussed with Marcel and Anuranjan was that
a converted driver should return false for 'grep ifnet if_driver.c' :)

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 277e067a 18-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

While not too late rename if_get_counter_compat() to if_get_counter_default().
The compat counters will go away, but the function will remain in its place,
and in all places where it is going to be called.

Discussed with: melifaro


# 5d99eb59 17-Sep-2014 Marcelo Araujo <araujo@FreeBSD.org>

Revert r271735. The comment is absolutely correct, we do not support 802.1p priority tagging. I got confused with the packet tagged and packet to be tagged.

Spotted by: glebius


# 397bdf7c 17-Sep-2014 Marcelo Araujo <araujo@FreeBSD.org>

Remove old comment, we already do 802.1q tagging.

Phabric: D797
Reviewed by: kevlo
Approved by: kevlo
Sponsored by: QNAP Systems Inc.


# 6667db31 16-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Fix if_omcast handling
* Convert if_oerrors to pcpu.

Suggested by: glebius
MFC after: 2 weeks


# 72f31000 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Revert r271504. A new patch to solve this issue will be made.

Suggested by: adrian @


# 772b000f 13-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Switch if_vlan(4) to rmlock.

MFC after: 2 weeks


# 299153b5 13-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Switch if_vlan(4) to use counter(9) using new
if_get_counter api.


# eb93b77a 13-Sep-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Improve transmit sending offload, TSO, algorithm in general.

The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# bf7dcda3 03-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Clean up unused CSUM_FRAGMENT.

Sponsored by: Nginx, Inc.


# 2d222cb7 03-Aug-2014 Alexander Motin <mav@FreeBSD.org>

Improve locking of multicast addresses in VLAN and LAGG interfaces.

This fixes several scenarios of reproducible panics, cause by races
between multicast address changes and interface destruction.

MFC after: 2 weeks


# bd9fd667 15-Apr-2014 Rick Macklem <rmacklem@FreeBSD.org>

Vlan did not set the value of if_hw_tsomax, so when vlan
was stacked on top of a network interface that set if_hw_tsomax,
tcp_output() would see the default value instead of the value
set by the network interface. This patch modifies vlan so that
it sets if_hw_tsomax to the value of the parent interface.

Reviewed by: glebius
MFC after: 2 weeks


# c3322cb9 28-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 76039bc8 26-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 8b20f6cf 15-Jun-2013 Hiroki Sato <hrs@FreeBSD.org>

Return ENETDOWN when the parent interface is down.

MFC after: 1 week


# 2c5b403e 18-Apr-2013 Oleg Bulyzhin <oleg@FreeBSD.org>

Recover missing arp_ifinit() call.

MFC after: 2 weeks


# da2299c5 27-Nov-2012 Andre Oppermann <andre@FreeBSD.org>

Remove unused and unnecessary CSUM_IP_FRAGS checksumming capability.
Checksumming the IP header of fragments is no different from doing
normal IP headers.

Discussed with: yongari
MFC after: 1 week


# 42a58907 16-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Make the "struct if_clone" opaque to users of the cloning API. Users
now use function calls:

if_clone_simple()
if_clone_advanced()

to initialize a cloner, instead of macros that initialize if_clone
structure.

Discussed with: brooks, bz, 1 year ago


# 9823d527 10-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Revert previous commit...

Pointyhat to: kevlo (myself)


# a10cee30 09-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Prefer NULL over 0 for pointers


# b90dde2f 21-Aug-2012 John Baldwin <jhb@FreeBSD.org>

Fix a silly grammar bogon.

Submitted by: Stephen McKay


# 28cc4d37 20-Aug-2012 John Baldwin <jhb@FreeBSD.org>

Refine the changes made in r208212 to avoid bogus failures from
if_delmulti() when clearing the configuration for a subinterface when
the parent interface is being detached. The current code was still
triggering an assertion in if_delmulti() due to the parent interface being
partially detached. Fix this by not calling if_delmulti() at all if the
parent interface is being detached. Warn if if_delmulti() fails when the
parent is not being detached (but similar to 208212, still proceed with
tearing down the vlan state).

Tested by: ae@
MFC after: 1 month


# 09fe6320 19-Jun-2012 Navdeep Parhar <np@FreeBSD.org>

- Updated TOE support in the kernel.

- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)


# 7983103a 12-Jan-2012 Robert Watson <rwatson@FreeBSD.org>

Clarify throughout the vlan(4) code the difference between a "tag" (the
802.1q-defined 16-bit VID, CFI, and PCP field in host by order) and a
VLAN ID (VID). Tags go in packets. VIDs identify VLANs.

No functional change is intended, so this should be safe to MFC. Further
cleanup with functional changes will be committed separately (for example,
renaming vlan_tag/vlan_tag_p, which modify the KPI and KBI).

Reviewed by: bz
Sponsored by: ADARA Networks, Inc.
MFC after: 3 days


# 5a39f779 05-Jan-2012 Robert Watson <rwatson@FreeBSD.org>

Refine last comment.

Submitted by: joeld
Sponsored by: ADARA Networks, Inc.
MFC after: 3 days


# 15f6780e 05-Jan-2012 Robert Watson <rwatson@FreeBSD.org>

Add comment to the VLAN code about its integration with VIMAGE: we see what
the code is doing, we recognise the legitimacy of its goal, but we're not
quite sure it's going about it the right way. More pondering is clearly
required.

Sponsored by: ADARA Networks, Inc.
Discussed with: bz
MFC after: 3 days


# 1ad7a257 29-Dec-2011 Pyun YongHyeon <yongari@FreeBSD.org>

Update if_obytes and if_omcast after successful transmit.
While I'm here update if_oerrors if parent interface of vlan is not
up and running. Previously it updated collision counter and it was
confusing to interprete it.

PR: kern/163478
Reviewed by: glebius, jhb
Tested by: Joe Holden < lists <> rewt dot org dot uk >


# d9b1d615 28-Nov-2011 John Baldwin <jhb@FreeBSD.org>

Change the if_vlan driver to use if_transmit for forwarding packets to the
parent interface. This avoids the overhead of queueing a packet to an IFQ
only to immediately dequeue it again.

Suggested by: np
Reviewed by: brooks
MFC after: 1 month


# 4b22573a 11-Nov-2011 Brooks Davis <brooks@FreeBSD.org>

In r191367 the need for if_free_type() was removed and a new member
if_alloctype was used to store the origional interface type. Take
advantage of this change by removing all existing uses of if_free_type()
in favor of if_free().

MFC after: 1 Month


# 6472ac3d 07-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# e4cd31dd 21-Mar-2011 Jeff Roberson <jeff@FreeBSD.org>

- Merge changes to the base system to support OFED. These include
a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND,
and other miscellaneous small features.


# ccf7ba97 22-Nov-2010 Marko Zec <zec@FreeBSD.org>

Allow for vlan(4) ifnets to have overlapping unit numbers if they are
created in separated vnets. As a side-effect of having a separated
if_cloner instance for each vnet, all vlan ifnets created in a vnet
will be automatically destroyed when vnet teardown is initiated.

Disallow SIOCSETVLAN and SIOCGETVLAN ioctls on vlan ifnets which are
associated with physical ifnets residing in parent vnets.

This is an interim vlan-specific solution which will be superseded by a
more generic if_cloner V_irtualization change from p4. For nooptions
VIMAGE builds, this should be a no-op change.

Discussed with: bz
MFC after: 3 days


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 3ba24fde 06-Aug-2010 John Baldwin <jhb@FreeBSD.org>

Adjust the interface type in the link layer socket address for vlan(4)
interfaces to be a vlan (IFT_L2VLAN) rather than an Ethernet interface
(IFT_ETHER). The code already fixed if_type in the ifnet causing some
places to report the interface as a vlan (e.g. arp -a output) and other
places to report the interface as Ethernet (getifaddrs(3)). Now they
should all report IFT_L2VLAN.

Reviewed by: brooks
MFC after: 1 month


# 7c61d493 24-May-2010 Andrew Thompson <thompsa@FreeBSD.org>

MFC r202588

Declare a new EVENTHANDLER called iflladdr_event which signals that the L2
address on an interface has changed. This lets stacked interfaces such as
vlan(4) detect that their lower interface has changed and adjust things in
order to keep working. Previously this situation broke at least vlan(4) and
lagg(4) configurations.

The EVENTHANDLER_INVOKE call was not placed within if_setlladdr() due to the
risk of a loop.

PR: kern/142927
Submitted by: Nikolay Denev

MFC r202611

Do not hold the lock over if_setlladdr() as it calls into the interface driver
init routine.


# af73bcd7 21-May-2010 John Baldwin <jhb@FreeBSD.org>

MFC 208212:
Ignore failures from removing multicast addresses from the parent (trunk)
interface when tearing down a vlan interface.


# 6f359e28 17-May-2010 John Baldwin <jhb@FreeBSD.org>

Ignore failures from removing multicast addresses from the parent (trunk)
interface when tearing down a vlan interface. If a trunk interface is
detached, all of its multicast addresses are removed before the ifnet
departure eventhandlers are invoked. This means that all of the multicast
addresses are removed before the vlan interfaces are removed which causes
the if_delmulti() calls in the vlan teardown to fail.

In the VLAN_ARRAY case, this left vlan interfaces referencing a no longer
valid parent interface. In the !VLAN_ARRAY case, the eventhandler gets
stuck in an infinite loop retrying vlan_unconfig_locked() forever. In
general the callers of vlan_unconfig_locked() do not expect nor handle
failure, so I believe it is safer to ignore the errors and tear down as
much of the vlan state as possible.

Silence from: net@
MFC after: 4 days


# 83867e0a 28-Mar-2010 Ed Maste <emaste@FreeBSD.org>

MFC r205411:

Avoid holding the VLAN_LOCK() over the parent interface SIOCGIFMEDIA
ioctl call, as it may sleep.

Reviewed by: rwatson


# 98323201 22-Mar-2010 Pyun YongHyeon <yongari@FreeBSD.org>

MFC r204156:
Add __FBSDID.


# d8564efd 21-Mar-2010 Ed Maste <emaste@FreeBSD.org>

Avoid holding the VLAN_LOCK() over the parent interface SIOCGIFMEDIA
ioctl call, as it may sleep.

Reviewed by: rwatson


# d5eda01f 18-Mar-2010 Pyun YongHyeon <yongari@FreeBSD.org>

MFC r204149:
Add TSO support on VLANs. Intentionally separated IFCAP_VLAN_HWTSO
from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to
TSO over VLAN without VLAN hardware tagging.
Driver changes and userland support will follow.


# 8b2d9181 20-Feb-2010 Pyun YongHyeon <yongari@FreeBSD.org>

Add __FBSDID.

Reviewed by: sam


# 9b76d9cb 20-Feb-2010 Pyun YongHyeon <yongari@FreeBSD.org>

Add TSO support on VLANs. Intentionally separated IFCAP_VLAN_HWTSO
from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to
TSO over VLAN without VLAN hardware tagging.
Driver changes and userland support will follow.

Reviewed by: thompsa


# 6117727b 18-Jan-2010 Andrew Thompson <thompsa@FreeBSD.org>

Do not hold the lock over if_setlladdr() as it calls into the interface driver
init routine.


# ea4ca115 18-Jan-2010 Andrew Thompson <thompsa@FreeBSD.org>

Declare a new EVENTHANDLER called iflladdr_event which signals that the L2
address on an interface has changed. This lets stacked interfaces such as
vlan(4) detect that their lower interface has changed and adjust things in
order to keep working. Previously this situation broke at least vlan(4) and
lagg(4) configurations.

The EVENTHANDLER_INVOKE call was not placed within if_setlladdr() due to the
risk of a loop.

PR: kern/142927
Submitted by: Nikolay Denev


# 02bcb7ec 05-Jan-2010 John Baldwin <jhb@FreeBSD.org>

MFC 201196:
Change vlan interfaces to cope more usefully with the parent interface being
renamed. Previously the vlan interfaces would lose their configuration as if
the parent interface had been physically removed. Now vlan interfaces ignore
rename events.
- Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being
renamed. This flag can be checked in ifnet departure/arrival event
handlers to treat rename events differently.
- Change the ifnet departure event handler in the if_vlan(4) driver to
ignore departure events due to a trunk interface being renamed.


# eee4cfb9 04-Jan-2010 John Baldwin <jhb@FreeBSD.org>

MFC 201351:
Use stricter checking to match possible vlan clones by not allowing extra
garbage characters around or within the tag.


# fb92ad4a 31-Dec-2009 John Baldwin <jhb@FreeBSD.org>

Use stricter checking to match possible vlan clones by not allowing extra
garbage characters around or within the tag.

Reviewed by: brooks
MFC after: 3 days


# a6fffd6c 31-Dec-2009 Brooks Davis <brooks@FreeBSD.org>

The devices that supported EVFILT_NETDEV kqueue filters were removed in
r195175. Remove all definitions, documentation, and usage.

fifo_misc.c:
Remove all kqueue tests as fifo_io.c performs all those that
would have remained.

Reviewed by: rwatson
MFC after: 3 weeks
X-MFC note: don't change vlan_link_state() function signature


# 5428776e 29-Dec-2009 John Baldwin <jhb@FreeBSD.org>

Change vlan interfaces to cope more usefully with the parent interface being
renamed. Previously the vlan interfaces would lose their configuration as if
the parent interface had been physically removed. Now vlan interfaces ignore
rename events.
- Add a new ifnet flag (IFF_RENAMING) that is set while an ifnet is being
renamed. This flag can be checked in ifnet departure/arrival event
handlers to treat rename events differently.
- Change the ifnet departure event handler in the if_vlan(4) driver to
ignore departure events due to a trunk interface being renamed.

Reviewed by: brooks, rwatson
MFC after: 1 week


# 1bdc73d3 08-Sep-2009 Ed Maste <emaste@FreeBSD.org>

Compare pointer with NULL, not 0.


# 3ef94f2b 28-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Merge r196481 from head to stable/8:

Rework global locks for interface list and index management, correcting
several critical bugs, including race conditions and lock order issues:

Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
sxlock. Either can be held to stablize the lists and indexes, but both
are required to write. This allows the list to be held stable in both
network interrupt contexts and sleepable user threads across sleeping
memory allocations or device driver interactions. As before, writes to
the interface list must occur from sleepable contexts.

Reviewed by: bz, julian

Approved by: re (kib)


# 77dfcdc4 23-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Rework global locks for interface list and index management, correcting
several critical bugs, including race conditions and lock order issues:

Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
sxlock. Either can be held to stablize the lists and indexes, but both
are required to write. This allows the list to be held stable in both
network interrupt contexts and sleepable user threads across sleeping
memory allocations or device driver interactions. As before, writes to
the interface list must occur from sleepable contexts.

Reviewed by: bz, julian
MFC after: 3 days


# 530c0060 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# eddfbb76 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 5736e6fb 23-Jun-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.


# 8d8bc018 08-Jun-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.


# c8728236 17-Apr-2009 John Baldwin <jhb@FreeBSD.org>

The vlan code has not required the miibus code since 6.0 when
if_link_state_change() was added and the vlan link-state hook was moved
out of miibus and into net/if.c.

MFC after: 1 month


# 33553d6e 27-Feb-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.


# 96c8ef3a 12-Feb-2009 Maxim Konovalov <maxim@FreeBSD.org>

o In case of the error do not forget to deallocate a cloned device unit.

PR: kern/131642
Submitted by: Dmitrij Tejblum
MFC after: 1 week


# 86c2bc49 12-Feb-2009 Robert Watson <rwatson@FreeBSD.org>

Remove unused ifaddr local variable in ioctl routine.

MFC after: 3 days


# 4b79449e 02-Dec-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation


# aea78d20 22-Nov-2008 Kip Macy <kmacy@FreeBSD.org>

convert calls to IFQ_HANDOFF to if_transmit


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 8b615593 02-Oct-2008 Marko Zec <zec@FreeBSD.org>

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# 22893351 28-Aug-2008 Jack F Vogel <jfv@FreeBSD.org>

Fix to bug kern/126850. Only dispatch event hander if the
interface had a parent (was attached).

Reviewed by: EvilSam
MFC after: 1 week


# 603724d3 17-Aug-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch


# c725524c 14-Jul-2008 Jack F Vogel <jfv@FreeBSD.org>

Add event notification at attach/detach so the NIC
is able to detect it and do hardware filtering.


# 60e87ca8 18-Oct-2007 Andrew Thompson <thompsa@FreeBSD.org>

The bridging output function puts the mbuf directly on the interfaces send
queue so the output network card must support the same tagging mechanism as
how the frame was input (prepended Ethernet header tag or stripped HW mflag).

Now the vlan Ethernet header is _always_ stripped in ether_input and the mbuf
flagged, only only network cards with VLAN_HWTAGGING enabled would properly
re-tag any outgoing vlan frames.

If the outgoing interface does not support hardware tagging then readd the vlan
header to the front of the frame. Move the common vlan encapsulation in to
ether_vlanencap().

Reported by: Erik Osterholm, Jon Otterholm
MFC after: 1 week


# 0b4e4d87 19-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Now <net/if_arp.h> is unused here.


# 65239942 19-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Fix a nameless constant: 6 -> ETHER_ADDR_LEN

Tested with: md5(1)


# 13cf779d 19-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Now that this driver uses ether_ioctl(), it no longer needs
the INET related include files.


# f84b2d69 15-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Remove a spurious blank line at the start of vlan_growhash().
Add a diagnostic message to the function about resizing vlan
hash table.


# 9c6dee24 14-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Let vlan_ioctl() pass some work on to ether_ioctl()
and so reduce code duplication a bit.


# 25c0f7b3 11-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Emit load and unload messages under bootverbose.
This can help to spot bugs (which it did for me,)
and let people know which mode the vlan module is
actually using if they suspect it isn't picking its
options from the main kernel config file.


# c0cb022b 11-Mar-2007 Yaroslav Tykhiy <ytykhiy@gmail.com>

Fix some minor issues in the internal vlan lists:

- ifv_list member of struct ifvlan is unneeded in array mode,
it's used only in hash mode to resolve hash collisions.

- We don't need the list of trunks at all. (The initial reason for
having it was to be able to destroy all trunks in the MOD_UNLOAD
handler, but a trunk is not to be destroyed forcibly -- it will
go away when all vlan interfaces on it have been deleted.
Note that if_clone_detach() called first of all under MOD_UNLOAD
will delete all vlan interfaces and thus make all trunks go away
quietly.)

- It's enough to use a single [S]LIST_FIRST() in a typical list
destruction loop.


# 2dc879b3 30-Dec-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

- Don't defer the removal of an 802.1q header for no real reason.
- Micro-optimize the addition of an 802.1q header to match the removal code.
- Consistently check for interfaces being up and running.
- Consistently use NULL instead of 0 with pointers.


# aad0be7a 11-Oct-2006 Gleb Smirnoff <glebius@FreeBSD.org>

- Update the baudrate every time the parent changes its link state.
- Rearrange the curly braces so that this piece of code is more
readable.


# 78ba57b9 17-Sep-2006 Andre Oppermann <andre@FreeBSD.org>

Move ethernet VLAN tags from mtags to its own mbuf packet header field
m_pkthdr.ether_vlan. The presence of the M_VLANTAG flag on the mbuf
signifies the presence and validity of its content.

Drivers that support hardware VLAN tag stripping fill in the received
VLAN tag (containing both vlan and priority information) into the
ether_vtag mbuf packet header field:

m->m_pkthdr.ether_vtag = vlan_id; /* ntohs()? */
m->m_flags |= M_VLANTAG;

to mark the packet m with the specified VLAN tag.

On output the driver should check the mbuf for the M_VLANTAG flag to
see if a VLAN tag is present and valid:

if (m->m_flags & M_VLANTAG) {
... = m->m_pkthdr.ether_vtag; /* htons()? */
... pass tag to hardware ...
}

VLAN tags are stored in host byte order. Byte swapping may be necessary.

(Note: This driver conversion was mechanic and did not add or remove any
byte swapping in the drivers.)

Remove zone_mtag_vlan UMA zone and MTAG_VLAN definition. No more tag
memory allocation have to be done.

Reviewed by: thompsa, yar
Sponsored by: TCP/IP Optimization Fundraise 2005


# ad387028 25-Aug-2006 Andrew Thompson <thompsa@FreeBSD.org>

Fix spelling.


# aabf9940 15-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

This XXX remark was rendered false by rev. 103, which made the
VLAN_ARRAY case subject to rw locking, too.


# 73f2233d 15-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Make it a tad easier to base other encapsulation schemes on this driver
by restoring the ifv_proto field in the vlan softc and putting it to use
this time. It's a good companion for ifv_encaplen, which has already been
used throughout this driver.


# 2ada9747 15-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Set IFF_DRV_RUNNING on vlan(4) once in vlan_config(),
not at many places after each call to vlan_config().
This is consistent with IFF_DRV_RUNNING being unset
in vlan_unconfig().


# f6e5e0ad 11-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Optionally pad outgoing frames to the minimum of 60 bytes (excl. FCS)
before tagging them. This can help to work around brain-damage in some
switches that fail to pad a frame after untagging it if its length drops
below the minimum. This option is blessed by IEEE Std 802.1Q (2003 Ed.),
paragraph C.4.4.3.b. It's controlled by sysctl net.link.vlan.soft_pad.

Idea by: az
MFC after: 1 week


# 60c60618 03-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Should vlan_input() ever be called with ifp pointing to a non-Ethernet
interface, do not just assign -1 to tag because it breaks the logic of
the code to follow. The better way is to handle this case as an unsupported
protocol and return unless INVARIANTS is in effect and we can panic.
Panic is good there because the scenario can happen only because of a
coding error elsewhere.

We also should show the interface name in the panic message for easier
debugging of the problem, should it ever emerge.

Submitted by: qingli (initially)


# db8b5973 03-Aug-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Back out rev. 1.107 because it introduced as many problems
as it tried to solve:

- it smuggled hidden 802.1q details into otherwise protocol-neutral code;
- it put an important code consistency check under DEBUG, which was never
defined by anyone but a developer hacking this file for the moment;
- lastly, the former bcopy() call had been correct as long as the "dead"
code was there.

(A new version of the fix for tag of -1 to come in the next commit.)

Agreed by: qingli


# 0d024885 01-Aug-2006 Qing Li <qingli@FreeBSD.org>

In vlan_input(), if the network interface does not perform h/w based
vlan tag processing, the code will use bcopy() to remove the vlan
tag field but the code copies 2 bytes too many, which essentially
overwrites the protocol type field.

Also, a tag value of -1 is generated for unrecognized interface type,
which would cause an invalid memory access in the vlans[] array.

In addition, removed a line of dead code and its associated comments.

Reviewed by: sam


# 6b7330e2 09-Jul-2006 Sam Leffler <sam@FreeBSD.org>

Revise network interface cloning to take an optional opaque
parameter that can specify configuration parameters:
o rev cloner api's to add optional parameter block
o add SIOCCREATE2 that accepts parameter data
o rev vlan support to use new api (maintain old code)

Reviewed by: arch@


# 249f4297 29-Jun-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Detach the interface first, do vlan_unconfig() then.
Previously, another thread could get a pointer to the
interface by scanning the system-wide list and sleep
on the global vlan mutex held by vlan_unconfig().
The interface was gone by the time the other thread
woke up.

In order to be able to call vlan_unconfig() on a detached
interface, remove the purely cosmetic bzero'ing of IF_LLADDR
from the function because a detached interface has no addresses.

Noticed by: a stress-testing script by maxim
Reviewed by: glebius


# 114c608c 29-Jun-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Remove a few unused things.
Fix some style and consistency points.


# 15ed2fa1 21-Jun-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Fix the VLAN_ARRAY case, mostly regarding improper use of atomic(9)
in place of conventional rw locking. Alas, atomic(9) can't buy us
lockless operation so easily.


# 5cb8c31a 21-Jun-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Track interface department events and detach vlans from
departing trunk so that we don't get into trouble later
by dereferencing a stale pointer to dead trunk's things.

Prodded by: oleg
Sponsored by: RiNet (Cronyx Plus LLC)
MFC after: 1 week


# ceec92fe 09-Mar-2006 Ruslan Ermilov <ru@FreeBSD.org>

Don't acquire a lock before calling vlan_unconfig().
This fixes a panic when doing "ifconfig ... -vlandev".

OK'ed by: glebius


# 33499e2a 24-Feb-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Don't to forget to unlock the rwlock on trunk before destroying it.
This should fix panic on "kldunload if_vlan" while vlanX are still there.

Reviewed by: glebius


# 11edc477 10-Feb-2006 Ed Maste <emaste@FreeBSD.org>

Bump the MODULE_VERSION for HEAD, as the vlan(4) API is different in
RELENG_6, and would require a lower version number.

Requested by: glebius
Approved by: rwatson (mentor)


# 802dadcf 10-Feb-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Avoid frobbing IFF_UP at any cost (which is close to
zero in this case.) A kernel driver has IFF_DRV_RUNNING
at its full disposal while IFF_UP may be toggled only by
humans or their daemonic deputies from the userland.

MFC after: 3 days


# 7f8b9934 09-Feb-2006 Ed Maste <emaste@FreeBSD.org>

Add a MODULE_VERSION so that other modules (perhaps third-party) can
depend on this one.

Approved by: rwatson (mentor)


# 05a2398f 02-Feb-2006 Gleb Smirnoff <glebius@FreeBSD.org>

In vlan_config() first call vlan_inithash(), then lock mutex, because
vlan_inithash() calls malloc(M_WAITOK).


# 64a17d2e 31-Jan-2006 Yaroslav Tykhiy <ytykhiy@gmail.com>

Set IFF_BROADCAST and IFF_MULTICAST on vlan interfaces from the
beginning and simply refuse to attach to a parent without either
flag.

Our network stack cannot handle well IFF_BROADCAST or IFF_MULTICAST
on an interface changing on the fly. E.g., IP will or won't assign
a broadcast address to an interface and join the all-hosts multicast
group on it depending on its IFF_BROADCAST and IFF_MULTICAST settings.
Should the flags alter later, IP will miss the change and keep using
bogus settings. This can lead to evil things like supplying an
invalid broadcast address or trying to leave a multicast group that
hasn't been joined. So just avoid touching the flags since an
interface was created. This has no practical purpose.

Discussed with: -net, glebius, oleg
MFC after: 1 week


# 75ee267c 30-Jan-2006 Gleb Smirnoff <glebius@FreeBSD.org>

Merge the //depot/user/yar/vlan branch into CVS. It contains some collective
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.

The most important changes:

o Instead of global linked list of all vlan softc use a per-trunk
hash. The size of hash is dynamically adjusted, depending on
number of entries. This changes struct ifnet, replacing counter
of vlans with a pointer to trunk structure. This change is an
improvement for setups with big number of VLANs, several interfaces
and several CPUs. It is a small regression for a setup with a single
VLAN interface.
An alternative to dynamic hash is a per-trunk static array with
4096 entries, which is a compile time option - VLAN_ARRAY. In my
experiments the array is not an improvement, probably because such
a big trunk structure doesn't fit into CPU cache.
o Introduce an UMA zone for VLAN tags. Since drivers depend on it,
the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
This change is a big improvement for any setup utilizing vlan(4).
o Use rwlock(9) instead of mutex(9) for locking. We are the first
ones to do this! :)
o Some drivers can do hardware VLAN tagging + hardware checksum
offloading. Add an infrastructure for this. Whenever vlan(4) is
attached to a parent or parent configuration is changed, the flags
on vlan(4) interface are updated.

In collaboration with: yar, thompsa
In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>


# 62f0bf32 27-Nov-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Take if_baudrate from the parent. This fixes problem with SNMP
daemons reporting zero speed for vlan(4) interfaces.


# 4a0d6638 11-Nov-2005 Ruslan Ermilov <ru@FreeBSD.org>

- Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
through ifp anyway. IF_LLADDR() works faster, and all (except
one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
and drop the IFP2ENADDR() macro; all users have been converted
to use IF_LLADDR() instead.


# d09ed26f 11-Nov-2005 Ruslan Ermilov <ru@FreeBSD.org>

- Make IFP2ENADDR() a pointer to IF_LLADDR() rather than another
copy of Ethernet address.

- Change iso88025_ifattach() and fddi_ifattach() to accept MAC
address as an argument, similar to ether_ifattach(), to make
this work.


# 4e7e0183 08-Nov-2005 Andrew Thompson <thompsa@FreeBSD.org>

Move the cloned interface list management in to if_clone. For some drivers the
softc lists and associated mutex are now unused so these have been removed.

Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.

Idea by: brooks
Reviewed by: brooks


# 6d3a3ab7 06-Nov-2005 Gleb Smirnoff <glebius@FreeBSD.org>

- Do not raise IFF_DRV_OACTIVE flag in vlan_start, because this
can lead to stalled interface
- Explain this fact in a comment.

Reviewed by: rwatson, thompsa, yar


# 1cf236fb 02-Oct-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Improve handling flags that must be propagated
to the parent interface, such as IFF_PROMISC and
IFF_ALLMULTI. In addition, vlan(4) gains ability
to migrate from one parent to another w/o losing
its own flags.

PR: kern/81978
MFC after: 2 weeks


# 83908c65 16-Sep-2005 Ruslan Ermilov <ru@FreeBSD.org>

The arguments to printf() were swapped.


# ffdd61c3 15-Sep-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Do assorted nitpicking in diagnostics while I'm here:
- Use __func__ consistently instead of copying function name
to message strings. Code tends to migrate around source files.
- DIAGNOSTIC is for information, INVARIANTS is for panics.


# 14e98256 16-Sep-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

It's nice to have relevant comments both in if {} and else {},
not in just one of them.


# f4ec4126 16-Sep-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Test the new M_VLANTAG packet flag before calling
m_tag_locate(). This adds little overhead of a simple
bitwise operation in case hardware VLAN acceleration
is on, yet saves the more expensive function call if
the acceleration is off.

Reviewed by: ru, glebius
X-MFC-after: 6.0


# eefbcf0e 31-Aug-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Use VLAN_TAG_VALUE() not only to read a dot1q tag
value from an m_tag, but also to set it. This reduces
complex code duplication and improves its readability.

Alas, we shouldn't rename the macro to VLAN_TAG_LVALUE()
globally because that would cause pain for kernel module
port maintainers and vendors using FreeBSD as their codebase.
Added a clarifying comment instead.

Discussed with: ru, glebius
X-MFC-After: 6.0-RELEASE (MFC is good just to reduce the diff)


# ba26134b 30-Aug-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Fix fallout from revision 1.77, mark outgoing packets with M_VLANTAG flag.

PR: kern/80646
Reviewed by: yar
MFC after: 3 days


# f3447eb4 15-Aug-2005 Brooks Davis <brooks@FreeBSD.org>

Vlan interfaces change their type after ether_ifattach() so we needs to
use if_free_type(ifp, IFT_ETHER) to delete them and stop leaking struct
arpcoms.

Reported by: thompsa
MFC After: 3 days


# 13f4c340 09-Aug-2005 Robert Watson <rwatson@FreeBSD.org>

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days


# fc74a9f9 10-Jun-2005 Brooks Davis <brooks@FreeBSD.org>

Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.

Reviewed by: sobomax, sam


# 984be3ef 19-Apr-2005 Gleb Smirnoff <glebius@FreeBSD.org>

- Call if_link_state_change() for each vlan, when link changes
on parent.
- Remove route.h include.
- Fix comment about MII.

Sponsored by: Rambler
Reviewed by: yar


# 6ee20ab5 18-Feb-2005 Ruslan Ermilov <ru@FreeBSD.org>

Allocate the M_VLANTAG m_pkthdr flag, and use it to indicate that
a packet has VLAN mbuf tag attached. This is faster to check than
m_tag_locate(), and allows us to use the tags in non-vlan(4) VLAN
producers.

The first argument to VLAN_OUTPUT_TAG() is now unused but retained
for backward compatibility.

While here, embellish a fix in rev. 1.174 of if_ethersubr.c -- it
now checks for packets with VLAN (mbuf) tags, and it should now
be possible to bridge(4) on vlan(4)'s whose parent interfaces
support VLAN decapsulation in hardware.

Reviewed by: sam


# cab574d8 24-Jan-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Fix spelling in a comment.


# c6e6ca3e 23-Jan-2005 Yaroslav Tykhiy <ytykhiy@gmail.com>

Reduce the global name space pollution.
The cloner structure isn't referenced by name outside this file.


# c398230b 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for license, minor formatting changes


# ad3b9257 15-Aug-2004 John-Mark Gurney <jmg@FreeBSD.org>

Add locking to the kqueue subsystem. This also makes the kqueue subsystem
a more complete subsystem, and removes the knowlege of how things are
implemented from the drivers. Include locking around filter ops, so a
module like aio will know when not to be unloaded if there are outstanding
knotes using it's filter ops.

Currently, it uses the MTX_DUPOK even though it is not always safe to
aquire duplicate locks. Witness currently doesn't support the ability
to discover if a dup lock is ok (in some cases).

Reviewed by: green, rwatson (both earlier versions)


# d6fcfb7a 26-Jul-2004 Yaroslav Tykhiy <ytykhiy@gmail.com>

Stop tinkering with the parent's VLAN_MTU capability.
Now it is user-controlled through ifconfig(8).

The former ``automagic'' way of operation created more
trouble than good. First, VLAN_MTU consumers other than
vlan(4) had appeared, e.g., ng_vlan(4). Second, there was
no way to disable VLAN_MTU manually if it were causing
trouble, e.g., data corruption.

Dropping the ``automagic'' should be completely invisible
to the user since
a) all the drivers supporting VLAN_MTU
have it enabled by default, and in the first place
b) there is only one driver that can really toggle VLAN_MTU
in the hardware under its control (it's fxp(4), to which
I added VLAN_MTU controls to illustrate the principle.)


# b4e9f837 22-Jul-2004 Brooks Davis <brooks@FreeBSD.org>

Actually free the unit when destroying the interface.

Reported by: la at delfi.lt
Tested by: la at delfi.lt
PR: 68618


# 3e019dea 15-Jul-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".


# 29c2dfbe 04-Jul-2004 Bruce M Simpson <bms@FreeBSD.org>

Workaround a locking problem in vlan(4). vlan_setmulti() may be called
with sleepable locks held from further up in the network stack, and
attempts to allocate memory to hold multicast group membership information
with M_WAITOK.

This panic was triggered specifically when an exiting routing daemon
process closes its raw sockets after joining multicast groups on them.

While we're here, comment some possible locking badness.

PR: kern/48560


# 15a66c21 04-Jul-2004 Bruce M Simpson <bms@FreeBSD.org>

style(9)/whitespace cleanup while I'm in this file.


# b46f884b 23-Jun-2004 Joerg Wunsch <joerg@FreeBSD.org>

Add a couple of #ifdef DEBUG printf()s in vlan_input() I found to be
useful when debugging the ether_demux() problem (when bridging over
VLANs).


# f889d2ef 22-Jun-2004 Brooks Davis <brooks@FreeBSD.org>

Major overhaul of pseudo-interface cloning. Highlights include:

- Split the code out into if_clone.[ch].
- Locked struct if_clone. [1]
- Add a per-cloner match function rather then simply matching names of
the form <name><unit> and <name>.
- Use the match function to allow creation of <interface>.<tag>
vlan interfaces. The old way is preserved unchanged!
- Also the match function to allow creation of stf(4) interfaces named
stf0, stf, or 6to4. This is the only major user visible change in
that "ifconfig stf" creates the interface stf rather then stf0 and
does not print "stf0" to stdout.
- Allow destroy functions to fail so they can refuse to delete
interfaces. Currently, we forbid the deletion of interfaces which
were created in the init function, particularly lo0, pflog0, and
pfsync0. In the case of lo0 this was a panic implementation so it
does not count as a user visiable change. :-)
- Since most interfaces do not need the new functionality, an family of
wrapper functions, ifc_simple_*(), were created to wrap old style
cloner functions.
- The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
instead.

Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by: andre, mlaier
Discussed on: net


# affc907d 15-Jun-2004 Max Laier <mlaier@FreeBSD.org>

Replace IF_HANDOFF with new IFQ_HANDOFF to enqueue with ALTQ once enabled on
the respective drivers.


# 6cbd3e99 26-May-2004 Yaroslav Tykhiy <ytykhiy@gmail.com>

if_printf() won't emit a newline unless told to.


# 656acce4 25-May-2004 Yaroslav Tykhiy <ytykhiy@gmail.com>

After all the relevant drivers have been fixed, fix vlan(4) itself
WRT manipulating capabilities of the parent interface:

- use ioctl(SIOCSIFCAP) to toggle VLAN_MTU (the way that was done
before was just wrong);

- use the right order of conditional clauses to set the MTU fudge
(that is logically independent from toggling VLAN_MTU.)


# b08347a0 23-May-2004 Yaroslav Tykhiy <ytykhiy@gmail.com>

Consult parent's if_capenable for active VLAN-related capabilities.
This change is possible since all the relevant drivers have been
fixed to set if_capenable properly. The field if_capabilities tracks
supported capabilities, which may be disabled administratively.

Inheriting checksum offload support from the parent interface isn't
that easy because the checksumming capabilities of the parent may be
toggled on the fly. Disable the code for now.


# d35bcd3b 21-May-2004 Ruslan Ermilov <ru@FreeBSD.org>

Added dependency on the miibus module.


# e6d95d51 03-May-2004 Scott Long <scottl@FreeBSD.org>

Add route.h to pick up the rt_ifmsg() declaration.


# 127d7b2d 03-May-2004 Andre Oppermann <andre@FreeBSD.org>

Link state change notification of ethernet media to the routing socket.

o Extend the if_data structure with an ifi_link_state field and
provide the corresponding defines for the valid states.

o The mii_linkchg() callback updates the ifi_link_state field
and calls rt_ifmsg() to notify listeners on the routing socket
in addition to the kqueue KNOTE.

o If vlans are configured on a physical interface notify and update
all vlan pseudo devices as well with the vlan_link_state() callback.

No objections by: sam, wpaul, ru, bms
Brucification by: bde


# 3fefbff0 24-Apr-2004 Luigi Rizzo <luigi@FreeBSD.org>

arpcom untangling:

consistently with the rest of the code, use IFP2AC(ifp) to access
the arpcom structure given the ifp.

In this case also fix a difference in assumptions WRT the rest of
the net/ sources: it is not the 'struct *softc' that starts with a
'struct arpcom', but a 'struct arpcom' that starts with a
'struct ifnet'


# 5016de40 02-Jan-2004 Sam Leffler <sam@FreeBSD.org>

backout the switch to use a zone for vlan tags; this requires
vlans be present if any driver with h/w vlan tagging is configured


# dfffd5d4 02-Jan-2004 Sam Leffler <sam@FreeBSD.org>

switch vlan packet tag allocation to use a private zone


# 26f9b263 11-Nov-2003 Ruslan Ermilov <ru@FreeBSD.org>

- vlan_start(): Increment the correct interface statistics member.

Reviewed by: mdodd

- vlan_input(): Macroize the VLAN tag extraction from mbuf.


# 9bf40ede 31-Oct-2003 Brooks Davis <brooks@FreeBSD.org>

Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)


# 5e17543a 28-Oct-2003 Brooks Davis <brooks@FreeBSD.org>

Use VLANNAME instead of "vlan".


# 4faedfe8 05-Sep-2003 Sam Leffler <sam@FreeBSD.org>

Add locking. We use a single lock to guard the global vlan list and also
to protect the vlan state in each ifnet (e.g. vlan count). The latter is
probably better handled through an ifnet-centric means but since changes
are infrequent shouldn't matter for now.

Sponsored by: FreeBSD Foundation


# fb88a3e0 08-Jul-2003 Bill Paul <wpaul@FreeBSD.org>

- In vlan_input(), always mask off all but the VLID bits from tags
extracted from received frames, both in the IFCAP_VLAN_HWTAGGING case
and not. (Some drivers may already do this masking internally, but
doing it here doesn't hurt and insures consistency.)

- In vlan_ioctl(), don't let the user set a VLAN ID value with anything
besides the VLID bits set, otherwise we will have trouble matching
an interface in vlan_input() later.

PR: kern/46405


# 79a58745 05-Jul-2003 Bill Paul <wpaul@FreeBSD.org>

Testing VLANs with the new 8139C+ chip (which does hardware tag
insertion and extraction) has revealed two bugs:

- In vlan_start(), we're supposed to check the underlying interface to
see if it has the IFCAP_VLAN_HWTAGGING cabability set and, if so, set
things up for the VLAN_OUTPUT_TAG() routine. However the code checks
ifp->if_capabilities, which is the vlan pseudo-interface's capabilities
when it should be checking p->if_capabilities, which relates to the
underlying physical interface. Change ifp->if_capabilities to
p->if_capabilities so this works.

- In vlan_input(), we have to extract the 16-bit tag value from the
received frame and use it to figure out which vlan interface gets
the frame. The code that we use to track down the desired vlan
pseudo-interface is:

for (ifv = LIST_FIRST(&ifv_list); ifv != NULL;
ifv = LIST_NEXT(ifv, ifv_list))
if (ifp == ifv->ifv_p && tag == ifv->ifv_tag)
break;

The problem is that 'tag' is not computed consistently. In the case
where the interface supports hardware VLAN tag extraction and calls
VLAN_INPUT_TAG(), we do this:

tag = *(u_int*)(mtag+1);

But in the software emulation case, we do this

tag = EVL_VLANOFTAG(ntohs(evl->evl_tag));

The problem here is the EVL_VLANOFTAG() macro is only ever applied
in this one case. It's never applied to ifv->ifv_tag or anwhere else.
We must be consistent: either it's applied everywhere or nowhere.
To see how this can be a problem, do something like
ifconfig vlan0 vlan 12345 vlandev foo0 and observe the results.

I'm not quite sure what the right thing is to do here. Neither the
vlan(4) nor ifconfig(8) man pages suggest which way to go. For now,
I've removed this use of EVL_VLANOFTAG() so that the tag will match
correctly in all cases. I will not get upset if somebody makes a
compelling argument for using EVL_VLANOFTAG() everywhere instead,
as long as the use is consistent.


# 4a692a1f 12-Mar-2003 Sam Leffler <sam@FreeBSD.org>

correct two more flag misuses; m_tag* use malloc flags


# 797f247b 02-Mar-2003 Matthew N. Dodd <mdodd@FreeBSD.org>

sizeof(struct llc) -> LLC_SNAPFRAMELEN
sizeof(struct ether_header) -> ETHER_HDR_LEN
sizeof(struct fddi_header) -> FDDI_HDR_LEN


# a163d034 18-Feb-2003 Warner Losh <imp@FreeBSD.org>

Back out M_* changes, per decision of the TRB.

Approved by: trb


# b3cca108 22-Jan-2003 Bill Fenner <fenner@FreeBSD.org>

Implement SIOCGIFMEDIA for vlan devices by passing the request to the
parent device, if there is a parent configured. Modify the result
returned by the parent to indicate that the only supported media
is the currently configured one.

Reviewed by: brooks


# 44956c98 21-Jan-2003 Alfred Perlstein <alfred@FreeBSD.org>

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# a3814acf 14-Nov-2002 Sam Leffler <sam@FreeBSD.org>

o eliminate separate callback interface for h/w tagged input packets; instead
drivers "tag packets" with an m_tag and the input packet handling recognizes
such packets and does the right thing
o track the number of active vlans on an interface; this lets lots of places
only do vlan-specific processing when needed
o track changes to ether_ifdetach/ether_ifattach
o track bpf changes
o eliminate the use of M_PROTO1 for communicating to drivers about tagged
packets
o eliminate the use of IFF_LINK0 for drivers communicating to the vlan code
that they support h/w tagging; replaced by explicit interface capabilities
o add ifnet capabilities for h/w tagging and support of "large mtu's"
o use new interface capabilities to auto-configure use of large mtu's and h/w
tagging
o add support for proper handling of promiscuous mode
o document driver/vlan communication conventions

Reviewed by: many
Approved by: re


# a9283398 07-Nov-2002 John Baldwin <jhb@FreeBSD.org>

Add a cast to quiet a warning.


# 28a1a7c6 20-Oct-2002 Brooks Davis <brooks@FreeBSD.org>

Use if_printf(ifp, "blah") instead of printf("vlan%d: blah", ifp->if_unit).


# ae5a19be 25-May-2002 Brooks Davis <brooks@FreeBSD.org>

Move all unit number management cloned interfaces into the cloning
code. The reverts the API change which made the <if>_clone_destory()
functions return an int instead of void bringing us into closer
alignment with NetBSD.

Reviewed by: net (a long time ago)


# 7d3e4c6e 03-Apr-2002 Luigi Rizzo <luigi@FreeBSD.org>

Fix a couple of incorrect m_free() vs. m_freem() usages and related issues.

Reviewed-by: brooks


# 3b16e7b2 11-Mar-2002 Maxime Henrion <mux@FreeBSD.org>

Simplify the interface cloning framework by handling unit
unit allocation with a bitmap in the generic layer. This
allows us to get rid of the duplicated rman code in every
clonable interface.

Reviewed by: brooks
Approved by: phk


# b75496fe 04-Mar-2002 Brooks Davis <brooks@FreeBSD.org>

Change the network interface cloning API so the destroy function returns
an int errorcode instead of void in preperation for merging cloning of
the loopback device.

Submitted by: mux
MFC after: 2 weeks


# 7a46ec8f 25-Feb-2002 Brooks Davis <brooks@FreeBSD.org>

When using hardware decoding, reconstruct the wire form of the ethernet
header and push it up any attached bpf devices on the parent interface.
This makes hardware vlan decoding more like the normal software path.

Tested by: cjtt@employees.org
MFC after: 2 weeks


# 31689d25 21-Nov-2001 Andrew R. Reiter <arr@FreeBSD.org>

- Utilize the great M_ZERO flag rather than allocating memory then do
a call to memset.


# 211f625a 15-Oct-2001 Bill Fenner <fenner@FreeBSD.org>

Set the interface speed back to zero, after ether_ifattach() set it
to 10Mbps. RFC 2863 says: "For a sub-layer which has no concept
of bandwidth, [ifSpeed] should be zero."


# 322dcb8d 14-Oct-2001 Max Khon <fjoe@FreeBSD.org>

bring in ARP support for variable length link level addresses

Reviewed by: jdp
Approved by: jdp
Obtained from: NetBSD
MFC after: 6 weeks


# 242c766b 05-Oct-2001 Bill Fenner <fenner@FreeBSD.org>

- Fix typo in "didn't find tag in list" code -- != should have been ==.
This fixes the panic when receiving a packet with an unknown tag, and
also allows reception of packets with known tags.
- Allow overlapping tag number spaces when using multiple hardware-assisted
VLAN parent devices (by comparing the parent interface in
vlan_input_tag() just as in vlan_input() ).
- fix typo in comment

MFC after: 1 week


# f9132ceb 05-Sep-2001 Jonathan Lemon <jlemon@FreeBSD.org>

Wrap array accesses in macros, which also happen to be lvalues:

ifnet_addrs[i - 1] -> ifaddr_byindex(i)
ifindex2ifnet[i] -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.


# 9d4fe4b2 05-Sep-2001 Brooks Davis <brooks@FreeBSD.org>

Make vlan(4) loadable, unloadable, and clonable. As a side effect,
interfaces must now always enable VLAN support.

Reviewed by: jlemon
MFC after: 3 weeks


# 1b2a4f7a 24-Jul-2001 Bill Fenner <fenner@FreeBSD.org>

Eliminate the panic, reported by Daniel Sobral, which occurs when
vlan_unconfig()-ing an interface on which multicast groups have been
joined. Instead, keep the list of groups around (and, in fact, allow
changing of the membership list) and re-join them when the vlan interface
is reassociated with a lower level interface.


# d2a75853 23-Jul-2001 Bill Fenner <fenner@FreeBSD.org>

Use the IANA assignment IFT_L2VLAN directly instead of indirecting through
a privately #defined IFT_8021_VLAN.

MFC after: 3 days


# 65f1bb65 15-Jun-2001 Peter Wemm <peter@FreeBSD.org>

Fix warning. s/char/unsigned char/ in "(char *)eth"
294: warning: ethernet address is not type unsigned char *


# 26e30963 02-May-2001 Bill Fenner <fenner@FreeBSD.org>

Get IP multicast working on VLAN devices:

- Allocate zeroed memory in ether_resolvemulti() to prevent equal() from
comparing garbage and determining that two otherwise-equal sockaddr_dls
are different.
- Fill in all required fields of the sockaddr_dl
- Actually copy the multicast address into the sockaddr_dl when calling
if_addmulti()
- Don't claim that we don't have a way to resolve layer 3 addresses into
layer 2 addresses; use the ethernet way.


# 24993214 28-Mar-2001 Yaroslav Tykhiy <ytykhiy@gmail.com>

Fix a number of minor bugs in the VLAN code:

* Initialize the "struct sockaddr_dl sdl" correctly in vlan_setmulti().

PR: kern/22181

* The driver used to call malloc(..., M_NOWAIT), but to not check the
return value. Change malloc(..., M_NOWAIT) to malloc(..., M_WAITOK)
because the corresponding part of code is called from the upper
half of the kernel only.

PR: kern/22181

* Make sure a parent interface is up and running before invoking
its if_start() routine in order to avoid system panic.

PR: kern/22179 kern/24741 i386/25478

* Do not copy all the flags from a parent mindlessly.

PR: kern/22179

* Do not call if_down() on a parent interface if it's already down.
Call if_down() at splimp because if_down() needs that.

PR: kern/22179

Reviewed by: wollman


# befdaf4e 14-Feb-2001 Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org>

Fix another typo I missed on first reading:
insersion -> insertion


# 2d89d40a 14-Feb-2001 Jeroen Ruigrok van der Werven <asmodai@FreeBSD.org>

Fix typo and comma placement.


# 6817526d 06-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Convert if_multiaddrs from LIST to TAILQ so that it can be traversed
backwards in the three drivers which want to do that.

Reviewed by: mikeh


# fc2ffbe6 04-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)


# 22f29826 03-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Use <sys/queue.h> macro api rather than fondle its implementation detals.

Created with: /usr/bin/sed
Reviewed by: /sbin/md5


# 2b120974 31-Jan-2001 Peter Wemm <peter@FreeBSD.org>

Exterminate the use of PSEUDO_SET() with extreme prejudice.


# df5e1987 25-Nov-2000 Jonathan Lemon <jlemon@FreeBSD.org>

Lock down the network interface queues. The queue mutex must be obtained
before adding/removing packets from the queue. Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.

IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue. An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.

Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.


# 21b8ebd9 13-Jul-2000 Archie Cobbs <archie@FreeBSD.org>

Make all Ethernet drivers attach using ether_ifattach() and detach using
ether_ifdetach().

The former consolidates the operations of if_attach(), ng_ether_attach(),
and bpfattach(). The latter consolidates the corresponding detach operations.

Reviewed by: julian, freebsd-net


# 2e2de7f2 13-May-2000 Archie Cobbs <archie@FreeBSD.org>

Move code to handle BPF and bridging for incoming Ethernet packets out
of the individual drivers and into the common routine ether_input().
Also, remove the (incomplete) hack for matching ethernet headers
in the ip_fw code.

The good news: net result of 1016 lines removed, and this should make
bridging now work with *all* Ethernet drivers.

The bad news: it's nearly impossible to test every driver, especially
for bridging, and I was unable to get much testing help on the mailing
lists.

Reviewed by: freebsd-net


# 46a32e61 26-Mar-2000 Philippe Charnier <charnier@FreeBSD.org>

Remove duplicate word


# 5326b23c 06-Feb-2000 Matthew N. Dodd <mdodd@FreeBSD.org>

m_pullup() frees the supplied mbuf on failure; we don't need to try
and do this ourselves.

Approved by: jkh
Noticed by: Mike Spengler <mks@networkcs.com>


# 4af90a4d 03-Feb-2000 Matthew N. Dodd <mdodd@FreeBSD.org>

Make sure that the entire header is in the first mbuf before we
attempt to copy the ethernet header forward and otherwise encapsulate
a packet for output.

This fixes the panic when using VLAN devices on hardware that doesn't
do 802.1Q tagging onboard. (That is to say, all drivers except the Tigon.)

My tests consisted of telnet, ttcp, and a pingflood of packets
between 1 and 1600 (plus headers) bytes.

MFC to follow in 1 week.

Approved by: jkh


# cef8bf09 29-Jan-2000 Peter Wemm <peter@FreeBSD.org>

Remove some #if NFOO > 0 that are always true because of config rules.


# c0230c1b 12-Dec-1999 Jordan K. Hubbard <jkh@FreeBSD.org>

The current code incorrectly assumes that all vlans
are configured, and/or associated with a parent device. If you
receive a frame for a VLAN that's not in the list, you walk off
the end of the list. Boom.

Submitted by: C. Stephen Gunn <csg@waterspout.com>
PR: 15291


# ae290324 12-Dec-1999 Jordan K. Hubbard <jkh@FreeBSD.org>

sys/net/if_vlan.c fails to maintain the IFF_RUNNING flag on the
vlan interfaces it manages. This prevents the interface from
actually sending or receiving data.

Submitted by: C. Stephen Gunn <csg@waterspout.com>
PR: 15290


# 46783fb8 24-Sep-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Remove NBPF conditionality of bpf calls in most of our network drivers.

This means that we will not have to have a bpf and a non-bpf version
of our driver modules.

This does not open any security hole, because the bpf core isn't loadable

The drivers left unchanged are the "cross platform" drivers where the respective
maintainers are urged to DTRT, whatever that may be.

Add a couple of missing FreeBSD tags.


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 6b5ca0d8 06-Jul-1999 Dag-Erling Smørgrav <des@FreeBSD.org>

Rename bpfilter to bpf.


# 4a408dcb 07-Apr-1999 Bill Paul <wpaul@FreeBSD.org>

Add missing SYSCTL_DECL(_net_link); required by newer sysctl implementation.

Noticed by: Matthew Dodd <winter@jurai.net>


# 97ed1257 14-Mar-1999 Bill Paul <wpaul@FreeBSD.org>

Grrr... botched remote commit. Let's try this again: vlan updates,
take two.


# f731f104 14-Mar-1999 Bill Paul <wpaul@FreeBSD.org>

Updates for vlan stuff:

- add support for devices that do vlan tag insertion/deletion in firmware
- add multicast support
- add vlan_unconfig() to complement vlan_config()
- update ifconfig(8) to configure vlan interfaces (vlan tag and
parent device)

Also fix a small bug in ifconfig; sometimes sa_family is overwritten
by ioctls.

Reviewed by: wollman


# 2127f260 04-Dec-1998 Archie Cobbs <archie@FreeBSD.org>

Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>


# cfe8b629 22-Aug-1998 Garrett Wollman <wollman@FreeBSD.org>

Yow! Completely change the way socket options are handled, eliminating
another specialized mbuf type in the process. Also clean up some
of the cruft surrounding IPFW, multicast routing, RSVP, and other
ill-explored corners.


# eb92a347 15-May-1998 Garrett Wollman <wollman@FreeBSD.org>

Fix an obvious parameter-order bogon. (Don't know what happened to
the warning message before.)


# 2cc2df49 17-Mar-1998 Garrett Wollman <wollman@FreeBSD.org>

Add preliminary support for IEEE 802.1Q VLAN tagging. It doesn't actually
work reliably yet (I've had panics), but it does seem to occasionally
be able to transmit and receive syntactically-correct packets.
Also fixes one of if_ethersubr.c's legion style bugs, and removes
the hostcache code from standard kernels---the code that depends on it
is not going to happen any time soon, I'm afraid.