History log of /freebsd-current/sys/netpfil/pf/if_pfsync.c
Revision Date Author Comments
# caccf6d3 24-Mar-2024 Kristof Provost <kp@FreeBSD.org>

pfsync: cope with multiple pending plus messages

It's possible for pfsync to add a plus message when one is already queued.
Append both, rather than overwriting the already pending one.

MFC after: 1 week


# 81debbd6 24-Mar-2024 Kristof Provost <kp@FreeBSD.org>

pfsync: fix use of invalidated stack variable

Calls to pfsync_send_plus() pass pointers to stack variables.
If pfsync_sendout() then fails it retains the pointer to these stack
variables, accesing them later.

Allocate a buffer and copy the data instead, so that we can retain the
pointer safely.

Reported by: CI KASAN, markj
MFC after: 1 week


# 50edc630 13-Feb-2024 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pfsync: Fix offset calculation

Even though message version is automatically recognized and the top of
the struct is identical for different versions, when iterating over
multiple messages proper message length must be used. That's the length
of an union member for given version, not of the union itself.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D43862


# 04932601 07-Dec-2023 Kristof Provost <kp@FreeBSD.org>

pf: store state creation/expiration timestamps with milisecond precision

The primary beneficiary is pflow(4), which expects milisecond precision
in timestamps.

Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43112


# bd802636 04-Nov-2023 Mark Johnston <markj@FreeBSD.org>

pfsync: Avoid transmitting uninitialized bytes in pfsync_sendout()

When IPv6 support was added to pfsync, PFSYNC_MINPKT increased such that
we always allocate enough space for either IPv4 or IPv6 headers. IPv6
headers are 20 bytes larger than IPv4 headers. When pfsync_sendout()
does its thing, it ends up allocating enough space for either; thus when
transmitting an IPv4 packet, the last 20 bytes of the buffer are left
uninitialized.

Fix the problem by stashing the length in a local variable and adjusting
it depending on the address family in use.

While here, just zero the entire buffer in one go rather than being
careful to initialize each subheader. This seems simpler and less error
prone.

Reported by: KMSAN
Reviewed by: kp
Fixes: 6fc7fc2dbb2b ("pfsync: transport over IPv6")
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42461


# f415a5c1 08-Sep-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: fix state leak

If we receive a state with a route-to interface name set and we can't
find the interface we do not insert the state. However, in that case we
must still clean up the state (and state keys).
Do so, so we do not leak states.

Reviewed by: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41779


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 77c9e608 13-Jul-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: fix NOINET6 build

While here also fix a few minor style(9) issues.


# 6fc7fc2d 13-Jul-2023 Luiz Amaral <email@luiz.eng.br>

pfsync: transport over IPv6

Implement pfsync over IPv6.

Submitted by: Luiz Amaral <email@luiz.eng.br>
Submitted by: Naman Sood <naman@freebsdfoundation.org>
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D40102


# 6b4ed16d 12-Jul-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: Simplify rule actions logic

Actions applied to a processed packet come in case of stateless
firewalling from a rule or in case of statefull firewalling from a
state. The state obtains the actions from a rule when it is created by a
rule or by pfsync. The logic for deciding if actions come from a rule or
a state is spread across many places in pf.

There already is struct pf_rule_actions in struct pf_pdesc and thus it
can be used as a central place for storing actions and their parameters.
OpenBSD does something similar: they also store the actions in struct
pf_pdesc and have no variables in pf_test() but they use separate
variables instead of a structure. By using struct pf_rule_actions we can
simplify the code even further. Applying of actions is done *only* in
pf_rule_to_actions() no matter if for the legacy scrub rules or for the
normal match / pass rules. The logic of choosing if rule or state
actions are used is applied only once in pf_test() by copying the whole
struct.

Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D41009


# 476f6121 20-Jun-2023 Kristof Provost <kp@FreeBSD.org>

pf: fix build without VIMAGE

Remove the name conflict between the pfsync_defer_tmo variable and
function.

This worked fine in kernels with VIMAGE (the default), but not in those
without.

Reported by: des@
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 6983b986 19-Jun-2023 Kristof Provost <kp@FreeBSD.org>

pf: allow defer timeout to be configured

Add the net.pfsync.defer_delay sysctl to allow the defer timeout (i.e.
how long pf holds onto packets waiting for the peer to ack the new
state) to be changed.

This is intended to make testing of the defer code more robust, by
allowing longer timeouts to mitigate scheduling/measurement jitter.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 4bf98559 29-May-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: make contents of struct pfsync_state configurable

Make struct pfsync_state contents configurable by sending out new
versions of the structure in separate subheader actions. Both old and
new version of struct pfsync_state can be understood, so replication of
states from a system running an older kernel is possible. The version
being sent out is configured using ifconfig pfsync0 … version XXXX. The
version is an user-friendly string - 1301 stands for FreeBSD 13.1 (I
have checked synchronization against a host running 13.1), 1400 stands
for 14.0.

A host running an older kernel will just ignore the messages and count
them as "packets discarded for bad action".

Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D39392


# cdc231bd 15-May-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pfsync: Remove deletion of states using the full pfsync_state struct

State deletions are sent over pfsync using struct pfsync_del_c.

Remove the code for receiving state deletions using struct pfsync_state
as such deletions are never sent. Rename functions and constants so that
only the "compressed" versions remain.

Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D40004


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# bf206a1d 04-May-2023 Kristof Provost <kp@FreeBSD.org>

pf: remove NULL check before uma_zfree()

uma_zfree() can be called on a NULL pointer. Simplify the pf code a
little by removing the redundant checks.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 39282ef3 13-Apr-2023 Kajetan Staszkiewicz <vegeta@tuxpowered.net>

pf: backport OpenBSD syntax of "scrub" option for "match" and "pass" rules

Introduce the OpenBSD syntax of "scrub" option for "match" and "pass"
rules and the "set reassemble" flag. The patch is backward-compatible,
pf.conf can be still written in FreeBSD-style.

Obtained from: OpenBSD
MFC after: never
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D38025


# 27b23cde 24-Mar-2023 Kristof Provost <kp@FreeBSD.org>

pf: remove pd_refs from pfsync

It only served to complicate cleanup, and added no value.

While here drop packets in pfsync_defer_tmo() if we don't have a syncif,
rather than just leaving them on the queue.

Reviewed by: markj
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D39248


# 01194da2 22-Mar-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: hold b_mtx for callout_stop(pd_tmo)

The pd_tmo callout has an associated mutex, which we must hold while
calling callout_stop().

Reported by: markj
Reviewed by: markj
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D39223


# 53247cdf 20-Mar-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: fix pfsync_undefer_state() locking

pfsync_undefer_state() takes the bucket lock, but could get called from
places (e.g. from pfsync_update_state() or pfsync_delete_state()) where
we already held the lock.

As it can also be called from places where we don't yet hold the lock
create new locked variant for use when the lock is already held. Keep
using pfsync_undefer_state() where the lock must still be taken.

PR: 268246
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (Netgate)


# 844ad282 20-Mar-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: add missing unlock in pfsync_defer_tmo()

The callout for pfsync_defer_tmo() is created with
CALLOUT_RETURNUNLOCKED, because while the callout framework takes care
of taking the lock we want to run a few operations outside of the lock,
so we unlock ourselves.

However, if `sc->sc_sync_if == NULL` we return without releasing the
lock, and leak the lock, causing later deadlocks.
Ensure we always release the bucket lock when we exit pfsync_defer_tmo()

PR: 268246
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (Netgate)


# f52ca3df 16-Feb-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: ensure 'error' is always initialised

Reported by: Herbert J. Skuhra <herbert@gojira.at>
MFC after: 2 weeks


# 9a1cab6d 13-Feb-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: support deferring IPv6 packets

When we send out a deferred packet we must make sure to call
ip6_output() for IPv6 packets. If not we might end up attempting to
ip_fragment() an IPv6 packet, which could lead to us reading outside of
the mbuf.

PR: 268246
Reviewed by: melifaro, zlei
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38586


# 0ed5f66c 02-Feb-2023 Kristof Provost <kp@FreeBSD.org>

pfsync: add missing bucket lock

pfsync_q_ins() expects us to hold the bucket lock, but when we enter it
from pfsync_state_import() we don't.

MFC after: 2 weeks


# 3d0d5b21 23-Jan-2023 Justin Hibbits <jhibbits@FreeBSD.org>

IfAPI: Explicitly include <net/if_private.h> in netstack

Summary:
In preparation of making if_t completely opaque outside of the netstack,
explicitly include the header. <net/if_var.h> will stop including the
header in the future.

Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius, melifaro
Differential Revision: https://reviews.freebsd.org/D38200


# fd02192c 12-Jan-2023 Kristof Provost <kp@FreeBSD.org>

pf: fix panic on deferred packets

The pfsync_defer_tmo() callout needs to set the correct vnet before it
can transmit packets. It used the rcvif in the mbuf to get this vnet,
but that doesn't work for locally originated traffic. In that case the
rcvif pointer is NULL, and the dereference leads to a panic.

Instead use the sc_sync_if, which is always set (if pfsync is enabled,
at least).

PR: 268246
MFC after: 2 weeks


# 48767d87 14-Nov-2022 Kristof Provost <kp@FreeBSD.org>

pfsync: fix memory leak

The recent refactoring to prepare for pfsync over IPv6 introduced a
memory leak.
If we don't have a sync peer configured we return early (without sending
out a packet), but failed to free the newly allocated packet.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 813c5b75 08-Nov-2022 Luiz Amaral <email@luiz.eng.br>

pfsync: prepare code to accommodate AF_INET6 family

Work is ongoing to add support for pfsync over IPv6. This required some
changes to allow for differentiating between the two families in a more
generic way.

This patch converts the relevant ioctls to using nvlists, making future
extensions (such as supporting IPv6 addresses) easier.

Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D36277


# 69ce6ae2 06-Sep-2022 Mateusz Guzik <mjg@FreeBSD.org>

pf: make pfsync_state_import appease an assert in pf_free_state

The newly created state failed to be inserted anywhere. This follows
other places.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 485be979 22-Aug-2022 Luiz Amaral <email@luiz.eng.br>

pfsync: replace struct pfsync_pkt with int flags

Get rid of struct pfsync_pkt. It was used to store data on the stack to
pass to all the submessage handlers, but only the flags part of it was
ever used. Just pass the flags directly instead.

Reviewed by: kp
Obtained from: OpenBSD
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D36294


# 78b1fc05 17-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: separate pr_input and pr_ctlinput out of protosw

The protosw KPI historically has implemented two quite orthogonal
things: protocols that implement a certain kind of socket, and
protocols that are IPv4/IPv6 protocol. These two things do not
make one-to-one correspondence. The pr_input and pr_ctlinput methods
were utilized only in IP protocols. This strange duality required
IP protocols that doesn't have a socket to declare protosw, e.g.
carp(4). On the other hand developers of socket protocols thought
that they need to define pr_input/pr_ctlinput always, which lead to
strange dead code, e.g. div_input() or sdp_ctlinput().

With this change pr_input and pr_ctlinput as part of protosw disappear
and IPv4/IPv6 get their private single level protocol switch table
ip_protox[] and ip6_protox[] respectively, pointing at array of
ipproto_input_t functions. The pr_ctlinput that was used for
control input coming from the network (ICMP, ICMPv6) is now represented
by ip_ctlprotox[] and ip6_ctlprotox[].

ipproto_register() becomes the only official way to register in the
table. Those protocols that were always static and unlikely anybody
is interested in making them loadable, are now registered by ip_init(),
ip6_init(). An IP protocol that considers itself unloadable shall
register itself within its own private SYSINIT().

Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36157


# 8c77967e 11-Aug-2022 Gleb Smirnoff <glebius@FreeBSD.org>

protosw: retire pr_output method

The only place to execute this method was raw_usend(). Only those
protocols that used raw socket were able to actually enter that method.
All pr_output assignments being deleted by this commit were a dead code
for many years.

Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36126


# 766f3c80 25-Jul-2022 Dimitry Andric <dim@FreeBSD.org>

Adjust function definitions in if_pfsync.c to avoid clang 15 warnings

With clang 15, the following -Werror warnings are produced:

sys/netpfil/pf/if_pfsync.c:2439:21: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
pfsync_pointers_init()
^
void
sys/netpfil/pf/if_pfsync.c:2453:23: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
pfsync_pointers_uninit()
^
void
sys/netpfil/pf/if_pfsync.c:2503:12: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
pfsync_init()
^
void
sys/netpfil/pf/if_pfsync.c:2524:14: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
pfsync_uninit()
^
void

This is because pfsync_pointers_init(), pfsync_pointers_uninit(),
pfsync_init(), and pfsync_uninit() are declared with (void) argument
lists, but defined with empty argument lists. Make the definitions match
the declarations.

MFC after: 3 days


# fb48e998 25-Jul-2022 Dimitry Andric <dim@FreeBSD.org>

Fix unused variable warning in if_pfsync.c

With clang 15, the following -Werror warning is produced:

sys/netpfil/pf/if_pfsync.c:2153:9: error: variable 'sent' set but not used [-Werror,-Wunused-but-set-variable]
int i, sent = 0;
^

The 'sent' variable was used in the for loop later in the
pfsync_bulk_update() function, but refactoring in 4fc65bcbe3fb7 got rid
of it. Remove the variable since it no longer serves any purpose.

MFC after: 3 days


# 43020350 21-Apr-2022 Kristof Provost <kp@FreeBSD.org>

pfsync: NULL check before dereference

Move the use of 'sc' to after the NULL check.
It's very unlikely that we'd actually hit this, but Coverity is correct
that it's not a good idea to dereference the pointer and only then NULL
check it.

Reported by: Coverity (CID 1398362)
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 654c1b8e 01-Apr-2022 Luiz Amaral <email@luiz.eng.br>

pfsync: Add CTLFLAG_VNET to carp_demotion_factor sysctl

When trying to avoid a CARP demotion during a pfsync service restart, I
noticed that a non-default value for the net.pfsync.carp_demotion_factor
sysctl was not being applied during the demotion. The CARP was always
demoted by 240.

After investigating, I realized that the sysctl was using VNET_NAME()
without the CTLFLAG_VNET.

PR: 262983
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34737


# 73fd0eaf 02-Dec-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: fix incorrect enabling of defer mode

When we exposed the PFSYNCF_OK flag to userspace in 5f5bf88949d we
unintentionally caused defer mode to always be enabled.
The ioctl check only looked for nonzero, not for the PFSYNCF_DEFER flag.

Fix this check and ensure ifconfig sets the flag.

Reviewed by: glebius
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33244


# 41c4f198 02-Dec-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: locking fixes

* Ensure we unlock the pfsync lock in pfsync_defer()
* We must hold the bucket lock when calling pfsync_push()
* The pfsync_defer_tmo() callout locks the bucket lock, not the pfsync
lock

Reviewed by: glebius
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33243


# 93a3fa41 02-Dec-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: fix defer timeout

Don't use a fixed number of ticks, but take hz into account so we have a
consistent timeout, regardless of what hz is set up.
Use a 20ms timeout, becaues that's what OpenBSD uses.

Reviewed by: glebius
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33242


# 7b02a551f 02-Dec-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: check IFF_DRV_RUNNING in the correct field

This flag is stored in if_drv_flags, not if_flags.

Reviewed by: glebius
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33241


# 27bd812c 02-Dec-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: NULL check sc before using it

In pfsync_defer() we must wait to lock sc until we've ensured it's not
NULL.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D33240


# 8f3d786c 01-Nov-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: remove the flags argument from pf_unlink_state

All consumers call it with PF_ENTER_LOCKED.

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# bcd4c17c 19-Oct-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: fix some cc --analyze warnings

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 211cddf9 06-Jul-2021 Kristof Provost <kp@FreeBSD.org>

pf: rename pf_state to pf_kstate

Indicate that this is a kernel-only structure, and make it easier to
distinguish from others used to communicate with userspace.

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31096


# 803dfe3d 28-Jun-2021 Mateusz Guzik <mjg@FreeBSD.org>

pf: deduplicate V_pf_state_z handling with pfsync

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")


# d0fdf2b2 12-May-2021 Kristof Provost <kp@FreeBSD.org>

pf: Track the original kif for floating states

Track (and display) the interface that created a state, even if it's a
floating state (and thus uses virtual interface 'all').

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D30245


# 5f5bf889 23-Apr-2021 Kristof Provost <kp@FreeBSD.org>

pfsync: Expose PFSYNCF_OK flag to userspace

Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.

Reviewed by: donner
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29948


# 9f2e5184 15-Mar-2021 Thomas Kurschel <topical@gmx.net>

pfsync: Unconditionally push packets when requesting state updates

When we request a bulk sync we need to ensure we actually send out that
request, not just buffer it until we have enough data to send a full
packet.

PR: 254236
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29271


# cecfaf9b 10-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Fully remove interrupt events on vnet cleanup

swi_remove() removes the software interrupt handler but does not remove
the associated interrupt event.
This is visible when creating and remove a vnet jail in `procstat -t
12`.

We can remove it manually with intr_event_destroy().

PR: 254171
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29211


# 28dc2c95 10-Mar-2021 Kristof Provost <kp@FreeBSD.org>

pf: Simplify cleanup

We can now counter_u64_free(NULL), so remove the checks.

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29190


# 320c1116 12-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Split pfi_kif into a user and kernel space structure

No functional change.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27761


# e86bddea 05-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Split pf_rule into kernel and user space versions

No functional change intended.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27758


# 1c00efe9 23-Dec-2020 Kristof Provost <kp@FreeBSD.org>

pf: Use counter(9) for pf_state byte/packet tracking

This improves cache behaviour by not writing to the same variable from
multiple cores simultaneously.

pf_state is only used in the kernel, so can be safely modified.

Reviewed by: Lutz Donnerhacke, philip
MFC after: 1 week
Sponsed by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27661


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# 10b49b23 21-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (6 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

Mark all nodes in pf, pfsync and carp as MPSAFE.

Reviewed by: kp
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23634


# ef1bd1e5 22-Jan-2020 Kristof Provost <kp@FreeBSD.org>

pfsync: Ensure we enter network epoch before calling ip_output

As of r356974 calls to ip_output() require us to be in the network epoch.
That wasn't the case for the calls done from pfsyncintr() and
pfsync_defer_tmo().


# 59854ecf 25-Jun-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.

The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.

To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi

All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.

This patch has been tested by pho@ .

Differential Revision: https://reviews.freebsd.org/D20080
Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 812483c4 16-Mar-2019 Kristof Provost <kp@FreeBSD.org>

pf: Rename pfsync bucket lock

Previously the main pfsync lock and the bucket locks shared the same name.
This lead to spurious warnings from WITNESS like this:

acquiring duplicate lock of same type: "pfsync"
1st pfsync @ /usr/src/sys/netpfil/pf/if_pfsync.c:1402
2nd pfsync @ /usr/src/sys/netpfil/pf/if_pfsync.c:1429

It's perfectly okay to grab both the main pfsync lock and a bucket lock at the
same time.

We don't need different names for each bucket lock, because we should always
only acquire a single one of those at a time.

MFC after: 1 week


# 6a8ee0f7 18-Jan-2019 Kristof Provost <kp@FreeBSD.org>

pf: fix pfsync breaking carp

Fix missing initialisation of sc_flags into a valid sync state on clone which
breaks carp in pfsync.

This regression was introduce by r342051.

PR: 235005
Submitted by: smh@FreeBSD.org
Pointy hat to: kp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D18882


# 4fc65bcb 06-Dec-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Performance improvement

pfsync code is called for every new state, state update and state
deletion in pf. While pf itself can operate on multiple states at the
same time (on different cores, assuming the states hash to a different
hashrow), pfsync only had a single lock.
This greatly reduced throughput on multicore systems.

Address this by splitting the pfsync queues into buckets, based on the
state id. This ensures that updates for a given connection always end up
in the same bucket, which allows pfsync to still collapse multiple
updates into one, while allowing multiple cores to proceed at the same
time.

The number of buckets is tunable, but defaults to 2 x number of cpus.
Benchmarking has shown improvement, depending on hardware and setup, from ~30%
to ~100%.

MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D18373


# dde6e1fe 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Add missing unlock

If we fail to set up the multicast entry for pfsync and return an error
we must release the pfsync lock first.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17506


# 04fe85f0 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Allow module to be unloaded

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17505


# fbbf436d 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Handle syncdev going away

If the syncdev is removed we no longer need to clean up the multicast
entry we've got set up for that device.

Pass the ifnet detach event through pf to pfsync, and remove our
multicast handle, and mark us as no longer having a syncdev.

Note that this callback is always installed, even if the pfsync
interface is disabled (and thus it's not a per-vnet callback pointer).

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17502


# 26549dfc 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Ensure uninit is done before pf

pfsync touches pf memory (for pf_state and the pfsync callback
pointers), not the other way around. We need to ensure that pfsync is
torn down before pf.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17501


# 5f6cf24e 02-Nov-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Make pfsync callbacks per-vnet

The callbacks are installed and removed depending on the state of the
pfsync device, which is per-vnet. The callbacks must also be per-vnet.

MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17499


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# de210dec 29-Jun-2018 Kristof Provost <kp@FreeBSD.org>

pfsync: Fix state sync during initial bulk update

States learned via pfsync from a peer with the same ruleset checksum were not
getting assigned to rules like they should because pfsync_in_upd() wasn't
passing the PFSYNC_SI_CKSUM flag along to pfsync_state_import.

PR: 229092
Submitted by: Kajetan Staszkiewicz <vegeta tuxpowered.net>
Obtained from: OpenBSD
MFC after: 1 week
Sponsored by: InnoGames GmbH


# 455969d3 30-May-2018 Kristof Provost <kp@FreeBSD.org>

pf: Replace rwlock on PF_RULES_LOCK with rmlock

Given that PF_RULES_LOCK is a mostly read lock, replace the rwlock with rmlock.
This change improves packet processing rate in high pps environments.
Benchmarking by olivier@ shows a 65% improvement in pps.

While here, also eliminate all appearances of "sys/rwlock.h" includes since it
is not used anymore.

Submitted by: farrokhi@
Differential Revision: https://reviews.freebsd.org/D15502


# 541d96aa 30-Mar-2018 Brooks Davis <brooks@FreeBSD.org>

Use an accessor function to access ifr_data.

This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900


# 8820ecc0 30-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

SPDX: Fix some cases wrongly attributed to MIT.

In the cases of BSD-style license variants without clauses, use 0BSD for
the time being in lack of a better description.


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# aa8c6a6d 09-Dec-2016 Marcel Moolenaar <marcel@FreeBSD.org>

Improve upon r309394

Instead of taking an extra reference to deal with pfsync_q_ins()
and pfsync_q_del() taken and dropping a reference (resp,) make
it optional of those functions to take or drop a reference by
passing an extra argument.

Submitted by: glebius@


# 296d65b7 09-Dec-2016 Gleb Smirnoff <glebius@FreeBSD.org>

Backout accidentially leaked in r309746 not yet reviewed patch :(


# 3cbee8ca 09-Dec-2016 Gleb Smirnoff <glebius@FreeBSD.org>

Use counter_ratecheck() in the ICMP rate limiting.

Together with: rrs, jtl


# d6d35f15 01-Dec-2016 Marcel Moolenaar <marcel@FreeBSD.org>

Fix use-after-free bugs in pfsync(4)

Use after free happens for state that is deleted. The reference
count is what prevents the state from being freed. When the
state is dequeued, the reference count is dropped and the memory
freed. We can't dereference the next pointer or re-queue the
state.

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D8671


# 66c00e9e 23-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Proerply virtualize pfsync for bringup after pf is initialized and
teardown of VNETs once pf(4) has been shut down.
Properly split resources into VNET_SYS(UN)INITs and one time module
loading.
While here cover the INET parts in the uninit callpath with proper
#ifdefs.

Approved by: re (gjb)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# 7c4676dd 13-Nov-2015 Randall Stewart <rrs@FreeBSD.org>

This fixes several places where callout_stops return is examined. The
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped. Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.

MFC after: 1 Month with associated callout change.


# fd90e2ed 22-May-2015 Jung-uk Kim <jkim@FreeBSD.org>

CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks


# 6d947416 01-Apr-2015 Gleb Smirnoff <glebius@FreeBSD.org>

o Use new function ip_fillid() in all places throughout the kernel,
where we want to create a new IP datagram.
o Add support for RFC6864, which allows to set IP ID for atomic IP
datagrams to any value, to improve performance. The behaviour is
controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by
default.
o In case if we generate IP ID, use counter(9) to improve performance.
o Gather all code related to IP ID into ip_id.c.

Differential Revision: https://reviews.freebsd.org/D2177
Reviewed by: adrian, cy, rpaulo
Tested by: Emeric POUPON <emeric.poupon stormshield.eu>
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Relnotes: yes


# 033074c4 09-Nov-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Replace 'struct route *' if_output() argument with 'struct nhop_info *'.
Leave 'struct route' as is for legacy routing api users.
Remove most of rtalloc_ign*-derived functions.


# 6df8a710 07-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.

Sponsored by: Nginx, Inc.


# 2a6009bf 19-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically convert to if_inc_counter().


# 56b61ca2 19-Sep-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove ifq_drops from struct ifqueue. Now queue drops are accounted in
struct ifnet if_oqdrops.

Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops
is simply removed from them. There were no API to read this statistic.

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 73d76e77 14-Aug-2014 Kevin Lo <kevlo@FreeBSD.org>

Change pr_output's prototype to avoid the need for explicit casts.
This is a follow up to r269699.

Phabric: D564
Reviewed by: jhb


# 8f5a8818 07-Aug-2014 Kevin Lo <kevlo@FreeBSD.org>

Merge 'struct ip6protosw' and 'struct protosw' into one. Now we have
only one protocol switch structure that is shared between ipv4 and ipv6.

Phabric: D476
Reviewed by: jhb


# 8ff2bd98 09-Jul-2014 Gleb Smirnoff <glebius@FreeBSD.org>

On machines with strict alignment copy pfsync_state_key from packet
on stack to avoid unaligned access.

PR: 187381
Submitted by: Lytochkin Boris <lytboris gmail.com>


# 48278b88 14-Feb-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Once pf became not covered by a single mutex, many counters in it became
race prone. Some just gather statistics, but some are later used in
different calculations.

A real problem was the race provoked underflow of the states_cur counter
on a rule. Once it goes below zero, it wraps to UINT32_MAX. Later this
value is used in pf_state_expires() and any state created by this rule
is immediately expired.

Thus, make fields states_cur, states_tot and src_nodes of struct
pf_rule be counter(9)s.

Thanks to Dennis for providing me shell access to problematic box and
his help with reproducing, debugging and investigating the problem.

Thanks to: Dennis Yusupoff <dyr smartspb.net>
Also reported by: dumbbell, pgj, Rambler
Sponsored by: Nginx, Inc.


# 76039bc8 26-Oct-2013 Gleb Smirnoff <glebius@FreeBSD.org>

The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by: Netflix
Sponsored by: Nginx, Inc.


# 415077ba 29-Jul-2013 Andrey V. Elsukov <ae@FreeBSD.org>

Fix a possible NULL-pointer dereference on the pfsync(4) reconfiguration.

Reported by: Eugene M. Zheganin


# 6828cc99 19-Jun-2013 Gleb Smirnoff <glebius@FreeBSD.org>

De-vnet hash sizes and hash masks.

Submitted by: Nikos Vassiliadis <nvass gmx.com>
Reviewed by: trociny


# b69d74e8 11-May-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Invalid creatorid is always EINVAL, not only when we are in verbose mode.


# f8aa4447 06-May-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Improve KASSERT() message.


# 47e8d432 25-Apr-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Add const qualifier to the dst parameter of the ifnet if_output method.


# 41a7572b 12-Mar-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Functions m_getm2() and m_get2() have different order of arguments,
and that can drive someone crazy. While m_get2() is young and not
documented yet, change its order of arguments to match m_getm2().

Sorry for churn, but better now than later.


# e2a55a00 15-Feb-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Finish the r244185. This fixes ever growing counter of pfsync bad
length packets, which was actually harmless.

Note that peers with different version of head/ may grow this
counter, but it is harmless - all pfsync data is processed.

Reported & tested by: Anton Yuzhaninov <citrin citrin.ru>
Sponsored by: Nginx, Inc


# d8aa10cc 28-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

In netpfil/pf:
- Add my copyright to files I've touched a lot this year.
- Add dash in front of all copyright notices according to style(9).
- Move $OpenBSD$ down below copyright notices.
- Remove extra line between cdefs.h and __FBSDID.


# 4c794f5c 14-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Fix VIMAGE build broken in r244185.

Submitted by: Nikolai Lifanov <lifanov mail.lifanov.com>


# 9ff7e6e9 12-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Merge rev. 1.119 from OpenBSD:

date: 2009/03/31 01:21:29; author: dlg; state: Exp; lines: +9 -16
...

this also firms up some of the input parsing so it handles short frames a
bit better.

This actually fixes reading beyond mbuf data area in pfsync_input(), that
may happen at certain pfsync datagrams.


# fed76350 11-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Merge 1.127 from OpenBSD, that closes a regression from 1.125 (merged
as r242694):
do better detection of when we have a better version of the tcp sequence
windows than our peer.

this resolves the last of the pfsync traffic storm issues ive been able to
produce, and therefore makes it possible to do usable active-active
statuful firewalls with pf.


# 8db7e13f 06-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Remove extra PFSYNC_LOCK() in pfsync_bulk_update() which lead to lock
recursion.

Reported by: Ian FREISLICH <ianf cloudseed.co.za>


# 5da39c56 06-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Revert erroneous r242693. A state may have PFTM_UNLINKED being on the
PFSYNC_S_DEL queue of pfsync.


# f18ab0ff 07-Nov-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Merge rev. 1.125 from OpenBSD:
date: 2009/06/12 02:03:51; author: dlg; state: Exp; lines: +59 -69
rewrite the way states from pfsync are merged into the local state tree
and the conditions on which pfsync will notify its peers on a stale update.

each side (ie, the sending and receiving side) of the state update is
compared separately. any side that is further along than the local state
tree is merged. if any side is further along in the local state table, an
update is sent out telling the peers about it.


# d75efebe 07-Nov-2012 Gleb Smirnoff <glebius@FreeBSD.org>

It may happen that pfsync holds the last reference on a state. In this
case keys had already been freed. If encountering such state, then
just release last reference.

Not sure this can happen as a runtime race, but can be reproduced by
the following scenario:

- enable pfsync
- disable pfsync
- wait some time
- enable pfsync


# 8f134647 22-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>


# 42a58907 16-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Make the "struct if_clone" opaque to users of the cloning API. Users
now use function calls:

if_clone_simple()
if_clone_advanced()

to initialize a cloner, instead of macros that initialize if_clone
structure.

Discussed with: brooks, bz, 1 year ago


# 9823d527 10-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Revert previous commit...

Pointyhat to: kevlo (myself)


# a10cee30 09-Oct-2012 Kevin Lo <kevlo@FreeBSD.org>

Prefer NULL over 0 for pointers


# aa955cb5 01-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

To reduce volume of pfsync traffic:
- Scan request update queue to prevent doubles.
- Do not push undersized daragram in pfsync_update_request().


# 7b6fbb73 29-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Clear and re-setup all function pointers that glue pf(4) and pfsync(4)
together whenever the pfsync0 is brought down or up respectively.


# 0fa4aaa7 29-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Simplify send out queue code:
- Write method of a queue now is void,length of item is taken
as queue property.
- Write methods don't need to know about mbud, supply just buf
to them.
- No need for safe queue iterator in pfsync_sendout().

Obtained from: OpenBSD


# 51e02a31 22-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

EBUSY is a better reply for refusing to unload pf(4) or pfsync(4).

Submitted by: pluknet


# 3b3a8eb9 14-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

o Create directory sys/netpfil, where all packet filters should
reside, and move there ipfw(4) and pf(4).

o Move most modified parts of pf out of contrib.

Actual movements:

sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5

sys/netinet/ipfw -> sys/netpfil/ipfw

The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.

Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.

The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.

Discussed with: bz, luigi