History log of /freebsd-current/sys/netpfil/ipfw/ip_fw_pfil.c
Revision Date Author Comments
# 49197c39 25-Sep-2023 Zhenlei Huang <zlei@FreeBSD.org>

ipfw: Add sysctl flag CTLFLAG_TUN to loader tunables

The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.

1. net.inet.ip.fw.enable
2. net.inet6.ip6.fw.enable
3. net.link.ether.ipfw

No functional change intended.

Reviewed by: glebius
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D41928


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# fc727ad6 24-Apr-2023 Boris Lytochkin <lytboris@gmail.com>

ipfw: add [fw]mark implementation for ipfw

Packet Mark is an analogue to ipfw tags with O(1) lookup from mbuf while
regular tags require a single-linked list traversal.
Mark is a 32-bit number that can be looked up in a table
[with 'number' table-type], matched or compared with a number with optional
mask applied before comparison.
Having generic nature, Mark can be used in a variety of needs.
For example, it could be used as a security group: mark will hold a security
group id and represent a group of packet flows that shares same access
control policy.

Reviewed By: pauamma_gundo.com
Differential Revision: https://reviews.freebsd.org/D39555
MFC after: 1 month


# caf32b26 14-Feb-2023 Gleb Smirnoff <glebius@FreeBSD.org>

pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks()

The 0b70e3e78b0 changed the original design of a single entry point
into pfil(9) chains providing separate functions for the filtering
points that always provide mbufs and know the direction of a flow.
The motivation was to reduce branching. The logical continuation
would be to do the same for the filtering points that always provide
a memory pointer and retire the single entry point.

o Hooks now provide two functions: one for mbufs and optional for
memory pointers.
o pfil_hook_args() has a new member and pfil_add_hook() has a
requirement to zero out uninitialized data. Bump PFIL_VERSION.
o As it was before, a hook function for a memory pointer may realloc
into an mbuf. Such mbuf would be returned via a pointer that must
be provided in argument.
o The only hook that supports memory pointers is ipfw:default-link.
It is rewritten to provide two functions.
o All remaining uses of pfil_run_hooks() are converted to
pfil_mem_in().
o Transparent union of pfil_packet_t and tricks to fix pointer
alignment are retired. Internal pfil_realloc() reduces down to
m_devget() and thus is retired, too.

Reviewed by: mjg, ocochard
Differential revision: https://reviews.freebsd.org/D37977


# 1854fb8f 12-Nov-2022 Dag-Erling Smørgrav <des@FreeBSD.org>

ipfw: Spell PFIL_PASS correctly.

Sponsored by: Netapp
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37375


# bc6a2267 16-Jun-2021 Mark Johnston <markj@FreeBSD.org>

ipfw: Update the pfil mbuf pointer in ipfw_check_frame()

ipfw_chk() might call m_pullup() and thus can change the mbuf chain
head. In this case, the new chain head has to be returned to the pfil
hook caller, otherwise the pfil hook caller is left with a dangling
pointer.

Note that this affects only the link-layer hooks installed when the
net.link.ether.ipfw sysctl is set to 1.

PR: 256439, 254015, 255069, 255104
Fixes: f355cb3e6
Reviewed by: ae
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30764


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 97245d40 21-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Always create ipfw(4) hooks as long as module is loaded.

Now enabling ipfw(4) with sysctls controls only linkage of hooks to default
heads. When module is loaded fetch sysctls as tunables, to make it possible
to boot with ipfw(4) in kernel, but not linked to any pfil(9) hooks.


# b11efc1e 18-Mar-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Modify struct nat64_config.

Add second IPv6 prefix to generic config structure and rename another
fields to conform to RFC6877. Now it contains two prefixes and length:
PLAT is provider-side translator that translates N:1 global IPv6 addresses
to global IPv4 addresses. CLAT is customer-side translator (XLAT) that
algorithmically translates 1:1 IPv4 addresses to global IPv6 addresses.
Use PLAT prefix in stateless (nat64stl) and stateful (nat64lsn)
translators.

Modify nat64_extract_ip4() and nat64_embed_ip4() functions to accept
prefix length and use plat_plen to specify prefix length.

Retire net.inet.ip.fw.nat64_allow_private sysctl variable.
Add NAT64_ALLOW_PRIVATE flag and use "allow_private" config option to
configure this ability separately for each NAT64 instance.

Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC


# f355cb3e 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

PFIL_MEMPTR for ipfw link level hook

With new pfil(9) KPI it is possible to pass a void pointer with length
instead of mbuf pointer to a packet filter. Until this commit no filters
supported that, so pfil run through a shim function pfil_fake_mbuf().

Now the ipfw(4) hook named "default-link", that is instantiated when
net.link.ether.ipfw sysctl is on, supports processing pointer/length
packets natively.

- ip_fw_args now has union for either mbuf or void *, and if flags have
non-zero length, then we use the void *.
- through ipfw_chk() we handle mem/mbuf cases differently.
- ether_header goes away from args. It is ipfw_chk() responsibility
to do parsing of Ethernet header.
- ipfw_log() now uses different bpf APIs to log packets.

Although ipfw_chk() is now capable to process pointer/length packets,
this commit adds support for the link level hook only, see
ipfw_check_frame(). Potentially the IP processing hook ipfw_check_packet()
can be improved too, but that requires more changes since the hook
supports more complex actions: NAT, divert, etc.

Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D19357


# dc0fa4f7 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove 'dir' argument from dummynet_io(). This makes it possible to make
dn_dir flags private to dummynet. There is still some room for improvement.


# b00b7e03 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Reduce argument list to ipfw_divert(), as args holds the rule ref and
the direction. While here make 'tee' a bool.


# cef9f220 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove 'dir' argument in ng_ipfw_input, since ip_fw_args now has this info.
While here make 'tee' boolean.


# b7795b67 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

- Add more flags to ip_fw_args. At this changeset only IPFW_ARGS_IN and
IPFW_ARGS_OUT are utilized. They are intented to substitute the "dir"
parameter that is often passes together with args.
- Rename ip_fw_args.oif to ifp and now it is set to either input or
output interface, depending on IPFW_ARGS_IN/OUT bit set.


# 1830dae3 14-Mar-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Make second argument of ip_divert(), that specifies packet direction a bool.
This allows pf(4) to avoid including ipfw(4) private files.


# ca0f03e8 11-Mar-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Add IP_FW_NAT64 to codes that ipfw_chk() can return.

It will be used by upcoming NAT64 changes. We use separate code
to avoid propogating EACCES error code to user level applications
when NAT64 consumes a packet.

Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC


# 2790ca97 31-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Fix build without INET6.


# b252313f 31-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

New pfil(9) KPI together with newborn pfil API and control utility.

The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented. The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.

In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.

New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.

Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.

Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.

Differential Revision: https://reviews.freebsd.org/D18951


# 3b1522c2 09-Jan-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Fix the build with INVARIANTS.

MFC after: 1 month


# 1cdf23bc 09-Jan-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64.
And refactor the code to avoid unneeded initialization to reduce overhead
of per-packet processing.

ipfw(4) can be invoked by pfil(9) framework for each packet several times.
Each call uses on-stack variable of type struct ip_fw_args to keep the
state of ipfw(4) processing. Currently this variable has 240 bytes size
on amd64. Each time ipfw(4) does bzero() on it, and then it initializes
some fields.

glebius@ has reported that they at Netflix discovered, that initialization
of this variable produces significant overhead on packet processing.
After patching I managed to increase performance of packet processing on
simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to
11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card).

Introduced new field flags, it is used to keep track of what fields was
initialized. Some fields were moved into the anonymous union, to reduce
the size. They all are mutually exclusive. dummypar field was unused, and
therefore it is removed. The hopstore6 field type was changed from
sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct
ip_fw_args is 128 bytes.

ipfw_chk() was modified to properly handle ip_fw_args.flags instead of
rely on checking for NULL pointers.

Reviewed by: gallatin
Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D18690


# 5310c191 27-Oct-2018 Eugene Grosbein <eugen@FreeBSD.org>

ipfw: implement ngtee/netgraph actions for layer-2 frames.

Kernel part of ipfw does not support and ignores rules other than
"pass", "deny" and dummynet-related for layer-2 (ethernet frames).
Others are processed as "pass".

Make it support ngtee/netgraph rules just like they are supported
for IP packets. For example, this allows us to mirror some frames
selectively to another interface for delivery to remote network analyzer
over RSPAN vlan. Assuming ng_ipfw(4) netgraph node has a hook named "900"
attached to "lower" hook of vlan900's ng_ether(4) node, that would be
as simple as:

ipfw add ngtee 900 ip from any to 8.8.8.8 layer2 out xmit igb0

PR: 213452
MFC after: 1 month
Tested-by: Fyodor Ustinov <ufm@ufm.su>


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# e4014585 02-Nov-2016 Luiz Otavio O Souza <loos@FreeBSD.org>

Remove the mbuf tag after use (for reinjected packets).

Fixes the packet processing in dummynet l2 rules.

Obtained from: pfSense
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)


# 3e80a649 01-Nov-2016 Luiz Otavio O Souza <loos@FreeBSD.org>

Stop abusing from struct ifnet presence to determine the packet direction
for dummynet, use the correct argument for that, remove the false coment
about the presence of struct ifnet.

Fixes the input match of dummynet l2 rules.

Obtained from: pfSense
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)


# 62f42cf8 19-May-2015 Luigi Rizzo <luigi@FreeBSD.org>

use proper types to represent function pointers


# 2530ed9e 13-Mar-2015 Andrey V. Elsukov <ae@FreeBSD.org>

Fix `ipfw fwd tablearg'. Use dedicated field nh4 in struct table_value
to obtain IPv4 next hop address in tablearg case.

Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop
address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we
still use this opcode, but when packet identified as IPv6 packet, we
obtain next hop address from dedicated field nh6 in struct table_value.

Replace hopstore field in struct ip_fw_args with anonymous union and add
hopstore6 field. Use this field to copy tablearg value for IPv6.

Replace spare1 field in struct table_value with zoneid. Use it to keep
scope zone id for link-local IPv6 addresses. Since spare1 was used
internally, replace spare0 array with two variables spare0 and spare1.

Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting
IPv6 addresses in table_value. Use zoneid field in struct table_value
to store sin6_scope_id value.

Since the kernel still uses embedded scope zone id to represent
link-local addresses, convert next_hop6 address into this form before
return from pfil processing. This also fixes in6_localip() check
for link-local addresses.

Differential Revision: https://reviews.freebsd.org/D2015
Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# 6df8a710 07-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.

Sponsored by: Nginx, Inc.


# 69155b13 16-Apr-2014 Andrey V. Elsukov <ae@FreeBSD.org>

Set oif only for outgoing packets.

PR: 188543
MFC after: 1 week
Sponsored by: Yandex LLC


# 620ee5d3 21-Mar-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Fix breakage in ipfw+VIMAGE after r261590.

PR: kern/187665
Sponsored by: Nginx, Inc.


# 77024cbc 21-Nov-2013 Luigi Rizzo <luigi@FreeBSD.org>

make ipfw_check_packet() and ipfw_check_frame() public,
so they can be used in the userspace version of ipfw/dummynet
(normally using netmap for the I/O path).

This is the first of a few commits to ease compiling the
ipfw kernel code in userspace.


# eb1b1807 05-Dec-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually


# ffdbf9da 01-Nov-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Remove the recently added sysctl variable net.pfil.forward.
Instead, add protocol specific mbuf flags M_IP_NEXTHOP and
M_IP6_NEXTHOP. Use them to indicate that the mbuf's chain
contains the PACKET_TAG_IPFORWARD tag. And do a tag lookup
only when this flag is set.

Suggested by: andre


# c1de64a4 25-Oct-2012 Andrey V. Elsukov <ae@FreeBSD.org>

Remove the IPFIREWALL_FORWARD kernel option and make possible to turn
on the related functionality in the runtime via the sysctl variable
net.pfil.forward. It is turned off by default.

Sponsored by: Yandex LLC
Discussed with: net@
MFC after: 2 weeks


# 8f134647 22-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>


# 8f35d5f3 08-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

Catch up with r241245 and do not return packet back in host byte order.


# 21d172a3 06-Oct-2012 Gleb Smirnoff <glebius@FreeBSD.org>

A step in resolving mess with byte ordering for AF_INET. After this change:

- All packets in NETISR_IP queue are in net byte order.
- ip_input() is entered in net byte order and converts packet
to host byte order right _after_ processing pfil(9) hooks.
- ip_output() is entered in host byte order and converts packet
to net byte order right _before_ processing pfil(9) hooks.
- ip_fragment() accepts and emits packet in net byte order.
- ip_forward(), ip_mloopback() use host byte order (untouched actually).
- ip_fastforward() no longer modifies packet at all (except ip_ttl).
- Swapping of byte order there and back removed from the following modules:
pf(4), ipfw(4), enc(4), if_bridge(4).
- Swapping of byte order added to ipfilter(4), based on __FreeBSD_version
- __FreeBSD_version bumped.
- pfil(9) manual page updated.

Reviewed by: ray, luigi, eri, melifaro
Tested by: glebius (LE), ray (BE)


# 3b3a8eb9 14-Sep-2012 Gleb Smirnoff <glebius@FreeBSD.org>

o Create directory sys/netpfil, where all packet filters should
reside, and move there ipfw(4) and pf(4).

o Move most modified parts of pf out of contrib.

Actual movements:

sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5

sys/netinet/ipfw -> sys/netpfil/ipfw

The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.

Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.

The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.

Discussed with: bz, luigi