History log of /freebsd-current/sys/netinet/ip_fw.h
Revision Date Author Comments
# 78e4dbc3 12-May-2024 Gordon Bergling <gbe@FreeBSD.org>

ipfw: Fix a typo in a source code comment

- s/defaul/default/

MFC after: 3 days


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# fc727ad6 24-Apr-2023 Boris Lytochkin <lytboris@gmail.com>

ipfw: add [fw]mark implementation for ipfw

Packet Mark is an analogue to ipfw tags with O(1) lookup from mbuf while
regular tags require a single-linked list traversal.
Mark is a 32-bit number that can be looked up in a table
[with 'number' table-type], matched or compared with a number with optional
mask applied before comparison.
Having generic nature, Mark can be used in a variety of needs.
For example, it could be used as a security group: mark will hold a security
group id and represent a group of packet flows that shares same access
control policy.

Reviewed By: pauamma_gundo.com
Differential Revision: https://reviews.freebsd.org/D39555
MFC after: 1 month


# 81cac390 04-Jun-2022 Arseny Smalyuk <smalukav@gmail.com>

ipfw: add support radix tables and table lookup for MAC addresses

By analogy with IP address matching, add a way to use ipfw radix
tables for MAC matching. This is implemented using new ipfw table
with mac:radix type. Also there are src-mac and dst-mac lookup
commands added.

Usage example:
ipfw table 1 create type mac
ipfw table 1 add 11:22:33:44:55:66/48
ipfw add skipto tablearg src-mac 'table(1)'
ipfw add deny src-mac 'table(1, 100)'
ipfw add deny lookup dst-mac 1

Note: sysctl net.link.ether.ipfw=1 should be set to enable ipfw
filtering on L2.

Reviewed by: melifaro
Obtained from: Yandex LLC
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D35103


# a08cdb6c 02-Feb-2021 Neel Chauhan <nc@FreeBSD.org>

Allow setting alias port ranges in libalias and ipfw. This will allow a system
to be a true RFC 6598 NAT444 setup, where each network segment (e.g. user,
subnet) can have their own dedicated port aliasing ranges.

Reviewed by: donner, kp
Approved by: 0mp (mentor), donner, kp
Differential Revision: https://reviews.freebsd.org/D23450


# a81c165b 19-Jan-2021 Alex Richardson <arichardson@FreeBSD.org>

Require uint32_t alignment for ipfw_insn

There are many casts of this struct to uint32_t, so we also need to ensure
that it is sufficiently aligned to safely perform this cast on architectures
that don't allow unaligned accesses. This fixes lots of -Wcast-align warnings.

Reviewed By: ae
Differential Revision: https://reviews.freebsd.org/D27879


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# 481be5de 12-Feb-2020 Randall Stewart <rrs@FreeBSD.org>

White space cleanup -- remove trailing tab's or spaces
from any line.

Sponsored by: Netflix Inc.


# 978f2d17 21-Jun-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Add "tcpmss" opcode to match the TCP MSS value.

With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.

# ipfw add deny log tcp from any to any tcpmss 0-500

Reviewed by: melifaro,bcr
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC


# 5c04f73e 18-Mar-2019 Andrey V. Elsukov <ae@FreeBSD.org>

Add NAT64 CLAT implementation as defined in RFC6877.

CLAT is customer-side translator that algorithmically translates 1:1
private IPv4 addresses to global IPv6 addresses, and vice versa.
It is implemented as part of ipfw_nat64 kernel module. When module
is loaded or compiled into the kernel, it registers "nat64clat" external
action. External action named instance can be created using `create`
command and then used in ipfw rules. The create command accepts two
IPv6 prefixes `plat_prefix` and `clat_prefix`. If plat_prefix is ommitted,
IPv6 NAT64 Well-Known prefix 64:ff9b::/96 will be used.

# ipfw nat64clat CLAT create clat_prefix SRC_PFX plat_prefix DST_PFX
# ipfw add nat64clat CLAT ip4 from IPv4_PFX to any out
# ipfw add nat64clat CLAT ip6 from DST_PFX to SRC_PFX in

Obtained from: Yandex LLC
Submitted by: Boris N. Lytochkin
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC


# d66f9c86 04-Dec-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Add ability to request listing and deleting only for dynamic states.

This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.

Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.

Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.

Obtained from: Yandex LLC
MFC after: 2 months
Sponsored by: Yandex LLC


# 5786c6b9 20-Nov-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Make multiline APPLY_MASK() macro to be function-like.

Reported by: cem
MFC after: 1 week


# 094d6f8d 21-Oct-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Add IPFW_RULE_JUSTOPTS flag, that is used by ipfw(8) to mark rule,
that was added using "new rule format". And then, when the kernel
returns rule with this flag, ipfw(8) can correctly show it.

Reported by: lev
MFC after: 3 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D17373


# f7c4fdee 09-Jul-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Add "record-state", "set-limit" and "defer-action" rule options to ipfw.

"record-state" is similar to "keep-state", but it doesn't produce implicit
O_PROBE_STATE opcode in a rule. "set-limit" is like "limit", but it has the
same feature as "record-state", it is single opcode without implicit
O_PROBE_STATE opcode. "defer-action" is targeted to be used with dynamic
states. When rule with this opcode is matched, the rule's action will
not be executed, instead dynamic state will be created. And when this
state will be matched by "check-state", then rule action will be executed.
This allows create a more complicated rulesets.

Submitted by: lev
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D1776


# b99a6823 07-Feb-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Rework ipfw dynamic states implementation to be lockless on fast path.

o added struct ipfw_dyn_info that keeps all needed for ipfw_chk and
for dynamic states implementation information;
o added DYN_LOOKUP_NEEDED() macro that can be used to determine the
need of new lookup of dynamic states;
o ipfw_dyn_rule now becomes obsolete. Currently it used to pass
information from kernel to userland only.
o IPv4 and IPv6 states now described by different structures
dyn_ipv4_state and dyn_ipv6_state;
o IPv6 scope zones support is added;
o ipfw(4) now depends from Concurrency Kit;
o states are linked with "entry" field using CK_SLIST. This allows
lockless lookup and protected by mutex modifications.
o the "expired" SLIST field is used for states expiring.
o struct dyn_data is used to keep generic information for both IPv4
and IPv6;
o struct dyn_parent is used to keep O_LIMIT_PARENT information;
o IPv4 and IPv6 states are stored in different hash tables;
o O_LIMIT_PARENT states now are kept separately from O_LIMIT and
O_KEEP_STATE states;
o per-cpu dyn_hp pointers are used to implement hazard pointers and they
prevent freeing states that are locklessly used by lookup threads;
o mutexes to protect modification of lists in hash tables now kept in
separate arrays. 65535 limit to maximum number of hash buckets now
removed.
o Separate lookup and install functions added for IPv4 and IPv6 states
and for parent states.
o By default now is used Jenkinks hash function.

Obtained from: Yandex LLC
MFC after: 42 days
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D12685


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 665c8a2e 26-Nov-2017 Michael Tuexen <tuexen@FreeBSD.org>

Add to ipfw support for sending an SCTP packet containing an ABORT chunk.
This is similar to the TCP case. where a TCP RST segment can be sent.

There is one limitation: When sending an ABORT in response to an incoming
packet, it should be tested if there is no ABORT chunk in the received
packet. Currently, it is only checked if the first chunk is an ABORT
chunk to avoid parsing the whole packet, which could result in a DOS attack.

Thanks to Timo Voelker for helping me to test this patch.
Reviewed by: bcr@ (man page part), ae@ (generic, non-SCTP part)
Differential Revision: https://reviews.freebsd.org/D13239


# 0d5af38c 18-Oct-2017 Michael Tuexen <tuexen@FreeBSD.org>

Revert change which got in accidently.


# 3ed8d364 18-Oct-2017 Michael Tuexen <tuexen@FreeBSD.org>

Fix a bug introduced in r324638.
Thanks to Felix Weinrank for making me aware of this.

MFC after: 3 days


# 11c56650 02-Apr-2017 Andrey V. Elsukov <ae@FreeBSD.org>

Add O_EXTERNAL_DATA opcode support.

This opcode can be used to attach some data to external action opcode.
And unlike to O_EXTERNAL_INSTANCE opcode, this opcode does not require
creating of named instance to pass configuration arguments to external
action handler. The data is coming just next to O_EXTERNAL_ACTION opcode.

The userlevel part currenly supports formatting for opcode with ipfw_insn
size, by default it expects u16 numeric value in the arg1.

Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC


# 57fb3b7a 13-Aug-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Add `stats reset` command implementation to NPTv6 module
to be able reset statistics counters.

Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# d8caf56e 13-Aug-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Add ipfw_nat64 module that implements stateless and stateful NAT64.

The module works together with ipfw(4) and implemented as its external
action module.

Stateless NAT64 registers external action with name nat64stl. This
keyword should be used to create NAT64 instance and to address this
instance in rules. Stateless NAT64 uses two lookup tables with mapped
IPv4->IPv6 and IPv6->IPv4 addresses to perform translation.

A configuration of instance should looks like this:
1. Create lookup tables:
# ipfw table T46 create type addr valtype ipv6
# ipfw table T64 create type addr valtype ipv4
2. Fill T46 and T64 tables.
3. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
4. Create NAT64 instance:
# ipfw nat64stl NAT create table4 T46 table6 T64
5. Add rules that matches the traffic:
# ipfw add nat64stl NAT ip from any to table(T46)
# ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96
6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.

Stateful NAT64 registers external action with name nat64lsn. The only
one option required to create nat64lsn instance - prefix4. It defines
the pool of IPv4 addresses used for translation.

A configuration of instance should looks like this:
1. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
2. Create NAT64 instance:
# ipfw nat64lsn NAT create prefix4 A.B.C.D/28
3. Add rules that matches the traffic:
# ipfw add nat64lsn NAT ip from any to A.B.C.D/28
# ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96
4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.

Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D6434


# d6eb9b02 11-Aug-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Restore "nat global" support.

Now zero value of arg1 used to specify "tablearg", use the old "tablearg"
value for "nat global". Introduce new macro IP_FW_NAT44_GLOBAL to replace
hardcoded magic number to specify "nat global". Also replace 65535 magic
number with corresponding macro. Fix typo in comments.

PR: 211256
Tested by: Victor Chernov
MFC after: 3 days


# ed22e564 18-Jul-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Add named dynamic states support to ipfw(4).

The keep-state, limit and check-state now will have additional argument
flowname. This flowname will be assigned to dynamic rule by keep-state
or limit opcode. And then can be matched by check-state opcode or
O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize
compatibility with old rulesets default flowname introduced.
It will be assigned to the rules when user has omitted state name in
keep-state and check-state opcodes. Also if name is ambiguous (can be
evaluated as rule opcode) it will be replaced to default.

Reviewed by: julian
Obtained from: Yandex LLC
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D6674


# b867e84e 18-Jul-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6
as defined in RFC 6296. The module works together with ipfw(4) and
implemented as its external action module. When it is loaded, it registers
as eaction and can be used in rules. The usage pattern is similar to
ipfw_nat(4). All matched by rule traffic goes to the NPT module.

Reviewed by: hrs
Obtained from: Yandex LLC
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D6420


# 2685841b 17-May-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Make named objects set-aware. Now it is possible to create named
objects with the same name in different sets.

Add optional manage_sets() callback to objects rewriting framework.
It is intended to implement handler for moving and swapping named
object's sets. Add ipfw_obj_manage_sets() function that implements
generic sets handler. Use new callback to implement sets support for
lookup tables.
External actions objects are global and they don't support sets.
Modify eaction_findbyname() to reflect this.
ipfw(8) now may fail to move rules or sets, because some named objects
in target set may have conflicting names.
Note that ipfw_obj_ntlv type was changed, but since lookup tables
actually didn't support sets, this change is harmless.

Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# a4641f4e 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/net*: minor spelling fixes.

No functional change.


# 2acdf79f 14-Apr-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Add External Actions KPI to ipfw(9).

It allows implementing loadable kernel modules with new actions and
without needing to modify kernel headers and ipfw(8). The module
registers its action handler and keyword string, that will be used
as action name. Using generic syntax user can add rules with this
action. Also ipfw(8) can be easily modified to extend basic syntax
for external actions, that become a part base system.
Sample modules will coming soon.

Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# 5dc5a0e0 03-Nov-2015 Andrey V. Elsukov <ae@FreeBSD.org>

Implement `ipfw internal olist` command to list named objects.

Reviewed by: melifaro
Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# 74b22066 27-Apr-2015 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make rule table kernel-index rewriting support any kind of objects.

Currently we have tables identified by their names in userland
with internal kernel-assigned indices. This works the following way:

When userland wishes to communicate with kernel to add or change rule(s),
it makes indexed sorted array of table names
(internally ipfw_obj_ntlv entries), and refer to indices in that
array in rule manipulation.
Prior to committing new rule to the ruleset kernel
a) finds all referenced tables, bump their refcounts and change
values inside the opcodes to be real kernel indices
b) auto-creates all referenced but not existing tables and then
do a) for them.

Kernel does almost the same when exporting rules to userland:
prepares array of used tables in all rules in range, and
prepends it before the actual ruleset retaining actual in-kernel
indexes for that.

There is also special translation layer for legacy clients which is
able to provide 'real' indices for table names (basically doing atoi()).

While it is arguable that every subsystem really needs names instead of
numbers, there are several things that should be noted:

1) every non-singleton subsystem needs to store its runtime state
somewhere inside ipfw chain (and be able to get it fast)
2) we can't assume object numbers provided by humans will be dense.

Existing nat implementation (O(n) access and LIST inside chain) is a
good example.

Hence the following:
* Convert table-centric rewrite code to be more generic, callback-based
* Move most of the code from ip_fw_table.c to ip_fw_sockopt.c
* Provide abstract API to permit subsystems convert their objects
between userland string identifier and in-kernel index.
(See struct opcode_obj_rewrite) for more details
* Create another per-chain index (in next commit) shared among all subsystems
* Convert current NAT44 implementation to use new API, O(1) lookups,
shared index and names instead of numbers (in next commit).

Sponsored by: Yandex LLC


# 2530ed9e 13-Mar-2015 Andrey V. Elsukov <ae@FreeBSD.org>

Fix `ipfw fwd tablearg'. Use dedicated field nh4 in struct table_value
to obtain IPv4 next hop address in tablearg case.

Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop
address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we
still use this opcode, but when packet identified as IPv6 packet, we
obtain next hop address from dedicated field nh6 in struct table_value.

Replace hopstore field in struct ip_fw_args with anonymous union and add
hopstore6 field. Use this field to copy tablearg value for IPv6.

Replace spare1 field in struct table_value with zoneid. Use it to keep
scope zone id for link-local IPv6 addresses. Since spare1 was used
internally, replace spare0 array with two variables spare0 and spare1.

Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting
IPv6 addresses in table_value. Use zoneid field in struct table_value
to store sin6_scope_id value.

Since the kernel still uses embedded scope zone id to represent
link-local addresses, convert next_hop6 address into this form before
return from pfil processing. This also fixes in6_localip() check
for link-local addresses.

Differential Revision: https://reviews.freebsd.org/D2015
Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# 2930362f 13-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix matching default rule on clear/show commands.

Found by: Oleg Ginzburg


# be8bc457 08-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add IP_FW_DUMP_SOPTCODES sopt to be able to determine
which opcodes are currently available in kernel.


# d6164b77 07-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make ipfw_nat module use IP_FW3 codes.

Kernel changes:
* Split kernel/userland nat structures eliminating IPFW_INTERNAL hack.
* Add IP_FW_NAT44_* codes resemblin old ones.
* Assume that instances can be named (no kernel support currently).
* Use both UH+WLOCK locks for all configuration changes.
* Provide full ABI support for old sockopts.

Userland changes:
* Use IP_FW_NAT44_* codes for nat operations.
* Remove undocumented ability to show ranges of nat "log" entries.


# 0cba2b28 31-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add support for multi-field values inside ipfw tables.
This is the last major change in given branch.

Kernel changes:
* Use 64-bytes structures to hold multi-value variables.
* Use shared array to hold values from all tables (assume
each table algo is capable of holding 32-byte variables).
* Add some placeholders to support per-table value arrays in future.
* Use simple eventhandler-style API to ease the process of adding new
table items. Currently table addition may required multiple UH drops/
acquires which is quite tricky due to atomic table modificatio/swap
support, shared array resize, etc. Deal with it by calling special
notifier capable of rolling back state before actually performing
swap/resize operations. Original operation then restarts itself after
acquiring UH lock.
* Bump all objhash users default values to at least 64
* Fix custom hashing inside objhash.

Userland changes:
* Add support for dumping shared value array via "vlist" internal cmd.
* Some small print/fill_flags dixes to support u32 values.
* valtype is now bitmask of
<skipto|pipe|fib|nat|dscp|tag|divert|netgraph|limit|ipv4|ipv6>.
New values can hold distinct values for each of this types.
* Provide special "legacy" type which assumes all values are the same.
* More helpers/docs following..

Some examples:

3:41 [1] zfscurr0# ipfw table mimimi create valtype skipto,limit,ipv4,ipv6
3:41 [1] zfscurr0# ipfw table mimimi info
+++ table(mimimi), set(0) +++
kindex: 2, type: addr
references: 0, valtype: skipto,limit,ipv4,ipv6
algorithm: addr:radix
items: 0, size: 296
3:42 [1] zfscurr0# ipfw table mimimi add 10.0.0.5 3000,10,10.0.0.1,2a02:978:2::1
added: 10.0.0.5/32 3000,10,10.0.0.1,2a02:978:2::1
3:42 [1] zfscurr0# ipfw table mimimi list
+++ table(mimimi), set(0) +++
10.0.0.5/32 3000,0,10.0.0.1,2a02:978:2::1


# 4bbd1577 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make room for multi-type values in struct tentry.


# c21034b7 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Replace "cidr" table type with "addr" type.

Suggested by: luigi


# 18ad4197 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Fix displaying dynamic rules for large rulesets.
* Clean up some comments.


# 1940fa77 12-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Change tablearg value to be 0 (try #2).
Most of the tablearg-supported opcodes does not accept 0 as valid value:
O_TAG, O_TAGGED, O_PIPE, O_QUEUE, O_DIVERT, O_TEE, O_SKIPTO, O_CALLRET,
O_NETGRAPH, O_NGTEE, O_NAT treats 0 as invalid input.

The rest are O_SETDSCP and O_SETFIB.
'Fix' them by adding high-order bit (0x8000) set for non-tablearg values.
Do translation in kernel for old clients (import_rule0 / export_rule0),
teach current ipfw(8) binary to add/remove given bit.

This change does not affect handling SETDSCP values, but limit
O_SETFIB values to 32767 instead of 65k. Since currently we have either
old (16) or new (2^32) max fibs, this should not be a big deal:
we're definitely OK for former and have to add another opcode to deal
with latter, regardless of tablearg value.


# 4f43138a 11-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add the abilify to lock/unlock given table from changes.

Example:

# ipfw table si lock
# ipfw table si info
+++ table(si), set(0) +++
kindex: 0, type: cidr, locked
valtype: number, references: 0
algorithm: cidr:radix
items: 0, size: 288
# ipfw table si add 4.5.6.7
ignored: 4.5.6.7/32 0
ipfw: Adding record failed: table is locked
# ipfw table si unlock
# ipfw table si add 4.5.6.7
added: 4.5.6.7/32 0
# ipfw table si lock
# ipfw table si delete 4.5.6.7
ignored: 4.5.6.7/32 0
ipfw: Deleting record failed: table is locked
# ipfw table si unlock
# ipfw table si delete 4.5.6.7
deleted: 4.5.6.7/32 0


# 3a845e10 11-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add support for batched add/delete for ipfw tables
* Add support for atomic batches add (all or none).
* Fix panic on deleting non-existing entry in radix algo.

Examples:

# si is empty
# ipfw table si add 1.1.1.1/32 1111 2.2.2.2/32 2222
added: 1.1.1.1/32 1111
added: 2.2.2.2/32 2222
# ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 4444
exists: 2.2.2.2/32 2200
added: 4.4.4.4/32 4444
ipfw: Adding record failed: record already exists
^^^^^ Returns error but keeps inserted items
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444
# ipfw table si atomic add 3.3.3.3/32 3333 4.4.4.4/32 4400 5.5.5.5/32 5555
added(reverted): 3.3.3.3/32 3333
exists: 4.4.4.4/32 4400
ignored: 5.5.5.5/32 5555
ipfw: Adding record failed: record already exists
^^^^^ Returns error and reverts added records
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444


# 8bd19212 08-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Partially revert previous commit:
"0" value is perfectly valid for O_SETFIB and O_SETDSCP,
so tablearg remains to be 655535 for now.


# 2c452b20 08-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Switch tablearg value from 65535 to 0.
* Use u16 table kidx instead of integer on for iface opcode.
* Provide compability layer for old clients.


# adf3b2b9 08-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add IP_FW_TABLE_XMODIFY opcode
* Since there seems to be lack of consensus on strict value typing,
remove non-default value types. Use userland-only "value format type"
to print values.

Kernel changes:
* Add IP_FW_XMODIFY to permit table run-time modifications.
Currently we support changing limit and value format type.

Userland changes:
* Support IP_FW_XMODIFY opcode.
* Support specifying value format type (ftype) in tablble create/modify req
* Fine-print value type/value format type.


# 28ea4fa3 08-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Remove IP_FW_TABLES_XGETSIZE opcode.
It is superseded by IP_FW_TABLES_XLIST.


# a73d728d 07-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Kernel changes:
* Implement proper checks for switching between global and set-aware tables
* Split IP_FW_DEL mess into the following opcodes:
* IP_FW_XDEL (del rules matching pattern)
* IP_FW_XMOVE (move rules matching pattern to another set)
* IP_FW_SET_SWAP (swap between 2 sets)
* IP_FW_SET_MOVE (move one set to another one)
* IP_FW_SET_ENABLE (enable/disable sets)
* Add IP_FW_XZERO / IP_FW_XRESETLOG to finish IP_FW3 migration.
* Use unified ipfw_range_tlv as range description for all of the above.
* Check dynamic states IFF there was non-zero number of deleted dyn rules,
* Del relevant dynamic states with singe traversal instead of per-rule one.

Userland changes:
* Switch ipfw(8) to use new opcodes.


# 46d52008 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Implement atomic ipfw table swap.

Kernel changes:
* Add opcode IP_FW_TABLE_XSWAP
* Add support for swapping 2 tables with the same type/ftype/vtype.
* Make skipto cache init after ipfw locks init.

Userland changes:
* Add "table X swap Y" command.


# 5f379342 02-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Show algorithm-specific data in "table info" output.


# 4c0c07a5 01-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Permit limiting number of items in table.

Kernel changes:
* Add TEI_FLAGS_DONTADD entry flag to indicate that insert is not possible
* Support given flag in all algorithms
* Add "limit" field to ipfw_xtable_info
* Add actual limiting code into add_table_entry()

Userland changes:
* Add "limit" option as "create" table sub-option. Limit modification
is currently impossible.
* Print human-readable errors in table enry addition/deletion code.


# 914bffb6 31-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add new "flow" table type to support N=1..5-tuple lookups
* Add "flow:hash" algorithm

Kernel changes:
* Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups
* Add IPFW_TABLE_FLOW table type
* Add "struct tflow_entry" as strage for 6-tuple flows
* Add "flow:hash" algorithm. Basically it is auto-growing chained hash table.
Additionally, we store mask of fields we need to compare in each instance/

* Increase ipfw_obj_tentry size by adding struct tflow_entry
* Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info
* Increase algoname length: 32 -> 64 (algo options passed there as string)
* Assume every table type can be customized by flags, use u8 to store "tflags" field.
* Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback.
* Fix bug in cidr:chash resize procedure.

Userland changes:
* add "flow table(NAME)" syntax to support n-tuple checking tables.
* make fill_flags() separate function to ease working with _s_x arrays
* change "table info" output to reflect longer "type" fields

Syntax:
ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash]

Examples:

0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash
0:02 [2] zfscurr0# ipfw table fl2 info
+++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
0:02 [2] zfscurr0# ipfw table fl2 list
+++ table(fl2), set(0) +++
2a02:6b8::333,6,443 45000
10.0.0.92,6,80 22000
0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)'
00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
0:03 [2] zfscurr0# ipfw show
00200 0 0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 617 59416 allow ip from any to any
0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80
Trying 78.46.89.105...
..
0:04 [2] zfscurr0# ipfw show
00200 5 272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 682 66733 allow ip from any to any


# b23d5de9 30-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add number:array algorithm lookup method.

Kernel changes:
* s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/
* Force "lookup <port|uid|gid|jid>" to be IPFW_TABLE_NUMBER
* Support "lookup" method for number tables
* Add number:array algorihm (i32 as key, auto-growing).

Userland changes:
* Support named tables in "lookup <tag> Table"
* Fix handling of "table(NAME,val)" case
* Support printing "number" table data.


# 9d099b4f 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Dump available table algorithms via "ipfw talist" cmd.

Kernel changes:
* Add type/refcount fields to table algo instances.
* Add IP_FW_TABLES_ALIST opcode to export available algorihms to userland.

Userland changes:
* Fix cores on empty input inside "ipfw table" handler.
* Add "ipfw talist" cmd to print availabled kernel algorithms.
* Change "table info" output to reflect long algorithm config lines.


# 68394ec8 28-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add generic ipfw interface tracking API
* Rewrite interface tables to use interface indexes

Kernel changes:
* Add generic interface tracking API:
- ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates
state & bumps ref)
- ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to
update ifindex)
- ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer)
- ipfw_iface_unref(unlocked, drops reference)
Additionally, consumer callbacks are called in interface withdrawal/departure.

* Rewrite interface tables to use iface tracking API. Currently tables are
implemented the following way:
runtime data is stored as sorted array of {ifidx, val} for existing interfaces
full data is stored inside namedobj instance (chained hashed table).

* Add IP_FW_XIFLIST opcode to dump status of tracked interfaces

* Pass @chain ptr to most non-locked algorithm callbacks:
(prepare_add, prepare_del, flush_entry ..). This may be needed for better
interaction of given algorithm an other ipfw subsystems

* Add optional "change_ti" algorithm handler to permit updating of
cached table_info pointer (happens in case of table_max resize)

* Fix small bug in ipfw_list_tables()
* Add badd (insert into sorted array) and bdel (remove from sorted array) funcs

Userland changes:
* Add "iflist" cmd to print status of currently tracked interface
* Add stringnum_cmp for better interface/table names sorting


# 7e767c79 08-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Use different rule structures in kernel/userland.
* Switch kernel to use per-cpu counters for rules.
* Keep ABI/API.

Kernel changes:
* Each rules is now exported as TLV with optional extenable
counter block (ip_fW_bcounter for base one) and
ip_fw_rule for rule&cmd data.
* Counters needs to be explicitly requested by IPFW_CFG_GET_COUNTERS flag.
* Separate counters from rules in kernel and clean up ip_fw a bit.
* Pack each rule in IPFW_TLV_RULE_ENT tlv to ease parsing.
* Introduce versioning in container TLV (may be needed in future).
* Fix ipfw_cfg_lheader broken u64 alignment.

Userland changes:
* Use set_mask from cfg header when requesting config
* Fix incorrect read accouting in ipfw_show_config()
* Use IPFW_RULE_NOOPT flag instead of playing with _pad
* Fix "ipfw -d list": do not print counters for dynamic states
* Some small fixes


# 6447bae6 06-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Prepare to pass other dynamic states via ipfw_dump_config()

Kernel changes:
* Change dump format for dynamic states:
each state is now stored inside ipfw_obj_dyntlv
last dynamic state is indicated by IPFW_DF_LAST flag
* Do not perform sooptcopyout() for !SOPT_GET requests.

Userland changes:
* Introduce foreach_state() function handler to ease work
with different states passed by ipfw_dump_config().


# 81d3153d 06-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add "lookup" table functionality to permit userland entry lookups.
* Bump table dump format preserving old ABI.

Kernel size:
* Add IP_FW_TABLE_XFIND to handle "lookup" request from userland.
* Add ta_find_tentry() algorithm callbacks/handlers to support lookups.
* Fully switch to ipfw_obj_tentry for various table dumps:
algorithms are now required to support the latest (ipfw_obj_tentry) entry
dump format, the rest is handled by generic dump code.
IP_FW_TABLE_XLIST opcode version bumped (0 -> 1).
* Eliminate legacy ta_dump_entry algo handler:
dump_table_entry() converts data from current to legacy format.

Userland side:
* Add "lookup" table parameter.
* Change the way table type is guessed: call table_get_info() first,
and check value for IPv4/IPv6 type IFF table does not exist.
* Fix table_get_list(): do more tries if supplied buffer is not enough.
* Sparate table_show_entry() from table_show_list().


# ac35ff17 03-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fully switch to named tables:

Kernel changes:
* Introduce ipfw_obj_tentry table entry structure to force u64 alignment.
* Support "update-on-existing-key" "add" bahavior (TEI_FLAGS_UPDATED).
* Use "subtype" field to distingush between IPv4 and IPv6 table records
instead of previous hack.
* Add value type (vtype) field for kernel tables. Current types are
number,ip and dscp
* Fix sets mask retrieval for old binaries
* Fix crash while using interface tables

Userland changes:
* Switch ipfw_table_handler() to use named-only tables.
* Add "table NAME create [type {cidr|iface|u32} [valtype {number|ip|dscp}] ..."
* Switch ipfw_table_handler to match_token()-based parser.
* Switch ipfw_sets_handler to use new ipfw_get_config() for mask retrieval.
* Allow ipfw set X table ... syntax to permit using per-set table namespaces.


# 6c2997ff 29-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add new IP_FW_XADD opcode which permits to
a) specify table ids as names
b) add multiple rules at once.
Partially convert current code for atomic addition of multiple rules.


# 563b5ab1 28-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Suppord showing named tables in ipfw(8) rule listing.

Kernel changes:
* change base TLV header to be u64 (so size can be u32).
* Introduce ipfw_obj_ctlv generc container TLV.
* Add IP_FW_XGET opcode which is now used for atomic configuration
retrieval. One can specify needed configuration pieces to retrieve
via flags field. Currently supported are
IPFW_CFG_GET_STATIC (static rules) and
IPFW_CFG_GET_STATES (dynamic states).
Other configuration pieces (tables, pipes, etc..) support is planned.

Userland changes:
* Switch ipfw(8) to use new IP_FW_XGET for rule listing.
* Split rule listing code get and show pieces.
* Make several steps forward towards libipfw:
permit printing states and rules(paritally) to supplied buffer.
do not die on malloc/kernel failure inside given printing functions.
stop assuming cmdline_opts is global symbol.


# 9490a627 16-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add IP_FW_TABLE_XCREATE / IP_FW_TABLE_XMODIFY opcodes.
* Add 'algoname' string to ipfw_xtable_info permitting to specify lookup
algoritm with parameters.
* Rework part of ipfw_rewrite_table_uidx()

Sponsored by: Yandex LLC


# d3a4f924 15-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Simplify opcode handling.

* Use one u16 from op3 header to implement opcode versioning.
* IP_FW_TABLE_XLIST has now 2 handlers, for ver.0 (old) and ver.1 (current).
* Every getsockopt request is now handled in ip_fw_table.c
* Rename new opcodes:
IP_FW_OBJ_DEL -> IP_FW_TABLE_XDESTROY
IP_FW_OBJ_LISTSIZE -> IP_FW_TABLES_XGETSIZE
IP_FW_OBJ_LIST -> IP_FW_TABLES_XLIST
IP_FW_OBJ_INFO -> IP_FW_TABLE_XINFO
IP_FW_OBJ_INFO -> IP_FW_TABLE_XFLUSH

* Add some docs about using given opcodes.
* Group some legacy opcode/handlers.


# f1220db8 14-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move further to eliminate next pieces of number-assuming code inside tables.

Kernel changes:
* Add IP_FW_OBJ_FLUSH opcode (flush table based on its name/set)
* Add IP_FW_OBJ_DUMP opcode (dumps table data based on its names/set)
* Add IP_FW_OBJ_LISTSIZE / IP_FW_OBJ_LIST opcodes (get list of kernel tables)

Userland changes:
* move tables code to separate tables.c file
* get rid of tables_max
* switch "all"/list handling to new opcodes


# 9f7d47b0 14-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add API to ease adding new algorithms/new tabletypes to ipfw.

Kernel-side changelog:
* Split general tables code and algorithm-specific table data.
Current algorithms (IPv4/IPv6 radix and interface tables radix) moved to
new ip_fw_table_algo.c file.
Tables code now supports any algorithm implementing the following callbacks:
+struct table_algo {
+ char name[64];
+ int idx;
+ ta_init *init;
+ ta_destroy *destroy;
+ table_lookup_t *lookup;
+ ta_prepare_add *prepare_add;
+ ta_prepare_del *prepare_del;
+ ta_add *add;
+ ta_del *del;
+ ta_flush_entry *flush_entry;
+ ta_foreach *foreach;
+ ta_dump_entry *dump_entry;
+ ta_dump_xentry *dump_xentry;
+};

* Change ->state, ->xstate, ->tabletype fields of ip_fw_chain to
->tablestate pointer (array of 32 bytes structures necessary for
runtime lookups (can be probably shrinked to 16 bytes later):

+struct table_info {
+ table_lookup_t *lookup; /* Lookup function */
+ void *state; /* Lookup radix/other structure */
+ void *xstate; /* eXtended state */
+ u_long data; /* Hints for given func */
+};

* Add count method for namedobj instance to ease size calculations
* Bump ip_fw3 buffer in ipfw_clt 128->256 bytes.
* Improve bitmask resizing on tables_max change.
* Remove table numbers checking from most places.
* Fix wrong nesting in ipfw_rewrite_table_uidx().

* Add IP_FW_OBJ_LIST opcode (list all objects of given type, currently
implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_LISTSIZE (get buffer size to hold IP_FW_OBJ_LIST data,
currenly implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_INFO (requests info for one object of given type).

Some name changes:
s/ipfw_xtable_tlv/ipfw_obj_tlv/ (no table specifics)
s/ipfw_xtable_ntlv/ipfw_obj_ntlv/ (no table specifics)

Userland changes:
* Add do_set3() cmd to ipfw2 to ease dealing with op3-embeded opcodes.
* Add/improve support for destroy/info cmds.


# b074b7bb 12-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make ipfw tables use names as used-level identifier internally:

* Add namedobject set-aware api capable of searching/allocation objects by their name/idx.
* Switch tables code to use string ids for configuration tasks.
* Change locking model: most configuration changes are protected with UH lock, runtime-visible are protected with both locks.
* Reduce number of arguments passed to ipfw_table_add/del by using separate structure.
* Add internal V_fw_tables_sets tunable (set to 0) to prepare for set-aware tables (requires opcodes/client support)
* Implement typed table referencing (and tables are implicitly allocated with all state like radix ptrs on reference)
* Add "destroy" ipfw(8) using new IP_FW_DELOBJ opcode

Namedobj more detailed:
* Blackbox api providing methods to add/del/search/enumerate objects
* Statically-sized hashes for names/indexes
* Per-set bitmask to indicate free indexes
* Separate methods for index alloc/delete/resize

Basically, there should not be any user-visible changes except the following:
* reducing table_max is not supported
* flush & add change table type won't work if table is referenced

Sponsored by: Yandex LLC


# c3015737 17-May-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix wrong formatting of 0.0.0.0/X table records in ipfw(8).

Add `flags` u16 field to the hole in ipfw_table_xentry structure.
Kernel has been guessing address family for supplied record based
on xent length size.
Userland, however, has been getting fixed-size ipfw_table_xentry structures
guessing address family by checking address by IN6_IS_ADDR_V4COMPAT().

Fix this behavior by providing specific IPFW_TCF_INET flag for IPv4 records.

PR: bin/189471
Submitted by: Dennis Yusupoff <dyr@smartspb.net>
MFC after: 2 weeks


# ae01d73c 20-Mar-2013 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add ipfw support for setting/matching DiffServ codepoints (DSCP).

Setting DSCP support is done via O_SETDSCP which works for both
IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done for IPv4.
Dscp can be specified by name (AFXY, CSX, BE, EF), by value
(0..63) or via tablearg.

Matching DSCP is done via another opcode (O_DSCP) which accepts several
classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words).

Many people made their variants of this patch, the ones I'm aware of are
(in alphabetic order):

Dmitrii Tejblum
Marcelo Araujo
Roman Bogorodskiy (novel)
Sergey Matveichuk (sem)
Sergey Ryabin

PR: kern/102471, kern/121122
MFC after: 2 weeks


# bdf942c3 03-May-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

Revert r234834 per luigi@ request.

Cleaner solution (e.g. adding another header) should be done here.

Original log:
Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by: luigi
Approved by: kib(mentor)


# 7bd5e9b1 30-Apr-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by: ae(mentor)
MFC after: 2 weeks


# 732d27b3 25-Mar-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

- Permit number of ipfw tables to be changed in runtime.

net.inet.ip.fw.tables_max is now read-write.

- Bump IPFW_TABLES_MAX to 65535
Default number of tables is still 128

- Remove IPFW_TABLES_MAX from ipfw(8) code.

Sponsored by Yandex LLC

Approved by: kib(mentor)

MFC after: 2 weeks


# f8bee51a 12-Mar-2012 Alexander V. Chernikov <melifaro@FreeBSD.org>

- Add ipfw eXtended tables permitting radix to be used for any kind of keys.
- Add support for IPv6 and interface extended tables
- Make number of tables to be loader tunable in range 0..65534.
- Use IP_FW3 opcode for all new extended table cmds

No ABI changes are introduced. Old userland will see valid tables for
IPv4 tables and no entries otherwise. Flush works for any table.

IP_FW3 socket option is used to encapsulate all new opcodes:
/* IP_FW3 header/opcodes */
typedef struct _ip_fw3_opheader {
uint16_t opcode; /* Operation opcode */
uint16_t reserved[3]; /* Align to 64-bit boundary */
} ip_fw3_opheader;

New opcodes added:
IP_FW_TABLE_XADD, IP_FW_TABLE_XDEL, IP_FW_TABLE_XGETSIZE, IP_FW_TABLE_XLIST

ipfw(8) table argument parsing behavior is changed:
'ipfw table 999 add host' now assumes 'host' to be interface name instead of
hostname.

New tunable:
net.inet.ip.fw.tables_max controls number of table supported by ipfw in given
VNET instance. 128 is still the default value.

New syntax:
ipfw add skipto tablearg ip from any to any via table(42) in
ipfw add skipto tablearg ip from any to any via table(4242) out

This is a bit hackish, special interface name '\1' is used to signal interface
table number is passed in p.glob field.

Sponsored by Yandex LLC

Reviewed by: ae
Approved by: ae (mentor)

MFC after: 4 weeks


# 8a006adb 20-Aug-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

Add support for IPv6 to ipfw fwd:
Distinguish IPv4 and IPv6 addresses and optional port numbers in
user space to set the option for the correct protocol family.
Add support in the kernel for carrying the new IPv6 destination
address and port.
Add support to TCP and UDP for IPv6 and fix UDP IPv4 to not change
the address in the IP header.
Add support for IPv6 forwarding to a non-local destination.
Add a regession test uitilizing VIMAGE to check all 20 possible
combinations I could think of.

Obtained from: David Dolson at Sandvine Incorporated
(original version for ipfw fwd IPv6 support)
Sponsored by: Sandvine Incorporated
PR: bin/117214
MFC after: 4 weeks
Approved by: re (kib)


# 9527ec6e 29-Jun-2011 Andrey V. Elsukov <ae@FreeBSD.org>

Add new rule actions "call" and "return" to ipfw. They make
possible to organize subroutines with rules.

The "call" action saves the current rule number in the internal
stack and rules processing continues from the first rule with
specified number (similar to skipto action). If later a rule with
"return" action is encountered, the processing returns to the first
rule with number of "call" rule saved in the stack plus one or higher.

Submitted by: Vadim Goncharov
Discussed by: ipfw@, luigi@


# 9d0a2ddf 19-Apr-2011 Gleb Smirnoff <glebius@FreeBSD.org>

- Rewrite functions that copyin/out NAT configuration, so that they
calculate required memory size dynamically.
- Fix races on chain re-lock.
- Introduce new field to ip_fw_chain - generation count. Now utilized
only in the NAT configuration, but can be utilized wider in ipfw.
- Get rid of NAT_BUF_LEN in ip_fw.h

PR: kern/143653


# ae99fd0e 12-Nov-2010 Luigi Rizzo <luigi@FreeBSD.org>

The first customer of the SO_USER_COOKIE option:
the "sockarg" ipfw option matches packets associated to
a local socket and with a non-zero so_user_cookie value.
The value is made available as tablearg, so it can be used
as a skipto target or pipe number in ipfw/dummynet rules.

Code by Paul Joe, manpage by me.

Submitted by: Paul Joe
MFC after: 1 week


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 8018e843 23-Mar-2010 Luigi Rizzo <luigi@FreeBSD.org>

MFC of a large number of ipfw and dummynet fixes and enhancements
done in CURRENT over the last 4 months.
HEAD and RELENG_8 are almost in sync now for ipfw, dummynet
the pfil hooks and related components.

Among the most noticeable changes:
- r200855 more efficient lookup of skipto rules, and remove O(N)
blocks from critical sections in the kernel;
- r204591 large restructuring of the dummynet module, with support
for multiple scheduling algorithms (4 available so far)
See the original commit logs for details.

Changes in the kernel/userland ABI should be harmless because the
kernel is able to understand previous requests from RELENG_8 and
RELENG_7. For this reason, this changeset would be applicable
to RELENG_7 as well, but i am not sure if it is worthwhile.


# f9f7bde3 15-Mar-2010 Luigi Rizzo <luigi@FreeBSD.org>

+ implement (two lines) the kernel side of 'lookup dscp N' to use the
dscp as a search key in table lookups;

+ (re)implement a sysctl variable to control the expire frequency of
pipes and queues when they become empty;

+ add 'queue number' as optional part of the flow_id. This can be
enabled with the command

queue X config mask queue ...

and makes it possible to support priority-based schedulers, where
packets should be grouped according to the priority and not some
fields in the 5-tuple.
This is implemented as follows:
- redefine a field in the ipfw_flow_id (in sys/netinet/ip_fw.h) but
without changing the size or shape of the structure, so there are
no ABI changes. On passing, also document how other fields are
used, and remove some useless assignments in ip_fw2.c

- implement small changes in the userland code to set/read the field;

- revise the functions in ip_dummynet.c to manipulate masks so they
also handle the additional field;

There are no ABI changes in this commit.


# cc4d3c30 02-Mar-2010 Luigi Rizzo <luigi@FreeBSD.org>

Bring in the most recent version of ipfw and dummynet, developed
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.

The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.

In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.

Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.

Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.

Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.

CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.


# de240d10 22-Dec-2009 Luigi Rizzo <luigi@FreeBSD.org>

merge code from ipfw3-head to reduce contention on the ipfw lock
and remove all O(N) sequences from kernel critical sections in ipfw.

In detail:

1. introduce a IPFW_UH_LOCK to arbitrate requests from
the upper half of the kernel. Some things, such as 'ipfw show',
can be done holding this lock in read mode, whereas insert and
delete require IPFW_UH_WLOCK.

2. introduce a mapping structure to keep rules together. This replaces
the 'next' chain currently used in ipfw rules. At the moment
the map is a simple array (sorted by rule number and then rule_id),
so we can find a rule quickly instead of having to scan the list.
This reduces many expensive lookups from O(N) to O(log N).

3. when an expensive operation (such as insert or delete) is done
by userland, we grab IPFW_UH_WLOCK, create a new copy of the map
without blocking the bottom half of the kernel, then acquire
IPFW_WLOCK and quickly update pointers to the map and related info.
After dropping IPFW_LOCK we can then continue the cleanup protected
by IPFW_UH_LOCK. So userland still costs O(N) but the kernel side
is only blocked for O(1).

4. do not pass pointers to rules through dummynet, netgraph, divert etc,
but rather pass a <slot, chain_id, rulenum, rule_id> tuple.
We validate the slot index (in the array of #2) with chain_id,
and if successful do a O(1) dereference; otherwise, we can find
the rule in O(log N) through <rulenum, rule_id>

All the above does not change the userland/kernel ABI, though there
are some disgusting casts between pointers and uint32_t

Operation costs now are as follows:

Function Old Now Planned
-------------------------------------------------------------------
+ skipto X, non cached O(N) O(log N)
+ skipto X, cached O(1) O(1)
XXX dynamic rule lookup O(1) O(log N) O(1)
+ skipto tablearg O(N) O(1)
+ reinject, non cached O(N) O(log N)
+ reinject, cached O(1) O(1)
+ kernel blocked during setsockopt() O(N) O(1)
-------------------------------------------------------------------

The only (very small) regression is on dynamic rule lookup and this will
be fixed in a day or two, without changing the userland/kernel ABI

Supported by: Valeria Paoli
MFC after: 1 month


# 70228fb3 15-Dec-2009 Luigi Rizzo <luigi@FreeBSD.org>

Start splitting ip_fw2.c and ip_fw.h into smaller components.
At this time we pull out from ip_fw2.c the logging functions, and
support for dynamic rules, and move kernel-only stuff into
netinet/ipfw/ip_fw_private.h

No ABI change involved in this commit, unless I made some mistake.
ip_fw.h has changed, though not in the userland-visible part.

Files touched by this commit:

conf/files
now references the two new source files

netinet/ip_fw.h
remove kernel-only definitions gone into netinet/ipfw/ip_fw_private.h.

netinet/ipfw/ip_fw_private.h
new file with kernel-specific ipfw definitions

netinet/ipfw/ip_fw_log.c
ipfw_log and related functions

netinet/ipfw/ip_fw_dynamic.c
code related to dynamic rules

netinet/ipfw/ip_fw2.c
removed the pieces that goes in the new files

netinet/ipfw/ip_fw_nat.c
minor rearrangement to remove LOOKUP_NAT from the
main headers. This require a new function pointer.

A bunch of other kernel files that included netinet/ip_fw.h now
require netinet/ipfw/ip_fw_private.h as well.
Not 100% sure i caught all of them.

MFC after: 1 month


# 3cdcbc48 04-Dec-2009 Luigi Rizzo <luigi@FreeBSD.org>

some simple MFC:

r200020:
change the type of the opcode from enum *:8 to u_int8_t
so the size and alignment of the ipfw_insn is not compiler dependent.
No changes in the code generated by gcc.

r200023:
Add new sockopt names for ipfw and dummynet.

This commit is just grabbing entries for the new names
that will be used in the future, so you don't need to
rebuild anything now.

r200034
Dispatch sockopt calls to ipfw and dummynet
using the new option numbers, IP_FW3 and IP_DUMMYNET3.
Right now the modules return an error if called with those arguments
so there is no danger of unwanted behaviour.

r200040
- initialize src_ip in the main loop to prevent a compiler warning
(gcc 4.x under linux, not sure how real is the complaint).
- rename a macro argument to prevent name clashes.
- add the macro name on a couple of #endif
- add a blank line for readability.


# 9565806f 02-Dec-2009 Luigi Rizzo <luigi@FreeBSD.org>

change the type of the opcode from enum *:8 to u_int8_t
so the size and alignment of the ipfw_insn is not compiler dependent.
No changes in the code generated by gcc.

There was only one instance of this kind in our entire source tree,
so i suspect the old definition was a poor choice (which i made).

MFC after: 3 days


# f8f0b704 21-Aug-2009 Julian Elischer <julian@FreeBSD.org>

MFC r196423
Fix ipfw's initialization functions to get the correct order of evaluation
to allow vnet and non vnet operation. Move some functions from ip_fw_pfil.c
to ip_fw2.c and mode to mostly using the SYSINIT and VNET_SYSINIT handlers
instead of the modevent handler. Correct some spelling errors in comments
in the affected code. Note this bug fixes a crash in NON VIMAGE kernels when
ipfw is unloaded.

This patch is a minimal patch for 8.0
I have a much larger patch that actually fixes the underlying problems
that will be applied after 8.0

Reviewed by: zec@, rwatson@, bz@(earlier version)
Approved by: re (rwatson)


# c4b21cbe 21-Aug-2009 Julian Elischer <julian@FreeBSD.org>

Fix ipfw's initialization functions to get the correct order of evaluation
to allow vnet and non vnet operation. Move some functions from ip_fw_pfil.c
to ip_fw2.c and mode to mostly using the SYSINIT and VNET_SYSINIT handlers
instead of the modevent handler. Correct some spelling errors in comments
in the affected code. Note this bug fixes a crash in NON VIMAGE kernels when
ipfw is unloaded.

This patch is a minimal patch for 8.0
I have a much larger patch that actually fixes the underlying problems
that will be applied after 8.0

Reviewed by: zec@, rwatson@, bz@(earlier version)
Approved by: re (rwatson)
MFC after: Immediatly


# 1e77c105 16-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used. Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with: bz, julian
Reviewed by: bz
Approved by: re (kensmith, kib)


# eddfbb76 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# dda10d62 09-Jun-2009 Oleg Bulyzhin <oleg@FreeBSD.org>

Close long existed race with net.inet.ip.fw.one_pass = 0:
If packet leaves ipfw to other kernel subsystem (dummynet, netgraph, etc)
it carries pointer to matching ipfw rule. If this packet then reinjected back
to ipfw, ruleset processing starts from that rule. If rule was deleted
meanwhile, due to existed race condition panic was possible (as well as
other odd effects like parsing rules in 'reap list').

P.S. this commit changes ABI so userland ipfw related binaries should be
recompiled.

MFC after: 1 month
Tested by: Mikolaj Golub


# b87ce554 05-Jun-2009 Luigi Rizzo <luigi@FreeBSD.org>

Several ipfw options and actions use a 16-bit argument to indicate
pipes, queues, tags, rule numbers and so on.
These are all different namespaces, and the only thing they have in
common is the fact they use a 16-bit slot to represent the argument.

There is some confusion in the code, mostly for historical reasons,
on how the values 0 and 65535 should be used. At the moment, 0 is
forbidden almost everywhere, while 65535 is used to represent a
'tablearg' argument, i.e. the result of the most recent table() lookup.

For now, try to use explicit constants for the min and max allowed
values, and do not overload the default rule number for that.

Also, make the MTAG_IPFW declaration only visible to the kernel.

NOTE: I think the issue needs to be revisited before 8.0 is out:
the 2^16 namespace limit for rule numbers and pipe/queue is
annoying, and we can easily bump the limit to 2^32 which gives
a lot more flexibility in partitioning the namespace.

MFC after: 5 days


# 115a40c7 05-Jun-2009 Luigi Rizzo <luigi@FreeBSD.org>

More cleanup in preparation of ipfw relocation (no actual code change):

+ move ipfw and dummynet hooks declarations to raw_ip.c (definitions
in ip_var.h) same as for most other global variables.
This removes some dependencies from ip_input.c;

+ remove the IPFW_LOADED macro, just test ip_fw_chk_ptr directly;

+ remove the DUMMYNET_LOADED macro, just test ip_dn_io_ptr directly;

+ move ip_dn_ruledel_ptr to ip_fw2.c which is the only file using it;

To be merged together with rev 193497

MFC after: 5 days


# 5f416f8e 02-May-2009 Marko Zec <zec@FreeBSD.org>

Make indentation more uniform accross vnet container structs.

This is a purely cosmetic / NOP change.

Reviewed by: bz
Approved by: julian (mentor)
Verified by: svn diff -x -w producing no output


# f6dfe47a 30-Apr-2009 Marko Zec <zec@FreeBSD.org>

Permit buiding kernels with options VIMAGE, restricted to only a single
active network stack instance. Turning on options VIMAGE at compile
time yields the following changes relative to default kernel build:

1) V_ accessor macros for virtualized variables resolve to structure
fields via base pointers, instead of being resolved as fields in global
structs or plain global variables. As an example, V_ifnet becomes:

options VIMAGE: ((struct vnet_net *) vnet_net)->_ifnet
default build: vnet_net_0._ifnet
options VIMAGE_GLOBALS: ifnet

2) INIT_VNET_* macros will declare and set up base pointers to be used
by V_ accessor macros, instead of resolving to whitespace:

INIT_VNET_NET(ifp->if_vnet); becomes

struct vnet_net *vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET];

3) Memory for vnet modules registered via vnet_mod_register() is now
allocated at run time in sys/kern/kern_vimage.c, instead of per vnet
module structs being declared as globals. If required, vnet modules
can now request the framework to provide them with allocated bzeroed
memory by filling in the vmi_size field in their vmi_modinfo structures.

4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are
extended to hold a pointer to the parent vnet. options VIMAGE builds
will fill in those fields as required.

5) curvnet is introduced as a new global variable in options VIMAGE
builds, always pointing to the default and only struct vnet.

6) struct sysctl_oid has been extended with additional two fields to
store major and minor virtualization module identifiers, oid_v_subs and
oid_v_mod. SYSCTL_V_* family of macros will fill in those fields
accordingly, and store the offset in the appropriate vnet container
struct in oid_arg1.
In sysctl handlers dealing with virtualized sysctls, the
SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target
variable and make it available in arg1 variable for further processing.

Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have
been deleted.

Reviewed by: bz, rwatson
Approved by: julian (mentor)


# 1ed81b73 06-Apr-2009 Marko Zec <zec@FreeBSD.org>

First pass at separating per-vnet initializer functions
from existing functions for initializing global state.

At this stage, the new per-vnet initializer functions are
directly called from the existing global initialization code,
which should in most cases result in compiler inlining those
new functions, hence yielding a near-zero functional change.

Modify the existing initializer functions which are invoked via
protosw, like ip_init() et. al., to allow them to be invoked
multiple times, i.e. per each vnet. Global state, if any,
is initialized only if such functions are called within the
context of vnet0, which will be determined via the
IS_DEFAULT_VNET(curvnet) check (currently always true).

While here, V_irtualize a few remaining global UMA zones
used by net/netinet/netipsec networking code. While it is
not yet clear to me or anybody else whether this is the right
thing to do, at this stage this makes the code more readable,
and makes it easier to track uncollected UMA-zone-backed
objects on vnet removal. In the long run, it's quite possible
that some form of shared use of UMA zone pools among multiple
vnets should be considered.

Bump __FreeBSD_version due to changes in layout of structs
vnet_ipfw, vnet_inet and vnet_net.

Approved by: julian (mentor)


# eb2e4119 01-Apr-2009 Paolo Pisati <piso@FreeBSD.org>

Implement an ipfw action to reassemble ip packets: reass.


# 0906f40f 02-Mar-2009 Luigi Rizzo <luigi@FreeBSD.org>

fw_debug has been unused for ages, so remove it from the list
of sysctl_variables.
I would also remove it from the VNET record but I am unsure if
there is any ABI issue -- so for the time being just mark it as
unused in ip_fw.h, and then we will collect the garbage at some
appropriate time in the future.

MFC after: 3 days


# 35b78b75 16-Feb-2009 Luigi Rizzo <luigi@FreeBSD.org>

remove dependency on eventhandler.h, we only need a forward declaration


# 1b193af6 13-Dec-2008 Bjoern A. Zeeb <bz@FreeBSD.org>

Second round of putting global variables, which were virtualized
but formerly missed under VIMAGE_GLOBAL.

Put the extern declarations of the virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.

Sponsored by: The FreeBSD Foundation


# 385195c0 10-Dec-2008 Marko Zec <zec@FreeBSD.org>

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 1f6ef666 10-Oct-2008 Robert Watson <rwatson@FreeBSD.org>

Fix content and spelling of comment on _ipfw_insn.len -- a count of
32-bit words, not 32-byte words.

MFC after: 3 days


# 8b615593 02-Oct-2008 Marko Zec <zec@FreeBSD.org>

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# f7b5554e 21-Sep-2008 Roman Kurakin <rik@FreeBSD.org>

Export IPFW_TABLES_MAX value for compiled in defaults.


# eb29d14c 14-Sep-2008 Roman Kurakin <rik@FreeBSD.org>

Make the commet for the default rule number more clear.

Submitted by: yar@


# 8191aa7c 06-Sep-2008 Roman Kurakin <rik@FreeBSD.org>

Export the IPFW_DEFAULT_RULE outside ip_fw2.c. This number in not only
the default rule number but also the maximum rule number. User space
software such as ipfw and natd should be aware of its value. The
software that already includes ip_fw.h should use the defined value. All
other a expected to use sysctl (as discussed on net@).

MFC after: 5 days.
Discussed on: net@


# 8b07e49a 09-May-2008 Julian Elischer <julian@FreeBSD.org>

Add code to allow the system to handle multiple routing tables.
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)

Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.

From my notes:

-----

One thing where FreeBSD has been falling behind, and which by chance I
have some time to work on is "policy based routing", which allows
different
packet streams to be routed by more than just the destination address.

Constraints:
------------

I want to make some form of this available in the 6.x tree
(and by extension 7.x) , but FreeBSD in general needs it so I might as
well do it in -current and back port the portions I need.

One of the ways that this can be done is to have the ability to
instantiate multiple kernel routing tables (which I will now
refer to as "Forwarding Information Bases" or "FIBs" for political
correctness reasons). Which FIB a particular packet uses to make
the next hop decision can be decided by a number of mechanisms.
The policies these mechanisms implement are the "Policies" referred
to in "Policy based routing".

One of the constraints I have if I try to back port this work to
6.x is that it must be implemented as a EXTENSION to the existing
ABIs in 6.x so that third party applications do not need to be
recompiled in timespan of the branch.

This first version will not have some of the bells and whistles that
will come with later versions. It will, for example, be limited to 16
tables in the first commit.
Implementation method, Compatible version. (part 1)
-------------------------------
For this reason I have implemented a "sufficient subset" of a
multiple routing table solution in Perforce, and back-ported it
to 6.x. (also in Perforce though not always caught up with what I
have done in -current/P4). The subset allows a number of FIBs
to be defined at compile time (8 is sufficient for my purposes in 6.x)
and implements the changes needed to allow IPV4 to use them. I have not
done the changes for ipv6 simply because I do not need it, and I do not
have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.

Other protocol families are left untouched and should there be
users with proprietary protocol families, they should continue to work
and be oblivious to the existence of the extra FIBs.

To understand how this is done, one must know that the current FIB
code starts everything off with a single dimensional array of
pointers to FIB head structures (One per protocol family), each of
which in turn points to the trie of routes available to that family.

The basic change in the ABI compatible version of the change is to
extent that array to be a 2 dimensional array, so that
instead of protocol family X looking at rt_tables[X] for the
table it needs, it looks at rt_tables[Y][X] when for all
protocol families except ipv4 Y is always 0.
Code that is unaware of the change always just sees the first row
of the table, which of course looks just like the one dimensional
array that existed before.

The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
are all maintained, but refer only to the first row of the array,
so that existing callers in proprietary protocols can continue to
do the "right thing".
Some new entry points are added, for the exclusive use of ipv4 code
called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
which have an extra argument which refers the code to the correct row.

In addition, there are some new entry points (currently called
rtalloc_fib() and friends) that check the Address family being
looked up and call either rtalloc() (and friends) if the protocol
is not IPv4 forcing the action to row 0 or to the appropriate row
if it IS IPv4 (and that info is available). These are for calling
from code that is not specific to any particular protocol. The way
these are implemented would change in the non ABI preserving code
to be added later.

One feature of the first version of the code is that for ipv4,
the interface routes show up automatically on all the FIBs, so
that no matter what FIB you select you always have the basic
direct attached hosts available to you. (rtinit() does this
automatically).

You CAN delete an interface route from one FIB should you want
to but by default it's there. ARP information is also available
in each FIB. It's assumed that the same machine would have the
same MAC address, regardless of which FIB you are using to get
to it.

This brings us as to how the correct FIB is selected for an outgoing
IPV4 packet.

Firstly, all packets have a FIB associated with them. if nothing
has been done to change it, it will be FIB 0. The FIB is changed
in the following ways.

Packets fall into one of a number of classes.

1/ locally generated packets, coming from a socket/PCB.
Such packets select a FIB from a number associated with the
socket/PCB. This in turn is inherited from the process,
but can be changed by a socket option. The process in turn
inherits it on fork. I have written a utility call setfib
that acts a bit like nice..

setfib -3 ping target.example.com # will use fib 3 for ping.

It is an obvious extension to make it a property of a jail
but I have not done so. It can be achieved by combining the setfib and
jail commands.

2/ packets received on an interface for forwarding.
By default these packets would use table 0,
(or possibly a number settable in a sysctl(not yet)).
but prior to routing the firewall can inspect them (see below).
(possibly in the future you may be able to associate a FIB
with packets received on an interface.. An ifconfig arg, but not yet.)

3/ packets inspected by a packet classifier, which can arbitrarily
associate a fib with it on a packet by packet basis.
A fib assigned to a packet by a packet classifier
(such as ipfw) would over-ride a fib associated by
a more default source. (such as cases 1 or 2).

4/ a tcp listen socket associated with a fib will generate
accept sockets that are associated with that same fib.

5/ Packets generated in response to some other packet (e.g. reset
or icmp packets). These should use the FIB associated with the
packet being reponded to.

6/ Packets generated during encapsulation.
gif, tun and other tunnel interfaces will encapsulate using the FIB
that was in effect withthe proces that set up the tunnel.
thus setfib 1 ifconfig gif0 [tunnel instructions]
will set the fib for the tunnel to use to be fib 1.

Routing messages would be associated with their
process, and thus select one FIB or another.
messages from the kernel would be associated with the fib they
refer to and would only be received by a routing socket associated
with that fib. (not yet implemented)

In addition Netstat has been edited to be able to cope with the
fact that the array is now 2 dimensional. (It looks in system
memory using libkvm (!)). Old versions of netstat see only the first FIB.

In addition two sysctls are added to give:
a) the number of FIBs compiled in (active)
b) the default FIB of the calling process.

Early testing experience:
-------------------------

Basically our (IronPort's) appliance does this functionality already
using ipfw fwd but that method has some drawbacks.

For example,
It can't fully simulate a routing table because it can't influence the
socket's choice of local address when a connect() is done.

Testing during the generating of these changes has been
remarkably smooth so far. Multiple tables have co-existed
with no notable side effects, and packets have been routes
accordingly.

ipfw has grown 2 new keywords:

setfib N ip from anay to any
count ip from any to any fib N

In pf there seems to be a requirement to be able to give symbolic names to the
fibs but I do not have that capacity. I am not sure if it is required.

SCTP has interestingly enough built in support for this, called VRFs
in Cisco parlance. it will be interesting to see how that handles it
when it suddenly actually does something.

Where to next:
--------------------

After committing the ABI compatible version and MFCing it, I'd
like to proceed in a forward direction in -current. this will
result in some roto-tilling in the routing code.

Firstly: the current code's idea of having a separate tree per
protocol family, all of the same format, and pointed to by the
1 dimensional array is a bit silly. Especially when one considers that
there is code that makes assumptions about every protocol having the
same internal structures there. Some protocols don't WANT that
sort of structure. (for example the whole idea of a netmask is foreign
to appletalk). This needs to be made opaque to the external code.

My suggested first change is to add routing method pointers to the
'domain' structure, along with information pointing the data.
instead of having an array of pointers to uniform structures,
there would be an array pointing to the 'domain' structures
for each protocol address domain (protocol family),
and the methods this reached would be called. The methods would have
an argument that gives FIB number, but the protocol would be free
to ignore it.

When the ABI can be changed it raises the possibilty of the
addition of a fib entry into the "struct route". Currently,
the structure contains the sockaddr of the desination, and the resulting
fib entry. To make this work fully, one could add a fib number
so that given an address and a fib, one can find the third element, the
fib entry.

Interaction with the ARP layer/ LL layer would need to be
revisited as well. Qing Li has been working on this already.

This work was sponsored by Ironport Systems/Cisco

Reviewed by: several including rwatson, bz and mlair (parts each)
Obtained from: Ironport systems/Cisco


# bcf5b9fa 29-Apr-2008 Robert Watson <rwatson@FreeBSD.org>

Fix a comment typo.

MFC after: 3 days


# 531c890b 29-Feb-2008 Paolo Pisati <piso@FreeBSD.org>

Move ipfw's nat code into its own kld: ipfw_nat.


# bb5081a7 25-Jan-2008 Robert Watson <rwatson@FreeBSD.org>

Hide ipfw internal data structures behind IPFW_INTERNAL rather than
exposing them to all consumers of ip_fw.h. These structures are
used in both ipfw(8) and ipfw(4), but not part of the user<->kernel
interface for other applications to use, rather, shared
implementation.

MFC after: 3 days
Reported by: Paul Vixie <paul at vix dot com>


# 7a92401a 04-May-2007 Bjoern A. Zeeb <bz@FreeBSD.org>

Add support for filtering on Routing Header Type 0 and
Mobile IPv6 Routing Header Type 2 in addition to filter
on the non-differentiated presence of any Routing Header.

MFC after: 3 weeks


# ff2f6fe8 29-Dec-2006 Paolo Pisati <piso@FreeBSD.org>

Summer of Code 2005: improve libalias - part 2 of 2

With the second (and last) part of my previous Summer of Code work, we get:

-ipfw's in kernel nat

-redirect_* and LSNAT support

General information about nat syntax and some examples are available
in the ipfw (8) man page. The redirect and LSNAT syntax are identical
to natd, so please refer to natd (8) man page.

To enable in kernel nat in rc.conf, two options were added:

o firewall_nat_enable: equivalent to natd_enable

o firewall_nat_interface: equivalent to natd_interface

Remember to set net.inet.ip.fw.one_pass to 0, if you want the packet
to continue being checked by the firewall ruleset after being
(de)aliased.

NOTA BENE: due to some problems with libalias architecture, in kernel
nat won't work with TSO enabled nic, thus you have to disable TSO via
ifconfig (ifconfig foo0 -tso).

Approved by: glebius (mentor)


# afad78e2 18-Aug-2006 Julian Elischer <julian@FreeBSD.org>

comply with style police

Submitted by: ru
MFC after: 1 month


# c487be96 17-Aug-2006 Julian Elischer <julian@FreeBSD.org>

Allow ipfw to forward to a destination that is specified by a table.
for example:
fwd tablearg ip from any to table(1)
where table 1 has entries of the form:
1.1.1.0/24 10.2.3.4
208.23.2.0/24 router2

This allows trivial implementation of a secondary routing table implemented
in the firewall layer.

I expect more work (under discussion with Glebius) to follow this to clean
up some of the messy parts of ipfw related to tables.

Reviewed by: Glebius
MFC after: 1 month


# 6a7d5cb6 24-May-2006 Oleg Bulyzhin <oleg@FreeBSD.org>

Implement internal (i.e. inside kernel) packet tagging using mbuf_tags(9).
Since tags are kept while packet resides in kernelspace, it's possible to
use other kernel facilities (like netgraph nodes) for altering those tags.

Submitted by: Andrey Elsukov <bu7cher at yandex dot ru>
Submitted by: Vadim Goncharov <vadimnuclight at tpu dot ru>
Approved by: glebius (mentor)
Idea from: OpenBSD PF
MFC after: 1 month


# e9318748 11-May-2006 Max Laier <mlaier@FreeBSD.org>

Reintroduce net.inet6.ip6.fw.enable sysctl to dis/enable the ipv6 processing
seperately. Also use pfil hook/unhook instead of keeping the check
functions in pfil just to return there based on the sysctl. While here fix
some whitespace on a nearby SYSCTL_ macro.


# ea9dce14 13-Feb-2006 Ruslan Ermilov <ru@FreeBSD.org>

When sending a packet from dummynet, indicate that we're forwarding
it so that ip_id etc. don't get overwritten. This fixes forwarding
of fragmented IP packets through a dummynet pipe -- fragments came
out with modified and different(!) ip_id's, making it impossible to
reassemble a datagram at the receiver side.

Submitted by: Alexander Karptsov (reworked by me)
MFC after: 3 days


# 40b1ae9e 12-Dec-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Add a new feature for optimizining ipfw rulesets - substitution of the
action argument with the value obtained from table lookup. The feature
is now applicable only to "pipe", "queue", "divert", "tee", "netgraph"
and "ngtee" rules.

An example usage:

ipfw pipe 1000 config bw 1000Kbyte/s
ipfw pipe 4000 config bw 4000Kbyte/s
ipfw table 1 add x.x.x.x 1000
ipfw table 1 add x.x.x.y 4000
ipfw pipe tablearg ip from table(1) to any

In the example above the rule will throw different packets to different pipes.

TODO:
- Support "skipto" action, but without searching all rules.
- Improve parser, so that it warns about bad rules. These are:
- "tablearg" argument to action, but no "table" in the rule. All
traffic will be blocked.
- "tablearg" argument to action, but "table" searches for entry with
a specific value. All traffic will be blocked.
- "tablearg" argument to action, and two "table" looks - for src and
for dst. The last lookup will match.


# b090e4ce 29-Nov-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Garbage-collect now unused struct _ipfw_insn_pipe and flush_pipe_ptrs(),
thus removing a few XXXes.
Document the ABI breakage in UPDATING.


# 9066356b 13-Aug-2005 Bjoern A. Zeeb <bz@FreeBSD.org>

* Add dynamic sysctl for net.inet6.ip6.fw.
* Correct handling of IPv6 Extension Headers.
* Add unreach6 code.
* Add logging for IPv6.

Submitted by: sysctl handling derived from patch from ume needed for ip6fw
Obtained from: is_icmp6_query and send_reject6 derived from similar
functions of netinet6,ip6fw
Reviewed by: ume, gnn; silence on ipfw@
Test setup provided by: CK Software GmbH
MFC after: 6 days


# 57cd6d26 02-Jun-2005 Max Laier <mlaier@FreeBSD.org>

Add support for IPv4 only rules to IPFW2 now that it supports IPv6 as well.
This is the last requirement before we can retire ip6fw.

Reviewed by: dwhite, brooks(earlier version)
Submitted by: dwhite (manpage)
Silence from: -ipfw


# a1429ad9 04-May-2005 Gleb Smirnoff <glebius@FreeBSD.org>

IPFW version 2 is the only option in HEAD and RELENG_5.
Thus, cleanup unnecessary now ifdefs.


# 8195404b 18-Apr-2005 Brooks Davis <brooks@FreeBSD.org>

Add IPv6 support to IPFW and Dummynet.

Submitted by: Mariano Tortoriello and Raffaele De Lorenzo (via luigi)


# 670742a1 04-Feb-2005 Gleb Smirnoff <glebius@FreeBSD.org>

Add a ng_ipfw node, implementing a quick and simple interface between
ipfw(4) and netgraph(4) facilities.

Reviewed by: andre, brooks, julian


# 6c69a7c3 14-Jan-2005 Gleb Smirnoff <glebius@FreeBSD.org>

o Clean up interface between ip_fw_chk() and its callers:

- ip_fw_chk() returns action as function return value. Field retval is
removed from args structure. Action is not flag any more. It is one
of integer constants.
- Any action-specific cookies are returned either in new "cookie" field
in args structure (dummynet, future netgraph glue), or in mbuf tag
attached to packet (divert, tee, some future action).

o Convert parsing of return value from ip_fw_chk() in ipfw_check_{in,out}()
to a switch structure, so that the functions are more readable, and a future
actions can be added with less modifications.

Approved by: andre
MFC after: 2 months


# c398230b 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for license, minor formatting changes


# c99ee9e0 02-Oct-2004 Brian Feldman <green@FreeBSD.org>

Add support to IPFW for matching by TCP data length.


# 6daf7ebd 02-Oct-2004 Brian Feldman <green@FreeBSD.org>

Add support to IPFW for classification based on "diverted" status
(that is, input via a divert socket).


# 974dfe30 02-Oct-2004 Brian Feldman <green@FreeBSD.org>

Add to IPFW the ability to do ALTQ classification/tagging.


# d6a8d588 28-Sep-2004 Max Laier <mlaier@FreeBSD.org>

Add an additional struct inpcb * argument to pfil(9) in order to enable
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.

This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.

Suggested by: rwatson
A lot of work by: csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by: rwatson, csjp
Tested by: -pf, -ipfw, LINT, csjp and myself
MFC after: 3 days

LOR IDs: 14 - 17 (not fixed yet)


# e4c97eff 19-Aug-2004 Andre Oppermann <andre@FreeBSD.org>

Bring back the sysctl 'net.inet.ip.fw.enable' to unbreak the startup scripts
and to be able to disable ipfw if it was compiled directly into the kernel.


# 9b932e9e 17-Aug-2004 Andre Oppermann <andre@FreeBSD.org>

Convert ipfw to use PFIL_HOOKS. This is change is transparent to userland
and preserves the ipfw ABI. The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.

However there are many changes how ipfw is and its add-on's are handled:

In general ipfw is now called through the PFIL_HOOKS and most associated
magic, that was in ip_input() or ip_output() previously, is now done in
ipfw_check_[in|out]() in the ipfw PFIL handler.

IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to
be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
reassembly. If not, or all fragments arrived and the packet is complete,
divert_packet is called directly. For 'tee' no reassembly attempt is made
and a copy of the packet is sent to the divert socket unmodified. The
original packet continues its way through ip_input/output().

ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet
with the new destination sockaddr_in. A check if the new destination is a
local IP address is made and the m_flags are set appropriately. ip_input()
and ip_output() have some more work to do here. For ip_input() the m_flags
are checked and a packet for us is directly sent to the 'ours' section for
further processing. Destination changes on the input path are only tagged
and the 'srcrt' flag to ip_forward() is set to disable destination checks
and ICMP replies at this stage. The tag is going to be handled on output.
ip_output() again checks for m_flags and the 'ours' tag. If found, the
packet will be dropped back to the IP netisr where it is going to be picked
up by ip_input() again and the directly sent to the 'ours' section. When
only the destination changes, the route's 'dst' is overwritten with the
new destination from the forward m_tag. Then it jumps back at the route
lookup again and skips the firewall check because it has been marked with
M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with
'option IPFIREWALL_FORWARD' to enable it.

DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for
a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will
then inject it back into ip_input/ip_output() after it has served its time.
Dummynet packets are tagged and will continue from the next rule when they
hit the ipfw PFIL handlers again after re-injection.

BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS.

More detailed changes to the code:

conf/files
Add netinet/ip_fw_pfil.c.

conf/options
Add IPFIREWALL_FORWARD option.

modules/ipfw/Makefile
Add ip_fw_pfil.c.

net/bridge.c
Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw
is still directly invoked to handle layer2 headers and packets would
get a double ipfw when run through PFIL_HOOKS as well.

netinet/ip_divert.c
Removed divert_clone() function. It is no longer used.

netinet/ip_dummynet.[ch]
Neither the route 'ro' nor the destination 'dst' need to be stored
while in dummynet transit. Structure members and associated macros
are removed.

netinet/ip_fastfwd.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.

netinet/ip_fw.h
Removed 'ro' and 'dst' from struct ip_fw_args.

netinet/ip_fw2.c
(Re)moved some global variables and the module handling.

netinet/ip_fw_pfil.c
New file containing the ipfw PFIL handlers and module initialization.

netinet/ip_input.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code. ip_forward() does not longer require
the 'next_hop' struct sockaddr_in argument. Disable early checks
if 'srcrt' is set.

netinet/ip_output.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.

netinet/ip_var.h
Add ip_reass() as general function. (Used from ipfw PFIL handlers
for IPDIVERT.)

netinet/raw_ip.c
Directly check if ipfw and dummynet control pointers are active.

netinet/tcp_input.c
Rework the 'ipfw forward' to local code to work with the new way of
forward tags.

netinet/tcp_sack.c
Remove include 'opt_ipfw.h' which is not needed here.

sys/mbuf.h
Remove m_claim_next() macro which was exclusively for ipfw 'forward'
and is no longer needed.

Approved by: re (scottl)


# 5af87d0e 15-Aug-2004 David E. O'Brien <obrien@FreeBSD.org>

Put the 'antispoof' opcode in the proper place in the opcode list such
that it doesn't break the ipfw2 ABI.


# 31c88a30 12-Aug-2004 Christian S.J. Peron <csjp@FreeBSD.org>

Add the ability to associate ipfw rules with a specific prison ID.
Since the only thing truly unique about a prison is it's ID, I figured
this would be the most granular way of handling this.

This commit makes the following changes:

- Adds tokenizing and parsing for the ``jail'' command line option
to the ipfw(8) userspace utility.
- Append the ipfw opcode list with O_JAIL.
- While Iam here, add a comment informing others that if they
want to add additional opcodes, they should append them to the end
of the list to avoid ABI breakage.
- Add ``fw_prid'' to the ipfw ucred cache structure.
- When initializing ucred cache, if the process is jailed,
set fw_prid to the prison ID, otherwise set it to -1.
- Update man page to reflect these changes.

This change was a strong motivator behind the ucred caching
mechanism in ipfw.

A sample usage of this new functionality could be:

ipfw add count ip from any to any jail 2

It should be noted that because ucred based constraints
are only implemented for TCP and UDP packets, the same
applies for jail associations.

Conceptual head nod by: pjd
Reviewed by: rwatson
Approved by: bmilekic (mentor)


# 5f9541ec 09-Aug-2004 Andre Oppermann <andre@FreeBSD.org>

New ipfw option "antispoof":

For incoming packets, the packet's source address is checked if it
belongs to a directly connected network. If the network is directly
connected, then the interface the packet came on in is compared to
the interface the network is connected to. When incoming interface
and directly connected interface are not the same, the packet does
not match.

Usage example:

ipfw add deny ip from any to any not antispoof in

Manpage education by: ru


# cd8b5ae0 09-Jun-2004 Ruslan Ermilov <ru@FreeBSD.org>

Introduce a new feature to IPFW2: lookup tables. These are useful
for handling large sparse address sets. Initial implementation by
Vsevolod Lobko <seva@ip.net.ua>, refined by me.

MFC after: 1 week


# 22b5770b 23-Apr-2004 Andre Oppermann <andre@FreeBSD.org>

Add the option versrcreach to verify that a valid route to the
source address of a packet exists in the routing table. The
default route is ignored because it would match everything and
render the check pointless.

This option is very useful for routers with a complete view of
the Internet (BGP) in the routing table to reject packets with
spoofed or unrouteable source addresses.

Example:

ipfw add 1000 deny ip from any to any not versrcreach

also known in Cisco-speak as:

ip verify unicast source reachable-via any

Reviewed by: luigi


# ac9d7e26 25-Feb-2004 Max Laier <mlaier@FreeBSD.org>

Re-remove MT_TAGs. The problems with dummynet have been fixed now.

Tested by: -current, bms(mentor), me
Approved by: bms(mentor), sam


# 36e8826f 17-Feb-2004 Max Laier <mlaier@FreeBSD.org>

Backout MT_TAG removal (i.e. bring back MT_TAGs) for now, as dummynet is
not working properly with the patch in place.

Approved by: bms(mentor)


# 1094bdca 13-Feb-2004 Max Laier <mlaier@FreeBSD.org>

This set of changes eliminates the use of MT_TAG "pseudo mbufs", replacing
them mostly with packet tags (one case is handled by using an mbuf flag
since the linkage between "caller" and "callee" is direct and there's no
need to incur the overhead of a packet tag).

This is (mostly) work from: sam

Silence from: -arch
Approved by: bms(mentor), sam, rwatson


# 9bf40ede 31-Oct-2003 Brooks Davis <brooks@FreeBSD.org>

Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)


# 4805529c 15-Jul-2003 Luigi Rizzo <luigi@FreeBSD.org>

Allow set 31 to be used for rules other than 65535.
Set 31 is still special because rules belonging to it are not deleted
by the "ipfw flush" command, but must be deleted explicitly with
"ipfw delete set 31" or by individual rule numbers.

This implement a flexible form of "persistent rules" which you might
want to have available even after an "ipfw flush".
Note that this change does not violate POLA, because you could not
use set 31 in a ruleset before this change.

sbin/ipfw changes to allow manipulation of set 31 will follow shortly.

Suggested by: Paul Richards


# f030c151 04-Jul-2003 Luigi Rizzo <luigi@FreeBSD.org>

Correct some comments, add opcode O_IPSEC to match packets
coming out of an ipsec tunnel.


# 330462a3 03-Jun-2003 Bernd Walter <ticso@FreeBSD.org>

Change handling to support strong alignment architectures such as alpha and
sparc64.

PR: alpha/50658
Submitted by: rizzo
Tested on: alpha


# 010dabb0 14-Mar-2003 Crist J. Clark <cjc@FreeBSD.org>

Add a 'verrevpath' option that verifies the interface that a packet
comes in on is the same interface that we would route out of to get to
the packet's source address. Essentially automates an anti-spoofing
check using the information in the routing table.

Experimental. The usage and rule format for the feature may still be
subject to change.


# d28e8b3a 24-Oct-2002 Maxime Henrion <mux@FreeBSD.org>

Oops, forgot to commit this file. This is part of the fix
for ipfw2 panics on sparc64.


# 43405724 09-Aug-2002 Luigi Rizzo <luigi@FreeBSD.org>

One bugfix and one new feature.

The bugfix (ipfw2.c) makes the handling of port numbers with
a dash in the name, e.g. ftp-data, consistent with old ipfw:
use \\ before the - to consider it as part of the name and not
a range separator.

The new feature (all this description will go in the manpage):

each rule now belongs to one of 32 different sets, which can
be optionally specified in the following form:

ipfw add 100 set 23 allow ip from any to any

If "set N" is not specified, the rule belongs to set 0.

Individual sets can be disabled, enabled, and deleted with the commands:

ipfw disable set N
ipfw enable set N
ipfw delete set N

Enabling/disabling of a set is atomic. Rules belonging to a disabled
set are skipped during packet matching, and they are not listed
unless you use the '-S' flag in the show/list commands.
Note that dynamic rules, once created, are always active until
they expire or their parent rule is deleted.
Set 31 is reserved for the default rule and cannot be disabled.

All sets are enabled by default. The enable/disable status of the sets
can be shown with the command

ipfw show sets

Hopefully, this feature will make life easier to those who want to
have atomic ruleset addition/deletion/tests. Examples:

To add a set of rules atomically:

ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18

To delete a set of rules atomically

ipfw disable set 18
ipfw delete set 18
ipfw enable set 18

To test a ruleset and disable it and regain control if something
goes wrong:

ipfw disable set 18
ipfw add ... set 18 ... # repeat as needed
ipfw enable set 18 ; echo "done "; sleep 30 && ipfw disable set 18

here if everything goes well, you press control-C before
the "sleep" terminates, and your ruleset will be left
active. Otherwise, e.g. if you cannot access your box,
the ruleset will be disabled after the sleep terminates.

I think there is only one more thing that one might want, namely
a command to assign all rules in set X to set Y, so one can
test a ruleset using the above mechanisms, and once it is
considered acceptable, make it part of an existing ruleset.


# 318aa87b 17-Jul-2002 Luigi Rizzo <luigi@FreeBSD.org>

Fix a panic when doing "ipfw add pipe 1 log ..."

Also synchronize ip_dummynet.c with the version in RELENG_4 to
ease MFC's.


# a8c102a2 14-Jul-2002 Luigi Rizzo <luigi@FreeBSD.org>

Implement keepalives for dynamic rules, so they will not expire
just because you leave your session idle.

Also, put in a fix for 64-bit architectures (to be revised).

In detail:

ip_fw.h

* Reorder fields in struct ip_fw to avoid alignment problems on
64-bit machines. This only masks the problem, I am still not
sure whether I am doing something wrong in the code or there
is a problem elsewhere (e.g. different aligmnent of structures
between userland and kernel because of pragmas etc.)

* added fields in dyn_rule to store ack numbers, so we can
generate keepalives when the dynamic rule is about to expire

ip_fw2.c

* use a local function, send_pkt(), to generate TCP RST for Reset rules;

* save about 250 bytes by cleaning up the various snprintf()
in ipfw_log() ...

* ... and use twice as many bytes to implement keepalives
(this seems to be working, but i have not tested it extensively).

Keepalives are generated once every 5 seconds for the last 20 seconds
of the lifetime of a dynamic rule for an established TCP flow. The
packets are sent to both sides, so if at least one of the endpoints
is responding, the timeout is refreshed and the rule will not expire.

You can disable this feature with

sysctl net.inet.ip.fw.dyn_keepalive=0

(the default is 1, to have them enabled).

MFC after: 1 day

(just kidding... I will supply an updated version of ipfw2 for
RELENG_4 tomorrow).


# 7d4d3e90 08-Jul-2002 Luigi Rizzo <luigi@FreeBSD.org>

Remove one unused command name.


# 5e43aef8 05-Jul-2002 Luigi Rizzo <luigi@FreeBSD.org>

Implement the last 2-3 missing instructions for ipfw,
now it should support all the instructions of the old ipfw.

Fix some bugs in the user interface, /sbin/ipfw.

Please check this code against your rulesets, so i can fix the
remaining bugs (if any, i think they will be mostly in /sbin/ipfw).

Once we have done a bit of testing, this code is ready to be MFC'ed,
together with a bunch of other changes (glue to ipfw, and also the
removal of some global variables) which have been in -current for
a couple of weeks now.

MFC after: 7 days


# 9758b77f 27-Jun-2002 Luigi Rizzo <luigi@FreeBSD.org>

The new ipfw code.

This code makes use of variable-size kernel representation of rules
(exactly the same concept of BPF instructions, as used in the BSDI's
firewall), which makes firewall operation a lot faster, and the
code more readable and easier to extend and debug.

The interface with the rest of the system is unchanged, as witnessed
by this commit. The only extra kernel files that I am touching
are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In
userland I only had to touch those programs which manipulate the
internal representation of firewall rules).

The code is almost entirely new (and I believe I have written the
vast majority of those sections which were taken from the former
ip_fw.c), so rather than modifying the old ip_fw.c I decided to
create a new file, sys/netinet/ip_fw2.c . Same for the user
interface, which is in sbin/ipfw/ipfw2.c (it still compiles to
/sbin/ipfw). The old files are still there, and will be removed
in due time.

I have not renamed the header file because it would have required
touching a one-line change to a number of kernel files.

In terms of user interface, the new "ipfw" is supposed to accepts
the old syntax for ipfw rules (and produce the same output with
"ipfw show". Only a couple of the old options (out of some 30 of
them) has not been implemented, but they will be soon.

On the other hand, the new code has some very powerful extensions.
First, you can put "or" connectives between match fields (and soon
also between options), and write things like

ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any

This should make rulesets slightly more compact (and lines longer!),
by condensing 2 or more of the old rules into single ones.

Also, as an example of how easy the rules can be extended, I have
implemented an 'address set' match pattern, where you can specify
an IP address in a format like this:

10.20.30.0/26{18,44,33,22,9}

which will match the set of hosts listed in braces belonging to the
subnet 10.20.30.0/26 . The match is done using a bitmap, so it is
essentially a constant time operation requiring a handful of CPU
instructions (and a very small amount of memmory -- for a full /24
subnet, the instruction only consumes 40 bytes).

Again, in this commit I have focused on functionality and tried
to minimize changes to the other parts of the system. Some performance
improvement can be achieved with minor changes to the interface of
ip_fw_chk_t. This will be done later when this code is settled.

The code is meant to compile unmodified on RELENG_4 (once the
PACKET_TAG_* changes have been merged), for this reason
you will see #ifdef __FreeBSD_version in a couple of places.
This should minimize errors when (hopefully soon) it will be time
to do the MFC.


# 2b25acc1 22-Jun-2002 Luigi Rizzo <luigi@FreeBSD.org>

Remove (almost all) global variables that were used to hold
packet forwarding state ("annotations") during ip processing.
The code is considerably cleaner now.

The variables removed by this change are:

ip_divert_cookie used by divert sockets
ip_fw_fwd_addr used for transparent ip redirection
last_pkt used by dynamic pipes in dummynet

Removal of the first two has been done by carrying the annotations
into volatile structs prepended to the mbuf chains, and adding
appropriate code to add/remove annotations in the routines which
make use of them, i.e. ip_input(), ip_output(), tcp_input(),
bdg_forward(), ether_demux(), ether_output_frame(), div_output().

On passing, remove a bug in divert handling of fragmented packet.
Now it is the fragment at offset 0 which sets the divert status of
the whole packet, whereas formerly it was the last incoming fragment
to decide.

Removal of last_pkt required a change in the interface of ip_fw_chk()
and dummynet_io(). On passing, use the same mechanism for dummynet
annotations and for divert/forward annotations.

option IPFIREWALL_FORWARD is effectively useless, the code to
implement it is very small and is now in by default to avoid the
obfuscation of conditionally compiled code.

NOTES:
* there is at least one global variable left, sro_fwd, in ip_output().
I am not sure if/how this can be removed.

* I have deliberately avoided gratuitous style changes in this commit
to avoid cluttering the diffs. Minor stule cleanup will likely be
necessary

* this commit only focused on the IP layer. I am sure there is a
number of global variables used in the TCP and maybe UDP stack.

* despite the number of files touched, there are absolutely no API's
or data structures changed by this commit (except the interfaces of
ip_fw_chk() and dummynet_io(), which are internal anyways), so
an MFC is quite safe and unintrusive (and desirable, given the
improved readability of the code).

MFC after: 10 days


# 2f8707ca 13-May-2002 Luigi Rizzo <luigi@FreeBSD.org>

Remove custom definitions (IP_FW_TCPF_SYN etc.) of TCP header flags
which are the same as the original ones (TH_SYN etc.)


# d60315be 09-May-2002 Luigi Rizzo <luigi@FreeBSD.org>

Cleanup the interface to ip_fw_chk, two of the input arguments
were totally useless and have been removed.

ip_input.c, ip_output.c:
Properly initialize the "ip" pointer in case the firewall does an
m_pullup() on the packet.

Remove some debugging code forgotten long ago.

ip_fw.[ch], bridge.c:
Prepare the grounds for matching MAC header fields in bridged packets,
so we can have 'etherfw' functionality without a lot of kernel and
userland bloat.


# 4d77a549 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P.


# 37b5d6e3 21-Dec-2001 Yaroslav Tykhiy <ytykhiy@gmail.com>

Implement matching IP precedence in ipfw(4).

Submitted by: Igor Timkin <ivt@gamma.ru>


# 7b109fa4 04-Nov-2001 Luigi Rizzo <luigi@FreeBSD.org>

MFS: sync the ipfw/dummynet/bridge code with the one recently merged
into stable (mostly , but not only, formatting and comments changes).


# 06dae58b 29-Oct-2001 Josef Karthauser <joe@FreeBSD.org>

A few more style changes picked up whilst working on an MFC to -stable.


# f227535c 29-Oct-2001 Josef Karthauser <joe@FreeBSD.org>

Fix some whitespace, and a comment that I missed in the last commit.


# 25549c00 28-Oct-2001 Josef Karthauser <joe@FreeBSD.org>

Clean up the style of this header file.


# 830cc178 27-Sep-2001 Luigi Rizzo <luigi@FreeBSD.org>

Two main changes here:
+ implement "limit" rules, which permit to limit the number of sessions
between certain host pairs (according to masks). These are a special
type of stateful rules, which might be of interest in some cases.
See the ipfw manpage for details.

+ merge the list pointers and ipfw rule descriptors in the kernel, so
the code is smaller, faster and more readable. This patch basically
consists in replacing "foo->rule->bar" with "rule->bar" all over
the place.
I have been willing to do this for ages!

MFC after: 1 week


# 32f967a3 20-Sep-2001 Luigi Rizzo <luigi@FreeBSD.org>

A bunch of minor changes to the code (see below) for readability, code size
and speed. No new functionality added (yet) apart from a bugfix.
MFC will occur in due time and probably in stages.

BUGFIX: fix a problem in old code which prevented reallocation of
the hash table for dynamic rules (there is a PR on this).

OTHER CHANGES: minor changes to the internal struct for static and dynamic rules.
Requires rebuild of ipfw binary.

Add comments to show how data structures are linked together.
(It probably makes no sense to keep the chain pointers separate
from actual rule descriptors. They will be hopefully merged soon.

keep a (sysctl-readable) counter for the number of static rules,
to speed up IP_FW_GET operations

initial support for a "grace time" for expired connections, so we
can set timeouts for closing connections to much shorter times.

merge zero_entry() and resetlog_entry(), they use basically the
same code.

clean up and reduce replication of code for removing rules,
both for readability and code size.

introduce a separate lifetime for dynamic UDP rules.

fix a problem in old code which prevented reallocation of
the hash table for dynamic rules (PR ...)

restructure dynamic rule descriptors

introduce some local variables to avoid multiple dereferencing of
pointer chains (reduces code size and hopefully increases speed).


# bb07ec8c 13-Feb-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Introduce a new feature in IPFW: Check of the source or destination
address is configured on a interface. This is useful for routers with
dynamic interfaces. It is now possible to say:

0100 allow tcp from any to any established
0200 skipto 1000 tcp from any to any
0300 allow ip from any to any
1000 allow tcp from 1.2.3.4 to me 22
1010 deny tcp from any to me 22
1020 allow tcp from any to any

and not have to worry about the behaviour if dynamic interfaces configure
new IP numbers later on.

The check is semi expensive (traverses the interface address list)
so it should be protected as in the above example if high performance
is a requirement.


# 7e1cd0d2 09-Feb-2001 Luigi Rizzo <luigi@FreeBSD.org>

Sync with the bridge/dummynet/ipfw code already tested in stable.

In ip_fw.[ch] change a couple of variable and field names to
avoid having types, variables and fields with the same name.


# 507b4b54 01-Feb-2001 Luigi Rizzo <luigi@FreeBSD.org>

MFS: bridge/ipfw/dummynet fixes (bridge.c will be committed separately)


# 65450f2f 08-Jan-2001 Robert Watson <rwatson@FreeBSD.org>

o IPFW incorrectly handled filtering in the presence of previously
reserved and now allocated TCP flags in incoming packets. This patch
stops overloading those bits in the IP firewall rules, and moves
colliding flags to a seperate field, ipflg. The IPFW userland
management tool, ipfw(8), is updated to reflect this change. New TCP
flags related to ECN are now included in tcp.h for reference, although
we don't currently implement TCP+ECN.

o To use this fix without completely rebuilding, it is sufficient to copy
ip_fw.h and tcp.h into your appropriate include directory, then rebuild
the ipfw kernel module, and ipfw tool, and install both. Note that a
mismatch between module and userland tool will result in incorrect
installation of firewall rules that may have unexpected effects. This
is an MFC candidate, following shakedown. This bug does not appear
to affect ipfilter.

Reviewed by: security-officer, billf
Reported by: Aragon Gouveia <aragon@phat.za.net>


# 98b82992 01-Oct-2000 Bill Fumerola <billf@FreeBSD.org>

Add new fields for more granularity:
IP: version, tos, ttl, len, id
TCP: seq#, ack#, window size

Reviewed by: silence on freebsd-{net,ipfw}


# 11840b06 21-Aug-2000 Archie Cobbs <archie@FreeBSD.org>

Remove obsolete comment.


# 9714563d 08-Jun-2000 Dan Moschuk <dan@FreeBSD.org>

Add tcpoptions to ipfw. This works much in the same way as ipoptions do.
It also squashes 99% of packet kiddie synflood orgies. For example, to
rate syn packets without MSS,

ipfw pipe 10 config 56Kbit/s queue 10Packets
ipfw add pipe 10 tcp from any to any in setup tcpoptions !mss

Submitted by: Richard A. Steenbergen <ras@e-gerbil.net>


# 5d3fe434 08-Jun-2000 Luigi Rizzo <luigi@FreeBSD.org>

Implement WF2Q+ in dummynet.


# e3975643 25-May-2000 Jake Burkholder <jake@FreeBSD.org>

Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by: msmith and others


# 740a1973 23-May-2000 Jake Burkholder <jake@FreeBSD.org>

Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by: phk
Reviewed by: phk
Approved by: mdodd


# 03c61266 10-Feb-2000 Luigi Rizzo <luigi@FreeBSD.org>

Support for stateful (dynamic) ipfw rules. They are very
similar to ipfilter's keep-state.

Look at the updated ipfw(8) manpage for details.

Approved-by: jordan


# ec8fac2a 08-Jan-2000 Luigi Rizzo <luigi@FreeBSD.org>

Add ipfw hooks for the new dummynet features.

Support masks on TCP/UDP ports.

Minor cleanup of ip_fw_chk() to avoid repeated calls to PULLUP_TO
at each rule.


# 664a31e4 28-Dec-1999 Peter Wemm <peter@FreeBSD.org>

Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.


# 8948e4ba 05-Dec-1999 Archie Cobbs <archie@FreeBSD.org>

Miscellaneous fixes/cleanups relating to ipfw and divert(4):

- Implement 'ipfw tee' (finally)
- Divert packets by calling new function divert_packet() directly instead
of going through protosw[].
- Replace kludgey global variable 'ip_divert_port' with a function parameter
to divert_packet()
- Replace kludgey global variable 'frag_divert_port' with a function parameter
to ip_reass()
- style(9) fixes

Reviewed by: julian, green


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 4d1bb12d 27-Aug-1999 Brian Feldman <green@FreeBSD.org>

Correction: uid -> gid (comment)


# 77275942 11-Aug-1999 Luigi Rizzo <luigi@FreeBSD.org>

Implement probabilistic rule match in ipfw. Each rule can be associated
with a match probability to achieve non-deterministic behaviour of
the firewall. This can be extremely useful for testing purposes
such as simulating random packet drop without having to use dummynet
(which already does the same thing), and simulating multipath effects
and the associated out-of-order delivery (this time in conjunction
with dummynet).

The overhead on normal rules is just one comparison with 0.

Since it would have been trivial to implement this by just adding
a field to the ip_fw structure, I decided to do it in a
backward-compatible way (i.e. struct ip_fw is unchanged, and as a
consequence you don't need to recompile ipfw if you don't want to
use this feature), since this was also useful for -STABLE.

When, at some point, someone decides to change struct ip_fw, please
add a length field and a version number at the beginning, so userland
apps can keep working even if they are out of sync with the kernel.


# 0b6c1a83 01-Aug-1999 Brian Feldman <green@FreeBSD.org>

Make ipfw's logging more dynamic. Now, log will use the default limit
_or_ you may specify "log logamount number" to set logging specifically
the rule.
In addition, "ipfw resetlog" has been added, which will reset the
logging counters on any/all rule(s). ipfw resetlog does not affect
the packet/byte counters (as ipfw reset does), and is the only "set"
command that can be run at securelevel >= 3.
This should address complaints about not being able to set logging
amounts, not being able to restart logging at a high securelevel,
and not being able to just reset logging without resetting all of the
counters in a rule.


# f8075bf9 28-Jul-1999 Brian Feldman <green@FreeBSD.org>

Correct a really gross comment format.


# 7a2aab80 19-Jun-1999 Brian Feldman <green@FreeBSD.org>

This is the much-awaited cleaned up version of IPFW [ug]id support.
All relevant changes have been made (including ipfw.8).


# 66e55756 20-Apr-1999 Peter Wemm <peter@FreeBSD.org>

Tidy up some stray / unused stuff in the IPFW package and friends.
- unifdef -DCOMPAT_IPFW (this was on by default already)
- remove traces of in-kernel ip_nat package, it was never committed.
- Make IPFW and DUMMYNET initialize themselves rather than depend on
compiled-in hooks in ip_init(). This means they initialize the same
way both in-kernel and as kld modules. (IPFW initializes now :-)


# b715f178 14-Dec-1998 Luigi Rizzo <luigi@FreeBSD.org>

Last bits (i think) of dummynet for -current.


# 67a895f6 02-Sep-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Widen and change the layout of the IPFW structures flag element.

This will allow us to add dummynet to 3.0

Recompile /sbin/ipfw AND your kernel.


# cfe8b629 22-Aug-1998 Garrett Wollman <wollman@FreeBSD.org>

Yow! Completely change the way socket options are handled, eliminating
another specialized mbuf type in the process. Also clean up some
of the cruft surrounding IPFW, multicast routing, RSVP, and other
ill-explored corners.


# f9e354df 05-Jul-1998 Julian Elischer <julian@FreeBSD.org>

Support for IPFW based transparent forwarding.
Any packet that can be matched by a ipfw rule can be redirected
transparently to another port or machine. Redirection to another port
mostly makes sense with tcp, where a session can be set up
between a proxy and an unsuspecting client. Redirection to another machine
requires that the other machine also be expecting to receive the forwarded
packets, as their headers will not have been modified.

/sbin/ipfw must be recompiled!!!

Reviewed by: Peter Wemm <peter@freebsd.org>
Submitted by: Chrisy Luke <chrisy@flix.net>


# e7a58978 03-Feb-1998 Bruce Evans <bde@FreeBSD.org>

Added #include of <sys/queue.h> so that this file is more "self"-sufficent.


# 1c910ddb 07-Jan-1998 Alexander Langer <alex@FreeBSD.org>

Bump up packet and byte counters to 64-bit unsigned ints. As a
consequence, ipfw's list command now adjusts its output at runtime
based on the largest packet/byte counter values.

NOTE:
o The ipfw struct has changed requiring a recompile of both kernel
and userland ipfw utility.

o This probably should not be brought into 2.2.

PR: 3738


# 55b211e3 28-Oct-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# 514ede09 16-Sep-1997 Bruce Evans <bde@FreeBSD.org>

Fixed gratuitous ANSIisms.


# 750f6aad 08-Aug-1997 Alexander Langer <alex@FreeBSD.org>

Support interface names up to 15 characters in length. In order to
accommodate the expanded name, the ICMP types bitmap has been
reduced from 256 bits to 32.

A recompile of kernel and user level ipfw is required.

To be merged into 2.2 after a brief period in -current.

PR: bin/4209
Reviewed by: Archie Cobbs <archie@whistle.com>


# e4676ba6 01-Jun-1997 Julian Elischer <julian@FreeBSD.org>

Submitted by: Whistle Communications (archie Cobbs)

these are quite extensive additions to the ipfw code.
they include a change to the API because the old method was
broken, but the user view is kept the same.

The new code allows a particular match to skip forward to a particular
line number, so that blocks of rules can be
used without checking all the intervening rules.
There are also many more ways of rejecting
connections especially TCP related, and
many many more ...

see the man page for a complete description.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 839cc09e 16-Jan-1997 Adam David <adam@FreeBSD.org>

implement "not" keyword for inverting the address logic


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# fed1c7e9 21-Aug-1996 Søren Schmidt <sos@FreeBSD.org>

Add hooks for an IP NAT module, much like the firewall stuff...
Move the sockopt definitions for the firewall code from
ip_fw.h to in.h where it belongs.


# cc98643e 13-Aug-1996 Paul Traina <pst@FreeBSD.org>

Completely rewrite handling of protocol field for firewalls, things are
now completely consistent across all IP protocols and should be quite a
bit faster.

Discussed with: fenner & alex


# 93e0e116 10-Jul-1996 Julian Elischer <julian@FreeBSD.org>

Adding changes to ipfw and the kernel to support ip packet diversion..
This stuff should not be too destructive if the IPDIVERT is not compiled in..
be aware that this changes the size of the ip_fw struct
so ipfw needs to be recompiled to use it.. more changes coming to clean this up.


# 7a991086 09-Jun-1996 Alexander Langer <alex@FreeBSD.org>

Big sweep over ipfw, picking up where Poul left off:

- Log ICMP type during verbose output.
- Added IPFIREWALL_VERBOSE_LIMIT option to prevent denial of service
attacks via syslog flooding.
- Filter based on ICMP type.
- Timestamp chain entries when they are matched.
- Interfaces can now be matched with a wildcard specification (i.e.
will match any interface unit for a given name).
- Prevent the firewall chain from being manipulated when securelevel
is greater than 2.
- Fixed bug that allowed the default policy to be deleted.
- Ability to zero individual accounting entries.
- Remove definitions of old_chk_ptr and old_ctl_ptr when compiling
ipfw as a lkm.
- Remove some redundant code shared between ip_fw_init and ipfw_load.

Closes PRs: 1192, 1219, and 1267.


# b7509747 01-Jun-1996 Gary Palmer <gpalmer@FreeBSD.org>

Correct spelling error in comment


# 23bf9953 03-Apr-1996 Poul-Henning Kamp <phk@FreeBSD.org>

Add feature for tcp "established".
Change interface between netinet and ip_fw to be more general, and thus
hopefully also support other ip filtering implementations.


# 09bb5f75 24-Feb-1996 Poul-Henning Kamp <phk@FreeBSD.org>

Make getsockopt() capable of handling more than one mbuf worth of data.
Use this to read rules out of ipfw.
Add the lkm code to ipfw.c


# b83e4314 23-Feb-1996 Poul-Henning Kamp <phk@FreeBSD.org>

The new firewall functionality:
Filter on the direction (in/out).
Filter on fragment/not fragment.


# e7319bab 23-Feb-1996 Poul-Henning Kamp <phk@FreeBSD.org>

Big sweep over the IPFIREWALL and IPACCT code.

Close the ip-fragment hole.
Waste less memory.
Rewrite to contemporary more readable style.
Kill separate IPACCT facility, use "accept" rules in IPFIREWALL.
Filter incoming >and< outgoing packets.
Replace "policy" by sticky "deny all" rule.
Rules have numbers used for ordering and deletion.
Remove "rerorder" code entirely.
Count packet & bytecount matches for rules.

Code in -current & -stable is now the same.


# 37afa1e8 01-Oct-1995 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Well..finally..this is the first part..it should take care of
matching IP options..Check and test this - i made only a couple
of rough tests and this could be buggy.. Ipaccounting can't use
IP Options (and i don't see any need to cound packets with specific
options either..)
More to come...


# f70b1050 22-Jul-1995 David Greenman <dg@FreeBSD.org>

Added $Id$.


# c6e8c357 09-Jul-1995 David Greenman <dg@FreeBSD.org>

Fixed panic that occurs on certain firewall rejected packets that was
caused by dtom() being used on an mbuf cluster. The fix involves passing
around the mbuf pointer.

Submitted by: Bill Fenner


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# 29fe22b9 24-Feb-1995 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Allow "via" to be specified ever as IP adress or
as interface name/unit...


# 4dd1662b 12-Jan-1995 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Actual firewall change.
1) Firewall is not subdivided on forwarding / blocking chains
anymore.Actually only one chain left-it was the blocking one.
2) LKM support.ip_fwdef.c is function pointers definition and
goes into kernel along with all INET stuff.


# 3107b31b 13-Dec-1994 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Add clear one accounting entry control.
Structure fields changed to seem more standart.


# 10a642bb 12-Dec-1994 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Add match by interface from which packet arrived (via)
Handle right fragmented packets. Remove checking option
from kernel..


# c334f866 27-Nov-1994 Ugen J.S. Antsilevich <ugen@FreeBSD.org>

Added: ICMP reply,TCP SYN check,logging..


# 63f8d699 16-Nov-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

Ugen J.S.Antsilevich's latest, happiest, IP firewall code.
Poul: Please take this into BETA. It's non-intrusive, and a rather
substantial improvement over what was there before.


# dbdc2966 08-Nov-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

Ugen makes it in with 10 seconds to spare with a one-char diff. Some
people are born lucky..
Submitted by: ugen


# 72e8fea5 07-Nov-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

Almost 12th hour (the 11th hour was almost an hour ago :-) patches
from Ugen.


# 0a87b233 31-Oct-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

Latest changes from Uben.
Submitted by: uben


# 100ba1a6 28-Oct-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

IP Firewall code from Daniel Boulet and J.S.Antsilevich
Submitted by: danny ugen