History log of /freebsd-current/sys/netpfil/ipfw/ip_fw_table_algo.c
Revision Date Author Comments
# 194df014 13-Nov-2023 Andrey V. Elsukov <ae@FreeBSD.org>

ipfw: fix copy&paste bug for number:array tables

Use compare_numarray() method for binary search. This fixes
table lookups for keys greater than UINT16_MAX.

Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 81cac390 04-Jun-2022 Arseny Smalyuk <smalukav@gmail.com>

ipfw: add support radix tables and table lookup for MAC addresses

By analogy with IP address matching, add a way to use ipfw radix
tables for MAC matching. This is implemented using new ipfw table
with mac:radix type. Also there are src-mac and dst-mac lookup
commands added.

Usage example:
ipfw table 1 create type mac
ipfw table 1 add 11:22:33:44:55:66/48
ipfw add skipto tablearg src-mac 'table(1)'
ipfw add deny src-mac 'table(1, 100)'
ipfw add deny lookup dst-mac 1

Note: sysctl net.link.ether.ipfw=1 should be set to enable ipfw
filtering on L2.

Reviewed by: melifaro
Obtained from: Yandex LLC
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D35103


# 60a28b09 18-Dec-2021 Mateusz Guzik <mjg@FreeBSD.org>

ipfw: plug set-but-not-used vars

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 3ad80c65 14-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix LINT-NOINET6 build after r368571.

Reported by: mjg


# 2616eaa3 11-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix NOINET6 build broken by r368571.


# 4451d893 11-Dec-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

ipfw kfib algo: Use rt accessors instead of accessing rib/rtentry directly.

This removes assumptions on prefix storage and rtentry layout
from an external code.

Differential Revision: https://reviews.freebsd.org/D27450


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# 6ad7446c 02-Jul-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Complete conversions from fib<4|6>_lookup_nh_<basic|ext> to fib<4|6>_lookup().

fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees
over pointer validness and requiring on-stack data copying.

With no callers remaining, remove fib[46]_lookup_nh_ functions.

Submitted by: Neel Chauhan <neel AT neelc DOT org>
Differential Revision: https://reviews.freebsd.org/D25445


# 47cb0632 04-Jun-2020 Eugene Grosbein <eugen@FreeBSD.org>

ipfw: unbreak matching with big table type flow.

Test case:

# n=32769
# ipfw -q table 1 create type flow:proto,dst-ip,dst-port
# jot -w 'table 1 add tcp,127.0.0.1,' $n 1 | ipfw -q /dev/stdin
# ipfw -q add 5 unreach filter-prohib flow 'table(1)'

The rule 5 matches nothing without the fix if n>=32769.

With the fix, it works:
# telnet localhost 10001
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Permission denied
telnet: Unable to connect to remote host

MFC after: 2 weeks
Discussed with: ae, melifaro


# e7d8af4f 28-Apr-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move route_temporal.c and route_var.h to net/route.

Nexthop objects implementation, defined in r359823,
introduced sys/net/route directory intended to hold all
routing-related code. Move recently-introduced route_temporal.c and
private route_var.h header there.

Differential Revision: https://reviews.freebsd.org/D24597


# 20efcfc6 16-Jun-2018 Andrey V. Elsukov <ae@FreeBSD.org>

Switch RIB and RADIX_NODE_HEAD lock from rwlock(9) to rmlock(9).

Using of rwlock with multiqueue NICs for IP forwarding on high pps
produces high lock contention and inefficient. Rmlock fits better for
such workloads.

Reviewed by: melifaro, olivier
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D15789


# d821d364 21-Jan-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

Unsign some values related to allocation.

When allocating memory through malloc(9), we always expect the amount of
memory requested to be unsigned as a negative value would either stand for
an error or an overflow.
Unsign some values, found when considering the use of mallocarray(9), to
avoid unnecessary casting. Also consider that indexes should be of
at least the same size/type as the upper limit they pretend to index.

MFC after: 3 weeks


# ba3e1361 14-Apr-2017 Andrey V. Elsukov <ae@FreeBSD.org>

Use address of specific union member instead of whole union address to
fix PVS-Studio warnings.

MFC after: 1 week


# f91eb6ad 13-Apr-2017 Maxim Konovalov <maxim@FreeBSD.org>

o Redundant assignments removed.

Found by: PVS-Stdio, V519
Reviewed by: ae


# 37aefa2a 05-Jun-2016 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix 4-byte overflow in ipv6_writemask.

This bug could cause some IPv6 table prefix delete requests to fail.

Obtained from: Yandex LLC


# b309f085 05-May-2016 Andrey V. Elsukov <ae@FreeBSD.org>

Change the type of objhash_cb_t callback function to be able return an
error code. Use it to interrupt the loop in ipfw_objhash_foreach().

Obtained from: Yandex LLC
Sponsored by: Yandex LLC


# a4641f4e 03-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/net*: minor spelling fixes.

No functional change.


# 61eee0e2 24-Jan-2016 Alexander V. Chernikov <melifaro@FreeBSD.org>

MFP r287070,r287073: split radix implementation and route table structure.

There are number of radix consumers in kernel land (pf,ipfw,nfs,route)
with different requirements. In fact, first 3 don't have _any_ requirements
and first 2 does not use radix locking. On the other hand, routing
structure do have these requirements (rnh_gen, multipath, custom
to-be-added control plane functions, different locking).
Additionally, radix should not known anything about its consumers internals.

So, radix code now uses tiny 'struct radix_head' structure along with
internal 'struct radix_mask_head' instead of 'struct radix_node_head'.
Existing consumers still uses the same 'struct radix_node_head' with
slight modifications: they need to pass pointer to (embedded)
'struct radix_head' to all radix callbacks.

Routing code now uses new 'struct rib_head' with different locking macro:
RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing
information base).

New net/route_var.h header was added to hold routing subsystem internal
data. 'struct rib_head' was placed there. 'struct rtentry' will also
be moved there soon.


# 89fc126a 10-Jan-2016 Alexander V. Chernikov <melifaro@FreeBSD.org>

Initialize error value ta_lookup_kfib() by default to please compiler.


# 60c274aa 10-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Initialize error after r293626 in case neither INET nor INET6 is
compiled into the kernel. Ideally lots more code would just not
be called (or compiled in) in that case but that requires a lot
more surgery. For now try to make IP-less kernels compile again.


# 004d3e30 09-Jan-2016 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make ipfw addr:kfib lookup algo use new routing KPI.


# 0caab009 05-Feb-2015 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Make sure table algorithm destroy hook is always called without locks
* Explicitly lock freeing interface references in ta_destroy_ifidx
* Change ipfw_iface_unref() to require UH lock
* Add forgotten ipfw_iface_unref() to destroy_ifidx_locked()

PR: kern/197276
Submitted by: lev
Sponsored by: Yandex LLC


# f7bab8d0 09-Nov-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Switch route radix to dual-lock model:
use rmlock for data patch access, and config rwlock
for conrol plane processing. Route table changes require
bock locks held.


# 55e5eda6 08-Nov-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Separate radix and routing: use different structures for route and
for other customers.

Introduce new 'struct rib_head' for routing purposes and make
all routing api use it.


# 8c3cfe0b 04-Nov-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Hide 'struct rtentry' and all its macro inside new header:
net/route_internal.h
The goal is to make its opaque for all code except route/rtsock and
proto domain _rmx.


# 9e3a53fd 22-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Rename log2 to tal_log2.

Submitted by: luigi


# 3fd16a3a 10-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Remove redundant if_notifier declaration.


# d699ee2d 10-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix NOINET6 build for ipfw.


# 9fe15d06 10-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Partially fix build on !amd64

Pointed by: bz


# 8ebca97f 07-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Fix crash in interface tracker due to using old "linked" field.
* Ensure we're flushing entries without any locks held.
* Free memory in (rare) case when interface tracker fails to register ifp.
* Add KASSERT on table values refcounts.


# d4e1b515 04-Oct-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix build with gcc.


# b1d105bc 21-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add pre-alfa version of DXR lookup module.
It does build but (currently) does not work.

This change is not intended to be merged along with other ipfw changes.


# 1a33e799 05-Sep-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Change copyrights to the proper one.


# 0cba2b28 31-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add support for multi-field values inside ipfw tables.
This is the last major change in given branch.

Kernel changes:
* Use 64-bytes structures to hold multi-value variables.
* Use shared array to hold values from all tables (assume
each table algo is capable of holding 32-byte variables).
* Add some placeholders to support per-table value arrays in future.
* Use simple eventhandler-style API to ease the process of adding new
table items. Currently table addition may required multiple UH drops/
acquires which is quite tricky due to atomic table modificatio/swap
support, shared array resize, etc. Deal with it by calling special
notifier capable of rolling back state before actually performing
swap/resize operations. Original operation then restarts itself after
acquiring UH lock.
* Bump all objhash users default values to at least 64
* Fix custom hashing inside objhash.

Userland changes:
* Add support for dumping shared value array via "vlist" internal cmd.
* Some small print/fill_flags dixes to support u32 values.
* valtype is now bitmask of
<skipto|pipe|fib|nat|dscp|tag|divert|netgraph|limit|ipv4|ipv6>.
New values can hold distinct values for each of this types.
* Provide special "legacy" type which assumes all values are the same.
* More helpers/docs following..

Some examples:

3:41 [1] zfscurr0# ipfw table mimimi create valtype skipto,limit,ipv4,ipv6
3:41 [1] zfscurr0# ipfw table mimimi info
+++ table(mimimi), set(0) +++
kindex: 2, type: addr
references: 0, valtype: skipto,limit,ipv4,ipv6
algorithm: addr:radix
items: 0, size: 296
3:42 [1] zfscurr0# ipfw table mimimi add 10.0.0.5 3000,10,10.0.0.1,2a02:978:2::1
added: 10.0.0.5/32 3000,10,10.0.0.1,2a02:978:2::1
3:42 [1] zfscurr0# ipfw table mimimi list
+++ table(mimimi), set(0) +++
10.0.0.5/32 3000,0,10.0.0.1,2a02:978:2::1


# 13263632 30-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Make objhash api a bit more abstract by providing ability to specify
own hash/compare functions.
* Add requirement for table algorithms to copy "valie" field in @add
callback instead of "prepare_add".
* Document existing requirement for table algorithms to store value
of deleted record to @tei.


# 4bbd1577 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Make room for multi-type values in struct tentry.


# c21034b7 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Replace "cidr" table type with "addr" type.

Suggested by: luigi


# d3b00c08 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add cidr:kfib algo type just for fun. It binds kernel fib
of given number to a table.

Example:
# ipfw table fib2 create algo "cidr:kfib fib=2"
# ipfw table fib2 info
+++ table(fib2), set(0) +++
kindex: 2, type: cidr, locked
valtype: number, references: 0
algorithm: cidr:kfib fib=2
items: 11, size: 288
# ipfw table fib2 list
+++ table(fib2), set(0) +++
10.0.0.0/24 0
127.0.0.1/32 0
::/96 0
::1/128 0
::ffff:0.0.0.0/96 0
2a02:978:2::/112 0
fe80::/10 0
fe80:1::/64 0
fe80:2::/64 0
fe80:3::/64 0
ff02::/16 0
# ipfw table fib2 lookup 10.0.0.5
10.0.0.0/24 0
# ipfw table fib2 lookup 2a02:978:2::11
2a02:978:2::/112 0
# ipfw table fib2 detail
+++ table(fib2), set(0) +++
kindex: 2, type: cidr, locked
valtype: number, references: 0
algorithm: cidr:kfib fib=2
items: 11, size: 288
IPv4 algorithm radix info
items: 0 itemsize: 200
IPv6 algorithm radix info
items: 0 itemsize: 200


# fd0869d5 14-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Document internal commands.
* Do not require/set default table type if algo name is specified.
* Add TA_FLAG_READONLY option for algorithms.


# 301290bc 12-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Rename has_space to need_modify to be consistent with 0 as return values.
* document all callbacks supported by algorithms code.


# 3a845e10 11-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add support for batched add/delete for ipfw tables
* Add support for atomic batches add (all or none).
* Fix panic on deleting non-existing entry in radix algo.

Examples:

# si is empty
# ipfw table si add 1.1.1.1/32 1111 2.2.2.2/32 2222
added: 1.1.1.1/32 1111
added: 2.2.2.2/32 2222
# ipfw table si add 2.2.2.2/32 2200 4.4.4.4/32 4444
exists: 2.2.2.2/32 2200
added: 4.4.4.4/32 4444
ipfw: Adding record failed: record already exists
^^^^^ Returns error but keeps inserted items
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444
# ipfw table si atomic add 3.3.3.3/32 3333 4.4.4.4/32 4400 5.5.5.5/32 5555
added(reverted): 3.3.3.3/32 3333
exists: 4.4.4.4/32 4400
ignored: 5.5.5.5/32 5555
ipfw: Adding record failed: record already exists
^^^^^ Returns error and reverts added records
# ipfw table si list
+++ table(si), set(0) +++
1.1.1.1/32 1111
2.2.2.2/32 2222
4.4.4.4/32 4444


# 720ee730 08-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Kernel changes:
* Fix buffer calculation for table dumps
* Fix IPv6 radix entiries addition broken in r269371.

Userland changes:
* Fix bug in retrieving statric ruleset
* Fix several bugs in retrieving table list


# 5f379342 02-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Show algorithm-specific data in "table info" output.


# a399f8be 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Be consistent on cidr:radix function naming: use algo name instead
of "cidr".


# d20facb2 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Remove unneded headers.


# 3fe2ef91 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Whitespace changes.


# 0bce0c23 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Move all algo-specific structures to the top of algo definition.
* Be consistent on naming variables in different algos.
* Use exponential array grow in iface:array and number:array.


# 648e8380 03-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Store entry value back in @tei on entry update/deletion as another step
to batched atomic updates.


# b6ee846e 02-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Fix case when returning more that 4096 bytes of data
* Use different approach to ensure algo has enough space to store N elements:
- explicitly ask algo (under UH_WLOCK) before/after insertion. This (along
with existing reallocation callbacks) really guarantees us that it is safe
to insert N elements at once while holding UH_WLOCK+WLOCK.
- remove old aflags/flags approach


# 4c0c07a5 01-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Permit limiting number of items in table.

Kernel changes:
* Add TEI_FLAGS_DONTADD entry flag to indicate that insert is not possible
* Support given flag in all algorithms
* Add "limit" field to ipfw_xtable_info
* Add actual limiting code into add_table_entry()

Userland changes:
* Add "limit" option as "create" table sub-option. Limit modification
is currently impossible.
* Print human-readable errors in table enry addition/deletion code.


# 95c3c1e2 01-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Do not perform memset() on ta_buf in algo callbacks:
it is already zeroed by base code.


# 2e324d29 01-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Simplify radix operations: use unified tei_to_sockaddr_ent() to generate
keys for add/delete calls.


# 57a1cf95 01-Aug-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Use TA_FLAG_DEFAULT for default algorithm selection instead of
exporting algorithm structures directly.

* Pass needed state buffer size in algo structures as preparation
for tables add/del requests batching.


# 914bffb6 31-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add new "flow" table type to support N=1..5-tuple lookups
* Add "flow:hash" algorithm

Kernel changes:
* Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups
* Add IPFW_TABLE_FLOW table type
* Add "struct tflow_entry" as strage for 6-tuple flows
* Add "flow:hash" algorithm. Basically it is auto-growing chained hash table.
Additionally, we store mask of fields we need to compare in each instance/

* Increase ipfw_obj_tentry size by adding struct tflow_entry
* Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info
* Increase algoname length: 32 -> 64 (algo options passed there as string)
* Assume every table type can be customized by flags, use u8 to store "tflags" field.
* Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback.
* Fix bug in cidr:chash resize procedure.

Userland changes:
* add "flow table(NAME)" syntax to support n-tuple checking tables.
* make fill_flags() separate function to ease working with _s_x arrays
* change "table info" output to reflect longer "type" fields

Syntax:
ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash]

Examples:

0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash
0:02 [2] zfscurr0# ipfw table fl2 info
+++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
0:02 [2] zfscurr0# ipfw table fl2 list
+++ table(fl2), set(0) +++
2a02:6b8::333,6,443 45000
10.0.0.92,6,80 22000
0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)'
00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
0:03 [2] zfscurr0# ipfw show
00200 0 0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 617 59416 allow ip from any to any
0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80
Trying 78.46.89.105...
..
0:04 [2] zfscurr0# ipfw show
00200 5 272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 682 66733 allow ip from any to any


# b23d5de9 30-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add number:array algorithm lookup method.

Kernel changes:
* s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/
* Force "lookup <port|uid|gid|jid>" to be IPFW_TABLE_NUMBER
* Support "lookup" method for number tables
* Add number:array algorihm (i32 as key, auto-growing).

Userland changes:
* Support named tables in "lookup <tag> Table"
* Fix handling of "table(NAME,val)" case
* Support printing "number" table data.


# ce2817b5 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add "lookup" method for cidr:hash algorithm type.
* Add auoto-grow ability to cidr:hash type.
* Fix some bugs / simplify implementation for cidr:hash.


# 9d099b4f 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Dump available table algorithms via "ipfw talist" cmd.

Kernel changes:
* Add type/refcount fields to table algo instances.
* Add IP_FW_TABLES_ALIST opcode to export available algorihms to userland.

Userland changes:
* Fix cores on empty input inside "ipfw table" handler.
* Add "ipfw talist" cmd to print availabled kernel algorithms.
* Change "table info" output to reflect long algorithm config lines.


# 0b565ac0 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Copy ta structures to stable storage to ease future extension.
* Remove algo .lookup field since table lookup function is set by algo code.


# 74b941f0 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add new ipfw cidr algorihm: hash table.
Algorithm works with both IPv4 and IPv6 prefixes, /32 and /128
ranges are assumed by default.
It works the following way: input IP address is masked to specified
mask, hashed and searched inside hash bucket.

Current implementation does not support "lookup" method and hash auto-resize.
This will be changed soon.

some examples:

ipfw table mi_test2 create type cidr algo cidr:hash
ipfw table mi_test create type cidr algo "cidr:hash masks=/30,/64"

ipfw table mi_test2 info
+++ table(mi_test2), set(0) +++
type: cidr, kindex: 7
valtype: number, references: 0
algorithm: cidr:hash
items: 0, size: 220

ipfw table mi_test info
+++ table(mi_test), set(0) +++
type: cidr, kindex: 6
valtype: number, references: 0
algorithm: cidr:hash masks=/30,/64
items: 0, size: 220

ipfw table mi_test add 10.0.0.5/30
ipfw table mi_test add 10.0.0.8/30
ipfw table mi_test add 2a02:6b8:b010::1/64 25

ipfw table mi_test list
+++ table(mi_test), set(0) +++
10.0.0.4/30 0
10.0.0.8/30 0
2a02:6b8:b010::/64 25


# adea6201 29-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Change algorthm names to "type:algo" (e.g. "iface:array", "cidr:radix") format.
* Pass number of items changed in add/del hooks to permit adding/deleting
multiple values at once.


# 68394ec8 28-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add generic ipfw interface tracking API
* Rewrite interface tables to use interface indexes

Kernel changes:
* Add generic interface tracking API:
- ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates
state & bumps ref)
- ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to
update ifindex)
- ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer)
- ipfw_iface_unref(unlocked, drops reference)
Additionally, consumer callbacks are called in interface withdrawal/departure.

* Rewrite interface tables to use iface tracking API. Currently tables are
implemented the following way:
runtime data is stored as sorted array of {ifidx, val} for existing interfaces
full data is stored inside namedobj instance (chained hashed table).

* Add IP_FW_XIFLIST opcode to dump status of tracked interfaces

* Pass @chain ptr to most non-locked algorithm callbacks:
(prepare_add, prepare_del, flush_entry ..). This may be needed for better
interaction of given algorithm an other ipfw subsystems

* Add optional "change_ti" algorithm handler to permit updating of
cached table_info pointer (happens in case of table_max resize)

* Fix small bug in ipfw_list_tables()
* Add badd (insert into sorted array) and bdel (remove from sorted array) funcs

Userland changes:
* Add "iflist" cmd to print status of currently tracked interface
* Add stringnum_cmp for better interface/table names sorting


# db785d31 26-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Require explicit table creation before use on kernel side.
* Add resize callbacks for upcoming table-based algorithms.

Kernel changes:
* s/ipfw_modify_table/ipfw_manage_table_ent/
* Simplify add_table_entry(): make table creation a separate piece of code.
Do not perform creation if not in "compat" mode.
* Add ability to perform modification of algorithm state (like table resize).
The following callbacks were added:
- prepare_mod (allocate new state, without locks)
- fill_mod (UH_WLOCK, copy old state to new one)
- modify (UH_WLOCK + WLOCK, switch state)
- flush_mod (no locks, flushes allocated data)
Given callbacks are called if table modification has been requested by add or
delete callbacks. Additional u64 tc->'flags' field was added to pass these
requests.
* Change add/del table ent format: permit adding/removing multiple entries
at once (only 1 supported at the moment).

Userland changes:
* Auto-create tables with warning


# e0a8b9ee 09-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Reduce size of ipfw table entries for cidr/iface:

Since old structures had _value as the last field,
every table match required 3 cache lines instead of 2.
Fix this by
- using the fact that supplied masks are suplicated inside radix
- using lightweigth sa_in6 structure as key for IPv6

Before (amd64):
sizeof(table_entry): 136
sizeof(table_xentry): 160
After (amd64):
sizeof(radix_cidr_entry): 120
sizeof(radix_cidr_xentry): 128
sizeof(radix_iface): 128

* Fix memory leak for table entry update
* Do some more sanity checks while deleting entry
* Do not store masks for host routes

Sponsored by: Yandex LLC


# 6447bae6 06-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Prepare to pass other dynamic states via ipfw_dump_config()

Kernel changes:
* Change dump format for dynamic states:
each state is now stored inside ipfw_obj_dyntlv
last dynamic state is indicated by IPFW_DF_LAST flag
* Do not perform sooptcopyout() for !SOPT_GET requests.

Userland changes:
* Introduce foreach_state() function handler to ease work
with different states passed by ipfw_dump_config().


# 81d3153d 06-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add "lookup" table functionality to permit userland entry lookups.
* Bump table dump format preserving old ABI.

Kernel size:
* Add IP_FW_TABLE_XFIND to handle "lookup" request from userland.
* Add ta_find_tentry() algorithm callbacks/handlers to support lookups.
* Fully switch to ipfw_obj_tentry for various table dumps:
algorithms are now required to support the latest (ipfw_obj_tentry) entry
dump format, the rest is handled by generic dump code.
IP_FW_TABLE_XLIST opcode version bumped (0 -> 1).
* Eliminate legacy ta_dump_entry algo handler:
dump_table_entry() converts data from current to legacy format.

Userland side:
* Add "lookup" table parameter.
* Change the way table type is guessed: call table_get_info() first,
and check value for IPv4/IPv6 type IFF table does not exist.
* Fix table_get_list(): do more tries if supplied buffer is not enough.
* Sparate table_show_entry() from table_show_list().


# ac35ff17 03-Jul-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fully switch to named tables:

Kernel changes:
* Introduce ipfw_obj_tentry table entry structure to force u64 alignment.
* Support "update-on-existing-key" "add" bahavior (TEI_FLAGS_UPDATED).
* Use "subtype" field to distingush between IPv4 and IPv6 table records
instead of previous hack.
* Add value type (vtype) field for kernel tables. Current types are
number,ip and dscp
* Fix sets mask retrieval for old binaries
* Fix crash while using interface tables

Userland changes:
* Switch ipfw_table_handler() to use named-only tables.
* Add "table NAME create [type {cidr|iface|u32} [valtype {number|ip|dscp}] ..."
* Switch ipfw_table_handler to match_token()-based parser.
* Switch ipfw_sets_handler to use new ipfw_get_config() for mask retrieval.
* Allow ipfw set X table ... syntax to permit using per-set table namespaces.


# 9490a627 16-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add IP_FW_TABLE_XCREATE / IP_FW_TABLE_XMODIFY opcodes.
* Add 'algoname' string to ipfw_xtable_info permitting to specify lookup
algoritm with parameters.
* Rework part of ipfw_rewrite_table_uidx()

Sponsored by: Yandex LLC


# ea761a5d 14-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move most of external table structures/functions to separate ip_fw_table.h


# 9f7d47b0 14-Jun-2014 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add API to ease adding new algorithms/new tabletypes to ipfw.

Kernel-side changelog:
* Split general tables code and algorithm-specific table data.
Current algorithms (IPv4/IPv6 radix and interface tables radix) moved to
new ip_fw_table_algo.c file.
Tables code now supports any algorithm implementing the following callbacks:
+struct table_algo {
+ char name[64];
+ int idx;
+ ta_init *init;
+ ta_destroy *destroy;
+ table_lookup_t *lookup;
+ ta_prepare_add *prepare_add;
+ ta_prepare_del *prepare_del;
+ ta_add *add;
+ ta_del *del;
+ ta_flush_entry *flush_entry;
+ ta_foreach *foreach;
+ ta_dump_entry *dump_entry;
+ ta_dump_xentry *dump_xentry;
+};

* Change ->state, ->xstate, ->tabletype fields of ip_fw_chain to
->tablestate pointer (array of 32 bytes structures necessary for
runtime lookups (can be probably shrinked to 16 bytes later):

+struct table_info {
+ table_lookup_t *lookup; /* Lookup function */
+ void *state; /* Lookup radix/other structure */
+ void *xstate; /* eXtended state */
+ u_long data; /* Hints for given func */
+};

* Add count method for namedobj instance to ease size calculations
* Bump ip_fw3 buffer in ipfw_clt 128->256 bytes.
* Improve bitmask resizing on tables_max change.
* Remove table numbers checking from most places.
* Fix wrong nesting in ipfw_rewrite_table_uidx().

* Add IP_FW_OBJ_LIST opcode (list all objects of given type, currently
implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_LISTSIZE (get buffer size to hold IP_FW_OBJ_LIST data,
currenly implemented for IPFW_OBJTYPE_TABLE).
* Add IP_FW_OBJ_INFO (requests info for one object of given type).

Some name changes:
s/ipfw_xtable_tlv/ipfw_obj_tlv/ (no table specifics)
s/ipfw_xtable_ntlv/ipfw_obj_ntlv/ (no table specifics)

Userland changes:
* Add do_set3() cmd to ipfw2 to ease dealing with op3-embeded opcodes.
* Add/improve support for destroy/info cmds.