History log of /openbsd-current/sys/netinet/ipsec_output.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.98 11-Feb-2024 bluhm

Remove include netinet6/ip6_var.h from netinet/in_pcb.h.

OK mvs@


Revision tags: OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.97 02-Jan-2022 jsg

spelling
ok jmc@ reads ok tb@


# 1.96 23-Dec-2021 bluhm

IPsec is not MP safe yet. To allow forwarding in parallel without
dirty hacks, it is better to protect IPsec input and output with
kernel lock. Not much is lost as crypto needs the kernel lock
anyway. From here we can refine the lock later.
Note that there is no kernel lock in the SPD lockup path. Goal is
to keep that lock free to allow fast forwarding with non IPsec
traffic.
tested by Hrvoje Popovski; OK tobhe@


# 1.95 20-Dec-2021 mvs

Use per-CPU counters for tunnel descriptor block (TDB) statistics.
'tdb_data' struct became unused and was removed.

Tested by Hrvoje Popovski.
ok bluhm@


# 1.94 11-Dec-2021 bluhm

Protect the write access to the TDB flags field with a mutex per
TDB. Clearing the timeout flags just before pool put in tdb_free()
does not make sense. Move this to tdb_delete(). While there make
the parentheses in the flag check consistent.
tested by Hrvoje Popovski; OK tobhe@


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.97 02-Jan-2022 jsg

spelling
ok jmc@ reads ok tb@


# 1.96 23-Dec-2021 bluhm

IPsec is not MP safe yet. To allow forwarding in parallel without
dirty hacks, it is better to protect IPsec input and output with
kernel lock. Not much is lost as crypto needs the kernel lock
anyway. From here we can refine the lock later.
Note that there is no kernel lock in the SPD lockup path. Goal is
to keep that lock free to allow fast forwarding with non IPsec
traffic.
tested by Hrvoje Popovski; OK tobhe@


# 1.95 20-Dec-2021 mvs

Use per-CPU counters for tunnel descriptor block (TDB) statistics.
'tdb_data' struct became unused and was removed.

Tested by Hrvoje Popovski.
ok bluhm@


# 1.94 11-Dec-2021 bluhm

Protect the write access to the TDB flags field with a mutex per
TDB. Clearing the timeout flags just before pool put in tdb_free()
does not make sense. Move this to tdb_delete(). While there make
the parentheses in the flag check consistent.
tested by Hrvoje Popovski; OK tobhe@


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.96 23-Dec-2021 bluhm

IPsec is not MP safe yet. To allow forwarding in parallel without
dirty hacks, it is better to protect IPsec input and output with
kernel lock. Not much is lost as crypto needs the kernel lock
anyway. From here we can refine the lock later.
Note that there is no kernel lock in the SPD lockup path. Goal is
to keep that lock free to allow fast forwarding with non IPsec
traffic.
tested by Hrvoje Popovski; OK tobhe@


# 1.95 20-Dec-2021 mvs

Use per-CPU counters for tunnel descriptor block (TDB) statistics.
'tdb_data' struct became unused and was removed.

Tested by Hrvoje Popovski.
ok bluhm@


# 1.94 11-Dec-2021 bluhm

Protect the write access to the TDB flags field with a mutex per
TDB. Clearing the timeout flags just before pool put in tdb_free()
does not make sense. Move this to tdb_delete(). While there make
the parentheses in the flag check consistent.
tested by Hrvoje Popovski; OK tobhe@


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.95 20-Dec-2021 mvs

Use per-CPU counters for tunnel descriptor block (TDB) statistics.
'tdb_data' struct became unused and was removed.

Tested by Hrvoje Popovski.
ok bluhm@


# 1.94 11-Dec-2021 bluhm

Protect the write access to the TDB flags field with a mutex per
TDB. Clearing the timeout flags just before pool put in tdb_free()
does not make sense. Move this to tdb_delete(). While there make
the parentheses in the flag check consistent.
tested by Hrvoje Popovski; OK tobhe@


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.94 11-Dec-2021 bluhm

Protect the write access to the TDB flags field with a mutex per
TDB. Clearing the timeout flags just before pool put in tdb_free()
does not make sense. Move this to tdb_delete(). While there make
the parentheses in the flag check consistent.
tested by Hrvoje Popovski; OK tobhe@


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.93 02-Dec-2021 bluhm

Allow to build kernel without IPSEC or INET6 defines.
OK mpi@ mvs@


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.92 25-Nov-2021 bluhm

Implement reference counting for IPsec tdbs. Not all cases are
covered yet, more ref counts to come. The timeouts are protected,
so the racy tdb_reaper() gets retired. The tdb_policy_head, onext
and inext lists are protected. All gettdb...() functions return a
tdb that is ref counted and has to be unrefed later. A flag ensures
that tdb_delete() is called only once.
Tested by Hrvoje Popovski; OK sthen@ mvs@ tobhe@


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.91 23-Oct-2021 tobhe

Retire asynchronous crypto API as it is no longer required by any driver and
adds unnecessary complexity. Dedicated crypto offloading devices are not common
anymore. Modern CPU crypto acceleration works synchronously, eliminating the need
for callbacks.

Replace all occurrences of crypto_dispatch() with crypto_invoke(), which is
blocking and only returns after the operation has completed or an error occured.
Invoke callback functions directly from the consumer (e.g. IPsec, softraid)
instead of relying on the crypto driver to call crypto_done().

ok bluhm@ mvs@ patrick@


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.90 22-Oct-2021 bluhm

Make error handling in IPsec consistent. Pass errors to the callers.
OK tobhe@


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.89 13-Oct-2021 bluhm

The function crypto_dispatch() never returns an error. Make it
void and remove error handling in the callers.
OK patrick@ mvs@


# 1.88 13-Oct-2021 bluhm

The function ipip_output() was registered as .xf_output() xform
function. But was is never called via this pointer. It would have
immediatley crashed as mp is always NULL when called via .xf_output().
Do not set .xf_output to ipip_output. This allows to pass only the
parameters which are actually needed and the control flow is clearer.
OK mpi@


# 1.87 05-Oct-2021 bluhm

Cleanup the error handling in ipsec ipip_output() and consistently
goto drop instead of return. An ENOBUFS should be EINVAL in IPv6
case. Also use combined packet and byte counter.
OK sthen@ dlg@


Revision tags: OPENBSD_7_0_BASE
# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.86 27-Jul-2021 mvs

Revert "Use per-CPU counters for tunnel descriptor block" diff.

Panic reported by Hrvoje Popovski.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.85 26-Jul-2021 mvs

Use per-CPU counters for tunnel descriptor block (tdb) statistics.
'tdb_data' struct became unused and was removed.

ok bluhm@


# 1.84 26-Jul-2021 bluhm

Do not queue crypto operations for IPsec. The packet entries in
task queues were unlimited and could overflow during havy traffic.
Even if we still use hardware drivers that sleep, softnet task
instead of soft interrupt can handle this now. Without queues net
lock is inherited and kernel lock is only needed once per packet.
This results in less lock contention and faster IPsec.
Also protect tdb drop counters with net lock and avoid a leak in
crypto dispatch error handling.
intense testing Hrvoje Popovski; OK mpi@


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.83 21-Jul-2021 bluhm

Propagate errors from crypto_invoke() and count them in IPsec. They
should not happen, but always check error conditions. tq is never
NULL, remove the check. tdb->tdb_odrops++ is not MP safe, but will
be addressed separately in ipsec_output_cb().
OK mvs@


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.82 08-Jul-2021 bluhm

Debug printfs in encdebug were inconsistent, some missing newlines
produced ugly output. Move the function name and the newline into
the DPRINTF macro. This simplifies the debug statements.
OK tobhe@


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.81 07-Jul-2021 bluhm

Fix whitespaces in IPsec code.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.80 18-Jun-2021 bluhm

The crypto(9) framework used by IPsec runs on a kernel task that
is protected by kernel lock. There were crashes in swcr_authenc()
when it was accessing swcr_sessions. As a quick fix, protect all
calls from network stack to crypto with kernel lock. This also
covers the rekeying case that is called from pfkey via tdb_init().
OK mvs@


Revision tags: OPENBSD_6_9_BASE
# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.79 10-Mar-2021 jsg

spelling

ok gnezdo@ semarie@ mpi@


Revision tags: OPENBSD_6_8_BASE
# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.78 22-Sep-2020 tobhe

whitespace


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.77 24-Jun-2020 cheloha

kernel: use gettime(9)/getuptime(9) in lieu of time_second(9)/time_uptime(9)

time_second(9) and time_uptime(9) are widely used in the kernel to
quickly get the system UTC or system uptime as a time_t. However,
time_t is 64-bit everywhere, so it is not generally safe to use them
on 32-bit platforms: you have a split-read problem if your hardware
cannot perform atomic 64-bit reads.

This patch replaces time_second(9) with gettime(9), a safer successor
interface, throughout the kernel. Similarly, time_uptime(9) is replaced
with getuptime(9).

There is a performance cost on 32-bit platforms in exchange for
eliminating the split-read problem: instead of two register reads you
now have a lockless read loop to pull the values from the timehands.
This is really not *too* bad in the grand scheme of things, but
compared to what we were doing before it is several times slower.

There is no performance cost on 64-bit (__LP64__) platforms.

With input from visa@, dlg@, and tedu@.

Several bugs squashed by visa@.

ok kettenis@


Revision tags: OPENBSD_6_7_BASE
# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.76 23-Apr-2020 tobhe

Add support for autmatically moving traffic between rdomains on ipsec(4)
encryption or decryption. This allows us to keep plaintext and encrypted
network traffic seperated and reduces the attack surface for network
sidechannel attacks.

The only way to reach the inner rdomain from outside is by successful
decryption and integrity verification through the responsible Security
Association (SA).
The only way for internal traffic to get out is getting encrypted and
moved through the outgoing SA.
Multiple plaintext rdomains can share the same encrypted rdomain while
the unencrypted packets are still kept seperate.
The encrypted and unencrypted rdomains can have different default routes.

The rdomains can be configured with the new SADB_X_EXT_RDOMAIN pfkey
extension. Each SA (tdb) gets a new attribute 'tdb_rdomain_post'.
If this differs from 'tdb_rdomain' then the packet is moved to
'tdb_rdomain_post' afer IPsec processing.

Flows and outgoing IPsec SAs are installed in the plaintext rdomain,
incoming IPsec SAs are installed in the encrypted rdomain.
IPCOMP SAs are always installed in the plaintext rdomain.
They can be viewed with 'route -T X exec ipsecctl -sa' where X is the
rdomain ID.

As the kernel does not create encX devices automatically when creating
rdomains they have to be added by hand with ifconfig for IPsec to work
in non-default rdomains.

discussed with chris@ and kn@
ok markus@, patrick@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE OPENBSD_6_6_BASE
# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.75 14-Sep-2018 mestre

Initialize the TDB to NULL in ipsec_common_input() and
ipsec_{input,output}_cb() so that in the case of sending or receiving a bogus
mbuf (NULL) we don't end up trying to dereference the TDB, while being an
uninitialized pointer, to increase the drops.

Coverity IDs 1473312, 1473313 and 1473317.

OK mpi@ visa@


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.74 28-Aug-2018 mpi

Add per-TDB counters and a new SADB extension to export them to
userland.

Inputs from markus@, ok sthen@


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.73 12-Jul-2018 mpi

Introduce ipsec_output_cb() to merge duplicate code and account for
dropped packets in the output path.

While here fix a memory leak when compression is not needed w/ IPcomp.

ok markus@


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.72 04-Jun-2018 bluhm

Cleanup IPsec output error handling with consistent goto drop.
from markus@; OK mpi@


# 1.71 14-May-2018 bluhm

When walking the IPv6 header chain in IPsec output, check that the
next extension header is within the packet length. Also check at
the end that the IPv4 headers are not longer than the packet.
reported by Maxime Villard; from markus@ via NetBSD; OK mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.70 08-Nov-2017 visa

branches: 1.70.2;
Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

branches: 1.68.4;
The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.


# 1.70 08-Nov-2017 visa

Make {ah,esp,ipcomp}stat use percpu counters.

OK bluhm@, mpi@


# 1.69 06-Nov-2017 mpi

Use %s and __func__ in DPRINTF() to reduce false positive with grep(1).

ok kettenis@, dhill@, visa@, jca@


Revision tags: OPENBSD_6_2_BASE
# 1.68 18-May-2017 bluhm

The function name ip4_input() is confusing as it also handles IPv6
packets. This is the IP in IP protocol input function, so call it
ipip_input(). Rename the existing ipip_input() to ipip_input_gif()
as it is the input function used by the gif interface. Pass the
address family to make it consistent with pr_input. Use __func__
in debug print and panic messages. Move all ipip prototypes to the
ip_ipip.h header file.
OK dhill@ mpi@


# 1.67 16-May-2017 mpi

Replace remaining splsoftassert(IPL_SOFTNET) by NET_ASSERT_LOCKED().

ok visa@


# 1.66 06-Apr-2017 dhill

Replace bcopy with a simple assignment where both variables are
properly aligned and sockaddr_union fields, or with memcpy when
the memory doesn't overlap.

OK bluhm@


Revision tags: OPENBSD_6_1_BASE
# 1.65 20-Jan-2017 mpi

Kill recursive splsofnet()/splx() dances.

Tested by Hrvoje Popovski, ok visa@


# 1.64 11-Oct-2016 mikeb

Rename 'i' to 'hlen' for greater readability; ok millert, naddy


# 1.63 13-Sep-2016 markus

avoid extensive mbuf allocation for IPsec by replacing m_inject(4)
with m_makespace(4) from freebsd; ok mpi@, bluhm@, mikeb@, dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.62 28-Feb-2016 mikeb

When IPsec UDP encapsulation is used for IPv6, the stack should
construct an IPv6 packet instead of an IPv4.

Diff from Patrick Wildt <patrick at blueri ! se> with input from
bluhm@; ok mpi, bluhm


Revision tags: OPENBSD_5_9_BASE
# 1.61 11-Sep-2015 claudio

Kill yet another argument to functions in IPv6. This time ip6_output's
ifpp - XXX: just for statistics
ifpp is always NULL in all callers so that statistic confirms ifpp is
dying
OK mpi@


Revision tags: OPENBSD_5_8_BASE
# 1.60 15-Jul-2015 deraadt

m_freem() can handle NULL, do not check for this condition beforehands.
ok stsp mpi


# 1.59 11-Jun-2015 mikeb

Move away from using hzto(9); OK dlg


# 1.58 17-Apr-2015 mikeb

Stubs and support code for NIC-enabled IPsec bite the dust.
No objection from reyk@, OK markus, hshoexer


# 1.57 14-Apr-2015 mikeb

make ipsp_address thread safe; ok mpi


Revision tags: OPENBSD_5_7_BASE
# 1.56 24-Jan-2015 deraadt

Userland (base & ports) was adapted to always include <netinet/in.h>
before <net/pfvar.h> or <net/if_pflog.h>. The kernel files can be
cleaned up next. Some sockaddr_union steps make it into here as well.
ok naddy


# 1.55 19-Dec-2014 tedu

unifdef INET in net code as a precursor to removing the pretend option.
long live the one true internet.
ok henning mikeb


# 1.54 08-Sep-2014 jsg

remove uneeded route.h includes
ok miod@ mpi@


Revision tags: OPENBSD_5_6_BASE
# 1.53 22-Jul-2014 mpi

Fewer <netinet/in_systm.h> !


# 1.52 21-Apr-2014 henning

ip_output() using varargs always struck me as bizarre, esp since it's only
ever used to pass on uint32 (for ipsec). stop that madness and just pass
the uint32, 0 in all cases but the two that pass the ipsec flowinfo.
ok deraadt reyk guenther


# 1.51 21-Apr-2014 henning

we'll do fine without casting NULL to struct foo * / void *
ok gcc & md5 (alas, no binary change)


Revision tags: OPENBSD_5_5_BASE
# 1.50 24-Oct-2013 mpi

Remove the number of in6_var.h inclusions by moving some functions and
global variables to in6.h.

ok deraadt@


# 1.49 03-Aug-2013 markus

unbreak PMTU-discovery for AES-GCM; ok mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.48 10-Apr-2013 mpi

Remove various external variable declaration from sources files and
move them to the corresponding header with an appropriate comment if
necessary.

ok guenther@


# 1.47 28-Mar-2013 tedu

code that calls timeout functions should include timeout.h
slipped by on i386, but the zaurus doesn't automagically pick it up.
spotted by patrick


Revision tags: OPENBSD_5_3_BASE
# 1.46 20-Sep-2012 blambert

spltdb() was really just #define'd to be splsoftnet(); replace the former
with the latter

no change in md5 checksum of generated files

ok claudio@ henning@


# 1.45 18-Sep-2012 markus

remove the SADB_X_SAFLAGS_{HALFIV,RANDOMPADDING,NOREPLAY} pfkey-API (not set
anywhere) as well as the matching TDBF_{HALFIV,RANDOMPADDING,NOREPLAY} code.
ok mikeb@


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.44 05-Mar-2011 bluhm

The function pf_tag_packet() never fails. Remove a redundant check
and make it void.
ok henning@, markus@, mcbride@


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.43 09-Jul-2010 reyk

Add support for using IPsec in multiple rdomains.

This allows to run isakmpd/iked/ipsecctl in multiple rdomains
independently (with "route exec"); the kernel will pickup the rdomain
from the process context of the pfkey socket and load the flows and
SAs into the matching rdomain encap routing table. The network stack
also needs to pass the rdomain to the ipsec stack to lookup the
correct rdomain that belongs to an interface/mbuf/... You can now run
individual IPsec configs per rdomain or create IPsec VPNs between
multiple rdomains on the same machine ;). Note that a primary enc(4)
in addition to enc0 interface is required per rdomain, eg. enc1 rdomain 1.

Test by some people, mostly on existing "rdomain 0" setups. Was in
snaps for some days and people didn't complain.

ok claudio@ naddy@


Revision tags: OPENBSD_4_7_BASE
# 1.42 10-Jan-2010 markus

Fix two bugs in IPsec/HMAC-SHA2:
(1) use correct (message) block size of 128 byte (instead of 64
bytes) for HMAC-SHA512/384 (RFC4634).
(2) RFC4868 specifies that HMAC-SHA-{256,384,512} is truncated to
nnn/2 bits, while we still use 96 bits. 96 bits have been
specified in draft-ietf-ipsec-ciph-sha-256-00 while
draft-ietf-ipsec-ciph-sha-256-01 changed it to 128 bits.

WARNING: this change makes IPsec with SHA-256 (the default)
incompatible with older OpenBSD versions and other IPsec-implementations
that share this bug.

ok+tests naddy, fries; requested by reyk/deraadt


Revision tags: OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.41 26-Aug-2008 henning

we need to call pf_pkt_addr_changed here too. found by david


# 1.40 21-Aug-2008 bluhm

Assign the ip and ip6 pointers in ipsp_process_packet() only if a
header of the matching address family is available. Especially do
not read ip->ip_off from an IPv6 packet header.
ok markus


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE OPENBSD_4_4_BASE
# 1.39 01-Jun-2007 henning

apply the "skip ipsec if there are no flows" speedup diff to IPv6 too.
we need a pointer to the inpcb to decide, which was not previously
passed to ip6_output, so this diff is a little bigger.
from itojun, ok ryan


# 1.38 28-May-2007 henning

double pf performance.
boring details:
pf used to use an mbuf tag to keep track of route-to etc, altq, tags,
routing table IDs, packets redirected to localhost etc. so each and every
packet going through pf got an mbuf tag. mbuf tags use malloc'd memory,
and that is knda slow.
instead, stuff the information into the mbuf header directly.
bridging soekris with just "pass" as ruleset went from 29 MBit/s to
58 MBit/s with that (before ryan's randomness fix, now it is even betterer)
thanks to chris for the test setup!
ok ryan ryan ckuethe reyk


Revision tags: OPENBSD_4_1_BASE
# 1.37 08-Feb-2007 itojun

- AH: when computing crypto checksum for output, massage source-routing
header.
- ipsec_input: fix mistake in IPv6 next-header chasing.
- ipsec_output: look for the position to insert AH more carefully.
- ip6_output: enable use of AH with extension headers.
avoid tunnellinng when source-routing header is present.

ok by deraad, naddy, hshoexer


# 1.36 19-Dec-2006 itojun

TDBF_USEDTUNNEL flag manipulation was inside #ifdef INET. it applies
to INET6 too, so move it outside. markus ok


# 1.35 05-Dec-2006 markus

do not install pmtu routes for transport mode SAs, as they do not
the dest IP; PMTU debugging support; ok hshoexer


# 1.34 24-Nov-2006 reyk

add support to tag ipsec traffic belonging to specific IKE-initiated
phase 2 traffic. this allows policy-based filtering of encrypted and
unencrypted ipsec traffic with pf(4). see ipsec.conf(5) and
isakmpd.conf(5) for details and examples.

this is work in progress and still needs some testing and feedback,
but it is safe to put it in now.

ok hshoexer@


Revision tags: OPENBSD_3_8_BASE OPENBSD_3_9_BASE OPENBSD_4_0_BASE
# 1.33 12-Apr-2005 markus

handle PMTU for ipip SAs, too; ok hshoexer, cloder


Revision tags: OPENBSD_3_7_BASE
# 1.32 24-Sep-2004 markus

pmtu support for udpencap; ok hshoexer, ho


Revision tags: OPENBSD_3_6_BASE
# 1.31 26-Jun-2004 ho

branches: 1.31.2;
Default enable udpencap. Add 'disable' sysctl to sysctl.conf. markus@ ok.


# 1.30 21-Jun-2004 tholo

First step towards more sane time handling in the kernel -- this changes
things such that code that only need a second-resolution uptime or wall
time, and used to get that from time.tv_secs or mono_time.tv_secs now get
this from separate time_t globals time_second and time_uptime.

ok art@ niklas@ nordin@


# 1.29 21-Jun-2004 markus

don't send UDP encapsulated packets w/o UDP header if encap is disabled; ok ho@


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.28 02-Dec-2003 markus

branches: 1.28.2;
UDP encapsulation for ESP in transport mode (draft-ietf-ipsec-udp-encaps-XX.txt)
ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.27 09-Jul-2003 itojun

do not flip ip_len/ip_off in netinet stack. deraadt ok.
(please test, especially PF portion)


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.26 19-Feb-2003 jason

add a counter for times ipcomp is skipped because the packet is below the
minimum compression threshold.


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.25 28-Aug-2002 pefo

Fix a problem where passing NULL as a pointer with varargs does not promote
NULL to full 64 bits on a 64 bit address system. Soultion is to add a
(void *) cast before NULL. This makes a 64 bit MIPS kernel work and will
probably help future 64 bit ports as well.

OK from art@


# 1.24 01-Jul-2002 angelos

Move mtod() after the m_pullup() --- noted by sam@errno.com (who seems
to be going over the IPsec code with a magnifying glass)


# 1.23 19-Jun-2002 angelos

Remove redundant address family check -- sam@errno.com


# 1.22 09-Jun-2002 itojun

whitespace


Revision tags: OPENBSD_3_1_BASE
# 1.21 19-Feb-2002 miod

IPsec is written ``IPsec'', not ``IPSec''.


Revision tags: UBC_BASE
# 1.20 06-Dec-2001 angelos

branches: 1.20.2;
Use hzto() to handle overflow of (hz * timeout) cases --- when using
extremely long SA expirations.


Revision tags: OPENBSD_3_0_BASE
# 1.19 08-Aug-2001 jjbg

Remove IPCOMP option, it's now part of IPSEC option. You still need to
enable ipcomp via sysctl to use it. deraadt@ ok.


# 1.18 05-Jul-2001 jjbg

IPComp support. angelos@ ok.


# 1.17 26-Jun-2001 angelos

KNF


# 1.16 25-Jun-2001 angelos

Copyright.


# 1.15 24-Jun-2001 provos

path mtu discovery for ipsec. on receiving a need fragment icmp match
against active tdb and store the ipsec header size corrected mtu


# 1.14 08-Jun-2001 angelos

Trim include files.


# 1.13 30-May-2001 angelos

Update to match prototypes.


# 1.12 29-May-2001 angelos

Record last use time for SAs.


# 1.11 28-May-2001 angelos

Don't use IPV6_ENCAPSULATED, tags are used instead.


# 1.10 27-May-2001 angelos

New tags.


# 1.9 22-May-2001 angelos

Add an IPSEC_NEEDED tag if SKIPCRYPTO is set in the TDB


# 1.8 20-May-2001 angelos

Record outgoing SA processing, do loop detection.


# 1.7 11-May-2001 aaron

branches: 1.7.2;
Check m_pullup() and m_pullup2() return for NULL, not 0; itojun@ ok


Revision tags: OPENBSD_2_9_BASE
# 1.6 14-Apr-2001 angelos

Minor changes, preparing for real socket-attached TDBs; also, more
information will be stored in the TDB. ok ho@ provos@


# 1.5 06-Apr-2001 csapuntz

Move offsetof define into sys/param.h


# 1.4 28-Mar-2001 angelos

Allow tdbi's to appear in mbufs throughout the stack; this allows
security properties of the packets to be pushed up to the application
(not done yet). Eventually, this will be turned into a packet
attributes framework.

Make sure tdbi's are free'd/cleared properly whenever drivers (or NFS)
does weird things with mbufs.


# 1.3 15-Mar-2001 mickey

convert SA expirations to the new timeouts.
simplifies expirations handling a lot.
tdb_exp_timeout and tdb_soft_timeout are made
consistant throughout the code to be a relative time offsets,
just like first_use timeouts.
tested on singlehost isakmpd setup.
lots of dangling spaces and tabs removed.
angelos@ ok


Revision tags: OPENBSD_2_8_BASE
# 1.2 19-Sep-2000 angelos

SA bundles.


# 1.1 19-Sep-2000 angelos

Lots and lots of changes.