Cross Reference: /freebsd-current/sys/netipsec/xform

History log of /freebsd-current/sys/netipsec/xform_ipcomp.c
Revision	Date	Author	Comments
# 71625ec9	16-Aug-2023	Warner Losh <imp@FreeBSD.org>	sys: Remove $FreeBSD$: one-line .c comment pattern Remove /^/[/]\s\$FreeBSD\$.*\n/
# 0361f165	23-Jun-2022	Kristof Provost <kp@FreeBSD.org>	ipsec: replace SECASVAR mtx by rmlock This mutex is a significant point of contention in the ipsec code, and can be relatively trivially replaced by a read-mostly lock. It does require a separate lock for the replay protection, which we do here by adding a separate mutex. This improves throughput (without replay protection) by 10-15%. MFC after: 3 weeks Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D35763
# 35d9e00d	24-Jan-2022	John Baldwin <jhb@FreeBSD.org>	IPsec: Use protocol-specific malloc types instead of M_XDATA. Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D33992
# dae61c9d	25-Jun-2020	John Baldwin <jhb@FreeBSD.org>	Simplify IPsec transform-specific teardown. - Rename from the teardown callback from 'zeroize' to 'cleanup' since this no longer zeroes keys. - Change the callback return type to void. Nothing checked the return value and it was always zero. - Don't have esp call into ah since it no longer needs to depend on this to clear the auth key. Instead, both are now private and self-contained. Reviewed by: delphij Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25443
# 28d2a72b	29-May-2020	John Baldwin <jhb@FreeBSD.org>	Consistently include opt_ipsec.h for consumers of <netipsec/ipsec.h>. This fixes ipsec.ko to include all of IPSEC_DEBUG. Reviewed by: imp MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25046
# 9c0e3d3a	25-May-2020	John Baldwin <jhb@FreeBSD.org>	Add support for optional separate output buffers to in-kernel crypto. Some crypto consumers such as GELI and KTLS for file-backed sendfile need to store their output in a separate buffer from the input. Currently these consumers copy the contents of the input buffer into the output buffer and queue an in-place crypto operation on the output buffer. Using a separate output buffer avoids this copy. - Create a new 'struct crypto_buffer' describing a crypto buffer containing a type and type-specific fields. crp_ilen is gone, instead buffers that use a flat kernel buffer have a cb_buf_len field for their length. The length of other buffer types is inferred from the backing store (e.g. uio_resid for a uio). Requests now have two such structures: crp_buf for the input buffer, and crp_obuf for the output buffer. - Consumers now use helper functions (crypto_use_, e.g. crypto_use_mbuf()) to configure the input buffer. If an output buffer is not configured, the request still modifies the input buffer in-place. A consumer uses a second set of helper functions (crypto_use_output_) to configure an output buffer. - Consumers must request support for separate output buffers when creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are only permitted to queue a request with a separate output buffer on sessions with this flag set. Existing drivers already reject sessions with unknown flags, so this permits drivers to be modified to support this extension without requiring all drivers to change. - Several data-related functions now have matching versions that operate on an explicit buffer (e.g. crypto_apply_buf, crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf). - Most of the existing data-related functions operate on the input buffer. However crypto_copyback always writes to the output buffer if a request uses a separate output buffer. - For the regions in input/output buffers, the following conventions are followed: - AAD and IV are always present in input only and their fields are offsets into the input buffer. - payload is always present in both buffers. If a request uses a separate output buffer, it must set a new crp_payload_start_output field to the offset of the payload in the output buffer. - digest is in the input buffer for verify operations, and in the output buffer for compute operations. crp_digest_start is relative to the appropriate buffer. - Add a crypto buffer cursor abstraction. This is a more general form of some bits in the cryptosoft driver that tried to always use uio's. However, compared to the original code, this avoids rewalking the uio iovec array for requests with multiple vectors. It also avoids allocate an iovec array for mbufs and populating it by instead walking the mbuf chain directly. - Update the cryptosoft(4) driver to support separate output buffers making use of the cursor abstraction. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545
# c0341432	27-Mar-2020	John Baldwin <jhb@FreeBSD.org>	Refactor driver and consumer interfaces for OCF (in-kernel crypto). - The linked list of cryptoini structures used in session initialization is replaced with a new flat structure: struct crypto_session_params. This session includes a new mode to define how the other fields should be interpreted. Available modes include: - COMPRESS (for compression/decompression) - CIPHER (for simply encryption/decryption) - DIGEST (computing and verifying digests) - AEAD (combined auth and encryption such as AES-GCM and AES-CCM) - ETA (combined auth and encryption using encrypt-then-authenticate) Additional modes could be added in the future (e.g. if we wanted to support TLS MtE for AES-CBC in the kernel we could add a new mode for that. TLS modes might also affect how AAD is interpreted, etc.) The flat structure also includes the key lengths and algorithms as before. However, code doesn't have to walk the linked list and switch on the algorithm to determine which key is the auth key vs encryption key. The 'csp_auth_' fields are always used for auth keys and settings and 'csp_cipher_' for cipher. (Compression algorithms are stored in csp_cipher_alg.) - Drivers no longer register a list of supported algorithms. This doesn't quite work when you factor in modes (e.g. a driver might support both AES-CBC and SHA2-256-HMAC separately but not combined for ETA). Instead, a new 'crypto_probesession' method has been added to the kobj interface for symmteric crypto drivers. This method returns a negative value on success (similar to how device_probe works) and the crypto framework uses this value to pick the "best" driver. There are three constants for hardware (e.g. ccr), accelerated software (e.g. aesni), and plain software (cryptosoft) that give preference in that order. One effect of this is that if you request only hardware when creating a new session, you will no longer get a session using accelerated software. Another effect is that the default setting to disallow software crypto via /dev/crypto now disables accelerated software. Once a driver is chosen, 'crypto_newsession' is invoked as before. - Crypto operations are now solely described by the flat 'cryptop' structure. The linked list of descriptors has been removed. A separate enum has been added to describe the type of data buffer in use instead of using CRYPTO_F_* flags to make it easier to add more types in the future if needed (e.g. wired userspace buffers for zero-copy). It will also make it easier to re-introduce separate input and output buffers (in-kernel TLS would benefit from this). Try to make the flags related to IV handling less insane: - CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv' member of the operation structure. If this flag is not set, the IV is stored in the data buffer at the 'crp_iv_start' offset. - CRYPTO_F_IV_GENERATE means that a random IV should be generated and stored into the data buffer. This cannot be used with CRYPTO_F_IV_SEPARATE. If a consumer wants to deal with explicit vs implicit IVs, etc. it can always generate the IV however it needs and store partial IVs in the buffer and the full IV/nonce in crp_iv and set CRYPTO_F_IV_SEPARATE. The layout of the buffer is now described via fields in cryptop. crp_aad_start and crp_aad_length define the boundaries of any AAD. Previously with GCM and CCM you defined an auth crd with this range, but for ETA your auth crd had to span both the AAD and plaintext (and they had to be adjacent). crp_payload_start and crp_payload_length define the boundaries of the plaintext/ciphertext. Modes that only do a single operation (COMPRESS, CIPHER, DIGEST) should only use this region and leave the AAD region empty. If a digest is present (or should be generated), it's starting location is marked by crp_digest_start. Instead of using the CRD_F_ENCRYPT flag to determine the direction of the operation, cryptop now includes an 'op' field defining the operation to perform. For digests I've added a new VERIFY digest mode which assumes a digest is present in the input and fails the request with EBADMSG if it doesn't match the internally-computed digest. GCM and CCM already assumed this, and the new AEAD mode requires this for decryption. The new ETA mode now also requires this for decryption, so IPsec and GELI no longer do their own authentication verification. Simple DIGEST operations can also do this, though there are no in-tree consumers. To eventually support some refcounting to close races, the session cookie is now passed to crypto_getop() and clients should no longer set crp_sesssion directly. - Assymteric crypto operation structures should be allocated via crypto_getkreq() and freed via crypto_freekreq(). This permits the crypto layer to track open asym requests and close races with a driver trying to unregister while asym requests are in flight. - crypto_copyback, crypto_copydata, crypto_apply, and crypto_contiguous_subsegment now accept the 'crp' object as the first parameter instead of individual members. This makes it easier to deal with different buffer types in the future as well as separate input and output buffers. It's also simpler for driver writers to use. - bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer. This understands the various types of buffers so that drivers that use DMA do not have to be aware of different buffer types. - Helper routines now exist to build an auth context for HMAC IPAD and OPAD. This reduces some duplicated work among drivers. - Key buffers are now treated as const throughout the framework and in device drivers. However, session key buffers provided when a session is created are expected to remain alive for the duration of the session. - GCM and CCM sessions now only specify a cipher algorithm and a cipher key. The redundant auth information is not needed or used. - For cryptosoft, split up the code a bit such that the 'process' callback now invokes a function pointer in the session. This function pointer is set based on the mode (in effect) though it simplifies a few edge cases that would otherwise be in the switch in 'process'. It does split up GCM vs CCM which I think is more readable even if there is some duplication. - I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC as an auth algorithm and updated cryptocheck to work with it. - Combined cipher and auth sessions via /dev/crypto now always use ETA mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored. This was actually documented as being true in crypto(4) before, but the code had not implemented this before I added the CIPHER_FIRST flag. - I have not yet updated /dev/crypto to be aware of explicit modes for sessions. I will probably do that at some point in the future as well as teach it about IV/nonce and tag lengths for AEAD so we can support all of the NIST KAT tests for GCM and CCM. - I've split up the exising crypto.9 manpage into several pages of which many are written from scratch. - I have converted all drivers and consumers in the tree and verified that they compile, but I have not tested all of them. I have tested the following drivers: - cryptosoft - aesni (AES only) - blake2 - ccr and the following consumers: - cryptodev - IPsec - ktls_ocf - GELI (lightly) I have not tested the following: - ccp - aesni with sha - hifn - kgssapi_krb5 - ubsec - padlock - safe - armv8_crypto (aarch64) - glxsb (i386) - sec (ppc) - cesa (armv7) - cryptocteon (mips64) - nlmsec (mips64) Discussed with: cem Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D23677
# b8a6e03f	07-Oct-2019	Gleb Smirnoff <glebius@FreeBSD.org>	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111
# 1b0909d5	17-Jul-2018	Conrad Meyer <cem@FreeBSD.org>	OpenCrypto: Convert sessions to opaque handles instead of integers Track session objects in the framework, and pass handles between the framework (OCF), consumers, and drivers. Avoid redundancy and complexity in individual drivers by allocating session memory in the framework and providing it to drivers in ::newsession(). Session handles are no longer integers with information encoded in various high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the appropriate crypto_ses2foo() function on the opaque session handle. Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to the opaque handle interface. Discard existing session tracking as much as possible (quick pass). There may be additional code ripe for deletion. Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style interface. The conversion is largely mechnical. The change is documented in crypto.9. Inspired by https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html . No objection from: ae (ipsec portion) Reported by: jhb
# 2e08e39f	13-Jul-2018	Conrad Meyer <cem@FreeBSD.org>	OCF: Add a typedef for session identifiers No functional change. This should ease the transition from an integer session identifier model to an opaque pointer model.
# 6d8fdfa9	05-Jun-2018	Andrey V. Elsukov <ae@FreeBSD.org>	Rework IP encapsulation handling code. Currently it has several disadvantages: - it uses single mutex to protect internal structures. It is used by data- and control- path, thus there are no parallelism at all. - it uses single list to keep encap handlers for both INET and INET6 families. - struct encaptab keeps unneeded information (src, dst, masks, protosw), that isn't used by code in the source tree. - matches are prioritized and when many tunneling interfaces are registered, encapcheck handler of each interface is invoked for each packet. The search takes O(n) for n interfaces. All this work is done with exclusive lock held. What this patch includes: - the datapath is converted to be lockless using epoch(9) KPI. - struct encaptab now linked using CK_LIST. - all unused fields removed from struct encaptab. Several new fields addedr: min_length is the minimum packet length, that encapsulation handler expects to see; exact_match is maximum number of bits, that can return an encapsulation handler, when it wants to consume a packet. - IPv6 and IPv4 handlers are stored in separate lists; - added new "encap_lookup_t" method, that will be used later. It is targeted to speedup lookup of needed interface, when gif(4)/gre(4) have many interfaces. - the need to use protosw structure is eliminated. The only pr_input method was used from this structure, so I don't see the need to keep using it. - encap_input_t method changed to avoid using mbuf tags to store softc pointer. Now it is passed directly trough encap_input_t method. encap_getarg() funtions is removed. - all sockaddr structures and code that uses them removed. We don't have any code in the tree that uses them. All consumers use encap_attach_func() method, that relies on invoking of encapcheck() to determine the needed handler. - introduced struct encap_config, it contains parameters of encap handler that is going to be registered by encap_attach() function. - encap handlers are stored in lists ordered by exact_match value, thus handlers that need more bits to match will be checked first, and if encapcheck method returns exact_match value, the search will be stopped. - all current consumers changed to use new KPI. Reviewed by: mmacy Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D15617
# fd40ecf3	20-Mar-2018	John Baldwin <jhb@FreeBSD.org>	Set the proper vnet in IPsec callback functions. When using hardware crypto engines, the callback functions used to handle an IPsec packet after it has been encrypted or decrypted can be invoked asynchronously from a worker thread that is not associated with a vnet. Extend 'struct xform_data' to include a vnet pointer and save the current vnet in this new member when queueing crypto requests in IPsec. In the IPsec callback routines, use the new member to set the current vnet while processing the modified packet. This fixes a panic when using hardware offload such as ccr(4) with IPsec after VIMAGE was enabled in GENERIC. Reported by: Sony Arpita Das and Harsh Jain @ Chelsio Reviewed by: bz MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14763
# 151ba793	24-Dec-2017	Alexander Kabaev <kan@FreeBSD.org>	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385
# fe267a55	27-Nov-2017	Pedro F. Giffuni <pfg@FreeBSD.org>	sys: general adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended.
# 7f1f6591	29-May-2017	Andrey V. Elsukov <ae@FreeBSD.org>	Disable IPsec debugging code by default when IPSEC_DEBUG kernel option is not specified. Due to the long call chain IPsec code can produce the kernel stack exhaustion on the i386 architecture. The debugging code usually is not used, but it requires a lot of stack space to keep buffers for strings formatting. This patch conditionally defines macros to disable building of IPsec debugging code. IPsec currently has two sysctl variables to configure debug output: * net.key.debug variable is used to enable debug output for PF_KEY protocol. Such debug messages are produced by KEYDBG() macro and usually they can be interesting for developers. * net.inet.ipsec.debug variable is used to enable debug output for DPRINTF() macro and ipseclog() function. DPRINTF() macro usually is used for development debugging. ipseclog() function is used for debugging by administrator. The patch disables KEYDBG() and DPRINTF() macros, and formatting buffers declarations when IPSEC_DEBUG is not present in kernel config. This reduces stack requirement for up to several hundreds of bytes. The net.inet.ipsec.debug variable still can be used to enable ipseclog() messages by administrator. PR: 219476 Reported by: eugen No objection from: #network MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10869
# 3aee7099	23-May-2017	Andrey V. Elsukov <ae@FreeBSD.org>	Fix possible double releasing for SA and SP references. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For outbound packets the direct call chain is the following: IPSEC_OUTPUT() method -> ipsec[46]_common_output() -> -> ipsec[46]_perform_request() -> xform_output() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_output_cb() -> ipsec_process_done() -> ip[6]_output(). The SA and SP references are held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did references releasing in xform_output_cb(). But when the crypto callback called directly, in case of error the error handling code in ipsec[46]_perform_request() also did references releasing. To fix this, remove error handling from ipsec[46]_perform_request() and do it in xform_output() before crypto_dispatch(). MFC after: 10 days
# 5f7c516f	23-May-2017	Andrey V. Elsukov <ae@FreeBSD.org>	Fix possible double releasing for SA reference. There are two possible ways how crypto callback are called: directly from caller and deffered from crypto thread. For inbound packets the direct call chain is the following: IPSEC_INPUT() method -> ipsec_common_input() -> xform_input() -> -> crypto_dispatch() -> crypto_invoke() -> crypto_done() -> -> xform_input_cb() -> ipsec[46]_common_input_cb() -> netisr_queue(). The SA reference is held while crypto processing is not finished. The error handling code wrongly expected that crypto callback always called from the crypto thread context, and it did SA reference releasing in xform_input_cb(). But when the crypto callback called directly, in case of error (e.g. data authentification failed) the error handling in ipsec_common_input() also did SA reference releasing. To fix this, remove error handling from ipsec_common_input() and do it in xform_input() before crypto_dispatch(). PR: 219356 MFC after: 10 days
# fcf59617	06-Feb-2017	Andrey V. Elsukov <ae@FreeBSD.org>	Merge projects/ipsec into head/. Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352
# 0d056664	24-Apr-2016	Andrey V. Elsukov <ae@FreeBSD.org>	Fix build for NOINET and NOINET6 kernels. Use own protosw structures for both address families. Check proto in encapcheck function and use -1 as proto argument in encap_attach_func(), both address families can have IPPROTO_IPV4 and IPPROTO_IPV6 protocols. Reported by: bz
# 3cbd4ec3	24-Apr-2016	Andrey V. Elsukov <ae@FreeBSD.org>	Handle non-compressed packets for IPComp in tunnel mode. RFC3173 says that the IP datagram MUST be sent in the original non-compressed form, when the total size of a compressed payload and the IPComp header is not smaller than the size of the original payload. In tunnel mode for small packets IPComp will send encapsulated IP datagrams without IPComp header. Add ip_encap handler for IPPROTO_IPV4 and IPPROTO_IPV6 to handle these datagrams. The handler does lookup for SA related to IPComp protocol and given from mbuf source and destination addresses as tunnel endpoints. It decapsulates packets only when corresponding SA is found. Reported by: gnn Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D6062
# 155d72c4	15-Apr-2016	Pedro F. Giffuni <pfg@FreeBSD.org>	sys/net* : for pointers replace 0 with NULL. Mostly cosmetical, no functional change. Found with devel/coccinelle.
# f3677984	30-Sep-2015	Andrey V. Elsukov <ae@FreeBSD.org>	Take extra reference to security policy before calling crypto_dispatch(). Currently we perform crypto requests for IPSEC synchronous for most of crypto providers (software, aesni) and only VIA padlock calls crypto callback asynchronous. In synchronous mode it is possible, that security policy will be removed during the processing crypto request. And crypto callback will release the last reference to SP. Then upon return into ipsec[46]_process_packet() IPSECREQUEST_UNLOCK() will be called to already freed request. To prevent this we will take extra reference to SP. PR: 201876 Sponsored by: Yandex LLC
# 3d80e82d	26-Apr-2015	Andrey V. Elsukov <ae@FreeBSD.org>	Fix possible use after free due to security policy deletion. When we are passing mbuf to IPSec processing via ipsec[46]_process_packet(), we hold one reference to security policy and release it just after return from this function. But IPSec processing can be deffered and when we release reference to security policy after ipsec[46]_process_packet(), user can delete this security policy from SPDB. And when IPSec processing will be done, xform's callback function will do access to already freed memory. To fix this move KEY_FREESP() into callback function. Now IPSec code will release reference to SP after processing will be finished. Differential Revision: https://reviews.freebsd.org/D2324 No objections from: #network Sponsored by: Yandex LLC
# 962ac6c7	18-Apr-2015	Andrey V. Elsukov <ae@FreeBSD.org>	Change ipsec_address() and ipsec_logsastr() functions to take two additional arguments - buffer and size of this buffer. ipsec_address() is used to convert sockaddr structure to presentation format. The IPv6 part of this function returns pointer to the on-stack buffer and at the moment when it will be used by caller, it becames invalid. IPv4 version uses 4 static buffers and returns pointer to new buffer each time when it called. But anyway it is still possible to get corrupted data when several threads will use this function. ipsec_logsastr() is used to format string about SA entry. It also uses static buffer and has the same problem with concurrent threads. To fix these problems add the buffer pointer and size of this buffer to arguments. Now each caller will pass buffer and its size to these functions. Also convert all places where these functions are used (except disabled code). And now ipsec_address() uses inet_ntop() function from libkern. PR: 185996 Differential Revision: https://reviews.freebsd.org/D2321 Reviewed by: gnn Sponsored by: Yandex LLC
# f0514a8b	11-Dec-2014	Andrey V. Elsukov <ae@FreeBSD.org>	Remove now unused mtag argument from ipsec*_common_input_cb. Obtained from: Yandex LLC Sponsored by: Yandex LLC
# 566cbcc8	11-Dec-2014	Andrey V. Elsukov <ae@FreeBSD.org>	Remove unused mtag variable. Obtained from: Yandex LLC Sponsored by: Yandex LLC
# 2d957916	01-Dec-2014	Andrey V. Elsukov <ae@FreeBSD.org>	Remove route chaching support from ipsec code. It isn't used for some time. * remove sa_route_union declaration and route_cache member from struct secashead; * remove key_sa_routechange() call from ICMP and ICMPv6 code; * simplify ip_ipsec_mtu(); * remove #include <net/route.h>; Sponsored by: Yandex LLC
# 6df8a710	07-Nov-2014	Gleb Smirnoff <glebius@FreeBSD.org>	Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed. Sponsored by: Nginx, Inc.
# db8c0879	09-Jul-2013	Andrey V. Elsukov <ae@FreeBSD.org>	Migrate structs ahstat, espstat, ipcompstat, ipipstat, pfkeystat, ipsec4stat, ipsec6stat to PCPU counters.
# c80211e3	09-Jul-2013	Andrey V. Elsukov <ae@FreeBSD.org>	Prepare network statistics structures for migration to PCPU counters. Use uint64_t as type for all fields of structures. Changed structures: ahstat, arpstat, espstat, icmp6_ifstat, icmp6stat, in6_ifstat, ip6stat, ipcompstat, ipipstat, ipsecstat, mrt6stat, mrtstat, pfkeystat, pim6stat, pimstat, rip6stat, udpstat. Discussed with: arch@
# a04d64d8	20-Jun-2013	Andrey V. Elsukov <ae@FreeBSD.org>	Use corresponding macros to update statistics for AH, ESP, IPIP, IPCOMP, PFKEY. MFC after: 2 weeks
# db178eb8	27-Apr-2011	Bjoern A. Zeeb <bz@FreeBSD.org>	Make IPsec compile without INET adding appropriate #ifdef checks. Unfold the IPSEC_COMMON_INPUT_CB() macro in xform_{ah,esp,ipcomp}.c to not need three different versions depending on INET, INET6 or both. Mark two places preparing for not yet supported functionality with IPv6. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 4 days
# dc49da97	01-Apr-2011	Bjoern A. Zeeb <bz@FreeBSD.org>	Do not allow recursive RFC3173 IPComp payload. Reviewed by: Tavis Ormandy (taviso cmpxchg8b.com) MFC after: 5 days Security: CVE-2011-1547
# 73171433	31-Mar-2011	Fabien Thomas <fabient@FreeBSD.org>	Optimisation in IPSEC(4): - Remove contention on ISR during the crypto operation by using rwlock(9). - Remove a second lookup of the SA in the callback. Gain on 6 cores CPU with SHA1/AES128 can be up to 30%. Reviewed by: vanhu MFC after: 1 month
# a7d5f7eb	19-Oct-2010	Jamie Gritton <jamie@FreeBSD.org>	A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
# 3b558c96	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199947, 199950: Enable IPcomp by default. PR: kern/123587
# 87d7d0ab	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199946: Add more statistics variables for IPcomp. Try to version the struct in a backward compatible way. People asked for the versioning of the stats structs in general before. Note: old netstat binaries, as only consumer, continue to work as they are still using kvm but will not display the new stats. [1] Discussed with: rwatson [1]
# 99808bdf	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199905: Assimilate very similar input and output code paths (no real functional change).
# 0b845b93	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199899: Only add the IPcomp header if crypto reported success and we have a lower payload size. Before we had always added the header, no matter if we actually send out compressed data or not. With this, after the opencrypto/deflate changes, IPcomp starts to work apart from edge cases. Leave it disabled by default until those are fixed as well. PR: kern/123587
# bc05a8e0	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199897: Remove whitespace.
# eee2ee2a	05-Dec-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	MFC r199896: Directly send data uncompressed if the packet payload size is lower than the compression algorithm threshold.
# a77cb332	29-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Enable IPcomp by default. PR: kern/123587 MFC after: 5 days
# 90b4c081	29-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Add more statistics variables for IPcomp. Try to version the struct in a backward compatible way. People asked for the versioning of the stats structs in general before. MFC after: 5 days
# 10229cd1	29-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Assimilate very similar input and output code paths (no real functional change). MFC after: 5 days
# afa47e51	29-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Only add the IPcomp header if crypto reported success and we have a lower payload size. Before we had always added the header, no matter if we actually send out compressed data or not. With this, after the opencrypto/deflate changes, IPcomp starts to work apart from edge cases. Leave it disabled by default until those are fixed as well. PR: kern/123587 MFC after: 5 days
# 3d34d241	28-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Remove whitespace. MFC after: 6 days
# 4ff98521	28-Nov-2009	Bjoern A. Zeeb <bz@FreeBSD.org>	Directly send data uncompressed if the packet payload size is lower than the compression algorithm threshold. MFC after: 6 days
# 530c0060	01-Aug-2009	Robert Watson <rwatson@FreeBSD.org>	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)
# 0a4747d4	20-Jul-2009	Robert Watson <rwatson@FreeBSD.org>	Garbage collect vnet module registrations that have neither constructors nor destructors, as there's no actual work to do. In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added. Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6. Reviewed by: bz Approved by: re (vimage blanket)
# eddfbb76	14-Jul-2009	Robert Watson <rwatson@FreeBSD.org>	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
# bfe1aba4	10-Apr-2009	Marko Zec <zec@FreeBSD.org>	Introduce vnet module registration / initialization framework with dependency tracking and ordering enforcement. With this change, per-vnet initialization functions introduced with r190787 are no longer directly called from traditional initialization functions (which cc in most cases inlined to pre-r190787 code), but are instead registered via the vnet framework first, and are invoked only after all prerequisite modules have been initialized. In the long run, this framework should allow us to both initialize and dismantle multiple vnet instances in a correct order. The problem this change aims to solve is how to replay the initialization sequence of various network stack components, which have been traditionally triggered via different mechanisms (SYSINIT, protosw). Note that this initialization sequence was and still can be subtly different depending on whether certain pieces of code have been statically compiled into the kernel, loaded as modules by boot loader, or kldloaded at run time. The approach is simple - we record the initialization sequence established by the traditional mechanisms whenever vnet_mod_register() is called for a particular vnet module. The vnet_mod_register_multi() variant allows a single initializer function to be registered multiple times but with different arguments - currently this is only used in kern/uipc_domain.c by net_add_domain() with different struct domain * as arguments, which allows for protosw-registered initialization routines to be invoked in a correct order by the new vnet initialization framework. For the purpose of identifying vnet modules, each vnet module has to have a unique ID, which is statically assigned in sys/vimage.h. Dynamic assignment of vnet module IDs is not supported yet. A vnet module may specify a single prerequisite module at registration time by filling in the vmi_dependson field of its vnet_modinfo struct with the ID of the module it depends on. Unless specified otherwise, all vnet modules depend on VNET_MOD_NET (container for ifnet list head, rt_tables etc.), which thus has to and will always be initialized first. The framework will panic if it detects any unresolved dependencies before completing system initialization. Detection of unresolved dependencies for vnet modules registered after boot (kldloaded modules) is not provided. Note that the fact that each module can specify only a single prerequisite may become problematic in the long run. In particular, INET6 depends on INET being already instantiated, due to TCP / UDP structures residing in INET container. IPSEC also depends on INET, which will in turn additionally complicate making INET6-only kernel configs a reality. The entire registration framework can be compiled out by turning on the VIMAGE_GLOBALS kernel config option. Reviewed by: bz Approved by: julian (mentor)
# 1ed81b73	06-Apr-2009	Marko Zec <zec@FreeBSD.org>	First pass at separating per-vnet initializer functions from existing functions for initializing global state. At this stage, the new per-vnet initializer functions are directly called from the existing global initialization code, which should in most cases result in compiler inlining those new functions, hence yielding a near-zero functional change. Modify the existing initializer functions which are invoked via protosw, like ip_init() et. al., to allow them to be invoked multiple times, i.e. per each vnet. Global state, if any, is initialized only if such functions are called within the context of vnet0, which will be determined via the IS_DEFAULT_VNET(curvnet) check (currently always true). While here, V_irtualize a few remaining global UMA zones used by net/netinet/netipsec networking code. While it is not yet clear to me or anybody else whether this is the right thing to do, at this stage this makes the code more readable, and makes it easier to track uncollected UMA-zone-backed objects on vnet removal. In the long run, it's quite possible that some form of shared use of UMA zone pools among multiple vnets should be considered. Bump __FreeBSD_version due to changes in layout of structs vnet_ipfw, vnet_inet and vnet_net. Approved by: julian (mentor)
# 44e33a07	19-Nov-2008	Marko Zec <zec@FreeBSD.org>	Change the initialization methodology for global variables scheduled for virtualization. Instead of initializing the affected global variables at instatiation, assign initial values to them in initializer functions. As a rule, initialization at instatiation for such variables should never be introduced again from now on. Furthermore, enclose all instantiations of such global variables in #ifdef VIMAGE_GLOBALS blocks. Essentialy, this change should have zero functional impact. In the next phase of merging network stack virtualization infrastructure from p4/vimage branch, the new initialization methology will allow us to switch between using global variables and their counterparts residing in virtualization containers with minimum code churn, and in the long run allow us to intialize multiple instances of such container structures. Discussed at: devsummit Strassburg Reviewed by: bz, julian Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
# d7f03759	19-Oct-2008	Ulf Lilleengen <lulf@FreeBSD.org>	- Import the HEAD csup code which is the basis for the cvsmode work.
# 8b615593	02-Oct-2008	Marko Zec <zec@FreeBSD.org>	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation
# 603724d3	17-Aug-2008	Bjoern A. Zeeb <bz@FreeBSD.org>	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
# eaa9325f	24-May-2008	Bjoern A. Zeeb <bz@FreeBSD.org>	In addition to the ipsec_osdep.h removal a week ago, now also eliminate IPSEC_SPLASSERT_SOFTNET which has been 'unused' since FreeBSD 5.0.
# 0bf686c1	06-Aug-2007	Robert Watson <rwatson@FreeBSD.org>	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)
# ec31427d	23-Mar-2006	Pawel Jakub Dawidek <pjd@FreeBSD.org>	Allow to use fast_ipsec(4) on debug.mpsafenet=0 and INVARIANTS-enabled systems. Without the change it will panic on assertions. MFC after: 2 weeks
# 47e2996e	15-Mar-2006	Sam Leffler <sam@FreeBSD.org>	promote fast ipsec's m_clone routine for public use; it is renamed m_unshare and the caller can now control how mbufs are allocated Reviewed by: andre, luigi, mlaier MFC after: 1 week
# c398230b	06-Jan-2005	Warner Losh <imp@FreeBSD.org>	/* -> /*- for license, minor formatting changes
# 8976be94	27-Jan-2004	Sam Leffler <sam@FreeBSD.org>	change SYSINIT starting point to be consistent with other modules
# 9ffa9677	29-Sep-2003	Sam Leffler <sam@FreeBSD.org>	MFp4: portability work, general cleanup, locking fixes change 38496 o add ipsec_osdep.h that holds os-specific definitions for portability o s/KASSERT/IPSEC_ASSERT/ for portability o s/SPLASSERT/IPSEC_SPLASSERT/ for portability o remove function names from ASSERT strings since line#+file pinpints the location o use __func__ uniformly to reduce string storage o convert some random #ifdef DIAGNOSTIC code to assertions o remove some debuggging assertions no longer needed change 38498 o replace numerous bogus panic's with equally bogus assertions that at least go away on a production system change 38502 + 38530 o change explicit mtx operations to #defines to simplify future changes to a different lock type change 38531 o hookup ipv4 ctlinput paths to a noop routine; we should be handling path mtu changes at least o correct potential null pointer deref in ipsec4_common_input_cb chnage 38685 o fix locking for bundled SA's and for when key exchange is required change 38770 o eliminate recursion on the SAHTREE lock change 38804 o cleanup some types: long -> time_t o remove refrence to dead #define change 38805 o correct some types: long -> time_t o add scan generation # to secpolicy to deal with locking issues change 38806 o use LIST_FOREACH_SAFE instead of handrolled code o change key_flush_spd to drop the sptree lock before purging an entry to avoid lock recursion and to avoid holding the lock over a long-running operation o misc cleanups of tangled and twisty code There is still much to do here but for now things look to be working again. Supported by: FreeBSD Foundation
# 6464079f	31-Aug-2003	Sam Leffler <sam@FreeBSD.org>	Locking and misc cleanups; most of which I've been running for >4 months: o add locking o strip irrelevant spl's o split malloc types to better account for memory use o remove unused IPSEC_NONBLOCK_ACQUIRE code o remove dead code Sponsored by: FreeBSD Foundation
# d8409aaf	29-Jun-2003	Sam Leffler <sam@FreeBSD.org>	consolidate callback optimization check in one location by adding a flag for crypto operations that indicates the crypto code should do the check in crypto_done MFC after: 1 day
# 1bb98f3b	27-Jun-2003	Sam Leffler <sam@FreeBSD.org>	Check crypto driver capabilities and if the driver operates synchronously mark crypto requests with ``callback immediately'' to avoid doing a context switch to return crypto results. This completes the work to eliminate context switches for using software crypto via the crypto subsystem (with symmetric crypto ops).
# eb73a605	23-Feb-2003	Sam Leffler <sam@FreeBSD.org>	o add a CRYPTO_F_CBIMM flag to symmetric ops to indicate the callback should be done in crypto_done rather than in the callback thread o use this flag to mark operations from /dev/crypto since the callback routine just does a wakeup; this eliminates the last unneeded ctx switch o change CRYPTO_F_NODELAY to CRYPTO_F_BATCH with an inverted meaning so "0" becomes the default/desired setting (needed for user-mode compatibility with openbsd) o change crypto_dispatch to honor CRYPTO_F_BATCH instead of always dispatching immediately o remove uses of CRYPTO_F_NODELAY o define COP_F_BATCH for ops submitted through /dev/crypto and pass this on to the op that is submitted Similar changes and more eventually coming for asymmetric ops. MFC if re gives approval.
# 88768458	15-Oct-2002	Sam Leffler <sam@FreeBSD.org>	"Fast IPsec": this is an experimental IPsec implementation that is derived from the KAME IPsec implementation, but with heavy borrowing and influence of openbsd. A key feature of this implementation is that it uses the kernel crypto framework to do all crypto work so when h/w crypto support is present IPsec operation is automatically accelerated. Otherwise the protocol implementations are rather differet while the SADB and policy management code is very similar to KAME (for the moment). Note that this implementation is enabled with a FAST_IPSEC option. With this you get all protocols; i.e. there is no FAST_IPSEC_ESP option. FAST_IPSEC and IPSEC are mutually exclusive; you cannot build both into a single system. This software is well tested with IPv4 but should be considered very experimental (i.e. do not deploy in production environments). This software does NOT currently support IPv6. In fact do not configure FAST_IPSEC and INET6 in the same system. Obtained from: KAME + openbsd Supported by: Vernier Networks