History log of /freebsd-current/sys/opencrypto/ktls_ocf.c
Revision Date Author Comments
# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4e474146 15-Nov-2022 John Baldwin <jhb@FreeBSD.org>

ktls_ocf: Reject encrypted TLS records using AEAD that are too small.

If a TLS record is too small to contain the required explicit IV,
record_type (TLS 1.3), and MAC, reject attempts to decrypt it with
EMSGSIZE without submitting it to OCF. OCF drivers may not properly
detect that regions in the crypto request are outside the bounds of
the mbuf chain. The caller isn't supposed to submit such requests.

Reviewed by: markj
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D37372


# 9a673b71 15-Nov-2022 John Baldwin <jhb@FreeBSD.org>

ktls: Add software support for AES-CBC decryption for TLS 1.1+.

This is mainly intended to provide a fallback for TOE TLS which may
need to use software decryption for an initial record at the start
of a connection.

Reviewed by: markj
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D37370


# a8280123 22-Apr-2022 John Baldwin <jhb@FreeBSD.org>

KTLS: Add a new recrypt operation to the software backend.

When using NIC TLS RX, packets that are dropped and retransmitted are
not decrypted by the NIC but are passed along as-is. As a result, a
received TLS record might contain a mix of encrypted and decrypted
data. If this occurs, the already-decrypted data needs to be
re-encrypted so that the resulting record can then be decrypted
normally.

Add support for this for sessions using AES-GCM with TLS 1.2 or TLS
1.3. For the recrypt operation, allocate a temporary buffer and
encrypt the the payload portion of the TLS record with AES-CTR with an
initial IV constructed from the AES-GCM nonce. Then fixup the
original mbuf chain by copying the results from the temporary buffer
back into the original mbufs for any mbufs containing decrypted data.

Once it has been recrypted, the mbuf chain can then be decrypted via
the normal software decryption path.

Co-authored by: Hans Petter Selasky <hselasky@FreeBSD.org>
Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D35012


# 663ae8f7 22-Apr-2022 John Baldwin <jhb@FreeBSD.org>

KTLS: Construct IV directly in crp.crp_iv for TLS 1.3 AEAD encryption.

Previously this used a temporary nonce[] buffer. The decrypt hook for
TLS 1.3 as well as the hooks for TLS 1.2 already constructed the IV
directly in crp.crp_iv.

Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D35027


# a4c5d490 22-Apr-2022 John Baldwin <jhb@FreeBSD.org>

KTLS: Move OCF function pointers out of ktls_session.

Instead, create a switch structure private to ktls_ocf.c and store a
pointer to the switch in the ocf_session. This will permit adding an
additional function pointer needed for NIC TLS RX without further
bloating ktls_session.

Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D35011


# 4b337ada 21-Apr-2022 John Baldwin <jhb@FreeBSD.org>

KTLS: Free the MAC session when destroying AES-CBC software sessions.

Reviewed by: hselasky
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D35013


# 05a1d0f5 14-Dec-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Support for TLS 1.3 receive offload.

Note that support for TLS 1.3 receive offload in OpenSSL is still an
open pull request in active development. However, potential changes
to that pull request should not affect the kernel interface.

Reviewed by: hselasky
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33007


# 16bea05a 16-Nov-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Split encrypt vs decrypt OCF counters.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33006


# 522a2aa7 11-Nov-2021 John Baldwin <jhb@FreeBSD.org>

Drop "All rights reserved" from a Netflix copyright.

Reviewed by: imp
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32778


# 63378c84 04-Nov-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Remove unused ocf_operation structure from stack.

Reviewed by: jhb@
MFC after: 1 week
Sponsored by: NVIDIA Networking


# 4827bf76 27-Oct-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Fix assertion for TLS 1.0 CBC when using non-zero starting seqno.

The starting sequence number used to verify that TLS 1.0 CBC records
are encrypted in-order in the OCF layer was always set to 0 and not to
the initial sequence number from the struct tls_enable.

In practice, OpenSSL always starts TLS transmit offload with a
sequence number of zero, so this only matters for tests that use a
random starting sequence number.

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32676


# b33ff941 21-Oct-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Change struct ktls_session.cipher to an OCF-specific type.

As a followup to SW KTLS assuming an OCF backend, rename
struct ocf_session to struct ktls_ocf_session and forward
declare it in <sys/ktls.h> to use as the type of
struct ktls_session.cipher.

Reviewed by: gallatin, hselasky
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32565


# 470e851c 30-Aug-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Support asynchronous dispatch of AEAD ciphers.

KTLS OCF support was originally targeted at software backends that
used host CPU cycles to encrypt TLS records. As a result, each KTLS
worker thread queued a single TLS record at a time and waited for it
to be encrypted before processing another TLS record. This works well
for software backends but limits throughput on OCF drivers for
coprocessors that support asynchronous operation such as qat(4) or
ccr(4). This change uses an alternate function (ktls_encrypt_async)
when encrypt TLS records via a coprocessor. This function queues TLS
records for encryption and returns. It defers the work done after a
TLS record has been encrypted (such as marking the mbufs ready) to a
callback invoked asynchronously by the coprocessor driver when a
record has been encrypted.

- Add a struct ktls_ocf_state that holds the per-request state stored
on the stack for synchronous requests. Asynchronous requests malloc
this structure while synchronous requests continue to allocate this
structure on the stack.

- Add a ktls_encrypt_async() variant of ktls_encrypt() which does not
perform request completion after dispatching a request to OCF.
Instead, the ktls_ocf backends invoke ktls_encrypt_cb() when a TLS
record request completes for an asynchronous request.

- Flag AEAD software TLS sessions as async if the backend driver
selected by OCF is an async driver.

- Pull code to create and dispatch an OCF request out of
ktls_encrypt() into a new ktls_encrypt_one() function used by both
ktls_encrypt() and ktls_encrypt_async().

- Pull code to "finish" the VM page shuffling for a file-backed TLS
record into a helper function ktls_finish_noanon() used by both
ktls_encrypt() and ktls_encrypt_cb().

Reviewed by: markj
Tested on: ccr(4) (jhb), qat(4) (markj)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31665


# 6b313a3a 25-May-2021 John Baldwin <jhb@FreeBSD.org>

Include the trailer in the original dst_iov.

This avoids creating a duplicate copy on the stack just to
append the trailer.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30139


# 21e3c1fb 25-May-2021 John Baldwin <jhb@FreeBSD.org>

Assume OCF is the only KTLS software backend.

This removes support for loadable software backends. The KTLS OCF
support is now always included in kernels with KERN_TLS and the
ktls_ocf.ko module has been removed. The software encryption routines
now take an mbuf directly and use the TLS mbuf as the crypto buffer
when possible.

Bump __FreeBSD_version for software backends in ports.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30138


# 4a92afae 25-May-2021 John Baldwin <jhb@FreeBSD.org>

ktls_ocf: Fix a few places to not hardcode the GMAC hash length.

This is not a functional change as the Poly1305 hash is the same
length as the GMAC hash length.

Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30137


# 3fa03421 10-Mar-2021 John Baldwin <jhb@FreeBSD.org>

ktls: Fix non-inplace TLS 1.3 encryption.

Copy the iovec for the trailer from the proper place. This is the same
fix for CBC encryption from ff6a7e4ba6bf.

Reported by: gallatin
Reviewed by: gallatin, markj
Fixes: 49f6925ca
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29177


# ff6a7e4b 04-Mar-2021 Mark Johnston <markj@FreeBSD.org>

ktls: Fix CBC encryption when input and output iov sizes are different

Reported by: gallatin
Tested by: gallatin
Fixes: 49f6925ca
Differential Revision: https://reviews.freebsd.org/D29073


# 49f6925c 03-Mar-2021 Mark Johnston <markj@FreeBSD.org>

ktls: Cache output buffers for software encryption

Maintain a cache of physically contiguous runs of pages for use as
output buffers when software encryption is configured and in-place
encryption is not possible. This makes allocation and free cheaper
since in the common case we avoid touching the vm_page structures for
the buffer, and fewer calls into UMA are needed. gallatin@ reports a
~10% absolute decrease in CPU usage with sendfile/KTLS on a Xeon after
this change.

It is possible that we will not be able to allocate these buffers if
physical memory is fragmented. To avoid frequently calling into the
physical memory allocator in this scenario, rate-limit allocation
attempts after a failure. In the failure case we fall back to the old
behaviour of allocating a page at a time.

N.B.: this scheme could be simplified, either by simply using malloc()
and looking up the PAs of the pages backing the buffer, or by falling
back to page by page allocation and creating a mapping in the cache
zone. This requires some way to save a mapping of an M_EXTPG page array
in the mbuf, though. m_data is not really appropriate. The second
approach may be possible by saving the mapping in the plinks union of
the first vm_page structure of the array, but this would force a vm_page
access when freeing an mbuf.

Reviewed by: gallatin, jhb
Tested by: gallatin
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28556


# 4dd6800e 18-Feb-2021 John Baldwin <jhb@FreeBSD.org>

Add Chacha20-Poly1305 support in the OCF backend for KTLS.

This supports Chacha20-Poly1305 for both send and receive for TLS 1.2
and for send in TLS 1.3.

Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27841


# db6b5644 08-Feb-2021 Mark Johnston <markj@FreeBSD.org>

ktls: Avoid wakeups and locking for synchronous callbacks

When performing encryption in software, the KTLS crypto callback always
locks the session to deliver a wakeup. But, if we're handling the
operation synchronously this is wasted effort and can result in
sleepqueue lock contention on large systems.

Use CRYPTO_SESS_SYNC() to determine whether the operation will be
completed asynchronously or not, and select a callback appropriately.
Avoid locking the session to check for completion if the session handles
requests synchronously.

Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28195


# 1755b2b9 08-Feb-2021 Mark Johnston <markj@FreeBSD.org>

ktls: Use COUNTER_U64_DEFINE_EARLY

This makes it a bit more straightforward to add new counters when
debugging. No functional change intended.

Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28498


# 47e2650e 13-Oct-2020 John Baldwin <jhb@FreeBSD.org>

Add support to the KTLS OCF module for AES-CBC MTE ciphersuites.

This is a simplistic approach which encrypts each TLS record in two
separate passes: one to generate the MAC and a second to encrypt.
This supports TLS 1.0 connections with implicit IVs as well as TLS
1.1+ with explicit IVs.

Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26730


# 3c0e5685 23-Jul-2020 John Baldwin <jhb@FreeBSD.org>

Add support for KTLS RX via software decryption.

Allow TLS records to be decrypted in the kernel after being received
by a NIC. At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse. Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications. However, there is an
important difference:

- In the transmit case, the socket buffer is always a single "record"
holding a chain of mbufs. Not-yet-encrypted mbufs are marked not
ready (M_NOTREADY) and released to protocols for transmit by marking
mbufs ready once their data is encrypted.

- In the receive case, incoming (encrypted) data appended to the
socket buffer is still a single stream of data from the protocol,
but decrypted TLS records are stored as separate records in the
socket buffer and read individually via recvmsg().

Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message). Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.

As such, I settled on the following solution:

- Socket buffers now contain an additional chain of mbufs (sb_mtls,
sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
the protocol layer. These mbufs are still marked M_NOTREADY, but
soreceive*() generally don't know about them (except that they will
block waiting for data to be decrypted for a blocking read).

- Each time a new mbuf is appended to this TLS mbuf chain, the socket
buffer peeks at the TLS record header at the head of the chain to
determine the encrypted record's length. If enough data is queued
for the TLS record, the socket is placed on a per-CPU TLS workqueue
(reusing the existing KTLS workqueues and worker threads).

- The worker thread loops over the TLS mbuf chain decrypting records
until it runs out of data. Each record is detached from the TLS
mbuf chain while it is being decrypted to keep the mbufs "pinned".
However, a new sb_dtlscc field tracks the character count of the
detached record and sbcut()/sbdrop() is updated to account for the
detached record. After the record is decrypted, the worker thread
first checks to see if sbcut() dropped the record. If so, it is
freed (can happen when a socket is closed with pending data).
Otherwise, the header and trailer are stripped from the original
mbufs, a control message is created holding the decrypted TLS
header, and the decrypted TLS record is appended to the "normal"
socket buffer chain.

(Side note: the SBCHECK() infrastucture was very useful as I was
able to add assertions there about the TLS chain that caught several
bugs during development.)

Tested by: rmacklem (various versions)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24628


# 70d1a435 23-Jul-2020 John Baldwin <jhb@FreeBSD.org>

Consolidate duplicated code into a ktls_ocf_dispatch function.

This function manages the loop around crypto_dispatch and coordination
with ktls_ocf_callback.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25757


# 33a1a488 20-Jul-2020 John Baldwin <jhb@FreeBSD.org>

Don't dynamically allocate data structures for KTLS crypto requests.

Allocate iovec arrays and struct cryptop and struct ocf_operation
objects on the stack to reduce avoid the overhead of malloc().

These structures are all small enough to fit on the stack of the KTLS
worker threads.

Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25692


# 4a711b8d 25-Jun-2020 John Baldwin <jhb@FreeBSD.org>

Use zfree() instead of explicit_bzero() and free().

In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by: cem
Reviewed by: cem, delphij
Approved by: csprng (cem)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25435


# 5b750b9a 22-Jun-2020 John Baldwin <jhb@FreeBSD.org>

Store the AAD in a separate buffer for KTLS.

For TLS 1.2 this permits reusing one of the existing iovecs without
always having to duplicate both.

While here, only duplicate the output iovec for TLS 1.3 if it will be
used.

Reviewed by: gallatin
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25291


# 4e6a3813 12-Jun-2020 John Baldwin <jhb@FreeBSD.org>

Fix a regression in r361804 for TLS 1.3.

I was not including the record type stored in the first byte of the
trailer as part of the payload to be encrypted and hashed.

Sponsored by: Netflix


# 080933c0 04-Jun-2020 John Baldwin <jhb@FreeBSD.org>

Use separate output buffers for OCF requests in KTLS.

KTLS encryption requests for file-backed data such as from sendfile(2)
require the encrypted data to be stored in a separate buffer from the
unencrypted file input data. Previously the OCF backend for KTLS
manually copied the data from the input buffer to the output buffer
before queueing the crypto request. Now the OCF backend will use a
separate output buffer for such requests and avoid the copy. This
mostly helps when an async co-processor is used by saving CPU cycles
used on the copy.

Reviewed by: gallatin (earlier version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545


# 9c0e3d3a 25-May-2020 John Baldwin <jhb@FreeBSD.org>

Add support for optional separate output buffers to in-kernel crypto.

Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer. Using a separate output buffer avoids this copy.

- Create a new 'struct crypto_buffer' describing a crypto buffer
containing a type and type-specific fields. crp_ilen is gone,
instead buffers that use a flat kernel buffer have a cb_buf_len
field for their length. The length of other buffer types is
inferred from the backing store (e.g. uio_resid for a uio).
Requests now have two such structures: crp_buf for the input buffer,
and crp_obuf for the output buffer.

- Consumers now use helper functions (crypto_use_*,
e.g. crypto_use_mbuf()) to configure the input buffer. If an output
buffer is not configured, the request still modifies the input
buffer in-place. A consumer uses a second set of helper functions
(crypto_use_output_*) to configure an output buffer.

- Consumers must request support for separate output buffers when
creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
only permitted to queue a request with a separate output buffer on
sessions with this flag set. Existing drivers already reject
sessions with unknown flags, so this permits drivers to be modified
to support this extension without requiring all drivers to change.

- Several data-related functions now have matching versions that
operate on an explicit buffer (e.g. crypto_apply_buf,
crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).

- Most of the existing data-related functions operate on the input
buffer. However crypto_copyback always writes to the output buffer
if a request uses a separate output buffer.

- For the regions in input/output buffers, the following conventions
are followed:
- AAD and IV are always present in input only and their
fields are offsets into the input buffer.
- payload is always present in both buffers. If a request uses a
separate output buffer, it must set a new crp_payload_start_output
field to the offset of the payload in the output buffer.
- digest is in the input buffer for verify operations, and in the
output buffer for compute operations. crp_digest_start is relative
to the appropriate buffer.

- Add a crypto buffer cursor abstraction. This is a more general form
of some bits in the cryptosoft driver that tried to always use uio's.
However, compared to the original code, this avoids rewalking the uio
iovec array for requests with multiple vectors. It also avoids
allocate an iovec array for mbufs and populating it by instead walking
the mbuf chain directly.

- Update the cryptosoft(4) driver to support separate output buffers
making use of the cursor abstraction.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545


# c0341432 27-Mar-2020 John Baldwin <jhb@FreeBSD.org>

Refactor driver and consumer interfaces for OCF (in-kernel crypto).

- The linked list of cryptoini structures used in session
initialization is replaced with a new flat structure: struct
crypto_session_params. This session includes a new mode to define
how the other fields should be interpreted. Available modes
include:

- COMPRESS (for compression/decompression)
- CIPHER (for simply encryption/decryption)
- DIGEST (computing and verifying digests)
- AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
- ETA (combined auth and encryption using encrypt-then-authenticate)

Additional modes could be added in the future (e.g. if we wanted to
support TLS MtE for AES-CBC in the kernel we could add a new mode
for that. TLS modes might also affect how AAD is interpreted, etc.)

The flat structure also includes the key lengths and algorithms as
before. However, code doesn't have to walk the linked list and
switch on the algorithm to determine which key is the auth key vs
encryption key. The 'csp_auth_*' fields are always used for auth
keys and settings and 'csp_cipher_*' for cipher. (Compression
algorithms are stored in csp_cipher_alg.)

- Drivers no longer register a list of supported algorithms. This
doesn't quite work when you factor in modes (e.g. a driver might
support both AES-CBC and SHA2-256-HMAC separately but not combined
for ETA). Instead, a new 'crypto_probesession' method has been
added to the kobj interface for symmteric crypto drivers. This
method returns a negative value on success (similar to how
device_probe works) and the crypto framework uses this value to pick
the "best" driver. There are three constants for hardware
(e.g. ccr), accelerated software (e.g. aesni), and plain software
(cryptosoft) that give preference in that order. One effect of this
is that if you request only hardware when creating a new session,
you will no longer get a session using accelerated software.
Another effect is that the default setting to disallow software
crypto via /dev/crypto now disables accelerated software.

Once a driver is chosen, 'crypto_newsession' is invoked as before.

- Crypto operations are now solely described by the flat 'cryptop'
structure. The linked list of descriptors has been removed.

A separate enum has been added to describe the type of data buffer
in use instead of using CRYPTO_F_* flags to make it easier to add
more types in the future if needed (e.g. wired userspace buffers for
zero-copy). It will also make it easier to re-introduce separate
input and output buffers (in-kernel TLS would benefit from this).

Try to make the flags related to IV handling less insane:

- CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
member of the operation structure. If this flag is not set, the
IV is stored in the data buffer at the 'crp_iv_start' offset.

- CRYPTO_F_IV_GENERATE means that a random IV should be generated
and stored into the data buffer. This cannot be used with
CRYPTO_F_IV_SEPARATE.

If a consumer wants to deal with explicit vs implicit IVs, etc. it
can always generate the IV however it needs and store partial IVs in
the buffer and the full IV/nonce in crp_iv and set
CRYPTO_F_IV_SEPARATE.

The layout of the buffer is now described via fields in cryptop.
crp_aad_start and crp_aad_length define the boundaries of any AAD.
Previously with GCM and CCM you defined an auth crd with this range,
but for ETA your auth crd had to span both the AAD and plaintext
(and they had to be adjacent).

crp_payload_start and crp_payload_length define the boundaries of
the plaintext/ciphertext. Modes that only do a single operation
(COMPRESS, CIPHER, DIGEST) should only use this region and leave the
AAD region empty.

If a digest is present (or should be generated), it's starting
location is marked by crp_digest_start.

Instead of using the CRD_F_ENCRYPT flag to determine the direction
of the operation, cryptop now includes an 'op' field defining the
operation to perform. For digests I've added a new VERIFY digest
mode which assumes a digest is present in the input and fails the
request with EBADMSG if it doesn't match the internally-computed
digest. GCM and CCM already assumed this, and the new AEAD mode
requires this for decryption. The new ETA mode now also requires
this for decryption, so IPsec and GELI no longer do their own
authentication verification. Simple DIGEST operations can also do
this, though there are no in-tree consumers.

To eventually support some refcounting to close races, the session
cookie is now passed to crypto_getop() and clients should no longer
set crp_sesssion directly.

- Assymteric crypto operation structures should be allocated via
crypto_getkreq() and freed via crypto_freekreq(). This permits the
crypto layer to track open asym requests and close races with a
driver trying to unregister while asym requests are in flight.

- crypto_copyback, crypto_copydata, crypto_apply, and
crypto_contiguous_subsegment now accept the 'crp' object as the
first parameter instead of individual members. This makes it easier
to deal with different buffer types in the future as well as
separate input and output buffers. It's also simpler for driver
writers to use.

- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
This understands the various types of buffers so that drivers that
use DMA do not have to be aware of different buffer types.

- Helper routines now exist to build an auth context for HMAC IPAD
and OPAD. This reduces some duplicated work among drivers.

- Key buffers are now treated as const throughout the framework and in
device drivers. However, session key buffers provided when a session
is created are expected to remain alive for the duration of the
session.

- GCM and CCM sessions now only specify a cipher algorithm and a cipher
key. The redundant auth information is not needed or used.

- For cryptosoft, split up the code a bit such that the 'process'
callback now invokes a function pointer in the session. This
function pointer is set based on the mode (in effect) though it
simplifies a few edge cases that would otherwise be in the switch in
'process'.

It does split up GCM vs CCM which I think is more readable even if there
is some duplication.

- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
as an auth algorithm and updated cryptocheck to work with it.

- Combined cipher and auth sessions via /dev/crypto now always use ETA
mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
This was actually documented as being true in crypto(4) before, but
the code had not implemented this before I added the CIPHER_FIRST
flag.

- I have not yet updated /dev/crypto to be aware of explicit modes for
sessions. I will probably do that at some point in the future as well
as teach it about IV/nonce and tag lengths for AEAD so we can support
all of the NIST KAT tests for GCM and CCM.

- I've split up the exising crypto.9 manpage into several pages
of which many are written from scratch.

- I have converted all drivers and consumers in the tree and verified
that they compile, but I have not tested all of them. I have tested
the following drivers:

- cryptosoft
- aesni (AES only)
- blake2
- ccr

and the following consumers:

- cryptodev
- IPsec
- ktls_ocf
- GELI (lightly)

I have not tested the following:

- ccp
- aesni with sha
- hifn
- kgssapi_krb5
- ubsec
- padlock
- safe
- armv8_crypto (aarch64)
- glxsb (i386)
- sec (ppc)
- cesa (armv7)
- cryptocteon (mips64)
- nlmsec (mips64)

Discussed with: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23677


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 55b7a0e1 17-Dec-2019 John Baldwin <jhb@FreeBSD.org>

Add support for TLS 1.3 using AES-GCM to the OCF backend for KTLS.

Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22802


# 6554362c 27-Sep-2019 Andrew Gallatin <gallatin@FreeBSD.org>

kTLS support for TLS 1.3

TLS 1.3 requires a few changes because 1.3 pretends to be 1.2
with a record type of application data. The "real" record type is
then included at the end of the user-supplied plaintext
data. This required adding a field to the mbuf_ext_pgs struct to
save the record type, and passing the real record type to the
sw_encrypt() ktls backend functions.

Reviewed by: jhb, hselasky
Sponsored by: Netflix
Differential Revision: D21801


# b2e60773 26-Aug-2019 John Baldwin <jhb@FreeBSD.org>

Add kernel-side support for in-kernel TLS.

KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports
offload of TLS for transmitted data. Key negotation must still be
performed in userland. Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option. All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.

Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type. Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.

At present, rekeying is not supported though the in-kernel framework
should support rekeying.

KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer. Each TLS frame is described by a single
ext_pgs mbuf. The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.

KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.

Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then
called to schedule TLS frames for encryption. In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed. For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking
ktls_enqueue().

A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue(). Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.

(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct
map.)

KTLS supports pluggable software encryption backends. Internally,
Netflix uses proprietary pure-software backends. This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames. As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.

Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready(). At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.

ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session. TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted. The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface. If so, the packet is tagged
with the TLS send tag and sent to the interface. The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation. If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped. In addition, a task is scheduled to refresh the TLS send
tag for the TLS session. If a new TLS send tag cannot be allocated,
the connection is dropped. If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag. (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another. As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being
dropped.)

ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.

Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option. They can also use this socket
option to toggle between software and ifnet TLS modes.

In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax. However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.

Various sysctls and counters are available under the kern.ipc.tls
sysctl node. The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default). The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.

KTLS is enabled via the KERN_TLS kernel option.

This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.

Reviewed by: gallatin, hselasky, rrs
Obtained from: Netflix
Sponsored by: Netflix, Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21277