History log of /linux-master/io_uring/rw.h
Revision Date Author Comments
# 414d0f45 20-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/alloc_cache: switch to array based caching

Currently lists are being used to manage this, but best practice is
usually to have these in an array instead as that it cheaper to manage.

Outside of that detail, games are also played with KASAN as the list
is inside the cached entry itself.

Finally, all users of this need a struct io_cache_entry embedded in
their struct, which is union'ized with something else in there that
isn't used across the free -> realloc cycle.

Get rid of all of that, and simply have it be an array. This will not
change the memory used, as we're just trading an 8-byte member entry
for the per-elem array size.

This reduces the overhead of the recycled allocations, and it reduces
the amount of code code needed to support recycling to about half of
what it currently is.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# d6f911a6 18-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/rw: add iovec recycling

Let the io_async_rw hold on to the iovec and reuse it, rather than always
allocate and free them.

Also enables KASAN for the iovec entries, so that reuse can be detected
even while they are in the cache.

While doing so, shrink io_async_rw by getting rid of the bigger embedded
fast iovec. Since iovecs are being recycled now, shrink it from 8 to 1.
This reduces the io_async_rw size from 264 to 160 bytes, a 40% reduction.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 0d10bd77 18-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring: get rid of struct io_rw_state

A separate state struct is not needed anymore, just fold it in with
io_async_rw.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# a9165b83 18-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring/rw: always setup io_async_rw for read/write requests

read/write requests try to put everything on the stack, and then alloc
and copy if a retry is needed. This necessitates a bunch of nasty code
that deals with intermediate state.

Get rid of this, and have the prep side setup everything that is needed
upfront, which greatly simplifies the opcode handlers.

This includes adding an alloc cache for io_async_rw, to make it cheap
to handle.

In terms of cost, this should be basically free and transparent. For
the worst case of {READ,WRITE}_FIXED which didn't need it before,
performance is unaffected in the normal peak workload that is being
used to test that. Still runs at 122M IOPS.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f688944c 06-Nov-2023 Jens Axboe <axboe@kernel.dk>

io_uring/rw: add separate prep handler for fixed read/write

Rather than sprinkle opcode checks in the generic read/write prep handler,
have a separate prep handler for the vectored readv/writev operation.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 0e984ec8 06-Nov-2023 Jens Axboe <axboe@kernel.dk>

io_uring/rw: add separate prep handler for readv/writev

Rather than sprinkle opcode checks in the generic read/write prep handler,
have a separate prep handler for the vectored readv/writev operation.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# fc68fcda 11-Sep-2023 Jens Axboe <axboe@kernel.dk>

io_uring/rw: add support for IORING_OP_READ_MULTISHOT

This behaves like IORING_OP_READ, except:

1) It only supports pollable files (eg pipes, sockets, etc). Note that
for sockets, you probably want to use recv/recvmsg with multishot
instead.

2) It supports multishot mode, meaning it will repeatedly trigger a
read and fill a buffer when data is available. This allows similar
use to recv/recvmsg but on non-sockets, where a single request will
repeatedly post a CQE whenever data is read from it.

3) Because of #2, it must be used with provided buffers. This is
uniformly true across any request type that supports multishot and
transfers data, with the reason being that it's obviously not
possible to pass in a single buffer for the data, as multiple reads
may very well trigger before an application has a chance to process
previous CQEs and the data passed from them.

Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# c92fcfc2 02-Jun-2023 Jens Axboe <axboe@kernel.dk>

io_uring: avoid indirect function calls for the hottest task_work

We use task_work for a variety of reasons, but doing completions or
triggering rety after poll are by far the hottest two. Use the indirect
funtion call wrappers to avoid the indirect function call if
CONFIG_RETPOLINE is set.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 47b4c686 20-Sep-2022 Pavel Begunkov <asml.silence@gmail.com>

io_uring/rw: don't lose partial IO result on fail

A partially done read/write may end up in io_req_complete_failed() and
loose the result, make sure we return the number of bytes processed.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/05e0879c226bcd53b441bf92868eadd4bf04e2fc.1663668091.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f3b44f92 13-Jun-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move read/write related opcodes to its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>