History log of /linux-master/io_uring/Makefile
Revision Date Author Comments
# f15ed8b4 27-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring: move mapping/allocation helpers to a separate file

Move the related code from io_uring.c into memmap.c. No functional
changes in this patch, just cleaning it up a bit now that the full
transition is done.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 77a1cd5e 26-Mar-2024 Jens Axboe <axboe@kernel.dk>

io_uring: re-arrange Makefile order

The object list is a bit of a mess, with core and opcode files mixed in.
Re-arrange it so that we have the core bits first, and then opcode
specific files after that.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 8d0c12a8 08-Jun-2023 Stefan Roesch <shr@devkernel.io>

io-uring: add napi busy poll support

This adds the napi busy polling support in io_uring.c. It adds a new
napi_list to the io_ring_ctx structure. This list contains the list of
napi_id's that are currently enabled for busy polling. The list is
synchronized by the new napi_lock spin lock. The current default napi
busy polling time is stored in napi_busy_poll_to. If napi busy polling
is not enabled, the value is 0.

In addition there is also a hash table. The hash table store the napi
id and the pointer to the above list nodes. The hash table is used to
speed up the lookup to the list elements. The hash table is synchronized
with rcu.

The NAPI_TIMEOUT is stored as a timeout to make sure that the time a
napi entry is stored in the napi list is limited.

The busy poll timeout is also stored as part of the io_wait_queue. This
is necessary as for sq polling the poll interval needs to be adjusted
and the napi callback allows only to pass in one value.

This has been tested with two simple programs from the liburing library
repository: the napi client and the napi server program. The client
sends a request, which has a timestamp in its payload and the server
replies with the same payload. The client calculates the roundtrip time
and stores it to calculate the results.

The client is running on host1 and the server is running on host 2 (in
the same rack). The measured times below are roundtrip times. They are
average times over 5 runs each. Each run measures 1 million roundtrips.

no rx coal rx coal: frames=88,usecs=33
Default 57us 56us

client_poll=100us 47us 46us

server_poll=100us 51us 46us

client_poll=100us+ 40us 40us
server_poll=100us

client_poll=100us+ 41us 39us
server_poll=100us+
prefer napi busy poll on client

client_poll=100us+ 41us 39us
server_poll=100us+
prefer napi busy poll on server

client_poll=100us+ 41us 39us
server_poll=100us+
prefer napi busy poll on client + server

Signed-off-by: Stefan Roesch <shr@devkernel.io>
Suggested-by: Olivier Langlois <olivier@trillion01.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230608163839.2891748-5-shr@devkernel.io
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# b4bb1900 02-Feb-2024 Tony Solomonik <tony.solomonik@gmail.com>

io_uring: add support for ftruncate

Adds support for doing truncate through io_uring, eliminating
the need for applications to roll their own thread pool or offload
mechanism to be able to do non-blocking truncates.

Signed-off-by: Tony Solomonik <tony.solomonik@gmail.com>
Link: https://lore.kernel.org/r/20240202121724.17461-3-tony.solomonik@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# c4320315 19-Dec-2023 Jens Axboe <axboe@kernel.dk>

io_uring/register: move io_uring_register(2) related code to register.c

Most of this code is basically self contained, move it out of the core
io_uring file to bring a bit more separation to the registration related
bits. This moves another ~10% of the code into register.c.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 194bb58c 08-Jun-2023 Jens Axboe <axboe@kernel.dk>

io_uring: add support for futex wake and wait

Add support for FUTEX_WAKE/WAIT primitives.

IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as
it does support passing in a bitset.

Similary, IORING_OP_FUTEX_WAIT is a mix of FUTEX_WAIT and
FUTEX_WAIT_BITSET.

For both of them, they are using the futex2 interface.

FUTEX_WAKE is straight forward, as those can always be done directly from
the io_uring submission without needing async handling. For FUTEX_WAIT,
things are a bit more complicated. If the futex isn't ready, then we
rely on a callback via futex_queue->wake() when someone wakes up the
futex. From that calback, we queue up task_work with the original task,
which will post a CQE and wake it, if necessary.

Cancelations are supported, both from the application point-of-view,
but also to be able to cancel pending waits if the ring exits before
all events have occurred. The return value of futex_unqueue() is used
to gate who wins the potential race between cancelation and futex
wakeups. Whomever gets a 'ret == 1' return from that claims ownership
of the io_uring futex request.

This is just the barebones wait/wake support. PI or REQUEUE support is
not added at this point, unclear if we might look into that later.

Likewise, explicit timeouts are not supported either. It is expected
that users that need timeouts would do so via the usual io_uring
mechanism to do that using linked timeouts.

The SQE format is as follows:

`addr` Address of futex
`fd` futex2(2) FUTEX2_* flags
`futex_flags` io_uring specific command flags. None valid now.
`addr2` Value of futex
`addr3` Mask to wake/wait

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f31ecf67 10-Jul-2023 Jens Axboe <axboe@kernel.dk>

io_uring: add IORING_OP_WAITID support

This adds support for an async version of waitid(2), in a fully async
version. If an event isn't immediately available, wait for a callback
to trigger a retry.

The format of the sqe is as follows:

sqe->len The 'which', the idtype being queried/waited for.
sqe->fd The 'pid' (or id) being waited for.
sqe->file_index The 'options' being set.
sqe->addr2 A pointer to siginfo_t, if any, being filled in.

buf_index, add3, and waitid_flags are reserved/unused for now.
waitid_flags will be used for options for this request type. One
interesting use case may be to add multi-shot support, so that the
request stays armed and posts a notification every time a monitored
process state change occurs.

Note that this does not support rusage, on Arnd's recommendation.

See the waitid(2) man page for details on the arguments.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# eb42cebb 12-Jul-2022 Pavel Begunkov <asml.silence@gmail.com>

io_uring: add zc notification infrastructure

Add internal part of send zerocopy notifications. There are two main
structures, the first one is struct io_notif, which carries inside
struct ubuf_info and maps 1:1 to it. io_uring will be binding a number
of zerocopy send requests to it and ask to complete (aka flush) it. When
flushed and all attached requests and skbs complete, it'll generate one
and only one CQE. There are intended to be passed into the network layer
as struct msghdr::msg_ubuf.

The second concept is notification slots. The userspace will be able to
register an array of slots and subsequently addressing them by the index
in the array. Slots are independent of each other. Each slot can have
only one notifier at a time (called active notifier) but many notifiers
during the lifetime. When active, a notifier not going to post any
completion but the userspace can attach requests to it by specifying
the corresponding slot while issueing send zc requests. Eventually, the
userspace will want to "flush" the notifier losing any way to attach
new requests to it, however it can use the next atomatically added
notifier of this slot or of any other slot.

When the network layer is done with all enqueued skbs attached to a
notifier and doesn't need the specified in them user data, the flushed
notifier will post a CQE.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3ecf54c31a85762bf679b0a432c9f43ecf7e61cc.1657643355.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# d9b57aa3 15-Jun-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move opcode table to opdef.c

We already have the declarations in opdef.h, move the rest into its own
file rather than in the main io_uring.c file.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f3b44f92 13-Jun-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move read/write related opcodes to its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 73572984 13-Jun-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move rsrc related data, core, and commands

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 3b77495a 13-Jun-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split provided buffers handling into its own file

Move both the opcodes related to it, and the internals code dealing with
it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 7aaff708 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move cancelation into its own file

This also helps cleanup the io_uring.h cancel parts, as we can make
things static in the cancel.c file, mostly.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 329061d3 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move poll handling into its own file

Add a io_poll_issue() rather than export the general task_work locking
and io_issue_sqe(), and put the io_op_defs definition and structure into
a separate header file so that poll can use it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# c9f06aa7 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move io_uring_task (tctx) helpers into its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# a4ad4f74 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move fdinfo helpers to its own file

This also means moving a bit more of the fixed file handling to the
filetable side, which makes sense separately too.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 17437f31 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move SQPOLL related handling into its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 59915143 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move timeout opcodes and handling into its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 36404b09 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move msg_ring into its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f9ead18c 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split network related opcodes into its own file

While at it, convert the handlers to just use io_eopnotsupp_prep()
if CONFIG_NET isn't set.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# e0da14de 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move statx handling to its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# a9c210ce 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move epoll handler to its own file

Would be nice to sort out Kconfig for this and don't even compile
epoll.c if we don't have epoll configured.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 99f15d8d 25-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move uring_cmd handling to its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# cd40cae2 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split out open/close operations

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 453b329b 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: separate out file table handling code

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# f4c163dd 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split out fadvise/madvise operations

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 0d584727 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split out fs related sync/fallocate functions

This splits out sync_file_range, fsync, and fallocate.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 531113bb 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split out splice related operations

This splits out splice and tee support.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 11aeb714 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: split out filesystem related operations

This splits out renameat, unlinkat, mkdirat, symlinkat, and linkat.

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# e28683bd 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move nop into its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 5e2a18d9 24-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move xattr related opcodes to its own file

Signed-off-by: Jens Axboe <axboe@kernel.dk>


# ed29b0b4 23-May-2022 Jens Axboe <axboe@kernel.dk>

io_uring: move to separate directory

In preparation for splitting io_uring up a bit, move it into its own
top level directory. It didn't really belong in fs/ anyway, as it's
not a file system only API.

This adds io_uring/ and moves the core files in there, and updates the
MAINTAINERS file for the new location.

Signed-off-by: Jens Axboe <axboe@kernel.dk>