History log of /freebsd-current/sys/netinet/tcp_hpts.h
Revision Date Author Comments
# 638b5ae1 01-Mar-2024 Randall Stewart <rrs@FreeBSD.org>

HTPS has actually three states not two so the macro needs to account for that.

Ok lets fix up the tcp_in_hpts() so that it also says yes if you
are in the race state moving and you are scheduled to be put in.
This also requires changing the MPASS to be the old version non
inline function of tcp_in_hpts().

This change also adds a new inline macro so that a uint64_t timestamp can be
obtained by a transport (aka Rack will use this).

Reviewed by: glebius, tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D44157


# 48b55a7c 19-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: make the module unloadable

Although the HPTS subsytem wasn't initially designed as a loadable
module, now it is so. Make it possible to also unload it, but for
safety reasons hide that under 'kldunload -f'.

Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D43092


# e3cbc572 04-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

kern/subr_trap.c: repair the HPTS performance hack in userret()

It wasn't functional as subr_trap.c doesn't include opt_inet.h. Put a
better comment provided by gallatin@ in place of the old one. The idea
is to use userret() as a cheap place to call a soft clock. This approach
saves CPU on busy machines and saves power on idle machines.
An alternative would be to constantly schedule callouts. Running with
neither callouts nor the soft clock ruins HPTS precision.

Reviewed by: tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D42860


# 2c6fc36a 04-Dec-2023 Gleb Smirnoff <glebius@FreeBSD.org>

hpts/lro: make tcp_lro_flush_tcphpts() and tcp_run_hpts() pointers

Rename tcp_run_hpts() to tcp_hpts_softlock() to better describe its
function. This makes loadable hpts.ko working correctly with LRO.

Reviewed by: tuexen, rrs
Differential Revision: https://reviews.freebsd.org/D42858


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 6eb2dbfa 14-Jun-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: add missing static keywords

Without them compilation with -O0 would produce kernel modules
that depend on symbol that doesn't exist.


# f2e75b96 25-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: add missing "inline"

Fixes: c2a69e846fffb95271c0299e0a81e2033382e9c2


# c2a69e84 25-Apr-2023 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: move HPTS related fields from inpcb to tcpcb

This makes inpcb lighter and allows future cache line optimizations
of tcpcb. The reason why HPTS originally used inpcb is the compressed
TIME-WAIT state (see 0d7445193ab), that used to free a tcpcb, while the
associated connection is still on the HPTS ring.

Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D39697


# c67eb393 05-Apr-2023 Mateusz Guzik <mjg@FreeBSD.org>

tcp_hpts: plug a compiler warn

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 73ee5756 31-Mar-2023 Randall Stewart <rrs@FreeBSD.org>

Fixes in the tcp infrastructure with respect to stack changes as well as other infrastructure updates for incoming rack features.

So stack switching as always been a bit of a issue. We currently use a break before make setup which means that
if something goes wrong you have to try to get back to a stack. This patch among a lot of other things changes that so
that it is a make before break. We also expand some of the function blocks in prep for new features in rack that will allow
more controlled pacing. We also add other abilities such as the pathway for a stack to query a previous stack to acquire from
it critical state information so things in flight don't get dropped or mis-handled when switching stacks. We also add the
concept of a timer granularity. This allows an alternate stack to change from the old ticks granularity to microseconds and
of course this even gives us a pathway to go to nanosecond timekeeping if we need to (something for the data center to consider
for sure).

Once all this lands I will then update rack to begin using all these new features.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D39210


# d07a5018 03-Sep-2022 Gordon Bergling <gbe@FreeBSD.org>

tcp_hpts: Correct some typos in source code comments

- s/occured/occurred/
- s/the the/the/

MFC after: 3 days


# 6e6439b2 14-Apr-2022 Randall Stewart <rrs@FreeBSD.org>

tcp - hpts timing is off when we are above 1200 connections.

HPTS timing begins to go off when we reach the threshold of connections (1200 by default)
where we have any returning syscall or LRO stop finding the oldest hpts thread that
has not run but instead using the CPU it is on. This ends up causing quite a lot of times
where hpts threads may not run for extended periods of time. On top of all that which
causes heartburn if you are pacing in tcp, you also have the fact that where AMD's
podded L3 cache may have sets of 8 CPU's that share a L3, hpts is unaware of this
and thus on amd you can generate a lot of cache misses.

So to fix this we will get rid of the CPU mode, and always use oldest. But also make
HPTS aware of the CPU topology and keep the "oldest" to be within the same L3 cache.
This also works nicely for NUMA as well couple with Drew's earlier NUMA changes.

Reviewed by: glebius, gallatin, tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D34916


# a370832b 26-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp: remove delayed drop KPI

No longer needed after tcp_output() can ask caller to drop.

Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D33371


# db0ac6de 02-Dec-2021 Cy Schubert <cy@FreeBSD.org>

Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"

This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing
changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.


# 2e27230f 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: rewrite inpcb synchronization

Just trust the pcb database, that if we did in_pcbref(), no way
an inpcb can go away. And if we never put a dropped inpcb on
our queue, and tcp_discardcb() always removes an inpcb to be
dropped from the queue, then any inpcb on the queue is valid.

Now, to solve LOR between inpcb lock and HPTS queue lock do the
following trick. When we are about to process a certain time
slot, take the full queue of the head list into on stack list,
drop the HPTS lock and work on our queue. This of course opens
a race when an inpcb is being removed from the on stack queue,
which was already mentioned in comments. To address this race
introduce generation count into queues. If we want to remove
an inpcb with generation count mismatch, we can't do that, we
can only mark it with desired new time slot or -1 for remove.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33026


# f971e791 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: rename input queue to drop queue and trim dead code

The HPTS input queue is in reality used only for "delayed drops".
When a TCP stack decides to drop a connection on the output path
it can't do that due to locking protocol between main tcp_output()
and stacks. So, rack/bbr utilize HPTS to drop the connection in
a different context.

In the past the queue could also process input packets in context
of HPTS thread, but now no stack uses this, so remove this
functionality.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33025


# b0a7c008 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: make struct tcp_hpts_entry private to the module.

Also, make some of the functions also private to the module. Remove
unused functions discovered after that.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33024


# 50f081ec 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

tcp_hpts: provide tcp_in_hpts().

It will hide some internal HPTS knowledge from the consumers.

Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D33023


# d7955cc0 06-Jul-2021 Randall Stewart <rrs@FreeBSD.org>

tcp: HPTS performance enhancements

HPTS drives both rack and bbr, and yet there have been many complaints
about performance. This bit of work restructures hpts to help reduce CPU
overhead. It does this by now instead of relying on the timer/callout to
drive it instead use user return from a system call as well as lro flushes
to drive hpts. The timer becomes a backstop that dynamically adjusts
based on how "late" we are.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31083


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# df341f59 12-Feb-2020 Randall Stewart <rrs@FreeBSD.org>

Whitespace, remove from three files trailing white
space (leftover presents from emacs).

Sponsored by: Netflix Inc.


# 3b0b41e6 10-Jul-2019 Randall Stewart <rrs@FreeBSD.org>

This commit updates rack to what is basically being used at NF as
well as sets in some of the groundwork for committing BBR. The
hpts system is updated as well as some other needed utilities
for the entrance of BBR. This is actually part 1 of 3 more
needed commits which will finally complete with BBRv1 being
added as a new tcp stack.

Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D20834


# 52467047 04-Feb-2019 Warner Losh <imp@FreeBSD.org>

Regularize the Netflix copyright

Use recent best practices for Copyright form at the top of
the license:
1. Remove all the All Rights Reserved clauses on our stuff. Where we
piggybacked others, use a separate line to make things clear.
2. Use "Netflix, Inc." everywhere.
3. Use a single line for the copyright for grep friendliness.
4. Use date ranges in all places for our stuff.

Approved by: Netflix Legal (who gave me the form), adrian@ (pmc files)


# 6573d758 03-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): allow preemptible epochs to compose

- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
there's no longer any benefit to dropping the pcbinfo lock
and trying to do so just adds an error prone branchfest to
these functions
- Remove cases of same function recursion on the epoch as
recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
thread as the tracker field is now stack or heap allocated
as appropriate.

Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066


# 33123867 09-May-2018 Warner Losh <imp@FreeBSD.org>

Use the full year, for real this time.


# 603bbd06 09-May-2018 Warner Losh <imp@FreeBSD.org>

Minor style nits

Use full copyright year.
Remove 'All Rights Reserved' from new file (rights holder OK'd)
Minor #ifdef motion and #endif tagging
Remove __FBSDID macro from comments

Sponsored by: Netflix
OK'd by: rrs@


# 3ee9c3c4 19-Apr-2018 Randall Stewart <rrs@FreeBSD.org>

This commit brings in the TCP high precision timer system (tcp_hpts).
It is the forerunner/foundational work of bringing in both Rack and BBR
which use hpts for pacing out packets. The feature is optional and requires
the TCPHPTS option to be enabled before the feature will be active. TCP
modules that use it must assure that the base component is compile in
the kernel in which they are loaded.

MFC after: Never
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D15020