#
aaaa01c0 |
|
05-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp hpts: initialize variable Ensure that tv.tv_sec is zero in all code paths. Reported by: Coverity Scan CID: 1527724 Reviewed by: rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44584
|
#
b600644f |
|
01-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp hpts: improve consistency The target_slot argument of max_slots_available() can be NULL. Therefore, check for this in all places. Right now, all callers provide non-NULL pointer. Reported by: Coverity Scan CID: 1527732 Reviewed by: rrs MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44527
|
#
b7b78c1c |
|
28-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
Optimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold HPTS inserts a softclock for system call return that optimizes performance. However when no HPTS threads need the help (i.e. when they have less than 100 or so connections) then there should be little work done i.e. check the counter and return instead of running through all the threads getting locks etc.ptimize HPTS so that little work is done until we have a hpts thread that is over the connection threshold. Reported by: eduardo Reviewed by: gallatin, glebius, tuexen Tested by: gallatin Differential Revision: https://reviews.freebsd.org/D44420
|
#
638b5ae1 |
|
01-Mar-2024 |
Randall Stewart <rrs@FreeBSD.org> |
HTPS has actually three states not two so the macro needs to account for that. Ok lets fix up the tcp_in_hpts() so that it also says yes if you are in the race state moving and you are scheduled to be put in. This also requires changing the MPASS to be the old version non inline function of tcp_in_hpts(). This change also adds a new inline macro so that a uint64_t timestamp can be obtained by a transport (aka Rack will use this). Reviewed by: glebius, tuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D44157
|
#
ef0ac0a1 |
|
20-Jan-2024 |
Gordon Bergling <gbe@FreeBSD.org> |
tcp_hpts: Fix a typo of a function name in a comment - s/tcp_ouput/tcp_output/ MFC after: 3 days
|
#
08c33cd9 |
|
26-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
hpts: avoid duplicate call to tcp_output() Obtained from: rrs
|
#
48b55a7c |
|
19-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: make the module unloadable Although the HPTS subsytem wasn't initially designed as a loadable module, now it is so. Make it possible to also unload it, but for safety reasons hide that under 'kldunload -f'. Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D43092
|
#
175d4d69 |
|
19-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: use tcp_pace.cts_last_ran for last ran table Remove the global cts_last_ran and use already existing unused field of struct tcp_hptsi, which seems originally planned to hold this table. This makes it consistent with other malloc-ed tables, like main array of HPTS entities and CPU groups. Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D43091
|
#
3f46be6a |
|
07-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: let tcp_hpts_init() set a random CPU only once After d2ef52ef3dee the tcp_hpts_init() function can be called multiple times on a tcpcb if it is switched there and back between two TCP stacks. First, this makes existing assertion in tcp_hpts_init() incorrect. Second, it creates possibility to change a randomly set t_hpts_cpu to a different random value, while a tcpcb is already in the HPTS wheel, triggering other assertions later in tcp_hptsi(). The best approach here would be to work on the stacks to really clear a tcpcb out of HPTS wheel in tfb_tcp_fb_fini, draining the IHPTS_MOVING state. But that's pretty intrusive change, so let's just get back to the old logic (pre d2ef52ef3dee) where t_hpts_cpu was set to a random value only once in a CPU lifetime and a newly switched stack inherits t_hpts_cpu from the previous stack. Reviewed by: rrs, tuexen Differential Revision: https://reviews.freebsd.org/D42946 Reported-by: syzbot+fab29fe1ab089c52998d@syzkaller.appspotmail.com Reported-by: syzbot+ca5f2aa0fda15dcfe6d7@syzkaller.appspotmail.com Fixes: 2b3a77467dd3d74a7170f279fb25f9736b46ef8a
|
#
2c6fc36a |
|
04-Dec-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
hpts/lro: make tcp_lro_flush_tcphpts() and tcp_run_hpts() pointers Rename tcp_run_hpts() to tcp_hpts_softlock() to better describe its function. This makes loadable hpts.ko working correctly with LRO. Reviewed by: tuexen, rrs Differential Revision: https://reviews.freebsd.org/D42858
|
#
6a79e480 |
|
27-Nov-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Fix two latent bugs in hpts. One where a static is put on a local variable, the other an initialization bug where we should be setting tv.tv_sec to 0. PR: 275482
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
c3c20de3 |
|
25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: move HPTS/LRO flags out of inpcb to tcpcb These flags are TCP specific. While here, make also several LRO internal functions to pass tcpcb pointer instead of inpcb one. Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39698
|
#
c2a69e84 |
|
25-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: move HPTS related fields from inpcb to tcpcb This makes inpcb lighter and allows future cache line optimizations of tcpcb. The reason why HPTS originally used inpcb is the compressed TIME-WAIT state (see 0d7445193ab), that used to free a tcpcb, while the associated connection is still on the HPTS ring. Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39697
|
#
01216268 |
|
21-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: hpts needs to still call output even after input. The other stacks it turns out actually expect the output to be called and can become stuck if it is not. This is because they run there timer code from there and the input routine does not always assure a timer is running. The real longterm fix here might be to go into the other stacks (rack and bbr) and make sure that a timer is running after input if you don't do output.. as well as call the timer functions. This would cut down on calls from hpts. But I think its too dramatic of a change for the immediate time. Reviewed by: tuexen, glebius Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D39738
|
#
2ad584c5 |
|
17-Apr-2023 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: Inconsistent use of hpts_calling flag Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One such inconsistency results in a bug when we can't allocate enough sendmap entries to entertain a call to rack_output().. basically a timer won't get started like it should. Also in cleaning this up I find that the "no_output" side of input needs to be adjusted to make sure we don't try to re-pace too quickly outside the hpts assurance of 250useconds. Another thing here is we end up with duplicate calls to tcp_output() which we should not. If packets go from hpts for processing the input side of tcp will call the output side of tcp on the last packet if it is needed. This means that when that occurs a second call to tcp_output would be made that is not needed and if pacing is going on may be harmful. Lets fix all this and explicitly state the contract that hpts is making with transports that care about the flag. Reviewed by: tuexen, glebius Sponsored by: Netflix Inc Differential Revision:https://reviews.freebsd.org/D39653
|
#
a540cdca |
|
17-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: use queue(9) STAILQ for the input queue Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39574
|
#
35bc0bcc |
|
07-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: reduce argument list to functions that pass a segment The socket argument is superfluous, as a tcpcb always has one and only one socket. Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D39434
|
#
2ff8187e |
|
04-Apr-2023 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: remove dead code tcp_drop_in_pkts() Should have gone in f971e791391.
|
#
69c7c811 |
|
16-Mar-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities. The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints. Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831
|
#
d68f1542 |
|
11-Jan-2023 |
Gordon Bergling <gbe@FreeBSD.org> |
tcp_hpts: Fix a typo in a source code comment - s/subract/subtract/ MFC after: 3 days
|
#
eaabc937 |
|
14-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: retire TCPDEBUG This subsystem is superseded by modern debugging facilities, e.g. DTrace probes and TCP black box logging. We intentionally leave SO_DEBUG in place, as many utilities may set it on a socket. Also the tcp::debug DTrace probes look at this flag on a socket. Reviewed by: gnn, tuexen Discussed with: rscheff, rrs, jtl Differential revision: https://reviews.freebsd.org/D37694
|
#
e68b3792 |
|
07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: embed inpcb into tcpcb For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately. The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move. The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data. One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization. Large part of the change was done with sed script: s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>. Differential revision: https://reviews.freebsd.org/D37127
|
#
9eb0e832 |
|
08-Nov-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: provide macros to access inpcb and socket from a tcpcb There should be no functional changes with this commit. Reviewed by: rscheff Differential revision: https://reviews.freebsd.org/D37123
|
#
53af6903 |
|
06-Oct-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove INP_TIMEWAIT flag Mechanically cleanup INP_TIMEWAIT from the kernel sources. After 0d7445193ab, this commit shall not cause any functional changes. Note: this flag was very often checked together with INP_DROPPED. If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we will be able to remove most of this checks and turn them to assertions. Some of them can be turned into assertions right now, but that should be carefully done on a case by case basis. Differential revision: https://reviews.freebsd.org/D36400
|
#
d07a5018 |
|
03-Sep-2022 |
Gordon Bergling <gbe@FreeBSD.org> |
tcp_hpts: Correct some typos in source code comments - s/occured/occurred/ - s/the the/the/ MFC after: 3 days
|
#
b33bfe6e |
|
15-Aug-2022 |
Dimitry Andric <dim@FreeBSD.org> |
Fix unused variable warnings in tcp_hpts.c With clang 15, the following -Werror warning is produced: sys/netinet/tcp_hpts.c:1114:10: error: variable 'paced_cnt' set but not used [-Werror,-Wunused-but-set-variable] int32_t paced_cnt = 0; ^ sys/netinet/tcp_hpts.c:1112:11: error: variable 'total_slots_processed' set but not used [-Werror,-Wunused-but-set-variable] uint64_t total_slots_processed = 0; ^ The 'paced_cnt' variable was in tcp_hpts.c when it was first added, and the 'total_slots_processed' variable was added in d7955cc0ffdf9, but both appear to have been debugging aids that have never been used, so remove them. MFC after: 3 days
|
#
db6b3286 |
|
15-Aug-2022 |
Dimitry Andric <dim@FreeBSD.org> |
Adjust function definition in tcp_hpts.c to avoid clang 15 warning With clang 15, the following -Werror warning is produced: sys/netinet/tcp_hpts.c:1594:23: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] tcp_choose_hpts_to_run() ^ void This is because tcp_choose_hpts_to_run() is declared with a (void) argument list, but defined with an empty argument list. Make the definition match the declaration. MFC after: 3 days
|
#
6e6439b2 |
|
14-Apr-2022 |
Randall Stewart <rrs@FreeBSD.org> |
tcp - hpts timing is off when we are above 1200 connections. HPTS timing begins to go off when we reach the threshold of connections (1200 by default) where we have any returning syscall or LRO stop finding the oldest hpts thread that has not run but instead using the CPU it is on. This ends up causing quite a lot of times where hpts threads may not run for extended periods of time. On top of all that which causes heartburn if you are pacing in tcp, you also have the fact that where AMD's podded L3 cache may have sets of 8 CPU's that share a L3, hpts is unaware of this and thus on amd you can generate a lot of cache misses. So to fix this we will get rid of the CPU mode, and always use oldest. But also make HPTS aware of the CPU topology and keep the "oldest" to be within the same L3 cache. This also works nicely for NUMA as well couple with Drew's earlier NUMA changes. Reviewed by: glebius, gallatin, tuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D34916
|
#
1f2aaef2 |
|
09-Apr-2022 |
Gordon Bergling <gbe@FreeBSD.org> |
tcp_htps: Fix a typo in a source code comment - s/postion/position/ MFC after: 3 days
|
#
47ded797 |
|
07-Feb-2022 |
Franco Fichtner <franco@opnsense.org> |
netinet: simplify RSS ifdef statements Approved by: transport (rrs) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D31583
|
#
aac52f94 |
|
18-Jan-2022 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: Warning cleanup from new compiler. The clang compiler recently got an update that generates warnings of unused variables where they were set, and then never used. This revision goes through the tcp stack and cleans all of those up. Reviewed by: Michael Tuexen, Gleb Smirnoff Sponsored by: Netflix Inc. Differential Revision:
|
#
a370832b |
|
26-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove delayed drop KPI No longer needed after tcp_output() can ask caller to drop. Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D33371
|
#
f64dc2ab |
|
26-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: TCP output method can request tcp_drop The advanced TCP stacks (bbr, rack) may decide to drop a TCP connection when they do output on it. The default stack never does this, thus existing framework expects tcp_output() always to return locked and valid tcpcb. Provide KPI extension to satisfy demands of advanced stacks. If the output method returns negative error code, it means that caller must call tcp_drop(). In tcp_var() provide three inline methods to call tcp_output(): - tcp_output() is a drop-in replacement for the default stack, so that default stack can continue using it internally without modifications. For advanced stacks it would perform tcp_drop() and unlock and report that with negative error code. - tcp_output_unlock() handles the negative code and always converts it to positive and always unlocks. - tcp_output_nodrop() just calls the method and leaves the responsibility to drop on the caller. Sweep over the advanced stacks and use new KPI instead of using HPTS delayed drop queue for that. Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D33370
|
#
40fa3e40 |
|
26-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: mechanically substitute call to tfb_tcp_output to new method. Made with sed(1) execution: sed -Ef sed -i "" $(grep --exclude tcp_var.h -lr tcp_output sys/) sed: s/tp->t_fb->tfb_tcp_output\(tp\)/tcp_output(tp)/ s/to tfb_tcp_output\(\)/to tcp_output()/ Reviewed by: rrs, tuexen Differential revision: https://reviews.freebsd.org/D33366
|
#
db0ac6de |
|
02-Dec-2021 |
Cy Schubert <cy@FreeBSD.org> |
Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816" This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b. A mismerge of a merge to catch up to main resulted in files being committed which should not have been.
|
#
2e27230f |
|
02-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: rewrite inpcb synchronization Just trust the pcb database, that if we did in_pcbref(), no way an inpcb can go away. And if we never put a dropped inpcb on our queue, and tcp_discardcb() always removes an inpcb to be dropped from the queue, then any inpcb on the queue is valid. Now, to solve LOR between inpcb lock and HPTS queue lock do the following trick. When we are about to process a certain time slot, take the full queue of the head list into on stack list, drop the HPTS lock and work on our queue. This of course opens a race when an inpcb is being removed from the on stack queue, which was already mentioned in comments. To address this race introduce generation count into queues. If we want to remove an inpcb with generation count mismatch, we can't do that, we can only mark it with desired new time slot or -1 for remove. Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33026
|
#
f971e791 |
|
02-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: rename input queue to drop queue and trim dead code The HPTS input queue is in reality used only for "delayed drops". When a TCP stack decides to drop a connection on the output path it can't do that due to locking protocol between main tcp_output() and stacks. So, rack/bbr utilize HPTS to drop the connection in a different context. In the past the queue could also process input packets in context of HPTS thread, but now no stack uses this, so remove this functionality. Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33025
|
#
b0a7c008 |
|
02-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: make struct tcp_hpts_entry private to the module. Also, make some of the functions also private to the module. Remove unused functions discovered after that. Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33024
|
#
50f081ec |
|
02-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp_hpts: provide tcp_in_hpts(). It will hide some internal HPTS knowledge from the consumers. Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33023
|
#
de2d4784 |
|
02-Dec-2021 |
Gleb Smirnoff <glebius@FreeBSD.org> |
SMR protection for inpcbs With introduction of epoch(9) synchronization to network stack the inpcb database became protected by the network epoch together with static network data (interfaces, addresses, etc). However, inpcb aren't static in nature, they are created and destroyed all the time, which creates some traffic on the epoch(9) garbage collector. Fairly new feature of uma(9) - Safe Memory Reclamation allows to safely free memory in page-sized batches, with virtually zero overhead compared to uma_zfree(). However, unlike epoch(9), it puts stricter requirement on the access to the protected memory, needing the critical(9) section to access it. Details: - The database is already build on CK lists, thanks to epoch(9). - For write access nothing is changed. - For a lookup in the database SMR section is now required. Once the desired inpcb is found we need to transition from SMR section to r/w lock on the inpcb itself, with a check that inpcb isn't yet freed. This requires some compexity, since SMR section itself is a critical(9) section. The complexity is hidden from KPI users in inp_smr_lock(). - For a inpcb list traversal (a pcblist sysctl, or broadcast notification) also a new KPI is provided, that hides internals of the database - inp_next(struct inp_iterator *). Reviewed by: rrs Differential revision: https://reviews.freebsd.org/D33022
|
#
7312e4e5 |
|
08-Jul-2021 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: Fix 32 bit platform breakage This fixes the incorrect use of a sysctl add to u64. It was for a useconds time, but on 32 bit platforms its not a u64. Instead use the long directive. Reviewed by: tuexen Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D31107
|
#
d7955cc0 |
|
06-Jul-2021 |
Randall Stewart <rrs@FreeBSD.org> |
tcp: HPTS performance enhancements HPTS drives both rack and bbr, and yet there have been many complaints about performance. This bit of work restructures hpts to help reduce CPU overhead. It does this by now instead of relying on the timer/callout to drive it instead use user return from a system call as well as lro flushes to drive hpts. The timer becomes a backstop that dynamically adjusts based on how "late" we are. Reviewed by: tuexen, glebius Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D31083
|
#
662c1305 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
net: clean up empty lines in .c and .h files
|
#
4e1a3ff8 |
|
03-Mar-2020 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
tcp_hpts: make RSS kernel compile again. Add proper #includes, and #ifdefs and some style fixes to make RSS kernels compile again. There are still possible issues with uin16_t vs. uint_t cpuid which I am not going near. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D23726
|
#
7029da5c |
|
26-Feb-2020 |
Pawel Biernacki <kaktus@FreeBSD.org> |
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718
|
#
df341f59 |
|
12-Feb-2020 |
Randall Stewart <rrs@FreeBSD.org> |
Whitespace, remove from three files trailing white space (leftover presents from emacs). Sponsored by: Netflix Inc.
|
#
43e8b279 |
|
07-Nov-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
In TCP HPTS enter the epoch in tcp_hpts_thread() and assert it in the leaf functions.
|
#
3b0b41e6 |
|
10-Jul-2019 |
Randall Stewart <rrs@FreeBSD.org> |
This commit updates rack to what is basically being used at NF as well as sets in some of the groundwork for committing BBR. The hpts system is updated as well as some other needed utilities for the entrance of BBR. This is actually part 1 of 3 more needed commits which will finally complete with BBRv1 being added as a new tcp stack. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D20834
|
#
4e255d74 |
|
10-May-2019 |
Andrew Gallatin <gallatin@FreeBSD.org> |
Bind TCP HPTS (pacer) threads to NUMA domains Bind the TCP pacer threads to NUMA domains and build per-domain pacer-thread lookup tables. These tables allow us to use the inpcb's NUMA domain information to match an inpcb with a pacer thread on the same domain. The motivation for this is to keep the TCP connection local to a NUMA domain as much as possible. Thanks to jhb for pre-reviewing an earlier version of the patch. Reviewed by: rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20134
|
#
52467047 |
|
04-Feb-2019 |
Warner Losh <imp@FreeBSD.org> |
Regularize the Netflix copyright Use recent best practices for Copyright form at the top of the license: 1. Remove all the All Rights Reserved clauses on our stuff. Where we piggybacked others, use a separate line to make things clear. 2. Use "Netflix, Inc." everywhere. 3. Use a single line for the copyright for grep friendliness. 4. Use date ranges in all places for our stuff. Approved by: Netflix Legal (who gave me the form), adrian@ (pmc files)
|
#
384a5c3c |
|
01-Oct-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Add INP_INFO_WUNLOCK_ASSERT() macro and use it instead of INP_INFO_UNLOCK_ASSERT() in TCP-related code. For encapsulated traffic it is possible, that the code is running in net_epoch_preempt section, and INP_INFO_UNLOCK_ASSERT() is very strict assertion for such case. PR: 231428 Reviewed by: mmacy, tuexen Approved by: re (kib) Differential Revision: https://reviews.freebsd.org/D17335
|
#
6d2b0c01 |
|
06-Sep-2018 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Make tcp_hpts.c compile a LINT kernel with options RSS and PCBGROUPS added by adding the missing include files and changing a the type of cpuid which would otherwise cause a false comparison with NETISR_CPUID_NONE. Reviewed by: rrs Approved by: re (marius) Differential Revision: https://reviews.freebsd.org/D16891
|
#
16bbf600 |
|
10-Aug-2018 |
Andrey V. Elsukov <ae@FreeBSD.org> |
Remove unneeded ipsec-related includes. Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D16637
|
#
6573d758 |
|
03-Jul-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch(9): allow preemptible epochs to compose - Add tracker argument to preemptible epochs - Inline epoch read path in kernel and tied modules - Change in_epoch to take an epoch as argument - Simplify tfb_tcp_do_segment to not take a ti_locked argument, there's no longer any benefit to dropping the pcbinfo lock and trying to do so just adds an error prone branchfest to these functions - Remove cases of same function recursion on the epoch as recursing is no longer free. - Remove the the TAILQ_ENTRY and epoch_section from struct thread as the tracker field is now stack or heap allocated as appropriate. Tested by: pho and Limelight Networks Reviewed by: kbowling at llnw dot com Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16066
|
#
f923a734 |
|
18-Jun-2018 |
Randall Stewart <rrs@FreeBSD.org> |
Move the tp set back to where it was before we started playing with the VNET sets. This way we have verified the INP settings before we go to the trouble of de-referencing it. Reviewed by: and suggested by lstewart Sponsored by: Netflix Inc.
|
#
9e58ff6f |
|
18-Jun-2018 |
Matt Macy <mmacy@FreeBSD.org> |
convert inpcbinfo hash and info rwlocks to epoch + mutex - Convert inpcbinfo info & hash locks to epoch for read and mutex for write - Garbage collect code that handled INP_INFO_TRY_RLOCK failures as INP_INFO_RLOCK which can no longer fail When running 64 netperfs sending minimal sized packets on a 2x8x2 reduces unhalted core cycles samples in rwlock rlock/runlock in udp_send from 51% to 3%. Overall packet throughput rate limited by CPU affinity and NIC driver design choices. On the receiver unhalted core cycles samples in in_pcblookup_hash went from 13% to to 1.6% Tested by LLNW and pho@ Reviewed by: jtl Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15686
|
#
f994ead3 |
|
18-Jun-2018 |
Randall Stewart <rrs@FreeBSD.org> |
Move to using the inp->vnet pointer has suggested by lstewart. This is far better since the hpts system is using the inp as its basis anyway. Unfortunately his comments came late. Sponsored by: Netflix Inc.
|
#
9293873e |
|
14-Jun-2018 |
Gleb Smirnoff <glebius@FreeBSD.org> |
TCPOUTFLAGS no longer exists since r334843.
|
#
c9b4ac75 |
|
12-Jun-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This fixes missing VNET sets in the hpts system. Basically without this and running vnets with a TCP stack that uses some of the features is a recipe for panic (without this commit). Reported by: Larry Rosenman Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15757
|
#
cff21e48 |
|
11-Jun-2018 |
Jonathan T. Looney <jtl@FreeBSD.org> |
Change RACK dependency on TCPHPTS from a build-time dependency to a load- time dependency. At present, RACK requires the TCPHPTS option to run. However, because modules can be moved from machine to machine, this dependency is really best assessed at load time rather than at build time. Reviewed by: rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D15756
|
#
afbd6cfa |
|
08-Jun-2018 |
Matt Macy <mmacy@FreeBSD.org> |
hpts: remove redundant decl breaking gcc build
|
#
603bbd06 |
|
09-May-2018 |
Warner Losh <imp@FreeBSD.org> |
Minor style nits Use full copyright year. Remove 'All Rights Reserved' from new file (rights holder OK'd) Minor #ifdef motion and #endif tagging Remove __FBSDID macro from comments Sponsored by: Netflix OK'd by: rrs@
|
#
3ee9c3c4 |
|
19-Apr-2018 |
Randall Stewart <rrs@FreeBSD.org> |
This commit brings in the TCP high precision timer system (tcp_hpts). It is the forerunner/foundational work of bringing in both Rack and BBR which use hpts for pacing out packets. The feature is optional and requires the TCPHPTS option to be enabled before the feature will be active. TCP modules that use it must assure that the base component is compile in the kernel in which they are loaded. MFC after: Never Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15020
|