History log of /freebsd-current/sys/kern/subr_epoch.c
Revision Date Author Comments
# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 2555f175 31-Jan-2023 Konstantin Belousov <kib@FreeBSD.org>

Move kstack_contains() and GET_STACK_USAGE() to MD machine/stack.h

Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38320


# fa1d803c 20-Jan-2023 Brooks Davis <brooks@FreeBSD.org>

epoch: replace hand coded assertion

The assertion is equivalent to kstack_contains() so use that rather
than spelling it out.

Suggested by: jhb
Reviewed by: jhb
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D38107


# aca2a7fa 07-Mar-2022 Eric van Gyzen <vangyzen@FreeBSD.org>

stack_zero is not needed before stack_save

The man page was recently clarified to commit to this contract.

MFC after: 1 week
Sponsored by: Dell EMC Isilon


# db0ac6de 02-Dec-2021 Cy Schubert <cy@FreeBSD.org>

Revert "wpa: Import wpa_supplicant/hostapd commit 14ab4a816"

This reverts commit 266f97b5e9a7958e365e78288616a459b40d924a, reversing
changes made to a10253cffea84c0c980a36ba6776b00ed96c3e3b.

A mismerge of a merge to catch up to main resulted in files being
committed which should not have been.


# d96fccc5 02-Dec-2021 Gleb Smirnoff <glebius@FreeBSD.org>

epoch: with EPOCH_TRACE add epoch_where_report()
which will report where the epoch was entered and also
mark the tracker, so that exit will also be reported.

Helps to understand epoch entrance/exit scenarios in
complex cases, like network stack. As everything else
under EPOCH_TRACE it is a developer only tool.


# ef0f7ae9 21-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

The old thread priority must be stored as part of the EPOCH(9) tracker.

Else recursive use of EPOCH(9) may cause the wrong priority to be restored.

Bump the __FreeBSD_version due to changing the thread and epoch tracker
structure.

Differential Revision: https://reviews.freebsd.org/D30375
Reviewed by: markj@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# c82c2006 21-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Accessing the epoch structure should happen after the INIT_CHECK().
Else the epoch pointer may be NULL.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# f3316835 21-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Properly define EPOCH(9) function macro.

No functional change intended.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# cc9bb7a9 21-May-2021 Hans Petter Selasky <hselasky@FreeBSD.org>

Rework for-loop in EPOCH(9) to reduce indentation level.

No functional change intended.

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking


# 7667824a 06-Nov-2020 Kyle Evans <kevans@FreeBSD.org>

epoch: support non-preemptible epochs checking in_epoch()

Previously, non-preemptible epochs could not check; in_epoch() would always
fail, usually because non-preemptible epochs don't imply THREAD_NO_SLEEPING.

For default epochs, it's easy enough to verify that we're in the given
epoch: if we're in a critical section and our record for the given epoch
is active, then we're in it.

This patch also adds some additional INVARIANTS bookkeeping. Notably, we set
and check the recorded thread in epoch_enter/epoch_exit to try and catch
some edge-cases for the caller. It also checks upon freeing that none of the
records had a thread in the epoch, which may make it a little easier to
diagnose some improper use if epoch_free() took place while some other
thread was inside.

This version differs slightly from what was just previously reviewed by the
below-listed, in that in_epoch() will assert that no CPU has this thread
recorded even if it *is* currently in a critical section. This is intended
to catch cases where the caller might have somehow messed up critical
section nesting, we can catch both if they exited the critical section or if
they exited, migrated, then re-entered (on the wrong CPU).

Reviewed by: kib, markj (both previous version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27098


# 826c0793 07-Aug-2020 Hans Petter Selasky <hselasky@FreeBSD.org>

Add full support support for dynamic allocation and freeing of epoch's.

Make sure to reclaim epoch structures when they are freed to support
dynamic allocation and freeing of epoch structures.

While at it, move the 64 supported epoch control structures to the
static memory domain. This overall simplifies the management and
debugging of system epoch's.

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D25960
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 48baf00f 12-Feb-2020 Mateusz Guzik <mjg@FreeBSD.org>

epoch: convert zpcpu_get_cpua(.., curcpu) to zpcpu_get


# 66c6c556 16-Jan-2020 Gleb Smirnoff <glebius@FreeBSD.org>

Change argument order of epoch_call() to more natural, first function,
then its argument.

Reviewed by: imp, cem, jhb


# fedab1b4 13-Jan-2020 Konstantin Belousov <kib@FreeBSD.org>

Code must not unlock a mutex while owning the thread lock.

Reviewed by: hselasky, markj
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23150


# cc79ea3a 18-Dec-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Restore important comment in RCU/EPOCH support in FreeBSD after r355784.

Sponsored by: Mellanox Technologies


# 686bcb5c 15-Dec-2019 Jeff Roberson <jeff@FreeBSD.org>

schedlock 4/4

Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency. This means that mi_switch() is entered with the thread
locked but returns without. This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.

This change has not been made to 4BSD but in principle it would be
more straightforward.

Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778


# 173c062a 06-Dec-2019 Bjoern A. Zeeb <bz@FreeBSD.org>

Improve EPOCH_TRACE

Two changes to EPOCH_TRACE:
(1) add a sysctl to surpress the backtrace from epoch_trace_report().
Sometimes the log line for the recursion is enough and the
backtrace massively spams the console.
(2) In order to be able to go without the backtrace do not only
print where the previous occurance happened, but also where
the current one happens. That way we have file:line information
for both and can look at them without the need for getting line
numbers from backtrace and a debugging tool.

Reviewed by: glebius
Sponsored by: Netflix (originally)
Differential Revision: https://reviews.freebsd.org/D22641


# 7993a104 22-Nov-2019 Conrad Meyer <cem@FreeBSD.org>

Add explicit SI_SUB_EPOCH

Add explicit SI_SUB_EPOCH, after SI_SUB_TASKQ and before SI_SUB_SMP
(EARLY_AP_STARTUP). Rename existing "SI_SUB_TASKQ + 1" to SI_SUB_EPOCH.

epoch(9) consumers cannot epoch_alloc() before SI_SUB_EPOCH:SI_ORDER_SECOND,
but likely should allocate before SI_SUB_SMP. Prior to this change,
consumers (well, epoch itself, and net/if.c) just open-coded the
SI_SUB_TASKQ + 1 order to match epoch.c, but this was fragile.

Reviewed by: mmacy
Differential Revision: https://reviews.freebsd.org/D22503


# 5757b59f 29-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Merge td_epochnest with td_no_sleeping.

Epoch itself doesn't rely on the counter and it is provided
merely for sleeping subsystems to check it.

- In functions that sleep use THREAD_CAN_SLEEP() to assert
correctness. With EPOCH_TRACE compiled print epoch info.
- _sleep() was a wrong place to put the assertion for epoch,
right place is sleepq_add(), as there ways to call the
latter bypassing _sleep().
- Do not increase td_no_sleeping in non-preemptible epochs.
The critical section would trigger all possible safeguards,
no sleeping counter is extraneous.

Reviewed by: kib


# 080e9496 22-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Allow epoch tracker to use the very last byte of the stack. Not sure
this will help to avoid panic in this function, since it will also use
some stack, but makes code more strict.

Submitted by: hselasky


# 77d70e51 21-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Assert that any epoch tracker belongs to the thread stack.

Reviewed by: kib


# 279b9aab 21-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Remove epoch tracker from struct thread. It was an ugly crutch to emulate
locking semantics for if_addr_rlock() and if_maddr_rlock().


# bac06038 15-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

When assertion for a thread not being in an epoch fails also print all
entered epochs. Works with EPOCH_TRACE only.

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D22017


# f6eccf96 13-Oct-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Since EPOCH_TRACE had been moved to opt_global.h, we don't need to waste
extra space in struct thread.


# dd902d01 25-Sep-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Add debugging facility EPOCH_TRACE that checks that epochs entered are
properly nested and warns about recursive entrances. Unlike with locks,
there is nothing fundamentally wrong with such use, the intent of tracer
is to help to review complex epoch-protected code paths, and we mean the
network stack here.

Reviewed by: hselasky
Sponsored by: Netflix
Pull Request: https://reviews.freebsd.org/D21610


# 2fb62b1a 24-Jul-2019 Mark Johnston <markj@FreeBSD.org>

Fix the turnstile_lock() KPI.

turnstile_{lock,unlock}() were added for use in epoch. turnstile_lock()
returned NULL to indicate that the calling thread had lost a race and
the turnstile was no longer associated with the given lock, or the lock
owner. However, reader-writer locks may not have a designated owner,
in which case turnstile_lock() would return NULL and
epoch_block_handler_preempt() would leak spinlocks as a result.

Apply a minimal fix: return the lock owner as a separate return value.

Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21048


# 131b2b76 28-Jun-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Implement API for draining EPOCH(9) callbacks.

The epoch_drain_callbacks() function is used to drain all pending
callbacks which have been invoked by prior epoch_call() function calls
on the same epoch. This function is useful when there are shared
memory structure(s) referred to by the epoch callback(s) which are not
refcounted and are rarely freed. The typical place for calling this
function is right before freeing or invalidating the shared
resource(s) used by the epoch callback(s). This function can sleep and
is not optimized for performance.

Differential Revision: https://reviews.freebsd.org/D20109
MFC after: 1 week
Sponsored by: Mellanox Technologies


# f855ec81 12-Feb-2019 Marius Strobl <marius@FreeBSD.org>

Make taskqgroup_attach{,_cpu}(9) work across architectures

So far, intr_{g,s}etaffinity(9) take a single int for identifying
a device interrupt. This approach doesn't work on all architectures
supported, as a single int isn't sufficient to globally specify a
device interrupt. In particular, with multiple interrupt controllers
in one system as found on e. g. arm and arm64 machines, an interrupt
number as returned by rman_get_start(9) may be only unique relative
to the bus and, thus, interrupt controller, a certain device hangs
off from.
In turn, this makes taskqgroup_attach{,_cpu}(9) and - internal to
the gtaskqueue implementation - taskqgroup_attach_deferred{,_cpu}()
not work across architectures. Yet in turn, iflib(4) as gtaskqueue
consumer so far doesn't fit architectures where interrupt numbers
aren't globally unique.
However, at least for intr_setaffinity(..., CPU_WHICH_IRQ, ...) as
employed by the gtaskqueue implementation to bind an interrupt to a
particular CPU, using bus_bind_intr(9) instead is equivalent from
a functional point of view, with bus_bind_intr(9) taking the device
and interrupt resource arguments required for uniquely specifying a
device interrupt.
Thus, change the gtaskqueue implementation to employ bus_bind_intr(9)
instead and intr_{g,s}etaffinity(9) to take the device and interrupt
resource arguments required respectively. This change also moves
struct grouptask from <sys/_task.h> to <sys/gtaskqueue.h> and wraps
struct gtask along with the gtask_fn_t typedef into #ifdef _KERNEL
as userland likes to include <sys/_task.h> or indirectly drags it
in - for better or worse also with _KERNEL defined -, which with
device_t and struct resource dependencies otherwise is no longer
as easily possible now.
The userland inclusion problem probably can be improved a bit by
introducing a _WANT_TASK (as well as a _WANT_MOUNT) akin to the
existing _WANT_PRISON etc., which is orthogonal to this change,
though, and likely needs an exp-run.

While at it:
- Change the gt_cpu member in the grouptask structure to be of type
int as used elswhere for specifying CPUs (an int16_t may be too
narrow sooner or later),
- move the gtaskqueue_enqueue_fn typedef from <sys/gtaskqueue.h> to
the gtaskqueue implementation as it's only used and needed there,
- change the GTASK_INIT macro to use "gtask" rather than "task" as
argument given that it actually operates on a struct gtask rather
than a struct task, and
- let subr_gtaskqueue.c consistently use __func__ to print functions
names.

Reported by: mmel
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D19139


# 91cf4975 13-Nov-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9) revert r340097 - no longer a need for multiple sections per cpu

I spoke with Samy Bahra and recent changes to CK to make ck_epoch_call and
ck_epoch_poll not modify the record have eliminated the need for this.


# 635c1884 13-Nov-2018 Gleb Smirnoff <glebius@FreeBSD.org>

style(9), mostly adjusting overly long lines.


# a760c50c 13-Nov-2018 Gleb Smirnoff <glebius@FreeBSD.org>

With epoch not inlined, there is no point in using _lite KPI. While here,
remove some unnecessary casts.


# 9f360eec 13-Nov-2018 Gleb Smirnoff <glebius@FreeBSD.org>

The dualism between epoch_tracker and epoch_thread is fragile and
unnecessary. So, expose CK types to kernel and use a single normal
structure for epoch_tracker.

Reviewed by: jtl, gallatin


# b79aa45e 13-Nov-2018 Gleb Smirnoff <glebius@FreeBSD.org>

For compatibility KPI functions like if_addr_rlock() that used to have
mutexes but now are converted to epoch(9) use thread-private epoch_tracker.
Embedding tracker into ifnet(9) or ifnet derived structures creates a non
reentrable function, that will fail miserably if called simultaneously from
two different contexts.
A thread private tracker will provide a single tracker that would allow to
call these functions safely. It doesn't allow nested call, but this is not
expected from compatibility KPIs.

Reviewed by: markj


# a82296c2 13-Nov-2018 Gleb Smirnoff <glebius@FreeBSD.org>

Uninline epoch(9) entrance and exit. There is no proof that modern
processors would benefit from avoiding a function call, but bloating
code. In fact, clang created an uninlined real function for many
object files in the network stack.

- Move epoch_private.h into subr_epoch.c. Code copied exactly, avoiding
any changes, including style(9).
- Remove private copies of critical_enter/exit.

Reviewed by: kib, jtl
Differential Revision: https://reviews.freebsd.org/D17879


# 10f42d24 02-Nov-2018 Matt Macy <mmacy@FreeBSD.org>

Convert epoch to read / write records per cpu

In discussing D17503 "Run epoch calls sooner and more reliably" with
sbahra@ we came to the conclusion that epoch is currently misusing the
ck_epoch API. It isn't safe to do a "write side" operation (ck_epoch_call
or ck_epoch_poll) in the middle of a "read side" section. Since, by definition,
it's possible to be preempted during the middle of an EPOCH_PREEMPT
epoch the GC task might call ck_epoch_poll or another thread might call
ck_epoch_call on the same section. The right solution is ultimately to change
the way that ck_epoch works for this use case. However, as a stopgap for
12 we agreed to simply have separate records for each use case.

Tested by: pho@

MFC after: 3 days


# 9fec45d8 08-Aug-2018 Matt Macy <mmacy@FreeBSD.org>

epoch_block_wait: don't check TD_RUNNING

struct epoch_thread is not type safe (stack allocated) and thus cannot be dereferenced from another CPU

Reported by: novel@


# 822e50e3 06-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): simplify initialization

replace manual NUMA aware allocation with a pcpu zone


# 10b8cd7f 04-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): make nesting assert in epoch_wait_preempt more specific

Reported by: markj


# 6573d758 03-Jul-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): allow preemptible epochs to compose

- Add tracker argument to preemptible epochs
- Inline epoch read path in kernel and tied modules
- Change in_epoch to take an epoch as argument
- Simplify tfb_tcp_do_segment to not take a ti_locked argument,
there's no longer any benefit to dropping the pcbinfo lock
and trying to do so just adds an error prone branchfest to
these functions
- Remove cases of same function recursion on the epoch as
recursing is no longer free.
- Remove the the TAILQ_ENTRY and epoch_section from struct
thread as the tracker field is now stack or heap allocated
as appropriate.

Tested by: pho and Limelight Networks
Reviewed by: kbowling at llnw dot com
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16066


# 74333b3d 24-Jun-2018 Matt Macy <mmacy@FreeBSD.org>

fix assert and conditionally allow mutexes to be held across epoch_wait_preempt


# 0bcfb473 23-Jun-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): Don't trigger taskq enqueue before the grouptaskqs are setup

If EARLY_AP_STARTUP is not defined it is possible for an epoch to be
allocated prior to it being possible to call epoch_call without
issue.

Based on patch by andrew@

PR: 229014
Reported by: andrew


# ae25f40b 21-Jun-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): make non-preemptible variant work early boot


# e445381f 29-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): make epoch closer to style(9)


# 13679eba 21-May-2018 Mark Johnston <markj@FreeBSD.org>

Don't pass a section cookie to CK for non-preemptible epoch sections.

They're only useful when multiple threads may share an epoch record,
and that can't happen with non-preemptible sections.

Reviewed by: mmacy
Differential Revision: https://reviews.freebsd.org/D15507


# e339e436 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

subr_epoch.c fix unused variable warnings


# 20ba6811 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): assert that epoch is allocated post-configure


# 70398c2f 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): Make epochs non-preemptible by default

There are risks associated with waiting on a preemptible epoch section.
Change the name to make them not be the default and document the issue
under CAVEATS.

Reported by: markj


# 60b7b90d 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch: actually allocate the counters we've assigned sysctls too

Approved by: sbruno


# 5e68a3df 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch: add non-preemptible "critical" variant

adds:
- epoch_enter_critical() - can be called inside a different epoch,
starts a section that will acquire any MTX_DEF mutexes or do
anything that might sleep.
- epoch_exit_critical() - corresponding exit call
- epoch_wait_critical() - wait variant that is guaranteed that any
threads in a section are running.
- epoch_global_critical - an epoch_wait_critical safe epoch instance

Requested by: markj
Approved by: sbruno


# a5f10424 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch: skip poll function call in hardclock unless there are callbacks pending

Reported by: mjg
Approved by: sbruno


# c4d901e9 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): schedule pcpu callback task in hardclock if there are callbacks pending

Approved by: sbruno


# 2a45e828 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): eliminate the need to wait when polling for callbacks to run

by using ck's own callback handling mechanism we can simply check which
callbacks have had a grace period elapse

Approved by: sbruno


# d1bcb409 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): fix potential deadlock

Don't acquire a waiting thread's lock while holding our own

Approved by: sbruno


# 766d2253 17-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): restore thread priority on exit if it was changed by a waiter

Reported by: markj
Approved by: sbruno


# fdf71aeb 16-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): make recursion lighter weight

There isn't any real work to do except bump td_epochnest when recursing.
Skip the additional work in this case.

Approved by: sbruno


# b8205686 16-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): Guarantee forward progress on busy sections

Add epoch section to struct thread. We can use this to
ennable epoch counter to advance even if a section is
perpetually occupied by a thread.

Approved by: sbruno


# 0c58f85b 13-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): allow sx locks to be held across epoch_wait()

The INVARIANTS checks in epoch_wait() were intended to
prevent the block handler from returning with locks held.
What it in fact did was preventing anything except Giant
from being held across it. Check that the number of locks
held has not changed instead.

Approved by: sbruno@


# 1f4beb63 13-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): cleanups, additional debug checks, and add global_epoch

- GC the _nopreempt routines
- to really benefit we'd need a separate routine
- they're not currently in use
- they complicate the API for no benefit at this time

- check that we're actually in a epoch section at exit

- handle epoch_call() early in boot

- Fix copyright declaration language

Approved by: sbruno@


# f1401123 12-May-2018 Matt Macy <mmacy@FreeBSD.org>

hwpmc/epoch - don't reference domain if NUMA is not set

It appears that domain information is set correctly independent
of whether or not NUMA is defined. However, there is no memory
backing secondary domains leading to allocation failure.

Reported by: pho@, np@
Approved by: sbruno@


# 8dcbd0ea 11-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): always set inited in epoch_init

- set inited in the !usedomains case

Reported by: jhibbits
Approved by: sbruno


# 4aa302df 11-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): callback task fixes

- initialize the pcpu STAILQ in the NUMA case
- don't enqueue the callback task if there isn't sufficient work to be done

Reported by: pho@
Approved by: sbruno@


# b2cb2896 10-May-2018 Matt Macy <mmacy@FreeBSD.org>

epoch(9): fix priority handling, make callback lists pcpu, and other fixes

- Lend priority to preempted threads in epoch_wait to handle the case
in which we've had priority lent to us. Previously we borrowed the
priority of the lowest priority preempted thread. (pointed out by mjg@)

- Don't attempt allocate memory per-domain on powerpc, we don't currently
handle empty sockets (as is the case on jhibbits Talos' board).

- Handle deferred callbacks as pcpu lists and poll the lists periodically.
Currently the interval is 1/hz.

- Drop the thread lock when adaptive spinning. Holding the lock starves
other threads and can even lead to lockups.

- Keep a generation count pcpu so that we don't keep spining if a thread
has left and re-entered an epoch section.

- Actually removed the callback from the callback list so that we don't
double free. Sigh ...

Approved by: sbruno@


# 06bf2a6a 10-May-2018 Matt Macy <mmacy@FreeBSD.org>

Add simple preempt safe epoch API

Read locking is over used in the kernel to guarantee liveness. This API makes
it easy to provide livenes guarantees without atomics.

Includes epoch_test kernel module to stress test the API.

Documentation will follow initial use case.

Test case and improvements to preemption handling in response to discussion
with mjg@

Reviewed by: imp@, shurd@
Approved by: sbruno@