History log of /freebsd-current/sys/kern/kern_lock.c
Revision Date Author Comments
# b92cd6b2 21-May-2024 Ryan Libby <rlibby@FreeBSD.org>

lockmgr: make lockmgr_disowned public and use it

Reviewed by: mckusick, kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45248


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 9a7f7c26 22-Feb-2023 Mitchell Horne <mhorne@FreeBSD.org>

lockmgr: upgrade panic return checks

We short-circuit lockmgr functions in the face of a kernel panic. Other
lock implementations do this with a SCHEDULER_STOPPED() check, which
covers the additional case where the debugger is active but the system
has not panicked. Update this code to match that behaviour.

Reviewed by: mjg, kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38655


# f902e4bb 11-Sep-2021 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: fix lock profiling of face adaptive spinning


# 6a467cc5 23-May-2021 Mateusz Guzik <mjg@FreeBSD.org>

lockprof: pass lock type as an argument instead of reading the spin flag


# eac22dd4 14-Feb-2021 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: shrink struct lock by 8 bytes on LP64

Currently the struct has a 4 byte padding stemming from 3 ints.

1. prio comfortably fits in short, unfortunately there is no dedicated
type for it and plumbing it throughout the codebase is not worth it
right now, instead an assert is added which covers also flags for
safety
2. lk_exslpfail can in principle exceed u_short, but the count is
already not considered reliable and it only ever gets modified
straight to 0. In other words it can be incrementing with an upper
bound of USHRT_MAX

With these in place struct lock shrinks from 48 to 40 bytes.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D28680


# 38baca17 06-Jan-2021 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: fix upgrade

TRYUPGRADE requests kept failing when they should not have due to wrong
macro used to count readers.

Fixes: f6b091fbbd77cbb0 ("lockmgr: rewrite upgrade to stop always dropping the lock")
Noted by: asomers
Differential Revision: https://reviews.freebsd.org/D27947


# f90d57b8 23-Nov-2020 Mateusz Guzik <mjg@FreeBSD.org>

locks: push lock_delay_arg_init calls down

Minor cleanup to skip doing them when recursing on locks and so that
they can act on found lock value if need be.


# 6fed89b1 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

kern: clean up empty lines in .c and .h files


# 13869889 24-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: add missing 'continue' to account for spuriously failed fcmpset

PR: 248245
Reported by: gbe
Noted by: markj
Fixes by: r363415 ("lockmgr: add adaptive spinning")


# 31ad4050 21-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: add adaptive spinning

It is very conservative. Only spinning when LK_ADAPTIVE is passed, only on
exclusive lock and never when any waiters are present. buffer cache is remains
not spinning.

This reduces total sleep times during buildworld etc., but it does not shorten
total real time (culprits are contention in the vm subsystem along with slock +
upgrade which is not covered).

For microbenchmarks: open3_processes -t 52 (open/close of the same file for
writing) ops/s:
before: 258845
after: 801638

Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25753


# 4aff9f5d 21-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: denote recursion with a bit in lock value

This reduces excessive reads from the lock.

Tested by: pho


# f6b091fb 21-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: rewrite upgrade to stop always dropping the lock

This matches rw and sx locks.


# bdb6d824 21-Jul-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: add a helper for reading the lock value


# 9a79b990 09-Apr-2020 Kirk McKusick <mckusick@FreeBSD.org>

When running with a kernel compiled with DEBUG_LOCKS, before
panic'ing for recusing on a non-recursive lock, print out the
kernel stack where the lock was originally acquired.


# c1b57fa7 14-Feb-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: rename lock_fast_path to lock_flags

The routine is not much of a fast path and the flags name better describes
its purpose.


# 943c4932 14-Feb-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: retire the unused lockmgr_unlock_fast_path routine


# c00115f1 24-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: don't touch the lock past unlock

This evens it up with other locking primitives.

Note lock profiling still touches the lock, which again is in line with the
rest.

Reviewed by: jeff
Differential Revision: https://reviews.freebsd.org/D23343


# 879e0604 11-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

Add KERNEL_PANICKED macro for use in place of direct panicstr tests


# fea73412 24-Dec-2019 Conrad Meyer <cem@FreeBSD.org>

sleep(9), sleepqueue(9): const'ify wchan pointers

_sleep(9), wakeup(9), sleepqueue(9), et al do not dereference or modify the
channel pointers provided in any way; they are merely used as intptrs into a
dictionary structure to match waiters with wakers. Correctly annotate this
such that _sleep() and wakeup() may be used on const pointers without
invoking ugly patterns like __DECONST(). Plumb const through all of the
underlying sleepqueue bits.

No functional change.

Reviewed by: rlibby
Discussed with: kib, markj
Differential Revision: https://reviews.freebsd.org/D22914


# c8b29d12 11-Dec-2019 Mateusz Guzik <mjg@FreeBSD.org>

vfs: locking primitives which elide ->v_vnlock and shared locking disablement

Both of these features are not needed by many consumers and result in avoidable
reads which in turn puts them on profiles due to cache-line ping ponging.

On top of that the current lockgmr entry point is slower than necessary
single-threaded. As an attempted clean up preparing for other changes,
provide new routines which don't support any of the aforementioned features.

With these patches in place vop_stdlock and vop_stdunlock disappear from
flamegraphs during -j 104 buildkernel.

Reviewed by: jeff (previous version)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22665


# 5fe188b1 30-Nov-2019 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: remove more remnants of adaptive spinning

Sponsored by: The FreeBSD Foundation


# 5b699f16 21-Aug-2019 Mark Johnston <markj@FreeBSD.org>

Add lockmgr(9) probes to the lockstat DTrace provider.

They follow the conventions set by rw and sx lock probes. There is
an additional lockstat:::lockmgr-disown probe.

Update lockstat(1) to report on contention and hold events for
lockmgr locks. Document the new probes in dtrace_lockstat.4, and
deduplicate some of the existing probe descriptions.

Reviewed by: mjg
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21355


# 46713135 14-Jan-2019 Gleb Smirnoff <glebius@FreeBSD.org>

Add flag LK_NEW for lockinit() that is converted to LO_NEW and passed
down to lock_init(). This allows for lockinit() on a not prezeroed
memory.


# 6e8c1ccb 06-Dec-2018 Mateusz Guzik <mjg@FreeBSD.org>

Annotate Giant drop/pickup macros with __predict_false

They are used in important places of the kernel with the lock not being held
majority of the time.

Sponsored by: The FreeBSD Foundation


# 95ab076d 13-Jul-2018 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: tidy up slock/sunlock similar to other locks


# 02fe8a24 18-May-2018 Matt Macy <mmacy@FreeBSD.org>

remove unused locked variable in lockmgr_unlock_fast_path


# 10391db5 18-May-2018 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: avoid atomic on unlock in the slow path

The code is pretty much guaranteed not to be able to unlock.

This is a minor nit. The code still performs way too many reads.
The altered exclusive-locked condition is supposed to be always
true as well, to be cleaned up at a later date.


# b543c98c 24-Apr-2018 Conrad Meyer <cem@FreeBSD.org>

lockmgr: Add missed neutering during panic

r313683 introduced new lockmgr APIs that missed the panic-time neutering
present in the rest of our locks. Correct that by adding the usual check.

Additionally, move the __lockmgr_args neutering above the assertions at the
top of the function. Drop the interlock unlock because we shouldn't have
an unneutered interlock either. No point trying to unlock it.

PR: 227749
Reported by: jtl
Sponsored by: Dell EMC Isilon


# 83fc34ea 20-Mar-2018 Gleb Smirnoff <glebius@FreeBSD.org>

At this point iwmesg isn't initialized yet, so print pointer to lock
rather than panic before panicing.


# 0ad122a9 04-Mar-2018 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: save on sleepq when cmpset fails


# 93d41967 04-Mar-2018 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: whack unused lockmgr_note_exclusive_upgrade


# 1c6987eb 04-Mar-2018 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: start decomposing the main routine

The main routine takes 8 args, 3 of which are almost the same for most uses.
This in particular pushes it above the limit of 6 arguments passable through
registers on amd64 making it impossible to tail call.

This is a prerequisite for further cleanups.

Tested by: pho


# 8a36da99 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/kern: adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 18f23540 17-Nov-2017 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: remove the ADAPTIVE_LOCKMGRS option

The code was never enabled and is very heavy weight.

A revamped adaptive spinning may show up at a later time.

Discussed with: kib


# c4a48867 12-Feb-2017 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: implement fast path

The main lockmgr routine takes 8 arguments which makes it impossible to
tail-call it by the intermediate vop_stdlock/unlock routines.

The routine itself starts with an if-forest and reads from the lock itself
several times.

This slows things down both single- and multi-threaded. With the patch
single-threaded fstats go 4% up and multithreaded up to ~27%.

Note that there is still a lot of room for improvement.

Reviewed by: kib
Tested by: pho


# fc4f686d 01-Jun-2016 Mateusz Guzik <mjg@FreeBSD.org>

Microoptimize locking primitives by avoiding unnecessary atomic ops.

Inline version of primitives do an atomic op and if it fails they fallback to
actual primitives, which immediately retry the atomic op.

The obvious optimisation is to check if the lock is free and only then proceed
to do an atomic op.

Reviewed by: jhb, vangyzen


# e3043798 29-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/kern: spelling fixes in comments.

No functional change.


# ce1c953e 01-Aug-2015 Mark Johnston <markj@FreeBSD.org>

Don't modify curthread->td_locks unless INVARIANTS is enabled.

This field is only used in a KASSERT that verifies that no locks are held
when returning to user mode. Moreover, the td_locks accounting is only
correct when LOCK_DEBUG > 0, which is implied by INVARIANTS.

Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D3205


# a115fb62 22-Jan-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Revert for r277213:

FreeBSD developers need more time to review patches in the surrounding
areas like the TCP stack which are using MPSAFE callouts to restore
distribution of callouts on multiple CPUs.

Bump the __FreeBSD_version instead of reverting it.

Suggested by: kmacy, adrian, glebius and kib
Differential Revision: https://reviews.freebsd.org/D1438


# 1a26c3c0 15-Jan-2015 Hans Petter Selasky <hselasky@FreeBSD.org>

Major callout subsystem cleanup and rewrite:
- Close a migration race where callout_reset() failed to set the
CALLOUT_ACTIVE flag.
- Callout callback functions are now allowed to be protected by
spinlocks.
- Switching the callout CPU number cannot always be done on a
per-callout basis. See the updated timeout(9) manual page for more
information.
- The timeout(9) manual page has been updated to reflect how all the
functions inside the callout API are working. The manual page has
been made function oriented to make it easier to deduce how each of
the functions making up the callout API are working without having
to first read the whole manual page. Group all functions into a
handful of sections which should give a quick top-level overview
when the different functions should be used.
- The CALLOUT_SHAREDLOCK flag and its functionality has been removed
to reduce the complexity in the callout code and to avoid problems
about atomically stopping callouts via callout_stop(). If someone
needs it, it can be re-added. From my quick grep there are no
CALLOUT_SHAREDLOCK clients in the kernel.
- A new callout API function named "callout_drain_async()" has been
added. See the updated timeout(9) manual page for a complete
description.
- Update the callout clients in the "kern/" folder to use the callout
API properly, like cv_timedwait(). Previously there was some custom
sleepqueue code in the callout subsystem, which has been removed,
because we now allow callouts to be protected by spinlocks. This
allows us to tear down the callout like done with regular mutexes,
and a "td_slpmutex" has been added to "struct thread" to atomically
teardown the "td_slpcallout". Further the "TDF_TIMOFAIL" and
"SWT_SLEEPQTIMO" states can now be completely removed. Currently
they are marked as available and will be cleaned up in a follow up
commit.
- Bump the __FreeBSD_version to indicate kernel modules need
recompilation.
- There has been several reports that this patch "seems to squash a
serious bug leading to a callout timeout and panic".

Kernel build testing: all architectures were built
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D1438
Sponsored by: Mellanox Technologies
Reviewed by: jhb, adrian, sbruno and emaste


# e64b4fa8 13-Nov-2014 Konstantin Belousov <kib@FreeBSD.org>

Do not try to dereference thread pointer when the value is not a pointer.

Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 2cba8dd3 04-Nov-2014 John Baldwin <jhb@FreeBSD.org>

Add a new thread state "spinning" to schedgraph and add tracepoints at the
start and stop of spinning waits in lock primitives.


# cc246667 02-Nov-2014 Konstantin Belousov <kib@FreeBSD.org>

Followup to r273966. Fix the build with ADAPTIVE_LOCKMGRS kernel option.

Note that the option is currently not used in any in-tree kernel
configs, including LINTs.

Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 72ba3c08 02-Nov-2014 Konstantin Belousov <kib@FreeBSD.org>

Fix two issues with lockmgr(9) LK_CAN_SHARE() test, which determines
whether the shared request for already shared-locked lock could be
granted. Both problems result in the exclusive locker starvation.

The concurrent exclusive request is indicated by either
LK_EXCLUSIVE_WAITERS or LK_EXCLUSIVE_SPINNERS flags. The reverse
condition, i.e. no exclusive waiters, must check that both flags are
cleared.

Add a flag LK_NODDLKTREAT for shared lock request to indicate that
current thread guarantees that it does not own the lock in shared
mode. This turns back the exclusive lock starvation avoidance code;
see man page update for detailed description.

Use LK_NODDLKTREAT when doing lookup(9).

Reported and tested by: pho
No objections from: attilio
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 575e02d9 29-Aug-2014 Konstantin Belousov <kib@FreeBSD.org>

Add function and wrapper to switch lockmgr and vnode lock back to
auto-promotion of shared to exclusive.

Tested by: hrs, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 54366c0b 25-Nov-2013 Attilio Rao <attilio@FreeBSD.org>

- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
version of the releasing functions for mutex, rwlock and sxlock.
Failing to do so skips the lockstat_probe_func invokation for
unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
kernel compiled without lock debugging options, potentially every
consumer must be compiled including opt_kdtrace.h.
Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested. As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while. Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by: EMC / Isilon storage division
Discussed with: rstone
[0] Reported by: rstone
[1] Discussed with: philip


# 7c6fe803 29-Sep-2013 Konstantin Belousov <kib@FreeBSD.org>

Add LK_TRYUPGRADE operation for lockmgr(9), which attempts to
atomically upgrade shared lock to exclusive. On failure, error is
returned and lock is not dropped in the process.

Tested by: pho (previous version)
No objections from: attilio
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (glebius)


# 7faf4d90 20-Sep-2013 Davide Italiano <davide@FreeBSD.org>

Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With
current lock classes KPI it was really difficult because there was no
way to pass an rmtracker object to the lock/unlock routines. In order
to accomplish the task, modify the aforementioned functions so that
they can return (or pass as argument) an uinptr_t, which is in the rm
case used to hold a pointer to struct rm_priotracker for current
thread. As an added bonus, this fixes rm_sleep() in the rm shared
case, which right now can communicate priotracker structure between
lc_unlock()/lc_lock().

Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)


# b5fb43e5 25-Jun-2013 John Baldwin <jhb@FreeBSD.org>

A few mostly cosmetic nits to aid in debugging:
- Call lock_init() first before setting any lock_object fields in
lock init routines. This way if the machine panics due to a duplicate
init the lock's original state is preserved.
- Somewhat similarly, don't decrement td_locks and td_slocks until after
an unlock operation has completed successfully.


# 24150d37 03-Jun-2013 John Baldwin <jhb@FreeBSD.org>

- Fix a couple of inverted panic messages for shared/exclusive mismatches
of a lock within a single thread.
- Fix handling of interlocks in WITNESS by properly requiring the interlock
to be held exactly once if it is specified.


# e63091ea 09-May-2013 Marcel Moolenaar <marcel@FreeBSD.org>

Add option WITNESS_NO_VNODE to suppress printing LORs between VNODE
locks. To support this, VNODE locks are created with the LK_IS_VNODE
flag. This flag is propagated down using the LO_IS_VNODE flag.

Note that WITNESS still records the LOR. Only the printing and the
optional entering into the kernel debugger is bypassed with the
WITNESS_NO_VNODE option.


# 43287e27 06-Jan-2013 Mateusz Guzik <mjg@FreeBSD.org>

lockmgr: unlock interlock (if requested) when dealing with upgrade/downgrade
requests for LK_NOSHARE locks, just like for shared locks.

PR: kern/174969
Reviewed by: attilio
MFC after: 1 week


# cd2fe4e6 22-Dec-2012 Attilio Rao <attilio@FreeBSD.org>

Fixup r240424: On entering KDB backends, the hijacked thread to run
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.

Skip the idlethread check on block/sleep lock operations when KDB is
active.

Reported by: jh
Tested by: jh
MFC after: 1 week


# 1c7d98d0 05-Dec-2012 Attilio Rao <attilio@FreeBSD.org>

Check for lockmgr recursion in case of disown and downgrade and panic
also in !debugging kernel rather than having "undefined" behaviour.

Tested by: avg
MFC after: 1 week


# e3ae0dfe 12-Sep-2012 Attilio Rao <attilio@FreeBSD.org>

Improve check coverage about idle threads.

Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.

Reviewed by: bde, jhb, kib
MFC after: 1 week


# f5f9340b 28-Mar-2012 Fabien Thomas <fabient@FreeBSD.org>

Add software PMC support.

New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ
MFC after: 1 month


# 2573ea5f 05-Mar-2012 Ivan Voras <ivoras@FreeBSD.org>

Print out process name and thread id in the debugging message.
This is useful because the message can end up in system logs in
non-debugging operation.

Reviewed by: attilio (earlier version)


# 35370593 11-Dec-2011 Andriy Gapon <avg@FreeBSD.org>

panic: add a switch and infrastructure for stopping other CPUs in SMP case

Historical behavior of letting other CPUs merily go on is a default for
time being. The new behavior can be switched on via
kern.stop_scheduler_on_panic tunable and sysctl.

Stopping of the CPUs has (at least) the following benefits:
- more of the system state at panic time is preserved intact
- threads and interrupts do not interfere with dumping of the system
state

Only one thread runs uninterrupted after panic if stop_scheduler_on_panic
is set. That thread might call code that is also used in normal context
and that code might use locks to prevent concurrent execution of certain
parts. Those locks might be held by the stopped threads and would never
be released. To work around this issue, it was decided that instead of
explicit checks for panic context, we would rather put those checks
inside the locking primitives.

This change has substantial portions written and re-written by attilio
and kib at various times. Other changes are heavily based on the ideas
and patches submitted by jhb and mdf. bde has provided many insights
into the details and history of the current code.

The new behavior may cause problems for systems that use a USB keyboard
for interfacing with system console. This is because of some unusual
locking patterns in the ukbd code which have to be used because on one
hand ukbd is below syscons, but on the other hand it has to interface
with other usb code that uses regular mutexes/Giant for its concurrency
protection. Dumping to USB-connected disks may also be affected.

PR: amd64/139614 (at least)
In cooperation with: attilio, jhb, kib, mdf
Discussed with: arch@, bde
Tested by: Eugene Grosbein <eugen@grosbein.net>,
gnn,
Steven Hartland <killing@multiplay.co.uk>,
glebius,
Andrew Boyer <aboyer@averesystems.com>
(various versions of the patch)
MFC after: 3 months (or never)


# d576deed 16-Nov-2011 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Constify arguments for locking KPIs where possible.

This enables locking consumers to pass their own structures around as const and
be able to assert locks embedded into those structures.

Reviewed by: ed, kib, jhb


# 6472ac3d 07-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# d0a724c5 01-Aug-2011 Konstantin Belousov <kib@FreeBSD.org>

Fix the LK_NOSHARE lockmgr flag interaction with LK_UPGRADE and
LK_DOWNGRADE lock ops. Namely, the ops should be NOP since LK_NOSHARE
locks are always exclusive.

Reported by: rmacklem
Reviewed by: attilio
Tested by: pho
Approved by: re (kensmith)
MFC after: 1 week


# de5b1952 25-Feb-2011 Alexander Leidinger <netchild@FreeBSD.org>

Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: arch@ (parts by rwatson, trasz, jhb)
X-MFC after: to be determined in last commit with code from this project


# 58ccf5b4 11-Jan-2011 John Baldwin <jhb@FreeBSD.org>

Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by: bde


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 3634d5b2 20-Aug-2010 John Baldwin <jhb@FreeBSD.org>

Add dedicated routines to toggle lockmgr flags such as LK_NOSHARE and
LK_CANRECURSE after a lock is created. Use them to implement macros that
otherwise manipulated the flags directly. Assert that the associated
lockmgr lock is exclusively locked by the current thread when manipulating
these flags to ensure the flag updates are safe. This last change required
some minor shuffling in a few filesystems to exclusively lock a brand new
vnode slightly earlier.

Reviewed by: kib
MFC after: 3 days


# 702748e9 18-Jan-2010 Attilio Rao <attilio@FreeBSD.org>

MFC r200447,201703,201709-201710:
In current code, threads performing an interruptible sleep
will leave the waiters flag on forcing the owner to do a wakeup even
when the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" queue while the other queue has real waiters on it,
because nobody is going to wakeup the 2nd queue waiters and they will
sleep indefinitively.
A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on.

Add a sleepqueue interface which does report the actual number of waiters
on a specified queue of a waitchannel and track if at least one sleepfail
waiter is present or not. In presence of this or empty "preferred" queue,
wakeup both waiters queues.

Discussed with: kib
Tested by: Pete French <petefrench at ticketswitch dot com>,
Justin Head <justin at encarnate dot com>


# aab9c8c2 06-Jan-2010 Attilio Rao <attilio@FreeBSD.org>

Fix typos.


# c636ba83 06-Jan-2010 Attilio Rao <attilio@FreeBSD.org>

Tweak comments.


# 9dbf7a62 06-Jan-2010 Attilio Rao <attilio@FreeBSD.org>

Exclusive waiters sleeping with LK_SLEEPFAIL on and using interruptible
sleeps/timeout may have left spourious lk_exslpfail counts on, so clean
it up even when accessing a shared queue acquisition, giving to
lk_exslpfail the value of 'upper limit'.
In the worst case scenario, infact (mixed
interruptible sleep / LK_SLEEPFAIL waiters) what may happen is that both
queues are awaken even if that's not necessary, but still no harm.

Reported by: Lucius Windschuh <lwindschuh at googlemail dot com>
Reviewed by: kib
Tested by: pho, Lucius Windschuh <lwindschuh at googlemail dot com>


# 2028867d 12-Dec-2009 Attilio Rao <attilio@FreeBSD.org>

In current code, threads performing an interruptible sleep (on both
sxlock, via the sx_{s, x}lock_sig() interface, or plain lockmgr), will
leave the waiters flag on forcing the owner to do a wakeup even when if
the waiter queue is empty.
That operation may lead to a deadlock in the case of doing a fake wakeup
on the "preferred" (based on the wakeup algorithm) queue while the other
queue has real waiters on it, because nobody is going to wakeup the 2nd
queue waiters and they will sleep indefinitively.

A similar bug, is present, for lockmgr in the case the waiters are
sleeping with LK_SLEEPFAIL on. In this case, even if the waiters queue
is not empty, the waiters won't progress after being awake but they will
just fail, still not taking care of the 2nd queue waiters (as instead the
lock owned doing the wakeup would expect).

In order to fix this bug in a cheap way (without adding too much locking
and complicating too much the semantic) add a sleepqueue interface which
does report the actual number of waiters on a specified queue of a
waitchannel (sleepq_sleepcnt()) and use it in order to determine if the
exclusive waiters (or shared waiters) are actually present on the lockmgr
(or sx) before to give them precedence in the wakeup algorithm.
This fix alone, however doesn't solve the LK_SLEEPFAIL bug. In order to
cope with it, add the tracking of how many exclusive LK_SLEEPFAIL waiters
a lockmgr has and if all the waiters on the exclusive waiters queue are
LK_SLEEPFAIL just wake both queues.

The sleepq_sleepcnt() introduction and ABI breakage require
__FreeBSD_version bumping.

Reported by: avg, kib, pho
Reviewed by: kib
Tested by: pho


# fbf9555a 22-Nov-2009 Attilio Rao <attilio@FreeBSD.org>

MFC r199008:
Track lockmgr_disown() in the stack.


# 337c5ff4 06-Nov-2009 Attilio Rao <attilio@FreeBSD.org>

Save the sack when doing a lockmgr_disown() call.

Requested by: kib
MFC: 3 days


# 3f4609ac 12-Oct-2009 Attilio Rao <attilio@FreeBSD.org>

MFC r197643, r197735:
When releasing a read/shared lock we need to use a write memory barrier
in order to avoid, on architectures which doesn't have strong ordered
writes, CPU instructions reordering.

Approved by: re (kib)


# 7f9f80ce 03-Oct-2009 Attilio Rao <attilio@FreeBSD.org>

When releasing a lockmgr held in shared way we need to use a write memory
barrier in order to avoid, on architectures which doesn't have strong
ordered writes, CPU instructions reordering.

Diagnosed by: fabio


# c90c9ddd 09-Sep-2009 Attilio Rao <attilio@FreeBSD.org>

Adaptive spinning for locking primitives, in read-mode, have some tuning
SYSCTLs which are inappropriate for a daily use of the machine (mostly
useful only by a developer which wants to run benchmarks on it).
Remove them before the release as long as we do not want to ship with
them in.

Now that the SYSCTLs are gone, instead than use static storage for some
constants, use real numeric constants in order to avoid eventual compiler
dumbiness and the risk to share a storage (and then a cache-line) among
CPUs when doing adaptive spinning together.

Pleasse note that the sys/linker_set.h inclusion in lockmgr and sx lock
support could have been gone, but re@ preferred them to be in order to
minimize the risk of problems on future merging.

Please note that this patch is not a MFC, but an 'edge case' as commit
directly to stable/8, which creates a diverging from HEAD.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Approved by: re (kib)


# db0c92ce 09-Sep-2009 Attilio Rao <attilio@FreeBSD.org>

MFC r196772:
fix adaptive spinning in lockmgr by using correctly GIANT_RESTORE and
continue statement and improve adaptive spinning for sx lock by just
doing once GIANT_SAVE.

Approved by: re (kib)


# 67784314 08-Sep-2009 Poul-Henning Kamp <phk@FreeBSD.org>

Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.


# b34421bf 08-Sep-2009 Poul-Henning Kamp <phk@FreeBSD.org>

Add necessary include.


# 8d3635c4 02-Sep-2009 Attilio Rao <attilio@FreeBSD.org>

Fix some bugs related to adaptive spinning:

In the lockmgr support:
- GIANT_RESTORE() is just called when the sleep finishes, so the current
code can ends up into a giant unlock problem. Fix it by appropriately
call GIANT_RESTORE() when needed. Note that this is not exactly ideal
because for any interation of the adaptive spinning we drop and restore
Giant, but the overhead should be not a factor.
- In the lock held in exclusive mode case, after the adaptive spinning is
brought to completition, we should just retry to acquire the lock
instead to fallthrough. Fix that.
- Fix a style nit

In the sx support:
- Call GIANT_SAVE() before than looping. This saves some overhead because
in the current code GIANT_SAVE() is called several times.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 7e2d0af9 17-Aug-2009 Attilio Rao <attilio@FreeBSD.org>

MFC r196334:

* Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
a pointer-fetching specific operation check. Consequently, rename the
operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
directly alignment on the word boundry, for all the given specific
architectures. That's a bit too strict for some common case, but it
assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)


# 353998ac 17-Aug-2009 Attilio Rao <attilio@FreeBSD.org>

* Change the scope of the ASSERT_ATOMIC_LOAD() from a generic check to
a pointer-fetching specific operation check. Consequently, rename the
operation ASSERT_ATOMIC_LOAD_PTR().
* Fix the implementation of ASSERT_ATOMIC_LOAD_PTR() by checking
directly alignment on the word boundry, for all the given specific
architectures. That's a bit too strict for some common case, but it
assures safety.
* Add a comment explaining the scope of the macro
* Add a new stub in the lockmgr specific implementation

Tested by: marcel (initial version), marius
Reviewed by: rwatson, jhb (comment specific review)
Approved by: re (kib)


# 651175c9 16-Jun-2009 Attilio Rao <attilio@FreeBSD.org>

Introduce support for adaptive spinning in lockmgr.
Actually, as it did receive few tuning, the support is disabled by
default, but it can opt-in with the option ADAPTIVE_LOCKMGRS.
Due to the nature of lockmgrs, adaptive spinning needs to be
selectively enabled for any interested lockmgr.
The support is bi-directional, or, in other ways, it will work in both
cases if the lock is held in read or write way. In particular, the
read path is passible of further tunning using the sysctls
debug.lockmgr.retries and debug.lockmgr.loops . Ideally, such sysctls
should be axed or compiled out before release.

Addictionally note that adaptive spinning doesn't cope well with
LK_SLEEPFAIL. The reason is that many (and probabilly all) consumers
of LK_SLEEPFAIL are mainly interested in knowing if the interlock was
dropped or not in order to reacquire it and re-test initial conditions.
This directly interacts with adaptive spinning because lockmgr needs
to drop the interlock while spinning in order to avoid a deadlock
(further details in the comments inside the patch).

Final note: finding someone willing to help on tuning this with
relevant workloads would be either very important and appreciated.

Tested by: jeff, pho
Requested by: many


# f0830182 02-Jun-2009 Attilio Rao <attilio@FreeBSD.org>

Handle lock recursion differenty by always checking against LO_RECURSABLE
instead the lock own flag itself.

Tested by: pho


# a5aedd68 26-May-2009 Stacey Son <sson@FreeBSD.org>

Add the OpenSolaris dtrace lockstat provider. The lockstat provider
adds probes for mutexes, reader/writer and shared/exclusive locks to
gather contention statistics and other locking information for
dtrace scripts, the lockstat(1M) command and other potential
consumers.

Reviewed by: attilio jhb jb
Approved by: gnn (mentor)


# e5023dd9 12-May-2009 Edward Tomasz Napierala <trasz@FreeBSD.org>

Add missing 'break' statement.

Found with: Coverity Prevent(tm)
CID: 3919


# 1723a064 15-Mar-2009 Jeff Roberson <jeff@FreeBSD.org>

- Wrap lock profiling state variables in #ifdef LOCK_PROFILING blocks.


# 04a28689 14-Mar-2009 Jeff Roberson <jeff@FreeBSD.org>

- Call lock_profile_release when we're transitioning a lock to be owned by
LK_KERNPROC.

Discussed with: attilio


# 8941aad1 06-Feb-2009 John Baldwin <jhb@FreeBSD.org>

Tweak the output of VOP_PRINT/vn_printf() some.
- Align the fifo output in fifo_print() with other vn_printf() output.
- Remove the leading space from lockmgr_printinfo() so its output lines up
in vn_printf().
- lockmgr_printinfo() now ends with a newline, so remove an extra newline
from vn_printf().


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 41313430 10-Sep-2008 John Baldwin <jhb@FreeBSD.org>

Teach WITNESS about the interlocks used with lockmgr. This removes a bunch
of spurious witness warnings since lockmgr grew witness support. Before
this, every time you passed an interlock to a lockmgr lock WITNESS treated
it as a LOR.

Reviewed by: attilio


# 814f26da 22-Aug-2008 John Baldwin <jhb@FreeBSD.org>

Use |= rather than += when aggregrating requests to wakeup the swapper.
What we really want is an inclusive or of all the requests, and += can
in theory roll over to 0.


# da7bbd2c 05-Aug-2008 John Baldwin <jhb@FreeBSD.org>

If a thread that is swapped out is made runnable, then the setrunnable()
routine wakes up proc0 so that proc0 can swap the thread back in.
Historically, this has been done by waking up proc0 directly from
setrunnable() itself via a wakeup(). When waking up a sleeping thread
that was swapped out (the usual case when waking proc0 since only sleeping
threads are eligible to be swapped out), this resulted in a bit of
recursion (e.g. wakeup() -> setrunnable() -> wakeup()).

With sleep queues having separate locks in 6.x and later, this caused a
spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock).
An attempt was made to fix this in 7.0 by making the proc0 wakeup use
the ithread mechanism for doing the wakeup. However, this required
grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep
elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated
into the same LOR since the thread lock would be some other sleepq lock.

Fix this by deferring the wakeup of the swapper until after the sleepq
lock held by the upper layer has been locked. The setrunnable() routine
now returns a boolean value to indicate whether or not proc0 needs to be
woken up. The end result is that consumers of the sleepq API such as
*sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup
proc0 if they get a non-zero return value from sleepq_abort(),
sleepq_broadcast(), or sleepq_signal().

Discussed with: jeff
Glanced at by: sam
Tested by: Jurgen Weber jurgen - ish com au
MFC after: 2 weeks


# 96f1567f 25-Jul-2008 Konstantin Belousov <kib@FreeBSD.org>

s/alredy/already/ in the comments and the log message.


# 5047a8fd 25-May-2008 Attilio Rao <attilio@FreeBSD.org>

The "if" semantic is not needed, just fix this.


# 22dd228d 12-Apr-2008 Attilio Rao <attilio@FreeBSD.org>

Use a "rel" memory barrier for disowning the lock as it cames from an
exclusive locking operation.


# e5f94314 12-Apr-2008 Attilio Rao <attilio@FreeBSD.org>

- Re-introduce WITNESS support for lockmgr. About the old implementation
the only one difference is that lockmgr*() functions now accept
LK_NOWITNESS flag which skips ordering for the instanced calling.
- Remove an unuseful stub in witness_checkorder() (because the above check
doesn't allow ever happening) and allow witness_upgrade() to accept
non-try operation too.


# 872b7289 12-Apr-2008 Attilio Rao <attilio@FreeBSD.org>

- Remove a stale comment.
- Add an extra assertion in order to catch malformed requested operations.


# e0f62984 07-Apr-2008 Attilio Rao <attilio@FreeBSD.org>

- Use a different encoding for lockmgr options: make them encoded by
bit in order to allow per-bit checks on the options flag, in particular
in the consumers code [1]
- Re-enable the check against TDP_DEADLKTREAT as the anti-waiters
starvation patch allows exclusive waiters to override new shared
requests.

[1] Requested by: pjd, jeff


# 047dd67e 06-Apr-2008 Attilio Rao <attilio@FreeBSD.org>

Optimize lockmgr in order to get rid of the pool mutex interlock, of the
state transitioning flags and of msleep(9) callings.
Use, instead, an algorithm very similar to what sx(9) and rwlock(9)
alredy do and direct accesses to the sleepqueue(9) primitive.

In order to avoid writer starvation a mechanism very similar to what
rwlock(9) uses now is implemented, with the correspective per-thread
shared lockmgrs counter.

This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and
lockmgr_args_rw(). These two are like the 2 "normal" versions, but they
both accept a rwlock as interlock. In order to realize this, the general
lockmgr manager function "__lockmgr_args()" has been implemented through
the generic lock layer. It supports all the blocking primitives, but
currently only these 2 mappers live.

The patch drops the support for WITNESS atm, but it will be probabilly
added soon. Also, there is a little race in the draining code which is
also present in the current CVS stock implementation: if some sharers,
once they wakeup, are in the runqueue they can contend the lock with
the exclusive drainer. This is hard to be fixed but the now committed
code mitigate this issue a lot better than the (past) CVS version.
In addition assertive KA_HELD and KA_UNHELD have been made mute
assertions because they are dangerous and they will be nomore supported
soon.

In order to avoid namespace pollution, stack.h is splitted into two
parts: one which includes only the "struct stack" definition (_stack.h)
and one defining the KPI. In this way, newly added _lockmgr.h can
just include _stack.h.

Kernel ABI results heavilly changed by this commit (the now committed
version of "struct lock" is a lot smaller than the previous one) and
KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction,
so manpages and __FreeBSD_version will be updated accordingly.

Tested by: kris, pho, jeff, danger
Reviewed by: jeff
Sponsored by: Google, Summer of Code program 2007


# 7fbfba7b 01-Mar-2008 Attilio Rao <attilio@FreeBSD.org>

- Handle buffer lock waiters count directly in the buffer cache instead
than rely on the lockmgr support [1]:
* bump the waiters only if the interlock is held
* let brelvp() return the waiters count
* rely on brelvp() instead than BUF_LOCKWAITERS() in order to check
for the waiters number
- Remove a namespace pollution introduced recently with lockmgr.h
including lock.h by including lock.h directly in the consumers and
making it mandatory for using lockmgr.
- Modify flags accepted by lockinit():
* introduce LK_NOPROFILE which disables lock profiling for the
specified lockmgr
* introduce LK_QUIET which disables ktr tracing for the specified
lockmgr [2]
* disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it
can only be used on a per-instance basis
- Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer
used

This patch breaks KPI so __FreBSD_version will be bumped and manpages
updated by further commits. Additively, 'struct buf' changes results in
a disturbed ABI also.

[2] Really, currently there is no ktr tracing in the lockmgr, but it
will be added soon.

[1] Submitted by: kib
Tested by: pho, Andrea Barberio <insomniac at slackware dot it>


# 81c794f9 25-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is
always curthread.

As KPI gets broken by this patch, manpages and __FreeBSD_version will be
updated by further commits.

Tested by: Andrea Barberio <insomniac at slackware dot it>


# 24463dbb 15-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

- Introduce lockmgr_args() in the lockmgr space. This function performs
the same operation of lockmgr() but accepting a custom wmesg, prio and
timo for the particular lock instance, overriding default values
lkp->lk_wmesg, lkp->lk_prio and lkp->lk_timo.
- Use lockmgr_args() in order to implement BUF_TIMELOCK()
- Cleanup BUF_LOCK()
- Remove LK_INTERNAL as it is nomore used in the lockmgr namespace

Tested by: Andrea Barberio <insomniac at slackware dot it>


# 84887fa3 13-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

- Add real assertions to lockmgr locking primitives.
A couple of notes for this:
* WITNESS support, when enabled, is only used for shared locks in order
to avoid problems with the "disowned" locks
* KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order
to assert for a generic thread (not curthread) owning or not the
lock. Really, this kind of check is bogus but it seems very
widespread in the consumers code. So, for the moment, we cater this
untrusted behaviour, until the consumers are not fixed and the
options could be removed (hopefully during 8.0-CURRENT lifecycle)
* Implementing KA_HELD and KA_UNHELD (not surported natively by
WITNESS) made necessary the introduction of LA_MASKASSERT which
specifies the range for default lock assertion flags
* About other aspects, lockmgr_assert() follows exactly what other
locking primitives offer about this operation.

- Build real assertions for buffer cache locks on the top of
lockmgr_assert(). They can be used with the BUF_ASSERT_*(bp)
paradigm.

- Add checks at lock destruction time and use a cookie for verifying
lock integrity at any operation.

- Redefine BUF_LOCKFREE() in order to not use a direct assert but
let it rely on the aforementioned destruction time check.

KPI results evidently broken, so __FreeBSD_version bumping and
manpage update result necessary and will be committed soon.

Side note: lockmgr_assert() will be used soon in order to implement
real assertions in the vnode namespace replacing the legacy and still
bogus "VOP_ISLOCKED()" way.

Tested by: kris (earlier version)
Reviewed by: jhb


# 2433c488 08-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into
VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should
only acquire curthread as argument; this will lead in axing the additional
argument from both functions, making the code cleaner.

Reviewed by: jeff, kib


# 9032b51e 06-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

td cannot be NULL in that place, so just axe out the check.


# 6efc8a16 05-Feb-2008 Attilio Rao <attilio@FreeBSD.org>

Add WITNESS support to lockmgr locking primitive.
This support tries to be as parallel as possible with other locking
primitives, but there are differences; more specifically:
- The base witness support is alredy equipped for allowing lock
duplication acquisition as lockmgr rely on this.
- In the case of lockmgr_disown() the lock result unlocked by witness
even if it is still held by the "kernel context"
- In the case of upgrading we can have 3 different situations:
* Total unlocking of the shared lock and nothing else
* Real witness upgrade if the owner is the first upgrader
* Shared unlocking and exclusive locking if the owner is not the first
upgrade but it is still allowed to upgrade
- LK_DRAIN is basically handled like an exclusive acquisition

Additively new options LK_NODUP and LK_NOWITNESS can now be used with
lockinit(): LK_NOWITNESS disables WITNESS for the specified lock while
LK_NODUP enable duplicated locks tracking. This will require manpages
update and a __FreeBSD_version bumping (addressed by further commits).

This patch also fixes a problem occurring if a lockmgr is held in
exclusive mode and the same owner try to acquire it in shared mode:
currently there is a spourious shared locking acquisition while what
we really want is a lock downgrade. Probabilly, this situation can be
better served with a EDEADLK failing errno return.

Side note: first testing on this patch alredy reveleated several LORs
reported, so please expect LORs cascades until resolved. NTFS also is
reported broken by WITNESS introduction. BTW, NTFS is exposing a lock
leak which needs to be fixed, and this patch can help it out if
rightly tweaked.

Tested by: kris, yar, Scot Hetzel <swhetzel at gmail dot com>


# 0e9eb108 23-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

Cleanup lockmgr interface and exported KPI:
- Remove the "thread" argument from the lockmgr() function as it is
always curthread now
- Axe lockcount() function as it is no longer used
- Axe LOCKMGR_ASSERT() as it is bogus really and no currently used.
Hopefully this will be soonly replaced by something suitable for it.
- Remove the prototype for dumplockinfo() as the function is no longer
present

Addictionally:
- Introduce a KASSERT() in lockstatus() in order to let it accept only
curthread or NULL as they should only be passed
- Do a little bit of style(9) cleanup on lockmgr.h

KPI results heavilly broken by this change, so manpages and
FreeBSD_version will be modified accordingly by further commits.

Tested by: matteo


# d1127e66 11-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

lockmgr() function will return successfully when trying to work under
panic but it won't actually lock anything.
This can lead some paths to reach lockmgr_disown() with inconsistent
lock which will let trigger the relative assertions.

Fix those in order to recognize panic situation and to not trigger.

Reported by: pho
Submitted by: kib


# 6edbb3ee 08-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

Fix a last second typo about recent lockmgr_disown() introduction.


# d7a7e179 08-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

Remove explicit calling of lockmgr() with the NULL argument.
Now, lockmgr() function can only be called passing curthread and the
KASSERT() is upgraded according with this.

In order to support on-the-fly owner switching, the new function
lockmgr_disown() has been introduced and gets used in BUF_KERNPROC().
KPI, so, results changed and FreeBSD version will be bumped soon.
Differently from previous code, we assume idle thread cannot try to
acquire the lockmgr as it cannot sleep, so loose the relative check[1]
in BUF_KERNPROC().

Tested by: kris

[1] kib asked for a KASSERT in the lockmgr_disown() about this
condition, but after thinking at it, as this is a well known general
rule, I found it not really necessary.


# 100f2415 27-Dec-2007 Attilio Rao <attilio@FreeBSD.org>

Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace.
This option just adds complexity and the new implementation no longer
will support it, so axing it now that it is unused is probabilly the
better idea.

FreeBSD version is bumped in order to reflect the KPI breakage introduced
by this patch.

In the ports tree, kris found that only old OSKit code uses it, but as
it is thought to work only on 2.x kernels serie, version bumping will
solve any problem.


# 7a1d78fa 27-Dec-2007 Attilio Rao <attilio@FreeBSD.org>

In order to avoid a huge class of deadlocks (in particular in interactions
with the interlock), owner of the lock should be only curthread or at
least, for its limited usage, NULL which identifies LK_KERNPROC.

The thread "extra argument" for the lockmgr interface is going to be
removed in the near future, but for the moment, just let kernel run for
some days with this check on in order to find potential deadlocking
places around the kernel and fix them.


# 9ccca7d1 01-Dec-2007 Robert Watson <rwatson@FreeBSD.org>

Modify stack(9) stack_print() and stack_sbuf_print() routines to use new
linker interfaces for looking up function names and offsets from
instruction pointers. Create two variants of each call: one that is
"DDB-safe" and avoids locking in the linker, and one that is safe for
use in live kernels, by virtue of observing locking, and in particular
safe when kernel modules are being loaded and unloaded simultaneous to
their use. This will allow them to be used outside of debugging
contexts.

Modify two of three current stack(9) consumers to use the DDB-safe
interfaces, as they run in low-level debugging contexts, such as inside
lockmgr(9) and the kernel memory allocator.

Update man page.


# 2c2bebfc 23-Nov-2007 Attilio Rao <attilio@FreeBSD.org>

transferlockers() is a very dangerous and hack-ish function as waiters
should never be moved by one lock to another.
As, luckily, nothing in our tree is using it, axe the function.

This breaks lockmgr KPI, so interested, third-party modules should update
their source code with appropriate replacement.

Ok'ed by: ups, rwatson
MFC after: 3 days


# f9721b43 18-Nov-2007 Attilio Rao <attilio@FreeBSD.org>

Expand lock class with the "virtual" function lc_assert which will offer
an unified way for all the lock primitives to express lock assertions.
Currenty, lockmgrs and rmlocks don't have assertions, so just panic in
that case.
This will be a base for more callout improvements.

Ok'ed by: jhb, jeff


# 431f8906 13-Nov-2007 Julian Elischer <julian@FreeBSD.org>

generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.


# c91fcee7 18-May-2007 John Baldwin <jhb@FreeBSD.org>

Move lock_profile_object_{init,destroy}() into lock_{init,destroy}().


# ab2dab16 30-Mar-2007 John Baldwin <jhb@FreeBSD.org>

- Use lock_init/lock_destroy() to setup the lock_object inside of lockmgr.
We can now use LOCK_CLASS() as a stronger check in lockmgr_chain() as a
result. This required putting back lk_flags as lockmgr's use of flags
conflicted with other flags in lo_flags otherwise.
- Tweak 'show lock' output for lockmgr to match sx, rw, and mtx.


# aa89d8cd 21-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Rename the 'mtx_object', 'rw_object', and 'sx_object' members of mutexes,
rwlocks, and sx locks to 'lock_object'.


# 6d257b6e 21-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Handle the case when a thread is blocked on a lockmgr lock with LK_DRAIN
in DDB's 'show sleepchain'.

MFC after: 3 days


# 6e21afd4 09-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Add two new function pointers 'lc_lock' and 'lc_unlock' to lock classes.
These functions are intended to be used to drop a lock and then reacquire
it when doing an sleep such as msleep(9). Both functions accept a
'struct lock_object *' as their first parameter. The 'lc_unlock' function
returns an integer that is then passed as the second paramter to the
subsequent 'lc_lock' function. This can be used to communicate state.
For example, sx locks and rwlocks use this to indicate if the lock was
share/read locked vs exclusive/write locked.

Currently, spin mutexes and lockmgr locks do not provide working lc_lock
and lc_unlock functions.


# 3ff6d229 09-Mar-2007 John Baldwin <jhb@FreeBSD.org>

Use C99-style struct member initialization for lock classes.


# fe68a916 26-Feb-2007 Kip Macy <kmacy@FreeBSD.org>

general LOCK_PROFILING cleanup

- only collect timestamps when a lock is contested - this reduces the overhead
of collecting profiles from 20x to 5x

- remove unused function from subr_lock.c

- generalize cnt_hold and cnt_lock statistics to be kept for all locks

- NOTE: rwlock profiling generates invalid statistics (and most likely always has)
someone familiar with that should review


# 61bd5e21 12-Nov-2006 Kip Macy <kmacy@FreeBSD.org>

track lock class name in a way that doesn't break WITNESS


# 54e57f76 11-Nov-2006 Kip Macy <kmacy@FreeBSD.org>

show lock class in profiling output for default case where type is not specified when initializing the lock

Approved by: scottl (standing in for mentor rwatson)


# 7c0435b9 10-Nov-2006 Kip Macy <kmacy@FreeBSD.org>

MUTEX_PROFILING has been generalized to LOCK_PROFILING. We now profile
wait (time waited to acquire) and hold times for *all* kernel locks. If
the architecture has a system synchronized TSC, the profiling code will
use that - thereby minimizing profiling overhead. Large chunks of profiling
code have been moved out of line, the overhead measured on the T1 for when
it is compiled in but not enabled is < 1%.

Approved by: scottl (standing in for mentor rwatson)
Reviewed by: des and jhb


# 04aa807c 01-Oct-2006 Tor Egge <tegge@FreeBSD.org>

If the buffer lock has waiters after the buffer has changed identity then
getnewbuf() needs to drop the buffer in order to wake waiters that might
sleep on the buffer in the context of the old identity.


# 462a7add 15-Aug-2006 John Baldwin <jhb@FreeBSD.org>

Add a new 'show sleepchain' ddb command similar to 'show lockchain' except
that it operates on lockmgr and sx locks. This can be useful for tracking
down vnode deadlocks in VFS for example. Note that this command is a bit
more fragile than 'show lockchain' as we have to poke around at the
wait channel of a thread to see if it points to either a struct lock or
a condition variable inside of a struct sx. If td_wchan points to
something unmapped, then this command will terminate early due to a fault,
but no harm will be done.


# be6847d7 15-Aug-2006 John Baldwin <jhb@FreeBSD.org>

Add a 'show lockmgr' command that dumps the relevant details of a lockmgr
lock.


# 338ae526 14-Jul-2006 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Remove duplicated #include.


# 49bdcff5 23-Dec-2005 Jeff Roberson <jeff@FreeBSD.org>

- Remove and unused include.

Submitted by: Antoine Brodin <antoine.brodin@laposte.net>


# c30bf5c3 02-Oct-2005 Robert Watson <rwatson@FreeBSD.org>

Include kdb.h so that kdb_active is declared regardless of KDB being
included in the kernel.

MFC after: 0 days


# 2b59d50c 27-Sep-2005 Robert Watson <rwatson@FreeBSD.org>

In lockstatus(), don't lock and unlock the interlock when testing the
sleep lock status while kdb_active, or we risk contending with the
mutex on another CPU, resulting in a panic when using "show
lockedvnods" while in DDB.

MFC after: 3 days
Reviewed by: jhb
Reported by: kris


# 1f71de49 02-Sep-2005 Suleiman Souhlal <ssouhlal@FreeBSD.org>

Print out a warning and a backtrace if we try to unlock a lockmgr that
we do not hold.

Glanced at by: phk
MFC after: 3 days


# e37a4994 29-Aug-2005 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Add 'depth' argument to CTRSTACK() macro, which allows to reduce number
of ktr slots used. If 'depth' is equal to 0, the whole stack will be
logged, just like before.


# 7499fd8d 02-Aug-2005 Jeff Roberson <jeff@FreeBSD.org>

- Fix a problem that slipped through review; the stack member of the lockmgr
structure should have the lk_ prefix.
- Add stack_print(lkp->lk_stack) to the information printed with
lockmgr_printinfo().


# e8ddb61d 02-Aug-2005 Jeff Roberson <jeff@FreeBSD.org>

- Replace the series of DEBUG_LOCKS hacks which tried to save the vn_lock
caller by saving the stack of the last locker/unlocker in lockmgr. We
also put the stack in KTR at the moment.

Contributed by: Antoine Brodin <antoine.brodin@laposte.net>


# 436901a8 11-Apr-2005 Jeff Roberson <jeff@FreeBSD.org>

- Differentiate two UPGRADE panics so I have a better idea of what's going
on here.


# a96ab770 06-Apr-2005 Jeff Roberson <jeff@FreeBSD.org>

- Remove dead code.


# 20728d8f 03-Apr-2005 Jeff Roberson <jeff@FreeBSD.org>

- Slightly restructure acquire() so I can add more ktr information and
an assert to help find two strange bugs.
- Remove some nearby spls.


# c4c0ec5b 30-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Add a LK_NOSHARE flag which forces all shared lock requests to be
treated as exclusive lock requests.

Sponsored by: Isilon Systems, Inc.


# b641353e 30-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Remove apause(). It makes no sense with our present mutex implementation
since simply unlocking a mutex does not ensure that one of the waiters
will run and acquire it. We're more likely to reacquire the mutex
before anyone else has a chance. It has also bit me three times now, as
it's not safe to drop the interlock before sleeping in many cases.

Sponsored by: Isilon Systems, Inc.


# bf5c2a19 27-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Don't bump the count twice in the LK_DRAIN case.

Sponsored by: Isilon Systems, Inc.


# f158df07 24-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Restore COUNT() in all of its original glory. Don't make it dependent
on DEBUG as ufs will soon grow a dependency on this count.

Discussed with: bde
Sponsored by: Isilon Systems, Inc.


# 92e251ca 24-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Complete the implementation of td_locks. Track the number of outstanding
lockmgr locks that this thread owns. This is complicated due to
LK_KERNPROC and because lockmgr tolerates unlocking an unlocked lock.

Sponsored by: Isilon Systes, Inc.


# f5f0da0a 15-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- transferlockers() requires the interlock to be SMP safe.

Sponsored by: Isilon Systems, Inc.


# 04186764 25-Jan-2005 Jeff Roberson <jeff@FreeBSD.org>

- Include LK_INTERLOCK in LK_EXTFLG_MASK so that it makes its way into
acquire.
- Correct the condition that causes us to skip apause() to only require
the presence of LK_INTERLOCK.

Sponsored by: Isilon Systems, Inc.


# 41bd6c15 24-Jan-2005 Jeff Roberson <jeff@FreeBSD.org>

- Do not use APAUSE if LK_INTERLOCK is set. We lose synchronization
if the lockmgr interlock is dropped after the caller's interlock
is dropped.
- Change some lockmgr KTRs to be slightly more helpful.

Sponsored By: Isilon Systems, Inc.


# 9454b2d8 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for copyright notices, minor format tweaks as necessary


# d8b8e875 29-Nov-2004 Paul Saab <ps@FreeBSD.org>

When upgrading the shared lock to an exclusive lock, if we discover
that the exclusive lock is already held, then we call panic. Don't
clobber internal lock state before panic'ing. This change improves
debugging if this case were to happen.

Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson


# 4cef6d5a 26-Aug-2004 Alexander Kabaev <kan@FreeBSD.org>

Reintroduce slightly modified patch from kern/69964. Check for
LK_HAVE_EXL in both acquire invocations.

MFC after: 5 days


# cffdaf2d 22-Aug-2004 Alexander Kabaev <kan@FreeBSD.org>

Temporarily back out r1.74 as it seems to cause a number of regressions
accordimg to numerous reports. It might get reintroduced some time later
when an exact failure mode is understood better.


# c8b87621 16-Aug-2004 Alexander Kabaev <kan@FreeBSD.org>

Upgrading a lock does not play well together with acquiring an exclusive lock
and can lead to two threads being granted exclusive access. Check that no one
has the same lock in exclusive mode before proceeding to acquire it.

The LK_WANT_EXCL and LK_WANT_UPGRADE bits act as mini-locks and can block
other threads. Normally this is not a problem since the mini locks are
upgraded to full locks and the release of the locks will unblock the other
threads. However if a thread reset the bits without obtaining a full lock
other threads are not awoken. Add missing wakeups for these cases.

PR: kern/69964
Submitted by: Stephan Uphoff <ups at tree dot com>
Very good catch by: Stephan Uphoff <ups at tree dot com>


# ff381670 23-Jul-2004 Robert Watson <rwatson@FreeBSD.org>

Don't include a "\n" in KTR output, it confuses automatic parsing.


# fa2a4d05 02-Jun-2004 Tim J. Robbins <tjr@FreeBSD.org>

Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by: jhb


# c969c60c 05-Jan-2004 Alexander Kabaev <kan@FreeBSD.org>

Add pid to the info printed in lockmgr_printinfo. This makes VFS
diagnostic messages slightly more useful.


# 6ff1481d 15-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Rearrange the SYSINIT order to call lockmgr_init() earlier so that
the runtime lockmgr initialization code in lockinit() can be eliminated.

Reviewed by: jhb


# 857d9c60 12-Jul-2003 Don Lewis <truckman@FreeBSD.org>

Extend the mutex pool implementation to permit the creation and use of
multiple mutex pools with different options and sizes. Mutex pools can
be created with either the default sleep mutexes or with spin mutexes.
A dynamically created mutex pool can now be destroyed if it is no longer
needed.

Create two pools by default, one that matches the existing pool that
uses the MTX_NOWITNESS option that should be used for building higher
level locks, and a new pool with witness checking enabled.

Modify the users of the existing mutex pool to use the appropriate pool
in the new implementation.

Reviewed by: jhb


# 677b542e 10-Jun-2003 David E. O'Brien <obrien@FreeBSD.org>

Use __FBSDID().


# c06394f5 11-Mar-2003 John Baldwin <jhb@FreeBSD.org>

Use the KTR_LOCK mask for logging events via KTR in lockmgr() rather
than KTR_LOCKMGR. lockmgr locks are locks just like other locks.


# 26306795 04-Mar-2003 John Baldwin <jhb@FreeBSD.org>

Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls
to WITNESS_WARN().


# 17661e5a 24-Feb-2003 Jeff Roberson <jeff@FreeBSD.org>

- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
these fields instead.
- Hold the vnode interlock while locking bufs on the clean/dirty queues.
This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by: arch, mckusick


# 5e8feb5b 16-Feb-2003 Jeff Roberson <jeff@FreeBSD.org>

- Add a WITNESS_SLEEP() for the appropriate cases in lockmgr().


# 822ded67 05-Feb-2003 Julian Elischer <julian@FreeBSD.org>

The lockmanager has to keep track of locks per thread, not per process.

Submitted by: david Xu (davidxu@)
Reviewed by: jhb@


# 6f8132a8 31-Jan-2003 Julian Elischer <julian@FreeBSD.org>

Reversion of commit by Davidxu plus fixes since applied.

I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.


# 0dbb100b 26-Jan-2003 David Xu <davidxu@FreeBSD.org>

Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian


# c6964d3b 30-Nov-2002 Kirk McKusick <mckusick@FreeBSD.org>

Remove a race condition / deadlock from snapshots. When
converting from individual vnode locks to the snapshot
lock, be sure to pass any waiting processes along to the
new lock as well. This transfer is done by a new function
in the lock manager, transferlockers(from_lock, to_lock);
Thanks to Lamont Granquist <lamont@scriptkiddie.org> for
his help in pounding on snapshots beyond all reason and
finding this deadlock.

Sponsored by: DARPA & NAI Labs.


# 3a096f6c 17-Oct-2002 Kirk McKusick <mckusick@FreeBSD.org>

Have lockinit() initialize the debugging fields of a lock
when DEBUG_LOCKS is defined.

Sponsored by: DARPA & NAI Labs.


# 8302d183 27-Aug-2002 Bruce Evans <bde@FreeBSD.org>

Include <sys/lockmgr.h> for the definitions of the locking interfaces that
are implemented here instead of depending on namespace pollution in
<sys/lock.h>. Fixed nearby include messes (1 disordered include and 1
unused include).


# 93b0017f 25-Aug-2002 Philippe Charnier <charnier@FreeBSD.org>

Replace various spelling with FALLTHROUGH which is lint()able


# 7181624a 29-May-2002 Jeff Roberson <jeff@FreeBSD.org>

Record the file, line, and pid of the last successful shared lock holder. This
is useful as a last effort in debugging file system deadlocks. This is enabled
via 'options DEBUG_LOCKS'


# 6008862b 04-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on: i386, alpha, sparc64


# 04858e7e 05-Mar-2002 Eivind Eklund <eivind@FreeBSD.org>

Change wmesg to const char * instead of char *


# 23b59018 20-Dec-2001 Matthew Dillon <dillon@FreeBSD.org>

Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget()
against VM_WAIT in the pageout code. Both fixes involve adjusting
the lockmgr's timeout capability so locks obtained with timeouts do not
interfere with locks obtained without a timeout.

Hopefully MFC: before the 4.5 release


# f2860039 13-Nov-2001 Matthew Dillon <dillon@FreeBSD.org>

Create a mutex pool API for short term leaf mutexes.
Replace the manual mutex pool in kern_lock.c (lockmgr locks) with the new API.
Replace the mutexes embedded in sxlocks with the new API.


# 61d80e90 11-Oct-2001 John Baldwin <jhb@FreeBSD.org>

Add missing includes of sys/ktr.h.


# 6a40ecce 10-Oct-2001 John Baldwin <jhb@FreeBSD.org>

Malloc mutexes pre-zero'd as random garbage (including 0xdeadcode) my
trigget the check to make sure we don't initalize a mutex twice.


# bce98419 13-Sep-2001 John Baldwin <jhb@FreeBSD.org>

Fix locking on td_flags for TDF_DEADLKTREAT. If the comments in the code
are true that curthread can change during this function, then this flag
needs to become a KSE flag, not a thread flag.


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 3f085c22 10-Aug-2001 John Baldwin <jhb@FreeBSD.org>

If we've panic'd already, then just bail in lockmgr rather than blocking or
possibly panic'ing again.


# 6157b69f 27-Apr-2001 Alfred Perlstein <alfred@FreeBSD.org>

Instead of asserting that a mutex is not still locked after unlocking it,
assert that the mutex is owned and not recursed prior to unlocking it.

This should give a clearer diagnostic when a programming error is caught.


# 98689e1e 20-Apr-2001 Alfred Perlstein <alfred@FreeBSD.org>

Assert that when using an interlock mutex it is not recursed when lockmgr()
is called.

Ok'd by: jhb


# 1375ed7e 13-Apr-2001 Alfred Perlstein <alfred@FreeBSD.org>

convert if/panic -> KASSERT, explain what triggered the assertion


# 1681b00a 07-Apr-2001 Jake Burkholder <jake@FreeBSD.org>

Fix a precedence bug. ! has higher precedence than &.


# 635962af 09-Feb-2001 John Baldwin <jhb@FreeBSD.org>

Proc locking.


# 9ed346ba 08-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 1b367556 23-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Convert all simplelocks to mutexes and remove the simplelock implementations.


# d1c1b841 21-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.


# 96fde7da 30-Nov-2000 Jake Burkholder <jake@FreeBSD.org>

Use msleep instead of mtx_exit; tsleep; mtx_enter, which is not safe.


# d8881ca3 20-Oct-2000 John Baldwin <jhb@FreeBSD.org>

- machine/mutex.h -> sys/mutex.h
- The initial lock_mtx mutex used in the lockmgr code is initialized very
early, so use MUTEX_DECLARE() and MTX_COLD.


# 9722d88f 12-Oct-2000 Jason Evans <jasone@FreeBSD.org>

For lockmgr mutex protection, use an array of mutexes that are allocated
and initialized during boot. This avoids bloating sizeof(struct lock).
As a side effect, it is no longer necessary to enforce the assumtion that
lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been
removed.

Idea taken from: BSD/OS.


# a18b1f1d 03-Oct-2000 Jason Evans <jasone@FreeBSD.org>

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 92b123a0 22-Sep-2000 Paul Saab <ps@FreeBSD.org>

Move MAXCPU from machine/smp.h to machine/param.h to fix breakage
with !SMP kernels. Also, replace NCPUS with MAXCPU since they are
redundant.


# c866ec47 16-Sep-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Make LINT compile.


# db5f635a 16-Mar-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Eliminate the undocumented, experimental, non-delivering and highly
dangerous MAX_PERF option.


# 6bdfe06a 11-Dec-1999 Eivind Eklund <eivind@FreeBSD.org>

Lock reporting and assertion changes.
* lockstatus() and VOP_ISLOCKED() gets a new process argument and a new
return value: LK_EXCLOTHER, when the lock is held exclusively by another
process.
* The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them
* Extend the vnode_if.src format to allow more exact specification than
locked/unlocked.

This commit should not do any semantic changes unless you are using
DEBUG_VFS_LOCKS.

Discussed with: grog, mch, peter, phk
Reviewed by: peter


# 99c9d349 10-Nov-1999 Alan Cox <alc@FreeBSD.org>

Correct a locking error in apause: It should always hold
the simple lock when it returns.

Also, eliminate spinning on a uniprocessor. It's pointless.

Submitted by: bde,
Assar Westerlund <assar@sics.se>


# e701df7d 26-Sep-1999 Matthew Dillon <dillon@FreeBSD.org>

Fix process p_locks accounting. Conversions of the owner to LK_KERNPROC
caused p_locks to be improperly accounted.

Submitted by: Tor.Egge@fast.no


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 33638e93 28-Jun-1999 Kirk McKusick <mckusick@FreeBSD.org>

When requesting an exclusive lock with LK_NOWAIT, do not panic
if LK_RECURSIVE is not set, as we will simply return that the
lock is busy and not actually deadlock. This allows processes
to use polling locks against buffers that they may already
hold exclusively locked.


# 67812eac 25-Jun-1999 Kirk McKusick <mckusick@FreeBSD.org>

Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.


# 50d3b68c 14-Mar-1999 Julian Elischer <julian@FreeBSD.org>

fix breakage for alphas.
Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>


# beef8a36 11-Mar-1999 Julian Elischer <julian@FreeBSD.org>

This solves a deadlock that can occur when read()ing into a file-mmap()
space. When doing this, it is possible to for another process to attempt
to get an exclusive lock on the vnode and deadlock the mmap/read
combination when the uiomove() call tries to obtain a second
shared lock on the vnode. There is still a potential deadlock
situation with write()/mmap().
Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Luoqi Chen <luoqi@freebsd.org>
Delimmitted by tag PRE_MATT_MMAP_LOCK and POST_MATT_MMAP_LOCK
in kern/kern_lock.c kern/kern_subr.c


# 15a1057c 20-Jan-1999 Eivind Eklund <eivind@FreeBSD.org>

Add 'options DEBUG_LOCKS', which stores extra information in struct
lock, and add some macros and function parameters to make sure that
the information get to the point where it can be put in the lock
structure.

While I'm here, add DEBUG_VFS_LOCKS to LINT.


# 219cbf59 09-Jan-1999 Eivind Eklund <eivind@FreeBSD.org>

KNFize, by bde.


# 5526d2d9 08-Jan-1999 Eivind Eklund <eivind@FreeBSD.org>

Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as
discussed on -hackers.

Introduce 'KASSERT(assertion, ("panic message", args))' for simple
check + panic.

Reviewed by: msmith


# 9fcdafae 26-Nov-1998 Eivind Eklund <eivind@FreeBSD.org>

Staticize.


# ab36c3d3 16-Apr-1998 Bruce Evans <bde@FreeBSD.org>

Really finish supporting compiling with `gcc -ansi'.


# 9b2e5bad 07-Mar-1998 John Dyson <dyson@FreeBSD.org>

Some kern_lock code improvements. Add missing wakeup, and enable
disabling some diagnostics when memory or speed is at a premium.


# e9fe146b 10-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Include SIMPLELOCK_DEBUG functions even if SMP if compiling LINT; give
an error for the combination if _not_ compiling LINT.


# 0b08f5f7 05-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Back out DIAGNOSTIC changes.


# 47cfdb16 04-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Turn DIAGNOSTIC into a new-style option.


# 4a11ca4e 07-Nov-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Remove a bunch of variables which were unused both in GENERIC and LINT.

Found by: -Wunused


# 55b211e3 28-Oct-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# 99448ed1 20-Sep-1997 John Dyson <dyson@FreeBSD.org>

Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half. The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs. Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.


# 0e61ac7b 22-Aug-1997 Poul-Henning Kamp <phk@FreeBSD.org>

typo in comment.


# 891e0f24 18-Aug-1997 John Dyson <dyson@FreeBSD.org>

Allow lockmgr to work without a current process. Disallowing that
was a mistake in the lockmgr rewrite.


# 7cbfd031 17-Aug-1997 Steve Passe <fsmp@FreeBSD.org>

Added includes of smp.h for SMP.
This eliminates a bazillion warnings about implicit s_lock & friends.


# 03e9c6c1 17-Aug-1997 John Dyson <dyson@FreeBSD.org>

Fix kern_lock so that it will work. Additionally, clean-up some of the
VM systems usage of the kernel lock (lockmgr) code. This is a first
pass implementation, and is expected to evolve as needed. The API
for the lock manager code has not changed, but the underlying implementation
has changed significantly. This change should not materially affect
our current SMP or UP code without non-standard parameters being used.


# 248fcb66 04-Aug-1997 Steve Passe <fsmp@FreeBSD.org>

pushed down "volatility" of simplelock to actual int inside the struct.

Submitted by: bde@zeta.org.au, smp@csn.net


# 6898627c 01-Apr-1997 Bruce Evans <bde@FreeBSD.org>

Fixed commented-out Lite2 sysctl debug.lockpausetime.

Removed unused #includes.


# 17a8bb9d 25-Mar-1997 Peter Wemm <peter@FreeBSD.org>

Add missing $Id$
Note; the RCS file has also been reconstructed to have a CSRG vendor branch.


# 356b94e0 25-Mar-1997 Peter Wemm <peter@FreeBSD.org>

Replace original rev 1.3; Author: bde; Date: 1997/02/25 17:24:43;
Fix counting of simplelocks in SIMPLELOCK_DEBUG
Fix style regression


# 4bdb9b11 25-Mar-1997 Peter Wemm <peter@FreeBSD.org>

Replace original rev 1.2; Author: mpp; Date: 1997/02/12 06:52:30
Add missing #include <sys/systm.h>


# a1ce9d5c 25-Mar-1997 Peter Wemm <peter@FreeBSD.org>

Replace original revision 1.1; Author dyson; Date: 1997/02/10 02:28:15
Changes from Lite2:
- DEBUG -> SIMPLELOCK_DEBUG
- cosmetic fixes
- bzero of lock at init time -> explicit init of members.


# 53bf4bb2 25-Mar-1997 Peter Wemm <peter@FreeBSD.org>

Import 4.4BSD-Lite2 onto CSRG branch